Technical Information

Marsden Online Archive Platform

The Marsden Online Archive was created using:

  • Fedora Commons: provides the backend repository to store and manage all of the unique digital material.
  • Islandora: provides functionality via modules to deal with different content types.
  • Drupal: provides access to the site via a front end, access layer.
  • Solr: provides the search functionality, including being able to ‘Refine’ your search.

Amount of Material

Marsden’s letters and journals, as well as the papers of other early NZ missionaries, have been made available on the Marsden Online Archive.

The amount of material currently available is detailed in the table below.

Items Pages
Letters 590 2,607
Journals 9 1,121
Total 599 3,728

Metadata Standards

The following three metadata standards were selected for the Marsden Online Archive:

These standards were selected as they support digital preservation, are appropriate for the format of the source documents, and are interoperable.

Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)

The OAI-PMH provides structured metadata for all of the items in the Marsden Online Archive. Using this will allow you to harvest our metadata which has already been cross-walked from Metadata Object Description Schema (MODS) to Dublin Core (DC). For more information on OAI-PMH see the Open Archives website.

Text Encoding Initiative Schema

The Text Encoding Initiative (TEI) Schema describes how TEI has been used in marking up the transcripts in the Marsden Online Archive. The transcripts have been marked up for:

  • Ship: <tei:shipName>
  • Date: <tei:date>
  • Person: <tei:persName>
  • Place: <tei:geogName>
  • Alternative Spelling: <tei:corr>
  • Corrected Words: <tei:sic>
  • Terms: <tei:term>
  • Cross Out: <tei:del>
  • Underline: <tei:hi rend="underline">
  • Paragraph Marks: <tei:p>
  • Page Marks: <tei:page>

A TEI xml file for each letter and journal is available in the 'Downloads' section along with the other versions of the transcripts and images. For more information on TEI see the Text Encoding Initiative website.

‘Clean’ .txt Data

The Marsden Online Archive also provides a number of clean data sets that you can use. ‘Clean’ means that all of the metadata, page numbers and TEI mark-up has been removed so the .txt file only contains the transcripts in its purest form.

Note: These text files are organised based on their Hocken Reference Number (i.e., MS_0054_001).