Digitization Methods
Last updated: May 20, 2014
Minnesota Historical Society digitization projects aim to promote access to a wide variety of content from the Society's archival and manuscripts collections, including correspondence, memoranda, diaries, speeches, reports, subject files, graphic materials, schedules, notes, minutes, maps, certificates, catalogs, membership records, and newsletters. Public use of these access copies is encouraged to help preserve the condition of the originals.
Online Digital Products
The internet provides cultural institutions with the opportunity to make their collections accessible to a broad audience outside the walls of their institution. The Society has been digitizing collections and providing online access to a variety of historical material since 1999. Large-scale digitization projects linked to searchable databases include the 1.75 million textual images in the Birth Records index, more than 240,000 photographic and fine art images in our Visual Resources Database, and the nearly 6,000 object images in our Collections Online. Over 120 finding aids to printed materials, government records, and manuscript collections are linked to more than 5,000 digital files representing tens of thousands of original pages. Other digital products range from online exhibits centered on a specific theme or topic, to educational resources for teachers.
Selection Process and Project Plan
Digitization projects are identified and proposed in several ways. Project proposals are normally identified by MNHS staff and are funded internally or through special grants. Joint collaborations with other cultural institutions and private sector organizations interested in creating content online also generate a number of projects. The selection process takes into consideration access and use restrictions, the material's physical condition and suitability for scanning, and the extent to which the material has been described at the series or item level.
A project plan identifies the main tasks involved in creating a digital product, date deliverables, general costs, team members and their roles and responsibilities.
Selection of Material
Selections from individual collections are based on user and researcher interest, curatorial input, historic significance, and copyright status.
Digital Production
Every effort is made to ensure that digital versions are as legible as original documents while balancing considerations of digital image size and system memory.
Collection contents have most often been scanned as groups of documents filed at the folder or series level to provide researchers with as full a context as possible. A thumbnail image of the first page in each folder or item that has been scanned serves as a graphical link to open the digital version. Extremely large files have been broken into parts to facilitate rapid download speeds.
Finding aids will also list material that have not been scanned. To request photocopies or scans of material that has not been digitized, please see the Reproduction Request Form.
Portable Document Format/Archival - (PDF/A)
In order to keep up with new technology and file formats, the Society has begun exploring the use of PDF/A-1 as an archival and access standard. This new standard (ISO 19005-1) defined in 2005, provides a mechanism for representing electronic documents in a manner that preserves their intellectual content over time, independent of the tools and systems used for creating, storing or rendering the files. This standard identifies a profile for electronic documents that ensures they can be reproduced for years to come.
All of the information necessary for displaying the document in the same manner every time is embedded in the file. Embedded information includes all visible content: text, images, vector graphics, color information, and fonts. PDF/A-1 is only part of an archiving solution, alone it does not guarantee long-term archiving nor that information will be displayed as desired. However, since the Society has decided to link PDF files to finding aids in order to display textual images, the PDF/A format makes the most practical sense because it defines a set of requirements that make long-term archiving possible.
Hardware and Software
- Hardware
- Epson Expression 10000 XL
- Fujitsu fi-6230
- Software
- Adobe Design Suite CS5
Standards and Content
Most textual materials are scanned in 8-bit gray-scale at 300 dpi (dots per inch) and some images are cropped to maximize quality while minimizing file size. Scans in 24-bit color at 300 dpi are used only when color is necessary to understand the content of an image and for the restricted items in our Reserve storage. Images are not modified in any manner by the application of descreening or filtering, but may be automatically or manually corrected for document clarity. Documents are scanned to a PDF file format, edited to include metadata pertaining to the specific folder or document, and converted to PDF/A.
Scanning Specifications | ||
---|---|---|
Gray-scale | Color | |
Resolution | 300 dpi | 300 dpi |
Bit Depth | 8-bit gray scale | 24 bit color |
Compression | JBIG2 | JBIG2 |
Most typewritten material is processed using OCR (optical character recognition) for easy searching and for the visually impaired. Files where text is hand-written, faint, or obscured by annotation are not generally subjected to OCR. Due to high costs, OCR is not corrected. Document compression is done using a PDF optimizer, keeping compatibility of the document with Acrobat v.7.0 and later. All images are compressed using the JBIG2 schema with lossless quality.
Minimal metadata is used to describe the document using the built-in fields provided in the document properties available for each PDF. Data fields are created for item title, creator, description, preferred citation, copyright status, copyright notice, and copyright informational URL. Additional fields could be utilized, however, in order to increase output and decrease processing costs these seven fields are the minimum required descriptive metadata. Technical information about capture devices and software applications was once tracked, but now reside in project documentation.
- Document Title
- Mandatory
- Used to display the title of the file as it corresponds to the file name in the finding aid.
- Examples:
- Jimmy Carter to Walter F. Mondale, February 16, 1978
- Foreign Policy Breakfasts, 1977-1978, folder 2, part 3 - Author
- Mandatory
- Used to define who or what organization/department/person created the file.
- Examples:
- Walter F. Mondale (1928-)
- Charles John LaVine, Translator
- Great Northern Railway Company (U.S.)
- Minnesota. Dept. of Education - Description
- Optional
- Used to give a short description of the file contents. Can be used to highlight relevant dates, additional creators, and/or any remarkable content.
- Examples:
- Regarding the role of the Vice President.
- Agendas, briefing materials, and handwritten note regarding foreign policy. - Keywords
- Mandatory
- Used to describe the physical location of the content.
- Examples:
- Vice Presidential Papers. Walter F. Mondale. Minnesota Historical Society.
- Correspondence of Walter Mondale's Office. Attorney General. State Archives. Minnesota Historical Society.
- Subject files. Louis W. Hill Papers. Minnesota Historical Society. - Copyright Status
- Required
- Used to define copyright status as Copyrighted, Public Domain, or Unknown.
- Copyright Notice
- Required
- Used to give notice that copyright is held in the file.
- Example:
- The copyright of this digital version belongs to the Minnesota Historical Society.
- Permission was given by Northwest Area Foundation to make this document accessible. - Copyright URL
- Required
- Used to provide external link to additional copyright information.
- Example:
http://www.mnhs.org/copyright
Copyright Restrictions and Usage
The Minnesota Historical Society claims copyright to the digital versions of content provided in the inventories to its collections. Content may not be copied without express written permission of the Society. However, users may print, download, link to, or email content for individual use.
If you have a question about copyright restrictions and usage, especially for publication purposes, please visit the MNHS Copyright and Use Information page. Permission and license to publish, display, or broadcast photographic and digital reproductions is granted via a Request for Permission form (see p. 2) and may involve a fee for commercial use.
For documents and templates related to MNHS digitization please visit the CM Toolkit.