Recommended Digital Data Formats for Long Term Digital Archives and Web Access
The following table represents the digital formats that MNHS has recognized and is encouraging grant projects to use when transferring records into digital archives. These formats, and corresponding confidence levels, represent MHS’ preferences for long-term preservation. Grantees can submit to use other formats (including those not listed) for active use as long as there is a clear rational and plan for record retention monitoring and updating. However, application will have a better chance if they deal with these preferred formats.
The confidence levels identified in the table below are determined by a combination of sustainability factors including:
- Disclosure. Degree to which complete specifications and tools for validating technical integrity exist and are accessible to those creating and sustaining digital content. A spectrum of disclosure levels can be observed for digital formats. What is most significant is not approval by a recognized standards body, but the existence of complete documentation.
- Adoption. Degree to which the format is already used by the primary creators, disseminators, or users of information resources. This includes use as a master format for delivery to end users, and as a means of interchange between systems.
- Transparency. Degree to which the digital representation is open to direct analysis and human readability with basic tools using a text only editor.
- Self-documentation . Self-documenting digital objects contain basic descriptive, technical, and other administrative metadata.
- External Dependencies. Degree to which a particular format depends on particular hardware, operating system, or software for rendering or use and the predicted complexity of dealing with those dependencies in future technical environments.
- Impact of Patents. Degree to which the ability of archival institutions to sustain content in a format will be inhibited by patents.
- Technical Protection Mechanisms. Implementation of mechanisms such as encryption and digital rights management tools that prevent the preservation of content by a trusted repository.
The Preferred and Possible confidence levels represent archival formats that MHS feels are the most sustainable over time. Applicants should avoid creating archival files in formats listed in the Unacceptable field. Web Access Only represents formats acceptable for use for public access on the web.
| Media | Preferred | Possible | Unacceptable | Web Access Only |
|---|---|---|---|---|
| Text | - Plain text (encoding: US ASCII, UTF-8, UTF-16 with BOM) - PDF/A-1-a (*.pdf) |
- PDF/A-1-b (*.pdf)(embedded fonts) - Rich Text Format (*.rtf) |
- Microsoft Word (*.doc) - PDF (external font) - All other text formats not listed here |
- Plain text - PDF/A-1-a (*.pdf) - PDF/A-1-b (*.pdf) |
| Vector Graphics (ie: Line art) |
- Scalable Vector Graphics (*.svg) | - Computer Graphics Metafile (*.CGM) - Design Web Format (*dwf) |
- Adobe Flash (*.swf) -Encapsulated PostScript (*eps) - All other vector image formats not listed here |
- Adobe Flash (*.swf)
|
| Raster Images (ie: Photos) |
- TIFF (*.tiff) (uncompressed single-page) - JPEG2000 (uncompressed)(*.jp2, *.jpx) |
- BMP (*.bmp) | - TIFF (*.tiff)(with LZW compression, in Planar format, or Multi-page) - GIF (*.gif) -JPEG/JFIF (*.jpg,jpeg) -PNG (*.png) - PhotoShop (*,psd) - All other raster image formats not listed here |
-JPEG/JFIF(*.jpg) -PNG(*.png) - GIF (*.gif) |
| Audio | - WAVE (wav) | - Standard MIDI (*.mid,*.midi) - Ogg Vorbis (*.ogg) - AIFF (uncompressed) (*.aif,*.aiff) |
- RealNetworks ‘Real Audio’ (*.ra, *.rm, *ram) - QuickTIme (*.mov) - Windows Media Audio (*.wma) - WAVE (compressed) (*.wav) - MP3 (*.mp3) - AIFC (*.aifc) - NeXT SND (*.snd) - All other audio formats not listed here |
- MP3 (*.mp3) - Windows Media Audio (*.wma) - QuickTIme (*.mov) - Windows Media Audio (*.wma) - Ogg Vorbis (*.ogg) - Adobe Flash (*.fla) |
| Video |
- AVI - QuickTime Movie - JPEG2000 (uncompressed) - Material Exchange Format(*.mxf) |
- Ogg Theora (*.ogg) | - AVI (compressed) (*.avi) - QuickTIme Movie (compressed) (*.mov) - Windows Media Video (*.wmv) - RealNetworks ‘Real Video’ (*.rv, *.rm, *ram) - MPEG-1, MPEG-2 (*.mpg,*.mpeg) - All other video formats not listed here |
- MPEG-4 (*.mp4) - QuickTIme Movie (*.mov) - Windows Media Video (*.wmv) - Ogg Theora (*.ogg) - Adobe Flash (*.fla) |
Notes:
- These requirements do not apply to non-archival files used in the grant documentation process.
- This document will be updated when information regarding file formats emerges.
Quicktime Movie and AVI are both proprietary formats owned by Apple Computer, Inc. and Microsoft Corporation respectively. While they are proprietary, their wide use, distribution and the lack of a solid non-proprietary format necessitate adopting the recommendations for the time being. Organizations converting to these formats may expect to convert to a non-proprietary format in the future when there is a clearer choice.