Due to scheduled maintenance, parts of MNHS.org will be inaccessible for a period of several hours starting around 5:00 PM (CDT) on Monday, April 21.
Minnesota  State Archives

Center for Archival Resources On Legislatures (CAROL)

BagIt

Overview

BagIt was created by the California Digital Library, Library of Congress, and Stanford University for the purpose of packaging digital content for transfer with automation of receipt, storage and retrieval. The BagIt specification states that "BagIt is a hierarchical file packaging format designed to support disk-based or network-based storage and transfer of generalized content." The package itself is often referred to as a “bag”. 

A "bag" contains the content as well as metadata files that document the storage and transfer of the content.  These metadata files are created automatically and include a .txt file for associated checksums of the content files, a .txt file for information about the bag, a .txt file for information about the version of BagIt being used, and a .txt file containing checksums for the .txt files themselves.

Bags are ideal for digital content normally kept as a collection of files.

In order to use BagIt, both the sender and receiver will need to have the BagIt Library installed on their systems.  Java is required to run the program.  The program is run in the DOS environment via the Command Line. 

 

Content Acquisition and Authentication

After a Bag is created, transfers can be made via physical media (hard disk drives, CD-ROM, DVD) as well as on a network (FTP, HTTP, etc.). Once received, the Bags can be validated to make sure there were no changes to the data during transport or after being received. 

BagIt uses checksums on the bag itself and each individual file within the bag; content verification can be done after a transfer to verify that the content did not change.  The hash values of the bag and its contents are compared to make sure the bag and individual files have not changed during transmission.

Bags can also be revalidated over time as part of a process to manage digital files within bags over time; if the checksums have not changed, the files are still the same. 

By ensuring the files being sent from one place to another remain unchanged, BagIt could be used as a method of content authentication.

One possible workflow could involve receiving content in a Bag. Copies of the Bag's content could be pulled out and used as needed while the entire Bag could be stored in a dark archive as a 'master copy' whose validity could be checked over time thus assisting with the authenticity and trustworthiness of the files over time. 

 

More Information

Digital Signal blog entry From There to Here, from Here to There, Digital Content is Everywhere! (January 3, 2012) about file transfers and BagIt. In addition to general information and a video on BagIt, the comments provide links to additional user guides for BagIt created by GeoMapp and Bagger created by the Mississippi Department of Archives and History.

The California Digital Library hosts a Confluence page on BagIt with links to the specification as well as known implementations and related projects.

The Digital Curation Google Group also discusses BagIt.

 

Resource Center Navigation

 

Please use your back button to return to the last page.

Links to the main sections of CAROL are provided below.

Home - Foundations - Access - Preservation - Authentication

 

February 15, 2012; links verified February 21, 2012