Digital Preservation UnConference

This Tuesday we hosted our first Digital Preservation UnConference at the John F. Kennedy Presidential Library.  We had a great turnout from a number of institutions around Boston and the larger New England community.  A range of  topics around digital preservation were discussed from social media and web archiving to wrangling data for a system migration.

As you may know, part of the residency program includes hosting an event at your institution.  At the JFK Library we immediately knew we wanted to host a public unconference.  This actually came up in discussions between myself and my host mentor, Erica Boudreau, before I even arrived in Boston.  I had been to a few digital humanities and library themed unconferences and I was excited to see how this format could be used to address the issues specific to digital preservation.

In planning for the event we created a wordpress site, including an UnConference 101, registration information, and directions to the event.  We used the website commenting function to allow attendees to propose sessions ahead of time.  We also created a twitter handle, @jfkdigipres for sharing updates and event information.

audience

Attendees get ready for the day of UnConference-ing!

The day started with brief opening remarks by myself, followed by session proposals from the attendees.  Then we broke for coffee and attendees voted on the proposals.  Once voting was over, myself and a few dedicated volunteers entered session proposals into the schedule.  And with that, the sessions were underway!

Volunteers and attendees collaborated on community notes which recorded the main points and resources discussed in each session.  If you couldn’t make it to the event, or are curious what happened in the sessions you missed, I highly recommend checking out these collaborative notes.  There are some great tools and ideas discussed there.

Since I’m currently looking into plans for a potential system migration, I led a discussion on content migration for digital preservation.  It was great to hear how others are and have dealt with large-scale migrations like this.  There was a great point made about how a system migration is iterative and you always have to keep an eye on the horizon because your current system might lose support or fail to meet your collection requirements in the future.

One twitter user said…

Fellow resident, Jeff Erickson, led a discussion on preparing to use Archives Direct, a tool he’s currently researching for preserving content collected through the Mass. Memories Road Show.  They also discussed the evolution of the tool Archivematica and the necessity of exit strategies when working with cloud storage providers.

The last session I attended was on personal digital collections and how public history is changing in the digital world.  Now that few people are writing physical letters, how will day to day communication be preserved in the future?  Will twitter accounts and email inboxes be included in future donations of personal collections? There were differing opinions on who is responsible for preserving these kinds of collections.  Historical societies and community archives have traditionally taken on these roles, but with limited staff and technical expertise can they continue doing so in the born-digital world?

The event had a strong presence on twitter, where tweets were shared with the hashtag #jfkdigipres.  We collected these tweets through a storify page so we can preserve this discussion around digital preservation.

Overall the event was a great success!  I hope the conversations started here will continue both online and through future digital preservation events.

 

20160223_161518

National Digital Stewardship Residents, past and present, in front of the John F. Kennedy Presidential Library

 

Advertisements

Digital Commonwealth Visit

Last week, a group of brave NDSR-ers trekked out during a snow storm to visit with staff from the Digital Commonwealth, which is based at the Boston Public Library.

As the blizzard raged outside, we met Tom Blake in the lobby of the library. Tom and his staff were kind enough to meet with us that morning – and they fought high winds and T  delays to get there!

IMG_9423

Blizzarding outside the BPL. It was like a scene from The Revenant.

For those who are not familiar, here’s a short description of the Digital Commonwealth, taken from their website:

“Digital Commonwealth is a non-profit collaborative organization that provides resources and services to support the creation, management, and dissemination of cultural heritage materials held by Massachusetts libraries, museums, historical societies, and archives. Digital Commonwealth currently has over 130 member institutions from across the state.

This site provides access to thousands of images, documents, and sound recordings that have been digitized by member institutions so that they may be available to researchers, students, and the general public.”

The Digital Commonwealth both hosts and harvests materials. That is, they may store digitized or digital material on their own servers (hosting), or they include material hosted elsewhere (such as on DSpace, ContentDM, etc) as part of their collections and then link out to its original location.

They will also digitize materials for organizations, which I think is a pretty amazing service to provide! This means that organizations can get their materials digitized without having to buy expensive equipment or allocate staff to digitizing – which, I’m sure many of you know, can be a time-consuming task.  They will also help organizations to create and clean up metadata. They use a MODS metadata schema. You can read more about their metadata requirements here.

During our visit, Tom and his staff emphasized that the Digital Commonwealth is very access-driven, and this is reflected in their collecting. He said that if an organization comes to them with materials and makes a case for why users would want to access those materials, they will almost always take those materials in. In fact, I believe one of the Digital Commonwealth staff members at our meeting said that the phrase “But someone will want to use this!” is kind of like their kryptonite. (Hope I’m not giving away a big secret by saying that.) I thought this commitment to access and their focus on users was really admirable!

 

I was initially interested in visiting the Digital Commonwealth because, over the course of the residency, I’ve begun to wonder about how smaller organizations with limited resources can participate in digital preservation. To me, digital preservation seems like a resource-demanding endeavor. You’ve got to pay for storage, pay for staff to process and preserve digital materials, pay for digitizing or technologies to manage born-digital materials – plus you need to have the expertise in your staff and the support from your administration. I was concerned that small organizations, such as local historical societies, wouldn’t be able to participate in digital preservation because their limited resources. But it’s not as though they could just ignore digital preservation – they probably want to digitize materials, or they might have a donor with born-digital materials. So what are small organizations to do?

I think the Digital Commonwealth is a great example of a solution to this problem. It allows small organizations to benefit from the resources and expertise available at larger organizations. It also gives smaller organizations a wider audience – because their materials are available on the digital commonwealth website, alongside materials from a variety of other organizations.

At the meeting, we discussed examples of this kind of resource sharing in other places, such as the Connecticut Digital Archive. I would be curious to hear if you, reader, know of any others, or know of examples where many small organizations have come together to pool their resources. Also, are you also concerned about small organizations and digital preservation? Why or why not?

Thanks for reading!

IMG_9421.JPG

Harvard Yard in the snow

Having FITS Over Digital Preservation?

fits_logoThis week my fellow residents and I were fortunate to receive an introduction to the File Information Tool Set (FITS) from Andrea Goethals. Andrea is the Manager of Digital Preservation and Repository Services at Harvard Library, Director of the NDSR Boston program and a developer of the FITS tool. Released in 2009, FITS is a digital preservation tool designed and developed at Harvard Library to identify and validate a wide assortment of file formats, determine technical characteristics, and extract embedded metadata. The technical metadata generated and collected by FITS can be exported in a variety of XML schemas and may be included in other files for digital preservation purposes, such as Harvard Libraries’ inclusion of FITS output in METS files in its preservation repository.

Digital preservation repositories accept into their care electronic files that are created and saved in a growing number of file formats. Proper identification of a file’s format and the extraction of embedded technical metadata are key aspects of preserving digital objects. Proper identification helps determine how digital objects will be managed and extracting embedded technical metadata provides information that future repository staff or users need to render, transform, access and use the digital objects.

fits-uses

There are several tools available that can identify and validate file formats and extract technical metadata. The great thing about FITS is that it bundles many of them together. The current version of FITS, 0.10.0, includes the following applications:

fits-tools

An explanation of each tool can be found on the FITS web site.

While these tools can be used individually, using them under the FITS umbrella is more efficient. FITS runs all the tools simultaneously, saving you time. FITS knows the strengths and weaknesses of the applications and which tools support which file formats. You benefit by installing and running a single application and receiving output from multiple applications that is appropriate to each file format.

Receiving output from multiple tools can help you verify accurate information when the tools agree, or flag a concern when they don’t. It is also helpful that FITS consolidates and normalizes the output, providing a homogenized data set that is easier to interpret. Each tool’s output is converted to a common FITS XML schema ensuring labels and terminology are used consistently. The extracted metadata can then be exported to different technical metadata schemas such as MIX for images, TextMD for text and DocumentMD for documents. Any of these schemas can then be inserted into other files like METS to provide repository documentation suitable for digital preservation.

fits-how_it_works

FITS is an open-source, Java-based application that is freely available from GitHub or the FITS web site. Because it is Java-based it runs on Windows, MAC or Linux platforms from a command-line interface. It also provides an API and can be embedded in other applications; it is one of the included micro services in Archivematica. Using a command-line interface can sometimes be intimidating and confusing, but FITS employs a limited number of intuitive commands.

FITS configuration is managed with several XML files that are easily edited with a text editor. The main configuration file, fits.xml, allows you to prioritize tools, include or exclude certain file formats from processing, enable or disable additional features like generating checksums, and determining the various output options. Another positive for the digital preservation community is that FITS is actively maintained so there is a procedure for addressing bugs and a schedule for releasing updates.

The FITS web site (fitstool.org) is well organized and fully documents the installation, configuration, use, and output options.

I know my post pales in comparison to a live demo of the application. But if it piques your interest, take it for a test drive. You’ve got nothing to lose and you might add a new tool to your digital preservation tool box.

Thanks for reading, Jeff

NDSR Tour of the Massachusetts State Archives

This week, my fellow residents, our hosts, and members of the NDSR community visited the Massachusetts State Archives. Located on Columbia Point, the Archives house, preserve, and make accessible public records of the Massachusetts government. massachusetts_state_archives_50_timeline_ca_jul14_800x564We talked with the Electronic Records Archivist Veronica Martzahl about digital preservation efforts and learned about the Archives’ amazing collections from Executive Director Michael Comeau. Thanks to you both and the Archives staff for having us!

Veronica shared what led to the creation of her role at the Archives and informed us about some digital preservation initiatives that are underway. When previous Massachusetts governor Mitt Romney left office, his hard drives were swept clean and no electronic records were transferred to the Archives. This alone would be an issue in terms of government transparency and the importance of leaving a historical record (and definitely not in line with best archival practice!), but it became even more critical when Romney ran for president. This provided the impetus for the Archives to develop a digital preservation program that would ensure better procedures moving forward.

For about two years now, Veronica has been working tirelessly to implement a new digital repository, which included testing, cost analysis, and training, and has had her hands in several other projects as well. In the end, the Archives chose the Preservica Standard Edition for their digital collections. The big take-away from this is that the process was long and challenging. Dealing with factors such as IT constraints, budgeting, and the usual politics involved in government work presented some hurdles, but that there was strong institutional commitment for the project, which is such an important factor in digital preservation. This taught us much about the reality of selecting systems for your institution– something I’m sure all of the residents will deal with sooner or later! We were all very impressed with the amount of work Veronica has achieved, and can see the long-term positive impact that this repository will have for the Archives. 

As the resident at the State Library, I was particularly interested in what we can learn from another government agency working to preserve digital government information. Veronica was kind enough to spend some time with me last October discussing the current state of digital preservation at the State Archives, and I was excited to expand on that today, plus to hear updates since we last talked. One question I often get is, why don’t the library and archives collaborate on digital preservation? In a case of maddening bureaucracy, the Library reports to the Department of Administration and Finance, while the Archives report to the Secretary of the Commonwealth. This fracture often results in some confusion, but the staff at both institutions are very supportive (we often refer users to the Archives for research, and vice versa). I hope the Archives and Library staff can continue to find opportunities for collaboration, especially in regards to digital preservation.

After Veronica caught us up on the digital projects, Michael then provided us with some interesting background information about the Archives, it’s vast collection, and some detail about their emergency preparedness plan. Columbia Point, where the Archives are located, is very close to the water, and susceptible to some serious damage from natural disasters. e96cbf7d40f573aae8d8499bc743ff3eMichael explained that that is partially why the building is designed to be so strong– it has to withhold some intense weather! 

We were able to see the original versions of some founding documents in Massachusetts history and the Bill of Rights, on display in the Commonwealth Museum. As a history nerd, I was pretty jazzed to be in the same room as these materials. Hearing Michael discuss the process of designing a proper space to house these documents was equally interesting. They worked with a scientist at MIT to create a home for these materials, thus protecting their longevity. The encasements they designed have allowed these crucial pieces of history to be well preserved. Though our focus may be on digital preservation, it was a great chance for us to hear a case study around preservation of print materials and to consider how necessary preservation is, regardless of format.

This month we also get to hear about the Digital Commonwealth, get a demonstration in the FITS tool from Harvard, and attend an UnConference at the JFK Library. Looking forward to it!

CurateGear 2016 & the BitCurator User Forum

Hi everyone! A few weeks ago I traveled to Chapel Hill to attend CurateGear 2016 hosted by the University of North Carolina, School of Library Information Science and the BitCurator User Forum. This post chronicles my observations. I enjoyed having the time to listen and take in information about the projects that were being embarked on. I was excited to attend both events because I wanted to gain additional insight into the methods, projects, and tools that are being utilized and worked on. While the technical aspects were sometimes difficult to grasp, the general ideas were impactful and provided me with topics for future research.

CurateGear 2016 was a one day event packed with presentations describing ongoing projects and technology centered on digital curation methods, projects and tools.

The following presentations connected in some way to my project — using digital preservation standards and evolving practice to identify and evaluate possible options for improving preservation storage at MIT Libraries:

A few of presentations that interested me, but are not in my purview at the moment:

I decided to attend the BitCurator User Forum because I wanted to gain valuable insight into digital forensic tools and their application. The event was sponsored by the BitCurator Consortium and hosted by School of Information and Library Science at the University of North Carolina at Chapel Hill. Through my experience at the forum, I have started to acquire some comprehension of the best/good practices behind digital forensics, how the BitCurator environment is used, what people are looking for in future developments of the software, and what tools are currently being developed and how they will be applied. The event enjoyed a friendly atmosphere, enthusiastic participation and passionate attendees.

During the panel, Beyond Disk Imaging, Bertram Lyon from AV Preserve introduced a great new application called Exactly.The tool securely transfers any born-digital material from a sender to a recipient over a LAN, using DropBox, or via FTP with the benefit of establishing provenance and fixity at the beginning of the acquisition process. I find it exciting because I used to work in oral history, and this tool was first designed for the Louie B. Nunn Center for Oral History at the University of Kentucky Libraries. I can see how this tool would be incredibly useful for not just oral history programs, but also for use by cultural repository institutions. Kari Smith, Digital Archivist in the Institute Archives and Special Collections at MIT Libraries, is planning on running a study on secure digital data transfer options including an experiment using the Exactly tool in the Digital Sustainability Lab.

If you would like to check out the presentation slides from both events, they are online on the BitCurator User Forum and CurateGear 2016 websites.