Money Talks: Part I, Government Funders

Money Talks Image

Samantha DeWitt
NDSR Resident | Tufts University Tisch Library & Digital Collections and Archives

What can be done to encourage data preservation among university researchers? U.S. science and technology research agencies and the offices that oversee them have had strong ideas on the subject and have been making pronouncements for over a decade. Twelve years ago, the National Institutes of Health (NIH) began asking grant-seekers to plan for the management of their research data. The National Science Foundation (NSF) followed in 2011. In 2013, the White House superseded both agencies with a memorandum from the Office of Science and Technology Policy (OSTP) announcing the government’s commitment to “increase access to federally funded published research and digital scientific data…”

Money talks. The NSF invests about $7 billion in American research annually and the NIH allocates about $30 billion to medical research. To put these numbers in perspective for Tufts, that translated into about $8 million from the NSF and just under $62 million from the NIH last year.*

Because the agencies that disburse money in the form of federal R&D funds have been mandating data management plans from their applicants, the research universities that rely on those federal funds have responded. Many, including Tufts, have looked to their libraries to provide support in:

  • Assisting researchers in the creation and implementation of data management plans
  • Helping researchers find the right data repository
  • Dataset metadata creation
  • Encouraging best practices in data management

Universities have taken other steps as well. Some have created new data repositories, or have augmented their existing institutional repositories in order to accommodate and support the long-term preservation of research data.

Have government data access directives had their intended effect? So far the data are sparse. A 2014 Drexel University study did find NIH mandates, along with those implemented by scientific journals, seemed to be “meeting the goal of increasing the sharing of scientific resources among life science investigators.” But the point I want to make here is that, at the very least, these mandates have served to publicize the issue of data management to a degree that has encouraged debate and discussion among researchers and others within the university.

While the government still provides the greatest portion of funding to U.S. research universities, the automatic spending cuts of the 2013 budget sequestration have reduced the flow of money enough to make grant-seekers nervous. Researchers are increasingly appealing to foundations, corporations and philanthropic organizations to fill in the gaps. I hope you’ll join me in April for Part II, as we “follow the money” and look at which non-government funders are advocating for data management as well!

‘Till then,


* In research project grants

That Workflow’s A Monster: Updating Ingest Procedures at WGBH

Screen Shot 2015-02-04 at 3.05.56 PM

This was our first effort at charting out the current accessioning workflow for all the different material collected by WGBH, step-by-step – from its arrival (in boxes carried over from production departments, or on hard drives dropped off by production assistants, or copied to a server or sent to a dedicated inbox or – you get the idea), through the various complex data and rights management processes designed to make sure that every tape, shot, transcript, and contract is appropriately accounted for, all the way through to physical storage and the burgeoning digital preservation program the department is now putting into place. Peter Higgins (my partner in workflow documentation crime) and I spent a long time trying to make it as clear and easy to follow as possible, but it can’t be denied that it’s kind of a beast.

Don’t bother trying to read the tiny text in the boxes; for now, just relax and enjoy the wash of incomprehensible geometry.  After all, as of the point of the writing of this blog post, thirty days after Peter and I presented this document to the rest of the WGBH team, it’s also already outdated. WGBH, as I’ve mentioned before on this blog, is an archive in transition, which is one of the reasons working on this project now was so crucial. The current workflow involves a lot of emailing back and forth, entering information into Excel spreadsheets (unless someone else is using them first), moving folders around in a share drive, and assigning files color-coded labels. As a newbie myself, I can vouch for the fact that this is a difficult system to explain to newcomers, which is a problem for an archive that hosts several new interns every year. It also tends to result in a lot of this:

Screen Shot 2015-02-04 at 3.21.29 PM

Everyone at WGBH wants to streamline the current workflow, get rid of unnecessary steps and outdated practices, and figure out better tracking – having an easy way to tell who’s doing what when is key for ensuring that work isn’t duplicated and material doesn’t slip through the cracks. However, two major changes are coming down the pipeline, which may also alter the workflow significantly.

The first is that the Media Archive Research System, aka MARS – the complex FileMaker database which currently stores and links together all of WGBH’s data about its programs, physical media assets, original video content, and licensed media – is being replaced. In the long run, the new database should be friendlier for WGBH production departments, making it easier for them to enter and store metadata for the use of the rights department and the archives, and to retrieve the data they need on the other end. In the short term, however, there are still a lot of question marks about how exactly the new database is going to link up with the current archival workflows around metadata management.

The second factor is the adoption of the HydraDAM system for digital preservation, which I talked about [in a previous blog post linked here.] HydraDAM isn’t intended to be the primary source for content metadata, but again, in theory, it should be able to automate a lot of the processes that archivists are currently doing manually, such as generating and comparing checksums to ensure safe file transfers. But until HydraDAM is ready to kick into full gear, we won’t know for sure how nicely it’s going to play with the rest of the systems that are already in place in the archives.

We want to make sure these transitions go as smoothly as possible, and as we’re cleaning up the workflow to make it easier for the staff, we also want to make sure we’re not making any changes we’ll regret when the new systems come online. That’s why we let workflows eat our brain for most of December, throwing ourselves into the task of creating the monstrously in-depth diagram to represent the current workflow that we possibly could.

Then we used the original beast to build this:

Screen Shot 2015-02-04 at 3.31.01 PM

OK, yes, it still looks pretty beastly, but in theory it’s the first step on the road to a better, stronger, faster beast.  However, the real key focus here is all those gold-colored boxes on the chart. They represent elements of the workflow that we know we’re going to need, but don’t yet exactly know how they’re going to happen in the oncoming Hydra-PYM future – whether that’s because we haven’t finished developing the tool that’s going to do it, or because we need to research new tools that we haven’t yet got. We’re using this proposed workflow itself as a tool for discussion and planning, to help target the areas to focus on as we move forward in implementing new systems, and make sure that we’re not leaving out any crucial functionality that we’re going to need later down the line.

Although there’s still a long timeframe for the WGBH workflow overhaul, the work we’ve done has already had some immediate results – including the adoption of a new project management system to help the department keep better track of our ingest process. We’ve decided to use Trello, a flexible online application that allows us to create ‘cards’ representing each delivery of production material that comes into the archive, and add lists of tasks that need to be accomplished before the accession process is complete.

Screen Shot 2015-02-04 at 3.38.55 PM

When we presented our Trello test space to the MLA team at WGBH, we thought we would have to build in some time for testing and for people to get used to the idea of tracking their work in an entirely different system. However, everyone we spoke to was so excited for the change that we received a pretty much unanimous decision to jump on-board right away.

Inspired by the activity on the main MLA Trello, I’ve also created a Trello board to keep track of my own next phase of NDSR — which will be a topic for my next post, in which I solemnly swear to have minimal geometry.

– Rebecca