Since I will be spending my residency here at Tufts considering new ways for users to access university-associated research data, today’s meal comes fresh from the kitchen of research data management. Data sharing among researchers is certainly not a new topic of discussion, but it has gained notable fervor in light of expanding digital technologies over the past decade. Scholarly journals, advocating for improved access to research data, have become increasingly stringent in their data submission policies (Dryad has compiled a good list); and U.S. funding agencies such as the National Institutes of Health (NIH) and the National Science Foundation (NSF) have required data management plans from grant recipients for several years now. Non-government funders have also followed suit. Like the NIH, the Bill and Melinda Gates Foundation requires a data management plan for grants over $500,000. The Gordon and Betty Moore Foundation has published an extensive “data sharing philosophy” and encourages all funded researchers to share their data as soon as possible (in a manner consistent with applicable laws).
What effect policy and discourse is having on the research community is still open for debate, however. Last spring, Patrick Andreoli-Versbach and Frank Mueller-Langer published their study showing that out of 82 empirical researchers, only 12 shared research data regularly. Conversely, a study by Genevieve Pham-Kanter, Darren Zinner and Eric Campbell published in Plos One last month suggested that data sharing among life scientists was boosted by changes in data management policy and by the growth of third party data repositories and online data supplements.
It is generally accepted that some fields – physics for example – are better at curating and sharing data than others. Last month, the Signal (the terrific digital preservation and access blog of the Library of Congress) posted an interview with Elizabeth Griffin, an astrophysicist at the Canada-based Dominion Astrophysical Observatory. Griffin describes the astronomical community as being generally more advanced in data sharing and management for reasons that include: a smaller community size compared to other natural sciences, an “attendant international nature [that]… also requires careful attention to systems that have no borders,” and a history of making sure analog data was curated and accessible.
A multicolored mosaic of topics and issues emanate from the subject of preserving and sharing of research data. (I would make a “like a forest of autumn leaves” analogy but I don’t want to push it.) I have touched upon only a few notes in my first post – but I am looking forward to considering a great many more. I hope you will chime in for discussion, there is hot cider waiting and plenty of room at the table!