Shira Peltzman over at NDSR NY has already written up an excellent post about AMIA’s inaugural open-source track this year, but since it so heavily informs the work that I’m doing at WGBH, I wanted to take a little bit of a closer look at some of the themes that were running through this section of the conference.
(…ok, maybe I just wanted an reason to post a lot of puppy gifs. But we’ll get there.)
To recap, open-source software is built upon essentially free code — it’s available publicly and collaboratively developed, so anybody can, in theory, download it, implement it and improve upon it. Hack Day, which Joey and I posted about last week, is supposed to result in the development of open-source tools that can be freely used by the community. (The main site used for collaborative coding is GitHub, and knowledge of how to use GitHub is almost essential for working in the open-source community; the track this year actually started with a demo from LoC’s Lauren Sorensen about how to dip your feet into the GitHub waters. GitHub’s tools for submitting comments and changes to a project and tracking their implementation can actually be pretty useful for other things besides code — the PBCore Committee, for example, is using GitHub to receive comments and review updates to the PBCore metadata standard — but that’s a whole other story.)
Anyway, there are a lot of reasons why open-source is a pretty great thing for archives. For one thing, the fact that the code isn’t locked away behind a proprietary license makes it much more likely that people ten, twenty or fifty years in the future will be able to figure out how it worked and how to recreate or emulate it — reassuring, if you don’t want to run the risk of losing content due to obsolescence, or of uploading a bunch of material and metadata into a system that you can’t then get it back out of. Additionally, open-source technology provides a lot more opportunities for archives to customize and control their own preservation solutions; as an archivist, it’s always fairly unnerving to feel like the survival of your content is completely in somebody else’s hands.
Open-source also tends to sound like a great solution financially for archives. Who wants to pay software licensing fees when you can just download code from GitHub that will do the same thing for free? However, this is where it gets a little tricky. WGBH’s Karen Cariani, in one of the most quotable moments of the open-source stream, explained it like this:
What people expect from open-source software is something like free beer.
However, what you actually tend to get is more like a free puppy.
At this point you may be thinking, ‘but puppies are great! Way better than beer!’ It’s true, puppies are pretty great. They’re cute and they’re cuddly and when you contribute to their support you get a warm feeling inside of generally doing the right thing for the universe. Still, when someone gives you a puppy, it’s not exactly ‘free’ — now that you’ve got the puppy, you have the responsibility of shelling out a significant amount of cash on food, equipment, vet’s bills … not to mention the responsibility of housetraining it, taking it for daily walks, and cleaning up after it when it forgets all the training you gave it and pees on the floor. And this is all now going to be your job for pretty much the rest of the puppy’s lifespan.
That’s basically what open-source software is like. You’re getting the initial code for free, and that’s pretty great — but once you’ve got the code, getting it to work is going to involve a significant amount of time, and probably a significant amount of money as well. When you work with a proprietary software company, training the dog and walking it and taking it to the vet (in other words, customizing it, updating it and checking it for bugs) are all part of the company’s job; you’re paying them to take care of all that hassle for you. If you’re jumping on the open-source train, either you then have to hire someone else to make your open-source software behave — and a lot of open-source companies fund themselves by hiring out developers to do that — or figuring out how to do it becomes your job. And if you’re an archivist in a financially-strapped institution, odds are you’re already doing at least two jobs.
The idea here isn’t to discourage people from using open-source tools; far from it! Karen Cariani made this analogy as part of her presentation about WGBH’s decision to work with Hydra, an open-source repository solution that’s being adopted by a number of large archives. (I talked about this a little in my post on change management, too.) All the great reasons that I mentioned above for archives to invest in open-source software remain really solid reasons to invest in open-source software. It’s just important to be aware that it is an investment, and not go in expecting to get a lot of exciting something for nothing.
The thing about open-source software, though, is that the more people become aware of the options, and start talking about them and using them and documenting them and contributing them, the better and easier they all become for everybody. The power of open-source comes from an informed community. The importance of AMIA’s open-source track this year wasn’t even so much about the actual tools presented, although of course there were a lot of fantastic open-source tools presented (in addition to Hydra, the WebVTT standard for time-aligning metadata with web-streaming content got a lot of buzz, and I’ll never pass up an opportunity to give the QCTools project a shout-out, since it’s going to be a godsend for anyone whose job involves error-checking digital video files.) But specific projects aside, in order to be part of the open-source community, it’s important to really understand what open-source is, and what it means — and the frank and open discussions about open-source at AMIA this year played a huge role in broadening that understanding.
– Rebecca (who does really like free puppies)