By Jason Casden and Brian Dietz
Social media platforms have become a venue where serious discourse takes place, but much of this critical and ephemeral content is lost to researchers because few institutions are collecting and preserving it. Tools have surfaced that enable cultural heritage institutions to harvest and preserve the social media record at a much lower cost than was previously possible. While these tools are becoming more sophisticated, two problems remain: there is limited documentation regarding archival collecting practices, and these tools are not easily deployed by institutions with limited IT support, such as smaller cultural heritage organizations with close community ties.
In 2015, the NCSU (North Carolina State University) Libraries was awarded a Library Services and Technology Act (LSTA) EZ Innovation grant from the State Library of North Carolina for the project “New Voices and Fresh Perspectives: Collecting Social Media” that sought to address these problems. Project staff created a documentary toolkit that begins to address curatorial, legal, and ethical issues associated with archiving harvested social media data. We surveyed archival researchers about the value they saw in social media data for archival research, and also cultural heritage professionals on their current and anticipated collecting practices.
In addition to this outward facing work, during the period of the grant, the Libraries collected over 1.2 million tweets from over 380,000 Twitter accounts, and 29,000 Instagram photographs and associated metadata records from approximately 18,000 Instagram accounts. Finally, as an outgrowth of our work and feedback from the community, we’ve developed a virtual social media harvesting environment. The Social Media Combine assembles social media data harvesting software from NCSU Libraries’ Lentil Instagram harvester and GWU Library’s Social Feed Manager Twitter harvester, along with the web servers and databases necessary for their use into a single package that can be easily deployed to desktop and laptop computers by those with limited access to IT resources.
One particularly rewarding part of this project was the opportunity to work closely with computer science, library science, and public history graduate students from NC State and UNC at Chapel Hill. With the four students hired with grant funds, we implemented a dual paired work structure, with a pair of students each deployed to our Special Collections and Digital Library Initiatives departments. These collaborative teams were able to identify and implement adjustments to our project plan based on the discovery of new information that led to improved project outputs and high quality cited additions to their professional portfolios.
These accomplishments have positioned us to contribute further to this area of work. The documentary toolkit, while extensive, reflects a single institution’s attempt to establish guidelines for the curation and stewardship of social media data. The next step in the process will be to collaborate with colleagues. The Documenting the Now project develops, proposes, and refines a set of best practices and guidelines for archiving harvested social media data. We need to address each step in an archival object’s lifecycle — selection and appraisal, ingest, arrangement and description, discoverability, access by researchers and preservation – while building a refined understanding of the legal and ethical considerations of harvesting, preserving, and making accessible to researchers these materials.
We’d like to express our gratitude to the State Library of North Carolina for its guidance and support throughout the grant process, as well as to the Institute of Museum and Library Services for making funding available through the Library Services and Technology Act.
Jason Casden is the Interim Associate Head, Digital Library Initiatives, and Brian Dietz is the Digital Program Librarian for Special Collections at NCSU Libraries. They were both Principal Investigators of “New Voices and Fresh Perspectives.”