Munoz – Question 2
2In your role, are the main pressures on and needs of data curation the same across the sciences, social sciences, and humanities?
Associate Director, Maryland Institute for Technology in the Humanities (MITH); Assistant Dean of Digital Humanities Research, University Libraries – University of Maryland
¶ 1 Leave a comment on paragraph 1 0 In many ways the needs of and pressures on data curation are the same across the social sciences, humanities, and sciences. Moreover, since data curation, like other information sciences, is a “meta-discipline”1 rather than being strictly of the humanities, social sciences, or hard sciences, there are shared concepts and terminology that we can apply to the problems of data curation across disciplines.2 Certainly, the technical challenges of preserving digital information—unstable media, basic bit-level data integrity, rendering many different file formats over time, architectures, and the economics of data storage—are shared problems. These pressures may be of differing intensity across communities. Many humanists do not (yet) have to contemplate data volumes measured in exabytes as astronomers do. Already, though, humanists are working with volumes of data too large to manipulate and analyze with conventional, widely available tools; this trend in humanities research is only likely to grow as researchers devote increasing attention to the contemporary moment.3 The basic structure of the scholarly communication economy is another shared challenge for data curation across the disciplines. Data curation must engage with which scholarly outputs are “published,” how credit is apportioned for them (the citation of datasets, for example), and the role of for-profit corporations in the dissemination of research (in relation to open access and open data movements).
¶ 2 Leave a comment on paragraph 2 0 Even in the areas where there may be the most divergence about the needs of and pressures on data curation—the evidentiary role of information, the place of “interpretation,” and the nature of “proof”—I think there is more commonality than difference. As the amount of information grows and as the work of research in many disciplines increasingly involves managing and analyzing larger and larger amounts of information, human judgment is paired increasingly with (automated) machine analysis. (Part of data curation is working better with “robots.”) More rigorous conceptual and semantic modeling is needed to incorporate automated algorithmic processes into research—from the search engines everyone uses to the sophisticated data-filtering techniques necessary for working with streams of data coming from large telescopes.4 Because machines increasingly and inevitably mediate our work with data, the challenges of preserving (human) meaning and interpretation in computational environments for data curation are significant, and this is true across the social sciences, humanities, and the sciences.5
- ¶ 3 Leave a comment on paragraph 3 0
- Marcia J. Bates, “The Invisible Substrate of Information Science,” Journal of the American Society for Information Science 50, no. 12 (1999): 1043-50. [↩]
- See Allen H. Renear, Molly Dolan, Kevin Trainor, and Melissa H. Cragin, “Towards a Cross-Disciplinary Notion of Data Level in Data Curation,” November 11, 2009, https://www.ideals.illinois.edu/handle/2142/14547. [↩]
- “Big Data,” Wikipedia, the Free Encyclopedia, February 9, 2013, http://en.wikipedia.org/w/index.php?title=Big_data&oldid=537356997. [↩]
- I am indebted to Sayeed Choudhury for the telescope example; any failing in the paraphrasing is mine. Personal conversation. [↩]
- Compare David Dubin, Karen M. Wickett, and Simone Sacchi, “Content, Format, and Interpretation,” Balisage: The Markup Conference 2011, August 2 – 5, 2011, http://www.balisage.net/Proceedings/vol7/html/Dubin01/BalisageVol7-Dubin01.html, doi:10.4242/BalisageVol7.Dubin01; and Julia Flanders and Trevor Muñoz, “An Introduction to Humanities Data Curation,” DH Curation Guide, September 22, 2011, http://guide.dhcuration.org/intro/. [↩]