Publishing the Archive: Definitions and Typologies
¶ 1 Leave a comment on paragraph 1 0 In “Disrespect des Fonds: Rethinking Arrangement and Description in Born-Digital Archives,” Jefferson Bailey describes the need to redefine our thinking with regard to arrangement and description, as well as notions of the fonds, in light of the new conceptual and practical structures that digital materials enable.1 Further, our current practices of digital publishing and the dissemination of archival materials via online digital collections also call into question fundamental archival principles.
¶ 2 Leave a comment on paragraph 2 0 This essay will address shifts in definitions of the archival “collection” and “publishing” resulting from the proliferation of digital archival materials available online. What does it mean to disseminate the collection itself, rather than simply distributing the products of archival research? How does this relate to other modes of scholarly communication derived from archives, such as scholarly editions and the publication of selected collection items? What roles do structure and narrative play in these definitions? While there are no easy answers to these questions, some examples of what has been done can point us towards next steps and future approaches.
¶ 3 Leave a comment on paragraph 3 0 To elucidate my discussion, I will refer to digital collections and exhibitions created in the Cornell University Library by the Division of Rare and Manuscript Collections,2 and to several multi-institutional projects. These collections and exhibitions will serve as case studies for my analysis, which will be grounded in some of the practicalities of managing and maintaining collections and their metadata at Cornell. In particular, I will focus on the ambiguous definition of “collection.” By presenting a typology of digital collections, I will explain how the archival principle of provenance is tested and will address issues of replication and re-use. What are the conceptual implications of a digital collection existing in multiple online locations? How does all of this relate to the physical collection?
¶ 5 Leave a comment on paragraph 5 0 Since the nineteenth century, archivists have defined archival collections through such concepts as respect des fonds, original order, and the related concept of provenance. The creation of digital collections, however, is testing these definitions. Given that a digitized collection is often comprised of a selection of materials intended to represent the entire collection—sometimes selected simply because those portions do not have copyright restrictions—how are we to regard this “artificial” collection? Should we treat it as part of the original, or as a sort of parallel entity?
¶ 6 Leave a comment on paragraph 6 0 Respect des fonds—an approach articulated in the mid-nineteenth century by the French National Archives—means that an archival collection is defined, described, and maintained as a group based on the creator of the records. The related concept of original order, defined by the Prussian State Archives later in the nineteenth century, emphasized the importance of maintaining the existing internal structure of the fonds. In the United States, the principle of provenance is used interchangeably with these two concepts.3 But these defining principles are challenged when we consider the creation and online publication of digital surrogates. The matter is further complicated when we think about born-digital materials, their physical and intellectual separation from the paper materials in a collection, and their arrangement.
- ¶ 8 Leave a comment on paragraph 8 0
- Curated selections from a single collection. Created for access reasons and often funded by a grant or a donor, but with insufficient resources to digitize the entire collection, or with copyright restrictions that preclude some images from being posted online; see Andrew D. White Architectural Photographs Collection, in which the digitized images represent 10 percent of the archival collection.4
- Subject-based aggregation of digital or digitized media from many collections. Created for access reasons, frequently for a particular scholarly or popular audience; see Films from the University Archives, intended to aggregate digitized films from many collections related to Cornell history.5
- Aggregations of digitized images based on patron requests. Accrual of digital images based on researcher requests for on-demand scanning; see Images from Cornell’s Rare Book and Manuscript Collections.6
- Born-digital additions to physical archival collections. Accessioned as part of an existing physical collection, but accessible only in digital form; see web sites archived by the Division of Rare and Manuscript Collections at Cornell, which are related to organizations whose records we collect.7
- Born-digital collections. Collections arriving in entirely digital form; see Rose Goldsen Collection of New Media Art, a collection of digital and interactive art with a concentration on twenty-first-century developments in cinema, video, installation, photography, and sound.8
- Digital exhibitions. Curated assemblages with a thematic narrative, which often draw from many collections; see Wardrobes and Rabbit Holes: A Dark History of Children’s Literature.9
¶ 9 Leave a comment on paragraph 9 0 The fact that each of these collection types employs different access mechanisms from physical special-collections material—and even from other digital collection types—results in the fragmentation of archival collections.
¶ 11 Leave a comment on paragraph 11 0 These digitized, artificial, and fragmented collections can take on lives of their own in different formats. For example, all of our digital collections have their own catalog record and listing in Cornell’s Registry of Digital Collections (Figure 1)10; in other words, they are treated as separate, or perhaps parallel, entities from the archival collections from which they were derived. This practice was developed largely for practical reasons: to provide researchers access via the library catalog, and to have a single list of available digital resources. Given the distributed nature of digital work in the library, this practice also evolved because digital collections are maintained by the library’s central digital projects group, which is organizationally separate from the archives; they are viewed as independent entities that require maintenance and migration based on their technology platform. In contrast, born-digital materials are treated as additions to their collections, managed by the archives, and described using archival methods.
¶ 12 Leave a comment on paragraph 12 0 Additionally, the web sites for our exhibitions 11 are not treated with any descriptive formality, even though they are highly curated and often close in structure to traditional narrative publications. Exhibitions and their accompanying web sites, which function like print catalogs, are not treated as bibliographically separate from our web site and do not have catalog records. Again, this is a practice brought on by organizational structures, since creating web exhibitions has been the responsibility of our in-house web designer, and our digital exhibitions are published as traditional HTML web pages.
¶ 14 Leave a comment on paragraph 14 0 Archivists (among them, me) are accustomed to being involved in the process of scholarship and scholarly publishing. I consider archival collections to be the raw materials for historical interpretation via academic articles, books, and more recently, re-use in various forms online. At Cornell, in addition to numerous textual citations from our collections, several hundred images from collections are published each year in academic and non-academic works.12 Occasionally, entire texts or edited groups of texts are published in scholarly editions or printed manuscript collections. In that sense, we are very familiar with the circulation of our collection materials in print and electronic form.
¶ 15 Leave a comment on paragraph 15 0 If “publishing” means issuing for public distribution or even the process of “making public,” then digital collections are also publications. If this is the case, then what are we to make of the fact that we are not publishing interpretation, but the collection itself? And what are the conceptual implications of a digital collection existing in multiple online locations? For example, what if a collection is available via a local database, Flickr, and ARTstor? How does the online presence of a digital collection relate to the corresponding physical collection? These are questions we ask whenever we create new digital collections, and the answers help determine collection descriptions, metadata structures and syncing, linking within finding aids and elsewhere, use patterns, and how we explain collections to researchers in person and online.
¶ 16 Leave a comment on paragraph 16 0 I would posit that posting the “raw data” of archival collections online does indeed constitute publication, even in an interpretive sense. Although we are accustomed to narrative forms of publication (e.g., books and exhibitions), digital collections enable new, non-linear structures to emerge. Image collections are a good example, and the art and architectural history communities have embraced the idea; for instance, the Society of Architectural Historians (SAH) has been a leader in scholarly publishing via SAHARA (Society of Architectural Historians Architecture Resources Archive), a digital image collection curated and peer reviewed for content.13 This is an artificial collection created collectively for use by a specific scholarly community, and the process of collecting, curating, and metadata creation is crowdsourced among SAH members and editors.
¶ 17 Leave a comment on paragraph 17 0 Even these nuanced definitions of “collection” are complicated, both in a practical and conceptual sense, when we begin to think about different “editions” of digital collections. Archivists now have legacy digital collections from fifteen years ago that they have migrated from one format to another. We might move a collection from a legacy database to a more up-to-date digital asset management system, for example. In addition to legacy collections and migration, there are also cases of multiplication across different systems and collections. For instance, we might share a collection on our local servers with a multi-institutional repository like HathiTrust, Internet Archive, or ARTstor. We might post images in Flickr, HistoryPin, and on social media sites. The Samuel J. May Anti-Slavery Collection,14 for example, is available on local servers and, in part, in HistoryPin via a multi-institutional PBS project, the Abolitionist Map of America, which in turn was created in conjunction with an American Experience documentary.15 Several other collections, such as the Icelandic and Faroese Photographs of Frederick W.W. Howell, are simultaneously hosted on a local library server and presented via Luna Insight (Figure 2),16 available in Shared Shelf Commons (Figure 3),17 and accessible via Flickr Commons (Figure 4),18 in each case with different features available for display, searching, navigation, mapping, and grouping.
¶ 22 Leave a comment on paragraph 22 0 This multiplicity and dissemination of editions is not new to archives. Consider, for example, the mass-microfilming projects of the late-twentieth century and their role in distributing access to archival collections. These projects required archives to duplicate their holdings for the publication and distribution both of entire collections and selected portions of them. Microfilming projects still inform the digital content we produce today; digital collections are created from previously microfilmed collections based on their historical importance, and it is remarkably easy to digitize materials from the microfilm format. An example at Cornell is our Witchcraft Collection, which we hold in part on microfilm and in part as a digital collection created from microfilm.19 The digitization of collections raises the same issues as the earlier microfilming projects. However, the ease with which digital collections can be redesigned, remixed, and reissued is new to archives.
¶ 23 Leave a comment on paragraph 23 0 This replication is not a cause for concern in archives as long as we have preserved any original paper documents and digital files. However, it does bring about inequality in access to different parts of the collection. Depending upon how you look at it, either the physical collection or the digital portion of it can be orphaned from the rest. As archivists continue to make materials accessible online—and feel more pressure from donors and researchers to digitize their collections—this fragmentation will continue.
Leave a comment on paragraph 25 0
The data-like structure of online digital collections—standardized units of images and metadata—provides endless opportunities for re-use and rearrangement. A researcher can parse by metadata fields, including subject, date, and copyright status, and is thus empowered to repurpose and play with archival materials in digital format, whether in a group or presentation made in ARTstor for academic teaching, or through re-use of a public-domain image from Flickr, or through the use of such multi-institutional repositories as Shared Shelf Commons, which allow for searching across the boundaries of institutions or archival collections.20
¶ 26 Leave a comment on paragraph 26 0 The parsing of collections and metadata also provides for cross-collection data analysis on a large scale via automated textual analysis and visual pattern recognition. One example is the growing collection of collective bargaining agreements posted by the Kheel Center for Labor-Management Documentation and Archives at Cornell.21 In this case, collective bargaining agreements were digitized from several archival collections both for public access and for scholars wanting to conduct full text and metadata analysis of the corpus of agreements.
¶ 27 Leave a comment on paragraph 27 0 Another example is the repurposing and visualization of metadata that is part of the multi-institutional Social Networks and Archival Context (SNAC) project. SNAC aims to aggregate archival name authority data in order to improve the discoverability of archival collections and to provide access to socio-historical contexts (which include people, families, and corporate bodies) in which the records were created.22 On the one hand, this project emphasizes the importance of provenance, allowing archivists and researchers to locate collections at multiple institutions, collating them by creator. On the other hand, the radial graph visualizations encourage the study of complex networks of people and archival collections, allowing new forms of historical research and interpretation to emerge.
¶ 28 Leave a comment on paragraph 28 0 When digital objects are available in multi-collection or even multi-institutional repositories, cross-collection searching and analysis obliterate our sense of the primacy of the fonds. Importantly, though, researchers can sort and define collections by multiple arrangements. Although the archival context may be lost, new opportunities for the use of archives arise. As Paul Conway has noted in his user study of digitized photographic archives, “a new theory of the use of archives—modes of seeing—might emerge from in-depth engagement with experienced users.”23 As historical researchers engage with digital archival materials, archivists will continue to reassess not only how we define our collections, but how our collections are redefined by their use.
¶ 29 Leave a comment on paragraph 29 0 The fragmentation of original collection structures requires us to adjust our definitions of the archival collection and of publishing, with practical implications for archivists and researchers. Digital collections provide greater access to materials, but often without the context so important to archives and archivists. Ultimately, this fragmentation creates new opportunities, including some that could be empowering to researchers and archives. In order for digital archival collections to support scholarly and archival needs and goals, archivists should continue to open themselves to the reconfiguration of collections, and to uses that enable this multiplicity of possibilities.
- ¶ 30 Leave a comment on paragraph 30 0
- Jefferson Bailey, “Disrespect des Fonds: Rethinking Arrangement and Description in Born-Digital Archives,” Archive Journal 3 (Summer 2013), http://www.archivejournal.net/issue/3/archives-remixed/disrespect-des-fonds-rethinking-arrangement-and-description-in-born-digital-archives/. [↩]
- “Overview,” Division of Rare and Manuscript Collections, Cornell University Library, accessed September 15, 2013, http://rmc.library.cornell.edu/. [↩]
- For early- to mid-twentieth-century American interpretations of these concepts, see the writings of Waldo G. Leland, Ernst Posner (who worked in both the Prussian and American contexts), and T. R. Schellenberg. For thorough historical discussions and analyses of these concepts, see Bailey, “Disrespect des Fonds”; Nancy Bartlett, “The Origins of the Modern Archival Principle of Provenance,” in Bibliographical Foundations of French Historical Studies (New York: Haworth Press, 1992); and Luciana Duranti, “Origin and Development of Archival Description,” Archivaria 35 (1993). [↩]
- “Andrew D. White Architectural Photographs Collection,” Cornell University Library, accessed September 15, 2013, http://resolver.library.cornell.edu/misc/4077228. [↩]
- “Films from the University Archives,” MediaSpace, Cornell University Library, accessed September 15, 2013, http://media.library.cornell.edu/category/2_Collections%3EUniversity+Archives. [↩]
- “Images from Cornell’s Rare Book and Manuscript Collections,” Division of Rare and Manuscript Collections, Cornell University Library, accessed September 15, 2013, http://resolver.library.cornell.edu/misc/5862713. [↩]
- “Division of Rare and Manuscript Collections, Cornell University Library,” Archive-It.org, accessed September 15, 2013, http://www.archive-it.org/collections/3134. [↩]
- “Rose Goldsen Archive of New Media Art,” Cornell University Library, accessed September 15, 2013, http://goldsen.library.cornell.edu/. [↩]
- “Wardrobes and Rabbit Holes: A Dark History of Children’s Literature,” Division of Rare and Manuscript Collections, Cornell University Library, accessed September 15, 2013, http://rmc.library.cornell.edu/rabbithole/. [↩]
- “Registry of Digital Collections,” Cornell University Library, accessed September 15, 2013, http://rdc.library.cornell.edu/search/index.php?mode=browse&type=Collection. [↩]
- “Previous and Online Exhibitions,” Division of Rare and Manuscript Collections, Cornell University Library, accessed September 15, 2013, http://rmc.library.cornell.edu/events/previous_exhibitions.html. [↩]
- In the Division of Rare and Manuscript Collections, we record permissions responses in our reference tracking system. This allows us to quantify the number of inquiries we receive and to understand the nature of the use of materials for publication. [↩]
- “SAHARA,” Society of Architectural Historians, accessed September 15, 2013, http://www.sah.org/publications-and-research/sahara. [↩]
- “Samuel J. May Anti-Slavery Collection,” Division of Rare and Manuscript Collections, Cornell University Library, accessed September 15, 2013, http://resolver.library.cornell.edu/misc/4270009. [↩]
- “Abolitionist Map of America,” PBS, accessed September 15, 2013, http://www.pbs.org/wgbh/americanexperience/features/interactive-map/abolitionists-map/. [↩]
- “Icelandic and Faroese Photographs of W.W. Howell,” Cornell University Library, accessed September 15, 2013, http://library24.library.cornell.edu:8280/luna/servlet/CORNELL~2~1. [↩]
- Cornell University Library, “Icelandic and Faroese Photographs of W.W. Howell,” ARTstor, accessed September 15, 2013, http://resolver.library.cornell.edu/COLLECTION/501. [↩]
- Cornell University Library, “Icelandic and Faroese Photographs of W.W. Howell,” Flickr, accessed September 15, 2013, http://www.flickr.com/photos/cornelluniversitylibrary/collections/72157623945676076/. [↩]
- “Cornell University Library Witchcraft Collection,” Division of Rare and Manuscript Collections, Cornell University Library, accessed September 15, 2013, http://resolver.library.cornell.edu/misc/5618992. [↩]
- “Shared Shelf Commons,” ARTstor, accessed September 15, 2013, http://www.sscommons.org. [↩]
- “Collective Bargaining Agreements,” ILR School, Cornell University, accessed September 15, 2013, http://digitalcommons.ilr.cornell.edu/cba. [↩]
- “SNAC: The Social Networks and Archival Context Project,” Institute for Advanced Technology in the Humanities, University of Virginia, accessed September 15, 2013,http://socialarchive.iath.virginia.edu/. [↩]
- Paul Conway, “Modes of Seeing: Digitized Photographic Archives and the Experienced User,” American Archivist 73 (Fall/Winter 2010): 460. [↩]
Assistant Director, Technical Services and Curator of Digital and Media Collections, Division of Rare and Manuscript Collections – Cornell University