685
Views
3
CrossRef citations to date
0
Altmetric
Articles

Opportunities for Encoding EAD for Linked Data Extraction and Publication

 

ABSTRACT

This article follows the trajectory of support for linked data in Encoded Archival Description (EAD). While EAD3 brings new avenues for encoding URIs which may be extracted and transformed into linked data, support for its use within the standard has existed from the beginning. The new version of the standard expands on methods which those already using EAD2002 may wish to begin implementing. Its new element <relations> provides several additional alternatives for encoding data which can be extracted as linked data. Indeed, one may now encode linked data directly within an EAD document for reuse and publication.

View correction statement:
Erratum

Author Bio

Ruth Kitchin Tillman is the principal cataloger and linked data strategist at the Penn State University Libraries. She served as a reviewer and proofreader of the EAD3 standard and its accompanying tag library and maintains pedagogically oriented tag libraries for EAD2002 and EAD3 at eadiva.com. Her research areas include practical reuse of resource description, applications for linked data, and labor in libraries, archives, and museums.

Notes

1. Society of American Archivists Technical Subcommittee for Encoded Archival Description, Encoded Archival Description Tag Library - Version EAD3, (Chicago: Society of American Archivists, 2015), http://www.loc.gov/ead/EAD3taglib/index.html (accessed September 5, 2017).

2. Wim van Dongen, “Report to the Technical Subcommittee for Encoded Archival Standards.” (Presented at Society of American Archivists Annual Meeting, Portland, Oregon, July 26, 2017).

3. Society of American Archivists Technical Subcommittee for Encoded Archival Description, Preface.

4. The schema.org ontology was developed by a partnership including major search engines for better interchange between a variety of data standards and a format which they could recognize and index. http://schema.org/docs/about.html

5. Proposal submitted by Richard Wallis on behalf of the Schema Architypes W3C Community Group at https://github.com/schemaorg/schemaorg/issues/1758 (accessed September 29, 2017).

6. This paper references ArchivesSpace exclusively when providing examples of software used for URI encoding in archival description. It is one of the few undergoing ongoing development and, in the author's location of North America, the only one of those to allow for the encoding of URIs and publication of those URIs as linked data.

7. Mark A. Matienzo, Elizabeth Russey Roke, and Scott Carlson, “Creating a Linked Data-Friendly Metadata Application Profile for Archival Description” (Poster presentation at DCMI International Conference on Dublin Core and Metadata Applications, Washington, DC, October 26–29, 2017).

8. Daniel Pitti, Bill Stockting, Florence Clavaud, “Records in Contexts (RiC): a standard for archival description developed by the ICA Experts Group on Archival Description.” https://www.ica.org/en/records-in-contexts-ric-a-standard-for-archival-description-presentation-congress-2016 (accessed September 23, 2017).

9. Society of American Archivists, Encoded Archival Description Working Group, Encoded Archival Description Application Guidelines for Version 1.0, (Encoded Archival Description (EAD), Document Type Definition (DTD), Version 1.0, Technical Document No. 3), (1999). http://www.loc.gov/ead/tglib1998/tlprinc.html (accessed September 9, 2017).

10. Ibid., http://loc.gov/ead/tglib/elements/runner.html (accessed September 9, 2017).

11. Ibid., http://loc.gov/ead/tglib/elements/frontmatter.html (accessed September 9, 2017).

12. Ibid., http://loc.gov/ead/tglib/elements/persname.html (accessed September 9, 2017). The entry's description of @role also reads as though the author had MARC relators and/or Dublin Core on their mind.

13. Ibid., http://loc.gov/ead/tglib/att_gen.html (accessed September 9, 2017).

14. Or RDF, a series of W3C recommendations available at https://www.w3.org/standards/techs/rdf.

15. Ed Jones, “Introduction,” in Linked Data for Cultural Heritage, ed. Ed Jones and Michele Seikel (Chicago: ALA Editions, 2016), x.

16. National Information Standards Organization. Issues in Vocabulary Management, (Baltimore: NISO, 2017), http://www.niso.org/apps/group_public/download.php/18410/NISO_TR-06-2017_Issues_in_Vocabulary_Management.pdf (accessed September 27, 2017).

17. Jones, xi.

18. A string literal (referred to in casual documentation as either a “string” or a “literal”) is a set of characters, including letters, numbers, and special characters. In a program, these are enclosed in quotation marks and treated as a functional unit, much as one would treat multiple words, dates, and symbols in a term from a controlled vocabulary as a unit. When an access point is text, rather than URI, it is a string literal from a controlled vocabulary.

19. James Kim and Michael Hausenblas, “5-star Open Data,” 5-Star Open Data, http://5stardata.info/en/ (accessed September 12, 2017).

20. The examples in the Records in Context - Conceptual Model draft v0.1 demonstrate a conception of graphs as simply an alternative to encode text, not as an opportunity to create linked data. None include URIs as the triple's object. https://www.ica.org/sites/default/files/RiC-CM-0.1.pdf (accessed, September 21, 2017).

21. The technical specification for Turtle can be found at https://www.w3.org/TR/turtle/ (accessed September 18, 2017).

22. Although the Samvera URI Working Group's Predicate Decision tree was designed for use within a particular software ecosystem, its list of common ontologies may also be useful for archivists, https://wiki.duraspace.org/display/samvera/URI+Management+Working+Group?preview=/87460991/87462917/PredicateDecisionTree.pdf (accessed September 22, 2017).

23. One aspect of the linked open web is that anyone may say anything about anything. Multiple viewpoints about the same Thing may be represented. Or malicious, misinformed, or otherwise inaccurate assertions may cause improper inferences to be drawn. This should not deter cultural heritage organizations from engaging and attempting to contribute the best data possible.

24. Karen F. Gracy, “Archival Description and Linked Data: A Preliminary Study of Opportunities and Implementation Challenges,” Archival Science 15 (2015): 239–294.

25. Hillel Arnold, “Implementing Schema.org at the Rockefeller Archive Center,” Bits and Bytes (blog), October 17, 2013, http://blog.rockarch.org/?p=826 (accessed September 8, 2017).

26. Winona Salesky and Hillel Arnold, XTF-RAX, https://github.com/RockefellerArchiveCenter/XTF-RAC (accessed September 8, 2017).

27. Aaron Rubinstein, “Sharing Archival Metadata,” in Putting Descriptive Standards to Work (Chicago: Society of American Archivists, 2017).

28. Ed Jones and Michele Seikel, Linked Data for Cultural Heritage (Chicago: ALA Editions, 2016).

29. Society of American Archivists Technical Subcommittee for Encoded Archival Description, Preface.

30. ...or necessarily in the intention of its creators.

31. For example, Princeton's award-winning finding aid site allows one to view its finding aids as XML or RDF by appending.xml and.rdf respectively. One may see examples similar to those in this section in its EAD2002 finding aids and how they are then extracted as RDF.

32. Access points in EAD 1.0 differ, in some cases, from those in EAD2002. However, since the relevant attributes @authfilenumber, @source, and @role were all present in EAD 1.0, their inclusion from the beginning should be acknowledged.

33. Society of American Archivists, Encoded Archival Description Working Group, http://www.loc.gov/ead/tglib1998/tlatt1.html

34. <corpname>, <famnname>, <geogname>, <name>, and <persname>

35. Society of American Archivists, Encoded Archival Description Working Group, http://www.loc.gov/ead/tglib1998/tlatt1.html

36. Library of Congress, LC Linked Data Service: Authorities and Vocabularies, http://id.loc.gov (accessed September 5, 2017).

37. Rubinstein, 339–340.

38. The Public User Interface release 2.1.0 includes these changes. https://github.com/archivesspace/archivesspace/releases/tag/v2.1.0 (accessed September 24, 2017).

39. In fact, other than matching the element to extract the URIs, the script does not need to know that what it's handling is a personal name.

40. Note that the two choose different ontologies to express a very similar type of relationship between their records and thus the statements are equivalent only as much as the similar types of relationship are nearly equivalent.

41. This is distinct from @arcrole, which will be noted and used in the section on <relation>.

42. Although these examples use @relator within subject, a repository's data manager may decide non-name access points have default URI mappings (such as always mapping <subject> to http://purl.org/dc/terms/subject) and make them part of scripted extraction.

43. Technical Subcommittee for Encoded Archival Context of the Society of American Archivists, Encoded Archival Context—Corporate Bodies, Persons, and Families (EAC-CPF Tag Library), 2014, Relations, http://eac.staatsbibliothek-berlin.de/fileadmin/user_upload/schema/cpfTagLibrary.html#d0e650 (accessed September 20, 2017).

44. Generally to be considered as defined in the International Standard for Describing Functions (ISDF), which may then be constrained to “functions of corporate bodies associated with the creation and maintenance of archives,” https://www.ica.org/en/isdf-international-standard-describing-functions (accessed September 8, 2017).

45. Any kind of resource, from an archival collection to a catalog record for their work to a digitized edition of that work and more. Of the attributes within relations, @resourcerelationtype is the only to include the value “other.”

46. Society of American Archivists Technical Subcommittee for Encoded Archival Description, Preface.

47. Note that the prefix of the relation speaks to its type of Thing, rather than the type of relation between those two things.

48. The <objectxmlwrap> element was introduced from EAC-CPF and is not supported for those using the XML DTD instead of the XML Schema or RNG.

49. Although N-Triples and other methods of encoding RDF may resemble XML, only RDF/XML will validate within the <objectxmlwrapper> element.

50. Society of American Archivists Technical Subcommittee for Encoded Archival Description, <source>, http://www.loc.gov/ead/EAD3taglib/index.html#elem-source (accessed September 17, 2017).

51. Although <source> does not include child elements to encode dates or geographic information separately from the <descriptivenote>.

53. “Introduction to Structured Data,” Google Developers, https://developers.google.com/search/docs/guides/intro-structured-data (accessed September 20, 2017).

54. Because of the variety of languages in which they exist and the speed with which such tools may be written and become obsolete, this paper does not make recommendations on transformational tooling.

55. Arnold, “Implementing Schema.org at the Rockefeller Archive Center.”

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.