490
Views
8
CrossRef citations to date
0
Altmetric
Articles

Dacura: A new solution to data harvesting and knowledge extraction for the historical sciences

, , , , , & show all
Pages 165-174 | Published online: 20 Mar 2018
 

ABSTRACT

New advances in computer science address problems historical scientists face in gathering and evaluating the now vast data sources available through the Internet. As an example we introduce Dacura, a dataset curation platform designed to assist historical researchers in harvesting, evaluating, and curating high-quality information sets from the Internet and other sources. Dacura uses semantic knowledge graph technology to represent data as complex, inter-related knowledge allowing rapid search and retrieval of highly specific data without the need of a lookup table. Dacura automates the generation of tools to help non-experts curate high quality knowledge bases over time and to integrate data from multiple sources into its curated knowledge model. Together these features allow rapid harvesting and automated evaluation of Internet resources. We provide an example of Dacura in practice as the software employed to populate and manage the Seshat databank.

Acknowledgments

The authors wish to thank the participants in a workshop held at the Santa Fe Institute May 4–6, 2015 during which the needs for harvesting and integrating quality information was discussed and the Seshat meta-model developed. We gratefully acknowledge the contributions of our team of research assistants, post-doctoral researchers, consultants, and experts. Additionally, we have received invaluable assistance from our collaborators. Please see the Seshat website (www.seshatdatabank.info) for a comprehensive list of private donors, partners, experts, and consultants and their respective areas of expertise. Finally, we want to thank the anonymous reviewers whose insightful comments allowed us to substantially improve this paper.

Notes

1. The Dacura software was developed as part of the European ALIGNED Horizon 2020 project. All of the software, as well as other useful tools for managing semantic datasets and knowledge graphs, is available with an open source license through the project's web site at http://aligned-project.eu/open-source-tools/. However, this still requires users to configure and install their own knowledge-graph server, which remains a complex undertaking. It is our goal to also make the system available to researchers through the web as service in such a way that no technical knowledge is required to use it. We anticipate releasing a pilot version of this service in the middle of 2018. In the meantime, for any researchers who are particularly interested in seeing and using the web service, we are running an ongoing series of trials with Seshat and other collaborators and are open to new research collaborations. For updates check the Dacura website at http://dacura.cs.tcd.ie/ or by email at [email protected].

2. The version of Dacura shown on the site is a mock-up that lacks full functionality and is intended solely to give readers a sense of what the Dacura interface and output might look like when installed on their institutional computer. This version searches only within a portion of the Seshat databank and only returns simple html output. As discussed in the article, the fully-functional version of Dacura does much more than the mock-up provided here.

Additional information

Funding

This work was supported by a John Templeton Foundation grant to the Evolution Institute, entitled “Axial-Age Religions and the Z-Curve of Human Egalitarianism,” a Tricoastal Foundation grant to the Evolution Institute, entitled “The Deep Roots of the Modern World: The Cultural Evolution of Economic Growth and Political Stability,” an ESRC Large Grant to the University of Oxford, entitled “Ritual, Community, and Conflict” (REF RES-060-25-0085), a grant from the European Union Horizon 2020 research and innovation program (grant agreement No 644055 [ALIGNED, www.aligned-project.eu]), and an European Research Council Advanced Grant to the University of Oxford, entitled “Ritual Modes: Divergent modes of Ritual, Social Cohesion, Prosociality, and Conflict.”

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 113.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.