DBpedia Blog

LOD activities at the National Archives of the Netherlands


By Ed de Heer

About the National Archives

This article describes the Linked Open Data (LOD) activities of the National Archives of the Netherlands and is based on my presentation at Semantics 2022 in Vienna.

At the National Archives people find information about their lives, Dutch (political/administrative) history and society. Our mission is: “we serve every person’s right to information, and we offer insight into the history of our country.”  The National Archives believes in the power of open data. We want to offer open data as much as possible. Not only to the government and historians but also to third parties which develop new applications and websites. In this way the general public can participate, and new ways of disclosing heritage information can arise. We publish our data (archives, indexes, and photographs) with a CC0 license by, csv, XML and API’s. Below an overview of our overall collection and services.

Linked open data

We are working on the development of Linked Open Data since 2018. Then we started our first LOD experiments and bought an ETL tool to transform our data to RDF. In 2019 we developed an URI strategy and started to model the indexes. We have indexes about enslaved people and slavery, fish rights, emigrants and finance, etc. So we had to develop all kinds of LOD models and use different ontologies. Now we have just finished the publication of our 400,000 digitized pictures with a CC0 license as RDF through our  SPARQL endpoint (Beta). https://www.nationaalarchief.nl/onderzoeken/linked-open-data/sparql-interface

Challenges of linked open data

When transforming to RDF we faced some challenges. For instance the challenge of data quality. We don’t improve the quality of our data. When we want to curate our data we would have to check the original archives. This would take a lot of effort. And what is right or wrong? When a particular archive speaks of “Amsteredam” instead of “Amsterdam”, the record states “Amsteredam” and not Amsterdam, because that is the original spelling  in the archive. Also, within an organization as the National Archives, a lot of stakeholders are involved. IT, Collection, Services, and management. It takes a lot of time and effort to get all the priorities straight.

The Verkaufsbücher

One of our most successful LOD projects is the Verkaufsbücher. This is an administration of the Nazis during World War II in which they wrote the expropriation of Jewish properties in the Netherlands. These houses were “bought” from the Jewish people far under the real price and the owners were often deported shortly afterwards. The National Archives wanted to visualize this story and this data. We worked with the Offices of the land registry of the Netherlands (Kadaster). And developed a data story https://labs.kadaster.nl/stories/verkaufsbucher/index.html. This data story was noticed by a Dutch broadcasting company and issued an item on national television. This broadcast triggered a lot of exposure and the attention of Dutch government agencies and municipalities. Due to this story, local governments have started to investigate what happened with these properties during and directly after the war and some municipalities are going to compensate the victims or their next of kin.

Digital Heritage Network and the Dataset Register

All these LOD developments don’t thrive on their own. Working together with other institutions and professionals is vital. The Dutch Digital Heritage Network is a partnership of cultural heritage agencies in the Netherlands. It focuses on developing a system of national facilities and services for improving the visibility, usability, and sustainability of digital heritage based on linked open data. The network is open to all Dutch institutions and organizations in the digital heritage field.

The Dataset Register is an initiative of the Digital Heritage Network. The National Archives hosts the Dataset Register. This register provides insight into the availability of data sets in the heritage field and thus stimulates the use of these datasets. The Dataset Register makes it easier to publish information about heritage datasets. By analyzing the datasets we can build a knowledge graph on heritage data for better use and the Dataset Register can help software (Google) to find collections.

Heritage institutions are encouraged to make their data sets available, to describe these data sets and to publish them online. Also to submit the URLs of dataset descriptions to the Dataset Register. The Dataset Register retrieves the dataset descriptions, creating an overall picture of what is available. See also https://datasetregister.netwerkdigitaalerfgoed.nl/?lang=en

Drs. Ed de Heer MIM is advisor and project manager for Linked and Open Data at the National Archives and administrator for the Dataset Register.