Knowledge Graphs

Sneak Preview for DBpedia Fusion Knowledge Graph (BETA)

We are informally publishing the dev/beta version of our new dataset DBpedia Fusion KG, which is available under Business Source License (BSL) 1.1. Slides with detailed information: https://tinyurl.com/dbpedia-fusion-sneak

Download URL: https://data.dbpedia.io/databus.dbpedia.org/dbpedia-enterprise/dev/fusion-sneak-preview

We are still working on a small number of issues:

  • download via browser only, we will publish bash / python code soon
  • registration, login, access is being tested, e.g. some users did not receive a registration link per email
  • data is based on old Wikimedia dumps (some months old)
  • databus metadata needs to be created
  • some more data checks are needed
  • only EN DBpedia and Wikidata, next target is DE DBpedia and DNB data, then the rest of LOD step by step

Feedback is welcome either per https://forum.dbpedia.org or contact us directly for professional inquiries via dbpedia [at] infai.org

Snapshot Release

We announed the Dedia Snapshot 2022-12 Release on our blog. Please read more information about the previous DBpedia Snapshot 2022-09 Release here.

Datasets

Do you want to query DBpedia’s Largest Diamond Dataset or the Latest Core Release? No Problem! Following we published an updated list of DBpedia’s brand new datasets.

  • The Latest Core Release (branded tiny diamond) is our smallest dataset collection based on the English Wikipedia. This is the DBpedia that you know since 14 years.
  • The Marvin Bot releases 21 Billion triples per month (that’s  5500 triples per second) from 140 Wikipedia languages + Commons + Wikidata + the full article text. Please see the progress dashboard here.
  • DBpedia Archivo is a BETA prototype. We detect and crawl all available ontologies every 8 hours and store them persistently on the Databus. The ultimate fallback solution for the Web of Data. It also has a 4-star rating for each ontology and SHACL tests.
  • DBpedia Largest Diamond, also BETA is our skyrocketing dataset describing 220 million entities using 1.45 Billion triples from DBpedia, Geonames, DNB, Musicbrainz, etc. and is continuously growing.