Improve DBpedia

There are many ways to contribute and help improve DBpedia. Below you can find general ideas for contribution. In addition, feel free to post ideas for improvement of DBpedia in the DBpedia Forum.

Fix Monthly Releases

You can improve the monthly releases in several ways:

  • Providing patches to the extraction framework. The changes will be effective on the next run of MARVIN Release Bot. Feel free to make a pull request!
  • Improve the mappings at http://mappings.dbpedia.org. Any change made on the mappings will be reflected in the data behind Mappings and Ontology data groups.
  • Improve Wikidata mappings. This improvements will be reflected on the Wikidata data group (every artifact with mapping)
  • Improve the configuration for your language

Write Tests for Minidumps

Minidumps are small Wikipedia XML dumps which are used to test the extraction framework. Any errors found in the big dumps can be tested on minidumps on-commit. Defined tests are executed to test the extraction against the minidumps.

Here are several options on how you can improve the testing:

Learn more on how Testing on Minidumps works or how to Integrate custom SHACL tests.

Improve the DBpedia Ontology with DBpedia Archivo

Archivo automatically crawls and tests ontologies, so check out the info-page of the DBpedia ontology for the results in each version. The red ✘ marks a failed test, and a click on it reveals the report with the problems. Since most tests are SHACL-tests here is a quick tutorial how to evaluate SHACL-reports:

  • Every instance of sh:ValidationResult is a failed test.
  • sh:focusNode points to the problematic Node.
  • sh:resultPath is the the property where the problem ocurred.
  • sh:resultSeverity is the severity of the problem:
    • sh:Info is just an information and does not necessarily need a fix
    • sh:Warning is a non-critical warning and should probably be fixed
    • sh:Violation is a critical problem and should be fixed as soon as possible
  • sh:sourceConstraintComponent points out what the problem is: For example sh:NodeKindConstraintComponent means that the object of the focusNode resultPath object is not the right kind of value (e.g xsd:string instead of IRI) read more.
  • sh:resultMessage gives a short human readable explaination what the problem is.

Contribute Additional Datasets

We are open for data contributions from the community. Feel free to contribute and publish your data on the DBpedia Databus using the Databus Maven Plugin.

Many datasets have been already contributed by the community. Here are few examples:

Upload External Datasets or Linksets

  • NOTE: we are working on a good process here, but it is easy to grep 'sameAs' from selected, vetted artifacts
  • see geonames
  • these can be loaded into the ID index and Fusion later.

DBpedia Wiki

We describe more ways how you can improve DBpedia in the DBpedia Wiki. We are looking forward to your contribution!