Linked People

Abstract

Linked People is a data visualization tool that benefits from the wealth of family relationships in Wikidata and Wikipedia to create family trees of characters in Movies, TV shows, and Books. The required data is extracted from Wikidata and then is completed by adding Wikipedia summaries and images such as movie posters and book covers. The new addition to the project is the family trees of all known people and fictional characters at Wikipedia which is even more exciting because they are built on the fly and displayed to the users. Explore the world of family trees with Linked People!

Introduction

Linked People is a data visualization tool that benefits from the wealth of family relationships in Wikidata and Wikipedia to create family trees. The idea of this project struck me when I was watching some TV shows during quarantine and felt the need for family trees to understand the character relationships. My specific case was the German science fiction TV series, “Dark” [https://en.wikipedia.org/wiki/Dark_(TV_series)], which was almost impossible to follow without a family tree nearby. Many movies and TV shows have complicated character relationships or include less emphasized family relationships that are usually overseen by the viewers. While there was no reference source of family tree depictions, I found out that many such relationships are already documented in open data sources such as Wikidata. So, as an open data lover and researcher, I felt the need to do something with this beautiful data which is contributed by thousands of users around the globe. As the quarantine period prolonged, we started this project as a family effort and tried to turn the Corona disaster into a work-together opportunity. My wife, Dr. Ferial Shayeganfar, implemented the image processing and visualization parts, my son, Kian, took over the quality control tasks, and I started to extract and process the required data. 

To this end, we built a pipeline that pulls data from both Wikidata and Wikipedia and constructs the family trees. The pipeline includes several steps to extract Movies, TV shows, books, and their properties using Wikidata’s query service and then completes the information by adding Wikipedia summaries and images such as movie posters, logos, and book cover images. Finally, some image processing techniques (e.g. face detection) have been used to adjust the images and make them fit for family tree visualization. 

The new addition to the Linked People project is the family trees of all known people and fictional characters at Wikipedia which is even more exciting because unlike the family trees of movies, TV series, and books, these are built on the fly and displayed to the users.

Figure 1. Family tree of Albert Einstein at Linked People website

It is worth mentioning that the family tree URLs are compliant with both Wikipedia/DBPedia page names as well as Wikidata entity identifiers. So, for looking into the family tree of Albert Einstein for example [cf. Figure 1], users can either search it via the the search box (also powered by Wikipedia) or use any of the following URL patterns: 

https://linkedpeople.net/person/[WIKIDATA IDENTIFIER] 

https://linkedpeople.net/person/[WIKIPEDIA/DBPEDIA PAGE NAME] 

So, in case of Albert Einstein’s the family tree URLs are as follows:

https://linkedpeople.net/person/Q937

https://linkedpeople.net/person/Albert_Einstein

As the Linked People family trees are built on a set of structured data, they also follow the best practices in web content creation and below the surface, use structured metadata to annotate the entities and their relationships via machine-readable tags. So, a closer look into this metadata (e.g., using OpenLink Structured Data Sniffer browser extension) reveals the structured data embedded within family trees which shows metadata of family members and their relationships based on schema.org taxonomy [cf. Figure 2]. 

Figure 2. Metadata of Linked People family trees based on schema.org taxonomy.

Finally to make family trees more accessible, we have implemented browser plugins for the popular browsers which connect the Wikipedia, Wikidata, and DBpedia pages of known people to their corresponding family trees at Linked People website as depicted in Figure 3. 

Figure 3. Linked People browser connecting DBpedia page to the corresponding family tree

Sofar, we have received very positive feedback from both domain experts and ordinary users which motivates us to continue the project. I’ve worked and conducted research in Semantic Web and Linked Data domains for more than a decade but this project was special because on one hand the results are directly communicated with the general public who are engaged in the entertaining aspect of the project, and on the other hand the project serves as an interface to the open data world and hopefully will raise awareness about the importance of open data resources. As such, Wikidata and Wikipedia can benefit from the increased interest and larger number of data contributors to identify missing and incorrect information and improve the quality of their datasets.