After almost 10 years of being part of the Google Summer of code (GSoC) program, it’s time for a recap! We started with only a couple of projects and mentors, and over the years we grew to a total of 65 successful projects, 65 students and 59 mentors. In 2021 alone, 10 projects have already been accepted by students from all over the world, which in turn are supervised by more than 20 mentors.
DBpedia has to thank Google Summer of Code a lot for its continued growth. We are happy to be part of the open source community and to help students hone their skills and make an impact with their talent. None of this would have been possible without the generosity and hard work of our mentors. We would like to take this opportunity to thank all mentors and students and recognize them in this post. Here’s a list of DBpedia projects and people at the GSoC in recent years. Have fun reading!
2021
Even though this season of Google Summer of Code has just begun, we want to mention the current projects, students and mentors here.
Projects and Students | Mentors |
Towards a neural extraction framework by Ziwei XU | Tommaso Soru (now at Scalpel / Liber AI), Thiago Castro Ferreira (now at Universidade Federal de Minas Gerais), Zheyuan BAI (now at Huawei) |
Neural QA Model for DBpedia by Siddhant Jain | Tommaso Soru (now at Scalpel / Liber AI), Anand Panchbhai (now at Logy.AI) |
Lifecycle Management of DBpedia Neural QA Models by Sahan Dilshan | Edgard Marx (now at eccenca & HTWK), Lahiru Hinguruduwa (now at CERN), Nausheen Fatma (now at Tealbook) |
DBpedia Spotlight Dashboard: an integrated statistical information tool from the Wikipedia dumps and the DBpedia Extraction Framework artifacts by José Manual Díaz Urraco | Said Polanco-Martagón (now at Universidad Politécnica de Victoria), Maribel Angelica Marin Castro (now at Universidad Politécnica de Victoria), Beyza Yaman (now at ADAPT Centre), Julio Hernandez (now at ADAPT Centre), Jan Forberg (now at InfAI) |
Update DBpedia Sparql for newly updated wiki resources and specifically related to pandemic, healthcare, and health AI fields by Guang Zhang | Marvin Hofer (now at ScaDS.AI), Sebastian Hellmann (now at InfAI/DBpedia), Alexander Winter (now at InfAI) |
Social Knowledge Graph: Employing SNA measures to Knowledge Graph by Zhipeng Zhao | Luca Virgili (now at Polytechnic University of Marche) |
Modular DBpedia Chatbot by Jayesh Desai | Andreas Both (now at DATEV eG), Aleksandr Perevalov (Anhalt University of Applied Science), Ram G Athreya (now at Samsung Research America), Ricardo Usbeck (now at Hamburg University) |
DBpedia Live Neural Question Answering Chatbot by Ashutosh Kumar | Edgard Marx (now at eccenca GmbH), Diego Moussallem (now at Globo), Thiago Castro Ferreira (now at Universidade Federal de Minas Gerais), Nausheen Fatma (now at Tealbook) |
User Centric Knowledge Engineering and Data Visualization by Karan Kharecha | Jan Forberg (now at InfAI), Luca Virgili (now at Polytechnic University of Marche), Krishanu Konar (now at Media.net) |
Web app to generate RDF from DBpedia abstracts by Fernando Casabán Blasco | Mariano Rico (now at UPM and Dylan Q) |
2020
Oh, what a year! This year a new record was marked. DBpedia got 45 proposals for the Google Summer of Code. For the 9th year in a row, we were part of this incredible journey of young ambitious developers who joined us as an open source organization to work on a GSoC coding project all summer. Read an interesting post about DBpedia’s participation on the DBpedia blog.
Projects and Students | Mentors |
A multilingual Neural RDF Verbalizer by Marco Sobrevilla (now at University of Sao Paulo) | Stuart Chan, Diego Moussallem (now at Globo), Thiago Castro Ferreira (now at Universidade Federal de Minas Gerais) |
A Neural QA model for DBpedia: Compositionality by Zheyuan BAI (now at Huawei) | Tommaso Soru (now at Scalpel / Liber AI), Anand Panchbhai (now at Logy.AI) |
Combine DBpedia/Databus with IPFS by Kirill Yankov (now at InfAI) | Amandeep Srivastava (now at Goldman Sachs), Sebastian Hellmann (InfAI/DBpedia) |
Dashboard for Language/National Knowledge Graphs by Karan Kharecha | Luca Virgili (now at Polytechnic University of Marche), Jan Forberg (now at InfAI), Sebastian Hellmann (now at InfAI/DBpedia) |
DBpedia Neural Multilingual QA by Lahiru Hinguruduwa (now at CERN/DBpedia) | Edgard Marx (now at eccenca), Renato Fabbri, Akshay Jagatap (now at Amazon) |
Extending Extraction Framework with Citations, Commons and Lexeme Extractors by Mykola Medynskyi (now at InfAI) | Beyza Yaman (now at ADAPT Centre), Julio Hernandez (now at ADAPT Centre), Sebastian Hellmann (now at InfAI/DBpedia) |
RDF-to-Text using Generative Adversial Networks by Niloy Purkait (now at MindLabs) | Diego Moussallem (now at Globo), Thiago Castro Ferreira (now at Universidade Federal de Minas Gerais), Mariana Dias da Silva |
2019
And for the 8th time, DBpedians were coding in order of the GSoC program. For more information about 2019 have a look at the 6 projects below or check more details on the DBpedia blog.
Projects and Students | Mentors |
A neural QA Model for DBpedia by Anand Panchbhai (now at Logy.AI) | Tommaso Soru (now at Scalpel / Liber AI), Aashay Singhal (now at Indeed.com), Aman Mehta (now at Scaler Academy) |
Multilingual Neutral RDF Verbalizer for DBpedia by Dwaraknath Gnaneshwar (now at aixplain, inc.) | Diego Moussallem (now at Globo), Thiago Castro Ferreira (now at Universidade Federal de Minas Gerais) |
Predicate Detection using Word Embeddings for Question Answering over Linked Data by Yajing Bian | Ram G Athreya (now at Samsung Research America), Rricha Jalota (now at Paderborn University), Ricardo Usbeck (now at Hamburg University) |
Tool to generate RDF triples from DBpedia abstract by Jayakrishna Sahit (now at Stealth Mode) | Mariano Rico (now at UPM and Dylan Q), Amandeep Srivastava (now at Goldman Sachs) |
Transformer of Attention Mechanism for Long-text QA by Stuart Chan | Bharat (now at Brandvise: Business and Finance), Nausheen Fatma (now at Tealbook), Rricha Jalota (now at Paderborn University) |
Workflow for linking External datasets by Jaydeep Chakraborty (now at Arizona State University) | Krishanu Konar (now at Media.net), Beyza Yaman (now at ADAPT Centre), Luca Virgili (now at Polytechnic University of Marche) |
2018
In 2018 we started out with six students that committed to GSoC projects. However, in the course of the summer, some dropped out or did not pass the midterm evaluation. In the end, we had three finalists that made it through the program. Read Sandra’s recap on the DBpedia blog.
Projects and Students | Mentors |
A neural QA Model for DBpedia by Aman Mehta (now at Tavisca) | Tommaso Soru (now at Scalpel), Ricardo Usbeck (now at Hamburg University) |
Complex Embeddings for OOV Entities by Bharat Suri (now at Salesforce) | Tommaso Soru (now at Scalpel / Liber AI), Thiago Galery (now at Optimizely), Peng Xu (now at BorealisAI) |
Web application to detect incorrect mappings across DBpedias in different languages by Víctor Fernández | Mariano Rico (now at UPM and Dylan Q), Dimitris Kontokostats (now at Diffbot), Nandana (now at Logy.AI) |
2017
“GSoC is the perfect opportunity to learn from experts, get to know new communities, design principles and workflows.” (Ram G Athreya) On the DBpedia blog you can read more details about this year’s 7 GSoC project ideas.
Projects and Students | Mentors |
DBpedia Mapping Front-End Administration by Ismael Rodriguez (now at Hedyla) | Anastasia Dimou (now at imec), Dimitris Kontokostas (now at Diffbot), Wouter Maroy (now at Deloitte Belgium) |
First Chatbot for DBpedia by Ram Ganesan Athreya (now at Samsung Research America) | Ricardo Usbeck (now at Hamburg University) |
Knowledge base embeddings for DBpedia by Nausheen Fatma (now at Tealbook) | Sandro Athaide Coelho (now at Translucent Computing Inc), Tommaso Soru (now at Scalpel / Liber AI) |
Knowledge Base Embeddings for DBpedia by Akshay Jagatap (now at Amazon) | Pablo Mendes (now at Apple Inc), Sandro Athaide Coelho (now at Translucent Computing Inc), Tommaso Soru (now at Scalpel / Liber AI) |
The table extractor by Luca Virgili (now at Polytechnic University of Marche) | Emanuele Storti (now at Polytechnic University of Marche), Domenico Potena (now at Polytechnic University of Marche) |
Unsupervised Learning of DBpedia Taxonomy by Shashank Motepalli (University of Toronto) | Marco Fossati (now at Wikimedia Foundation), Dimitris Kontokostas (now at Diffbot) |
Wikipedia List-Extractor by Krishanu Konar (now at Media.net) | Marco Fossati (now at Wikimedia Foundation) |
2016
DBpedia participated for a fourth time in the Google Summer of Code program. This was a quite competitive year (like every year) where more than forty students applied for a DBpedia project. In the end, 8 great students from all around the world were selected and worked on their projects during the summer of 2016.
Projects and Students | Mentors |
A Hybrid Classifier/Rule-based Event Extractor for DBpedia Proposal by Vincent Bohlen (Fraunhofer FOKUS) | Marco Fossati (now at Wikimedia Foundation) |
Automatic mappings extraction by Aditya Nambiar (now at Facebook) | Markus Freudenberg (now at Datadrivers GmbH) |
Combining DBpedia and Topic Modelling by wojtuch | Alexandru Todor (now at FU Berlin) |
DBpedia Lookup Improvements by Kunal.Jha | Axel Ngonga (now at University of Paderborn) |
Inferring infobox template class mappings from Wikipedia + Wikidata by Peng_Xu (now at BorealisAI) | Nilesh Chakraborty (now at Fraunhofer IAIS) |
Integrating RML in the Dbpedia extraction framework by Wouter Maroy (now at Deloitte Belgium) | Dimitris Kontokostas (now at Diffbot), Anastasia Dimou (now at imec) |
The List Extractor by FedBai | Marco Fossati (now at Wikimedia Foundation) |
The Table Extractor by s.papalini | Marco Fossati (now at Wikimedia Foundation) |
2015
This year, for the first time, a former student is now one of our mentors! Dimitris Kontokostas, Marco Fossati, Thiago Galery, Joachim Daiber and Ruben Verborgh, members of the DBpedia community, mentored the following great students from around the world. Check more details on the DBpedia blog!
Projects and Students | Mentors |
Fact Extraction from Wikipedia Text by Emilio Dorigatti (now at LMU Munich) | Marco Fossati (now at Wikimedia Foundation) |
Better context vectors for disambiguation by Philipp Dowling (now at Bain & Company) | Thiago Galery (now at Optimizely), Joachim Daiber (now at Apple Inc), David Przybilla (now at Optimizely) |
Wikipedia Stats Extractor by Naveen Madhire (now at Amazon Web Services (AWS)) | Thiago Galery (now at Optimizely), Joachim Daiber (now at Apple Inc), David Przybilla (now at Optimizely) |
DBpedia Live scaling and new interface by Andre Pereira | Dimitris Kontokostas (now at Diffbot), Magnus Knuth (now at eccenca GmbH) |
Adding live-ness to the Triple Pattern Fragments server by Pablo Estrada (now at Google) | Ruben Verborgh (now at Ghent University – imec), Dimitris Kontokostas (now at Diffbot) |
DBpedia Schema Enrichment on Web Protege by laparmakerli | Alexandru Todor (now at FU Berlin) |
Keyword Search on DBpedia by Kartik Singhal | Axel Ngonga (now at University of Paderborn) |
Parallel processing in DBpedia extraction Framework by Ritesh Kumar Singh | Nilesh Chakraborty (now at Fraunhofer IAIS) |
Remodeling pignlproc for generating Named entity recognition models by Naveen Madhire (now at Amazon Web Services (AWS)) | Joachim Daiber (now at Apple) |
2014
For a third time, DBpedia was part of the Google Summer of Code because as you know, all good things come in threes.
Projects and Students | Mentors |
Abbreviation Base – A multilingual knowledge base for abbreviations by Divyum Rastogi (now at Amazon) | Martin Brümmer (now at GitLab Inc.), Sebastian Hellmann (now at InfAI/DBpedia) |
Distributed extraction of Wikipedia data dumps for DBpedia by Nilesh Chakraborty (now at Fraunhofer IAIS) | Sang Venkatraman (now at Accenture), Nicolas Torzec (now at Verizon MediaM), Dimitris Kontokostas (now at Diffbot), Andrea Di Menna (now at Immobiliare.it) |
Fine grained massive extraction of wikipedia content by Roberto Bampi (now at RISE SICS) | Michel Dumontier (now at University Maastricht), Marco Fossati (now at Wikimedia Foundation) |
Media Extractor for DBpedia by Leandro Doctors (now at University of Mons) | Alexandru Todor (now at FU Berlin), Magnus Knuth (now at eccenca GmbH), Marco Fossati (now at Wikimedia Foundation) |
Natural language question answering engine with DBpedia by Wencan | Axel Ngonga (now at University of Paderborn), Marco Fossati (now at Wikimedia Foundation), Michel Dumontier (now at University Maastricht) |
New DBpedia Interfaces: Resource Widgets by Jorge Cruz (now at Hipmunk) | Magnus Knuth (now at eccenca GmbH), Patrick Westphal (now at Leipzig University), Dimitris Kontokostas (now at Diffbot) |
Wikimedia Commons extraction by Gaurav Vaidya (now at RENCI, an institute of UNC at Chapel Hill) | Dimitris Kontokostas (now at Diffbot), Andrea Di Menna (now at Immobiliare.it), Jimmy O’Regan (now at Trinity College Dublin) |
2013
After the first year only DBpedia Spotlight was part of the GSoC, in 2013 the whole DBpedia family was participating, including DBpedia, DBpedia Spotlight and DBpedia Wiktionary. Check more details on the DBpedia blog.
Projects and Students | Mentors |
DBpedia: Design a better / interactive display page (+ Search) by El Denis | Claus Stadler (now at Smart Data Analytics (SDA) Research), Dimitris Kontokostas (now at Diffbot) |
HadyElsahar – WikiData+DBpedia Idea proposal by Hady Elsahar (now at NAVER LABS Europe) | Dimitris Kontokostas (now at Diffbot) |
Input Formats Generalization and Graph-Based Disambiguation Integration and Improvements by Zhiwei Cai | Max Jakob (now at Ecosia), Chris Hokamp (now at AYLIEN) |
Interface / Power tool for DBpedia testing metadata by Lazaros Ioannidis (now at POWERPHARM) | Dimitris Kontokostas (now at Diffbot) |
Type inference to extend coverage by Kasun Perera | Marco Fossati (now at Wikimedia Foundation) |
2012
2012 was the first year ever being part of the Google Summer of Code. All projects focused on DBpedia Spotlight and were co-mentored by Pablo N. Mendes (now at Apple Inc) and Max Jakob (now at Ecosia). Get more insights on the DBpedia blog.
- A database-backed core system with an improved model for estimation of annotation probability by Joachim Daiber (now at Apple Inc)
- DBpedia Spotlight for collective linking of entities in HTML pages by Hector Liu (now at Petuum Inc)
- Hadoop Indexing and Concept-Space Disambiguation Models for DBpedia Spotlight by Chris Hokamp (now at AYLIEN)
- Topical Classification by Dirk Weissenborn (now at Google Research)
- Did you consider this information as helpful?
- Yep!Not quite ...