Achtung:

Sie haben Javascript deaktiviert!
Sie haben versucht eine Funktion zu nutzen, die nur mit Javascript möglich ist. Um sämtliche Funktionalitäten unserer Internetseite zu nutzen, aktivieren Sie bitte Javascript in Ihrem Browser.

Show image information

Open Master Theses

Constructing Situation-Specific Knowledge Bases

Motivation

Today’s knowledge bases such as DBpedia, Wikidata or Yago contain millions of entities which are organized in large and complex type hierarchies. However, many applications only require a small part of the knowledge base and have special requirements on its type hierarchy, granularity, and schema. For example, a stock broker might be interested in companies, their relations and a meaningful categorization of companies, a car manufacturer is interested in all car-related entities, and a developer of a flight booking service is interested in all flight-related entities. Additional requirements originate from the concrete use cases, e.g., a stock broker using the knowledge base for text mining might have other requirements than a car manufacturer using the knowledge base to enhance search results on its website or a cloud provider using the knowledge base for semantic service matching.

Description of the Task

  • Identify and describe important scenarios for which a situation-specific knowledge base is required (only one scenario for bachelor’s thesis)
  • Identify the requirements that emerge from those scenarios
  • Develop an approach to adapt an existing knowledge base to those requirements (e.g., compute a subset of the knowledge base and refactor the schema and type hierarchy, more sophisticated approach for master’s thesis)

Contact

Stefan Heindorf

Evaluation of Open-Domain Knowledge Bases for Semantic Service Matching

Motivation

Automatic service discovery envisions the idea to find reusable web services that conform to a certain service request describing the desired properties of the web service. This technique requires semantic specifications of services and requests which form the basis of a service matching. Part of the semantic specification is an ontology that describes the entities and relations of a domain such as flight booking, maps, or music services.

However, matching requests and services across domains using heterogeneous ontologies is a major challenge and often results in a bad matching quality. To overcome this problem, we envision the usage of a large-scale, open-domain ontology as provided by the knowledge bases DBpedia, Wikidata, Yago, schema.org or ConceptNet. In this thesis, the suitability of these open-domain knowledge bases is to be evaluated for semantic service matching

Description of the Task

  • Select some typical web services from different domains and describe each of them
    in the ontologies of different knowledge bases (more web services and knowledge bases for master’s thesis)
  • Use existing service matchers to perform the service matching based on those ontologies
  • Compare the quality of the service matching depending on the ontology and its properties
  • Derive requirements on knowledge bases for semantic service matching
  • Investigate which knowledge base best fulfills those requirements

Contact

Stefan Heindorf, Simon Schwichtenberg

Personal Edit Suggestions for Crowdsourced Knowledge Bases

Motivation

Today’s knowledge bases such as Wikidata are incomplete. For example, for many entities of people, there is no date of birth specified, for entities related to geographic locations, there is no super type specified, or for football players, the FIFA Player ID is missing. Furthermore, in a pilot study, we made the observation that many volunteers editing the knowledge base Wikidata like to specialize on certain topics and properties to edit. The goal of this thesis is to develop an approach to make personalized suggestions to volunteers what to edit next. These suggestions should be based on the users personal edit history.

Description of the Task

  • Identify missing data in Wikidata via an existing tool
  • Develop an approach that makes personalized edit suggestions to volunteers based on their edit history (more sophisticated approach for master’s thesis than for bachelor’s thesis)
  • Implement your approach in a prototype
  • Evaluate the quality of your edit suggestions (possibly in a crowdsourcing experiment)

Prerequisites

  • Interest in exploratory data analysis and recommender systems

Contact

Stefan Heindorf

Ranking Constraint Violations in Knowledge Bases

Motivation

Knowledge bases such as Wikidata are used for a wide range of applications, e.g., quick answer boxes in search engines (e.g. Google, Bing), personal assistants (e.g., Siri, Google), or question answering systems (e.g., IBM Watson). However, today’s knowledge bases suffer from quality problems. For example, Wikidata currently reports over 10 million constraint violations. In traditional databases, all data which violates constraints is simply discarded. However, this approach is not applicable to real-world, large-scale knowledge bases as for almost every constraint, there is an exception in the real world, and strictly enforcing constraints prevents an agile and flexible development of the knowledge base. Nevertheless, constraint violations often point to quality problems. To overcome this dilemma, we envision a semi-automatic approach: constraint violations are ranked by the severity of their consequences, thus, enabling the volunteers of the knowledge base to manually review and fix the most important violations first.

Description of the Task

  • Investigate some examples of constraint violations in Wikidata, and manually order them by the severity of their conse-quences
  • Develop systematic criteria to rank constraint violations in knowledge bases
    (more/better criteria for master’s thesis)
  • Develop a prototype for automatically ranking the constraint violations
  • Evaluate your prototype by comparing its result with your initial, manual ranking (or even perform a crowdsourcing experiment for master’s thesis)
  • For the most common and severe types of constraint violations, offer suggestions how to fix them (semi-) automatically

Contact

Stefan Heindorf

The History and Future of Data Quality in Knowledge Bases

Motivation

Knowledge bases such as DBpedia, Wikidata, or Yago are used for a wide range of applications, e.g., quick answer boxes in search engines (e.g. Google, Bing), personal assistants (e.g., Siri, Google), or question answering systems (e.g., IBM Watson). However, today’s knowledge bases suffer from quality problems. For example, the error-rate is estimated to be between 5 % and 10 %, and the data is often incomplete. Over the last couple of years all major knowledge bases took countermeasure to improve data quality. For example, DBpedia developed better extractors to extract information from Wikipedia, Yago combined the data from many Wikipedias, Wikidata introduced semi-automatic editing tools. However, it has not been systematically studied how all those measures affected data quality and what we can learn from this for the future.

Description of the Task

  • Identify major events in the history of DBpedia, Wikidata, and Yago which potentially affected data quality (only for Wikidata if Bachelor’s thesis)
  • Identify important quality metrics that might have been affected
  • Develop a prototype to compute quality metrics in knowledge bases over time
  • Interpretate your results and compile ‘lessons learned’ to improve the data quality of knowledge bases in the future

Prerequisites

  •  Interest in big data and scalable systems

Contact

Stefan Heindorf

Entwicklung von Mustern für technologieorientierte Prozessinnovationen

Motivation

Die Digitalisierung und damit verbundene technologische Innovationen finden vielfältige Anwendungsmöglichkeiten in Unternehmensprozessen, z. B. in der Produktion, in der internen Logistik, der Qualitätssicherung und der Instandhaltung.

Der Einsatz neuer Technologien und Methoden wie künstlicher Intelligenz, Robotic, Big Data Analysis oder IoT bietet das Potential, Prozesse zu innovieren und die Wertschöpfungskette zu optimieren. Mitarbeiter können entlastet werden und höherwertige Tätigkeiten übernehmen.

Analog zum St. Gallener Business Model Innovation Ansatz gehen wir davon aus, dass eine Innovation stets auf ein oder mehrere – in diesem Fall technologisch getriebene – Innovationsmuster zurückzuführen ist.

Hieraus ergeben sich unterschiedliche Fragestellungen:

  • Welche Technologien sind für die Optimierung welcher Unternehmensprozesse relevant und besonders geeignet?
  • Welche Innovationsmuster können auf Basis dieser Technologien identifiziert werden?
  • Wie sind diese zu beschreiben? Welche Beispiele können herangezogen werden?
  • Wie muss die BMI-Methode angepasst werden, um mustergestützt Prozess-Innovation zu identifizieren? Die Arbeit erfolgt in Zusammenarbeit mit einem Unternehmen aus der Region.

Aufgabenbeschreibung

  • Analyse und Beschreibung ausgewählter Unternehmensprozesse
  • Durchführung von Literaturrecherche und Interviews zur Identifizierung typischer Innovationsmuster für diese Unternehmensprozesse
  • Beschreibung der Innovationsmuster und des Standes der Technik der zugehörigen Technologien
  • Adaption der mustergestützten BMI-Methode auf mustergestützte Prozess-Innovation

Kontakt

Florian Rittmeier

The University for the Information Society