Our group includes PostDocs, PhD students, and student assistants, and is headed by Prof. Felix Naumann. If you are interested in joining our team, please contact Felix Naumann.

For bachelor students we offer German lectures on database systems in addition to paper- or project-oriented seminars. Within a one-year bachelor project, students finalize their studies in cooperation with external partners. For master students we offer courses on information integration, data profiling, and information retrieval enhanced by specialized seminars, master projects and we advise master theses.

Most of our research is conducted in the context of larger research projects, in collaboration across students, across groups, and across universities. We strive to make available most of our datasets and source code.

Please do not hesitate to reach out directly to us, if you cannot find a paper, slides, or other research artifacts.

04.09.2018

Workshop Paper Accepted at KONVENS 2018

Julian Risch, Eva Krebs, Alexander Löser, Alexander Riese, Ralf Krestel

Our paper "Fine-Grained Classification of Offensive Language" has been accepted for presentation at the workshop of the Germeval Task 2018 — Shared Task on the Identification of Offensive Language, which is co-located with the Conference on Natural Language Processing / "Die Konferenz zur Verarbeitung natürlicher Sprache" (KONVENS). This system description paper is part of our comment analysis project and originated during our seminar on text mining in practice. The paper can be downloaded here.

Fine-Grained Classification of Offensive Language

Social media platforms receive massive amounts of user-generated content that may include offensive text messages. In the context of the GermEval task 2018, we propose an approach for fine-grained classification of offensive language. Our approach comprises a Naive Bayes classifier, a neural network, and a rule-based approach that categorize tweets. In addition, we combine the approaches in an ensemble to overcome weaknesses of the single models. We cross-validate our approaches with regard to macro-average F1-score on the provided training dataset.