Bild

ULMFiT at GermEval-2018: A Deep Neural Language Model for the Classification of Hate Speech in German Tweets

    Kristian Rother, Achim Rettberg

konvens 2018 - GermEval Proceedings, pp. 113-119, 2018/10/02

14th Conference on Natural Language Processing - KONVENS 2018


PDF
X
BibTEX-Export:

X
EndNote/Zotero-Export:

X
RIS-Export:

X 
Researchgate-Export (COinS)

Permanent QR-Code

Abstract

This paper describes the entry hshl coarse 1.txt for Task I (Binary Classification) of the Germeval Task 2018 - Shared Task on the Identification of Offensive Language. For this task, German tweets were classified as either offensive or non-offensive. The entry employs a task-specific classifier built on top of a medium-specific language model which is built on top of a universal language model. The approach uses a deep recurrent neural network, specifically the AWD-LSTM architecture. The universal language model was trained on 100 million unlabeled articles from the German Wikipedia and the medium-specific language model was trained on 303,256 unlabeled tweets. The classifier was trained on the labeled tweets that were provided by the organizers of the shared task.