Lexicons

HurtLex

HurtLex is a multilingual computational lexicon of hate words. Starting from the Italian lexicon ''Le Parole per Ferire'' by Tullio de Mauro, we developed a computational lexicon and semi-automatically translated it into more than 50 languages.

The development of HurtLex is described in this paper:
Elisa Bassignana, Valerio Basile, Viviana Patti. Hurtlex: A Multilingual Lexicon of Words to Hurt. In Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-It 2018)

The resource is available on the Github repository

Corpora

Hate Speech Corpus

This is a corpus of hate speech on Twitter towards migrants and ethnic and religious minorities (Roma and Muslims in particular).

The corpus development is described in the following papers:
Fabio Poletto, Marco Stranisci, Cristina Bosco, Manuela Sanguinetti, Viviana Patti. Hate Speech Annotation: Analysis of an Italian Twitter Corpus. In: Proceedings of the Fourth Italian Conference on Computational Linguistics (Clic-It 2017)
Manuela Sanguinetti, Fabio Poletto, Cristina Bosco, Viviana Patti, Marco Stranisci. An Italian Twitter Corpus of Hate Speech against Immigrants. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018)