We are currently working to organize a datathon on hate speech detection.
The proposed challenge for the datathon is to create a model that can automatically classify text messages from Twitter as Racist or not Racist. Focus will be on tweets written in Spanish, which currently lacks any robust datasets for this topic. We consider repeating the project in the future using data in Catalan, since the resources in that language are even scarcer.
Since we want this project to also have a positive social impact we are getting inputs and recommendations about what should be considered as racism from members of SOS Racisme and CNNAE (Comunidad Negra Africana y Afrodescendiente en España).
About the event
Although the exact day of the datathon is still not decided, we plan to celebrate it during the first quarter of 2022.
The goal is to have between 8-10 teams competing during the event. Teams will be provided with a data set and will be given a challenge to address. There will be 2 prizes for those teams resolving the proposed challenge. Each team is expected to have 4-5 team members and diversity and inclusion criteria will be taken into account when selecting the participants.
If you are interested in getting more details or being an sponsor please send us an email at email@example.com