Обложка канала

Towards NLP

NLP: все n-граммы про анализ текстов. По всем дополнительным вопросам:

Towards NLP

5 лет назад
Открыть в
RUSSE 2022 Detoxification Competition There was no posts in the channel for a month because we were preparing something quite interesting — the first shared task on text detoxification based on a parallel dataset! The shared task is hold on the base of Dialogue-2022 conference. So, what is going on. The task of text detoxification is quite straightforward: given as an input some toxic text, you need to generate its non-toxic version. For example: Well today i fucking fracking learned something. -> I have learned something new today. Go ahead ban me, i don’t give a shit. -> It won’t matter to me if I get banned. Interesting, right? Previously, I posted here a lot of content about detoxification and our experiments [the first Russian detoxification experiments, SOTA unsupervised English models]. However, all that was mostly about unsupervised methods. We have collected a unique parallel dataset for detoxification with which you are incredibly welcome to experiment! Moreover, your model results will be evaluated manually — we aim to find indeed strong detoxification systems! What is needed from your is to train/find/create such a seq2seq model that will pass human test. This post is a fuse for the track that will start December, 15. More details here: https://russe.nlpub.org/2022/tox/ Telegram group for communication: https://t.me/joinchat/Ckja7Vh00qPOU887pLonqQ See you in two days.