Towards NLP(@towards_nlp). Text Detoxification using Large Pre-trained Neural Models this whole article is bullshit . This art

Text Detoxification using Large Pre-trained Neural Models ~~this whole article is bullshit .~~ This article’s not a good deal. We continue our story about detoxification task. Today at EMNLP conference David Dale (@cointegrated) will present our work about detoxification for English language: paper github. The paper present the usage of two models modificated for text style transfer: - ParaGedi: Gedi model that can perform text generation from scratch guided by a language model informed about specific attributes of a text, e.g. style or topic. We extend this model by enabling it to paraphrase the input text. - CondBERT model: as it is known, BERT model was pretrained on several tasks, one of those is prediction of masked tokens. We can use such task, mask tokens — the attributes of original style — and prediction the substitution for them in our target style. Also, there was trained T5 detoxification model on pseudo-parallel corpus, you can try it via HuggingFace interface 🤗. The proposed models achieve today SOTA in style transfer for detoxification task! You are welcome to test the models and write github issues 🙂