Обложка канала

Towards NLP

NLP: все n-граммы про анализ текстов. По всем дополнительным вопросам:

Towards NLP

4 года назад
Открыть в
ParaDetox: Detoxification with Parallel Data At this ACL I was presenting our current culmination of the detoxification project — parallel dataset with pairs "toxic sentence <-> non-toxic paraphrase" together with the pipeline of such dataset collection. What for? Now detoxification task can be solved as a typical machine translation task that allows to achieve quite good quality of text style transfer models. Moreover, pipeline of dataset collection can be used for any other text style transfer task. What we release: * ParaDetox dataset in HuggingFace🤗 repo; * New SOTA model for detoxification also in 🤗 here; All other details can be found in our github repo. You are very welcomed to play with our detoxifier!