Automatic Paraphrasing via Sentence Reconstruction and Back-translation.
|
 |
|
Abstract
|
This paper proposes a novel framework for paraphrase generation that simultaneously decodes the output sentence us-ing a pretrained wordset-to-sequence model and a back-translation model and shows that the generated paraphrases can be used to augment the training data for machine translation to achieve substantial improvements. Paraphrase generation plays key roles in NLP tasks such as question answering, machine translation, and information retrieval. In this paper, we pro-pose a novel framework for paraphrase generation. It simultaneously decodes the output sentence us-ing a pretrained wordset-to-sequence model and a back-translation model. We evaluate this framework on Quora, WikiAnswers, MSCOCO and Twitter, and show its advantage over previous state-of-the-art unsupervised methods and distantly-supervised methods by significant margins on all datasets. For Quora and WikiAnswers, our framework even performs better than some strongly supervised methods with domain adaptation. Further, we show that the generated paraphrases can be used to augment the training data for machine translation to achieve substantial improvements.
|
| Notes |
[Online; accessed 31. May 2024]
|