708
Views
0
CrossRef citations to date
0
Altmetric
Articles

The detection of distributional discrepancy for language GANs

, ORCID Icon, , , &
Pages 1736-1750 | Received 22 Jan 2022, Accepted 11 May 2022, Published online: 14 Jun 2022
 

ABSTRACT

A pre-trained neural language model (LM) is usually used to generate texts. Due to exposure bias, the generated text is not as good as real text. Many researchers claimed they employed the Generative Adversarial Nets (GAN) to alleviate this issue by feeding reward signals from a discriminator to update the LM (generator). However, some researchers argued that GAN did not work by evaluating the generated texts with a quality-diversity metric such as Bleu versus self-Bleu, and language model score versus reverse language model score. Unfortunately, these two-dimension metrics are not reliable. Furthermore, the existing methods only assessed the final generated texts, thus neglecting the dynamic evaluating the adversarial learning process. Different from the above-mentioned methods, we adopted the most recent metric functions, which measure the distributional discrepancy between real and generated text. Besides that, we design a comprehensive experiment to investigate the performance during the learning process. First, we evaluate a language model with two functions and identify a large discrepancy. Then, several methods with the detected discrepancy signal to improve the generator were tried. Experimenting with two language GANs on two benchmark datasets, we found that the distributional discrepancy increases with more adversarial learning rounds. Our research provides convicted evidence that the language GANs fail.

Acknowledgments

We thank the anonymous reviewers for their valuable comments.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 An exception is RelGAN which does not need to pre-train D.

4 According to Section 3, we sample generated instances as much as test instances.

Additional information

Funding

This work was supported by the National Natural Science Foundation of China [grant numbers 61936012, 61976114, 81373056] and the National Key Research and Development Program of China [grant number 2018YFB1005102].