共 12 条
[2]
ERNIE 2.0: A Continual Pre-Training Framework for Language Understanding[J] . Yu Sun,Shuohuan Wang,Yukun Li,Shikun Feng,Hao Tian,Hua Wu,Haifeng Wang.Proceedings of the AAAI Conference on Artificial Intelligence . 2020 (05)
[3]
RoBERTa: A Robustly Optimized BERT Pretraining Approach[J] . Yinhan Liu,Myle Ott,Naman Goyal,Jingfei Du,Mandar Joshi,Danqi Chen,Omer Levy,Mike Lewis,Luke Zettlemoyer,Veselin Stoyanov.CoRR . 2019
[4]
GPT-based Generation for Classical Chinese Poetry[J] . Yi Liao,Yasheng Wang,Qun Liu,Xin Jiang.CoRR . 2019
[5]
Cross-lingual Language Model Pretraining[J] . Guillaume Lample,Alexis Conneau.CoRR . 2019
[6]
Pre-Training with Whole Word Masking for Chinese BERT[J] . Yiming Cui,Wanxiang Che,Ting Liu 0001,Bing Qin 0001,Ziqing Yang,Shijin Wang,Guoping Hu.CoRR . 2019
[7]
Domain-Adversarial Training of Neural Networks[J] . Yaroslav Ganin,Evgeniya Ustinova,Hana Ajakan,Pascal Germain,Hugo Larochelle,Fran?ois Laviolette,Mario Marchand,Victor S. Lempitsky.Journal of Machine Learning Research . 2016
[8]
Adam: A Method for Stochastic Optimization[J] . Diederik P. Kingma,Jimmy Ba.CoRR . 2014
[9]
ERNIE:Enhanced language representation with informative entities .2 Zhang Zhengyan,Han Xu,Liu Zhiyuan,et al. Proc of the 57th Annual Meeting of the ACL . 2019
[10]
BERT:Pre-training of deep bidirectional transformers for language understanding .2 Delvin J,Chang Mingwei,Lee K,et al. Proc of the 20th Annual Conf of the North American Chapter of the Association for Computational Linguistics . 2019