共 6 条
[2]
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension.[J] . Mike Lewis,Yinhan Liu,Naman Goyal,Marjan Ghazvininejad,Abdelrahman Mohamed,Omer Levy,Veselin Stoyanov,Luke Zettlemoyer. CoRR . 2019
[3]
A Neural Probabilistic Language Model.[J] . Yoshua Bengio,Réjean Ducharme,Pascal Vincent,Christian Janvin. Journal of Machine Learning Research . 2003
[4]
Language models are unsupervised multitask learners .2 RADFORD A,JEFFREY W,CHILD R,et al. https://cdn.openai.com/better-language-models/languagemodelsareunsupervisedmultitasklearners.pdf . 2022
[5]
BERT:pre-training of deep bidirectional transformers for language understanding .2 DEVLIN J,CHANG M W,LEE K,et al. https://aclanthology.org/N19-1423.pdf . 2022
[6]
XLNet:generalized autoregressive pretraining for language understanding .2 YANG Z L,DAI Z H,YANG Y M,et al. https://arxiv.org/abs/1906.08237 . 2022