Compressing Large Language Generation Models with Sequence-Level Knowledge Distillation
By Brendan Chambers, David Silin, and Kevin Gimpel of QuillBot Research
Continue reading: Compressing Large Language Generation Models with Sequence-Level Knowledge Distillation