Natural language processing (NLP) has undergone revolutionary changes in recent years. Thanks to the development and use of pre-trained language models, remarkable achievements have been made in many applications. Pre-trained language models offer two major advantages. One advantage is that they can significantly boost the accuracy of many NLP tasks. For example, one can exploit the BERT model to achieve performances higher than humans in language understanding.8 One can also leverage the GPT-3 model to generate texts that resemble human writings in language generation.3 A second advantage of pre-trained language models is that they are universal language processing tools. To conduct a machine learning-based task in traditional NLP, one had to label a large amount of data to train a model. In contrast, one currently needs only to label a small amount of data to fine-tune a pre-trained language model because it has already acquired a significant amount of knowledge necessary for language processing.
This article offers a brief introduction to language modeling, particularly pre-trained language modeling, from the perspectives of historical development and future trends for general readers in computer science. It is not a comprehensive survey but an overview, highlighting the basic concepts, intuitive explanations, technical achievements, and fundamental challenges. While positioned as an introduction, this article also helps knowledgeable readers to deepen their understanding and initiate brainstorming. References on pre-trained language models for beginners are also provided.
Thank you for this article! I enjoyed learning about the different NLP models and the history behind them. Previously, I had only heard the names and occasionally played with demos, but I never dove into how they work as I figured it would be too complex. However this article makes understanding the basic idea of them very easy! It was timely and very well written.
Displaying 1 comment