Comparative Analysis of Transformer based Language Models

Aman Pathak, Medi-Caps University, India; Aman Pathak, Medi-Caps University, India

Comparative Analysis of Transformer based Language Models

Authors

Aman Pathak, Medi-Caps University, India

Abstract

Natural language processing (NLP) has witnessed many substantial advancements in the past three years. With the introduction of the Transformer and self-attention mechanism, language models are now able to learn better representations of the natural language. These attentionbased models have achieved exceptional state-of-the-art results on various NLP benchmarks. One of the contributing factors is the growing use of transfer learning. Models are pre-trained on unsupervised objectives using rich datasets that develop fundamental natural language abilities that are fine-tuned further on supervised data for downstream tasks. Surprisingly, current researches have led to a novel era of powerful models that no longer require finetuning. The objective of this paper is to present a comparative analysis of some of the most influential language models. The benchmarks of the study are problem-solving methodologies, model architecture, compute power, standard NLP benchmark accuracies and shortcomings.

Keywords

Natural Language Processing, Transformers, Attention-Based Models, Representation Learning, Transfer Learning.

CS&IT Conference Proceedings

Comparative Analysis of Transformer based Language Models