An Improved mT5 Model for Chinese Text Summary Generation

Fuping Ren2, Jian Chen1, and Defu Zhang1, 1Xiamen University, China, 2Shenzhen Comtech Technology Co. Ltd, China; Fuping Ren2, Jian Chen1, and Defu Zhang1, 1Xiamen University, China, 2Shenzhen Comtech Technology Co. Ltd, China

An Improved mT5 Model for Chinese Text Summary Generation

Authors

Fuping Ren², Jian Chen¹, and Defu Zhang¹, ¹Xiamen University, China, ²Shenzhen Comtech Technology Co. Ltd, China

Abstract

Understanding complex policy documents can be challenging, highlighting the need for intelligent interpretation of Chinese policies. To enhance Chinese text summarization, this study utilized the mT5 model as the core framework and initial weights. Additionally, it reduced model size through parameter clipping, employed the Gap Sentence Generation (GSG) method as an unsupervised technique, and enhanced the Chinese tokenizer. After training on a meticulously processed 30GB Chinese training corpus, the study developed the enhanced mT5-GSG model. When fine-tuning on Chinese policy texts, it adopted the "Dropout Twice" approach and ingeniously merged the probability distribution of the two dropouts using the Wasserstein distance. Experimental results indicate that the proposed model achieved Rouge-1, Rouge-2, and Rouge-L scores of 56.13%, 45.76%, and 56.41% respectively on the Chinese policy text summarization dataset.

Keywords

Natural Language Processing, Text Summarization, Transformer model

CS&IT Conference Proceedings

An Improved mT5 Model for Chinese Text Summary Generation