Ping An Sets World Record In General Language Understanding Evaluation (GLUE) Benchmark

News

Ping An Group

30 Mar 2020

Ping An Sets World Record In General Language Understanding Evaluation (GLUE) Benchmark

Better Interaction Between Computers and Human Languages Has Widespread Applications Across Various Sectors

(Hong Kong, Shanghai, 30 March 2020) Ping An Insurance (Group) Company of China, Ltd. (hereafter “Ping An” or the “Group”, HKEx:2318; SSE:601318) announced that Ping An Technology (Shenzhen) Co., Ltd. (Ping An Technology) set a world record in the prestigious General Language Understanding Evaluation (GLUE) benchmark for Natural Language Processing (NLP). As of 30 March 2020, Ping An Technology’s record-breaking score of 90.6 is the highest in the world, with Baidu in second place and Alibaba in third.

GLUE is the top competition in the field of NLP and one of the most important measures of the technical level of NLP technologies.

Source: https://gluebenchmark.com/leaderboard

NLP, which allows computers to understand human speech and text, is one of the core technologies supporting artificial intelligence applications. The GLUE benchmark is comprised of nine tasks to test NLP models, including questions and answers, sentiment analysis, logical semantic analysis and textual entailment, among others. The final score is the average score for the nine tasks. Since the launch of the GLUE benchmark in 2018, world-renowned companies, including Google, Microsoft, Facebook, Huawei, Alibaba and well-known research institutions have participated in the competition.

In this competition, the model submitted by Ping An Technology is comprised of a pre-trained language model (ALBERT), the Data Augmentation and Auxiliary Feature (DAAF) and Neural Architecture Search (NAS). The DAAF is a learning framework developed by Ping An and played a key role in the test. It contains forward algorithms that can absorb the data for enhancement from external data and backward algorithms that can filter out the data that has a negative impact on enhancement. This framework has been widely used by Ping An in functions such as smart customer services, telemarketing, training and interviews.

In addition to GLUE, Ping An Technology also beat competitors and average human performance in the latest Stanford Question Answering Dataset 2.0 (SQuAD 2.0), another leading benchmark to test NLP algorithms. Ping An is the second company, after Google, that has taken top ranking in both tests.