男友太凶猛1v1高h,大地资源在线资源免费观看 ,人妻少妇精品视频二区,极度sm残忍bdsm变态

Global EditionASIA 中文雙語Fran?ais
China
Home / China / Education

Guideline to develop AI-backed Chinese language database

Digitalization of ancient texts promotes cultural heritage, Mandarin learning

By Zhao Yimeng | China Daily | Updated: 2025-04-01 09:10
Share
Share - WeChat

China is accelerating the digitalization of ancient texts and boosting access to oracle bone script data, aiming to integrate cultural heritage with digital Chinese, officials said on Monday.

The Ministry of Education, the National Language Commission and the Cyberspace Administration of China issued a guideline to promote the digitalization of the Chinese language and characters. The focus is on developing national language resources and large-scale Chinese language models to support artificial intelligence.

The guideline aims to establish a national corpus and strategic language resources information database by 2027. By 2035, the country hopes it will have significantly expanded the presence of the Chinese language in global digital and generative AI scenarios.

Liu Peijun, head of the Department of Language Information Management at the Ministry of Education, said the guideline calls for the digitalization of linguistic and cultural heritage, while promoting the construction of a national digital language and script museum.

It emphasizes advancing key technologies for ancient text digitalization, enhancing the accessibility of oracle bone script data and launching a multilingual digital education program to facilitate Chinese language learning globally, Liu said at a news conference.

A key aspect of this initiative is the development of large-scale linguistic data resources. The guideline outlines a plan to build a national corpus with extensive Chinese language datasets to support AI applications.

Among the pilot projects, Beijing Normal University has launched a large-scale Classical Chinese language model, an AI-driven initiative that sets a new benchmark in the field, Liu said.

Kang Zhen, vice-president of BNU, said the university has developed a range of digital language databases, including a comprehensive holographic Chinese character database, a digital resource of the ancient Chinese dictionary Shuowen Jiezi, and repositories for ancient inscriptions and handwritten texts.

These resources have played a crucial role in linguistic research and cultural preservation, Kang added.

The university's AI Taiyan, a Classical Chinese large language model trained with 1.8 billion parameters, has been designed for high-accuracy interpretation of ancient texts, supporting tasks such as word and phrase explanations, as well as classical-to-modern Chinese translation.

China is also spearheading the construction of a new national corpus to strengthen linguistic infrastructure in the AI era, said Wang Hui, deputy head of the Ministry of Education's Department of Language Application and Administration.

"Currently, most linguistic datasets remain limited to single-text formats and specific academic domains, lacking the scale and diversity required for AI applications," Wang said.

The department has begun planning for the corpus this year, seeking to launch two flagship databases, the Chinese civilization corpus for AI-assisted teaching and research, and the Chinese grand reading system corpus, Wang said.

Top
BACK TO THE TOP
English
Copyright 1995 - . All rights reserved. The content (including but not limited to text, photo, multimedia information, etc) published in this site belongs to China Daily Information Co (CDIC). Without written authorization from CDIC, such content shall not be republished or used in any form. Note: Browsers with 1024*768 or higher resolution are suggested for this site.
License for publishing multimedia online 0108263

Registration Number: 130349
FOLLOW US
 
主站蜘蛛池模板: 溆浦县| 土默特右旗| 梧州市| 丰顺县| 高清| 茂名市| 文成县| 镇康县| 安丘市| 南开区| 赞皇县| 古丈县| 崇阳县| 宜昌市| 达尔| 宝清县| 民县| 崇阳县| 江都市| 淳安县| 辽中县| 交城县| 昔阳县| 墨江| 收藏| 三原县| 泸西县| 屯昌县| 孝昌县| 长兴县| 古浪县| 江口县| 江永县| 漳浦县| 乌兰察布市| 武安市| 格尔木市| 江孜县| 临桂县| 利辛县| 平谷区|