China to ensure rare characters survive digital era
Chinese authorities and tech companies are working together to speed up the digitalization of rarely-used Chinese characters, which will allow them to be recognized by computers as the country's banks, hospitals, and government units move transactions online.
The China Electronics Standardization Institute, under the Ministry of Industry and Information Technology (MIIT), tech giant Tencent and others on Sunday announced new initiatives to solicit rare characters from the public and promote cross-industry recognition of such characters.
"Government and public services in China are accelerating the digital transformation and moving operations online. However, due to insufficient technical support, many people's names and place names can not be input," said Sun Wenlong, deputy head of the China Electronics Standardization Institute.
"This has caused troubles in many transactions, such as opening bank accounts and purchasing transportation tickets," Sun added.
The institute is inviting members of the public to photograph unusual characters and submit them via a mini program embedded in WeChat, a popular messaging app operated by Tencent. If passing the experts' assessment, they will be added to GB18030, China's official set of coded characters.
Since its launch on April 20, the mini program has received more than 1,400 rare characters, said Zeng Yu, vice president of Tencent. The character with the most submissions is "biang," as in "biang biang noodle," a popular snack from the northwestern province of Shaanxi.
Also on Sunday, Tencent's Sogou Keyboard launched an industrial solution to make rare characters more easily accessible in public services such as finance, transportation, education, and health care.
The latest version of GB18030 has registered more than 80,000 Chinese characters. However, most computers only support the input and display of about 30,000 commonly used characters, according to Sogou Keyboard.
Roughly 60 million people in China have names that contain rare characters, experts said, and a significant number of place names and ancient texts have difficulties being digitalized due to unrecognized characters contained within.
Earlier this month, a village in Yunnan Province soared to prominence as a local surname "Nia," meaning "a flying bird," could not be typed into computers, forcing many villagers with this family name to change into a similar character meaning "duck" when registering for ID cards.
Lin Sumiao, a lawyer based in Beijing, said her name, containing a rare character "su," was constantly shown as "Lin ? miao" on exam admission cards while she was in school, and the situation still happens when she prints flight boarding passes.
Lin is reluctant to change her given name as it carries special meanings. The character "su" consists of characters "geng" and "sheng," the latter part of the phrase "zi li geng sheng" meaning "to rely on oneself."
"The character contains my parents' wish for me to grow up into an independent soul," she said.
As an ideographic writing, the Chinese characters are difficult to be digitalized, as each character must be represented by a unique code and a unique font, said Huang Shanshan, director of the Chinese information lab of the China Electronics Standardization Institute.
Zeng with Tencent said the digitalization of uncommon characters is a complex systematic project, which requires government guidance and joint contributions by input methods, font libraries, hardware and software developers, and operating systems.
Despite the difficulties involved, the digitalization mission must be pushed on to not only solve practical problems, but also to preserve and pass down the Chinese culture, said Tan Jingchun, a researcher with the Chinese Academy of Social Sciences.
"Every rare character is a part of the cultural heritage. They shall not be lost in the digital era nor become an obstacle to the digital society," said Zeng.