Riding the wind of rebirth

Chapter 744 I Need It Too

However, there is also a problem with the Chinese character encoding in the previous life, that is, Unicode encoding came out too late, which led to Microsoft having to adopt a set of extended encoding based on GB13000, and because of this reason, the national standard had to be based on GB13000 encoding. Patched, expanded GBK, and then expanded GB18030.

The final GB 18030-2005, the full name of the national standard GB 18030-2005 "Information Technology Chinese Coded Character Set", is fully compatible with GB 2312-1980, basically compatible with GBK, supports all unified Chinese characters of GB 13000 and Unicode, and includes a total of There are 70244 Chinese characters.

At that time, there were not as many Chinese characters included in Unicode as GB 18030-2005. Although theoretically all Chinese characters could be accommodated casually, countless code points were empty.

The final status quo has resulted in an old system with a lot of patches and patches, and a new system with a lot of empty code bits and no one to do the filling work. As a result, in the information system decades later, there are still large numbers of incompatible Chinese character transcoding. question.

Zhou Zhi, who was a programmer of a state-owned enterprise in later generations, suffered a lot from it, so he believes that the key to solving this problem is that the country should abandon the cramped ISO/IEC 1064 from the very beginning, and first grab enough space for Chinese characters in the Unicode standard Yes, at least grab [-] code points to fill in first, and use it as the only mandatory standard, and this set is used all over the world.

So he said: "Isn't it just right that there is no horoscope? Only if there is no horoscope, we can participate deeply. As long as we can occupy three code points and leave it to us, we can accommodate [-] Chinese characters."

"And Unicode only has the concept of encoding, and the purpose of its design itself is to hold all kinds of characters in the world."

"Chinese character coding is undoubtedly the most complicated text coding work in the world. After we complete this work, we can also have sufficient voice in the organization. In the future, we can also guide the work of other countries and organizations and help us write other ethnic languages. , also serves as a foundation.”

Now it was Gu Lao's literary and historical experts' turn to fail to understand what Zhou Zhi and Li Hongjiang were discussing.

Mr. Gu interrupted the lively discussion between the two: "Elbow, Xiao Li, which one of you will explain in words that we old men can understand?"

Mai Mingchuan said with a smile: "I understand the general meaning, let me explain first to see if it's right, Xiao Li and Zhou Zhi will add it if it's not right."

"Now there are two sets of standards. One is ISO/IEC 1064. This system is mature. Although the first part has been promulgated, our country has developed GB 13000 based on this, which can be quickly implemented."

"But this system has a big problem, that is, there are too few code points, and it can only accommodate [-] Chinese characters. Now it seems that there is still a considerable distance from fully meeting the needs."

"There is another set of standards, which is Unicode."

"As long as the coding range allocated to Chinese characters is sufficient, this set of standards can accommodate all our Chinese characters, and in the future, we can continue to win more coding ranges for further expansion, or use them for coding other ethnic minority characters. .”

"From the perspective of design principles, the Unicode standard is actually better than ISO/IEC 1064. However, this standard is only a half-baked one. The first version has not yet been released. If we want to use the Unicode standard, we must first improve the standard. , and then we can discuss the allocation of intervals and the next step."

"Xiao Li's meaning is that we first use GB 13000. We already have the foundation for GB2312 before. We are familiar with this method and get quick results."

"The meaning of Elbow is that we started working on Unicode from the very beginning, and it was done in one step. Since the Unicode standard has not yet been finalized, then we will actively participate in it and work on the standard together!"

"Of course it would be the best result if we can really do what Elbow said. But, do we have the strength?" Mr. Gu still has the impression that the country's information industry is catching up at the beginning, and he is worried about relying on the country's current technology Power, can't do the job.

"In fact, they have basically completed this work." Li Hongjiang said: "Most computers use the American Standard Code for Information Interchange, which is ASCII code, which is a 7-bit code representing all uppercase and lowercase letters, numbers, punctuation marks and control characters. Scheme. The Unicode has been compiled for the ASCII code, and '\u0000' to '\u007F' correspond to all 128 ASCII characters."

"That is to say, the computer system can already use Unicode encoding, but it has not yet formed a large standard?"

"There are still many areas that need to be improved." Li Hongjiang said: "Of course, since the ACSII problems have been solved, at least the architecture is mature, and the rest are minor problems."

"If, I mean if, we could have a [-]-level code point space content for them to fill, I believe the league would be very interested."

There is a saying in later generations, which is called "first-rate companies make standards, second-rate companies make brands, and third-rate companies make products." The current GBK and Unicode are actually standards disputes.

Zhou Zhi added: "This is a major event that affects the whole body. To put it bluntly, it is a dispute over standards."

"China's discourse power in the world's information industry can be said to be insignificant, but the Chinese character library can be called a special resource."

"I'm afraid that in countries with all alphabetic languages ​​in the world, if you add up all the symbols, there are not as many Chinese characters as there are in China."

"If we complete this font first, then for Unicode, it can be shown to the world as its absolute advantage."

"It's like GBK is still using tank cannons, and Unicode has detonated a hydrogen bomb."

"We can take our achievements, pay membership fees, and become members of the organization."

Li Hongjiang did some research on this organization and said: "The Unicode Consortium is a Unicode organization located in California, USA. They actually allow any company or individual who is willing to pay the membership fee to join."

"Two organizations were established at the end of the 80s, one is the commercial organization of the Unicode organization, and the other is the International Standardization Organization for international cooperation. Under the needs of computer popularization and information internationalization, they respectively established the Unicode organization and the ISO-10646 work group.""

"They soon discovered the existence of each other, and everyone worked for the same purpose, so the two organizations worked together to develop universal codes applicable to all languages, and quite tacitly published Unicode and ISO-10646 character sets. Although In fact, the character set encodings of the two are the same, but in essence the two are indeed two different standards."

"The Unicode Consortium released The Unicode Standard for the first time in the previous year. The development of Unicode combined with the ISO/IEC 10646 developed by the International Organization for Standardization, that is, the universal character set. The two are actually the same in the principle of encoding."

"But The Unicode Standard contains more detailed implementation information, covering more detailed topics such as bit encoding, proofreading, and rendering. It even enumerates many character characteristics, including those that must support both reading directions. , such as the left-to-right direction of ordinary reading, and the right-to-left direction of Arabic."

My go!Zhou Zhi's eyes met Gu Kailai's, and Dan Zeng's eyes instantly met in the air. The reading habit of ancient Chinese classics is also from right to left!

Arabic can be used, and so can my Chinese classics!

Tap the screen to use advanced tools Tip: You can use left and right keyboard keys to browse between chapters.

You'll Also Like