Crossover: 2014

Chapter 267 Accelerating Data Utilization Compliance

Chapter 267 Accelerating Data Utilization Compliance
Being able to be so sympathetic to the emotions of her subordinates, Huang Jing felt that following Lin Hui was the right choice.

Of course, Lin Hui didn't know about Huang Jing's self-strategy deep in his heart.

After that, Lin Hui and Huang Jing didn't talk about working online.

Instead, they talked about some news about American technology giants.

Although it is basically gossip and other boring news, not all information is gossip.

At least Lin Hui didn't get nothing.

From the follow-up conversation with Huang Jing, Lin Hui learned a very important piece of information from Huang Jing.

That is, Apple seems to be committed to seeking a large data transaction with a total amount of about [-] million to [-] million US dollars.

Huang Jing was a little vague when describing this news.

It seemed that he was afraid that he would accidentally trick Lin Hui.

The information Huang Jing described before was often conclusive.

It is rare to be unconfident.

Regarding this transaction, Huang Jing first said it was a data transaction and later said it was not a data transaction.

Lin Hui was a little confused.

Even the gossip, Lin Hui, values ​​the corresponding value very much, after all, there are many times when there is no wind without waves.

As for what Huang Jing said about the message, after further inquiries and multiple inspections.

And after further deliberation, Lin Hui still figured it out.

The so-called two to three billion US dollars of data transactions are indeed data, but they are not general types of data transactions.

The data acquisition that Apple is seeking this time is actually a rather special data transaction.

Because of the information obtained through various channels, Lin Hui feels that Apple's goal is actually:

- "Dark Data".

Plotting this, it can also be seen that Ping Hao seems to be secretly crossing Chen Cang on the Mingxiu plank road.

Dark data is also sometimes called dust data.

Dark data or "data dust" is made up of all redundant, often forgotten data.

This data is collected by companies and organizations in the course of their activities but not used afterwards.

Dark data is often unstructured, unlabeled, and unanalyzed information.

Compared with the labeled data that Lin Hui ignored before.

Dark data has even less sense of existence.

Dark data This type of data is almost ignored.

After all, this kind of data exists on the network and in the server, and it will only take up valuable space.

Generally speaking, there are three main types of dark data:

The first is traditional text-based data.This may include emails, logs and documents.

The second type is non-traditional data.

This includes untagged audio and video files, still images, and sound files.

The third type is depth data.

This includes information in the deep web that search engines cannot reach.

Much of this deep data is private and controlled by governments or private institutions.

It includes data, medical records, legal records, financial information and organization-specific databases curated by academics, government agencies and local communities.

All of the above data can be called dark data.

……

Data such as dark data is more obscure than data in the traditional sense.

Although unmarked data such as dark data cannot be used directly.

But the potential of such a thing cannot be denied.

Anyway, it cannot be said that this information is not important.

As for why Guozi is interested in this kind of thing.

Because collecting this type of data has not always been considered data.

In fact, through deep cultivation, it is possible to obtain similar effects to traditional data.

And using this kind of data, through some conceptual education, consumers can even form an impression that the company has never dabbled in general data.

Isn't this very useful for establishing a corporate image? ?
In short, it cannot be said that there is no temptation for an enterprise that is both established and established.

Anyway, Lin Hui thinks that starting from dark data is in line with the behavior style of many technology giants.

It is analogous to the previously estimated price of Lin Hui.

If tens of millions of dollars can buy tens of millions of bilingual annotation data.

It is conceivable that the dark data worth two or three billion U.S. dollars that Apple is seeking must be a huge amount of data.

A big difference between labeled data and dark data is that labeled data is structured and processed data.

Dark data is largely unstructured or even "messy" data.

Structured data is generally data that has a fixed format and a limited length.

For example, the filled form is structured data.

For example, "Nationality, Florist, Ethnicity: Han, Gender: Male, Name: Zhang San, Age:..."

This kind of CCTV is called structured data.

This type of data is easily stored in a database in a fixed format.

The semi-structured data should be some data in XML or HTML format.

This type of data can be processed as structured data as needed, or can be extracted from plain text and processed as unstructured data.

The so-called unstructured data: data with variable length and no fixed format.

For example, web pages and emails are sometimes very long; sometimes they are very short and disappear after a few sentences. This type of data is typical unstructured data.

For example, Word documents, voice, video, and pictures are all unstructured data.

Semi-structured data and unstructured data are generally combined into one and collectively referred to as "dark data".

This term was not defined by Lin Hui either.

Compared with structured data such as labeled data, the value of dark data and labeled data is not the same.

The value of a unit of labeled data is often dozens or even hundreds of times that of a unit of dark data.

Even if two or three billion US dollars are exchanged for more expensive cross-language language labeling data, hundreds of millions can be exchanged.

What's more, how about hundreds of millions of dollars in exchange for dark data?
It is conceivable that the dark data involved in two to three billion US dollars is a considerable amount of dark data.

Lin Huina has a lot of past life information.

But it is absolutely impossible to have dark data that satisfies Apple's appetite.

Not to mention the information about Lin Hui's previous life.

Even the scale of dark data owned by some domestic Internet giants that rank among the top Internet companies may not be able to satisfy Apple's appetite.

In this case, if Lin Hui is interested in Apple's huge acquisition, it seems that he can only collect dark data.

As for how to collect it?

Dark data is collected in a variety of ways.

Because dark data includes user activity logs, customer conversation or email records, server monitoring logs, video files, machine and sensor information generated by the Internet of Things.

Dark data can also include data that is no longer accessible due to being stored on obsolete devices.

In this case, many times when cleaning activity logs or collecting storage fragments, it is possible to get some dark data by hand.

In addition, there are many ways to collect dark data.

It's easy to say though.

But as the saying goes, talking about toxicity regardless of dosage is hooliganism.

For the same reason, regardless of the size of the data, it is also a hooligan to talk about mining data.

Dark data of the scale shown by Apple is definitely not enough for traditional data mining methods.

It seems that there is no good way to mine dark data nowadays.

Traditional companies, when dealing with dark data, use stupid methods to find ways to convert unstructured data into structured data.

This method is time-consuming and labor-intensive.

But it's only for today's technology companies.

For Lin Hui, he still has many data mining methods.

No one knows how to mine data better than Lin Hui.

For large-scale data mining, it seems that the most convenient way is to mine with the help of artificial intelligence.

Even Lin Hui's computer in his previous life had some ready-made ways to mine dark data.

Although the efficiency is limited by the current hardware may be greatly reduced.

But compared to the current traditional mining methods, it also exists like a dimensionality reduction blow.

But a new problem arises again, where to mine dark data?

As mentioned earlier, deep data like some are private, controlled by the government or private institutions.

This category includes data curated by academics, government agencies, and local communities, medical records, legal records, financial information, and organization-specific databases.

Even if it belongs to dark data.

With Lin Hui's ten courage, Lin Hui didn't dare to dig.

After all, another name for this thing is state secrets.

After thinking for a while, Lin Hui came up with a few ideas.

But after careful consideration, no matter which idea it is, it seems that it is easy to take risks, and it is not feasible in a short time.

Although it is almost impossible to say that doing things is completely risk-free.

But it seems unnecessary to take risks for a mere $[-] million.

After all, with the information in Lin Hui's mind, it really doesn't take long to earn [-] million US dollars.

In short, there is no need to take risks at all.

Since there is no need to take risks, and Lin Hui always seeks stability.

So why did Lin Hui come up with a bunch of risky methods when it comes to data mining?

Could it be that Lin Ash has drifted away? ?

In the next few years, there is nothing wrong with Lin Hui's idea.

At least from the perspective of thinking at the end of 21, what Lin Hui thought of just now was not a desperate way.

The method Lin Hui thought of can be operated in accordance with the rules.

But unfortunately, it is now 2014, and it is impossible to perform the same operation.

The most fundamental reason is that in the previous life and the next few years, all data mining matters have been programmed and standardized.

There is a clear "data/data/security/safety/law" related to the use of data and data security.

The first article of the law clearly stated: "This law is formulated in order to regulate data processing activities, ensure data security, promote data development and utilization, protect the legitimate rights and interests of individuals and organizations, and safeguard national sovereignty, security, and development interests."

In the "Data/Data/Security/Security/Law", many things about data utilization and data security are quite clearly stipulated.

Needless to say, it involves data security emergency response mechanism, data security review, data export control and so on.

The key point is that this law clarifies the data classification and classification and core data protection system, and also clarifies the data security risk assessment and work coordination mechanism.

According to these two words, it undoubtedly means that there will be a security risk assessment for some data at the national level to classify the data.

Although it seems that the data control is stricter.

But this is good news for those who are truly down-to-earth.

Why do you say that? ?

Many things are not afraid of having clear regulations, but afraid of the situation that nothing can be done.

Nothing to do means a gray area. Some people are happy to walk in the gray area. It can only be said that they are stupid and bold, and they really don’t want to settle accounts after the fall.

Anyway, Lin Hui thinks that it is better to have a clear system for things that involve major interests.

Having a clear system represents formalization and rationalization.

This is of great benefit to practitioners.

Let’s also take the content of this law as an example. In this law, it is clearly stated that the state supports the development and use of data to improve the intelligence level of public services.

It clearly stated that the state supports data development and utilization and data security technology research, encourages technology promotion and commercial innovation in the fields of data development and utilization and data security, and cultivates and develops data development and utilization and data security products and industrial systems.

In addition, it clearly stated that the country promotes the construction of data development and utilization technology and data security standard system.

These are undoubtedly the gospel for a dedicated technical practitioner like Lin Hui, because it represents the country's clear recognition of reasonable and compliant data utilization.

According to the law, after screening out some key data related to national security and national interests.

Some ordinary data can actually be used reasonably based on this, even for commercial use as long as it does not violate the regulations.

This is a major boon for well-behaved technicians.

It can be said that after the passage of this law, the use of data involved in our country has really entered the right track.

It's better than the situation now [in 2014] anyway.

The current situation is that there are basically no clear laws related to data mining and utilization. .

Not to mention that there is no law when it comes to data mining and data utilization.

Even the clear legislative definitions of "data", "data processing" and "data security" were officially released in 2021 in the previous life.

There is no clear determination, so that offline data utilization, whether it is data mining or data processing, is actually a gray area in my country at present.

Although it is said that for ordinary people, "you can do what is not prohibited by the law", but when it comes to data, Lin Hui thinks it is better not to be too capricious.

Ordinary people may be indifferent to data, but technicians dare not underestimate the value of data.

As human society enters the digital age, cyberspace, the physical world, and human society begin to achieve deep integration.

Data is not only a product of the operation of cyberspace itself, but also a digital portrait of the operation of the physical world and human society, containing the laws of operation of the digital world.

In the digital age, data has multiple attributes such as national security, digital economy, social governance, and personal privacy.

In this case, many times the data is of great significance.

In this way, many things related to data can be handled by you even if there is no provincial law on data.

There is always a law that applies to you even if you make a big noise.

In this case, Lin Hui felt that data should not be too capricious.

Even if possible, Lin Hui felt that the "Data/Data/Security/Security/Law" should be promoted as soon as possible.

Even if it costs a certain amount, it is worth it.

(End of this chapter)

Tap the screen to use advanced tools Tip: You can use left and right keyboard keys to browse between chapters.

You'll Also Like