Big data in China
Chapter 32 Master big data and be the master of the future world!
Chapter 32 Master big data and be the master of the future world! (2)
Just like in the era of the industrial revolution, people needed electricity; in the computer age, people needed computers; then in the era of big data, everyone needs to obtain data freely and legally.What it brings is a change in the form of competition, and at the same time a change in competitive thinking—we must not only ensure that we obtain data in a timely manner, but also respect and support the relevant rights of others, and meet each other's needs for data.
For example, in 2009, the US government created the Data.gov website, which opened the door for the popularization of big data and the disclosure of data. The public can obtain various government data through this website.If China wants to catch up with the transformation of big data, it must first start a deep-seated "data disclosure" action, starting with the government to disclose data, followed by enterprises, and finally to each of us.
Data Control Principle [-]: Ensure maximum availability of data.Data Control Principle [-]: Ensure maximum understandability of data.We all know that there are gold mines in data, but from genomics, astronomy, ecology, clinical medicine to high-energy physics, etc., as I mentioned above, the core question here is, when data floods in, how do we Data collection, management, ensuring its comprehensibility and availability?
The complexity of big data lies in the speed of its delivery and use. For example, it must be real-time. If it lags by an hour or two or even a day or two, it may lose the meaning of acquisition.Real-time flow has the greatest value.Therefore, the liquidity of data is the biggest basis for big data to realize its personalized application. In other words, data itself has no value, and it has value only with sufficient liquidity.
For example, in the computer of a certain base of the US military in Afghanistan, information related to a group of terrorists is stored, including almost all details such as their photos, weapons and equipment, and logistics. A group of unidentified people need to confirm whether they are the target.Then, the base needs to send the raw data of these people to the drone in real time through the data link for comparison in the air. If it cannot be obtained in real time, after an hour, the terrorists may run away, or no one will be there. The plane ran out of fuel and had to return.The worst outcome could also be that the drone fired missiles at the people below, only to find out days later that it had mistakenly bombed civilians.This is a classic case of mobility and accessibility.
The fact is that the data around us has no practical value when we cannot use it, that is, when the data has no liquidity. It has the characteristics of the big data era.
Of course, I firmly believe that the free and legal flow of data will definitely come. This will be an irreversible trend, and it will not be determined by human interference.In this process of promotion, a steady stream of people will stand up and provide their technical processing capabilities and processing methods to more people, and provide them in a relatively reasonable and cheap way, and then jointly promote the flow of data. Complete the personalized application.
Avoid dead ends - wrong premises lead to wrong conclusions
Why do I say that false premises lead to false conclusions?Does the basis for data analysis have to be determined by some human motivation?Those who are accompanied by data every day may be able to summarize its reasons in one sentence, but those who are new to big data may be confused: since the correlation of data is enough to reflect a certain "fact", why not To set the prerequisites for the analysis of the data?
The answer may be one that many people don't want to accept, because they will find themselves drawing too many wrong conclusions.Sometimes, it's because of poor quality or too little data for analysis; but more often than not, it's precisely because we misuse the analysis results of the data.Our own mistakes made the data analysis go wrong.When people have problems themselves, big data will either make these problems continue to exist, or it will exacerbate the adverse consequences caused by these problems, making the results go further and further in the wrong direction.
The "data" produced by big data technology is not necessarily equal to good data.You must first understand this and make a clear judgment about it, otherwise you will fall into the quagmire of blind faith and superstition.Now more and more experts firmly believe in my analysis, that is, big data does not automatically produce good analysis results, but depends on the conditions you set in advance, such as certain analysis logic or data focus .
In specific applications, if the data is incomplete, taken out of context or destroyed, it may lead us to make wrong decisions.Even to some extent, such disastrous results are bound to happen, thereby weakening the value of data, affecting the competitiveness of enterprises or our personal daily life.
Mr. Green is a professor at Harvard University in the United States and an expert in the field of quantitative analysis. He once made a wrong understanding when performing data analysis, which led to wildly wrong results.He has launched a big data-related analysis project in the past few years. The content of his work is to predict unemployment in the United States by detecting keywords such as "job", "unemployment" and "classification" in tweets and other social media posts. Rate.
His team used sentiment analysis techniques to collect massive amounts of content containing these keywords, and based on the increase or decrease of these posts to judge their correlation with the monthly unemployment rate.During the collection and analysis, the team members found that the content containing the keyword "job" increased sharply, which means that more people were discussing the topic of work in a given month.But then, they discovered that it had nothing to do with the unemployment rate, but the fact that Jobs died—and that Jobs' name, Jobs, also means "job."
Therefore, Green said that people should learn a lesson from this example, and don't fully believe that big data can tell you a conclusion about one thing without any conditions, and magically help you make a decision.All analysis must set a credible or precise premise, otherwise you may lead your conclusion to a place that has nothing to do with the facts.
That said, correlations between data can spell catastrophic failure for you in the absence of necessary causal support.There are many ways to solve this trouble, for example, we can increase the analysis premise by adding additional keywords, but it often requires a lot of human work.
When setting certain fixed keywords, at first we will see some related or irrelevant things from the data analysis, more relevant ones, and very few irrelevant ones.But over time, if you don't change the query, if you don't fix the premise and the background of the data, you'll find that the topics that contain those keywords are somehow drifting off topic.Some times they deviate slightly, but other times they deviate so much that you can hardly find any correlation between them.
But Green also admitted: "In general, a lot of big data analysis has produced useful content. The important thing is that we only need to set the necessary start-up procedures for the analysis work, guide it on the right track, and it will give you the plan." The results in the paper can accomplish tasks that cannot be done by traditional methods.”
Data itself is not equal to wisdom, only after correct analysis, data can highlight its significance.If people feel that large amounts of data can magically produce good analysis results without any human intervention, then its negative aspects may come to the fore, preventing us from making positive judgments.
The name of Jobs is a classic case. When he died (the background and premise of the information changed), the same keyword caused great interference to the results of the data analysis, leading the end point to a different direction from the starting point. A place where horses and cattle are irrelevant.
A report in the "Wall Street Journal" also believes that today there are more and more data without content that is driving people's decision-making process.But the truth is that it’s not that data is useless, but that people’s motives for using data have subtly changed, just like putting the wrong seasoning in a dish, even if it’s just a small mistake, the taste of the dish will completely change.
The analysis of the negative side of "correlation" without setting the correct premise has always been a hot spot in big data research. Franck, a Belgian big data expert, pointed out in his article that in some cases, banks will Due to the situation of the contacts on the website, the loan is refused to the user.Although this man had good credit, he had some friends who were fond of bad debts, which affected the bank's judgment of him. "Relevance" here hurts a citizen who would otherwise qualify for a bank loan.
This shows that when we draw conclusions directly from the correlation of data without any premises, further analysis is necessary, otherwise it may cause trouble.For example, some criminal data from the 20th century in the United States shows that Hispanic and black men between the ages of 20 and 27 who drive entry-level luxury cars are the most likely to be drug dealers.However, in the actual process of handling cases by the police, it was discovered that many African-Americans who met the data criteria were not criminals, but good citizens.Many of them were included in the key surveillance targets by the police, but in the end it was a false alarm.
In short, Big Data is an analytical tool, but it should not be viewed by us as an always-right solution no matter what the situation.It can help you narrow it down, from maybe millions down to 150 or so.However, even after another 200 years, it is impossible for us to give the opportunity of "judging everything" to the computer.We can't just rely on data for analysis, and we can't ignore the unique judgment of human intelligence in the analysis process.
If you do this, it will definitely bring you troubles that are hard to get rid of.At that time, big data becomes a deadly big trouble in your life.Some of my friends have experienced this, and I hope people don't make this mistake again.
Problem Solving - The Role of Positioning People
Now, through the whole book "relentlessly" or "selectively" showing the application of data in many fields today, we have clearly understood the connotation of the era of big data. Correlation, the era of no longer superstitious sampling survey but the pursuit of overall analysis.
-
It is based on the excellent scientific and technological foundation and overwhelming network platform developed by human beings for hundreds of years.
-
It leaves us with almost no secrets. This is the right given to it by technology, and it is also the choice of technology to shape people or people to shape technology.
-
It makes the computer more and more "smart", and can even filter patterns or information that are more suitable for itself, and automatically help itself improve its operating modules, although it cannot yet rule humans.
-
It makes up for the lack of accuracy of individual cases with massive data, and then leads to more accurate results.
-
It produces a dialectical relationship between correlation and causation, and data processors will produce different results when predicting people's behavior, the occurrence of disease, and the arrival of disasters according to different choices of these two relationships.
-
It inevitably leads to changes in business models and political landscapes.
-
It endows users with enormous authority, but this pervasive authority makes people feel frightened, and will even cause a more serious threat to the existing order of human civilization.
-
It has changed legal thinking, and has created a profound problem in the judicial field about the presumption of innocence and the prediction of guilt, as shown in the American drama "Suspect Tracker".
In the era of big data, we are exposing a huge amount of personal information every day. Its great value lies in its secondary use, and this is the level that we are currently unable to supervise and redeem.How to protect the necessary personal privacy and successfully prevent the collection of big data giants is an urgent topic that everyone is discussing.Of course, what we gain from this book may not be limited to the above-mentioned things, but lies in the expansion of Chinese people's worldview, as well as thinking about the relationship between data and people.In the era of big data, how should ordinary Chinese define their roles?Big data is like a powerful ferocious beast that has just grown up and has not been put in a cage. It can not only guard the house, but also harm the owner. So how should we control it?
It's all about the choices we make about our characters.In the era of big data, each of us has the opportunity to play four roles, but not everyone has the ability to make choices that are in their best interests.
●Uninformed
They live in ignorance and are unaware of the series of changes caused by big data.They are ignorant, but simple and naive, and become the first target of data collection.But at the same time, they are also detached, becoming a "happy victim" in a state of ignorance.
● insider
They understand what is going on in the world, like they like some topics and books related to big data.In their personal lives, they also know they have been targeted by data collectors and are becoming providers of such data.Therefore, their hearts are very disturbed, but they are powerless.
●Participants
Participants or researchers in the big data industry know how to protect themselves and how to avoid personal privacy being collected.However, in the eyes of such people, the world is always dark. They are pessimistic about the future and full of vigilance about technological progress.
●Controller
This kind of people are at the top of the pyramid. They have mastered huge data resources and are elites in the era of big data. They can not only protect themselves, but also become a smart data collector and gain benefits from it.At least these people will not be victims of the big data era, and at the same time they determine the development direction of this era.Whether it is a devil or an angel, they must make their own choices.
Although people in the latter three roles may not feel happy, we all strive to be such people, rather than muddled "uninformed people" who are destined to be abandoned by this era and impacted by the tide of change to the ultimate stage. Inconspicuous corner.Big data will not wait for you to mature, but will mercilessly push you aside, and then walk away.
For us, big data is a new gold mine and a new opportunity. Although it also means risks, it is more of a huge benefit.
How to treat it correctly without going to extremes?Let us neither exaggerate nor underestimate the impact it will have on our lives.
If it is harmful to you at this stage, then stay away from it carefully; if it is beneficial, then please embrace it carefully, become the master of big data, and successfully dominate its influence in our lives, let It becomes a new starting point in your life!
(End of this chapter)
Just like in the era of the industrial revolution, people needed electricity; in the computer age, people needed computers; then in the era of big data, everyone needs to obtain data freely and legally.What it brings is a change in the form of competition, and at the same time a change in competitive thinking—we must not only ensure that we obtain data in a timely manner, but also respect and support the relevant rights of others, and meet each other's needs for data.
For example, in 2009, the US government created the Data.gov website, which opened the door for the popularization of big data and the disclosure of data. The public can obtain various government data through this website.If China wants to catch up with the transformation of big data, it must first start a deep-seated "data disclosure" action, starting with the government to disclose data, followed by enterprises, and finally to each of us.
Data Control Principle [-]: Ensure maximum availability of data.Data Control Principle [-]: Ensure maximum understandability of data.We all know that there are gold mines in data, but from genomics, astronomy, ecology, clinical medicine to high-energy physics, etc., as I mentioned above, the core question here is, when data floods in, how do we Data collection, management, ensuring its comprehensibility and availability?
The complexity of big data lies in the speed of its delivery and use. For example, it must be real-time. If it lags by an hour or two or even a day or two, it may lose the meaning of acquisition.Real-time flow has the greatest value.Therefore, the liquidity of data is the biggest basis for big data to realize its personalized application. In other words, data itself has no value, and it has value only with sufficient liquidity.
For example, in the computer of a certain base of the US military in Afghanistan, information related to a group of terrorists is stored, including almost all details such as their photos, weapons and equipment, and logistics. A group of unidentified people need to confirm whether they are the target.Then, the base needs to send the raw data of these people to the drone in real time through the data link for comparison in the air. If it cannot be obtained in real time, after an hour, the terrorists may run away, or no one will be there. The plane ran out of fuel and had to return.The worst outcome could also be that the drone fired missiles at the people below, only to find out days later that it had mistakenly bombed civilians.This is a classic case of mobility and accessibility.
The fact is that the data around us has no practical value when we cannot use it, that is, when the data has no liquidity. It has the characteristics of the big data era.
Of course, I firmly believe that the free and legal flow of data will definitely come. This will be an irreversible trend, and it will not be determined by human interference.In this process of promotion, a steady stream of people will stand up and provide their technical processing capabilities and processing methods to more people, and provide them in a relatively reasonable and cheap way, and then jointly promote the flow of data. Complete the personalized application.
Avoid dead ends - wrong premises lead to wrong conclusions
Why do I say that false premises lead to false conclusions?Does the basis for data analysis have to be determined by some human motivation?Those who are accompanied by data every day may be able to summarize its reasons in one sentence, but those who are new to big data may be confused: since the correlation of data is enough to reflect a certain "fact", why not To set the prerequisites for the analysis of the data?
The answer may be one that many people don't want to accept, because they will find themselves drawing too many wrong conclusions.Sometimes, it's because of poor quality or too little data for analysis; but more often than not, it's precisely because we misuse the analysis results of the data.Our own mistakes made the data analysis go wrong.When people have problems themselves, big data will either make these problems continue to exist, or it will exacerbate the adverse consequences caused by these problems, making the results go further and further in the wrong direction.
The "data" produced by big data technology is not necessarily equal to good data.You must first understand this and make a clear judgment about it, otherwise you will fall into the quagmire of blind faith and superstition.Now more and more experts firmly believe in my analysis, that is, big data does not automatically produce good analysis results, but depends on the conditions you set in advance, such as certain analysis logic or data focus .
In specific applications, if the data is incomplete, taken out of context or destroyed, it may lead us to make wrong decisions.Even to some extent, such disastrous results are bound to happen, thereby weakening the value of data, affecting the competitiveness of enterprises or our personal daily life.
Mr. Green is a professor at Harvard University in the United States and an expert in the field of quantitative analysis. He once made a wrong understanding when performing data analysis, which led to wildly wrong results.He has launched a big data-related analysis project in the past few years. The content of his work is to predict unemployment in the United States by detecting keywords such as "job", "unemployment" and "classification" in tweets and other social media posts. Rate.
His team used sentiment analysis techniques to collect massive amounts of content containing these keywords, and based on the increase or decrease of these posts to judge their correlation with the monthly unemployment rate.During the collection and analysis, the team members found that the content containing the keyword "job" increased sharply, which means that more people were discussing the topic of work in a given month.But then, they discovered that it had nothing to do with the unemployment rate, but the fact that Jobs died—and that Jobs' name, Jobs, also means "job."
Therefore, Green said that people should learn a lesson from this example, and don't fully believe that big data can tell you a conclusion about one thing without any conditions, and magically help you make a decision.All analysis must set a credible or precise premise, otherwise you may lead your conclusion to a place that has nothing to do with the facts.
That said, correlations between data can spell catastrophic failure for you in the absence of necessary causal support.There are many ways to solve this trouble, for example, we can increase the analysis premise by adding additional keywords, but it often requires a lot of human work.
When setting certain fixed keywords, at first we will see some related or irrelevant things from the data analysis, more relevant ones, and very few irrelevant ones.But over time, if you don't change the query, if you don't fix the premise and the background of the data, you'll find that the topics that contain those keywords are somehow drifting off topic.Some times they deviate slightly, but other times they deviate so much that you can hardly find any correlation between them.
But Green also admitted: "In general, a lot of big data analysis has produced useful content. The important thing is that we only need to set the necessary start-up procedures for the analysis work, guide it on the right track, and it will give you the plan." The results in the paper can accomplish tasks that cannot be done by traditional methods.”
Data itself is not equal to wisdom, only after correct analysis, data can highlight its significance.If people feel that large amounts of data can magically produce good analysis results without any human intervention, then its negative aspects may come to the fore, preventing us from making positive judgments.
The name of Jobs is a classic case. When he died (the background and premise of the information changed), the same keyword caused great interference to the results of the data analysis, leading the end point to a different direction from the starting point. A place where horses and cattle are irrelevant.
A report in the "Wall Street Journal" also believes that today there are more and more data without content that is driving people's decision-making process.But the truth is that it’s not that data is useless, but that people’s motives for using data have subtly changed, just like putting the wrong seasoning in a dish, even if it’s just a small mistake, the taste of the dish will completely change.
The analysis of the negative side of "correlation" without setting the correct premise has always been a hot spot in big data research. Franck, a Belgian big data expert, pointed out in his article that in some cases, banks will Due to the situation of the contacts on the website, the loan is refused to the user.Although this man had good credit, he had some friends who were fond of bad debts, which affected the bank's judgment of him. "Relevance" here hurts a citizen who would otherwise qualify for a bank loan.
This shows that when we draw conclusions directly from the correlation of data without any premises, further analysis is necessary, otherwise it may cause trouble.For example, some criminal data from the 20th century in the United States shows that Hispanic and black men between the ages of 20 and 27 who drive entry-level luxury cars are the most likely to be drug dealers.However, in the actual process of handling cases by the police, it was discovered that many African-Americans who met the data criteria were not criminals, but good citizens.Many of them were included in the key surveillance targets by the police, but in the end it was a false alarm.
In short, Big Data is an analytical tool, but it should not be viewed by us as an always-right solution no matter what the situation.It can help you narrow it down, from maybe millions down to 150 or so.However, even after another 200 years, it is impossible for us to give the opportunity of "judging everything" to the computer.We can't just rely on data for analysis, and we can't ignore the unique judgment of human intelligence in the analysis process.
If you do this, it will definitely bring you troubles that are hard to get rid of.At that time, big data becomes a deadly big trouble in your life.Some of my friends have experienced this, and I hope people don't make this mistake again.
Problem Solving - The Role of Positioning People
Now, through the whole book "relentlessly" or "selectively" showing the application of data in many fields today, we have clearly understood the connotation of the era of big data. Correlation, the era of no longer superstitious sampling survey but the pursuit of overall analysis.
-
It is based on the excellent scientific and technological foundation and overwhelming network platform developed by human beings for hundreds of years.
-
It leaves us with almost no secrets. This is the right given to it by technology, and it is also the choice of technology to shape people or people to shape technology.
-
It makes the computer more and more "smart", and can even filter patterns or information that are more suitable for itself, and automatically help itself improve its operating modules, although it cannot yet rule humans.
-
It makes up for the lack of accuracy of individual cases with massive data, and then leads to more accurate results.
-
It produces a dialectical relationship between correlation and causation, and data processors will produce different results when predicting people's behavior, the occurrence of disease, and the arrival of disasters according to different choices of these two relationships.
-
It inevitably leads to changes in business models and political landscapes.
-
It endows users with enormous authority, but this pervasive authority makes people feel frightened, and will even cause a more serious threat to the existing order of human civilization.
-
It has changed legal thinking, and has created a profound problem in the judicial field about the presumption of innocence and the prediction of guilt, as shown in the American drama "Suspect Tracker".
In the era of big data, we are exposing a huge amount of personal information every day. Its great value lies in its secondary use, and this is the level that we are currently unable to supervise and redeem.How to protect the necessary personal privacy and successfully prevent the collection of big data giants is an urgent topic that everyone is discussing.Of course, what we gain from this book may not be limited to the above-mentioned things, but lies in the expansion of Chinese people's worldview, as well as thinking about the relationship between data and people.In the era of big data, how should ordinary Chinese define their roles?Big data is like a powerful ferocious beast that has just grown up and has not been put in a cage. It can not only guard the house, but also harm the owner. So how should we control it?
It's all about the choices we make about our characters.In the era of big data, each of us has the opportunity to play four roles, but not everyone has the ability to make choices that are in their best interests.
●Uninformed
They live in ignorance and are unaware of the series of changes caused by big data.They are ignorant, but simple and naive, and become the first target of data collection.But at the same time, they are also detached, becoming a "happy victim" in a state of ignorance.
● insider
They understand what is going on in the world, like they like some topics and books related to big data.In their personal lives, they also know they have been targeted by data collectors and are becoming providers of such data.Therefore, their hearts are very disturbed, but they are powerless.
●Participants
Participants or researchers in the big data industry know how to protect themselves and how to avoid personal privacy being collected.However, in the eyes of such people, the world is always dark. They are pessimistic about the future and full of vigilance about technological progress.
●Controller
This kind of people are at the top of the pyramid. They have mastered huge data resources and are elites in the era of big data. They can not only protect themselves, but also become a smart data collector and gain benefits from it.At least these people will not be victims of the big data era, and at the same time they determine the development direction of this era.Whether it is a devil or an angel, they must make their own choices.
Although people in the latter three roles may not feel happy, we all strive to be such people, rather than muddled "uninformed people" who are destined to be abandoned by this era and impacted by the tide of change to the ultimate stage. Inconspicuous corner.Big data will not wait for you to mature, but will mercilessly push you aside, and then walk away.
For us, big data is a new gold mine and a new opportunity. Although it also means risks, it is more of a huge benefit.
How to treat it correctly without going to extremes?Let us neither exaggerate nor underestimate the impact it will have on our lives.
If it is harmful to you at this stage, then stay away from it carefully; if it is beneficial, then please embrace it carefully, become the master of big data, and successfully dominate its influence in our lives, let It becomes a new starting point in your life!
(End of this chapter)
You'll Also Like
-
Fairy tale: Little Red Riding Hood's wolf mentor
Chapter 209 8 hours ago -
Naruto: Uchiha is not the Raikage!
Chapter 139 8 hours ago -
Mount and Blade System: Start from Pioneer Lords
Chapter 319 9 hours ago -
Myth Card Supplier: Nezha the Third Prince
Chapter 551 9 hours ago -
Gensokyo Detective, but surrounded by Shura Field
Chapter 287 10 hours ago -
Refining Oneself Into A Corpse
Chapter 24 10 hours ago -
Mortal Mirror
Chapter 508 10 hours ago -
Online Game: I Am The God Of Wealth, What's Wrong With My Pet Having Hundreds Of Millions Of Po
Chapter 513 1 days ago -
Help! I changed the gender of the male protagonist in the yandere game
Chapter 91 1 days ago -
The Goddess Brings The Baby To The House, Awakening The Daddy System!
Chapter 368 1 days ago