Big data in China
Chapter 19 Big Data and Technological Change
Chapter 19 Big Data and Technological Change (3)
After thinking about it, Greg Linden, an Amazon technician at the time, came up with a solution: In fact, there is no need to compare different users at all, we only need to find the correlation between products.This kind of recommendation method can analyze the relationship between products in advance, so the recommendation speed is very fast, it is suitable for different products, and it can even recommend products across borders.
Linden said: "The book review team was defeated and disbanded. I am very sad about this. However, the data does not lie, and the cost of human review is very high."
He compared the sales volume brought by book critics with the marketing performance generated by the recommendation system, and found that the sales volume of products brought by the recommendation system was much higher than that of book critics. This sales comparison data directly affected Amazon’s disbandment of the book review group. The system replaces them to recommend products that are more likely to be popular with users.
Led by Amazon, more and more companies began to use this kind of personalized recommendation system, which rapidly promoted the development of e-commerce.And this kind of recommendation based on massive data is also a form of early application of big data.
☆Big data is the basis for personalization
In fact, to realize a personalized business model, sufficient data is an essential foundation.Without massive data, there is no way to talk about personalization; without massive data, it is difficult for us to summarize the personalities of even a small number of users, let alone the majority of users.
Not sure if you've heard the classic "beer and diapers" story.In supermarkets, diapers must be placed next to beer to sell well. This is exactly the conclusion drawn on the basis of in-depth analysis of public demand.Such a "rule" is quietly hidden in the data, it does not say a word, and it is only waiting for people to discover it by themselves.We always have to dig deep to get them to the surface.
然而大数据相对于传统的数据挖掘更进一步。大数据到底有多大?一组名为“互联网上一天”的数据告诉我们,一天之中,互联网产生的全部内容可以刻满1.68亿张DVD;发出的邮件有2940亿封之多(相当于美国两年的纸质信件数量);发出的社区帖子达200万个(相当于《时代》杂志770年的文字量);卖出的手机为37.8万台,高于全球每天出生的婴儿数量37.1万之间的关联。……大数据具有如下特点:数据量大、数据种类多、数据之间有潜在关联、速度快、时效高。
Ubiquitous data, ubiquitous network and large-scale distributed storage and computing power (cloud computing) faithfully record our clothing, food, housing, transportation and social status.Now, the amount of data created by human beings in one day is equivalent to the amount of data in one year in 2000.
Do you post information on Weibo, WeChat, Renren and other websites every day? Within 1 minute, more than 10 new microblogs were posted on Weibo; the page views of the social networking site Facebook exceeded 600 million... The users of the entire Internet and all commodities themselves are a large enough data space, plus space, time, Potentially related factors such as the weather, if you want to know the preferences of each user, the amount of data required is huge.The more data, the more accurate the understanding of users, but at the same time the more difficult it is to analyze.
☆Technical challenges of internet big data processing
In fact, when you are still using social platforms such as Weibo as a tool for expressing emotions or posting comments, Wall Street's money-making masters are mining the "data wealth" of the Internet, using it to predict market trends ahead of others, and Obtained decent profits.
However, dealing with Internet big data is full of challenges. As mentioned above, there are so many and complex data, how can we effectively find the correlation between data, and how can we make full use of big data to achieve personalized needs? , are questions that each of us needs to think about.
The first thing we need to be clear is what capabilities are needed to process big data.In order for data to be consumed faster than it can be generated, sufficient computing resources are necessary.The core capabilities of big data processing are high-level computing frameworks, stable program design, and precise algorithms.And these abilities need professional computer technology talents to realize.
The second is timeliness.The rate at which users generate data is very fast.How can we perceive these effective data in a timely manner, make an effective response before the user's next operation, and finally bring convenience to the user?Such timeliness requires the computer system to operate in the form of data flow, which eventually leads to the system adopting a technical solution that is completely different from traditional batch big data processing.
Finally, in order to meet individual needs to a greater extent, it must also have a sufficiently powerful customization capability.Although the customization needs of a single user may be small, the number of users is huge and the customization needs are very different. How can we meet the needs of each user in a timely and effective manner?This requires giving users as much freedom as the database SQL language (Structured Query Language), so that even the smallest needs can be met through simple operations.Such customization capabilities should be reflected in many aspects such as data storage, calculation, query, and display.
☆Alibaba cloud's solution--cloud recommendation
Whether it is the computing and storage capabilities for collecting big data, or the real-time computing and algorithm technology required to deal with personalized issues, it is not easy for webmasters and developers to solve problems quickly.Have you heard of the concept of "cloud"?It is equivalent to storing the data content generated by each user in a large memory, and then downloading from the "cloud" according to the user's needs, such as the very popular "cloud TV", "cloud storage", and "cloud service" Wait.Alibaba Cloud is trying to lower the threshold of personalized services through cloud services, so that more webmasters and developers can enjoy their own personalized services at low cost, among which cloud recommendation is a typical example.
What is cloud recommendation?To give a simple example, if a website introduces gourmet recipes, when users browse the recipes of certain dishes or soups, if they can recommend some related recipes, then users can stay on the website for more More time, more access.
How to implement cloud recommendation?In fact, there are several ways to find content that users are interested in:
First, find out from the user access log.Every user will generate browsing records, and cloud recommendation recommends a series of relevant content through scientific analysis of user browsing record data.
Second, you can recommend other popular recipes on the website, and integrate and link the same content on different websites.
Third, look for different types of content. If users are browsing the recipes of a certain type of soup, they can recommend some recipes for a certain type of meal.
However, to achieve such recommendations, traditional approaches require a lot of manual editing work.It can't be instant, and it's hard to guarantee a good effect.An accurate recommendation model must make a comprehensive evaluation of the overall effect of the method itself and the user's preference for the results of various recommendation methods, so as to find an accurate recommendation model suitable for each user, and finally let users enjoy the recommended booth" "Thousands of people, thousands of faces" personalized service.
So, can ordinary people use cloud recommendation services?OK.If you also want to try it, just register and apply on the cloud recommendation website, get a ten-digit application ID, such as "1000001234", and embed the code generated by the system in the web page code to get personalized recommendation results.This process usually takes 1 minute to complete.
The next thing, of course, will be handed over to the cloud system.It will start to conduct an in-depth analysis of the website, and will continue to automatically adjust the model and weight of the recommendation method based on the displayed click effects.
In the cloud recommendation management interface, website developers can customize parameters such as the size of the recommended location, the number of recommended content items, URL range, and display form.Webmasters can also see clicks on recommended booths and make appropriate adjustments to suggested placement parameters to improve performance.
If you are a professional website operator and manager, then you need to know that the cloud recommendation service also provides plug-in support for mainstream website building tools such as Wordpress.After installing the plug-in, the developer can operate and manage various functions recommended by the cloud on the tool management interface.According to background statistics, the overall traffic of the website after cloud recommendation is enabled will increase by 10%.This kind of personalized service makes people feel like the bank can get interest, which is the display of the charm of big data.I believe that with the continuous accumulation of data and the accumulation of the number of users, personalized services in the era of big data can bring far more surprises than a 10% increase in traffic!
Now, you may be able to understand the mystery behind cloud recommendation. Its basis is still the data generated by each user. What website you browse, what you publish on the Internet, or click on an article are all massive data. Foundation.Just like the principle of being inclusive of all rivers, the data generated by tens of thousands of netizens gather together and become "big data".Its essence is so transparent, yet so huge.
As long as the big data is analyzed, integrated, and utilized through professional analysis software, it will eventually become the "guess you like" that every user sees.At this time, presumably you are no longer surprised.The secret is that simple!
☆Is personalization really safe?
Every thing has a good side and a bad side.The same is true for big data. No matter how wonderful applications it develops and how convenient functions it provides, it has disadvantages that we need to avoid and correct.
What impresses you the most must be privacy.While big data greatly facilitates our lives, it also seriously threatens our personal privacy.On the one hand, we have to use the Internet, but on the other hand, we are afraid that our privacy will be exposed in the open. The "Prism Gate" incident made us rethink the security issues under big data.With the fermentation of the "Prism Gate", everyone was surprised to find that the US's network surveillance has gone so far in today's big data world, leaving all countries in the world behind.
There is also the issue of root servers.In 1969, the four main computers of four universities in the southwestern United States - University of California, Los Angeles, Stanford University Research Institute, University of California, and University of Utah were connected. This is the earliest Internet.At present, China's Internet users have exceeded 4 million, ranking first in the world, but there are 4 root servers in the world that are mainly used to manage the Internet's main directory, 5 primary root server is still in the United States, and the remaining 13 secondary root servers 1 are in the United States , none of them are in China.
The number of Internet users in the United States is less than half that of my country, but the number of web hosts is 28 times that of China.Under the premise that the United States controls most of the root servers, China's network security is worrying.This is also one of the determining factors where individuation is risky, because we have no control over the status quo, like a person with a chokehold.
The former media tycoon Rupert Murdoch's "The Sun" was closed due to the wiretapping scandal, and then the Snowden incident happened. These are all testing our really worrying network environment.Therefore, everyone has a question mark in their hearts, is the network security in the big data environment?Is our privacy still private?How to use the Internet safely and find a balance between the use of big data and user privacy is a question that we need to think about further.
(End of this chapter)
After thinking about it, Greg Linden, an Amazon technician at the time, came up with a solution: In fact, there is no need to compare different users at all, we only need to find the correlation between products.This kind of recommendation method can analyze the relationship between products in advance, so the recommendation speed is very fast, it is suitable for different products, and it can even recommend products across borders.
Linden said: "The book review team was defeated and disbanded. I am very sad about this. However, the data does not lie, and the cost of human review is very high."
He compared the sales volume brought by book critics with the marketing performance generated by the recommendation system, and found that the sales volume of products brought by the recommendation system was much higher than that of book critics. This sales comparison data directly affected Amazon’s disbandment of the book review group. The system replaces them to recommend products that are more likely to be popular with users.
Led by Amazon, more and more companies began to use this kind of personalized recommendation system, which rapidly promoted the development of e-commerce.And this kind of recommendation based on massive data is also a form of early application of big data.
☆Big data is the basis for personalization
In fact, to realize a personalized business model, sufficient data is an essential foundation.Without massive data, there is no way to talk about personalization; without massive data, it is difficult for us to summarize the personalities of even a small number of users, let alone the majority of users.
Not sure if you've heard the classic "beer and diapers" story.In supermarkets, diapers must be placed next to beer to sell well. This is exactly the conclusion drawn on the basis of in-depth analysis of public demand.Such a "rule" is quietly hidden in the data, it does not say a word, and it is only waiting for people to discover it by themselves.We always have to dig deep to get them to the surface.
然而大数据相对于传统的数据挖掘更进一步。大数据到底有多大?一组名为“互联网上一天”的数据告诉我们,一天之中,互联网产生的全部内容可以刻满1.68亿张DVD;发出的邮件有2940亿封之多(相当于美国两年的纸质信件数量);发出的社区帖子达200万个(相当于《时代》杂志770年的文字量);卖出的手机为37.8万台,高于全球每天出生的婴儿数量37.1万之间的关联。……大数据具有如下特点:数据量大、数据种类多、数据之间有潜在关联、速度快、时效高。
Ubiquitous data, ubiquitous network and large-scale distributed storage and computing power (cloud computing) faithfully record our clothing, food, housing, transportation and social status.Now, the amount of data created by human beings in one day is equivalent to the amount of data in one year in 2000.
Do you post information on Weibo, WeChat, Renren and other websites every day? Within 1 minute, more than 10 new microblogs were posted on Weibo; the page views of the social networking site Facebook exceeded 600 million... The users of the entire Internet and all commodities themselves are a large enough data space, plus space, time, Potentially related factors such as the weather, if you want to know the preferences of each user, the amount of data required is huge.The more data, the more accurate the understanding of users, but at the same time the more difficult it is to analyze.
☆Technical challenges of internet big data processing
In fact, when you are still using social platforms such as Weibo as a tool for expressing emotions or posting comments, Wall Street's money-making masters are mining the "data wealth" of the Internet, using it to predict market trends ahead of others, and Obtained decent profits.
However, dealing with Internet big data is full of challenges. As mentioned above, there are so many and complex data, how can we effectively find the correlation between data, and how can we make full use of big data to achieve personalized needs? , are questions that each of us needs to think about.
The first thing we need to be clear is what capabilities are needed to process big data.In order for data to be consumed faster than it can be generated, sufficient computing resources are necessary.The core capabilities of big data processing are high-level computing frameworks, stable program design, and precise algorithms.And these abilities need professional computer technology talents to realize.
The second is timeliness.The rate at which users generate data is very fast.How can we perceive these effective data in a timely manner, make an effective response before the user's next operation, and finally bring convenience to the user?Such timeliness requires the computer system to operate in the form of data flow, which eventually leads to the system adopting a technical solution that is completely different from traditional batch big data processing.
Finally, in order to meet individual needs to a greater extent, it must also have a sufficiently powerful customization capability.Although the customization needs of a single user may be small, the number of users is huge and the customization needs are very different. How can we meet the needs of each user in a timely and effective manner?This requires giving users as much freedom as the database SQL language (Structured Query Language), so that even the smallest needs can be met through simple operations.Such customization capabilities should be reflected in many aspects such as data storage, calculation, query, and display.
☆Alibaba cloud's solution--cloud recommendation
Whether it is the computing and storage capabilities for collecting big data, or the real-time computing and algorithm technology required to deal with personalized issues, it is not easy for webmasters and developers to solve problems quickly.Have you heard of the concept of "cloud"?It is equivalent to storing the data content generated by each user in a large memory, and then downloading from the "cloud" according to the user's needs, such as the very popular "cloud TV", "cloud storage", and "cloud service" Wait.Alibaba Cloud is trying to lower the threshold of personalized services through cloud services, so that more webmasters and developers can enjoy their own personalized services at low cost, among which cloud recommendation is a typical example.
What is cloud recommendation?To give a simple example, if a website introduces gourmet recipes, when users browse the recipes of certain dishes or soups, if they can recommend some related recipes, then users can stay on the website for more More time, more access.
How to implement cloud recommendation?In fact, there are several ways to find content that users are interested in:
First, find out from the user access log.Every user will generate browsing records, and cloud recommendation recommends a series of relevant content through scientific analysis of user browsing record data.
Second, you can recommend other popular recipes on the website, and integrate and link the same content on different websites.
Third, look for different types of content. If users are browsing the recipes of a certain type of soup, they can recommend some recipes for a certain type of meal.
However, to achieve such recommendations, traditional approaches require a lot of manual editing work.It can't be instant, and it's hard to guarantee a good effect.An accurate recommendation model must make a comprehensive evaluation of the overall effect of the method itself and the user's preference for the results of various recommendation methods, so as to find an accurate recommendation model suitable for each user, and finally let users enjoy the recommended booth" "Thousands of people, thousands of faces" personalized service.
So, can ordinary people use cloud recommendation services?OK.If you also want to try it, just register and apply on the cloud recommendation website, get a ten-digit application ID, such as "1000001234", and embed the code generated by the system in the web page code to get personalized recommendation results.This process usually takes 1 minute to complete.
The next thing, of course, will be handed over to the cloud system.It will start to conduct an in-depth analysis of the website, and will continue to automatically adjust the model and weight of the recommendation method based on the displayed click effects.
In the cloud recommendation management interface, website developers can customize parameters such as the size of the recommended location, the number of recommended content items, URL range, and display form.Webmasters can also see clicks on recommended booths and make appropriate adjustments to suggested placement parameters to improve performance.
If you are a professional website operator and manager, then you need to know that the cloud recommendation service also provides plug-in support for mainstream website building tools such as Wordpress.After installing the plug-in, the developer can operate and manage various functions recommended by the cloud on the tool management interface.According to background statistics, the overall traffic of the website after cloud recommendation is enabled will increase by 10%.This kind of personalized service makes people feel like the bank can get interest, which is the display of the charm of big data.I believe that with the continuous accumulation of data and the accumulation of the number of users, personalized services in the era of big data can bring far more surprises than a 10% increase in traffic!
Now, you may be able to understand the mystery behind cloud recommendation. Its basis is still the data generated by each user. What website you browse, what you publish on the Internet, or click on an article are all massive data. Foundation.Just like the principle of being inclusive of all rivers, the data generated by tens of thousands of netizens gather together and become "big data".Its essence is so transparent, yet so huge.
As long as the big data is analyzed, integrated, and utilized through professional analysis software, it will eventually become the "guess you like" that every user sees.At this time, presumably you are no longer surprised.The secret is that simple!
☆Is personalization really safe?
Every thing has a good side and a bad side.The same is true for big data. No matter how wonderful applications it develops and how convenient functions it provides, it has disadvantages that we need to avoid and correct.
What impresses you the most must be privacy.While big data greatly facilitates our lives, it also seriously threatens our personal privacy.On the one hand, we have to use the Internet, but on the other hand, we are afraid that our privacy will be exposed in the open. The "Prism Gate" incident made us rethink the security issues under big data.With the fermentation of the "Prism Gate", everyone was surprised to find that the US's network surveillance has gone so far in today's big data world, leaving all countries in the world behind.
There is also the issue of root servers.In 1969, the four main computers of four universities in the southwestern United States - University of California, Los Angeles, Stanford University Research Institute, University of California, and University of Utah were connected. This is the earliest Internet.At present, China's Internet users have exceeded 4 million, ranking first in the world, but there are 4 root servers in the world that are mainly used to manage the Internet's main directory, 5 primary root server is still in the United States, and the remaining 13 secondary root servers 1 are in the United States , none of them are in China.
The number of Internet users in the United States is less than half that of my country, but the number of web hosts is 28 times that of China.Under the premise that the United States controls most of the root servers, China's network security is worrying.This is also one of the determining factors where individuation is risky, because we have no control over the status quo, like a person with a chokehold.
The former media tycoon Rupert Murdoch's "The Sun" was closed due to the wiretapping scandal, and then the Snowden incident happened. These are all testing our really worrying network environment.Therefore, everyone has a question mark in their hearts, is the network security in the big data environment?Is our privacy still private?How to use the Internet safely and find a balance between the use of big data and user privacy is a question that we need to think about further.
(End of this chapter)
You'll Also Like
-
I have a hundred times reward in online games
Chapter 2281 10 hours ago -
The Dragon That Devoured the World: Begins with Rebirth as a Black Dragon
Chapter 326 23 hours ago -
The Saint Clan is too weak? What does it have to do with my Eternal Imperial Clan?
Chapter 66 23 hours ago -
Uchiha clan extermination night, start as Kanna-chan
Chapter 96 23 hours ago -
I'm just acting like a pervert, I'm actually a good prisoner
Chapter 233 23 hours ago -
Xingtie: My mission master joins the chat group
Chapter 288 23 hours ago -
Xingtie: I, who rub the armor with my hands, ascend to the reincarnation
Chapter 201 23 hours ago -
Martial Arts Immortality: Awakening the Three Colors of Domineering at the Beginning
Chapter 421 23 hours ago -
Transmigrating into a book: Green tea, I want the vicious female supporting role!
Chapter 406 23 hours ago -
Detective: Conan calls me brother-in-law?
Chapter 130 23 hours ago