China’s oceans of everyday data

A book by Kai-Fu Lee called AI-Superpowers outlines what’s needed for the current generation of artificial intelligence (AI) to succeed, and why China has the necessary ingredients to lead in the area. Lee is CEO of Sinovation Ventures, a firm that finances Chinese high-tech startups. Prior to that he was president of Google China, and was educated in the United States.

Data-based AI

In the early days of AI, researchers directed their attention to extracting and coding rules, heuristics and methods of logical inference to automate smart decision making. Now rival models have taken over, based on “learning” from lots of data. The result of this learning is a black box of number values in a computer’s networked data structure that generates outputs for given inputs.

There’s no “knowledge base” of scrutable rules, and no one interviews experts to illicit their knowledge and experience. Think of these neural network methods as automated classification.

In the case of image recognition, you feed a computer learning algorithm lots of photographs of houses, labelled as such. In the process the algorithm adjusts its neural network so that when it is fed a new image the system can correctly identify that there’s a house in the picture. But you don’t only expose the algorithm to house examples, but many other pictures labelled with different content — trees, dogs, people, dragons, CT scans.

AI developers apply similar techniques to sound files, voices, hand writing, EEG brain signals, clicks on a computer screen, social media feeds, and symptom data. The successes are obvious in the now ubiquitous automated face recognition on smartphones, and photo sharing and voice activated commands. The technique is often called “deep learning.”

The term “deep” refers to the many layers to the neural network, enhancing its ability to detect features in the data, even those that a human being would not be able to detect. A seminal technical paper by Geoffrey Hinton and colleagues demonstrated the method in reading hand-written letters and numbers.

What AI wants

For everyday applications, AI needs several ingredients according to Lee: (i) an application domain amenable to its methods, (ii) people who can develop smart learning algorithms tweaked for the domain under consideration; (iii) powerful computers with lots of storage capacity, and (iv) lots of data that comes already identified and classified.

There are many domains to which AI techniques can be applied, but one of the most interesting is in the markets for consumer products. The returns are high and there is data in abundance. According to Lee, this is where China is at a great advantage.

China already has the pool of talent able to develop the algorithms, with many well trained engineers and entrepreneurs returning home from education and experience abroad to complement home grown talent. Powerful computing is now ubiquitous, with much of it built in Chinese factories. But the main advantage is the quantity of data.

“China’s alternate digital universe now creates and captures oceans of new data about the real world. That wealth of information on users — their location every second of the day, how they commute, what foods they like, when and where they buy groceries and beer — will prove invaluable in the era of AI implementation. It gives these companies a detailed treasure trove of these users’ daily habits, one that can be combined with deep-learning algorithms to offer tailor-made services ranging from financial auditing to city planning” (17).

Integrated apps and data

Top-down governance, loose privacy rules, and an abundant population contributes to this plethora of data. Lee also shows how fierce competition between Chinese software developers means that they are able to incorporate more and more features into their apps. WeChat provides a potent example.

WeChat is a Chinese product by Tencent Ltd ostensibly for social media messaging on smartphones. For Chinese residents, it now constitutes a “super-app installed on virtually everyone’s smartphone, an all-in-one portal to the Chinese mobile ecosystem” (61). Tencent managed to induce consumers to link the app to their bank accounts. So, there’s a WeChat e-wallet, payment and financial services, and mobile phone top-up. The app can also read QR codes, provides maps and accesses localization services, gaming, and online dating, as well as the usual features of text, video, and picture messaging.

Lee thinks that added to its other leverage points, the profusion of interconnected data will lead inevitably to accelerated opportunities for AI learning algorithms in China. I’ll look at some of his observations about the impact of cultural differences in the next post.


  • Hinton, Geoffrey, Simon Osindero, and Yee-Whyte Teh. 2006. A fast learning algorithm for deep belief nets. Neural Computation, (18)1527-1554.
  • Lee, Kai-Fu. 2018. AI Super-powers: China, Silicon Valley, and the New World Order. Boston, MA: Houghton Mifflin Harcourt


  • The photograph above was taken 26 January 2020 during the weekend Chinese New Year Celebrations in the China Town area of London. The pandemic hadn’t yet hit the UK.

Leave a Reply