The Data

Data is often cited as a strength of the Chinese AI ecosystem, but little attention has been paid to the exact nature and source of that strength. Data is a multi-dimensional input into AI systems. Its features include breadth (quantity of users), depth (amount/nature of data for each user), quality (how well structured and well-labelled the data is), and diversity (heterogeneity of the user base), among others.

Chinese companies are not equally capable across these dimensions—for instance, they often lag behind their American peers in quality and diversity of data. But the Chinese ecosystem is particularly strong in the depth of data on each person. What this means is that many aspects of a Chinese citizen’s daily life are captured in digital form, and can then be used to either optimize AI-powered products or be deployed for government surveillance purposes.

WeChat Wallet

WeChat is primarily a messaging app with over 1 billion monthly active users. In recent years, it has consolidated other functionalities and become a super-app. It is developed by Tencent, one of China’s biggest technology companies with dominance in social media and online gaming.

Click the buttons in a representative WeChat Wallet to see the types of data gathered from a single user.

Surveillance Cameras

China’s sprawling network of surveillance cameras has long served as a tool of traditional surveillance and urban management, but now AI has removed the need for human involvement in monitoring some of the cameras.

Facial and object recognition software turns the pixels from that footage into data points that AI can understand, noting each bike, car, and face that passes through the frame. Although AI has been applied to only a portion of total surveillance cameras in the country, this “data-tization” of China’s public spaces and roads still yields rich data that can be used for purposes both mundane and alarming.

In the mundane case, Alibaba’s “City Brain” uses data from surveillance cameras to optimize traffic and route emergency responders in Hangzhou. Similar technology has been used to catch alleged criminals at concerts and beer festivals, as well as to name and shame jaywalkers around the country. On the most alarming end of the spectrum, facial recognition has been combined with authoritarian policing in a campaign of widespread surveillance and detention of ethnic Uyghurs in “re-education” camps in Xinjiang. Future updates to ChinAI will look further at the ethical implications of facial recognition and other surveillance technologies.