Sign In

Communications of the ACM

China Region Special Section: Big Trends

People Logistics in Smart Cities

circuit board cityscape

Credit: Tostphoto

Cities in China are growing rapidly in terms of both size and complexity. Governments have been searching for new technologies to make cities more efficient, and smart mobility has been the top priority in all solutions.

The past few years have seen a paradigm shift for smart mobility in China, that is, data-centric companies, mostly Internet companies, are taking a leading role in such initiatives instead of governments and academic researchers. For example, Alibaba, Baidu, Tencent, Ctrip, and Didi, among others, are spearheading the smart mobility initiatives. The driving force is twofold: these companies have accumulated a huge volume of data and invested a great deal of resources in the AI arena. Their main focus involves AI systems able to predict city traffic conditions with full spatiotemporal coverage and optimize transportation systems accordingly.

The blossoming of smart mobility initiatives is due to the fact the potential of big city traffic data has not been fully mined. Researchers are still pursuing a better way to break the data silos while also preserving privacy. Fortunately, we have seen significant efforts from both industries and governments in China to promote data sharing for innovations.

Here, we will focus on the smart mobility scenarios that are representative for cities in China. We will elaborate on two aspects: in-city and intercity transport. For the in-city scenario, we take Alibaba's City Brain program as an example to introduce how the big city data can be used to optimize the traffic signal scheduling and accelerate access for life-saving emergency vehicles (EV). For the intercity scenario, we focus on the weekend/holiday crowdedness unique to China, and introduce how big tourism data can be leveraged to divert tourists to more suitable attractions to avoid traffic congestion.

Back to Top

Real-Time and Holistic Situational Awareness on City Traffic

High-quality, real-time, and holistic traffic condition sensing is the prerequisite of further transport optimization. To overcome the limit of spatio-temporal coverage of traditional sensor data, the data from active navigation apps, including private cars and taxis, are employed to rebuild the trajectories using map-matching techniques, which can cover almost the entire road network of a city. Trajectory data can be used for various purposes, such as traffic parameter extraction (speed/volume) or origin-destination (OD) analysis. Various traffic indices can be generated based on these parameters, such as delay index that measures the travel time delay comparing to no-congested situation, and imbalance index that measures the speed difference between upstream and downstream road links. By monitoring the indices, abnormal traffic conditions can be detected and altered for attention in real time.

Navigation data is in the hands of the leading map/transport service providers in China, for example, AutoNavi (Alibaba), Baidu, Tencent, and Didi. All of these companies have participated in smart mobility initiatives. For example, in the City Brain project, AutoNavia provides near real-time traffic data services (traffic parameters are updated with two-minute intervals) for traffic sensing and optimization purposes.

Another important type of data is CCTV. Though not quite new as a data source, it is currently undergoing an AI transformation. There are tens of thousands CCTVs deployed in each big city in China for traffic surveillance. Traditionally, they only perform snapshot capturing rather than analytical jobs. It still requires human intervention to review and interpret current traffic conditions. It is a very tedious and error-prone job and becomes even more challenging as the number of CCTVs continues to increase.

Thanks to advancements in computer vision technology and elastic cloud computing, it is possible to empower computers to do the more analytical work usually handled by humans.2 Alibaba's City Brain project adopted a cloud-based architecture to stream the large volume of video data into the cloud, where it is processed in parallel to generate structured results such as speed/volume, queue length, and incidents. The results can be used in three ways: data fusion and cross-validation with other data sources, for example, speed/volume; input for other applications, for example, queue length can be used by traffic signal optimization; and, incident detection/tracking that greatly improves the efficiency of traffic surveillance and emancipates human labor (see Figure 1).

Figure 1. Incident detection and tracking in Hangzhou city brain.

AI-based CCTVs can also be used in many other scenarios such as security and policing. For example, if a car runs from the accident scene, the cameras can collaboratively track it in real time; face recognition technology can be used for CCTV data—even though a human face is typically a very small and vague camera image difficult for humans to recognize, the computers can still achieve a stronger view. Many unicorn start-up companies have emerged in this area, such as SenseTime,b Hikvision,c Dahuad

About 80% of the serious traffic congestions are caused by accidents. Quick response to accidents and anomalies is a very effective way to improve traffic conditions. Moreover, once multiple data sources are integrated to generate a holistic view of the city traffic, it not only improves the performance of the traffic management, but also enables further applications that can systematically use this data to optimize the city transport.

Back to Top

Self-Adaptive Traffic Signal Optimization

The traffic signal is one of the most important means of directing city traffic. The dominant traffic signal systems run at a cycle/split-based scheme, that is, each split corresponds to a phase within which only certain directions are allowed. Then the problem is how much (green) time to allocate to each phase. Existing signal systems either run at fixed timing schedules, or rely on the fixed loop detectors to do self-adaptive scheduling. However, loop detectors or speed cameras are fixed at certain locations and the data is considered nearsighted, while the root cause of a traffic jam might originate from a long distance. As holistic traffic sensing is now possible by fusing multiple data sources, there are a few new ideas for optimizing traffic signals.

The first idea is to balance the traffic condition of upstream and downstream road. As depicted in Figure 2, for each driving direction, for example, east to north, there is an upstream link and a downstream link. The imbalance value of each driving direction is defined as the difference of the normalized speed (actual speed over free speed) from upstream and downstream road links.


Given there are m signal phases, n driving directions, and the many-to-many relations between them; Equation 1 specifies the objective function for the optimization. The goal is to find the best incremental green time allocation for the m signal phases cacm6111_a.gif and cacm6111_b.gif are respectively the actual speed of the upstream and downstream roads of driving direction i at time t, cacm6111_j.gif and cacm6111_c.gif are the free speed; cacm6111_d.gif is the imbalance value for driving direction i at time t, and cacm6111_e.gif represents the expected incremental green time for direction i where β is a hyperparameter(s). The objective is to minimize the total (volume weighed sum) difference of the real allocation and the expected allocation of all driving directions, where cacm6111_f.gif is the volume of direction i at time t, and si is the set of all related phases of direction i. Finally, A and B are used to ensure the total cycle time of the new schedule is within an acceptable range, which are also hyperparameter(s) in signal systems.

Figure 2. Intersection with four entrances.

Another idea is based on the partition-and-conquer paradigm that applies to large-scale optimization, for example, a city or district. One of its applications is the so-called greenwave, which means multiple traffic signals are coordinated to reduce the number of stops. The co-ordination is achieved by setting appropriate phase difference for two traffic signals with the same cycle time, so that a vehicle traveling at normal speed can drive through the next traffic signal without stop. Greenwave is normally applied to arterial roads where there is a large volume of traffic crossing consecutive traffic signals. The key of conducting greenwave is to identify the route that can maximize the performance gain (normally the route with the maximal volume), which can be identified from trajectory data.

The underlying philosophy of the arterial road greenwave is its portability to any randomly shaped area. Figure 3 shows the result of the traffic signal partition in Huangpu district of Shanghai city. Navigation trajectories can be used to derive the volume data between each pair of traffic signals, which are used as the input of the network partition algorithms.1,5 The result is a good suggestion for further optimizations. For example, for the in-group coordination, a very simple idea is to rank the adjacent traffic signal pairs by their connectivities and set appropriate phase differences one by one; for the intergroup coordination, we focus on the traffic signals on the boundary, and take into consideration the overall traffic conditions in each group.

Figure 3. Enhancing traffic signals.

More and more cities in China have benefitted from such efforts including Hangzhou, Suzhou, Guangzhou, Shanghai, and Wuhan. Take Hangzhou as an example: its City Brain system is processing a large volume of data, including one million+ trajectories, 2,000+ cameras' video streams and many other traditional sensor data. It reports around 2,500 events daily with 95% accuracy. The average travel time of all trips in the city is reduced by 15.3%.

Back to Top

On-Demand Greenwave for Emergency Vehicles

The response time of emergency vehicles (EV) is critical to saving lives. Governments across the globe set ambitious response time targets. The National Health Service (NHS) of the U.K. set a target of eight minutes for most serious medical calls.e New York City mandates a 10-minute response time on emergency calls.f In Singapore, in 87.1% of cases, an EV arrives within 11 minutes.g As the last few years have seen rapid urbanization in China, the demand for faster EV response times continues to rise.

Navigation trajectories can be used to derive the volume data between each pair of traffic signals, which are used as the input of the network partition algorithm.

Functionally, there are two basic approaches to reducing response time: optimize the route for EVs to avoid traffic, obstacles, and any other risks; and, preempt traffic signal systems to allow EVs to pass swiftly through intersections. Both approaches still remain challenging: the estimated time of arrival (ETA) used by routing algorithms can be delayed by ever-changing traffic conditions, and the signal preemption must be dynamic and precise according to the traffic conditions to avoid negative impact on the overall traffic flow.

The time-dependent vehicle-routing problem (TDVRP) has long been researched.4,6 Traditional time-varying path searching algorithms are too optimistic: vehicles are expected to drive exactly at the predicted speed. In reality, the actual travel time at each individual road link can slightly vary from expected values. As illustrated in Figure 4, the speed prediction has a significant variance, which indicates a variable speed for actual driving. Cumulatively, this can lead to a large difference between the ETA and the actual arrival time. The higher the variance between arrival and ETA, the higher the risk for real people in critical conditions. Therefore, the question is: How best to plan a route that is fast and robust on ETA?

Figure 4. Exemplar speed observation generated by AutoNavi.

An improved route-searching algorithm can answer that question. Instead of trying to minimize merely the overall travel time, the variance of ETA is also taken into consideration.


Equation 2 is the revised objective function for selecting a path j from totally N candidates to minimize the weighted sum of mean (α) and standard deviation (σ) of travel time. A path pj is represented by a sequence of nodes cacm6111_g.gif where nj is the number of nodes for path pj, and α is the weight, which is a hyperparameter. cacm6111_h.gif is the arriving time at the ith node of path j, and thus cacm6111_i.gif is the ending time of path pj.

The key to solving the equation is to calculate the distribution of ending time. Let us assume a simple case: Traveling from node A to B: given the arriving time distribution at A, a time-varying speed function on edge AB, and an random perturbation imposed on the speed (a speed offset follows a normal distribution with mean 0), to compute the arriving time distribution at node B. Once this is solved, the whole searching algorithm can use it in an iterative way, that is, from cacm6111_h.gif to cacm6111_i.gif where cacm6111_h.gif is a fixed value. This problem can be modeled as a continuous Markov process.

As the EV travels along the planned route, it constantly communicates with the control center and shares its location and speed (by GPS devices). The control center fuses the real-time feedbacks with the historic data to predict the ETA at the next traffic signal junctions, and inform the signal control system to prioritize the EV's driving direction. The key challenge to this task is twofold: How to determine the most appropriate timing to start the green signal to clear the residual vehicle queue before the EV arrives; and, how to minimize the impact on opposite driving directions.

The residual queue length is defined as the length the vehicles fail to pass the junction in one cycle. Video analytics is one way to detect the queue length, and trajectory data can also be used to estimate the queue length where cameras are missing. Once the queue length is determined, the control system can gradually allocate extra green time to clear the queue before the EV arrives.

To minimize the negative impact introduced by the signal preemption, the algorithm dynamically searches for a optimized solution that balances the overall green time allocated to each phase, rather than simply dwelling on the target phase and causing problems to other directions. This problem is modeled as a mixed integer programming problem, which aims at smoothing the change of signal scheduling by starting the pre-emption as early as the ETA's variance is limited to a certain range.

Our test in Xiaoshan District of Hangzhou city has shown a significant improvement in EV travel times, as illustrated in the accompanying table. This test is conducted on a route from (30.138384 120.280503)(lat/lon) to (30.186592 120.266079) where there are 19 traffic signals.

Table. Reduction of response time from the field test in Hangzhou City's Xiaoshan District.

Back to Top

Tourism Recommendation to Solve the Holiday Crowdedness

During public holidays in China, popular tourist cities are flooded with large numbers of visitors that can swell to several times the number of residents. Increased needs for accommodation, food, and entertainment exert extreme pressure on the local environment and public services, especially for transportation. Take the China National Day (an annual weeklong holiday beginning October 1) for example: In 2015, visitors to Huangshan mountain spent nine hours on average waiting in line. In 2016, more than 25 million tourists visited Chongqing city—a western metropolitan city whose residential population is 30 million. In 2017, the traffic congestion on the Hukun expressway was accumulated to maximally 49.73km. Such over-crowded populations, as we know, lead to problems like pollution, congestion, and loss of open spaces, and causes inconvenience and negative experiences for both tourists and local residents.

As the largest online travel agency in China, Ctrip discovered an insight from its big data—that is, there is an imbalanced situation between the distribution of tourists and the collective capacity of attractions. The agency envisioned that a good recommendation system could help divert tourists to less-crowded attractions to resolve the problem. The basic idea is to build a tourist prediction component, and once an attraction is predicted as overcrowded, a recommendation component will be triggered to try to divert tourists to other places.

However, online tourism products are very different from regular commodities due to several factors, including: holiday travel is a low-frequency event, most people travel only 1–2 times per year; and, numerous travel packages generate different combinations of transport means, restaurants, and hotels. Thus, most travel products have very few or even zero customers, and it is very difficult to simply apply traditional recommendation algorithms to this scenario.

Ctrip's solution for recommendation is twofold: user-profiling based on its big tourism data accumulated over the last 18 years, and developing a hybrid collaborative filtering model that specifically targets the sparse data and cold-start problem.

Figure 5 is the user preference tree built from historical travel data. The short-term profile has the same structure of the long-term one, but is limited to the latest 30 days' data. The system can quickly iterate the tree and generate a preference vector for a user, as the input for the recommendation system.

Figure 5. User preference tree built from Ctrip's big tourism data.

The key to the enhanced recommendation algorithm is the so-called Additional Stacked Denoising Autoencoder (aSDAE),3 which employs the deep learning model to learn the latent variables of users and products, and combine it with the classic matrix factorization. The latent variables learned from the two models are used to fit the product-scoring table that is initialized by users' feedbacks. Moreover, the overall loss function is a linear combination of two models' loss functions. Lastly, a text-generation AI component will creatively generate poetry to characterize the recommended attraction and push to users. The test has shown the algorithm performs better than traditional ones for the sparse data and cold start scenarios.

The system has been deployed to governments such as Henan province, Guiyang City, and many others. In the Henan province, for example, the recommendation system was deployed last March. According to Ctrip's online travel booking data, during the Labor Day holiday (a period of three days starting around May 1), the total number of tourists in the Henan province reached 2.04 million, which is a 41.5% increase from Labor Day in 2017. To evaluate the effect of its recommendation system, the tourist distribution over 18 areas throughout the province is calculated. The standard deviation (SD) of the distribution is 234,355 in 2017 and 202,208 in 2018. The SD decrease suggests a more balanced experience visiting the province's many attractions, which benefits both tourists and local residents.

Back to Top


1. Blondel, V.D., Guillaume, J-L., Lambiotte, R., and Lefebvre, E. Fast unfolding of communities in large networks. J. Statistical Mechanics: Theory and Experiment 10, (2008), P10008.

2. Chu, W., Liu, Y., Shen, C., Cai, D., and Hua, X-S. Multitask vehicle detection with region-of-interest voting. IEEE Trans. Image Processing 27, 1 (2018), 432–441.

3. Dong, X., Yu, L., Wu, Z., Sun, Y., Yuan, L., and Zhang, F. A hybrid collaborative filtering model with deep structure for recommender systems. In Proceedings of AAAI (2017), 1309–1315.

4. Gao, S. and Chabini, I. Optimal routing policy problems in stochastic time-dependent networks. Transportation Research Part B: Methodological 40, 2 (2006), 93–122.

5. Lambiotte, R., Delvenne, J-C., and Barahona, M. Laplacian dynamics and multiscale modular structure in networks. arXiv preprint arXiv:0812.1770 (2008).

6. Malandraki, C. and Daskin, M.S. Time dependent vehicle routing problems: Formulations, properties and heuristic algorithms. Transportation Science 26, 3 (1992), 185–200.

Back to Top


Wanli Min is Chief Data Scientist and Senior Director at Alibaba Cloud Computing in Hangzhou.

Liang Yu is Senior Data Scientist at Alibaba Cloud Computing in Hangzhou.

Lei Yu is head of the AI Department at Ctrip in Shanghai.

Shubo He is manager of the AI Department at Ctrip in Shanghai.

Back to Top


a. AutoNavi is one of the largest web mapping, navigation and location based services providers, founded in 2001 and acquired by Alibaba Group in 2014.







©2018 ACM  0001-0782/18/11

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from [email protected] or fax (212) 869-0481.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2018 ACM, Inc.


No entries found