BREATHE LONDON
Categorizing Non-Traffic Emissions Sources using Peak Clustering Technique on Distributed Network
Collaborators: Michelle S. Hui, Jintao Gu, Timothy Baker, Mohammed I. Mead, K. Max Zhang

Overview
Cities are investing into distributed air quality monitoring networks to combat PM2.5 emission, which is linked to 5 million deaths each year. Alarmingly, 97% of global cities still fail to meet the WHO PM2.5 guidelines. Yet few studies investigate how to extract spatial-temporal insights from these networks. With the rise of traffic-reduction initiatives, non-traffic emissions have become a significant, yet complex, contributor to air quality issues.
This research develops approaches for cities to systematically extract hotspot and source identification from their distributed network. A major accomplishment of this study is identifying sites of non-traffic related emissions sources, which have been traditionally challenging to characterize in the emission inventories. This is a key step protecting urban citizen health and understanding emissions beyond traffic-related sources.
Network Analysis

The main goal of our research is to categorize monitoring sites with varying concentration profiles into (1) similar sources of pollution and (2) identify sites of interest. We employ a two-part approach that first identifies sites with consistently high PM2.5 pollution, or what we call hotspots, using a method called network analysis.
The goal of network analysis is to use PM2.5 to identify sites with consistently elevated PM2.5 concentrations. PM2.5 is extensively used as an indicator for local pollution sources beyond traffic, such as industrial emissions, construction activities, and residential heating, providing detailed insights into various contributors to air quality issues.
The analysis first performs intra-ranking between days to filter out extreme regional phenomena days. Then it inter-ranks between sites to identify hotspot sites. A key principle of the network analysis is to explore the variability among different sites within the network, indicative of both areas and emission sources of interests. Sites with consistently high concentrations in PM2.5 are identified by an annual ranking threshold. Monitoring sites that exceed the threshold are hotspots.
Peak Analysis K-Means Clustering

The goal of clustering on air pollution is to group sites by their emissions patterns to identify shared hyperlocal sources.
I tackled the challenging task of isolating hyperlocal emissions sources amidst a backdrop of strict emission standards (e.g. ULEZ) and high transboundary pollution. I began with traditional methods of clustering PM2.5 concentrations, yet these approaches displayed virtually uninterpretable clusters.
I shelved clustering and turned my efforts towards peak analysis. By focusing on peak analysis, I honed in on moments of heightened emissions within our time-series data. This approach, however, wasn't immediately fruitful. It took several iterations, experimenting with higher-dimensional relationships and peak shapes, before my breakthrough.
I discovered that the key lay in the diurnal patterns of the pollution peaks. This insight led me to refine peak analysis and merge it with clustering techniques, crafting an innovative algorithm that could differentiate between traffic and non-traffic related emissions sites, which is the key achievement of this research.

While traffic remained the dominant source, our algorithm unearthed smaller, yet significant, non-traffic clusters. These included substantial contributions from construction activities and late-night commercial cooking — sources traditionally overlooked in urban emission policies but are the 3rd and 4th largest sources in London and contain multiple PM2.5 hotspots. These findings not only reveal underrecognized sources of pollution but also underscore the necessity of broadening policy focus beyond traffic emissions. Our findings could have profound implications for cities worldwide by offering a new approach to identify and target diverse pollution sources for more effective air quality management and addressing a crucial gap in our collective efforts to create cleaner, healthier urban environments.