option
Home
News
Uncovering Our ‘Hidden Visits’ With Cell Phone Data and Machine Learning

Uncovering Our ‘Hidden Visits’ With Cell Phone Data and Machine Learning

April 23, 2025
26

If you've ever wondered how researchers track our movements across a country without relying solely on phone calls, a fascinating study by researchers from China and the United States offers some insight. Their collaborative work delves into the use of machine learning to uncover the 'hidden visits' we make—those trips that don't show up in standard telecom data because we're not using our phones enough.

The study, titled **Identifying Hidden Visits From Sparse Call Detail Record Data**, is spearheaded by Zhan Zhao from the University of Hong Kong, alongside Haris N. Koutsopoulos from Northeastern University in Boston, and Jinhua Zhao from MIT. Their goal? To leverage the mobile connectivity records—such as mobile data, SMS, and voice calls—from highly active users to model and predict the movement patterns of those who use their phones less frequently.

A rough schematic for extracting trip information from Call Detail Record (CD) data. Source: https://arxiv.org/pdf/2106.12885.pdf*A rough schematic for extracting trip information from Call Detail Record (CD) data.* Source: https://arxiv.org/pdf/2106.12885.pdf

While the team acknowledges the potential privacy concerns their work raises, they emphasize that their aim is to gain a more generalized understanding of movement patterns, rather than zooming in on individual journeys. They also point out that Call Detail Record (CDR) data, which is the backbone of such studies, has its limitations. It's often low in spatial resolution and susceptible to 'positioning noise' due to the user's changing position relative to cell phone towers. However, they argue that this inaccuracy actually serves as a privacy safeguard:

**‘The target application of our study is trip detection and OD estimation\[\*\], which are done at aggregate level, not individual level. The developed models can be directly deployed on the database servers of telecom carriers, without need for data transfer. Furthermore, compared to other forms of big data, such as social media or credit card transaction data, CDR data is relatively less intrusive in terms of personal privacy. In addition, its localization error helps to mask the exact user locations, providing another layer of privacy preservation.'**

Elapsed Time Intervals (ETIs)

When we're on the move with our mobile phones, not necessarily smartphones, the limitations of CDR data as a tool for pinpointing our location become clear. Elapsed Time Intervals (ETIs), those periods during a journey where we don't make or receive calls, are crucial markers for tracking our movements. These intervals of 'silence' can make us temporarily vanish from the grid.

The researchers highlight how these gaps interfere with analytical systems trying to make sense of A>B journeys. The sparsity of data might be hiding an 'unobserved trip'. Their new method tackles this by analyzing the spatiotemporal context of ETIs and considering 'the individual characteristics of the user'.

Dataset

To build their core training set, the researchers used data from a major cellular service operator in a Chinese city with a population of 6 million. This dataset included over two billion mobile phone transactions from three million users in November 2013, focusing solely on voice calls and data access records. Notably, they did not include SMS data, which added to the challenge of dealing with sparse data.

The data included an encrypted unique ID, a Location Area Code (LAC), a timestamp, a cell phone ID linked to the LAC to identify the specific cell phone tower involved in the transaction, and an Event ID indicating whether it was an outgoing/incoming call or data usage.

Process tree for the identification of hidden visits.*Process tree for the identification of hidden visits.*

This information was cross-referenced with a cell tower operation database, enabling the researchers to pinpoint the longitude and latitude coordinates of the tower associated with each communication event. They identified 9000 cell towers within the dataset.

The researchers noted the difficulty in accurately guessing trip destinations based solely on call records, as these records peak in the morning and afternoon, which aligns with typical travel patterns. Since phone calls can precede a journey and may even trigger it, this can skew destination estimation.

Mobile usage patterns over the course of a day.*Mobile usage patterns over the course of a day.*

Similar challenges arise with user-initiated data usage, like messaging apps. However, it's the 'automated' data usage—like the systematic polling of APIs for new messages or other data, including GPS and telemetry across apps—that helps in identifying these hidden movements.

Processing

The researchers employed a variety of machine learning classifiers to tackle this problem, including logistic regression, support vector machines (SVM), random forests, and a gradient boosting ensemble approach. These were implemented in Python using scikit-learn with default settings.

Among these, logistic regression provided the most interpretable model parameters. The team also found that longer ETIs increased the likelihood of a hidden visit occurring, with a higher incidence in the morning. Conversely, when a user's CDR data clearly showed a high number of destinations or waypoints, the likelihood of a hidden visit was lower. This finding supports the core principle of their research—that the most active users provide a detailed picture of their movements, from which the behavior of less active users can be inferred.

In their conclusion, the researchers suggest that their approach could be applied to other types of transit data, such as smart card data and geo-located social media information.

The research was supported by funding from Energy Foundation China and the China Sustainable Transportation Center.

*\* Origin-Destination*

Related article
Authentic Focusing System Developed for Affordable Augmented Reality Authentic Focusing System Developed for Affordable Augmented Reality Revolutionizing Projection-Based Augmented RealityResearchers from the prestigious Institute of Electrical and Electronics Engineers (IEEE) have made a groundbreaking leap forward
How we’re using AI to help cities tackle extreme heat How we’re using AI to help cities tackle extreme heat It's looking like 2024 might just break the record for the hottest year yet, surpassing 2023. This trend is particularly tough on folks living in urban heat islands—those spots in cities where concrete and asphalt soak up the sun's rays and then radiate the heat right back out. These areas can warm
'Degraded' Synthetic Faces May Enhance Facial Recognition Technology 'Degraded' Synthetic Faces May Enhance Facial Recognition Technology Researchers at Michigan State University have come up with an innovative way to use synthetic faces for a noble cause—enhancing the accuracy of image recognition systems. Instead of contributing to the deepfakes phenomenon, these synthetic faces are designed to mimic the imperfections found in real-
Comments (15)
0/200
BrianWalker
BrianWalker April 23, 2025 at 12:00:00 AM GMT

This study on tracking hidden visits with cell phone data and ML is mind-blowing 🤯 It's cool to see how researchers from different countries are teaming up to uncover these patterns. But it's also a bit creepy knowing our movements can be tracked so easily. Still, super interesting and definitely worth a read! 📚

BenHernández
BenHernández April 23, 2025 at 12:00:00 AM GMT

携帯電話データと機械学習を使って隠れた訪問を追跡するこの研究は驚きです🤯 異なる国の研究者が協力してこれらのパターンを明らかにしているのはクールです。でも、私たちの移動がこんなに簡単に追跡されるのはちょっと気味悪いです。でも、とても興味深くて読む価値があります!📚

HarryLewis
HarryLewis April 23, 2025 at 12:00:00 AM GMT

휴대전화 데이터와 머신러닝으로 숨겨진 방문을 추적하는 이 연구는 정말 놀랍네요 🤯 다른 나라의 연구자들이 협력해서 이런 패턴을 밝히는 건 멋지죠. 하지만 우리의 이동이 이렇게 쉽게 추적된다는 게 조금 섬뜩하기도 해요. 그래도 정말 흥미롭고 읽을 가치가 있어요! 📚

JasonMartin
JasonMartin April 23, 2025 at 12:00:00 AM GMT

Este estudo sobre o rastreamento de visitas ocultas com dados de celular e ML é impressionante 🤯 É legal ver como pesquisadores de diferentes países estão colaborando para descobrir esses padrões. Mas também é um pouco assustador saber que nossos movimentos podem ser rastreados tão facilmente. Ainda assim, muito interessante e vale a pena ler! 📚

RaymondRodriguez
RaymondRodriguez April 23, 2025 at 12:00:00 AM GMT

Este estudio sobre el seguimiento de visitas ocultas con datos de celulares y ML es alucinante 🤯 Es genial ver cómo investigadores de diferentes países están colaborando para descubrir estos patrones. Pero también es un poco escalofriante saber que nuestros movimientos pueden ser rastreados tan fácilmente. Aún así, muy interesante y definitivamente vale la pena leerlo! 📚

SamuelClark
SamuelClark April 23, 2025 at 12:00:00 AM GMT

This study on 'hidden visits' using cell phone data and machine learning is mind-blowing! It's fascinating how they can track movements so accurately. But it's also a bit creepy, isn't it? 🤔📱

Back to Top
OR