option
Home
News
Uncovering Our ‘Hidden Visits’ With Cell Phone Data and Machine Learning

Uncovering Our ‘Hidden Visits’ With Cell Phone Data and Machine Learning

April 23, 2025
59

If you've ever wondered how researchers track our movements across a country without relying solely on phone calls, a fascinating study by researchers from China and the United States offers some insight. Their collaborative work delves into the use of machine learning to uncover the 'hidden visits' we make—those trips that don't show up in standard telecom data because we're not using our phones enough.

The study, titled **Identifying Hidden Visits From Sparse Call Detail Record Data**, is spearheaded by Zhan Zhao from the University of Hong Kong, alongside Haris N. Koutsopoulos from Northeastern University in Boston, and Jinhua Zhao from MIT. Their goal? To leverage the mobile connectivity records—such as mobile data, SMS, and voice calls—from highly active users to model and predict the movement patterns of those who use their phones less frequently.

A rough schematic for extracting trip information from Call Detail Record (CD) data. Source: https://arxiv.org/pdf/2106.12885.pdf*A rough schematic for extracting trip information from Call Detail Record (CD) data.* Source: https://arxiv.org/pdf/2106.12885.pdf

While the team acknowledges the potential privacy concerns their work raises, they emphasize that their aim is to gain a more generalized understanding of movement patterns, rather than zooming in on individual journeys. They also point out that Call Detail Record (CDR) data, which is the backbone of such studies, has its limitations. It's often low in spatial resolution and susceptible to 'positioning noise' due to the user's changing position relative to cell phone towers. However, they argue that this inaccuracy actually serves as a privacy safeguard:

**‘The target application of our study is trip detection and OD estimation\[\*\], which are done at aggregate level, not individual level. The developed models can be directly deployed on the database servers of telecom carriers, without need for data transfer. Furthermore, compared to other forms of big data, such as social media or credit card transaction data, CDR data is relatively less intrusive in terms of personal privacy. In addition, its localization error helps to mask the exact user locations, providing another layer of privacy preservation.'**

Elapsed Time Intervals (ETIs)

When we're on the move with our mobile phones, not necessarily smartphones, the limitations of CDR data as a tool for pinpointing our location become clear. Elapsed Time Intervals (ETIs), those periods during a journey where we don't make or receive calls, are crucial markers for tracking our movements. These intervals of 'silence' can make us temporarily vanish from the grid.

The researchers highlight how these gaps interfere with analytical systems trying to make sense of A>B journeys. The sparsity of data might be hiding an 'unobserved trip'. Their new method tackles this by analyzing the spatiotemporal context of ETIs and considering 'the individual characteristics of the user'.

Dataset

To build their core training set, the researchers used data from a major cellular service operator in a Chinese city with a population of 6 million. This dataset included over two billion mobile phone transactions from three million users in November 2013, focusing solely on voice calls and data access records. Notably, they did not include SMS data, which added to the challenge of dealing with sparse data.

The data included an encrypted unique ID, a Location Area Code (LAC), a timestamp, a cell phone ID linked to the LAC to identify the specific cell phone tower involved in the transaction, and an Event ID indicating whether it was an outgoing/incoming call or data usage.

Process tree for the identification of hidden visits.*Process tree for the identification of hidden visits.*

This information was cross-referenced with a cell tower operation database, enabling the researchers to pinpoint the longitude and latitude coordinates of the tower associated with each communication event. They identified 9000 cell towers within the dataset.

The researchers noted the difficulty in accurately guessing trip destinations based solely on call records, as these records peak in the morning and afternoon, which aligns with typical travel patterns. Since phone calls can precede a journey and may even trigger it, this can skew destination estimation.

Mobile usage patterns over the course of a day.*Mobile usage patterns over the course of a day.*

Similar challenges arise with user-initiated data usage, like messaging apps. However, it's the 'automated' data usage—like the systematic polling of APIs for new messages or other data, including GPS and telemetry across apps—that helps in identifying these hidden movements.

Processing

The researchers employed a variety of machine learning classifiers to tackle this problem, including logistic regression, support vector machines (SVM), random forests, and a gradient boosting ensemble approach. These were implemented in Python using scikit-learn with default settings.

Among these, logistic regression provided the most interpretable model parameters. The team also found that longer ETIs increased the likelihood of a hidden visit occurring, with a higher incidence in the morning. Conversely, when a user's CDR data clearly showed a high number of destinations or waypoints, the likelihood of a hidden visit was lower. This finding supports the core principle of their research—that the most active users provide a detailed picture of their movements, from which the behavior of less active users can be inferred.

In their conclusion, the researchers suggest that their approach could be applied to other types of transit data, such as smart card data and geo-located social media information.

The research was supported by funding from Energy Foundation China and the China Sustainable Transportation Center.

*\* Origin-Destination*

Related article
Microsoft Study Reveals AI Models' Limitations in Software Debugging Microsoft Study Reveals AI Models' Limitations in Software Debugging AI models from OpenAI, Anthropic, and other leading AI labs are increasingly utilized for coding tasks. Google CEO Sundar Pichai noted in October that AI generates 25% of new code at the company, whil
AI-Powered Solutions Could Significantly Reduce Global Carbon Emissions AI-Powered Solutions Could Significantly Reduce Global Carbon Emissions A recent study by the London School of Economics and Systemiq reveals that artificial intelligence could substantially lower global carbon emissions without sacrificing modern conveniences, positionin
New Study Reveals How Much Data LLMs Actually Memorize New Study Reveals How Much Data LLMs Actually Memorize How Much Do AI Models Actually Memorize? New Research Reveals Surprising InsightsWe all know that large language models (LLMs) like ChatGPT, Claude, and Gemini are trained on enormous datasets—trillions of words from books, websites, code, and even multimedia like images and audio. But what exactly
Comments (16)
0/200
JuanLewis
JuanLewis August 1, 2025 at 9:47:34 AM EDT

This article blew my mind! Using phone data and ML to track hidden visits is so cool, but kinda creepy too. 🤯 Wonder how they balance privacy with all this tech wizardry.

RalphSanchez
RalphSanchez April 24, 2025 at 12:36:16 AM EDT

이 도구는 정말 놀랍습니다! 내 이동을 추적하는 데 유용하지만 조금 무섭기도 해요. 데이터를 삭제할 수 있는 옵션이 있으면 좋겠어요. 😓

MatthewScott
MatthewScott April 23, 2025 at 5:35:24 PM EDT

¡Esta herramienta es alucinante! Es como tener un detective en mi bolsillo, descubriendo todos esos viajes secretos que nunca supe. Muy útil para rastrear mis propios movimientos, pero un poco espeluznante también. ¿Quizás deberían añadir una opción para eliminar datos? 🤔

RalphHill
RalphHill April 23, 2025 at 4:51:52 PM EDT

Este estudo sobre 'visitas ocultas' usando dados de celular e aprendizado de máquina é impressionante! É fascinante como eles podem rastrear movimentos com tanta precisão. Mas também é um pouco assustador, não é? 🤔📱

WilliamMiller
WilliamMiller April 23, 2025 at 7:05:02 AM EDT

Essa ferramenta é incrível! Parece que tenho um detetive no meu bolso, descobrindo todas aquelas viagens secretas que eu nunca soube. Muito útil para rastrear meus próprios movimentos, mas um pouco assustador também. Talvez eles devam adicionar uma opção para excluir dados? 🤔

RaymondRodriguez
RaymondRodriguez April 23, 2025 at 6:37:03 AM EDT

Este estudio sobre el seguimiento de visitas ocultas con datos de celulares y ML es alucinante 🤯 Es genial ver cómo investigadores de diferentes países están colaborando para descubrir estos patrones. Pero también es un poco escalofriante saber que nuestros movimientos pueden ser rastreados tan fácilmente. Aún así, muy interesante y definitivamente vale la pena leerlo! 📚

Back to Top
OR