Advancing Responsible Societal-Scale Use of Mobility Data in High-Stakes Settings

Talk
Naman Awasthi
Time: 
12.16.2025 11:00 to 12:30
Location: 

Mobility (location) data from mobile phones or GPS sensors can be used to identify and solve societal-scale challenges like infrastructure planning, pandemic preparedness or public transit planning. This proposal presents a set of projects focused on the responsible use of human mobility data for public good while addressing key questions around privacy, fairness, and model generalization.
First, I present work focused on understanding the privacy perceptions of US residents on the use of location data from third party vendors. I uncover the differences in perceptions for use of multiple trajectory and place of visit features by several actors and purposes. Notably the research effort finds that sharing detailed features like frequent trajectories and places of visits or sharing data with vague purpose were rated negatively by participants. Using a vignette approach, I also observed that people were comfortable with academic researchers accessing such data for specific application areas, including public transit and public health. Building on this finding, I, next, focused on two projects using mobility data to support decision making: one focused on public transit in Baltimore City and another one focused in COVID-19.
Second, in the public health context, I focused on evaluating bias introduced by mobility data used in COVID-19 forecasting models. I identified and quantified biases in prediction accuracy, with most minority counties being associated with larger errors. I then proposed DemOpts, a novel framework for fair regression for COVID-19 data. This training framework will enable future deep learning models incorporating mobility to provide fair forecasts across the US.
Third, I developed a privacy-respectful app (BALTO on Google play store) to crowdsource location data from people using public transit in Baltimore city. This ongoing study, with over 200 participants and 1200 trips, collected both trips and qualitative experiences of transit riders, which will be used to evaluate door-to-door experiences. In addition, I compared travel duration times of users with theoretical times available from routing platforms (Transit API and Google Maps API) and uncovered socio-economic and structural differences between observed and actual travel times that could bias public transit decisions; and I will be proposing correction formulas to address that.
Finally, I am also working on understanding memorization of mobility data in LLMs. Current LLM-based trajectory forecasting models like UrbanGPT or CityGPT outperform graph neural network based models in mobility tasks like next-stop prediction. I hypothesize that this could be due, in part, to the memorization of mobility benchmark datasets, and I am currently working on approaches to evaluate both memorization and generalization.