👋 I’m a second-year PhD student at UCL SpaceTimeLab, researching conversational systems (large language models) for complex routing problems, supervised by Dr James Haworth, Dr Aldo Lipani, and Dr Stefano Cavazzi. I am broadly interested in understanding how language models can be adapted for geospatial data, and how to adapt geospatial data for LLMs. My research is funded by UK Research and Innovation (UKRI/EPSRC) and the Ordnance Survey.

Recent publications

  • Quantifying Geospatial in the Common Crawl Corpus (📄 arXiv; SIGSPATIAL’24 proceedings)

  • CC-GPX: Extracting High-Quality Annotated Geospatial Data from Common Crawl (📄 arXiv; SIGSPATIAL’24 proceedings)

  • CycleTrajectory: An End-to-End Pipeline for Enriching and Analyzing GPS Trajectories to Understand Cycling Behavior and Environment (📄 arXiv; SuMob @ SIGSPATIAL’24 proceedings)

  • Do Sentence Transformers Learn Quasi-Geospatial Concepts from General Text? (📄 arXiv; GeoExT @ ECIR 2024 proceedings)

Teaching

At UCL, I am/was a postgraduate teaching assistant for CEGE0096 Geospatial Programming (2023/24, 2024/25), CEGE0097 Spatial Analysis and Computation (2023/24), and CEGE0042 Spatial-Temporal Data Analysis and Data Mining (2023/24).

Experience

Prior to UCL, I spent five years in industry, first as a full-stack developer for an open data consultancy CTData Collaborative, and then as a data engineer for a location planning firm Geolytix.

I completed MSc in Geographic Information Science at the University of Leeds 🇬🇧, and BSc in Computer Science and Studio Arts at Trinity College—Hartford 🇺🇸. I spent one year of my undergraduate degree at Worcester College (University of Oxford) 🇬🇧, where I focused on machine learning.

Achievements ☄️

Together with Jack Dougherty, I co-authored O’Reilly’s Hands-On Data Visualization: Interactive Storytelling from Spreadsheets to Code, which was translated from English into Korean and Traditional Chinese.

Hands-On Data Visualization book cover