Hasty Briefsbeta

Bilingual

You can't unit test for taste

a day ago
  • #Data Processing
  • #AI Development
  • #Geolocation Apps
  • The author built In the Long Run, an app for runners to virtually traverse famous routes using Strava data, aiming to add points of interest (POIs) to maps.
  • GeoNames was used as a data source with Creative Commons licensing, processed via a pipeline using Python, Parquet files, and DuckDB, with Claude AI assisting.
  • Initial filtering of POIs excluded administrative divisions and focused on specific feature codes like parks and historic sites, with population and elevation filters.
  • Wikipedia links from GeoNames provided notoriety signals and summaries, but biases emerged due to anglophone editing patterns.
  • An LLM (Anthropic's Haiku) was used to rate POIs for significance, but it hallucinated details, leading to reliance on Wikipedia summaries for correctness.
  • Per-route parameters were added to adjust filtering, ranking, and geographic spread to address variances in POI types across different regions.
  • Evaluation was challenging due to lack of objective metrics for taste and POI relevance, requiring iterative tweaks and manual overrides.
  • The project shifted from viewing AI as a core feature to using it as a supplementary tool alongside traditional data processing methods.