cairn.
v0.1.0 · beta ★ GitHub

Comparison

How does Cairn stack up?

Cairn is the newest entrant in the OSM-stack geocoder space. This page is the honest read on what we share with Pelias, Photon, and Nominatim — and where we make different tradeoffs. Numbers are from public docs and our own criterion benches; nothing here is marketing.

The landscape, in one paragraph each

Pelias

Geocoder originally built at Mapzen, now community-maintained. Multi-service Node.js stack on top of Elasticsearch. Strong on global coverage, breadth of importers (OSM, WhosOnFirst, OpenAddresses, Geonames, OpenStreetCam), and ranking quality. Production-ready but operationally heavy: a real Pelias deploy is roughly ten cooperating services plus an Elasticsearch cluster. MIT-licensed.

Photon

Single-JAR Java geocoder built on Elasticsearch by komoot. Smaller scope than Pelias — forward + autocomplete only, no structured queries, no reverse-on-OpenAddresses. Very fast on the autocomplete path because it's purpose-built for it. Apache-2.0. Still ships an Elasticsearch alongside the JAR.

Nominatim

Reference OpenStreetMap geocoder. PHP application backed by PostgreSQL + PostGIS. The canonical answer for OSM-only deployments and the engine behind nominatim.openstreetmap.org. Best-in-class reverse geocoding, careful address normalization, but forward search latency is bound by the database. GPL-2.0.

Cairn

Single static Rust binary plus a flat-file bundle. tantivy for text, custom rkyv tile blobs for places, R*-tree on bbox plus geo::Contains for reverse. No daemon dependencies, no cluster, no database. MIT or Apache-2.0. Alpha — not yet production-tested at planet scale; country-scale bundles work today.

Architecture matrix

  Cairn Pelias Photon Nominatim
Language Rust Node.js + Java (ES) Java PHP + C + SQL
Storage flat files: tantivy + rkyv + bincode Elasticsearch Elasticsearch PostgreSQL + PostGIS
Runtime deps none ES cluster + ~10 services ES cluster + JVM PG + PHP-FPM + nginx
Single binary yes (musl-statable) no JAR (needs JVM) no
Airgap bundle copy full stack mirror JAR + ES + index dir full stack mirror
Diff updates cairn-build diff/apply minutely OSM replication periodic full rebuild minutely OSM replication
Forward search yes yes yes yes
Autocomplete yes (prefix-ngram) yes yes (specialty) limited
Reverse geocode yes (PIP + nearest fallback) yes yes yes (best-in-class)
Structured queries yes yes no yes
libpostal parsing optional FFI yes (built-in service) no partial
Importers OSM, WoF, OA, Geonames, Overture, GeoParquet, MS Buildings OSM, WoF, OA, Geonames, ++ OSM (via Nominatim DB) OSM
Wikidata enrichment yes (multilingual labels + P31 kind + cross-refs) limited (label join only) limited (label join only) limited
Building footprints yes (MS Buildings, ODbL) no no no
Stable identifier structured gid (<source>:<type>:<id>) + bundle-local place_id structured gid OSM id only OSM id + place_id
Lookup-by-id endpoint /v1/place?ids=… (gid or u64) /v1/place n/a /lookup
License MIT or Apache-2.0 MIT Apache-2.0 GPL-2.0
Maturity alpha production production production (reference)

Cairn measured latency

Numbers from cargo bench on an Apple M-series laptop in release mode. Synthetic in-memory datasets, single-thread, no network round-trip. They isolate the engine cost; real end-to-end latency adds HTTP framing on top (~50–200 µs).

Workload Dataset Time
Exact forward search 5,000 places 6.1 µs
Autocomplete (3-char prefix) 5,000 places 48.8 µs
Fuzzy search (1 typo) 5,000 places 54.8 µs
Point-in-polygon (PIP) 1,024 admin polygons 25 ns
Nearest-k (k = 10) 4,096 points 514 µs

End-to-end Switzerland benchmark

Same 506 MB Switzerland OSM PBF, same 6 408-query workload (Geonames-derived city + postcode + composite queries), same Apple Silicon host (36 GB RAM, OrbStack). Sequential single-client p50/p95/p99 from curl --time-total; peak RPS from ab -k keepalive sweep. Reproducible — see benchmarks/ in the repo for the harness.

Engine Build Disk Hot RSS p50 p95 p99 Peak RPS
Cairn 24 s 195 MB 80 MB 0.51 ms 0.63 ms 0.74 ms 57 554
Pelias 3 m 46 s 3.5 GB 2.7 GB 13.76 ms 34.41 ms 57.23 ms 362
Nominatim 3 h 13 m 9.2 GB 2.4 GB 9.51 ms 15.15 ms 23.00 ms 1 109
Photon 2 m 1 s 1.3 GB 2.1 GB 5.88 ms 15.26 ms 25.18 ms 2 406

Cairn ships the workload 31–77× faster at p99, sustains 24–159× higher peak RPS, with 26–34× smaller hot RSS and 6.7–48× smaller disk than the incumbents. Build wall-clock is 5–483× faster on the same 506 MB PBF input.

Country scale — Germany

Same harness, 8.6× the input (4.7 GB Geofabrik PBF, ~3 M places). Nominatim DE not run — projected ~28 h on arm64 single-thread osm2pgsql.

Engine Build Disk Hot RSS p50 p95 p99 Peak RPS
Cairn 8 m 7 s 1.54 GB 359 MB 0.57 ms 1.08 ms 2.35 ms 39 664
Pelias 37 m 5.28 GB 5.15 GB 10.29 ms 17.10 ms 23.16 ms 506
Photon 19 m 22 s 9.10 GB 2.13 GB 9.69 ms 43.91 ms 92.94 ms 1 919
Nominatim not run — ~28 h projected on arm64 single-thread osm2pgsql

On 3 M places Cairn keeps a 10–40× p99 lead, 21–78× peak RPS, and 6–14× smaller hot RSS vs each incumbent. Build is 2.4–4.6× faster in wall-clock and produces a 3.4–5.9× smaller on-disk artifact.

Recall on noisy queries

1 153 deliberate typo / ASCII-fold variants, top-1 hit rate. Cairn's ?phonetic=true (DoubleMetaphone) single-handedly recovers nearly every typo. No incumbent ships a phonetic toggle today.

VariantHitsRecall
baseline257 / 1 15322.3 %
?fuzzy=1865 / 1 15375.0 %
?phonetic=true1 147 / 1 15399.5 %
?semantic=true257 / 1 15322.3 %
all flags on1 153 / 1 153100.0 %

Numbers shift on different disks / kernels / host load. Re-run with ./run.sh <engine> from the benchmarks/ directory to validate on your own hardware. The full report including per-engine setup gotchas is at benchmarks/results/REPORT.md.

Continent scale — Europe

First continent-scale run. Host: 48 cores, 124 GB RAM, NVMe storage. Input: Geofabrik europe-latest.osm.pbf (32 GB), 7× larger than Germany.

Engine Build Peak build RSS Bundle disk Places Hot RSS p95
Cairn 7 h 4 m 94 GB 12 GB 25.45 M 482 MB 45 ms
Pelias not run yet
Photon not run yet
Nominatim not run yet

Cairn's pipeline on Europe runs in three macro-phases: OSM ingest at the front (4 h 28 m: pass 0 max-id scan → pass 1 flatnode write → pass 2a node-place emit → pass 2b0 relation way-ref scan → pass 2b1 parallel way-place emit with 458 M ways scanned and 22.16 M emitted → pass 2b2 admin relations producing 363 K polygons), admin enrichment in the middle (3 h 23 m: per-place point-in-polygon against the admin index for two passes — Place admin_path then AdminFeature admin_path), and the bundle-write tail at the back (~30 m: simplify polygons → write 9 172 admin tiles → sort 25.45 M places → emit 21 478 place tiles + 21 478 point tiles in one parallel pass → stream tantivy index from the just-written tile blobs → manifest + SBOM).

Bundle is 12 GB total: 1.3 GB place tiles, 4.0 GB spatial layers (admin + nearest), 6.6 GB tantivy text index, plus the manifest and CycloneDX SBOM. Peak build heap reached 94 GB during the place-tile + tantivy overlap; runtime hot RSS settles at 482 MB after the warmup window, mmap working set bounded by the touched portion of the bundle.

Runtime ab against representative queries on a single cairn-serve process bound to localhost:

Endpoint RPS p50 p95 p99
/v1/search?q=Berlin 1 289 38 ms 47 ms 64 ms
/v1/search?q=Paris 1 323 37 ms 45 ms 66 ms
/v1/search?q=Madrid 1 324 36 ms 48 ms 61 ms
/v1/reverse (basic) 8 157 6 ms 7 ms 8 ms

Reverse-geocode latency stays in single-digit ms thanks to the per-tile mmap'd admin polygon archive + sorted-edge PIP. Forward-search latency is markedly slower than the country-scale runs (CH p95 1 ms, DE p95 1 ms) — the bulk of that comes from tantivy's multi-segment index (the single-segment merge step disabled after a build-time race with the LogMergePolicy at continent scale). Re-enabling a deterministic single-segment final state without the race is the next perf item.

When to pick what

Pick Pelias when

Pick Photon when

Pick Nominatim when

Pick Cairn when

Honesty notes

Security posture

Trivy scans the production container image on every push, every PR, and once a day at 06:17 UTC. The latest CRITICAL/HIGH/MEDIUM findings (mirrored from the GitHub Code Scanning tab):

Loading current CVE list…
CVE Severity Package Installed Fixed in Title
Loading…

Triage policy and reporting: SECURITY.md.