cairn.
v0.1.0 · beta ★ GitHub
For organizations + institutions

Geocoding,
on your own terms.

Every map app, every delivery, every emergency dispatch, every property listing, every voter registration starts with the same question: "Where is this place?" Today most teams answer it by sending the address to Google's, Mapbox's, or HERE's servers — and accepting the cost, the lock-in, and the data exposure that comes with it. Cairn is what you reach for when you'd rather not.

01

What is geocoding, exactly?

Geocoding is the translation layer between human-readable place descriptions ("Place de la Concorde, Paris", "8044 Zürich", "Heidelberg University Library") and machine-friendly coordinates (latitude / longitude, polygon shapes, postal codes). It runs in five flavors:

i.

Forward

Address or place name → coordinates + canonical record. "Vaduz Castle"(47.142, 9.523), country = Liechtenstein, kind = castle.

ii.

Reverse

Coordinates → containing admin chain + nearest features. (47.142, 9.523) → "Vaduz, Oberland, Liechtenstein" + nearest road, POI, address.

iii.

Autocomplete

Partial input → ranked suggestions, keystroke-grain. The search box on every shipping form, every map UI, every store-locator page.

iv.

Structured

Field-by-field ("road = X, city = Y, country = Z") → best match. Used when the upstream system already separates components — payment forms, CRM imports, government records.

v.

Parsing

Free-text address → structured components (house_number, road, postcode, city). Pre-step for normalization, deduplication, validation.

Sounds boring. Powers everything from Uber's pickup field to a national census's address index. A geocoder that's slow, wrong, or unavailable is a geocoder that quietly costs the rest of the product roadmap.

02

Why it matters — by industry

Geocoding is invisible until it isn't. These are the moments where the choice between "send the address to Google" and "run the geocoder we own" stops being a tooling preference and starts being a strategic decision.

Logistics & delivery

Last-mile cost is dominated by stop-density × per-stop time. A geocoder that returns the wrong rooftop or the closest crossroads instead of the building entrance burns minutes per stop and fuels customer-support tickets. At fleet scale, 10 ms × 200 K calls/day = 33 minutes of hot-path latency per dispatcher.

Public sector & emergency dispatch

911 / 112 dispatch needs a reverse geocode in under a second on a constrained network, against a vetted authoritative dataset (national address registry, not "whatever Google has"). Air-gapped or sovereignty-bounded networks can't call the public cloud. Cairn ships a single static binary that runs on the dispatch box.

Healthcare & insurance

Patient addresses, claim locations, provider directories — all subject to data-residency rules (HIPAA, GDPR, schrems II, sectoral). Sending an address to a US-hosted API for resolution can be the breach event. Self-hosted geocoding keeps PHI on-premise.

Real estate & property tech

Listings, comparables, parcel boundaries, school catchments, flood zones. Every overlay needs a coordinate. Pricing models depend on consistent geocoding across millions of historical records — outsourced APIs change ranking and coverage without notice, breaking time-series analysis.

Telecom & utilities

Coverage maps, outage notifications, service requests, smart meter assignment. Operators run their own GIS — the geocoder needs to integrate with internal datasets (cell towers, fiber routes, substation polygons) that no public API has ever heard of.

Retail & storefront search

Store locators, BOPIS pickup, curbside delivery. Every "find a store near me" is a forward + reverse geocode pair. High-traffic days scale per-call billing into surprise invoices that the CFO didn't sign off on.

Banking & risk

KYC address validation, AML location screening, fraud scoring, branch routing, mortgage underwriting (flood / fire / earthquake risk by parcel). Auditable, reproducible geocoding is a regulatory requirement — auditors want to know which version of which database returned which answer.

Aid & humanitarian

Refugee camp logistics, disaster response, public-health outreach. Often offline-first by necessity — bandwidth is scarce, the cloud is down, OSM is the only ground truth. A geocoder that runs on a rugged laptop in a tent is the difference between a delivered shipment and a stuck convoy.

Academia & research

Reproducibility demands a frozen geocoder version. A paper that says "we used Google's API in 2023" can't be replicated in 2026 because Google's index moved on. A versioned, content-hashed bundle pins the exact answers.

03

The hidden costs of an incumbent.

Per-call billing turns into surprise invoices.

Google's geocoding API is $5 / 1 000 calls. Mapbox is $0.75 / 1 000. Sounds small until autocomplete drives 200 calls per filled form. A retail Black Friday spike, a viral campaign, an integration bug, a misconfigured retry loop, and the bill finds you.

Cairn: zero per-call cost. The bill is the box you run it on.

Vendor lock-in eats negotiating leverage.

Once a thousand integrations call maps.googleapis.com, switching is a re-architecture. The vendor knows. Pricing increases 30 % year-over-year and you have ninety days to comply.

Cairn: open-source MIT / Apache-2.0. Fork it. Ship it. Ten years from now it still runs.

Data leaves the perimeter.

Every query — addresses of customers, patients, refugees, voters, suspects — gets sent over TLS to a third-party log you don't control. Subpoenas, breaches, government requests, internal misuse. "Just metadata" is a useful fiction until it isn't.

Cairn: queries never leave the host. The geocoder runs in your VPC, your cluster, your laptop.

Latency is a function of the WAN.

Even on a perfect day, a transatlantic round-trip is ~120 ms before the API does any work. Inside a datacenter with a local geocoder, p99 is microseconds. The difference shows up in checkout abandonment, dispatcher reflex time, and conversion funnels.

Cairn: p99 = 0.74 ms on Switzerland-scale data (benchmark).

Reproducibility is non-existent.

Hosted APIs ship continuous updates. The geocode you got today is not necessarily the one you'll get in six months, and the vendor offers no version pin. For audit trails, scientific replication, and regulator submissions, that's disqualifying.

Cairn: every bundle ships with a blake3 manifest, an ed25519 signature, and a CycloneDX SBOM.

The dataset doesn't fit your domain.

Google's index doesn't know about your internal sites, non-public utility infrastructure, custom address aliases, or yet-to-be-published street names. Adding them is impossible.

Cairn: bundle merges OSM + WhosOnFirst + OpenAddresses + Geonames + your own CSV / PBF / WoF SPR.

04

What Cairn offers — operationally.

·

Single static binary

One file, one config flag, one bundle directory. Runs on a Mac, a Hetzner box, an AWS Fargate task, a k3s pod, a rugged ARM laptop. No Postgres, no Elasticsearch, no Java, no Node.

·

Airgap-first

The bundle is a directory you pre-build and copy in. No outbound network calls at runtime. Required for sovereign networks, classified environments, ships, embassies, disaster zones.

·

Sub-millisecond p99

Switzerland: p99 = 0.74 ms, peak 57 554 RPS, hot RSS 80 MB. Every query is mmap'd rkyv tile reads + tantivy term lookups — no daemon round-trip, no DB query plan.

·

Signed + auditable

Every bundle ships with a manifest.toml carrying blake3 of every artifact, an ed25519 detached signature, and a CycloneDX 1.5 SBOM. Pin a bundle ID in your audit trail; reproduce the exact answer years later.

·

Hot reload

POST /admin/reload swaps to a new bundle without a process restart. Atomic ArcSwap-backed indices, in-flight queries finish on the previous snapshot. K8s rolling updates without rolling.

·

Federation

cairn-serve --bundles a/,b/,c/ shards a planet across continents without standing up multiple processes. Queries fan out, merge by score; PIP fans out, merges finest-first.

·

Custom data merge

Bundles ingest OSM + WhosOnFirst + OpenAddresses + Geonames out of the box. Add your own CSV (internal facilities, non-public infrastructure, custom aliases) and re-build — it's one cairn-build build away.

·

Time-aware queries

?valid_at=YYYY filters by OSM start_date / end_date. Historical geocoding for archives, journalism, longitudinal research. No incumbent ships this.

·

Familiar HTTP API

Standard REST shape: /v1/search, /v1/reverse, /v1/structured, ?text=, ?focus.point.lon. Each endpoint returns a clean JSON envelope you can hit with curl, Postman, or any HTTP client. Teams already running an open-source geocoder typically migrate with a DNS swap.

·

Stable bookmark IDs

Every Place ships a global identifier (gid) of shape <source>:<type>:<id> (osm:way:12345, wof:locality:101748479) that survives bundle rebuilds and federation hops. Save a record once; reload it years later via /v1/place?ids=… — same identifier, same answer. Audit trails, share links, and cross-system joins keep working across releases.

·

Building-precision addressing

Microsoft Building Footprints layered alongside addresses give rooftop-level resolution where OSM is sparse (~1.4 B polygons globally). Reverse-geocode a coordinate and return the actual building it sits inside, not just the nearest address point.

05

Where Cairn fits.

If you are… Cairn is the right fit when…
Government / EU institution Data sovereignty + air-gap + reproducibility + auditable supply chain are non-negotiable.
Bank / insurer KYC, AML, claims, and risk pipelines need on-premise geocoding under regulatory audit (BCBS 239, Solvency II, GDPR Art. 32).
Healthcare provider PHI may not transit a third-party API; provider directories + patient outreach need accurate, local- authority geocodes.
Logistics / fleet operator Per-call API costs are eating margin and you've already maxed out caching.
Telecom / utility Geocoder must merge with internal GIS (cell towers, fiber, substations) that public APIs don't know about.
Aid / humanitarian org Operating offline-first in disaster zones with OSM as the ground truth.
Researcher Need a frozen, cite-able geocoder version that returns the same answer in five years.
SaaS engineering team Auto-completion + checkout flow latency matters and the API bill is climbing.
OSS / hobby / indie Just want a good geocoder that runs on your laptop.
06

Where to start.