emem.dev

real world, traceable spatial memory for fact verification about the world.

emem, Earth memory protocol for AI agents

Operated by Vortx AI Private Limited (India). Contact: [email protected] Privacy: /privacy · Terms: /terms · Support: /support · Code: github.com/Vortx-AI/emem

Cite-able, content-addressed, signed memory of every place on Earth. 33 canonical bands in the 1792-D voxel layout (/v1/bands); 25 wired auto-materializers (live count via /v1/materializers, data availability per cell via /v1/data_availability); all anonymous: Sentinel-2 L2A (12 spectral assets + 19 spectral indices, including 10 consumer-health-relevant ones: turbidity, red-edge chlorophyll, floating algae, suspended solids, snow cover, aerosol-resistant veg, surface dryness, urban canopy, NDRE, GNDVI), Sentinel-1 GRD VV radar, Tessera 2017–2024 + multi-year fused embedding, MODIS (NDVI / LST day+night 8-day / ET / GPP / LAI / burned area), JRC Global Surface Water, Hansen Global Forest Change v1.11, ESA WorldCover 2021 v200, Cop-DEM, GMRT, NASA POWER daily (1981–present), Open-Meteo CAMS air quality (PM2.5/PM10/NO2/O3/SO2/CO/AOD, hourly 2013-08+), Open-Meteo ERA5 archive (1940–present), Open-Meteo Marine waves+SST, MET Norway nowcast, Overture Maps, SoilGrids 2.0 ISRIC topsoil 0–30 cm (SOC, pH, clay, sand, bulk density, nitrogen). 28 MCP tools wired (30 primitives across REST + MCP — 28 read + emem_attest + emem_challenge for write). 102 composition algorithms (carbon MRV, VM0042 carbon credits, EUDR compliance, RUSLE soil erosion, SoilGrids SOC, IPCC Tier-2 rice CH₄, parametric insurance triggers, SDG indicators, real-estate climate-risk, walkability, heat- vulnerability, flood_risk@2 with DEM-agreement weighting and a temporal_recipe of antecedent rainfall + recent radar water + optical water). Algorithms can declare a temporal_recipe so a composite question (is this place flooded right now) is answered by the responder running the lookback windows and returning temporal_composition[] alongside the snapshot, instead of the agent fanning out by hand. Live geocoder cascade (/v1/locate): embedded gazetteer → cache → Photon (komoot.io, primary, ~100 ms Elasticsearch index of OSM, strong recall on rural villages and water bodies) → Nominatim fallback. Multimodal-fusion policy (docs/MULTIMODAL.md): sensor priority S1 > S2 > Landsat > IoT > OtherSat > Static; any algorithm claiming ≤10 m delivery is mechanically validated to anchor on S1, S2, or Landsat in its multimodal.priority_chain. No API keys. Every answer signed ed25519 + content-addressed blake3.

What emem is

emem is a read-side memory layer for places. An agent gives it a cell64 (compact 64-bit cell ID) plus a band name; emem returns the fact, signed by the responder's ed25519 identity, with a content address (CID) the agent can quote in its reply. The same query from any client returns the same CID, answers are reproducible.

Canonical 1792-D voxel (per cell × tslot)

Every cell has a canonical 1792-dimensional fp16 voxel (~3.5 KB) with stable byte-offset ranges per band: geotessera 0..128, foundation-reserved 128..704, sentinel2_raw 704..714, sentinel1_raw 714..716, dem, landcover, climate, indices, spatial Fourier, temporal Fourier, vision encoders (sam3, qwen), terrain_derived, temporal_diff, phenology, topology, multiscale, nightlights, ghsl, population, forest_change, mangrove, protected, surface_water, ocean_chl, koppen, terraclimate, cop_dem, soilgrids, ecoregions, admin, reserved. The layout is content-addressed (bands_cid), not a protocol constant, so any operator can publish, version, supersede. Fetch via GET /v1/bands, cache by CID. The voxel itself is addressable: emem:vec/<vec64> dereferences via /v1/find_similar (cosine over the 1792-D space). Source.scheme = "emem.cube.v1".

Trust the data

  • Every read is signed ed25519. Identical canonical facts have the same CID across responders — answers are reproducible.
  • Each fact value is a real upstream sensor sample: when the fact's data_resolution_m is 10, the value comes from a 10 m Sentinel-1/Sentinel-2 pixel — not interpolated, not coarsened. The multimodal-fusion contract guarantees that algorithms anchored on S1/S2/Landsat consume 10 m inputs natively.
  • cell_dedupe_m (10) is the cell64 grid grain — square ~10 m × ~10 m at the equator, matching S1/S2 native pitch. Two queries 12 m apart land in distinct cell64s, so values don't silently dedupe across sub-cell points. Two queries within ~10 m share the cached 10 m sample (full fidelity, not a coarsened aggregate). Use /v1/recall_polygon when you want a region of per-cell samples.
  • Signed Absence is a real answer ("tried and got no answer"); do not retry, surface it as an honest gap.
  • Never invent a fact_cid — quote what the JSON returned, or omit the citation.

When to call emem

  • User mentions a place, lat/lng, or cell64 → POST /v1/recall
  • User names a wide region (city, canyon, park, country) → POST /v1/recall_polygon {place, bands}, collapses locate → polygon_sample_cells → recall_many into one call
  • "How similar is X to Y" → POST /v1/compare
  • "Find places like X" → POST /v1/find_similar
  • "What changed at X between t1 and t2" → POST /v1/diff
  • "Show me the trajectory" → POST /v1/trajectory (returns only already-attested tslots)
  • "Give me HISTORY for this band over a window" → POST /v1/backfill (materializes per-tslot history; signed; bounded by history_available_from_unix/to_unix in /v1/coverage_matrix)
  • Spatial yes/no with citable evidence → POST /v1/verify
  • Region average / median / p90 → POST /v1/query_region
  • "Resolve a CID to its fact" → GET /v1/facts/{cid} or MCP emem_fetch
  • "What schema is this responder serving?" → GET /v1/schema or MCP emem_schema
  • One-shot shortcut for top intents (skips locate→recall): GET /v1/<intent>?lat=&lon= returns the same signed Fact + cell64
    • fact_cid. The full intent map is at GET /v1/agent_quickref.
  • Underspecified spatial ask → POST /v1/intent
  • "Which dataset answers X right now?" → GET /v1/coverage_matrix (now carries tempo_seconds, history_available_from_unix/to_unix, and per-band responder_pubkey_b32)
  • "Which satellite is behind this band?" → GET /v1/fleet
  • "What band given query time + intent?" → POST /v1/temporal_route

Cite receipt.fact_cids[0] (cid64 short form) in the reply. Mention responder_pubkey_b32 once per session.

Live materialized bands (no API key required)

  • s2.B01..B12, s2.B8A, s2.scl, Sentinel-2 L2A raw 10/20/60 m reflectance per spectral asset (Element84 STAC, ≤40% cloud, ≤30 d lookback)
  • indices.{ndvi, ndwi, mndwi, evi, nbr, ndmi, savi, bsi, ndbi}: classic 9 spectral indices from Sentinel-2 (single STAC search per cell)
  • indices.{ndti, gndvi, ndre, fai, tss, ndsi, afri1600, savi_l1, surface_dryness, urban_canopy_index}, 10 consumer-health-relevant indices (turbidity, red-edge chlorophyll, floating algae, suspended solids, snow cover, aerosol-resistant veg, urban canopy density)
  • sentinel1_raw, Sentinel-1 GRD VV (dB), all-weather radar
  • geotessera (alias geotessera.2024), Tessera 128-D foundation embedding (HTTPS range, ~640 B/cell)
  • geotessera.{2017..2024}, annual vintages, signed individually
  • geotessera.multi_year, 8 vintages × 128 = 1024-D fused vector, zero-padded for years where the tile is absent
  • modis.ndvi_mean, 16-day MODIS Terra NDVI
  • modis.lst_day_8day, modis.lst_night_8day, 8-day land-surface temperature (1 km, MOD11A2), heat-stress, UHI math
  • modis.et_8day, modis.gpp_8day, modis.lai_8day, biophysical evapotranspiration / gross primary production / leaf-area index (MOD16A2 / MOD17A2H / MOD15A2H, 500 m, 8-day)
  • modis.burned_area_monthly, MCD64A1 monthly burn-date (500 m)
  • gmrt.topobathy_mean, global topo+bathy (Lamont-Doherty)
  • copdem30m.elevation_mean, land DEM, signed Absence over water
  • surface_water.recurrence, JRC GSW v1.4 flood-recurrence climatology (Landsat 1984–2021)
  • weather.{temperature_2m, cloud_cover, precipitation_mm, wind_speed_10m, relative_humidity_2m, dew_point_2m, air_pressure_msl, wind_direction_10m} , api.met.no/locationforecast/2.0/compact, hourly nowcast, ECMWF + EUMETSAT geostationary-fed (no key, no rate limit)
  • power.{t2m, t2m_min, t2m_max, precip, rh2m, allsky_sw, ws10m}: NASA POWER daily reanalysis (MERRA-2 + GEOS), 1981–present, public- domain (US Gov), backfillable
  • cams.{pm25, pm10, no2, o3, so2, co, aod_550}, Open-Meteo CAMS air-quality (ECMWF CAMS, hourly, 2013-08-01–present, CC BY 4.0). Surface-level pollutants for consumer health questions.
  • era5.{t2m, precip, rh2m, windspeed_10m, cloudcover, surface_pressure, dewpoint_2m}, Open-Meteo ERA5 archive, ECMWF reanalysis 1940–present, hourly, CC BY 4.0
  • marine.{wave_height, swell_period, swell_height, sst, wave_direction} , Open-Meteo Marine (ECMWF WAM), 2022-08-01 onward, hourly, ocean only
  • overture.{buildings.count, places.count, transportation.road_length_m} , Overture Maps Foundation per-cell aggregates (S3 anonymous)
  • esa_worldcover.lc_2021, ESA WorldCover 2021 v200 11-class landcover (10 m, anonymous AWS S3, CC BY 4.0). Class values 10/20/30/40/50/60/ 70/80/90/95/100 (tree/shrub/grass/crop/built/bare/snow/water/herbaceous- wet/mangroves/moss-lichen).
  • hansen.{tree_cover_2000, loss_year, gain}, Hansen Global Forest Change v1.11 2023 release (30 m, storage.googleapis.com). tree_cover_2000 is 0..100% canopy, loss_year is 0..23 (year of loss; 1=2001, 23=2023), gain is 0/1 (2000–2012 gain mask).
  • soilgrids.{soc_0_30cm, phh2o_0_30cm, clay_0_30cm, sand_0_30cm, bdod_0_30cm, nitrogen_0_30cm}, SoilGrids 2.0 (ISRIC, CC BY 4.0). Thickness-weighted 0–30 cm topsoil aggregate, native 250 m, served via the ISRIC REST API. Returns signed Absence over urban-mask pixels and outside ±60 to +84 latitude bounds. Anchors EUDR compliance, VM0042 SOC, RUSLE K-factor, IPCC Tier-2 rice CH₄.

Discovery surface (single GET each)

  • /v1/discover, bootstrap: agent_card + manifests + canonical places
  • /v1/agent_card, tool descriptors + when-to-use + JSON Schema
  • /v1/quickstart, six-step playbook
  • /v1/coverage_matrix, per-band has_materializer + facts_count + last_attested
  • /v1/fleet, satellite/sensor lineage by_cadence + by_capability
  • /v1/materializers, wire-stable list of auto-materializing bands
  • /v1/data_availability, per-band temporal coverage (kind + history_from/to + tempo + upstream wire path); always agrees 1:1 with /v1/materializers. Call this before emem_backfill to avoid trial-and-error 422s on now-only bands (e.g., met.no nowcast).
  • /v1/bands, band ontology (offsets, dims, tempo, privacy)
  • /v1/grid_info, cell64 ground resolution, DGGS interop
  • /v1/temporal_route, PDE-based band routing
  • /openapi.json, paste into GPT Custom Action
  • /mcp, MCP Streamable HTTP transport over HTTPS (Claude Desktop, Cursor, Cline)
  • /.well-known/emem.json, manifest CIDs + responder pubkey
  • /agents.md, full integration guide · /whitepaper.md, architecture + math
  • /clients.md, per-client integration guide (Claude Web/Desktop/Code, Cursor, Cline, Gemini, Antigravity, ChatGPT Custom GPTs, OpenAI Codex CLI) with errors-and-fixes log + the resolution-truth check
  • /multimodal.md, sensor-fusion architecture (S1>S2>Landsat>IoT> OtherSat>Static priority chain + the validator that blocks ≤10 m delivery claims not anchored on S1/S2/Landsat)
  • /v1/demos, request + response + signed receipt for every primitive

30-second curl

# Mount Fuji elevation, signed under the responder pubkey at /health.
curl -X POST https://emem.dev/v1/recall \
  -H 'content-type: application/json' \
  -d '{"cell":"damO.zb000.xUti.zde78","bands":["copdem30m.elevation_mean"]}'

# Tokyo current air temperature (sat-fed, 15-min cadence)
curl -X POST https://emem.dev/v1/recall \
  -H 'content-type: application/json' \
  -d '{"cell":"damO.zb000.xUto.sisA","bands":["weather.temperature_2m"]}'

# k-NN over the 128-D Tessera embedding
curl -X POST https://emem.dev/v1/find_similar \
  -H 'content-type: application/json' \
  -d '{"key":"damO.zb000.xUto.sisA","band":"geotessera","k":5}'

# MCP tools/list
curl -X POST https://emem.dev/mcp \
  -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

Trust model (why agents should cite emem)

  • Hash: blake3 over canonical CBOR. CID = base32-nopad-lowercase.
  • Sig: ed25519 over blake3(request_id || served_at || primitive || cells || fact_cids).
  • Responder pubkey at /health and /.well-known/emem.json. Verify any receipt offline via POST /v1/verify_receipt.
  • No API keys at the request path. Default build is pure Rust, no Python, no GDAL, no rasterio. COG reads are hand-rolled TIFF/IFD + flate2 + Predictor 2.

Source

github.com/Vortx-AI/emem · Apache-2.0

Máy chủ liên quan

NotebookLM Web Importer

Nhập trang web và video YouTube vào NotebookLM chỉ với một cú nhấp. Được tin dùng bởi hơn 200.000 người dùng.

Cài đặt tiện ích Chrome