This repository is used to download Stadia satellite imagery tiles and use them to create label studio tasks
  • Go 98.8%
  • Nix 1.2%
Find a file
2026-06-19 22:42:17 +00:00
cmd/stadia-tile-creator Upload task files and manifest files as well as images 2026-06-19 21:05:28 +00:00
doc Update plan based on answered questions. 2026-06-19 18:45:49 +00:00
internal Update readme with latest test feedback 2026-06-19 22:42:17 +00:00
.gitignore Initial implementation of the Component Design phase. 2026-06-19 20:26:33 +00:00
flake.lock Fix flake 2026-06-19 15:38:03 +00:00
flake.nix Initial implementation of the Component Design phase. 2026-06-19 20:26:33 +00:00
go.mod Initial implementation of the Component Design phase. 2026-06-19 20:26:33 +00:00
go.sum Initial implementation of the Component Design phase. 2026-06-19 20:26:33 +00:00
README.md Update readme with latest test feedback 2026-06-19 22:42:17 +00:00

stadia-tile-creator

A Go CLI tool for downloading high-resolution satellite imagery from Stadia Maps, uploading tiles to S3-compatible object storage (Garage), and producing Label Studio task files for annotation.

Architecture

                          ┌──────────────────────────────────────┐
                          │         stadia-tile-creator           │
                          │                                      │
  Stadia API ──▶  download  ──▶  tiles/  +  tiles.jsonl         │
                          │        │                             │
                          │        ▼                             │
                          │  generate-tasks  ──▶  tasks.json     │
                          │        │                             │
                          │        ▼                             │
                          │      upload  ──▶  S3 bucket          │
                          │   (tiles + JSON files)               │
                          └──────────────────────────────────────┘
                                               │
                                               ▼
                                        Label Studio

Three pipeline stages (or four commands run manually):

Stage Input Output
download Bounding box + zoom + API key {output_dir}/{style}/{z}/{x}/{y}.{fmt} + tiles.jsonl
generate-tasks tiles.jsonl + base URL label-studio-tasks.json (JSON array)
upload Local tile directory Tiles, manifest, and task files in S3 bucket

The manifest subcommand also exists for inspecting or rebuilding the manifest independently of a full pipeline run.

Prerequisites

  • Go 1.23+ (built and tested with Go 1.26)
  • Stadia Maps API key — stored in STADIA_MAPS_API_KEY env var
  • S3-compatible storageGarage is the primary target; MinIO, Ceph RGW, and AWS S3 are also supported
  • S3 credentials — stored in AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY
  • Reverse proxy in front of Garage — required to add CORS headers for Label Studio's browser-based annotation interface (see Label Studio integration)

Installation

# Build from source
cd stadia-tile-creator
go build -o stadia-tile-creator ./cmd/stadia-tile-creator/

# Or via Nix
nix build

Quick start

# Set credentials
export STADIA_MAPS_API_KEY="your-key-here"
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export S3_ENDPOINT="https://your-garage.example.com"

# Run full pipeline for Visalia, CA at zoom 19
./stadia-tile-creator pipeline \
  --north 36.35021 --south 36.30616 \
  --west -119.35014 --east -119.26937 \
  --zoom 19 \
  --bucket stadia-tiles \
  --prefix visalia/zoom19/

This produces:

  • tiles/alidade_satellite/19/... — tile images on disk
  • tiles/tiles.jsonl — georeferencing manifest
  • tiles/label-studio-tasks.json — Label Studio task file
  • S3 objects at s3://stadia-tiles/visalia/zoom19/... — tiles + all JSON files

Typical workflow (manual commands)

The upload command now uploads tiles and all top-level JSON files (tiles.jsonl, label-studio-tasks.json). The recommended order is:

# 1. Download tiles from Stadia Maps
stadia-tile-creator download \
  --north 36.35021 --south 36.30616 \
  --west -119.35014 --east -119.26937 \
  --zoom 19 \
  --output-dir tiles/

# 2. Generate Label Studio task file (uses s3:// URL format)
stadia-tile-creator generate-tasks \
  --manifest tiles/tiles.jsonl \
  --base-url s3://stadia-tiles/visalia/zoom19/ \
  --output tiles/label-studio-tasks.json

# 3. Upload everything to S3 (tiles + manifest + task file)
stadia-tile-creator upload \
  --source tiles/ \
  --bucket stadia-tiles \
  --endpoint $S3_ENDPOINT \
  --prefix visalia/zoom19/

After step 3, Label Studio has everything it needs in S3:

  • s3://stadia-tiles/visalia/zoom19/alidade_satellite/19/{x}/{y}.jpg — tile images
  • s3://stadia-tiles/visalia/zoom19/tiles.jsonl — georeferencing manifest
  • s3://stadia-tiles/visalia/zoom19/label-studio-tasks.json — task file

Zoom level guidance

For pool-detection labeling, zoom 19 is the recommended level. Zoom 20 produces tiles that are too close for reasonable annotation (individual pools may span multiple tiles, breaking continuity). At zoom 19, a residential swimming pool typically fits within 24 tiles, which is ideal for bounding-box annotation.

If you omit --zoom (or pass --zoom 0), the tool auto-discovers the maximum available zoom via the Stadia API and uses that. You can override to force a specific zoom.

Command reference

Every command accepts -h for detailed flags.

download

Download tiles for a geographic bounding box.

stadia-tile-creator download \
  --north 36.35021 --south 36.30616 \
  --west -119.35014 --east -119.26937 \
  --zoom 19 \
  --output-dir tiles/ \
  --style alidade_satellite \
  --format jpg \
  --concurrency 8
Flag Default Description
--north, --south, --east, --west (required) Bounding box in decimal degrees
--zoom 0 Zoom level (0 = auto-discover via Stadia API, 19 recommended)
--output-dir tiles Local output directory
--style alidade_satellite Stadia raster style
--format jpg Image format (jpg, png, webp)
--concurrency 8 Parallel download workers
--api-key $STADIA_MAPS_API_KEY Stadia Maps API key

Output: Tiles written to {output-dir}/{style}/{z}/{x}/{y}.{fmt}. A tiles.jsonl manifest is also written to {output-dir}/tiles.jsonl.

Behavior: 404s (no tile at coordinate — common at bbox edges) are silently skipped. Rate limits (429) cause all workers to back off. Download progress is logged every 50 tiles. A summary of downloaded / skipped / failed is printed at completion.

generate-tasks

Generate label-studio-tasks.json from a manifest. Must be run before upload so the task file is uploaded alongside the tiles.

stadia-tile-creator generate-tasks \
  --manifest tiles/tiles.jsonl \
  --base-url s3://stadia-tiles/visalia/zoom19/ \
  --output tiles/label-studio-tasks.json \
  --image-key image
Flag Default Description
--manifest tiles.jsonl Input manifest
--base-url (required) URL prefix for tile images. Use s3://bucket/prefix/ format for pre-signed URL mode
--output label-studio-tasks.json Output task file
--image-key image Key name for image URL in task data
--limit 0 Max tasks to include (0 = all)
--shuffle false Shuffle task order

--base-url format: When using Garage with Label Studio's S3 storage connector, pass the URL in s3://bucket/prefix/ format (no domain). Label Studio will remap the s3:// prefix to an HTTPS URL and sign the request with its configured access key. See Label Studio integration for details.

upload

Upload tiles, the manifest (tiles.jsonl), and any top-level *.json / *.jsonl files (including label-studio-tasks.json) to S3.

stadia-tile-creator upload \
  --source tiles/ \
  --bucket stadia-tiles \
  --endpoint https://s3.example.com \
  --prefix visalia/zoom19/
Flag Default Description
--source tiles Local tile directory
--bucket (required) S3 bucket name
--endpoint $S3_ENDPOINT S3-compatible endpoint URL
--region us-east-1 Used for Signature V4 signing region
--signing-region garage (for custom endpoints) SigV4 signing region override
--prefix "" S3 key prefix
--concurrency 8 Parallel upload workers
--force-path-style true Path-style addressing (required for Garage)
--dry-run false Show what would be uploaded without doing it

S3 key layout:

s3://{bucket}/{prefix}{style}/{z}/{x}/{y}.{fmt}     ← tiles
s3://{bucket}/{prefix}tiles.jsonl                    ← manifest
s3://{bucket}/{prefix}label-studio-tasks.json        ← task file

Object metadata written: Content-Type (image/jpeg, application/json), Cache-Control: max-age=31536000, immutable, plus custom x-amz-meta-* headers (z, x, y, north, south, east, west, crs, sha256) on tiles.

Behavior: Uploads tiles with concurrent workers. Then uploads all top-level JSON files in a second pass. Retries 5xx and network errors up to 3× with exponential backoff. 400/403/404 are fatal. Appends successful records to {source}/tiles.jsonl.

manifest

Scan a local tile directory and generate tiles.jsonl. Normally not needed if you use download (which writes its own manifest), but useful for inspecting or rebuilding.

stadia-tile-creator manifest \
  --source tiles/ \
  --output tiles.jsonl
Flag Default Description
--source tiles Local tile directory
--output tiles.jsonl Output manifest path

Behavior: Walks {source}/{style}/{z}/{x}/{y}.{fmt}, computes SHA-256 for each tile, derives geographic bounds via go-stadia's tile math. If --output already exists, it merges: new tiles are added, changed tiles (different SHA-256) are updated, unchanged tiles are left as-is. Output is sorted by z, y, x.

pipeline

Run all stages in sequence: download → generate-tasks → upload.

stadia-tile-creator pipeline \
  --north 36.35021 --south 36.30616 \
  --west -119.35014 --east -119.26937 \
  --zoom 19 \
  --bucket my-bucket \
  --prefix visalia/zoom19/

Accepts all flags from download, upload, and generate-tasks.

Behavior: Runs download → manifest → generate-tasks → upload. The task file is generated before the upload stage so it gets uploaded to S3 alongside the tiles. If tiles already exist on disk (from a previous run), download is skipped. If tiles are already uploaded (have a URL in the manifest), upload is skipped. This makes pipeline safe to restart after interruption.

serve

Serve a local tile directory over HTTP for local development and testing.

stadia-tile-creator serve \
  --dir tiles/ \
  --port 8080 \
  --cors
Flag Default Description
--dir tiles Directory to serve
--port 8080 HTTP listen port
--host localhost Bind address
--cors false Enable CORS headers

Output files

tiles.jsonl — Georeferencing manifest

Newline-delimited JSON. One line per tile.

{"style":"alidade_satellite","z":19,"x":104857,"y":228476,"north":36.315,"south":36.312,"east":-119.285,"west":-119.289,"crs":"EPSG:3857","size":18432,"sha256":"e3b0c44...","url":"https://s3.example.com/visalia/zoom19/alidade_satellite/19/104857/228476.jpg"}
Field Type Description
style string Stadia raster style
z, x, y uint Tile coordinates
north, south, east, west float64 Geographic bounds (degrees)
crs string Coordinate reference system (EPSG:3857)
size int64 File size in bytes
sha256 string Hex-encoded SHA-256
url string S3 URL (populated after upload)

label-studio-tasks.json — Label Studio task file

[
  {
    "data": {
      "image": "s3://stadia-tiles/visalia/zoom19/alidade_satellite/19/104857/228476.jpg",
      "z": 19,
      "x": 104857,
      "y": 228476,
      "north": 36.315,
      "south": 36.312,
      "east": -119.285,
      "west": -119.289,
      "crs": "EPSG:3857",
      "sha256": "e3b0c44..."
    }
  }
]

Label Studio integration

Architecture

                    ┌─────────────────┐
  Label Studio ────▶│  Reverse Proxy  │────▶  Garage (S3)
  (browser)         │  (CORS headers) │
                    └─────────────────┘

Garage does not natively emit CORS headers. Label Studio's browser-based annotation interface requires CORS headers on image responses. The recommended setup:

  1. Place a reverse proxy in front of Garage (nginx, Caddy, or similar) that adds the necessary Access-Control-Allow-* headers to tile responses.

  2. Configure Label Studio with an S3 cloud storage connection using its own Garage access key. Enable pre-signed URLs.

  3. Use s3:// format for --base-url when generating tasks:

    stadia-tile-creator generate-tasks \
      --base-url s3://stadia-tiles/visalia/zoom19/
    

    Label Studio maps this s3:// prefix to the HTTPS endpoint configured in its S3 storage connection and signs the resulting URLs with its access key.

Labeling configuration (Label Studio XML)

<View>
  <Header value="Tile ($z, $x, $y) — bounds $north $west to $south $east"/>
  <Image name="image" value="$image" zoom="true" zoomControl="true" rotateControl="true"/>
  <RectangleLabels name="pool" toName="image" showInline="true">
    <Label value="Swimming Pool" background="#00FF00"/>
    <Label value="Not a Pool" background="#FF0000"/>
  </RectangleLabels>
</View>

Task file management

Label Studio does not refresh task files from cloud storage. Once a task file has been imported, Label Studio will not re-read it even if it changes on disk.

To add new tasks (e.g., a new batch of tiles for a different region):

  1. Generate a uniquely-named task file:

    stadia-tile-creator generate-tasks \
      --manifest batch2/tiles.jsonl \
      --base-url s3://stadia-tiles/batch2/ \
      --output tiles/label-studio-tasks-batch2.json
    
  2. Upload it alongside the tiles:

    stadia-tile-creator upload --source tiles/ ... --prefix batch2/
    
  3. In Label Studio, configure the S3 cloud storage source to match a prefix pattern that captures all task files:

    Regex: label-studio-tasks.*\.json
    

    Label Studio will pick up any new files matching the pattern on the next sync.

Alternatively, import the file manually through the Label Studio UI (Projects → Import → Upload file) each time you add tasks.

Environment variables

Variable Used by Description
STADIA_MAPS_API_KEY download, pipeline Stadia Maps API key
AWS_ACCESS_KEY_ID upload, pipeline S3 access key
AWS_SECRET_ACCESS_KEY upload, pipeline S3 secret key
AWS_REGION upload, pipeline S3 region (default: us-east-1)
S3_ENDPOINT upload, pipeline S3 endpoint URL
S3_SIGNING_REGION upload, pipeline SigV4 signing region (default: garage for custom endpoints)

Integration testing guide

What to test

The pipeline has three independent stages. Test each one in isolation before running the full pipeline.

1. Download

Test: Download a small bounding box (4-9 tiles) at a low zoom level.

# Small area at zoom 15 — should produce ~6 tiles
stadia-tile-creator download \
  --north 36.33 --south 36.31 \
  --west -119.30 --east -119.28 \
  --zoom 15 \
  --output-dir test-tiles/

Verify:

  • Tiles exist at test-tiles/alidade_satellite/15/{x}/{y}.jpg
  • Each tile is exactly 256×256 pixels (use identify or file)
  • Each tile is a valid JPEG (can be opened in an image viewer)
  • test-tiles/tiles.jsonl exists with one record per downloaded tile
  • Records contain non-empty sha256 and correct size
  • Geographic bounds are sensible (north > south, east > west)
  • Summary reports downloaded / skipped / failed counts

Test auto-zoom: Omit --zoom and verify it discovers the max available zoom. This probes zoom 22 → 0 and may take a few seconds.

2. Generate tasks

Test: Generate Label Studio task file.

stadia-tile-creator generate-tasks \
  --manifest test-tiles/tiles.jsonl \
  --base-url s3://test-bucket/test/zoom15/ \
  --output test-tiles/label-studio-tasks.json

Verify:

  • label-studio-tasks.json is a valid JSON array
  • Each task has data.image with an s3://bucket/prefix/... URL
  • Each task includes z, x, y, north, south, east, west, crs, sha256

Test with --image-key: Use --image-key photo and verify the image key in task data is "photo" instead of "image".

Test with --limit: Use --limit 3 and verify exactly 3 tasks are output.

Test with --shuffle: Run twice with --shuffle and verify the order differs.

3. Upload (dry-run)

Test: Verify against S3 without writing.

stadia-tile-creator upload \
  --source test-tiles/ \
  --bucket your-test-bucket \
  --endpoint $S3_ENDPOINT \
  --dry-run

Verify:

  • Prints the number of tiles that would be uploaded
  • No files are actually uploaded to S3

4. Upload (real)

Test: Upload to S3. This uploads tiles, manifest, and the task file.

stadia-tile-creator upload \
  --source test-tiles/ \
  --bucket your-test-bucket \
  --endpoint $S3_ENDPOINT \
  --prefix test/zoom15/

Verify:

  • Objects exist at s3://bucket/test/zoom15/alidade_satellite/15/{x}/{y}.jpg
  • Content-Type header is image/jpeg
  • Cache-Control header is max-age=31536000, immutable
  • Custom metadata (x-amz-meta-z, etc.) matches the tile coordinates
  • test/zoom15/tiles.jsonl exists in S3 with application/json content type
  • test/zoom15/label-studio-tasks.json exists in S3
  • tiles.jsonl now has url fields populated with S3 URLs
  • Upload progress is logged every 50 tiles

5. Manifest (standalone)

Test: Generate a manifest from an existing tile directory.

stadia-tile-creator manifest \
  --source test-tiles/ \
  --output test-tiles/tiles.jsonl

Verify:

  • tiles.jsonl is valid JSONL (one JSON object per line)
  • Running again produces the same file (idempotent)
  • Adding a new tile to the directory and re-running adds the new record
  • SHA-256 values match those from the download step
  • Records are sorted by z, y, x

6. Serve

Test: Serve tiles locally.

stadia-tile-creator serve --dir test-tiles/ --port 8081 --cors &
curl -I http://localhost:8081/alidade_satellite/15/0/0.jpg

Verify:

  • HTTP 200 for tiles that exist
  • HTTP 404 for tiles that don't
  • Access-Control-Allow-Origin: * header present (CORS enabled)
  • Tile served as image/jpeg

7. Pipeline (end-to-end)

Test: Run the full pipeline on a small area.

stadia-tile-creator pipeline \
  --north 36.33 --south 36.31 \
  --west -119.30 --east -119.28 \
  --zoom 15 \
  --bucket your-test-bucket \
  --endpoint $S3_ENDPOINT \
  --prefix test/full-pipeline/

Verify:

  • Download completes with summary (X downloaded, Y skipped, Z failed)
  • tiles.jsonl exists with correct records
  • label-studio-tasks.json exists with all tiles as tasks (uses s3:// URLs)
  • S3 objects exist at the expected keys with correct metadata
  • tiles.jsonl and label-studio-tasks.json are uploaded to S3 alongside tiles
  • Pipeline summary prints total tiles, MB, and elapsed time

Test resume: Kill the pipeline mid-download (Ctrl+C), then re-run the same command. Verify it skips already-downloaded and already-uploaded tiles.

8. Error handling

Test missing API key:

unset STADIA_MAPS_API_KEY
stadia-tile-creator download --north 36 --south 35 --west -119 --east -118

Should exit with a clear error about the missing API key.

Test invalid bbox: (north < south)

stadia-tile-creator download --north 35 --south 36 --west -119 --east -118

Should report 0 tiles and exit gracefully.

Test bad S3 credentials:

AWS_ACCESS_KEY_ID=bad AWS_SECRET_ACCESS_KEY=bad \
stadia-tile-creator upload --source test-tiles/ --bucket test --endpoint $S3_ENDPOINT

Should fail with a non-retryable error (400/403) within 1-2 attempts (not retry 3×).

Test missing S3 bucket:

stadia-tile-creator upload --source test-tiles/ --bucket nonexistent --endpoint $S3_ENDPOINT

Should fail with a non-retryable error (404 NoSuchBucket).

Expected performance

For Visalia, CA at zoom 19 (~300 tiles):

Stage Expected
Download 15-60 seconds depending on concurrency and network
Upload 5-30 seconds depending on bandwidth to S3
Generate tasks < 1 second
Total pipeline 1-3 minutes

Cleanup

rm -rf test-tiles/
# Also delete test objects from S3:
aws s3 rm s3://your-test-bucket/test/ --recursive --endpoint-url $S3_ENDPOINT