GeoJSON Sequence Streaming

High-performance, chunked binary streaming for geospatial features using the Apache Arrow-based GeoJSON Sequence Format (GSF). OGC Standard Stable

Overview

Traditional GeoJSON is human-readable but inefficient for large datasets and real-time pipelines. GeoJSON Sequence Streaming delivers features as compressed binary chunks over HTTP, reducing payload size by up to 70% while maintaining full OGC compliance.

The format leverages Apache Arrow's columnar structure to enable zero-copy deserialization, making it ideal for:

  • Real-time telemetry & IoT sensor networks
  • High-frequency financial geospatial data
  • Large-scale vector tile generation
  • Server-to-server feature sync pipelines

Specification & HTTP Headers

Streaming requires specific headers to negotiate chunked transfer and binary encoding. The server pushes Arrow batches wrapped in a GSF container.

Request Headers

HeaderDescriptionExample
AcceptRequest GSF formatapplication/x-geojson-seq
Transfer-EncodingEnable chunked deliverychunked
Accept-EncodingCompression supportgzip, br, zstd

Response Headers

HeaderDescriptionExample
Content-TypeGSF binary streamapplication/x-geojson-seq
X-GSF-Features-TotalEstimated total count124580
X-GSF-Batch-SizeFeatures per chunk1000
💡 Protocol Note

The stream follows HTTP/1.1 chunked transfer encoding. Each chunk contains a self-describing Arrow IPC message with schema metadata in the initial batch.

Implementation Examples

Use the following examples to request and consume GeoJSON Sequence streams in your application.

curl -X GET \ -H "Accept: application/x-geojson-seq" \ -H "Transfer-Encoding: chunked" \ "https://api.geoserver.com/v1/layers/realtime-sensors/features?stream=true" \ --output sensor_stream.gsf
// JavaScript: Fetch & Process Stream const response = await fetch('/v1/layers/sensors/features?stream=true', { headers: { Accept: 'application/x-geojson-seq' } }); // Read chunks using ReadableStream const reader = response.body.getReader(); while ((const { done, value } = await reader.read())) { if (done) break; const features = GSF.parseChunk(value); // Binary deserializer processBatch(features); }
# Python: Stream Processing with aiohttp & pyarrow import aiohttp import geojson_sequence as gsf async with aiohttp.ClientSession() as session: async with session.get("/features?stream=true", headers={"Accept": "application/x-geojson-seq"}) as resp: async for chunk in resp.content.iter_chunked(1024): table = gsf.decode_stream(chunk) for feature in table.to_geopandas(): analyze_feature(feature)

Performance Benchmarks

Comparative analysis across 1M point features with 12 attributes each. Tests run on AWS r5.2xlarge instances.

MetricTraditional GeoJSONGeoJSON SequenceGeoParquet
Payload Size284 MB86 MB (↓69%)72 MB
Parse Time (CPU)1.8s0.4s (↓77%)0.6s
Memory Footprint620 MB115 MB98 MB
Streaming LatencyBlock-basedSub-10ms chunksFile-based
⚠️ Considerations

While GSF excels in streaming and incremental parsing, GeoParquet remains superior for offline analytics and columnar querying. Choose GSF for real-time pipelines and GeoParquet for batch processing.

FAQ & Troubleshooting

Q: Does GeoServer support partial stream resumption?

Yes. Use the Range: bytes=start- header alongside the GSF format. The server tracks chunk boundaries via the X-GSF-Chunk-ID trailer.

Q: How do I handle schema evolution in live streams?

GSF embeds schema metadata in every batch header. Clients should verify the schema_version field before deserializing. Breaking changes trigger a 410 Gone with migration guidance.

Q: Is compression applied automatically?

Compression is negotiated via Accept-Encoding. We recommend zstd or brotli for optimal CPU/memory tradeoffs. Uncompressed chunks are available for debugging.

Q: Can I filter features during streaming?

Server-side filtering supports OGC CQL2 expressions. Add ?filter=cql:expression to your request. Note that complex spatial predicates may increase initial chunk latency.