GeoJSON Sequence Streaming
High-performance, chunked binary streaming for geospatial features using the Apache Arrow-based GeoJSON Sequence Format (GSF). OGC Standard Stable
Overview
Traditional GeoJSON is human-readable but inefficient for large datasets and real-time pipelines. GeoJSON Sequence Streaming delivers features as compressed binary chunks over HTTP, reducing payload size by up to 70% while maintaining full OGC compliance.
The format leverages Apache Arrow's columnar structure to enable zero-copy deserialization, making it ideal for:
- Real-time telemetry & IoT sensor networks
- High-frequency financial geospatial data
- Large-scale vector tile generation
- Server-to-server feature sync pipelines
Specification & HTTP Headers
Streaming requires specific headers to negotiate chunked transfer and binary encoding. The server pushes Arrow batches wrapped in a GSF container.
Request Headers
| Header | Description | Example |
|---|---|---|
Accept | Request GSF format | application/x-geojson-seq |
Transfer-Encoding | Enable chunked delivery | chunked |
Accept-Encoding | Compression support | gzip, br, zstd |
Response Headers
| Header | Description | Example |
|---|---|---|
Content-Type | GSF binary stream | application/x-geojson-seq |
X-GSF-Features-Total | Estimated total count | 124580 |
X-GSF-Batch-Size | Features per chunk | 1000 |
The stream follows HTTP/1.1 chunked transfer encoding. Each chunk contains a self-describing Arrow IPC message with schema metadata in the initial batch.
Implementation Examples
Use the following examples to request and consume GeoJSON Sequence streams in your application.
Performance Benchmarks
Comparative analysis across 1M point features with 12 attributes each. Tests run on AWS r5.2xlarge instances.
| Metric | Traditional GeoJSON | GeoJSON Sequence | GeoParquet |
|---|---|---|---|
| Payload Size | 284 MB | 86 MB (↓69%) | 72 MB |
| Parse Time (CPU) | 1.8s | 0.4s (↓77%) | 0.6s |
| Memory Footprint | 620 MB | 115 MB | 98 MB |
| Streaming Latency | Block-based | Sub-10ms chunks | File-based |
While GSF excels in streaming and incremental parsing, GeoParquet remains superior for offline analytics and columnar querying. Choose GSF for real-time pipelines and GeoParquet for batch processing.
FAQ & Troubleshooting
Q: Does GeoServer support partial stream resumption?
Yes. Use the Range: bytes=start- header alongside the GSF format. The server tracks chunk boundaries via the X-GSF-Chunk-ID trailer.
Q: How do I handle schema evolution in live streams?
GSF embeds schema metadata in every batch header. Clients should verify the schema_version field before deserializing. Breaking changes trigger a 410 Gone with migration guidance.
Q: Is compression applied automatically?
Compression is negotiated via Accept-Encoding. We recommend zstd or brotli for optimal CPU/memory tradeoffs. Uncompressed chunks are available for debugging.
Q: Can I filter features during streaming?
Server-side filtering supports OGC CQL2 expressions. Add ?filter=cql:expression to your request. Note that complex spatial predicates may increase initial chunk latency.