Jelly RDF Binary Codec
A Jelly RDF binary serialization codec for locorda_rdf_core. Jelly is a high-performance, streaming binary format for RDF data based on Protocol Buffers, achieving significantly better compression and throughput than text-based formats like Turtle or N-Triples.
Part of the Locorda RDF monorepo with additional packages for core RDF functionality, canonicalization, object mapping, vocabulary generation, and more.
Installation
dart pub add locorda_rdf_jelly
π Quick Start
Batch (non-streaming)
import 'package:locorda_rdf_core/core.dart';
import 'package:locorda_rdf_jelly/jelly.dart';
// Use the pre-configured global codec directly
final graph = jellyGraph.decode(jellyBytes);
final bytes = jellyGraph.encode(graph);
// Dataset (named graphs)
final dataset = jelly.decode(jellyBytes);
final encoded = jelly.encode(dataset);
Frame-level streaming
import 'package:locorda_rdf_jelly/jelly.dart';
// Encode a stream of triple batches β lookup tables are shared across all
// frames for maximum compression efficiency
final encoded = JellyTripleFrameEncoder().bind(frameStream);
await encoded.pipe(file.openWrite());
// Decode β emits one List<Triple> per physical Jelly frame
final triples = JellyTripleFrameDecoder()
.bind(byteStream)
.expand((frame) => frame);
Integration with RdfCore
import 'package:locorda_rdf_core/core.dart';
import 'package:locorda_rdf_jelly/jelly.dart';
// Register Jelly alongside other codecs for content-type dispatching
final rdfCore = RdfCore.withStandardCodecs(
additionalBinaryGraphCodecs: [JellyGraphCodec()],
additionalBinaryDatasetCodecs: [JellyDatasetCodec()],
);
// Codec-agnostic decode/encode via content type
final graph = rdfCore.decodeGraph(jellyBytes, contentType: jellyMimeType);
final bytes = rdfCore.encodeGraph(graph, contentType: jellyMimeType);
β¨ Features
- High performance β Fastest encoder and decoder in the suite. Writes protobuf wire format directly on both paths (no
GeneratedMessageallocation), with IRI/datatype lookup-table compression, O(1) LRU eviction, and repeated-term delta encoding. See benchmarks - Frame-level streaming β
JellyTripleFrameEncoder/JellyTripleFrameDecoder(and quad equivalents) are idiomaticStreamTransformerBaseinstances, composable with.bind()and.expand() - Cross-frame table sharing β In streaming mode the lookup tables accumulate across frames, giving better compression for continuous streams than independent per-frame encoding
- All three physical stream types β TRIPLES, QUADS, and GRAPHS, selectable via
JellyEncoderOptions.physicalType - Batch API β
JellyGraphCodec/JellyDatasetCodecimplement theRdfBinaryGraphCodec/RdfBinaryDatasetCodecinterfaces for drop-in use withRdfCore - Configurable β
JellyEncoderOptionsexposes table sizes, frame size, physical/logical stream type, and optional stream name;JellyDecoderOptionsfollows the same pattern for extensibility - Conformance tested β Verified against 82 official Jelly-RDF conformance tests (RDF 1.1)
Standards Compliance
This package is tested against the official jelly-protobuf conformance test suite, executed via git submodule. All RDF 1.1 test cases pass. RDF-star cases are tracked as a roadmap item (see below). Generalized RDF is intentionally out of scope β this library strictly follows the official RDF 1.1 specification.
Decoding (from_jelly)
| Stream type | Positive | Negative | Total |
|---|---|---|---|
| TRIPLES | 17 | 10 | 27 |
| QUADS | 8 | 3 | 11 |
| GRAPHS | 11 | 2 | 13 |
| Total | 36 | 15 | 51 |
Encoding (to_jelly)
| Stream type | Positive |
|---|---|
| TRIPLES | 16 |
| QUADS | 6 |
| GRAPHS | 9 |
| Total | 31 |
The to_jelly suite defines conformance in terms of RDF isomorphism: the encoded output is decoded and compared to the expected result rather than requiring bit-identical output. This allows flexibility in lookup-table layout while still verifying semantic correctness.
Advanced Usage
Encoder options
The physical stream type is determined entirely by the encoder class you choose β JellyTripleFrameEncoder always writes TRIPLES, JellyQuadFrameEncoder always writes QUADS. JellyEncoderOptions therefore only exposes table sizes, frame size, and metadata:
import 'package:locorda_rdf_jelly/jelly.dart';
final opts = JellyEncoderOptions(
// Lookup table sizes β larger β better compression, more memory
maxNameTableSize: 256, // default: 128, min: 8 (spec requirement)
maxPrefixTableSize: 64, // default: 32
maxDatatypeTableSize: 32, // default: 16
// Maximum rows per frame β smaller β lower latency, more overhead
maxRowsPerFrame: 512, // default: 256
// Optional informational metadata
streamName: 'my-export',
);
final encoder = JellyTripleFrameEncoder(options: opts);
The one exception is JellyDatasetEncoder, which can emit either the flat QUADS physical type (default) or the GRAPHS physical type that preserves named-graph boundaries in the stream. Choose via physicalType in JellyEncoderOptions β this requires importing the internal proto enum:
import 'package:locorda_rdf_jelly/jelly.dart';
import 'package:locorda_rdf_jelly/src/proto/rdf.pbenum.dart';
// GRAPHS physical type keeps graph boundaries in the encoded stream
final encoder = JellyDatasetEncoder(
options: JellyEncoderOptions(
physicalType: PhysicalStreamType.PHYSICAL_STREAM_TYPE_GRAPHS,
),
);
Streaming pipeline (quads / named graphs)
import 'package:locorda_rdf_jelly/jelly.dart';
// Quad (dataset) streaming
final Stream<Iterable<Quad>> quadFrameStream = ...;
final encoded = JellyQuadFrameEncoder().bind(quadFrameStream);
await encoded.pipe(sink);
final decoded = JellyQuadFrameDecoder()
.bind(byteStream)
.expand((frame) => frame);
Multi-frame input files (encoding)
When a logical dataset spans multiple source files or chunks β for example when streaming from a database in pages β feed each chunk as a separate batch to the frame encoder. The shared lookup table state means later frames benefit from IRIs already seen in earlier frames:
final encoder = JellyTripleFrameEncoder();
final controller = StreamController<Iterable<Triple>>();
final outputStream = encoder.bind(controller.stream);
outputStream.listen(sink.add, onDone: sink.close);
for (final page in pages) {
controller.add(page);
}
await controller.close();
Error Handling
Decoding throws a RdfDecoderException (from locorda_rdf_core) for:
- Malformed protobuf frames (truncated or corrupt data)
- Protocol violations caught by the Jelly specification (e.g. invalid stream type combinations, missing options row)
- Lookup table index out of range
Encoder constraint violations β such as specifying a maxNameTableSize below the spec minimum of 8 β are caught by a Dart assert at construction time, so they surface during development rather than at runtime in production.
try {
final graph = jellyGraph.decode(corruptBytes);
} on RdfDecoderException catch (e) {
// e.message contains a human-readable description of the violation
}
Performance
Jelly consistently outperforms all text-based RDF codecs in both encoding speed and output size. On the large benchmark (17.2k triples), Jelly encodes in 81% of Turtle's time and decodes in 7% of Turtle's time while producing 75% of the output size. See the full benchmark results for detailed comparisons across all codecs and dataset sizes.
Why is it fast?
Encoder: JellyRawFrameWriter writes protobuf wire format directly to byte buffers instead of constructing intermediate GeneratedMessage objects. Each proto object allocation costs ~200β400ns due to internal field arrays, type checking, and BuilderInfo setup β for a large graph this adds up to tens of thousands of unnecessary allocations. By computing field tags and varint-encoding in-place, the encoder eliminates this overhead entirely while producing byte-identical output (verified by dedicated wire-format equivalence tests).
Decoder: The raw frame parser reads protobuf wire format directly as well β no GeneratedMessage is allocated on the decode hot path either. On top of that, three structural optimisations eliminate the remaining overhead:
- O(1) lookup table β IRIs and datatypes are stored in a fixed-size
Listindexed directly by their Jelly ID (range 1..maxSize). The previousMap<int,String>-based implementation required an O(n) minimum-key scan on every eviction, causing super-linear scaling with dataset size. - IriTerm cache β A flat
List<IriTerm?>indexed by(nameId, prefixId)caches fully-constructedIriTermobjects. Cache hits bypass table lookups, string concatenation, andIriTermallocation entirely. - Datatype cache β A small list keyed by datatype ID caches the
IriTermfor each datatype. With typically β€5 distinct datatypes per stream, every typed literal after the first resolves without touching the datatype table.
On top of the raw encoding, two protocol-level mechanisms provide the compression advantage:
-
Lookup tables β IRIs and datatypes are assigned small integer IDs on first occurrence and referenced by ID thereafter. The encoder caches up to
maxNameTableSizename entries,maxPrefixTableSizeprefix entries, andmaxDatatypeTableSizedatatype entries simultaneously. -
Repeated-term delta encoding β Subject, predicate, and object terms that repeat between consecutive triples are omitted entirely from the encoded row. Dense datasets with high locality (e.g. all triples about the same subject grouped together) benefit most.
Tune maxRowsPerFrame based on your latency vs. throughput trade-off:
- Larger frames β fewer frame headers β higher throughput
- Smaller frames β lower end-to-end latency for streaming consumers
The Jelly specification recommends keeping frames under 1 MB.
API Overview
| Symbol | Kind | Description |
|---|---|---|
jellyGraph |
RdfBinaryGraphCodec |
Pre-configured global codec for single-graph Jelly streams |
jelly |
RdfBinaryDatasetCodec |
Pre-configured global codec for dataset Jelly streams |
jellyMimeType |
String |
MIME type application/x-jelly-rdf |
JellyGraphCodec |
RdfBinaryGraphCodec |
Instantiable graph codec for use with RdfCore |
JellyDatasetCodec |
RdfBinaryDatasetCodec |
Instantiable dataset codec for use with RdfCore |
JellyGraphEncoder |
RdfBinaryGraphEncoder |
Batch graph encoder (single Uint8List output) |
JellyGraphDecoder |
RdfBinaryGraphDecoder |
Batch graph decoder (single Uint8List input) |
JellyDatasetEncoder |
RdfBinaryDatasetEncoder |
Batch dataset encoder |
JellyDatasetDecoder |
RdfBinaryDatasetDecoder |
Batch dataset decoder |
JellyTripleFrameEncoder |
Converter / StreamTransformerBase |
Stateful frame-level triple encoder |
JellyTripleFrameDecoder |
Converter / StreamTransformerBase |
Frame-level triple decoder |
JellyQuadFrameEncoder |
Converter / StreamTransformerBase |
Stateful frame-level quad encoder |
JellyQuadFrameDecoder |
Converter / StreamTransformerBase |
Frame-level quad decoder |
JellyEncoderOptions |
Value object | Lookup table sizes, frame size, stream type, stream name |
JellyDecoderOptions |
Value object | Extensibility hook for future decoder configuration |
Roadmap / Next Steps
- RDF-star support β Decode and encode RDF-star (quoted triples) once
locorda_rdf_coreadds RDF-star term types - Negative encoder conformance tests β Exercise the 2
to_jellynegative cases (invalid stream-type requests) once the error-mapping API is stabilised - Formal conformance reporting β Submit results to the Jelly conformance registry
References
- Jelly RDF specification
- Jelly-protobuf conformance test suite
- locorda_rdf_core β RDF model and binary codec registry
- W3C RDF 1.1 Concepts
π€ Contributing
Contributions, bug reports, and feature requests are welcome!
- Fork the repo and submit a PR
- See CONTRIBUTING.md for guidelines
- Join the discussion in GitHub Issues
π€ AI Policy
This project is proudly human-led and human-controlled, with all key decisions, design, and code reviews made by people. At the same time, it stands on the shoulders of LLM giants: generative AI tools are used throughout the development process to accelerate iteration, inspire new ideas, and improve documentation quality. We believe that combining human expertise with the best of AI leads to higher-quality, more innovative open source software.
Β© 2025-2026 Klas KalaΓ. Licensed under the MIT License. Part of the Locorda RDF monorepo.
Libraries
- jelly
- Jelly RDF Binary Codec for locorda_rdf_core