rag 1.2.4
rag: ^1.2.4 copied to clipboard
Retrieval Augmented Generation for Agentic
rag #
rag is a Dart package for adding retrieval-augmented generation to
agentic agents.
This repository ships two things:
- the
raglibrary underlib/ - a small Flutter test harness under
rag_example/
It does not ship a standalone server, CLI, or opinionated ingestion pipeline.
What The Library Provides #
RagAgent, anAgentsubclass that exposes retrieval through a built-inquerytool.QueryTool, the tool implementation and generated schema the model calls for retrieval.VectorSpace, a backend-agnostic abstraction for indexing, querying, listing, and deleting records.PineconeVectorSpace, a concrete Pinecone implementation.FirestoreVectorSpace, a Firestore-backed embedding vector implementation.QdrantVectorSpace, a Qdrant-backed embedding vector implementation.VectorUpsert.fromIChunk(...), a helper for turningagenticchunk objects into indexable vector records.
Installation #
dart pub add rag
Imports #
The primary import is:
import 'package:rag/rag.dart';
package:rag/rag.dart currently re-exports:
- the
agenticpublic surface RagAgentQueryToolVectorSpace,VectorUpsert,VectorResult, andVectorSpaceResultPineconeVectorSpaceFirestoreVectorSpaceQdrantVectorSpace
Backend-specific imports still exist if you prefer them:
import 'package:rag/spaces/fire_space.dart';
import 'package:rag/spaces/pinecone_space.dart';
import 'package:rag/spaces/qdrant_space.dart';
How Retrieval Works #
At runtime, RagAgent.rag() does the same work as Agent.call(), but it
always injects the built-in query tool.
The flow is:
- You create a
RagAgentwith anagenticchat model, a chat provider, and aVectorSpace. - You add messages to the agent as usual.
- You call
agent.rag(). - The model can call the
querytool with aqueriesarray. QueryToolcallsvectorSpace.queryAll(queries).- The matching records are returned to the model as tool output.
- The model answers using the retrieved context.
QueryTool is built for full-text questions, not keyword bags. If the model
needs multiple queries, they should be distinct sub-questions ordered by
importance.
Quick Start #
This is the simplest supported shape: index some records, create a RagAgent,
and let the model retrieve before it answers.
import 'package:rag/rag.dart';
const openAiKey = String.fromEnvironment('OPENAI_API_KEY');
const pineconeKey = String.fromEnvironment('PINECONE_API_KEY');
Future<void> main() async {
VectorSpace vectorSpace = PineconeVectorSpace(
namespace: 'patients',
host: 'https://your-index-host.pinecone.io',
apiKey: pineconeKey,
rerank: true,
);
await vectorSpace.upsertAll([
VectorUpsert(
id: 'jane-doe/demographics',
content: 'Jane Doe was born on March 14, 1986.',
metadata: {'patient': 'Jane Doe', 'record': 'demographics'},
),
VectorUpsert(
id: 'jane-doe/allergies',
content: 'Jane Doe has a documented penicillin allergy.',
metadata: {'patient': 'Jane Doe', 'record': 'allergies'},
),
VectorUpsert(
id: 'jane-doe/medications',
content: 'Jane Doe currently takes 10 mg of lisinopril daily.',
metadata: {'patient': 'Jane Doe', 'record': 'medications'},
),
]);
RagAgent agent = RagAgent(
user: 'caregiver-portal',
llm: OpenAIConnector(apiKey: openAiKey).connect(ChatModel.openai4_1Mini),
chatProvider: MemoryChatProvider(
messages: [
Message.system(
'You answer caregiver questions about Jane Doe. '
'Always search the record before answering.',
),
],
),
vectorSpace: vectorSpace,
);
await agent.addMessage(Message.user('What is Jane Doe allergic to?'));
AgentMessage answer = await agent.rag();
print(answer.content);
}
Core API #
RagAgent #
RagAgent extends Agent and adds a required vectorSpace field plus two
convenience helpers:
QueryTool getQueryTool()
Future<AgentMessage> rag({
ToolSchema? responseFormat,
List<Tool> tools = const [],
int maxRecursiveToolCalls = 1,
})
rag() prepends the retrieval tool to any extra tools you pass in.
QueryTool #
QueryTool is a Tool with:
- default tool name:
query - a generated schema with one required field:
queries - a plain-text return value summarizing the matched records
Internally it does:
vectorSpace.queryAll(qt.queries)
That means the default QueryTool behavior uses the VectorSpace.queryAll()
implementation provided by the active backend.
VectorSpace #
VectorSpace is the storage interface behind the package:
upsertAll(List<VectorUpsert> upserts)query(String query, {int maxResults = 10})queryAll(List<String> queries, {int maxResults = 100, int compactQueriesTo = 10, int maxTokens = 50000})list({int batchSize = 100, String? prefix})deleteAll(List<String> ids, {int batchSize = 999})deleteAllPrefixed(String prefix, {int deleteBatchSize = 999, int listBatchSize = 100})purgeAll()
The default queryAll() implementation:
- compacts long query lists down to
compactQueriesTo - runs
query()for each query - merges the results
- sorts by descending score
- trims the baked result set to
maxTokens - trims again to
maxResults
VectorUpsert, VectorResult, and VectorSpaceResult #
VectorUpsertis the write model:id,content, andmetadataVectorResultextendsVectorUpsertwithscoreandcontentTokenCountVectorSpaceResultwrapsList<VectorResult>and can merge or bake results
VectorUpsert.fromIChunk(...) converts an agentic.IChunk-shaped object into
an upsert with metadata for:
indexstartlengthlodfrom
plus any metadata you supply.
Backend Notes #
PineconeVectorSpace #
PineconeVectorSpace is ready to use as-is.
It requires:
namespacehostapiKey
It also supports:
- optional reranking through
rerank - rerank tuning through
rerankTopKandrerankModel list()deleteAll()deleteAllPrefixed()purgeAll()
On upsert, it stores your content in Pinecone's text field and merges in
your metadata. On query, it uses Pinecone text search and estimates token
counts with bpe.
FirestoreVectorSpace #
FirestoreVectorSpace is ready to use when you provide an embedder and a
Firestore collection:
FirestoreVectorSpace(
embedder: OpenAIConnector(
apiKey: openAiKey,
).asEmbedder('text-embedding-3-small'),
collection: FirestoreDatabase.instance.collection('vectors'),
)
It stores each upsert as a Firestore document where:
- the Firestore document ID is the vector ID
contentFieldstores the text bodymetadataFieldstores arbitrary metadatavectorFieldstores the embedding vector
Configurable fields:
vectorField, defaultvectorcontentField, defaultcontentmetadataField, defaultmetadatadistanceMeasure, defaultVectorDistanceMeasure.cosine
Its query() implementation uses Firestore nearest-neighbor search, preserves
metadata, and assigns finite descending scores based on result rank.
QdrantVectorSpace #
QdrantVectorSpace is ready to use when you provide an embedder plus Qdrant
connection details:
QdrantVectorSpace(
embedder: OpenAIConnector(
apiKey: openAiKey,
).asEmbedder('text-embedding-3-small'),
host: 'your-cluster.us-east-1.aws.cloud.qdrant.io',
port: 6334,
apiKey: qdrantKey,
organization: 'docs',
chunkSize: 1024,
dimension: 1536,
)
Important behavior:
organizationis used as the collection name and namespaceensureCollection()creates the collection and addsentryandrecordindexes if it does not existdestroyCollection()removes the collectiondeleteEntry()deletes all points matching anentrydeleteRecord()deletes all points matching arecord- caller IDs are converted to deterministic UUIDs internally
- the original caller ID is also stored in the Qdrant payload under
id list()and query results round-trip the original caller IDsqueryAll()performs batched multi-query retrieval, hydrates point payloads, and safely decodes nested metadata payloads
The Qdrant payload currently stores:
idtext- optional
record - optional
entry - optional
metadata
Using The Query Tool Directly #
You do not have to call rag() if you want retrieval by itself.
import 'package:rag/rag.dart';
Future<void> main() async {
RagAgent agent = RagAgent(
user: 'demo',
llm: OpenAIConnector(apiKey: 'sk-...').connect(ChatModel.openai4_1Mini),
chatProvider: MemoryChatProvider(),
vectorSpace: PineconeVectorSpace(
namespace: 'docs',
host: 'https://your-index-host.pinecone.io',
apiKey: 'pcsk-...',
),
);
String toolOutput = await agent.getQueryTool().call(
agent: agent,
arguments: {
'queries': [
'What is the refund window?',
'How long does shipping take?',
],
},
);
print(toolOutput);
}
Indexing agentic Chunks #
If you already have chunk objects from agentic, VectorUpsert.fromIChunk(...)
is the easiest way to preserve chunk metadata while indexing:
VectorUpsert upsert = VectorUpsert.fromIChunk(
chunk: chunk,
idPrefix: 'faq/',
metadata: {'record': 'returns-policy'},
);
That helper expects a chunk-like object with fields compatible with
agentic.IChunk, including index, charStart, charEnd, fullContent,
lod, and from.
Bring Your Own Vector Store #
If Pinecone, Firestore, and Qdrant are not the right fit, you can implement
VectorSpace yourself.
import 'package:rag/rag.dart';
class MemoryVectorSpace extends VectorSpace {
MemoryVectorSpace({Map<String, VectorUpsert>? items})
: items = items ?? <String, VectorUpsert>{};
final Map<String, VectorUpsert> items;
@override
Future<void> upsertAll(List<VectorUpsert> upserts) {
for (VectorUpsert upsert in upserts) {
items[upsert.id] = upsert;
}
return Future.value();
}
@override
Future<void> deleteAll(List<String> ids, {int batchSize = 999}) {
for (String id in ids) {
items.remove(id);
}
return Future.value();
}
@override
Future<void> purgeAll() {
items.clear();
return Future.value();
}
@override
Stream<String> list({int batchSize = 100, String? prefix}) async* {
for (String id in items.keys) {
if (prefix == null || id.startsWith(prefix)) {
yield id;
}
}
}
@override
Future<VectorSpaceResult> query(String query, {int maxResults = 10}) {
List<String> terms = query.toLowerCase().split(RegExp(r'\\s+'));
Iterable<VectorUpsert> matches = items.values.where((item) {
String haystack = item.content.toLowerCase();
return terms.any(haystack.contains);
}).take(maxResults);
return Future.value(
VectorSpaceResult(
results: [
for (VectorUpsert item in matches)
VectorResult(
id: item.id,
content: item.content,
metadata: item.metadata,
score: 1.0,
contentTokenCount: item.content.split(RegExp(r'\\s+')).length,
),
],
),
);
}
}
Repository Layout #
lib/
rag.dart
rag_agent.dart
query_tool.dart
vector_space.dart
gen/
artifacts.gen.dart
exports.gen.dart
spaces/
pinecone_space.dart
fire_space.dart
qdrant_space.dart
test/
vector_space_test.dart
spaces/
fire_space_test.dart
qdrant_space_test.dart
rag_example/
lib/main.dart
About rag_example #
rag_example/ is a lightweight Flutter harness for experimenting with the
package. It currently wires up:
- Firebase and Firestore
- a
FirestoreVectorSpacesubclass - an
OpenRouterConnectorembedding implementation - a one-shot chunk indexing and retrieval smoke test on startup
It is useful as a local sandbox, but the package itself is the main product in this repository.