rag 1.2.3
rag: ^1.2.3 copied to clipboard
Retrieval Augmented Generation for Agentic
rag #
rag is a small Dart package for adding retrieval-augmented generation to
agentic agents.
It gives you:
RagAgent, a thinAgentwrapper that automatically exposes retrieval as a tool call.QueryTool, an LLM tool schema for searching a vector store with natural language questions.VectorSpace, a backend-agnostic interface for indexing, querying, listing, and deleting records.PineconeVectorSpace, a ready-to-use Pinecone implementation.FirestoreVectorSpace, an abstract Firestore-backed base class.QdrantVectorSpace, an abstract Qdrant-backed base class.
What This Project Is #
This package is a library, not a standalone app or CLI.
You use it when you already have:
- an
agenticchat model, - a chat history provider,
- and a vector store containing searchable text records,
and you want the model to retrieve relevant context before answering.
What It Does #
At runtime, RagAgent.rag() works like this:
- Your agent receives a user message.
- The model can call the built-in
querytool. - The
querytool searches yourVectorSpaceusing one or more full-text questions. - The best matching results are returned to the model as tool output.
- The model answers using the retrieved context.
The package does not prescribe how you chunk or author your documents. You can upsert any text you want into the vector store and retrieve it later.
Features #
- Automatic retrieval tool wiring through
RagAgent.rag(). - Backend abstraction via
VectorSpace, so you can swap Pinecone for your own implementation. - Natural-language multi-query search through
QueryTool. - Token-aware result baking: results are merged, sorted by score, and trimmed to fit a token budget.
- Query compaction when too many search prompts are supplied.
- Pinecone support for text upserts, search, list, delete, prefix delete, and full namespace purge.
- Firestore support through an abstract
FirestoreVectorSpacethat stores content, metadata, and embeddings in a collection. - Qdrant support through an abstract
QdrantVectorSpacewith collection helpers and batched multi-query retrieval. - Optional Pinecone reranking.
- Re-exports from
agentic, so common types likeAgent,Message,MemoryChatProvider,Tool, andChatModelare available frompackage:rag/rag.dart.
Installation #
dart pub add rag
The core package surface is:
import 'package:rag/rag.dart';
Backend implementations live under lib/spaces/ and are imported separately:
import 'package:rag/spaces/fire_space.dart';
import 'package:rag/spaces/pinecone_space.dart';
import 'package:rag/spaces/qdrant_space.dart';
Core API #
RagAgent #
RagAgent extends Agent and adds a vectorSpace plus a convenience
rag() method.
Future<AgentMessage> rag({
ToolSchema? responseFormat,
List<Tool> tools = const [],
int maxRecursiveToolCalls = 1,
})
rag() behaves like Agent.call(), but always includes the retrieval tool.
QueryTool #
QueryTool is the tool the model calls to search your records. It expects a
queries array of full-text questions ordered by importance.
VectorSpace #
VectorSpace is the storage interface behind the retrieval flow:
upsertAll()query()queryAll()list()deleteAll()deleteAllPrefixed()purgeAll()
PineconeVectorSpace #
PineconeVectorSpace implements VectorSpace with Pinecone's text search API.
It also estimates token counts using bpe so multi-query results can be trimmed
before they are handed back to the model.
FirestoreVectorSpace #
FirestoreVectorSpace is an abstract VectorSpace backed by a Firestore
collection. You subclass it and implement onEmbed() to provide embeddings.
It stores:
- document ID as the vector ID,
contentincontentField,- arbitrary metadata in
metadataField, - embedding vectors in
vectorField.
QdrantVectorSpace #
QdrantVectorSpace is an abstract VectorSpace backed by a Qdrant collection.
You subclass it and implement onEmbed() to return DenseVectors.
It also includes collection helpers such as:
ensureCollection()destroyCollection()deleteEntry()deleteRecord()
The current queryAll() implementation expects payloads that include text,
record, entry, and metadata.index / metadata.length.
Quick Start #
This example indexes a few records into Pinecone, creates a RagAgent, and
asks a question that should trigger retrieval.
import 'package:rag/rag.dart';
import 'package:rag/spaces/pinecone_space.dart';
const openaiKey = String.fromEnvironment('OPENAI_API_KEY');
const pineconeKey = String.fromEnvironment('PINECONE_API_KEY');
Future<void> main() async {
final vectorSpace = PineconeVectorSpace(
namespace: 'patients',
host: 'https://your-index-host.pinecone.io',
apiKey: pineconeKey,
rerank: true,
);
await vectorSpace.upsertAll([
VectorUpsert(
id: 'jane-doe/demographics',
content: 'Jane Doe was born on March 14, 1986.',
metadata: {'patient': 'Jane Doe', 'record': 'demographics'},
),
VectorUpsert(
id: 'jane-doe/allergies',
content: 'Jane Doe has a documented penicillin allergy.',
metadata: {'patient': 'Jane Doe', 'record': 'allergies'},
),
VectorUpsert(
id: 'jane-doe/medications',
content: 'Jane Doe currently takes 10 mg of lisinopril daily.',
metadata: {'patient': 'Jane Doe', 'record': 'medications'},
),
]);
final agent = RagAgent(
user: 'caregiver-portal',
llm: OpenAIConnector(apiKey: openaiKey).connect(ChatModel.openai4_1Mini),
chatProvider: MemoryChatProvider(
messages: [
Message.system(
'You answer caregiver questions about Jane Doe. '
'Search the record before answering.',
),
],
),
vectorSpace: vectorSpace,
);
await agent.addMessage(
Message.user('What is Jane Doe allergic to?'),
);
final answer = await agent.rag();
print('Assistant: ${answer.content}');
}
Example: Indexing and Managing Records #
You can use VectorSpace directly for ingestion and lifecycle operations.
import 'package:rag/rag.dart';
import 'package:rag/spaces/pinecone_space.dart';
Future<void> main() async {
final vectorSpace = PineconeVectorSpace(
namespace: 'knowledge-base',
host: 'https://your-index-host.pinecone.io',
apiKey: 'pcsk-...',
);
await vectorSpace.upsertAll([
VectorUpsert(
id: 'faq/shipping',
content: 'Standard shipping takes 3 to 5 business days.',
metadata: {'category': 'faq'},
),
VectorUpsert(
id: 'faq/returns',
content: 'Returns are accepted within 30 days of delivery.',
metadata: {'category': 'faq'},
),
]);
await for (final id in vectorSpace.list(prefix: 'faq/')) {
print(id);
}
await vectorSpace.deleteAllPrefixed('faq/');
}
Example: Using The Query Tool Directly #
If you want retrieval without calling rag(), you can invoke the query tool
yourself.
import 'package:rag/rag.dart';
import 'package:rag/spaces/pinecone_space.dart';
Future<void> main() async {
final agent = RagAgent(
user: 'demo',
llm: OpenAIConnector(apiKey: 'sk-...').connect(ChatModel.openai4_1Mini),
chatProvider: MemoryChatProvider(),
vectorSpace: PineconeVectorSpace(
namespace: 'docs',
host: 'https://your-index-host.pinecone.io',
apiKey: 'pcsk-...',
),
);
final toolOutput = await agent.getQueryTool().call(
agent: agent,
arguments: {
'queries': [
'What is the refund window?',
'How long does shipping take?',
],
},
);
print(toolOutput);
}
Example: Firestore Backend #
FirestoreVectorSpace is abstract, so you provide the embedding function.
import 'package:fire_api/fire_api.dart';
import 'package:rag/rag.dart';
import 'package:rag/spaces/fire_space.dart';
class OpenAIFirestoreVectorSpace extends FirestoreVectorSpace {
final OpenAIConnector connector;
final String embeddingModel;
OpenAIFirestoreVectorSpace({
required super.collection,
required this.connector,
this.embeddingModel = 'text-embedding-3-small',
super.vectorField,
super.contentField,
super.metadataField,
super.distanceMeasure,
});
@override
Future<List<List<double>>> onEmbed(List<String> inputs) {
return connector.embedMultiple(
model: embeddingModel,
texts: inputs,
);
}
}
Future<void> main() async {
final connector = OpenAIConnector(
apiKey: const String.fromEnvironment('OPENAI_API_KEY'),
);
final vectorSpace = OpenAIFirestoreVectorSpace(
connector: connector,
collection: FirestoreDatabase.instance.collection('rag-records'),
);
await vectorSpace.upsertAll([
VectorUpsert(
id: 'faq/refunds',
content: 'Refunds are available within 30 days of purchase.',
metadata: {'category': 'billing'},
),
]);
final result = await vectorSpace.query('What is the refund policy?');
print(result.results.first.content);
}
Example: Qdrant Backend #
QdrantVectorSpace is also abstract, so you provide the embedding function and
typically ensure the collection exists before indexing/querying.
import 'package:qdrant/qdrant.dart' hide Value, ListValue, Struct, Uri;
import 'package:rag/rag.dart';
import 'package:rag/spaces/qdrant_space.dart';
class OpenAIQdrantVectorSpace extends QdrantVectorSpace {
final OpenAIConnector connector;
final String embeddingModel;
OpenAIQdrantVectorSpace({
required this.connector,
required super.host,
required super.port,
required super.apiKey,
required super.organization,
required super.dimension,
required super.chunkSize,
this.embeddingModel = 'text-embedding-3-small',
});
@override
Future<List<DenseVector>> onEmbed(List<String> inputs) async {
final vectors = await connector.embedMultiple(
model: embeddingModel,
texts: inputs,
);
return vectors.map((vector) => DenseVector(data: vector)).toList();
}
}
Future<void> main() async {
final vectorSpace = OpenAIQdrantVectorSpace(
connector: OpenAIConnector(
apiKey: const String.fromEnvironment('OPENAI_API_KEY'),
),
host: 'qdrant.example.com',
port: 6334,
apiKey: const String.fromEnvironment('QDRANT_API_KEY'),
organization: 'knowledge-base',
dimension: 1536,
chunkSize: 800,
);
await vectorSpace.ensureCollection();
}
Example: Add Other Tools Alongside Retrieval #
rag() can expose your own tools in addition to the built-in retrieval tool.
final answer = await agent.rag(
tools: [
MyCalculatorTool(),
MyDatabaseWriteTool(),
],
maxRecursiveToolCalls: 2,
);
That gives the model access to:
- the built-in
querytool for retrieval, - plus whatever extra tools you supply.
Example: Bring Your Own Vector Store #
If you do not want Pinecone, implement VectorSpace yourself.
This naive in-memory example is useful for tests and local demos:
import 'package:rag/rag.dart';
class MemoryVectorSpace extends VectorSpace {
final Map<String, VectorUpsert> _items = {};
@override
Future<void> upsertAll(List<VectorUpsert> upserts) async {
for (final upsert in upserts) {
_items[upsert.id] = upsert;
}
}
@override
Future<void> deleteAll(List<String> ids, {int batchSize = 999}) async {
for (final id in ids) {
_items.remove(id);
}
}
@override
Future<void> purgeAll() async => _items.clear();
@override
Stream<String> list({int batchSize = 100, String? prefix}) async* {
for (final id in _items.keys) {
if (prefix == null || id.startsWith(prefix)) {
yield id;
}
}
}
@override
Future<VectorSpaceResult> query(String query, {int maxResults = 10}) async {
final terms = query.toLowerCase().split(RegExp(r'\\s+'));
final matches =
_items.values.where((item) {
final haystack = item.content.toLowerCase();
return terms.any(haystack.contains);
}).take(maxResults);
final results = <VectorResult>[];
for (final item in matches) {
results.add(
VectorResult(
id: item.id,
content: item.content,
metadata: item.metadata,
score: 1,
contentTokenCount: item.content.split(RegExp(r'\\s+')).length,
),
);
}
return VectorSpaceResult(results: results);
}
}
How To Use This Package Well #
- Store focused, retrieval-friendly chunks instead of entire books or huge documents.
- Write
contentfields as plain language passages the model can quote or summarize. - Use stable, prefixable IDs like
customer/123/profileso cleanup is easy. - Put filters, tags, or record identity into
metadataduring upsert. - Keep the system prompt explicit: tell the model to search before answering.
- Use multiple queries only when asking distinct sub-questions.
Notes #
package:rag/rag.dartexports the shared agent and vector abstractions, not the backend implementations.- Import backend classes from:
package:rag/spaces/pinecone_space.dart,package:rag/spaces/fire_space.dart, andpackage:rag/spaces/qdrant_space.dart. QueryToolis designed around full-text search questions, not keyword lists.VectorSpace.queryAll()merges the results from every query, sorts by score, then trims the final context tomaxTokensandmaxResults.PineconeVectorSpacewritescontentinto thetextfield and includes any provided metadata during upsert.FirestoreVectorSpaceuses Firestore document IDs as vector IDs.QdrantVectorSpaceusesorganizationas the collection/namespace name.
Package Layout #
lib/
rag.dart // primary export surface
rag_agent.dart // Agent wrapper with built-in retrieval tool
query_tool.dart // Tool schema and tool implementation
vector_space.dart // backend abstraction and result models
spaces/
pinecone_space.dart // Pinecone-backed VectorSpace
fire_space.dart // Firestore-backed VectorSpace base class
qdrant_space.dart // Qdrant-backed VectorSpace base class