rag 1.2.4 copy "rag: ^1.2.4" to clipboard
rag: ^1.2.4 copied to clipboard

Retrieval Augmented Generation for Agentic

rag #

rag is a Dart package for adding retrieval-augmented generation to agentic agents.

This repository ships two things:

  • the rag library under lib/
  • a small Flutter test harness under rag_example/

It does not ship a standalone server, CLI, or opinionated ingestion pipeline.

What The Library Provides #

  • RagAgent, an Agent subclass that exposes retrieval through a built-in query tool.
  • QueryTool, the tool implementation and generated schema the model calls for retrieval.
  • VectorSpace, a backend-agnostic abstraction for indexing, querying, listing, and deleting records.
  • PineconeVectorSpace, a concrete Pinecone implementation.
  • FirestoreVectorSpace, a Firestore-backed embedding vector implementation.
  • QdrantVectorSpace, a Qdrant-backed embedding vector implementation.
  • VectorUpsert.fromIChunk(...), a helper for turning agentic chunk objects into indexable vector records.

Installation #

dart pub add rag

Imports #

The primary import is:

import 'package:rag/rag.dart';

package:rag/rag.dart currently re-exports:

  • the agentic public surface
  • RagAgent
  • QueryTool
  • VectorSpace, VectorUpsert, VectorResult, and VectorSpaceResult
  • PineconeVectorSpace
  • FirestoreVectorSpace
  • QdrantVectorSpace

Backend-specific imports still exist if you prefer them:

import 'package:rag/spaces/fire_space.dart';
import 'package:rag/spaces/pinecone_space.dart';
import 'package:rag/spaces/qdrant_space.dart';

How Retrieval Works #

At runtime, RagAgent.rag() does the same work as Agent.call(), but it always injects the built-in query tool.

The flow is:

  1. You create a RagAgent with an agentic chat model, a chat provider, and a VectorSpace.
  2. You add messages to the agent as usual.
  3. You call agent.rag().
  4. The model can call the query tool with a queries array.
  5. QueryTool calls vectorSpace.queryAll(queries).
  6. The matching records are returned to the model as tool output.
  7. The model answers using the retrieved context.

QueryTool is built for full-text questions, not keyword bags. If the model needs multiple queries, they should be distinct sub-questions ordered by importance.

Quick Start #

This is the simplest supported shape: index some records, create a RagAgent, and let the model retrieve before it answers.

import 'package:rag/rag.dart';

const openAiKey = String.fromEnvironment('OPENAI_API_KEY');
const pineconeKey = String.fromEnvironment('PINECONE_API_KEY');

Future<void> main() async {
  VectorSpace vectorSpace = PineconeVectorSpace(
    namespace: 'patients',
    host: 'https://your-index-host.pinecone.io',
    apiKey: pineconeKey,
    rerank: true,
  );

  await vectorSpace.upsertAll([
    VectorUpsert(
      id: 'jane-doe/demographics',
      content: 'Jane Doe was born on March 14, 1986.',
      metadata: {'patient': 'Jane Doe', 'record': 'demographics'},
    ),
    VectorUpsert(
      id: 'jane-doe/allergies',
      content: 'Jane Doe has a documented penicillin allergy.',
      metadata: {'patient': 'Jane Doe', 'record': 'allergies'},
    ),
    VectorUpsert(
      id: 'jane-doe/medications',
      content: 'Jane Doe currently takes 10 mg of lisinopril daily.',
      metadata: {'patient': 'Jane Doe', 'record': 'medications'},
    ),
  ]);

  RagAgent agent = RagAgent(
    user: 'caregiver-portal',
    llm: OpenAIConnector(apiKey: openAiKey).connect(ChatModel.openai4_1Mini),
    chatProvider: MemoryChatProvider(
      messages: [
        Message.system(
          'You answer caregiver questions about Jane Doe. '
          'Always search the record before answering.',
        ),
      ],
    ),
    vectorSpace: vectorSpace,
  );

  await agent.addMessage(Message.user('What is Jane Doe allergic to?'));

  AgentMessage answer = await agent.rag();
  print(answer.content);
}

Core API #

RagAgent #

RagAgent extends Agent and adds a required vectorSpace field plus two convenience helpers:

QueryTool getQueryTool()

Future<AgentMessage> rag({
  ToolSchema? responseFormat,
  List<Tool> tools = const [],
  int maxRecursiveToolCalls = 1,
})

rag() prepends the retrieval tool to any extra tools you pass in.

QueryTool #

QueryTool is a Tool with:

  • default tool name: query
  • a generated schema with one required field: queries
  • a plain-text return value summarizing the matched records

Internally it does:

vectorSpace.queryAll(qt.queries)

That means the default QueryTool behavior uses the VectorSpace.queryAll() implementation provided by the active backend.

VectorSpace #

VectorSpace is the storage interface behind the package:

  • upsertAll(List<VectorUpsert> upserts)
  • query(String query, {int maxResults = 10})
  • queryAll(List<String> queries, {int maxResults = 100, int compactQueriesTo = 10, int maxTokens = 50000})
  • list({int batchSize = 100, String? prefix})
  • deleteAll(List<String> ids, {int batchSize = 999})
  • deleteAllPrefixed(String prefix, {int deleteBatchSize = 999, int listBatchSize = 100})
  • purgeAll()

The default queryAll() implementation:

  • compacts long query lists down to compactQueriesTo
  • runs query() for each query
  • merges the results
  • sorts by descending score
  • trims the baked result set to maxTokens
  • trims again to maxResults

VectorUpsert, VectorResult, and VectorSpaceResult #

  • VectorUpsert is the write model: id, content, and metadata
  • VectorResult extends VectorUpsert with score and contentTokenCount
  • VectorSpaceResult wraps List<VectorResult> and can merge or bake results

VectorUpsert.fromIChunk(...) converts an agentic.IChunk-shaped object into an upsert with metadata for:

  • index
  • start
  • length
  • lod
  • from

plus any metadata you supply.

Backend Notes #

PineconeVectorSpace #

PineconeVectorSpace is ready to use as-is.

It requires:

  • namespace
  • host
  • apiKey

It also supports:

  • optional reranking through rerank
  • rerank tuning through rerankTopK and rerankModel
  • list()
  • deleteAll()
  • deleteAllPrefixed()
  • purgeAll()

On upsert, it stores your content in Pinecone's text field and merges in your metadata. On query, it uses Pinecone text search and estimates token counts with bpe.

FirestoreVectorSpace #

FirestoreVectorSpace is ready to use when you provide an embedder and a Firestore collection:

FirestoreVectorSpace(
  embedder: OpenAIConnector(
    apiKey: openAiKey,
  ).asEmbedder('text-embedding-3-small'),
  collection: FirestoreDatabase.instance.collection('vectors'),
)

It stores each upsert as a Firestore document where:

  • the Firestore document ID is the vector ID
  • contentField stores the text body
  • metadataField stores arbitrary metadata
  • vectorField stores the embedding vector

Configurable fields:

  • vectorField, default vector
  • contentField, default content
  • metadataField, default metadata
  • distanceMeasure, default VectorDistanceMeasure.cosine

Its query() implementation uses Firestore nearest-neighbor search, preserves metadata, and assigns finite descending scores based on result rank.

QdrantVectorSpace #

QdrantVectorSpace is ready to use when you provide an embedder plus Qdrant connection details:

QdrantVectorSpace(
  embedder: OpenAIConnector(
    apiKey: openAiKey,
  ).asEmbedder('text-embedding-3-small'),
  host: 'your-cluster.us-east-1.aws.cloud.qdrant.io',
  port: 6334,
  apiKey: qdrantKey,
  organization: 'docs',
  chunkSize: 1024,
  dimension: 1536,
)

Important behavior:

  • organization is used as the collection name and namespace
  • ensureCollection() creates the collection and adds entry and record indexes if it does not exist
  • destroyCollection() removes the collection
  • deleteEntry() deletes all points matching an entry
  • deleteRecord() deletes all points matching a record
  • caller IDs are converted to deterministic UUIDs internally
  • the original caller ID is also stored in the Qdrant payload under id
  • list() and query results round-trip the original caller IDs
  • queryAll() performs batched multi-query retrieval, hydrates point payloads, and safely decodes nested metadata payloads

The Qdrant payload currently stores:

  • id
  • text
  • optional record
  • optional entry
  • optional metadata

Using The Query Tool Directly #

You do not have to call rag() if you want retrieval by itself.

import 'package:rag/rag.dart';

Future<void> main() async {
  RagAgent agent = RagAgent(
    user: 'demo',
    llm: OpenAIConnector(apiKey: 'sk-...').connect(ChatModel.openai4_1Mini),
    chatProvider: MemoryChatProvider(),
    vectorSpace: PineconeVectorSpace(
      namespace: 'docs',
      host: 'https://your-index-host.pinecone.io',
      apiKey: 'pcsk-...',
    ),
  );

  String toolOutput = await agent.getQueryTool().call(
    agent: agent,
    arguments: {
      'queries': [
        'What is the refund window?',
        'How long does shipping take?',
      ],
    },
  );

  print(toolOutput);
}

Indexing agentic Chunks #

If you already have chunk objects from agentic, VectorUpsert.fromIChunk(...) is the easiest way to preserve chunk metadata while indexing:

VectorUpsert upsert = VectorUpsert.fromIChunk(
  chunk: chunk,
  idPrefix: 'faq/',
  metadata: {'record': 'returns-policy'},
);

That helper expects a chunk-like object with fields compatible with agentic.IChunk, including index, charStart, charEnd, fullContent, lod, and from.

Bring Your Own Vector Store #

If Pinecone, Firestore, and Qdrant are not the right fit, you can implement VectorSpace yourself.

import 'package:rag/rag.dart';

class MemoryVectorSpace extends VectorSpace {
  MemoryVectorSpace({Map<String, VectorUpsert>? items})
    : items = items ?? <String, VectorUpsert>{};

  final Map<String, VectorUpsert> items;

  @override
  Future<void> upsertAll(List<VectorUpsert> upserts) {
    for (VectorUpsert upsert in upserts) {
      items[upsert.id] = upsert;
    }

    return Future.value();
  }

  @override
  Future<void> deleteAll(List<String> ids, {int batchSize = 999}) {
    for (String id in ids) {
      items.remove(id);
    }

    return Future.value();
  }

  @override
  Future<void> purgeAll() {
    items.clear();
    return Future.value();
  }

  @override
  Stream<String> list({int batchSize = 100, String? prefix}) async* {
    for (String id in items.keys) {
      if (prefix == null || id.startsWith(prefix)) {
        yield id;
      }
    }
  }

  @override
  Future<VectorSpaceResult> query(String query, {int maxResults = 10}) {
    List<String> terms = query.toLowerCase().split(RegExp(r'\\s+'));
    Iterable<VectorUpsert> matches = items.values.where((item) {
      String haystack = item.content.toLowerCase();
      return terms.any(haystack.contains);
    }).take(maxResults);

    return Future.value(
      VectorSpaceResult(
        results: [
          for (VectorUpsert item in matches)
            VectorResult(
              id: item.id,
              content: item.content,
              metadata: item.metadata,
              score: 1.0,
              contentTokenCount: item.content.split(RegExp(r'\\s+')).length,
            ),
        ],
      ),
    );
  }
}

Repository Layout #

lib/
  rag.dart
  rag_agent.dart
  query_tool.dart
  vector_space.dart
  gen/
    artifacts.gen.dart
    exports.gen.dart
  spaces/
    pinecone_space.dart
    fire_space.dart
    qdrant_space.dart

test/
  vector_space_test.dart
  spaces/
    fire_space_test.dart
    qdrant_space_test.dart

rag_example/
  lib/main.dart

About rag_example #

rag_example/ is a lightweight Flutter harness for experimenting with the package. It currently wires up:

  • Firebase and Firestore
  • a FirestoreVectorSpace subclass
  • an OpenRouterConnector embedding implementation
  • a one-shot chunk indexing and retrieval smoke test on startup

It is useful as a local sandbox, but the package itself is the main product in this repository.

1
likes
0
points
446
downloads

Publisher

verified publisherarcane.art

Weekly Downloads

Retrieval Augmented Generation for Agentic

Repository (GitHub)
View/report issues

License

unknown (license)

Dependencies

agentic, artifact, bpe, fast_log, fire_api, fixnum, grpc, pineconedb, qdrant, toxic, uuid

More

Packages that depend on rag