Flutter OCR Kit

A Flutter FFI plugin for OCR (Optical Character Recognition) with Edge AI support. Runs AI inference directly on mobile devices using ONNX Runtime and native OCR engines.

Screenshots

Invoice Scanner Quotation Scanner

Demo Video

The demo includes 4 examples:

Real-time text search - Find specific text strings in camera view
Real-time KIE - Extract specific types (dates, phone numbers, amounts)
Invoice scanner - Scan Taiwan e-invoices and extract invoice number, date, amount
Quotation scanner - Scan custom delivery notes with Layout Detection + OCR to extract items, prices, totals

Features

Native OCR Engine: Uses Apple Vision (iOS) and Google ML Kit (Android) for text recognition
Layout Detection: ONNX-based document layout analysis (PP-Layout model) to identify tables, text blocks, titles, and figures
Edge AI: All processing runs locally on device - no internet required
Cross-platform: Supports both iOS and Android

Supported Platforms

Platform	OCR Engine	Layout Detection	Native Library
iOS	Apple Vision	ONNX Runtime + OpenCV	Static (.a)
Android	Google ML Kit	ONNX Runtime + OpenCV	Dynamic (.so)

Installation

1. Add Dependency

dependencies:
  flutter_ocr_kit:
    git:
      url: https://github.com/robert008/flutter_ocr_kit.git

2. Download AI Model

Download the ONNX model from GitHub Releases:

Model	Size	Description
pp_doclayout_l.onnx	123 MB	Layout detection model

Steps:

Create assets/ folder in your project root
Download the model file and place it in assets/
Register in pubspec.yaml:

flutter:
  assets:
    - assets/pp_doclayout_l.onnx

3. Platform Setup

iOS: Run pod install in your iOS directory. The native libraries will be downloaded automatically.

Android: The native libraries are bundled with the package.

Quick Start

Basic OCR

import 'package:flutter_ocr_kit/flutter_ocr_kit.dart';

// Recognize text from image file
final result = await OcrKit.recognizeNative('/path/to/image.jpg');

print('Full text: ${result.fullText}');
for (final line in result.textLines) {
  print('${line.text} (confidence: ${line.score})');
}

Layout Detection

// Initialize layout model
OcrKit.init('/path/to/pp_doclayout_l.onnx');

// Detect document layout
final layout = OcrKit.detectLayout('/path/to/document.jpg');

for (final region in layout.detections) {
  print('${region.className}: (${region.x1}, ${region.y1}) - (${region.x2}, ${region.y2})');
}

// Release model when done to free memory
OcrKit.releaseLayout();

Note (iOS): Layout detection uses Core ML Execution Provider for faster inference, but consumes more memory (~1.5GB). Always call OcrKit.releaseLayout() when you no longer need layout detection to free memory. To disable Core ML and use CPU only (lower memory, slower speed), modify src/detect/doc_detector.cpp and rebuild the static library.

Supported layout classes: Text, Title, Figure, Figure caption, Table, Table caption, Header, Footer, Reference, Equation

Combined: Layout + OCR

// Step 1: Detect layout to find table regions
final layout = OcrKit.detectLayout(imagePath);
final tableRegions = layout.detections.where((d) => d.className == 'Table');

// Step 2: Run OCR on the entire image
final ocrResult = await OcrKit.recognizeNative(imagePath);

// Step 3: Filter OCR results within table regions
for (final table in tableRegions) {
  final tableTexts = ocrResult.textLines.where((line) {
    // Check if text is within table bounding box
    return line.rect.overlaps(Rect.fromLTRB(table.x1, table.y1, table.x2, table.y2));
  });
  print('Table content: ${tableTexts.map((t) => t.text).join(' ')}');
}

Example App

The example app includes 4 tabs demonstrating different use cases:

Tab 1: OCR

Basic OCR demonstration:

Pick image from gallery or capture with camera
Display recognized text with bounding boxes
Show confidence scores for each text line

Tab 2: KIE (Key Information Extraction)

Simple regex-based entity extraction:

Extract dates, amounts, phone numbers from OCR results
Demonstrates how to post-process OCR output

Tab 3: Invoice Scanner

Taiwan e-invoice scanner demo:

Real-time camera scanning
Extracts invoice number (XX-12345678 format)
Extracts amount and period
Auto-deduplication by invoice number

Important: This is a specialized demo for Taiwan e-invoice format. It demonstrates how to combine real-time OCR with custom extraction logic. You will need to modify the extraction rules (invoice_extractor.dart) for your own document format.

Tab 4: Quotation Scanner

Quotation/delivery note scanner demo:

Uses Layout Detection to find Table regions
Runs OCR within detected regions
Extracts quotation number, date, customer, items, and totals
Supports both photo mode and real-time camera mode

Important: This is a specialized demo for a specific quotation format. It demonstrates how to combine Layout Detection + OCR for structured document extraction. The extraction logic (quotation_extractor.dart) is tailored for the demo documents and will need customization for your own document format.

Demo Files

Editable demo files are provided for testing and customization:

example/assets/demo/
  invoices/              # Sample Taiwan e-invoice images
    invoice_1.jpg
    invoice_2.jpg
    invoice_3.jpg
  quotations/            # Sample quotation PDFs (editable)
    宏達科技_出貨單_HD-2024120001.pdf
    宏達科技_出貨單_HD-2024120015.pdf

You can modify the PDF files to test with your own data, then convert to images for scanning.

How to Build Your Own Document Scanner

The Invoice and Quotation demos show the pattern for building custom document scanners:

Define your extraction rules - Create an extractor class (see invoice_extractor.dart or quotation_extractor.dart)
Use regex patterns - Define patterns for the fields you want to extract:

// Example: Extract order number like "ORD-2024-001234"
final orderPattern = RegExp(r'ORD-\d{4}-\d{6}');
final match = orderPattern.firstMatch(ocrResult.fullText);

Use Layout Detection (optional) - For structured documents with tables:

// Find table regions first
final tables = layout.detections.where((d) => d.className == 'Table');
// Then extract data from table area only

Handle confidence scores - Filter low-confidence results:

final reliableText = ocrResult.textLines.where((line) => line.score > 0.8);

Project Structure

lib/
  flutter_ocr_kit.dart              # Main API (OcrKit class)
  src/
    models.dart                     # Data models (TextLine, OcrResult, LayoutResult)
    ocr_service.dart                # Async OCR service with isolate support
    invoice_extractor.dart          # Taiwan e-invoice extraction (demo)
    quotation_extractor.dart        # Quotation extraction (demo)

src/                                # Native C++ code (FFI)
  native_lib.cpp                    # FFI exported functions
  detect/
    doc_detector.cpp                # Layout detection with ONNX
  ocr/
    ocr_engine.cpp                  # OCR engine (backup, not used by default)

ios/
  Classes/
    OcrKitPlugin.m                  # iOS plugin entry
    VisionOcr.m                     # Apple Vision OCR implementation
  static_libs/                      # Pre-built static libraries (.a)
  Frameworks/                       # ONNX Runtime & OpenCV frameworks

android/
  src/main/
    kotlin/.../OcrKitPlugin.kt      # Android plugin (ML Kit OCR)
    jniLibs/                        # Pre-built dynamic libraries (.so)
    cpp/include/                    # ONNX Runtime & OpenCV headers

API Reference

OcrKit

Method	Description
`init(modelPath)`	Initialize ONNX layout model
`detectLayout(imagePath)`	Detect document layout regions
`recognizeNative(imagePath)`	OCR using native engine (Vision/ML Kit)
`recognizeFromFile(imagePath)`	OCR using ONNX model (backup)

OcrResult

Property	Type	Description
`fullText`	String	Concatenated text from all lines
`textLines`	List<TextLine>	Individual text lines with positions
`imageWidth`	int	Source image width
`imageHeight`	int	Source image height

TextLine

Property	Type	Description
`text`	String	Recognized text content
`score`	double	Confidence score (0.0 - 1.0)
`rect`	Rect	Bounding box position
`wordBoxes`	List<Rect>	Word-level bounding boxes

LayoutResult

Property	Type	Description
`detections`	List<LayoutDetection>	Detected regions
`count`	int	Number of detected regions

LayoutDetection

Property	Type	Description
`className`	String	Region type (Table, Text, Title, Figure, etc.)
`x1, y1, x2, y2`	double	Bounding box coordinates
`score`	double	Detection confidence

Building from Source

Prerequisites

Flutter SDK 3.7+
Xcode 14+ (for iOS)
Android NDK (for Android)

Build Commands

# Run example app
cd example && flutter run

# Analyze code
flutter analyze

# Build Android native library (.so)
./scripts/build_android_so.sh

# Build iOS static library (.a)
./scripts/build_ios_static.sh

License

MIT License

flutter_doclayout_kit - Document layout detection plugin

Flutter OCR Kit

Screenshots

Demo Video

Features

Supported Platforms

Installation

1. Add Dependency

2. Download AI Model

3. Platform Setup

Quick Start

Basic OCR

Layout Detection

Combined: Layout + OCR

Example App

Tab 1: OCR

Tab 2: KIE (Key Information Extraction)

Tab 3: Invoice Scanner

Tab 4: Quotation Scanner

Demo Files

How to Build Your Own Document Scanner

Project Structure

API Reference

OcrKit

OcrResult

TextLine

LayoutResult

LayoutDetection

Building from Source

Prerequisites

Build Commands

License

Libraries

flutter_ocr_kit package

Flutter OCR Kit

Screenshots

Demo Video

Features

Supported Platforms

Installation

1. Add Dependency

2. Download AI Model

3. Platform Setup

Quick Start

Basic OCR

Layout Detection

Combined: Layout + OCR

Example App

Tab 1: OCR

Tab 2: KIE (Key Information Extraction)

Tab 3: Invoice Scanner

Tab 4: Quotation Scanner

Demo Files

How to Build Your Own Document Scanner

Project Structure

API Reference

OcrKit

OcrResult

TextLine

LayoutResult

LayoutDetection

Building from Source

Prerequisites

Build Commands

License

Related Projects

Libraries

flutter_ocr_kit package