receipt_recognition 0.2.8
receipt_recognition: ^0.2.8 copied to clipboard
A Flutter package for scanning and extracting structured data from supermarket receipts using Google's ML Kit.
Changelog #
All notable changes to this project will be documented in this file.
0.2.8 β 2026-01-08 #
π Fixed #
- Unit Parsing: Improved logic for separating unit prices from product text, specifically handling cases where unit information might appear as trailing text or between other tokens more reliably.
0.2.7 β 2025-12-28 #
π οΈ Changed #
- Bounding Box Stability: Added a default zero-degree angle to bounding boxes generated from multi-line entities, ensuring more consistent geometry during receipt reconstruction.
β Tests #
- DM Integration: Expanded assertions for dm-drogerie markt receipts to include detailed validation of unit quantities and unit prices for line items.
0.2.6 β 2025-12-27 #
β¨ Added #
- Aldi-style unit support: Added logic to detect and parse unit information (quantity/price) when printed on the line above the product name (common on Aldi receipts).
- Expanded Currency Support: Improved monetary regex to handle currency symbols appearing after the amount (e.g.,
0,49 β¬).
π οΈ Changed #
- Skew Angle Optimization: Implemented a more robust, weighted-median approach for skew estimation across multiple frames, significantly improving alignment stability during video scans.
- Parsing Bounds: Refined
rightBoundandcenterBoundcalculations (3/4 and 1/2 of receipt width respectively) to better distinguish between item amounts and unknown metadata. - Unit Detection: Enhanced
_tryParseUnitand_tryParseUnknownto use more precise horizontal center-bounds, preventing misclassification of product text as unknown entities. - Integer Parsing: Improved integer extraction logic to ensure quantity detection doesn't accidentally pick up parts of decimal numbers.
β Tests #
- ALDI Integration: Added comprehensive integration tests for ALDI receipts, validating store name, total, purchase date, and multi-line unit/quantity parsing.
0.2.5 β 2025-12-14 #
β¨ Added #
- New store support: Added Kaufland and dm store support (including improved/real-world handling).
- Parsing support: Added handling for asterisk-marked receipt lines.
π οΈ Changed #
- Normalization: Normalize spaces more consistently to improve matching under OCR spacing glitches.
- Total-label handling: Refined label-related conditions to improve robustness.
- Position creation:
- Prefer creating positions from the same row when possible.
- Apply non-strict position creation only for unused amounts to avoid accidental matches.
- Build/CI hygiene: Ignore other build paths to reduce noise and avoid unintended file interactions.
- Internal: Dynamic script language adjustments and minor method renaming/cleanup.
β Tests #
- Added and reorganized tests, including expanded integration test assertions (e.g., more unit/quantity/price checks).
0.2.4 β 2025-12-04 #
β¨ Added #
- Extended integration test coverage: Added some comprehensive integration tests with special validation for deposit positions.
π Fixed #
- Total identification: Fixed a critical bug where totals were not correctly identified when total labels were present.
- Unit parsing tolerance: Adjusted vertical tolerance for better multi-line receipt handling.
0.2.3 β 2025-12-03 #
β¨ Added #
- Extended integration tests: Added comprehensive integration tests for multiple receipt types (EDEKA, LIDL, and ALDI) with validation of store names, totals, purchase dates, positions, and specific line items.
- Text processor testing support: Added
debugRunSynchronouslyForTestsflag toReceiptTextProcessorfor synchronous parsing in test environments. - Truncated frequency calculation: New
calculateTruncatedFrequencymethod that merges truncated leading-token alternatives into their longer counterparts before counting frequency.
π οΈ Changed #
- Unified unit parsing: Merged
_tryParseUnitQuantityand_tryParseUnitPriceinto a single_tryParseUnitmethod for more efficient unit detection. - Total label detection improvements:
- Now uses normalized keys for better matching accuracy
- Added prefix matching for total labels (e.g., "SUMME EUR" will match "summe")
- Parser refinements:
- Fixed bug where
_findTotalLabelwas incorrectly called instead of_findTotalwhen searching for totals - Improved unit text extraction to avoid including numeric portions
- Enhanced unknown entity filtering to prevent lines with too many numeric characters from being classified as unknown
- Fixed bug where
- Normalization improvements:
- Updated frequency calculation for alternative postfix texts to use truncated frequency method.
- Moved truncated frequency calculator to the correct normalization method for alternative texts.
- Refined alternative text frequency calculation in
RecognizedProduct.
- Parser simplification: Reduced complexity by removing overly complicated logic and making the codebase simpler and more maintainable.
- OCR autocorrection: Reduced OCR autocorrection aggressiveness for better accuracy.
- Skew calculation: Removed outliers from skew angle calculation for more stable alignment.
- Ignored keywords: Added "Subtotal" to the list of keywords to ignore during product recognition.
- Timeout handling: Improved timeout behavior for better responsiveness.
π Fixed #
- Entity type preservation: Fixed issue where entities were being incorrectly cast, now properly creates new
instances with correct types when filtering (e.g.,
RecognizedTotalβRecognizedAmount,RecognizedTotalLabelβRecognizedUnknown) - Unit boundary detection: Changed unit parsing to use
_rightL(line)instead of_leftL(line)for proper right-bound checking - Unit parsing: Fixed a critical bug where unit prices with negative total prices were not correctly handled with proper sign adjustment.
- Unit validation: Corrected unit price-quantity validation logic to properly check tolerance thresholds.
- Parsing stability: Multiple fixes to improve parsing reliability and reduce edge-case errors.
0.2.2 β 2025-11-19 #
π οΈ Changed #
- Product recognition threshold: Increased strictness for position confirmation β now requires 80% of positions to pass validation (up from 75%) for more reliable receipt parsing.
- Product name handling: Improved handling of leading digits β now allows product names with leading digits if they contain 6 or more letters total, reducing false rejections.
π Fixed #
- Normalization: Removed verbose debug comments from OCR correction logic, improving code clarity.
- Integration tests: Corrected asset path reference in integration tests.
π§Ή Housekeeping #
- Removed
.metadatatracking file from version control in example app. - Updated
.gitignoreto exclude metadata files going forward.
0.2.1 β 2025-11-13 #
β¨ Added #
- Added Spar to the default store list.
π οΈ Changed #
- Store and total-label detection is now more tolerant to OCR spacing glitches.
- Store extraction now only considers candidates above the first amount line, reducing false positives.
- Improved skew angle estimation for more stable item alignment.
- Parser now performs a fallback pass to assign unmatched amounts, reducing missing line items.
π Fixed #
- Purchase date handling no longer locks onto early, unstable detections; continuous scans refine it correctly.
- Minor stability improvements in the optimizer and parser.
0.2.0 β 2025-11-08 #
β¨ Added #
- Unit price & quantity parsing: extract quantity and price-per-unit for line items.
allowedProductGroupsoption to control which product groups are accepted during parsing.- Position bounding box exposure for recognized positions.
- Minor improvements around pseudo positions for better preview/UX.
π οΈ Changed #
- Tuned defaults for stability and performance (cache size, thresholds, timeouts).
- Normalization tweaks: treat
+as a normal character; refined space handling; extended patterns. - Removed legacy keyword groups (discount/deposit) in favor of
allowedProductGroups. - Internal type/use cleanups (prefer
double/intwhere appropriate). - Purchase date recognition is tuned for better robustness.
π Fixed #
- Lost groups during merging: fixed issues where groups could disappear or be dropped in long/unstable scans.
- Deposit handling: corrected parsing so deposit lines donβt corrupt totals or item classification.
- Product group classification: fixes to ensure correct group assignment and consistent downstream behavior.
- General stability fixes across optimizer and parser components.
0.1.9 β 2025-10-27 #
β¨ Added #
- Unit tests covering additional normalization scenarios.
π οΈ Changed #
- Adjusted
onScanTimeoutsignature inReceiptRecognizer. - Improved normalization and output: use normalized text consistently and refined purchase date output.
- Tuned optimizer cache for better stability and responsiveness.
π Fixed #
- Corrected skew angle handling and bounds calculation for accurate receipt alignment.
- Prevented misclassification of prices as product names.
0.1.8 - 2024-10-25 #
0.1.7 β 2025-10-24 #
β¨ Added #
- Integration test for single-image recognition in the example app:
example/integration_test/receipt_recognizer_test.dart
purchaseDatedetection and exposure onRecognizedReceipt.- Configurability via the new options/tuning system:
ReceiptOptionsto control parsing, validation, and scanning behavior.ReceiptTuningfor adjustable merge/stability thresholds.kReceiptDefaultOptionsas a sensible baseline.ReceiptRuntimehelper to apply options during parsing.
π οΈ Changed #
processImagenow always returns aRecognizedReceipt(nevernull), even when incomplete or invalid.- Clarified result semantics for continuous scanning (video):
isValid: essential structure present and totals are coherent.isConfirmed: result stabilized across frames (finalized by optimizer).
- Scanning guidance:
- Video: stop when
receipt.isValid && receipt.isConfirmed. - Single image:
isConfirmedtypically not achievable; treatisValidas sufficient or prompt re-capture.
- Video: stop when
- Parser and Optimizer refactored and improved for more robust extraction and more stable results across frames.
π§Ή Deprecated #
minValidScansinReceiptRecognizer(constructor parameter) in favor of stability/confirmation rules and tuning viaReceiptOptions/ReceiptTuning.- Legacy
sum/sumLabelnaming: prefertotal(RecognizedTotal) andtotalLabel(RecognizedTotalLabel). RecognizedCompanyis considered legacy; preferRecognizedStore.
0.1.6 - 2025-09-22 #
0.1.5 - 2025-09-21 #
β¨ Added #
-
Configurable stores support in the parsing pipeline, recognizer now accepts an
optionsmap that is forwarded to the receipt parser -
Example app updates:
- Introduces
opencv_dartin the exampleβs dependencies used for image preprocessing
- Introduces
π οΈ Changed #
ReceiptRecognizerconstructor: new optionaloptionsparameter- Default scanning cadence:
scanIntervalincreased from 50 ms β 100 ms
0.1.4 - 2025-09-20 #
0.1.3 - 2025-09-19 #
0.1.2 - 2025-09-18 #
π Fixed #
- Scan Validation: Fixed the issue where scans could complete prematurely before reaching the minimum valid scan count
- Match Percentage Calculation: Corrected calculation logic for sum match percentage to use actual values instead of scan-weighted values
- Ignore Keywords: Added "Subtotal" to the list of keywords to ignore during product recognition
π οΈ Changed #
- Product Display: Updated debug output to use normalized product text for better consistency
- Recognition Accuracy: Enhanced filtering of non-product lines to improve overall scan quality
0.1.1 - 2025-09-09 #
π Documentation #
- Streamlined README to reflect current capabilities and scope
- Removed outdated roadmap/status references and aligned wording with the present feature set
- Clarified Implementation Status and reorganized sections for better readability
- Minor copyedits for consistency (terminology, headings, and formatting)
0.1.0 - 2025-06-06 #
β¨ Added #
- Stable Sum Detection: The scanner now identifies the most likely total amount by combining confidence and position tracking across multiple frames
- Live Bounding Boxes: Detected receipt items are now outlined live on the camera preview, making it easier to see what's recognized in real time
π οΈ Changed #
- Metadata Filtering: The engine is better at ignoring irrelevant lines like footer notes or fiscal metadata, keeping the product list clean
- Enhanced Merge Logic: Optimized how multi-frame receipts are combined, improving structure and reliability even in fragmented scans
π Fixed #
- Incorrect Totals: Fixed an issue where prices from unrelated lines could falsely be interpreted as the receiptβs sum
- Layout Glitches: Resolved occasional misalignments or line skips when processing longer or poorly lit receipts
0.0.9 - 2025-06-03 #
β¨ Added #
- Smarter Scan Confidence Filtering: The recognition engine can now automatically ignore low-confidence items that might distort the total sum
- Status Getter: Added a simple flag to detect if scanning is still in progress
- Improved Example App: The example app now includes best practice instructions, live scan overlays, and a clearer UI for scanning and viewing receipts
π οΈ Changed #
- Internal Optimizations: Minor refactoring and documentation improvements to ensure better readability and maintainability of the code
π Fixed #
- Sum Calculation Accuracy: Fixed cases where incorrect items could cause mismatches with the printed total
- Scan Stability: Improved reliability of the scan process for varied receipt lengths and layouts
0.0.8 - 2024-05-29 #
β¨ Added #
- Manual Receipt Acceptance: Added
acceptReceipt()method to manually accept receipts with validation discrepancies - Enhanced Validation Flow: Improved receipt validation with configurable thresholds and detailed status reporting
- Validation State Tracking: Added comprehensive validation states (complete, nearlyComplete, incomplete, invalid)
- Documentation: Added detailed documentation on validation workflow and manual acceptance
π οΈ Changed #
- Implementation Status Table: Improved formatting of status table in documentation
- Progress Reporting: Enhanced
ScanProgressmodel with more detailed validation information - Validation Logic: Refined match percentage calculation between calculated and declared sums
π Fixed #
- Receipt Merging: Fixed edge cases in receipt merging that could lead to inaccurate total sums
- Stability Thresholds: Adjusted default stability thresholds for more reliable scanning
0.0.6 - 2024-05-18 #
β¨ Added #
- Intelligent product name normalization via
ReceiptNormalizer.normalizeFromGroup - Graph-based item ordering using
PositionGraphwith fuzzy matching and topological sort CachedReceiptconsolidation with improved trust-based merging logic- Parallelized parsing and graph resolution using Dart's
compute()isolates - Outlier filtering for OCR noise and trailing receipt lines
- Unit tests for name cleaning and group resolution
π οΈ Changed #
- Restructured
core/folder for better modularity - Improved
ReceiptParser._shrinkEntities()to robustly remove lines below sum/total - Enhanced debug output when scanning receipts in
ReceiptRecognizer
π Documentation #
- Added concise DartDoc to all public APIs, data models, and helpers
β Tests #
- Added group-based name cleaning tests
- Validated space vs. no-space consensus logic (e.g. "Co ke Zero" vs. "Coke Zero")
0.0.5 β 2025-05-05 #
π Documentation #
- Added comprehensive
dartdoccomments for all model classes inlib/src/receipt_models.dart
0.0.4 β 2025-05-04 #
β¨ Added #
- Optimizer Module: Introduced a new optimizer module to enhance the performance and efficiency of the receipt recognition pipeline
- Product Name Detection Enhancements: Implemented advanced techniques for more accurate extraction and recognition of product names from receipts
π οΈ Changed #
- Company Regex Enhancements: Updated regular expressions related to company name detection to enhance parsing accuracy
π Fixed #
- Product Name Extraction: Resolved issues where certain product names were not being accurately extracted due to inconsistent formatting
- Company Name Detection: Fixed bugs in the regular expressions that led to incorrect company name identification in specific receipt formats
- Parser Stability: Addressed edge cases in the parser that previously caused errors when processing receipts with uncommon layouts
0.0.3 - 2025-04-24 #
β Added #
- Unit Tests: Introduced unit tests to enhance code reliability and ensure consistent behavior across updates
π οΈ Changed #
- Parser Refactoring:
- Moved parsing methods to a dedicated parser module, improving code organization and maintainability
- Converted parser methods to static, facilitating easier access without instantiating classes
π Fixed #
- Company Name Recognition: Refined regular expressions related to company name detection, enhancing accuracy in identifying company names from receipts
0.0.2 β 2025-04-23 #
β¨ Added #
ReceiptWidget: A customizable Flutter widget that displays parsed receipt data in a layout resembling a real supermarket receipt- Advanced example demonstrating integration with a live video feed using the
camerapackage
π οΈ Changed #
- Updated
README.mdto include usage examples and details about models etc.
π Fixed #
- Minor bug fixes and performance improvements