flutter_onnxruntime_genai 0.1.6
flutter_onnxruntime_genai: ^0.1.6 copied to clipboard
A Flutter FFI plugin wrapping Microsoft ONNX Runtime GenAI C-API for on-device multimodal inference on Android and iOS devices.
0.1.6 #
- Fixed multimodal text-only inference: Always process through
OgaProcessorProcessImageseven without images (required for vision models like Phi-3.5). - Added KV-cache memory management: Set
max_lengthto 2048 tokens to prevent OOM crashes on mobile devices. - Added ONNX GenAI internal logging callback for better debugging.
- Added signal handlers (SIGSEGV, SIGABRT, etc.) for crash debugging.
- Enhanced debug logging with granular step tracking around critical API calls.
- Fixed crash during
OgaGenerator_SetInputscaused by prompt size exceedingmax_length.
0.1.5 #
- Internal testing release.
0.1.4 #
- Added comprehensive debug logging to trace native C++ execution step-by-step.
- Debug logs use Android Logcat (
__android_log_print) on Android andstderron other platforms. - Added multi-image inference support via
run_inference_multiandrunInferenceMultiAsync. - Debug logging can be disabled by setting
ONNX_DEBUG_LOGto0influtter_onnxruntime_genai.cpp.
0.1.3 #
- Fixed Android runtime crash: Added missing
libonnxruntime.sodependency to jniLibs. - Updated build script to automatically copy ONNX Runtime library alongside GenAI library.
0.1.2 #
- Fixed C++ API compatibility with ONNX Runtime GenAI C header.
- Updated
OgaTokenizerEncodeto use pre-created sequences. - Replaced deprecated
OgaGeneratorParamsSetInputSequenceswithOgaGenerator_AppendTokenSequences. - Replaced non-existent
OgaGenerator_ComputeLogits- usingOgaGenerator_GenerateNextTokendirectly. - Replaced
OgaGenerator_GetLastTokenwithOgaGenerator_GetNextTokens. - Fixed
OgaProcessorProcessImagesfunction name. - Fixed
OgaGenerator_SetInputsto be called on generator instead of params.
0.1.1 #
- Include prebuilt stripped native libraries for Android and iOS.
- Reduced package size for pub.dev compatibility.
- Updated documentation.
0.1.0 #
- Initial experimental release.
- Support for ONNX Runtime GenAI C-API.
- Multimodal inference support (Text + Image) for models like Phi-3.5 Vision.
- Support for Android (with 16KB page alignment) and iOS.
- Async induction via background isolates.
- Token-by-token streaming output.