flutter_llama 0.1.0
flutter_llama: ^0.1.0 copied to clipboard
Flutter plugin for running LLM inference with llama.cpp and GGUF models on Android and iOS
Changelog #
0.1.0 - 2025-10-21 #
Added #
- Initial release of flutter_llama
- Support for GGUF model loading
- Blocking text generation API
- Streaming text generation API
- GPU acceleration support (Metal on iOS, Vulkan on Android)
- Configurable model parameters (threads, GPU layers, context size, etc.)
- Configurable generation parameters (temperature, top-p, top-k, etc.)
- Model info retrieval
- Stop generation functionality
- Full iOS (Swift) implementation
- Full Android (Kotlin + JNI) implementation
- Comprehensive documentation and examples
Features #
- Native llama.cpp integration
- High-performance inference
- Cross-platform support (iOS and Android)
- Easy-to-use Dart API
- Production-ready code with error handling