llm_llamacpp 0.1.0 copy "llm_llamacpp: ^0.1.0" to clipboard
llm_llamacpp: ^0.1.0 copied to clipboard

llama.cpp backend implementation for LLM interactions. Enables local on-device inference with GGUF models on Android, iOS, macOS, Windows, and Linux.

Changelog #

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

0.1.0 - 2025-01-19 #

Added #

  • Initial release
  • Local on-device inference with GGUF models via llama.cpp
  • Cross-platform support: Android, iOS, macOS, Windows, Linux
  • Streaming token generation with isolate-based inference
  • Multiple prompt templates: ChatML, Llama2, Llama3, Alpaca, Vicuna, Phi-3
  • Tool calling support via prompt convention
  • GPU acceleration support (CUDA, Metal, Vulkan)
  • Model management features:
    • Model discovery in directories
    • Model loading with pooling (reference counting)
    • GGUF metadata reading without loading
    • HuggingFace model downloading
    • Safetensors to GGUF conversion
  • Native Assets build hook for automatic binary management
  • Prebuilt binaries available via GitHub Releases
0
likes
160
points
--
downloads

Publisher

unverified uploader

llama.cpp backend implementation for LLM interactions. Enables local on-device inference with GGUF models on Android, iOS, macOS, Windows, and Linux.

Repository (GitHub)
View/report issues
Contributing

Topics

#llamacpp #llama #llm #flutter #ffi

Documentation

API reference

License

MIT (license)

Dependencies

code_assets, ffi, flutter, hooks, http, llm_core, logging, path

More

Packages that depend on llm_llamacpp

Packages that implement llm_llamacpp