core/chat library
Classes
- InferenceChat
- StopTokenFilter
-
Filters stop tokens from model response stream.
For .litertlm on iOS, MediaPipe doesn't handle
<end_of_turn>— this filter detects and terminates the stream at the stop token, with buffering for partial tag matches.
Constants
- defaultMaxFunctionBufferLength → const int
- Default maximum length for function call buffer before flushing as text. Must accommodate verbose formats (DeepSeek tags, parallel calls).