Tiktoken class

Will encode, decode, and calculate the number of tokens in a text string.

It implements the Tiktoken tokeniser, a BPE used by OpenAI's models.

The supported models are:

  • Gpt-4
  • Gpt-4o
  • Gpt-4o-mini
  • o1
  • o1-mini
  • o1-preview

Splitting text strings into tokens is useful because GPT models see text in the form of tokens. Knowing how many tokens are in a text string can tell you whether:

  • Some text is too long for a text model to process.
  • How much an OpenAI API call costs (as usage is priced by token).

Note different models use different encodings. See TiktokenEncodingType

Example usage of encode, decode and count methods:

var tiktoken = Tiktoken(OpenAiModel.gpt_4);
var encoded = tiktoken.encode("hello world");
var decoded = tiktoken.decode(encoded);
int numberOfTokens = tiktoken.count("hello world");

Alternatively, you can use the static helper functions getEncoder and getEncoderForModel to get a TiktokenEncoder first:

var encoder = Tiktoken.getEncoder(TiktokenEncodingType.o200k_base);
var encoder = Tiktoken.getEncoderForModel(OpenAiModel.gpt_4o);

Note the TiktokenEncoder gives you more fine-grained control over the encoding process.

Visit the online Tiktokenizer: https://tiktokenizer.vercel.app/?model=gpt-4o

Constructors

Tiktoken(OpenAiModel model)

Properties

hashCode int
The hash code for this object.
no setterinherited
model OpenAiModel
final
runtimeType Type
A representation of the runtime type of the object.
no setterinherited

Methods

count(String text) int
decode(Uint32List encoded) String
encode(String text) Uint32List
noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited

Static Methods

getEncoder(TiktokenEncodingType encodingType) TiktokenEncoder
Returns the tiktoken encoding for the given encodingType.
getEncoderForModel(OpenAiModel model) TiktokenEncoder
Returns the tiktoken encoding used by a model.