B4J Library jtokkit - Java Tokenizer Kit

This is a wrap for this Github-Project

JTokkit is a Java tokenizer library designed for use with OpenAI models.
jtokkit.knuddels.de/

Download the jar from https://mvnrepository.com/artifact/com.knuddels/jtokkit/1.1.0 and put it into your additional library folder.
Same for the files in Attachment.

Introduction​

JTokkit aims to be a fast and efficient tokenizer designed for use in natural language processing tasks using the OpenAI models. It provides an easy-to-use interface for tokenizing input text, for example for counting required tokens in preparation of requests to the GPT-3.5 model. This library resulted out of the need to have similar capacities in the JVM ecosystem as the library tiktoken provides for Python.

? Features​

✅ Implements encoding and decoding via r50k_base, p50k_base, p50k_edit, cl100k_base and o200k_base
✅ Easy-to-use API
✅ Easy extensibility for custom encoding algorithms
✅ Zero Dependencies
✅ Supports Java 8 and above
✅ Fast and efficient performance
 

Attachments

  • jtokkitV0.02.zip
    3.3 KB · Views: 50

Tim Chapman

Active Member
Licensed User
Longtime User
I have made the sub with the select case statement below based on this page:

Counting Tokens:
Private Sub lblInputTokens_MouseClicked (EventData As MouseEvent)

    tokkit.Initialize("") ' Event name is not used

    Dim selectedModel As String = cmbSelectModel.Items.Get(cmbSelectModel.SelectedIndex)
    Dim encoding As String

    Select selectedModel
        Case "o1-preview", "o1-mini", "gpt-4o", "gpt-4o-mini"
            encoding = "o200k_base"
            
        Case "gpt-4", "gpt-4-32k", "gpt-4-turbo", "gpt-4-turbo-32k", "gpt-3.5-turbo", "gpt-3.5-turbo-16k"
            encoding = "cl100k_base"

        Case "text-curie-001", "text-babbage-002", "text-ada-001", "davinci", "curie", "babbage", "ada","dall-e", "dall-e-inpainting","text-davinci-003"
            encoding = "r50k_base"

        Case "whisper-1", "tts-1", "tts-1-hd"
            encoding = "p50k_base"
            
        Case Else
            xui.MsgboxAsync("Selected model does not have a defined encoding.", "Error")
            Return
    End Select
    
    tokkit.SetModelEncoding(encoding) ' Set the encoding based on the selected model

    Dim tokens As List = tokkit.encode(txtAIInput.text)
    Log(tokens)
    
    Dim decoded As String = tokkit.decode(tokens)
    Log($"Decoded ${decoded}"$)
    
    ' Count the tokens in the decoded string
    Dim decodedTokens As List = tokkit.encode(decoded)
    Dim tokenCount As Int = decodedTokens.Size
    Log($"Token Count: ${tokenCount}"$)
    
    lblInputTokens.Text = "Tokens: " & tokenCount
End Sub

I am getting the error below.

OpenAI says this about the error:

The error you're encountering indicates that the tokkit.SetModelEncoding(encoding) line is trying to set an encoding that is not recognized by the tokkit library. Specifically, the error message states that there is no enum constant for com.knuddels.jtokkit.api.ModelType.o200k_base. This suggests that the encoding string you're using does not match any of the expected values in the tokkit library.

Error occurred on line: 875 (Main)
java.lang.IllegalArgumentException: No enum constant com.knuddels.jtokkit.api.ModelType.o200k_base
at java.base/java.lang.Enum.valueOf(Enum.java:273)
at com.knuddels.jtokkit.api.ModelType.valueOf(ModelType.java:9)
at de.donmanfred.EncodingRegistrywrapper.SetModelEncoding(EncodingRegistrywrapper.java:58)
at b4j.example.main._lblinputtokens_mouseclicked(main.java:7383)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at anywheresoftware.b4a.shell.Shell.runMethod(Shell.java:629)
at anywheresoftware.b4a.shell.Shell.raiseEventImpl(Shell.java:234)
at anywheresoftware.b4a.shell.Shell.raiseEvent(Shell.java:167)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at anywheresoftware.b4a.BA.raiseEvent2(BA.java:111)
at anywheresoftware.b4a.shell.ShellBA.raiseEvent2(ShellBA.java:100)
at anywheresoftware.b4a.BA.raiseEvent(BA.java:98)
at anywheresoftware.b4j.objects.NodeWrapper$1.handle(NodeWrapper.java:109)
at anywheresoftware.b4j.objects.NodeWrapper$1.handle(NodeWrapper.java:1)
at javafx.base/com.sun.javafx.event.CompositeEventHandler.dispatchBubblingEvent(CompositeEventHandler.java:86)
at javafx.base/com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:234)
at javafx.base/com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:191)
at javafx.base/com.sun.javafx.event.CompositeEventDispatcher.dispatchBubblingEvent(CompositeEventDispatcher.java:59)
at javafx.base/com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:58)
at javafx.base/com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at javafx.base/com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at javafx.base/com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at javafx.base/com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at javafx.base/com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at javafx.base/com.sun.javafx.event.EventUtil.fireEventImpl(EventUtil.java:74)
at javafx.base/com.sun.javafx.event.EventUtil.fireEvent(EventUtil.java:54)
at javafx.base/javafx.event.Event.fireEvent(Event.java:198)
at javafx.graphics/javafx.scene.Scene$ClickGenerator.postProcess(Scene.java:3597)
at javafx.graphics/javafx.scene.Scene$MouseHandler.process(Scene.java:3899)
at javafx.graphics/javafx.scene.Scene.processMouseEvent(Scene.java:1885)
at javafx.graphics/javafx.scene.Scene$ScenePeerListener.mouseEvent(Scene.java:2618)
at javafx.graphics/com.sun.javafx.tk.quantum.GlassViewEventHandler$MouseEventNotification.run(GlassViewEventHandler.java:409)
at javafx.graphics/com.sun.javafx.tk.quantum.GlassViewEventHandler$MouseEventNotification.run(GlassViewEventHandler.java:299)
at java.base/java.security.AccessController.doPrivileged(AccessController.java:391)
at javafx.graphics/com.sun.javafx.tk.quantum.GlassViewEventHandler.lambda$handleMouseEvent$2(GlassViewEventHandler.java:447)
at javafx.graphics/com.sun.javafx.tk.quantum.QuantumToolkit.runWithoutRenderLock(QuantumToolkit.java:412)
at javafx.graphics/com.sun.javafx.tk.quantum.GlassViewEventHandler.handleMouseEvent(GlassViewEventHandler.java:446)
at javafx.graphics/com.sun.glass.ui.View.handleMouseEvent(View.java:556)
at javafx.graphics/com.sun.glass.ui.View.notifyMouse(View.java:942)
at javafx.graphics/com.sun.glass.ui.win.WinApplication._runLoop(Native Method)
at javafx.graphics/com.sun.glass.ui.win.WinApplication.lambda$runLoop$3(WinApplication.java:174)
at java.base/java.lang.Thread.run(Thread.java:832)

Any ideas how to fix this? It seems like it should work.
 

DonManfred

Expert
Licensed User
Longtime User

Tim Chapman

Active Member
Licensed User
Longtime User
I am sorry. I thought this would be OK since it pertained to TikToken, but you are right. Should I create a new thread then or continue with this here?
 

Tim Chapman

Active Member
Licensed User
Longtime User
New thread is here:
 
Last edited:
Top