B4A Library [B4X] [Chargeable] AI Embeddings - Turn SQLite and EVERY DB into a vector database

We are all familiar with AI Embeddings. They are the base in RAG (Retrieval Augmented Generation) for AI. This library converts SQLite and every database into a vector database. The embeddings of the chunked text must be saved as text (the json list toCompactString) in SQLite and the other non vector DBs. This means that you can do RAG without accessing external databases (or using in B4J also other non vector databases) directly in your iOS, Android, Window/Linux/Mac app. It contains all four ways of calculation of embeddings distance because different AI models use different embeddings approach.

The object is AIEmbeddings and when you define the object, then you start writting for example:

B4X:
Dim aiemb As AIEmbeddings
aiemb.Instructions

and you get a code suggestion that you can copy to start directly.

You will notice that you have to define in the AIQuery method an acceptable distance (MinimumOrMaximumDistance). For two of the distance calculations this is the minimum acceptable similarity and for the other two distance calculations it is the maximum acceptable distance. You will notice that it is extremely fast as it uses approaches used in arithmetical methods to return the result. The price is 49.99EUR and you can send me a private message if you want to acquire this b4xlib.

In post#4 there is a video showcasing everything.
 
Last edited:

hatzisn

Expert
Licensed User
Longtime User
would you mind posting some examples of what it actually does?

Sure, in the weekend I will prepare a video showcasing everything.
 

hatzisn

Expert
Licensed User
Longtime User
Here is a video showcasing the AI Embeddings b4xlib in conjunction with sql lite. The b4xlib employing arithmetical methods does the distance calculation between embeddings vectors and returns the result with them. You have to create the code that calculates the vectors themselves because they are calculated differently for different companies and models. Please do change the resolution of the video to 1080p and watch it maximized.

 
Last edited:

Magma

Expert
Licensed User
Longtime User
hi there...
i am curious ... to create vector db as sqlite ... no need AI model to create the different/similar texts-numbers and save them as chunks or parts of texts?

with which ai model is compatible... or who we must pay for AI use ? openai ?

2nd the sqlite db result ... how will use it ? no need any GGUF ? which suggesting?
 

hatzisn

Expert
Licensed User
Longtime User
hi there...
i am curious ... to create vector db as sqlite ... no need AI model to create the different/similar texts-numbers and save them as chunks or parts of texts?

There is no use of SQLite using it as a vector database if you do not fill it with Embeddings vectors calculated by special AI models and the corresponding chunks of text.

with which ai model is compatible... or who we must pay for AI use ? openai ?

The example is created using OpenAI's model "text-embedding-3-small". There is no specific embeddings model with which it is compatible. It is compatible with every embeddings model as it covers all ways of embeddings distance calculation.

2nd the sqlite db result ... how will use it ? no need any GGUF ? which suggesting?

That is consulting. I suppose this is saved for the clients that will acquire the b4xlib. I politely ask you to forgive me but I cannot answer this.
 

Magma

Expert
Licensed User
Longtime User
So first...

1. We must feed the vector/sqlite with questions (at least 1)/(per)answers to calculate the similarities... and in this part we need to pay a model to do that job... any...
2. I think we need at least a local model like ollama... or API subscription to one, because it can't work without one... (I mean for the production/results to give answers to our app users)

am I understanding well ?
 

Magma

Expert
Licensed User
Longtime User
It will be nice - at video tutorial... to fill with some texts the AI.. 4-5, also 2-3 questions...

then Offline your Computer (capture that moment), run your local model (not say anything which using)... and then make similar questions to see what answers you will take...

That will be Super !

* Also have this is in realtime - to see the speed.
* Write us - also the minimum setup need for PC-Computer configuration.

In any case you ve made a great work ! 🇬🇷💕
 

hatzisn

Expert
Licensed User
Longtime User
It will be nice - at video tutorial... to fill with some texts the AI.. 4-5, also 2-3 questions...

then Offline your Computer (capture that moment), run your local model (not say anything which using)... and then make similar questions to see what answers you will take...

That will be Super !

* Also have this is in realtime - to see the speed.
* Write us - also the minimum setup need for PC-Computer configuration.

In any case you ve made a great work ! 🇬🇷💕

Just a hint. You missed something. Different embeddings models make different calculations of embeddings. You cannot create the SQLite embeddings vectors with one model and then try to do RAG with another offline model. You have to create the SQLite vectors and do RAG with the same model. For the minimum requirements if you do not use an embeddings model that runs in Ollama and use OpenAI then the minimum processing requirements are way too minimum. Also for the cost, the embeddings model by OpenAI used in the example, has a cost of 0,02$ per million tokens (which is nothing). The target though mainly (but not limiting to) is iOS and Android apps where you cannot use Ollama - at least directly.
 
Last edited:
Top