B4A Library [B4X] [Chargeable] AI Embeddings - Turn SQLite and EVERY DB into a vector database

We are all familiar with AI Embeddings. They are the base in RAG (Retrieval Augmented Generation) for AI. This library converts SQLite and every database into a vector database. The embeddings of the chunked text must be saved as text (the json list toCompactString) in SQLite and the other non vector DBs. This means that you can do RAG without accessing external databases (or using in B4J also other non vector databases) directly in your iOS, Android, Window/Linux/Mac app. It contains all four ways of calculation of embeddings distance because different AI models use different embeddings approach.

The object is AIEmbeddings and when you define the object, then you start writting for example:

B4X:
Dim aiemb As AIEmbeddings
aiemb.Instructions

and you get a code suggestion that you can copy to start directly.

You will notice that you have to define in the AIQuery method an acceptable distance (MinimumOrMaximumDistance). For two of the distance calculations this is the minimum acceptable similarity and for the other two distance calculations it is the maximum acceptable distance. You will notice that it is extremely fast as it uses approaches used in arithmetical methods to return the result. The price is 49.99EUR and you can send me a private message if you want to acquire this b4xlib.

In post#4 there is a video showcasing everything.
 
Last edited:

hatzisn

Expert
Licensed User
Longtime User
would you mind posting some examples of what it actually does?

Sure, in the weekend I will prepare a video showcasing everything.
 

hatzisn

Expert
Licensed User
Longtime User
Here is a video showcasing the AI Embeddings b4xlib in conjunction with sql lite. The b4xlib employing arithmetical methods does the distance calculation between embeddings vectors and returns the result with them. You have to create the code that calculates the vectors themselves because they are calculated differently for different companies and models. Please do change the resolution of the video to 1080p and watch it maximized.

 
Last edited:

Magma

Expert
Licensed User
Longtime User
hi there...
i am curious ... to create vector db as sqlite ... no need AI model to create the different/similar texts-numbers and save them as chunks or parts of texts?

with which ai model is compatible... or who we must pay for AI use ? openai ?

2nd the sqlite db result ... how will use it ? no need any GGUF ? which suggesting?
 

hatzisn

Expert
Licensed User
Longtime User
hi there...
i am curious ... to create vector db as sqlite ... no need AI model to create the different/similar texts-numbers and save them as chunks or parts of texts?

There is no use of SQLite using it as a vector database if you do not fill it with Embeddings vectors calculated by special AI models and the corresponding chunks of text.

with which ai model is compatible... or who we must pay for AI use ? openai ?

The example is created using OpenAI's model "text-embedding-3-small". There is no specific embeddings model with which it is compatible. It is compatible with every embeddings model as it covers all ways of embeddings distance calculation.

2nd the sqlite db result ... how will use it ? no need any GGUF ? which suggesting?

That is consulting. I suppose this is saved for the clients that will acquire the b4xlib. I politely ask you to forgive me but I cannot answer this.
 

Magma

Expert
Licensed User
Longtime User
So first...

1. We must feed the vector/sqlite with questions (at least 1)/(per)answers to calculate the similarities... and in this part we need to pay a model to do that job... any...
2. I think we need at least a local model like ollama... or API subscription to one, because it can't work without one... (I mean for the production/results to give answers to our app users)

am I understanding well ?
 

Magma

Expert
Licensed User
Longtime User
It will be nice - at video tutorial... to fill with some texts the AI.. 4-5, also 2-3 questions...

then Offline your Computer (capture that moment), run your local model (not say anything which using)... and then make similar questions to see what answers you will take...

That will be Super !

* Also have this is in realtime - to see the speed.
* Write us - also the minimum setup need for PC-Computer configuration.

In any case you ve made a great work ! 🇬🇷💕
 

hatzisn

Expert
Licensed User
Longtime User
It will be nice - at video tutorial... to fill with some texts the AI.. 4-5, also 2-3 questions...

then Offline your Computer (capture that moment), run your local model (not say anything which using)... and then make similar questions to see what answers you will take...

That will be Super !

* Also have this is in realtime - to see the speed.
* Write us - also the minimum setup need for PC-Computer configuration.

In any case you ve made a great work ! 🇬🇷💕

Just a hint. You missed something. Different embeddings models make different calculations of embeddings. You cannot create the SQLite embeddings vectors with one model and then try to do RAG with another offline model. You have to create the SQLite vectors and do RAG with the same model. For the minimum requirements if you do not use an embeddings model that runs in Ollama and use OpenAI then the minimum processing requirements are way too minimum. Also for the cost, the embeddings model by OpenAI used in the example, has a cost of 0,02$ per million tokens (which is nothing). The target though mainly (but not limiting to) is iOS and Android apps where you cannot use Ollama - at least directly.
 
Last edited:

Mashiane

Expert
Licensed User
Longtime User
Hey

How can this be used for building RAG wich might lead to LLM model in b4x? For example, there are some b4x sources, github with stuff Erel built to teach AI models.
One way is that the content can be fed into a RAG system, it would be awesome if such could be automated though.

I was reading about LLamaIndex a few just to understand how this whole process works.

Any thoughts?
 

hatzisn

Expert
Licensed User
Longtime User
RAG is based on embeddings models and LLM models. You get the relevance with the first (given that you have calculated the embeddings for text chunks) and you use the second to generate the response.

The embeddings is a representation of meaning in a N-dimensional space. I cannot be sure as I lack the complete knowledge on this but if you feed code in the text chunks following the previous procedure it will devide it in tokens and try to extrapolate "meaning" from this. Since code is not spoken English I think there will be some glitches in the derived "meaning".

If you try then to connect the code "meaning" with spoken English then the relevance would be really low I imagine. Again I tell you that this answer is intuitively derived and it is not based in pure knowledge.

Although it gave me a headache watch the following video as it will answer a lot of your questions.

 

Magma

Expert
Licensed User
Longtime User
RAG is based on embeddings models and LLM models. You get the relevance with the first (given that you have calculated the embeddings for text chunks) and you use the second to generate the response.

The embeddings is a representation of meaning in a N-dimensional space. I cannot be sure as I lack the complete knowledge on this but if you feed code in the text chunks following the previous procedure it will devide it in tokens and try to extrapolate "meaning" from this. Since code is not spoken English I think there will be some glitches in the derived "meaning".

If you try then to connect the code "meaning" with spoken English then the relevance would be really low I imagine. Again I tell you that this answer is intuitively derived and it is not based in pure knowledge.

Although it gave me a headache watch the following video as it will answer a lot of your questions.

GoodMorning!

There are LLM models that gets the meaning of different languages... even if spoke/type something in Greek will check also in English / Bulgarian etc... maths are incredible... If BLOBs have many bytes are always better than smaller...
 

Mashiane

Expert
Licensed User
Longtime User
How can this be used for building RAG wich might lead to LLM model in b4x?
Hi...

I think I have an idea of what I was thinking about. In ChatGPT, when searching for b4x, one is able to see these GPTS which in whatever way it was done includes RAG building. I then ended up asking ChatGPT how these have been created and a step by step process was provided. So actually I dont have to re-invent the wheel here.

It would just be awesome if these members would bump heads and create a "master" version. I'm curious as to how each has been built, not even sure if there is some kind of teamwork building when it comes to these things. Its very interesting anyway.

1776263818215.png
 

Magma

Expert
Licensed User
Longtime User
Hi...

I think I have an idea of what I was thinking about. In ChatGPT, when searching for b4x, one is able to see these GPTS which in whatever way it was done includes RAG building. I then ended up asking ChatGPT how these have been created and a step by step process was provided. So actually I dont have to re-invent the wheel here.

It would just be awesome if these members would bump heads and create a "master" version. I'm curious as to how each has been built, not even sure if there is some kind of teamwork building when it comes to these things. Its very interesting anyway.

View attachment 171153
Those has nothing to do with RAG... are just Agents filled with pdf, technical documentations...

Vector and Embedding coming from specialized LLMs like Google Embedding 001 for example... that need CPU/GPU for calculations to extract "meaning" of sentences/words... ofcourse they can also run local with some special hardware... or not (there are also simple need just a good CPU embedding - like i5/6th gen+)...

Ofcourse if you need to use "vector databases" you need to check if db supports vector and if it has the right extensions... you can find a lot information for all known Databases!

So one thing is the support of DB for vectors and another thing is how to embed/create vectors to fill the db with them...
 

Mashiane

Expert
Licensed User
Longtime User
Those has nothing to do with RAG... are just Agents filled with pdf, technical documentations...
Well, I stand to be corrected, Step 3 on this chat, indicates that it creates some form of RAG system when creating those GPTs. Anyway, at least I have an idea of what I need, here is the chatgpt link...

Step 3 — Add knowledge (optional but powerful)

You can upload:

Documentation PDFs
Code examples
API specs
Your own project files

For your use case (B4X), this is where it becomes powerful:

Upload B4X docs
Your own reusable components
Example apps

This effectively creates a RAG (retrieval-augmented generation) system without coding it manually.
 
Last edited:

Magma

Expert
Licensed User
Longtime User
Well, I stand to be corrected, let me quote what chatgpt gave me when I asked it on how to create these GPTs..

B4X:
[HEADING=2]Step 3 — Add knowledge (optional but powerful)[/HEADING]

You can upload:


[LIST]
[*]Documentation PDFs
[*]Code examples
[*]API specs
[*]Your own project files
[/LIST]

For your use case (B4X), this is where it becomes powerful:


[LIST]
[*]Upload B4X docs
[*]Your own reusable components
[*]Example apps
[/LIST]

This effectively creates a [B]RAG (retrieval-augmented generation)[/B] system without coding it manually.[/quote]

Anyway, at least I have an idea of what I need, here is the chatgpt link... https://chatgpt.com/share/69dfb4d9-bda8-83ea-b54a-06c29b5457ac
...ohh I see where are you going it now... yes it creates RAG but the chatgpt automatically creates them / with the logic of it...
 

Magma

Expert
Licensed User
Longtime User
Embedding and meaning of those Vectors... are changing time to time... because they are creating new routines of embedding creation... better math - different size of BLOB, better meanings... so better answers.... So if someone - sometime created something before... if going to create it now from start will created better with the help of new ChatGPT... or Gemini or other tool... but all these embedding are so different... :-(
 
Top