Biswajit

Active Member
Licensed User
Longtime User
So, and update. I have a class that does the extract of the model. The async methods dont function correctly, however the no async method do. I eleive I have solved the issue.
Ok. So that was the issue with the archiver library.
 

DonManfred

Expert
Licensed User
Longtime User
Why not just use the regular archiver library without the resumeable sub stuff? With archiver the program will simply not continue until the unzipping is done: i.e. exactly what is needed in this speech rec. project. Or am I wrong?
 

drgottjr

Expert
Licensed User
Longtime User
there is an updated library? i donated to you. don't the rest of us get to see it?
 

drgottjr

Expert
Licensed User
Longtime User
i don't know what you see when you look at post #102 (or any around it), but there is nothing there but a message to you.
 

Attachments

  • 102.png
    45.4 KB · Views: 128

Biswajit

Active Member
Licensed User
Longtime User
i don't know what you see when you look at post #102 (or any around it), but there is nothing there but a message to you.
Check the post #102 date and the #1 last edited date. Posting an update to the first post is the correct way. Else new users have to search for the update here and there. You can subscribe to B4A library update thread so that you can get a notification when someone posts an update.
 
There are three more issues with this STT I'd like to report:
1. If I don't use the speech rognizer, but leave it on in a noisy room (i.e. with my TV on) for about half an hour, the STT does not immediately respond when I start to use it again. It takes dozens of spoken words before it catches up and is working to speed and properly again. The noise in the room seems to be filling a buffer all the time and slows the STT's responsiveness down. Any idea how this can be amended?
2. The 1.8 Gbytes big most elaborate US English model vosk-model-en-us-0.22 can not be installed nor unzipped. I guess it is too big (?)
3. All models occupy a triple amount of memory, probably because these models reside in File.dirAssets, are copied to File.dirInternal and then are unzipped.
The only thing I can do is delete the model from File.dirInternal after it has been successfully unzipped and installed.
I think the best way to minimize memory usage is to Unzip a model on forehand on a computer, upload the entire folder structure to a website and then download all the files and folders directly to File.dirInternal, thus avoiding the unzip routine inside the app.
 

Biswajit

Active Member
Licensed User
Longtime User
1. If I don't use the speech rognizer, but leave it on in a noisy room
I think you are not suppose to use this library for leaving it in a noisy room. This is for converting voice to text continuously. If you use it for listening to any wakeup word like hey google, then this might show unexpected behaviour.


I think the best way to minimize memory usage is to Unzip a model on forehand on a computer, upload the entire folder structure to a website
Its upto the developer, how he will optimise the app to use minimal storage.
 
You don't seem to know what causes this problem?
I am indeed using it for an elaborate personal assistant which just listens all the time for questions and commands. Also I might add a hotword to make it sleep and wake up again. For that purpose I have inserted the STT in a service, such that my assistant responds regardless what app the user is using. It works great, apart from the problem I mentioned. Probably I will make a function that destroys the service every ca. 10 minutes (after the user has not spoken for about a minute, assuming that he/she won't need it for the next couple of seconds) and then automatically restarts it again in order to erase caches etc. What do you think?
 

Biswajit

Active Member
Licensed User
Longtime User
Thats what I mentioned in previous comment. This library is not optimized for voice assistant. If you keep it running it will consume your battery and system resources. It also doesn't support any sleep or wakeup on hotword functionality. You can add this functionality to your app so that you can process something on detection of any hotword while the app is running.
 
Under Process Globals you should define:
Public model_folder_name As String = "model"
Private model_zip_name As String = "model.zip"
and make sure that you have renamed the chosen language model to "model.zip" (in lowercase)

PS: It might take some time before you read my reply: for some strange reason all my replies are "under moderation, awaiting approval". No idea why that is.
Is that usual behaviour in this forum as regards how new accounts are treated?
 

Paolo Pini

Member
Licensed User
Longtime User
Hi,
this is a great library.

I have developed an app that recognises speech and responds to certain messages.
I am trying to change the speech recognition module at runtime to change the langue to recognize, I succeed but the STT engine does not restart until I restart the app.
I renaming and reloading the speech modules at runtime I tried using the following commands to try to restart the engine after the speech pattern change:
B4X:
'test combination of:

        'STT.shutdown
        'STT.stop
        'STT.Initialize("STT", File.DirInternal & "/" & model_folder_name)
        'STT.startListening(-1)

'but  this sub is only called if I restart the application:

Sub STT_ReadyToListen
    Log("READY")
    STT.stop
    If STT.startListening(-1) Then
        Log("STT ready...")
    Else
        Log("Start failed...")
        MsgboxAsync("Start failed","")
    End If
End Sub

How can I restart the SST engine at runtime after reloading the voice module?

Thanks in advance

Paolo
 
Cookies are required to use this site. You must accept them to continue using the site. Learn more…