Biswajit

Active Member
Licensed User
Longtime User
Can I feed Youtube URL as input file instead of wav file ??
No. The SDK only supports WAV files. You can download the youtube audio and convert it to WAV format and pass it.

to make off-line speech-to-text.
This is an offline speech to text SDK you do not need an internet connection.
 

JohnC

Expert
Licensed User
Longtime User
Just tried the sound+record feature of the demo - works great!

Sending you a well-earned donation!
 

JohnC

Expert
Licensed User
Longtime User
I increased the audiostreamer bitrate from 16000 to 22050, but when I play back the audio it sounds like mickey mouse (plays too fast):

B4X:
   'player.Initialize("player",16000,True,16,player.VOLUME_MUSIC)
    player.Initialize("player",22050,True,16,player.VOLUME_MUSIC)

Is the STT_MicrophoneBuffer event hard-coded to only work with a 16000 sample rate?
 
Last edited:

Biswajit

Active Member
Licensed User
Longtime User
Yes. It's hard-coded. I will add an option to change that.
 

JohnC

Expert
Licensed User
Longtime User
Yes. It's hard-coded. I will add an option to change that.
Would it be possible for the new change to also allow "reading" WAV files that also may not have been encoded in 16000 sample rate?
 

Biswajit

Active Member
Licensed User
Longtime User
Would it be possible for the new change to also allow "reading" WAV files that also may not have been encoded in 16000 sample rate?
Yes it will work for both types of recognition (speech and wav file).
 

Biswajit

Active Member
Licensed User
Longtime User
Is the STT_MicrophoneBuffer event hard-coded to only work with a 16000 sample rate?
Would it be possible for the new change to also allow "reading" WAV files that also may not have been encoded in 16000 sample rate?

Please check the new update. After initializing the library you can change the sampleRate value at any point of time (before starting the recognition).
 

JohnC

Expert
Licensed User
Longtime User
Please check the new update. After initializing the library you can change the sampleRate value at any point of time (before starting the recognition).
The SampleRate setting works great!
 

Biswajit

Active Member
Licensed User
Longtime User

JohnC

Expert
Licensed User
Longtime User

JohnC

Expert
Licensed User
Longtime User
I tried it some audio files there stops talking and then continues
Just make sure that the sample rate you set for this library matches the sample rate of the WAV audio file and make sure the audio file is a true WAV format, and not simply renamed to WAV file extension.
 

Khalid.

Member
Just make sure that the sample rate
Yes it works in the conversion, my question is in the audio there may be some pauses by the speaker and then he continues speaking, how can the pause be compensated by putting a (((space))) (on the number of time gone) between speeches.
 

Biswajit

Active Member
Licensed User
Longtime User
I tried it some audio files there stops talking and then continues like.
space on the number of time gone
I think there is nothing called SPACE in voice recognition. When you pause/stop talking the recognizer waits for the next word instead of adding a blank space. This is valid for all types of voice recognition I guess. Try google assistant, Siri, Alexa, or any other voice-to-text app, I guess in all cases it will just wait for your voice when you temporarily stop talking.

The library supports multiple languages but one at a time.
 

drgottjr

Expert
Licensed User
Longtime User
a very tiny issue re an otherwise good job:
shutdown releases the audio recorder. if user exits app without having tapped the start button (maybe they changed their mind), a null object exception
is thrown. i would suggest an "if (recorder != null) release recorder" in shutdown. or recommend users add a try/catch on activity_pause
 

Biswajit

Active Member
Licensed User
Longtime User

Biswajit

Active Member
Licensed User
Longtime User
Ok I will check.
 
Cookies are required to use this site. You must accept them to continue using the site. Learn more…