B4J Question Speech to text + use API of Google

Magma

Expert
Licensed User
Longtime User
Hi there...

i want to know about Google API Speech to Text (recognize voice and write text)...
actually this is not only for b4j... is more general...

1) Having cost... ? https://cloud.google.com/speech-to-text/pricing
2) I want to know how to use it Asynchronous ... record wave and then upload it - and return to me text...

Is there any web service-api - example - that will return me the text i want

Thanks again...
 

JordiCP

Expert
Licensed User
Longtime User
Use this class, I experimented with it some time ago and works really well.
There is a SYNC and an ASYNC method at the bottom. The ASYNC method will return the event prefix that you pass to the function call, with a "_finished" suffix.
Both variants return a map with de following keys: Success(bool), Transcription(string), Confidence(float), and ErrorCause(string)

(Will need your own key)

--(edited)--
CL_SR Class:
Sub Class_Globals
   
    Private mBaseLink As String = "https://speech.googleapis.com/v1/speech:recognize?key="
    Private Const GOOGLE_SPEECH_API_KEY As String = "xxxxxx..."  '<- You key here!
End Sub


' Google Cloud Speech API Reference: https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize
' Sample interesting read: https://www.raviyp.com/embedded/229-google-speech-recognition-http-api-tutorial-with-live-demo

'Initializes the object. You can add parameters to this method if needed.
Public Sub Initialize
   
End Sub


Private Sub getDataBytesFromAudio( dir As String, filename As String) As String
    Dim su As StringUtils
    Dim s As String = su.EncodeBase64(File.ReadBytes(dir,filename))
    Return s  
End Sub


Private Sub recognizeAudio(cbk As Object, event As String, dir As String, fileName As String) As ResumableSub
   
    ' Create default response Map
    Dim resultMap As Map = CreateMap("Success":False, "Transcription":"", "Confidence": 0.0f, "ErrorCause":"Other" )
   
    ' Prepare the data.      
    Dim link As String = mBaseLink & GOOGLE_SPEECH_API_KEY
    Dim data As String = $"{"audio": {"content":"${getDataBytesFromAudio(dir,fileName)}"},"config": {"languageCode": "en-US"}}"$
   
    Log(data)

'    {"audio": {"content":"//8CAP//AAD//wIA/v8C/AgD+/wIA/////wIA/f8EAP3/....dummy.../wMA/f8DAP7/AAAAAAAAAQD//wA"
'    },"config": {"encoding": "LINEAR16","sampleRateHertz": 16000,"languageCode": "en-US"}}
       

    ' There are some limitations
    '     If WAV or FLAC, we don't need to specify encoding or smaple rate (included in the header)
    '     If encoding is LINEAR_PCM, bit samples must be 16

    ' Prepare the HTTP job  
    Dim j As HttpJob
    j.Initialize("recognize", Me)
    j.PostString(link, data)
    j.GetRequest.SetHeader("Content-Type", "application/json")    ' Important to add these lines, otherwise it will not work!!
    j.GetRequest.SetContentType("application/json")
   
    Wait For (j) JobDone(j As HttpJob)
    If j.Success Then
       
        Dim res As String = j.GetString
        Log("Received answer: "&res)
       
        Dim JSON As JSONParser
        JSON.Initialize(res)

        If res.StartsWith("{") Then        ' [: list, {:map
            Dim m As Map = JSON.NextObject
            Dim resultList As List = m.Get("results")
            If resultList<>Null And resultList.IsInitialized Then
                Dim m2 As Map =  resultList.Get(0)    ' Supposed to return only 1 result?
                Dim l2 As List = m2.GetDefault("alternatives",Null)
                If l2<>Null And l2.IsInitialized Then
                    Dim m3 As Map = l2.Get(0)
                    Dim transcription As String = m3.GetDefault("transcript","Not found")
                    Dim confidence As Float = m3.GetDefault("confidence", 0)
                       
                    resultMap.Put("Success", True)
                    resultMap.Put("Transcription", transcription)
                    resultMap.Put("Confidence", confidence)
                    resultMap.Put("ErrorCause","")
                   
                End If
                   
            End If
           
        Else
            resultMap.Put("ErrorCause", "Could not parse result: "&res)
        End If
               
    Else
        resultMap.Put("ErrorCause", "HTTP Job did not succeed")
    End If
   
    If event.Length>0 Then  
        CallSubDelayed2(cbk, event&"_finished", resultMap)     '<-- ASYNC case
    End If
    Return resultMap   '<-- SYNC case. Won't hurt when ASYNC
   
End Sub


'===================================================================================
' Public interface
'===================================================================================
Public Sub recognizeAudioSYNC( dir As String, filename As String) As ResumableSub
    wait for (recognizeAudio(Null, "",  dir, filename)) Complete (res As Map)
    Return res
End Sub

Public Sub recognizeAudioASYNC(callBack As Object, event As String, dir As String, fileName As String)
    recognizeAudio(callBack, event, dir, fileName)
End Sub


Usage example from outside the class.
B4X:
Sub recognize
   Dim mSR as CL_SR   ' Can be a global and reuse it.
   mSR.initialize
   mSR.recognizeAudioASYNC(Me, "mSR", File.DirInternal, mySavedAudioFileName)
End Sub

' This will be the called event
Sub mSR_finished(result as Map)
   'Log(result)
End Sub
 
Last edited:
Upvote 0

hanyelmehy

Active Member
Licensed User
Longtime User
Thank you for this code ,do you have other code for read stream audio from mic
 
Upvote 0
Cookies are required to use this site. You must accept them to continue using the site. Learn more…