Hi,
What I understand is that you want to recognize streaming voice, right?
Google has an undocumented limitation on maximum recognized speech time of 60 seconds. at one time. So
you need to solve the problem with this consideration, but there should be a way.
One way, I didn't test it though, is to run the voice twice.. the first time watch for long silence (utterance s) for
long than a second and log their duration and location in the timeline. then in the second run, stop the recognition
at the beginning of each silence, append the recognized text and run the voice recognizer again.
Another solution is, if you have audio editor, is to cut the voice into chunks or blocks of less than a minute each and
save them in different files.
Good luck