B4J Question [PyBridge] TTS with Pocket (Slow to play)

zed

Well-Known Member
Licensed User
The idea is to use streaming. The goal is for the sound to start playing after 200–400 ms, instead of waiting for the entire sentence to be generated.

Pocket-TTS offers a function called "tts_model.generate_audio_stream(voice_state, text)".

It returns small audio chunks as they are generated.
So you can receive a chunk, play it immediately, and continue receiving the next ones.

Script: python_streaming.py:
from pocket_tts import TTSModel
import numpy as np

# Load the model and voice only once
tts_model = TTSModel.load_model()
voice_state = tts_model.get_state_for_audio_prompt("alba")

def stream_tts(text):
    # Returns a list of audio chunks (numpy arrays)
    chunks = []
    for chunk in tts_model.generate_audio_stream(voice_state, text):
        chunks.append(chunk)
    return chunks

With this, we can play each chunk as soon as it arrives, which greatly reduces latency.

B4X:
Private Sub Button1_Click
    StreamSpeak(TextArea1.Text)
End Sub

Private Sub StreamSpeak(Text As String)
    Dim PyStream As PyWrapper = Py.ImportModule("python_streaming")
    Dim Result As PyWrapper = PyStream.Run("stream_tts").Arg(Text)

    ' Résultat = liste de chunks numpy
    Dim chunks As List = Result.ToList

    For Each chunk As PyWrapper In chunks
        PlayChunk(chunk)
    Next
End Sub

Private Sub PlayChunk(chunk As PyWrapper)
    ' Convertir chunk numpy → bytes WAV
    Dim buffer As PyWrapper = IO.Run("BytesIO")
    ScipyWavfile.Run("write").Arg(buffer).Arg(TTSModel.GetField("sample_rate")).Arg(chunk)
    Winsound.Run("PlaySound").Arg(buffer.Run("getvalue")).Arg(Winsound.GetField("SND_MEMORY"))
End Sub

With this code, the first chunk arrives in 200–400 ms. The sound starts immediately, and the phrase continues to generate during playback.
The perceived latency is very low.

Untested code.

 
Last edited:
Upvote 0

LGS

Member
Licensed User
Longtime User
Hi Zed, thanks for your support.

I initially tried testing your B4X code in the example, and on line
Warning #22:
    ' Résultat = liste de chunks numpy
    Dim chunks As List = Result.ToList
I get a "Type mismatch" warning.

I'm not really familiar enough with Python code to implement it in the example.

Thanks again for your help.
 
Upvote 0

zed

Well-Known Member
Licensed User
This is just an example. You need to adapt it to your code.
Posting your project would help us understand what isn't working correctly.
 
Upvote 0
Top