Android Question Human Voice Recognition

khwarizmi

Active Member
Licensed User
Longtime User
Hi all
I want to develop an automated proctoring system for exams for students. The system monitors a student's face during the exam to determine whether they are the real student. This is a paid service. It also captures sounds in the room to detect any human voices.
Can I use B4A to analyze the voice and distinguish whether it is a human voice, regardless of what they are saying?
I tried speech recognition systems, but they don't work because they only work if the voice is clear and loud, and the student in the exam might speak in a whisper or in a low voice.
 

aeric

Expert
Licensed User
Longtime User
Maybe a B4R solution using PIR motion and sound sensor.
 
Upvote 0

khwarizmi

Active Member
Licensed User
Longtime User
Maybe a B4R solution using PIR motion and sound sensor.
Thanks for the suggestion!


Actually, I was able to build a working solution using B4A only. I’m not using speech recognition — I just detect human voice presence by analyzing the microphone buffer.


💡 I check for:


  • Average amplitude over short time windows (2 seconds),
  • Variance in amplitude.

If both are above thresholds for several intervals, I consider it a human voice.


❗ Limitation: Continuous loud noises like a fan can trigger false detections, since they may have similar patterns.


Here's a simplified logic:

B4X:
Sub Process_Globals
    Private audio As AudioStreamer
    Private timer1 As Timer
    Private bufferList As List
End Sub

Sub Globals
End Sub

Sub Activity_Create(FirstTime As Boolean)
    audio.Initialize("audio", 16000, True, 16, audio.VOLUME_VOICE_CALL)
    bufferList.Initialize
    timer1.Initialize("timer1", 2000)
    timer1.Enabled = True
    audio.StartRecording
End Sub

Sub audio_RecordBuffer(Buffer() As Byte)
    bufferList.Add(Buffer)
End Sub

Sub timer1_Tick
    If bufferList.Size = 0 Then Return

    Dim amplitudes As List
    amplitudes.Initialize

    For Each buf() As Byte In bufferList
        amplitudes.Add(ComputeAmplitude(buf))
    Next

    bufferList.Clear

    Dim avg As Float = Average(amplitudes)
    Dim variance As Float = ComputeVariance(amplitudes, avg)

    Log("Avg: " & avg & "  Var: " & variance)

    If avg > 1000 And variance > 500 Then
        Log("👂 Possible human speech detected!")
        ' You could trigger an alert or record this timestamp
    End If
End Sub

Sub ComputeAmplitude(buffer() As Byte) As Float
    Dim raf As RandomAccessFile
    Dim tmpfile As String = File.DirInternal & "/tmp.raw"
    File.WriteBytes(tmpfile, "", buffer)
    raf.Initialize2(File.DirInternal, "tmp.raw", False, True)

    Dim samples As Int = raf.Size / 2
    Dim total As Long = 0
    For i = 0 To samples - 1
        total = total + Abs(raf.ReadShort(i * 2))
    Next
    raf.Close
    Return total / samples
End Sub

Sub Average(l As List) As Float
    Dim sum As Float = 0
    For Each f As Float In l
        sum = sum + f
    Next
    Return sum / l.Size
End Sub

Sub ComputeVariance(l As List, avg As Float) As Float
    Dim sumSq As Float = 0
    For Each f As Float In l
        sumSq = sumSq + Power(f - avg, 2)
    Next
    Return sumSq / l.Size
End Sub

So far, this works reliably in quiet environments like exam rooms. I'm considering adding frequency analysis (FFT) in future iterations to help distinguish fan noise from human voice more effectively.

Let me know if you'd like to collaborate or improve this further! 😊
 
Upvote 0

khwarizmi

Active Member
Licensed User
Longtime User
Thanks epiCode for the enlightenment, but it looks like I'll have to make one last attempt using frequency. The frequency of the human voice ranges from 85 to 180 Hz for men, and 165 to 255 Hz for women.
 
Upvote 0
Top