Android Code Snippet B4XOCRPage - Specific OCR capture area (using MLKIT GMS Vision OCR)

Hi there...

If you already seen the example of Erel with Barcode/QRCode Reading-Scanner using ML / GMS Vision Lib... you will understand the simplicity... well i wanted to do the same with ocr having specific Region (orthogon-square) for cropping - recognizing the text from specific area....

* The B4XOCRPage based on OCR TextRecognition using ML Kit
** Will need also Camera2 Library and CamEX2...

The result saving in a public text string having it in "main.ocrresult"

The w and h is the size of Panel -> using for crop image and also showing the specific region of orthogon - capturing...

here we go:

B4X:
Sub Class_Globals
    Private Root As B4XView 'ignore
    Private xui As XUI 'ignore
 
    Private pnlPreview As B4XView
    Private camex As CamEx2
    Private MyTaskIndex As Int
    Private camtimer As Timer
    Private IntervalBetweenPreviewsMs As Int = 500

    Private recognizer As TextRecognizer
    Private textResultLabel As Label
    Private okreturn As Label
 

    Dim w As Int = 200dip 'pnlPreview.Width / 2
    Dim h As Int = 200dip 'pnlPreview.Height / 3

    Private alreadyworks As Int = 0

End Sub

'You can add more parameters here.
Public Sub Initialize As Object
    Return Me
End Sub

'This event will be called once, before the page becomes visible.
Private Sub B4XPage_Created (Root1 As B4XView)
    Root = Root1
    'load the layout to Root
    Root.LoadLayout("camera_layout") ' Layout with only a Panel named pnlPreview

    B4XPages.SetTitle(Me, "OCR Recognizing")

    ' Create label for OCR text
    textResultLabel.Initialize("")
    okreturn.Initialize("okreturn")

 
    okreturn.TextColor=Colors.White
    okreturn.TextSize=42
    okreturn.Typeface = Typeface.MATERIALICONS
    okreturn.Text=Chr(0xE86C)
    okreturn.Gravity=Gravity.CENTER
 
 
 
    textResultLabel.TextColor = Colors.White
    textResultLabel.TextSize = 22
    textResultLabel.Gravity = Gravity.CENTER
    okreturn.Width=100dip
    okreturn.Height=100dip

 
    Root.AddView(okreturn,50%x-(okreturn.Width/2),100%y-(okreturn.Height*1),100dip,100dip)
 

    Root.AddView(textResultLabel, 0, 20dip, 100%x, 50dip)
 
    alreadyworks=0
    recognizer.Initialize("Latin")
 
    StartCamera
 
End Sub

Private Sub StartCamera
    pnlPreview.Visible = True

    camex.Initialize(pnlPreview)

    Wait For (camex.OpenCamera(False)) Complete (TaskIndex As Int)
    If TaskIndex > 0 Then
        MyTaskIndex = TaskIndex
        For Each size As CameraSize In camex.GetSupportedPreviewSizes
            If size.Height <= 480 Then
                camex.PreviewSize = size
                Exit
            End If
        Next

        Wait For (camex.PrepareSurface(MyTaskIndex)) Complete (Success As Boolean)
        If Success Then
            camex.StartPreview(MyTaskIndex, False)
            AddCropOverlay
            camtimer.Initialize("camtimer", IntervalBetweenPreviewsMs)
            alreadyworks=1
            camtimer.Enabled = True
        Else
            Log("Failed to prepare camera surface.")
        End If
    End If
End Sub

Private Sub AddCropOverlay
 
    For i = pnlPreview.NumberOfViews - 1 To 0 Step -1
        Dim v As View
        v = pnlPreview.GetView(i)
        If v.Tag.As(String).Contains("CropOverlay") Then
            v.RemoveView
        End If
    Next

    Dim overlay1 As B4XView = xui.CreatePanel("")
    overlay1.SetColorAndBorder(xui.Color_ARGB(180,  0, 0, 0), 0dip, 0, 0dip)
    overlay1.Tag = "CropOverlay1"
    Dim left As Int = (pnlPreview.Width - w) / 2
    Dim top As Int = (pnlPreview.Height - h) / 2
    pnlPreview.AddView(overlay1, 0,  0, 100%x, top)

    Dim overlay2 As B4XView = xui.CreatePanel("")
    overlay2.SetColorAndBorder(xui.Color_ARGB(180,  0, 0, 0), 0dip, 0, 0dip)
    overlay2.Tag = "CropOverlay2"
    Dim left As Int = (pnlPreview.Width - w) / 2
    Dim top As Int = (pnlPreview.Height - h) / 2
    pnlPreview.AddView(overlay2,  0,  top, left, h)


    Dim overlay3 As B4XView = xui.CreatePanel("")
    overlay3.SetColorAndBorder(xui.Color_ARGB(180,  0, 0, 0), 0dip, 0, 0dip)
    overlay3.Tag = "CropOverlay3"
    Dim left As Int = (pnlPreview.Width - w) / 2
    Dim top As Int = (pnlPreview.Height - h) / 2
    pnlPreview.AddView(overlay3,  left+w,  top,  100%x-(left+w), h)


    Dim overlay4 As B4XView = xui.CreatePanel("")
    overlay4.SetColorAndBorder(xui.Color_ARGB(180,  0, 0, 0), 0dip, 0, 0dip)
    overlay4.Tag = "CropOverlay4"
    Dim left As Int = (pnlPreview.Width - w) / 2
    Dim top As Int = (pnlPreview.Height - h) / 2
    pnlPreview.AddView(overlay4,  0,  (top+h),  100%x, 100%y-(top+h))



    Dim overlay As B4XView = xui.CreatePanel("")
 
    overlay.SetColorAndBorder(xui.Color_Transparent, 3dip, xui.Color_Red, 0dip)
    overlay.Tag = "CropOverlay"

    Dim left As Int = (pnlPreview.Width - w) / 2
    Dim top As Int = (pnlPreview.Height - h) / 2

    pnlPreview.AddView(overlay, left, top, w, h)
 
    If alreadyworks=1 Then camtimer.Enabled=True

End Sub

Private Sub camtimer_Tick
    alreadyworks=0
    camtimer.Enabled=False
    Try
    Dim bmp As Bitmap = camex.GetPreviewBitmap(800, 1280)
    If bmp.IsInitialized = False Then Return

    Dim b4xBmp As B4XBitmap = bmp
    Dim cropped As B4XBitmap = CropFromOverlay(b4xBmp)
    If cropped.IsInitialized = False Then Return

    Wait For (recognizer.Recognize(cropped)) Complete (Result As TextRecognizerResult)
    If Result.Success Then
        If Result.text.trim<>"" Then
        textResultLabel.Text = Result.Text.Replace(CRLF," ")
        okreturn.Text=Chr(0xE86C)
        End If
    Else
        'textResultLabel.Text = ""
        'okreturn.Text=""
    End If
    Catch
        Log(LastException)
    End Try
    camtimer.Enabled=True
    alreadyworks=1
End Sub

Private Sub CropFromOverlay(sourceBmp As B4XBitmap) As B4XBitmap
    For i = 0 To pnlPreview.NumberOfViews - 1
        Dim v As B4XView = pnlPreview.GetView(i)
        If v.Tag <> Null And v.Tag = "CropOverlay" Then
            Dim scaleX As Float = sourceBmp.Width / pnlPreview.Width
            Dim scaleY As Float = sourceBmp.Height / pnlPreview.Height

            Dim x As Int = v.Left * scaleX
            Dim y As Int = v.Top * scaleY
            Dim w As Int = v.Width * scaleX
            Dim h As Int = v.Height * scaleY

            If x + w > sourceBmp.Width Or y + h > sourceBmp.Height Then Return Null
            Return sourceBmp.Crop(x, y, w, h)
        End If
    Next
    Return Null
End Sub


Private Sub B4XPage_Disappear
    If camtimer.IsInitialized Then camtimer.Enabled = False
    If camex.IsInitialized Then camex.Stop
End Sub

Private Sub B4XPage_Appear
    If camex.IsInitialized Then StartCamera
End Sub



Private Sub okreturn_Click
    Main.ocrresult=textResultLabel.Text
    B4XPages.ClosePage(Me)
End Sub
 
Last edited:

Theera

Expert
Licensed User
Longtime User
Hi Magma,
I've problem how to code Main.ocrresult ,after tested your code. where is refering to Main.
Main.ocrresult=textResultLabel.Text
 

Magma

Expert
Licensed User
Longtime User
Hi Magma,
I've problem how to code Main.ocrresult ,after tested your code. where is refering to Main.
Yes this is a snippet (crop i can say)...

you must save somewhere the value of txtresultlabel.text ...i thought is as a public string to use it... in other b4xpages... may be - change it with anything u want.
 

Theera

Expert
Licensed User
Longtime User
Yes this is a snippet (crop i can say)...

you must save somewhere the value of txtresultlabel.text ...i thought is as a public string to use it... in other b4xpages... may be - change it with anything u want.
Thank you for kind of you.
 

Surreal

Member
Licensed User
Longtime User
Hi Magma,

Good morning, I am very interested in your work, but do you have a small project file to try?

thank you very much
 

Magma

Expert
Licensed User
Longtime User

Surreal

Member
Licensed User
Longtime User
...done...

attaching here..
Hi Magma,
I saw your code and it is excellent, I like it a lot.
I would like to ask you a question, I use in one of my projects the clsCameraIntent
I know that it is obsolete now, but I can not modify it now, I was wondering if your AddCropOverlay procedure is applicable in some way to this class.

Thanks for your patience and Have a nice day.
 

Surreal

Member
Licensed User
Longtime User
Can you share the part of code or project and your thought?
Unfortunately the project is substantial, but I'll try to explain what it does.
Among the various functions, with the clsCameraIntent I activate the system camera (after activating the GPS and the acquisition of the coordinates in the camera - Option Save Position) I photograph a car and its license plate thus obtaining a geolocalized photo of the car, which I save in the smartphone, subsequently I send it to a Desktop software developed in WPF that searches for the license plate and the geolocalization and shows it on the map.
Everything works perfectly.
Looking at your project I tried with the OCR to identify the license plate, your system is perfect because I frame only the portion that interests me, that is the license plate, and as you can see from your modified project it works very well.
Subsequently I use the license plate found to check in the database, with SQL and DBUtils and its functions, if it is present and registered!
Now the problem is that using the clsCameraIntent I can't understand how to crop the image as you do.

I should implement Camera2 and CamEx2 in my project.

I hope I explained myself.
Thank you.

I attach your project, with the very small changes made by me.
 

Attachments

  • Magma_scanner.rar
    30.2 KB · Views: 46

Magma

Expert
Licensed User
Longtime User
Well it is not has to do with what camera lib...(camera2 ofcourse is a must for today use)

Crop is has to do with a picture/bitmap you will grab...

Didn't see your modified project... I am out this weekend

But for plates you will need tesseract or better ...Yolo project ... a python project... so need to run somewhere a service...

Text recognizer is for simple uses... and not having all language pack... for ex. Arabic
 

drgottjr

Expert
Licensed User
Longtime User
while i myself like the idea of a pre-cropped frame, it might be of interest to forum members how android handles ocr.

android ocr engine "automatically" identifies blocks of text. the extracted text is made available to the client in several forms. in addition,
the location of the various blocks of text found within the image is also returned to the client. in other words, although the user can pre-crop
a section of the image for text extraction, the api crops all the blocks found. it is possible to display all the blocks or only those of interest.
this is not quite the same as pre-cropping a particular block, but if you pre-crop a particular block, you will only get the pre-cropped block.
if you let android do its thing, you get all the blocks. if the image has 4 or 5 blocks, and you pre-crop a block, you will have to take 4 or 5
pictures of the same page if you wanted the various blocks (or you have to code a way to move the pre-crop frame around).

anyway, and for what it's worth, here is an example of how android handles text extraction when left to its own devices:

1.png
2.png
3.png
 

Attachments

  • 2.png
    2.png
    129.6 KB · Views: 43
Last edited:

Magma

Expert
Licensed User
Longtime User
Unfortunately the project is substantial, but I'll try to explain what it does.
Among the various functions, with the clsCameraIntent I activate the system camera (after activating the GPS and the acquisition of the coordinates in the camera - Option Save Position) I photograph a car and its license plate thus obtaining a geolocalized photo of the car, which I save in the smartphone, subsequently I send it to a Desktop software developed in WPF that searches for the license plate and the geolocalization and shows it on the map.
Everything works perfectly.
Looking at your project I tried with the OCR to identify the license plate, your system is perfect because I frame only the portion that interests me, that is the license plate, and as you can see from your modified project it works very well.
Subsequently I use the license plate found to check in the database, with SQL and DBUtils and its functions, if it is present and registered!
Now the problem is that using the clsCameraIntent I can't understand how to crop the image as you do.

I should implement Camera2 and CamEx2 in my project.

I hope I explained myself.
Thank you.

I attach your project, with the very small changes made by me.
B4X:
Sub IsValidPlate(valore As String) As Boolean
    Dim matcher1 As Matcher
    '
    'ITALIAN LICENSE PLATE
    '
    matcher1 = Regex.Matcher("[A-Z]{2}[0-9]{3}[A-Z]{2}", valore)
    If matcher1.Find = True Then        
        Return True
    Else
        Return False
    End If
End Sub

You mean added this... yes is a way... the problem comes when a plate is old, or a public vehicle (not having the same serial) or an agricultural machine-vehicle... perhaps a rusty plate...

Those can easily passed with the help of tesseract and learning models... or an AI (using a chatgpt assistant will have cost but for sure will help)

Tesseract will need some plates captured (more than hundred)

(a python solution as a backend server will be also helpful)
 

TILogistic

Expert
Licensed User
Longtime User
Well it is not has to do with what camera lib...(camera2 ofcourse is a must for today use)

Crop is has to do with a picture/bitmap you will grab...

Didn't see your modified project... I am out this weekend

But for plates you will need tesseract or better ...Yolo project ... a python project... so need to run somewhere a service...

Text recognizer is for simple uses... and not having all language pack... for ex. Arabic
This is called automatic number plate recognition (ANPR)
I developed what the member @Surreal mentions some time ago with TextRecognition based on MLKit, and there are techniques you should use with the MLKIT APIs and the Camera.

Yolo:
 

TILogistic

Expert
Licensed User
Longtime User
Tips for Post #15
Use ML Kit's OCR to process an image and obtain a text object containing the text blocks (TextBlock).
Traverse the blocks and obtain their coordinates (boundingBox or cornerPoints).
Draw rectangles on the original image at the position of each block to highlight them.
 
Top