Android Question [B4X] [B4A] Add OCR features

GMan

Well-Known Member
Licensed User
Longtime User
Based on the 2013 project from @Erel (found here: https://www.b4x.com/android/forum/t...r-features-to-your-android-application.27080/ we tried to make this sample word, but without success.

Mentioned that the code is rather old and many things changed since that time we tried several solutions to make it work.
At last removed the httpJob.bas and the HttpUtils2.bas (replaced the last with OKHTTPUtils2) as mentioned also by Erel brings no other result.
Depending on the qualtity more or less charcters are recognized.
@DonManfred also tried to get it work with some success: it works, but maybe the Online-Part has issues
<a href="https://www.b4x.com/android/forum/t...-android-application.27080/page-2#post-656725">[Example] Add OCR features to your Android application</a>

Maybe the one or other has tried it in the past or has another solution
 

GMan

Well-Known Member
Licensed User
Longtime User
Recognition of the numbers and the text isnt good enough...for the text it doesnt really matter, but if a 5 is recongnized as a 3 its not so good.
I am playing with different resolutions when taking a picture, about 760 x 1280 the most letters and numbers are scanned correctly.

Also its depending of the ligth...scanning the bill in an "ambient" restaurant should not work very serious ;-)
 
Upvote 0

GMan

Well-Known Member
Licensed User
Longtime User
i am using this "bill-like-printed" paper (VERY old skool thermo printer ;-) ) from a german post office (an international company)
Here is one result (from the Log-entry):
After some teste i see, that the number of PAGES changes partly significant....as i understand it, the less the PAGES counter is the better is the result in recognition.

Another result with sunligth on the sheet:
 

Attachments

  • IMG_20190414_102203.jpg
    232.1 KB · Views: 280
Upvote 0

GMan

Well-Known Member
Licensed User
Longtime User
Tried 1st the parameter getWords=false (no recognition for "words" (as i understand).
B4X:
job.PostBytes("https://www.ocrwebservice.com/restservices/processDocument?gettext=true&getWord=false@language=german,english", ReadFile(VideoFileDir, "1.jpg"))

But the result returns text also:
 
Upvote 0

GMan

Well-Known Member
Licensed User
Longtime User
I,ll define zones, but makes not really sense: every coffeeshop prints hins own "design"...date on top, date on bottom, total on top, total on bottom etc etc
May be a solution to aks the user (just when scanning the bill) WHICH fields/areas (per tapping) should be scanned
 
Upvote 0

DonManfred

Expert
Licensed User
Longtime User
&getWord=false@language=german,english
1. The parametername is "getwords"!
2. You should not use a @ here at all.
B4X:
job.PostBytes("https://www.ocrwebservice.com/restservices/processDocument?gettext=true&getwords=false&language=german", ReadFile(VideoFileDir, "1.jpg"))
 
Upvote 0

DonManfred

Expert
Licensed User
Longtime User
every coffeeshop prints hins own "design"...date on top, date on bottom, total on top, total on bottom etc etc
Sure, you are right. I guess you need to play around with it.
Knowing the Positions of each "Block" you could define base parameters and do a few OCRs on each bill.
I don´t know if you can get the touch position using the cameraex2 example and whether you can build a Rect which then could be scanned by the restapi. you need to get the right Zone(s) for the touch Rect.

I guess defining one or more Zones can help you getting better results. But you need to define the Zones first for sure.


I can not really help here as i yesterday just created a trial account and tried to get the old example working. I totally rewrote it to use the Rest Api and working with okhttputils2. Finally i got it working (not only partially; It is working. But i´m not resposible for the results you get from this webapi ;-))

PD: If you know a better alternative which offers a Restapi and a trialaccount i can help you get it running.
 
Upvote 0

GMan

Well-Known Member
Licensed User
Longtime User
Yeah, seems the rigth way (at this time).

Could be a dialog step-by-step system with a fixed zone:

1. scan company label
2. scan bill date
3. scan bill number
4. etc

I played aroud with the zones...this gives a reliable result after all:
in this case the tracking number MUST be scanned 100% correct - and it was.

Using this:
B4X:
job.PostBytes("https://www.ocrwebservice.com/restservices/processDocument?gettext=true&getWord=false@language=german,english@outputFormat=txt@zone=150:230:160:50", ReadFile(VideoFileDir, "1.jpg"))
creates a small rectangle which was scanned correct.

btw: in the returned content the text is always shown, even it getWord=false is set ;-)

The last result:
 
Last edited:
Upvote 0

DonManfred

Expert
Licensed User
Longtime User
Here is another OCR Rest Api: https://ocr.space/ocrapi
They offer a free account too.

Here a example using this Api

B4X:
    Dim bytes() As Byte = ReadFile(File.DirAssets, "IMG_20190414_102203.jpg")
    Dim base64 As String = su.EncodeBase64(bytes)
 
    Dim job As HttpJob
    job.Initialize("ocr", Me)
    Dim m As Map = CreateMap("language": "ger", "isOverlayRequired": True,"filetype":"jpg","isTable":True,"base64Image": $"data:image/jpeg;base64,${base64}"$)
    job.PostMultipart($"https://api.ocr.space/parse/image"$,m,Null)
    job.GetRequest.SetHeader("apikey", "yourapikey") ' get you own apikey and use it here
    Wait For (j) JobDone(j As HttpJob)
    If j.Success Then
        'File.WriteString(File.DirRootExternal, "JobResult.txt", j.GetString)
        Log(j.GetString)
    Else
        Log(j.ErrorMessage)
    End If
    j.Release

Especialy the parameter isTable = true is something you could need. Also isOverlayRequired to request the bounds of each result.

In this case i did uploaded the image you posted here. "Die Quittung von der Post"

This is the resulting JSON

This is the IDno


All in all i think this Api is more approbiate.
Give it a try with your Bills...
 
Last edited:
Upvote 0

GMan

Well-Known Member
Licensed User
Longtime User
QuickTip: saw this just on Twitter
 
Upvote 0
Cookies are required to use this site. You must accept them to continue using the site. Learn more…