I'm trying to do a project where a ticket is read by OCR. Using conventional libraries I did not get good results.
(Only with google cloud vision the results were good but the service is paid)
So I'm trying to make my own OCR. The ticket is always the same size and the font is known.
Where am I:
- Get an image and do binarization. Ok
- Select areas of interest. Ok
- Split the images using a "connected elements" algorithm. Ok
- Resizing of the elements found. Ok
- Compare the combinations with a template and do scores. Ok
the matrix of bits generated from the image has 16x16 bits.
Something like this:
0000000000000000
0000000000000000
0011111111111100
0011111111111100
0000000000011000
0000000000110000
0000000001100000
0000000011000000
0000000110000000
0000001100000000
0000011000000000
0000110000000000
0011111111111100
0011111111111100
0000000000000000
0000000000000000
Yes that seems to work but needs to be better.
So I thought "what could I use to make recognition more intelligent?". I found "neural networks" and think that is very interesting.
I would like to learn basic neural networks for this project and for others because the topic is really interesting.
My difficulties:
How to apply "perceptron" to recognize these patterns?
How to do a back propagation to calibrate weights?
Sir, could you write a perceptron b4x code with this idea including back propagation?
Thank you very much.