Share My Creation At The Threshold

JohnC · Feb 11, 2023

It looks like the screen pixels are making it harder to read because even when a letter is white - it's speckled with blue dots.

I wonder if running the image through a despeckle image process first, will it then increase the threshold success of the OCR?

Johan Schoeman · Feb 11, 2023

I have messed around with Otsu threasholding a few years ago and not sure if it might yield a more OCR-able result. It first gray scales and then calc the threshold value used for binarization of the image.

Otsu Thresholding - binarization of images

This project demonstrates grayscaling of a multi colored image and then the binarization (i.e black or white pixels only) of the grayscaled image. The binarization is based on Otsu Thresholding as explained at www.labbookpages.co.uk/software/imgProc/otsuThreshold.html The image on the left is...

www.b4x.com

drgottjr · Feb 11, 2023

of course, i've seen and tried the otsu thresholding app (thanks, as usual, for a fine job), but it wouldn't run with the image
i had at hand. it complained about the particular format of the image (the same image which i had no problems with otherwise
and which i use in my example, just a .jpg taken by my device). so rather than wrestle with it, i put it aside for the moment.

in any case, otsu is one of many thresholding methods. they all share the same behavior that you describe: convert the base
image to grayscale and then work out the thresholding point. there are a lot of people who have spent a lot of time working
on thresholding. if you read their scholarly papers and look at their examples, otsu is not always the best. so i present
a representative selection for the user's convenience.

i spent a lot of time with leptonica in my tesseract days. i had some amazing results with sauvola thresholding. even leptonica's
base thresholding and tesseract's own minimal thresholding and zxing's hybrid binarizer all produce good results on the types of
images that we usually try to extract text from. i could have added them (at great additional app size). but, frankly, any ocr app
already includes one of them, invoked automatically before attempting to extract the text.

the purpose of my example was to produce a kind of "point and shoot" way of examining thresholding from different points of
view. if text extraction didn't work initially, the user could ask for a set of variously binarized images, and she could
choose one (if one such image was appropriate) without being a thresholding expert. maybe it's the otsu version. some images
produce no competely satisfactory result. some images need to be binarized by breaking up the image and dealing with each part
separately by hand, in stages. in such cases - eg, digitization of older, stained, skewed manuscripts - opencv and a lot of custom
thresholding techniques are required. in addition, other operations, such as edge finding, despeckling, deskewing, denoising,
rotating, inverting, etc may be required. leptonica, for example, has hundreds of such methods. having them all run without guidance
is beyond the scope of the example. and who, running an ocr app on his phone, is going to know exactly which arbitrary values are to
be assigned to the many, many variables required to binarize all images optimally? if these are the types of images one has to deal with
on a daily basis, then something other than a representative sampling is needed. i get that. i thought there might be some value to
my example; i can't be right all the time.

Share My Creation At The Threshold

Attachments

JohnC

Expert

Johan Schoeman

Expert

Otsu Thresholding - binarization of images

drgottjr

Expert

Similar Threads