python - Improve the quality of the letters in a image -


i'm working images have text. problem these images receipts, , after lot of transformations, text lost quality. i'm using python , opencv. trying lot of combinations of morphological transformations doc morphological transformations, don't satisfactory results.

i'm doing right (i'll comment i've tried, , let uncommented i'm using):

kernel = np.ones((2, 2), np.uint8) # opening = cv2.morphologyex(img, cv2.morph_open, kernel) # closing = cv2.morphologyex(img, cv2.morph_close, kernel) # dilation = cv2.dilate(opening, kernel, iterations=1) # kernel = np.ones((3, 3), np.uint8) erosion = cv2.erode(img, kernel, iterations=1) # gradient = cv2.morphologyex(img, cv2.morph_gradient, kernel) # img = erosion.copy() 

with this, original image:

enter image description here

i this:

enter image description here

it's little bit better, can see. still bad. ocr (tesseract) doesn't recognize characters here well. i've trained, can note, every "e" different, , on.

i results, think, if resolve problem, better.

maybe can thing, or use better combination of morphological transformations. if there tool (pil, imagemagick, etc..) use, can use it.

here's whole image, can see how looks:

enter image description here

as said, it's not bad, little more "optimization" of letters perfect.

in experience erode impairs ocr quality. if have grayscale image (not binary) can use better binarization algorithm. use sauvola algorithm binarization. if have binary image best thing can removing noise (remove small dots).


Comments

Popular posts from this blog

java - Jasper subreport showing only one entry from the JSON data source when embedded in the Title band -

serialization - Convert Any type in scala to Array[Byte] and back -

SonarQube Plugin for Jenkins does not find SonarQube Scanner executable -