TY - GEN
T1 - Enhancement of optical character recognition through neighbor embedding
AU - Smith, David C.
PY - 2012
Y1 - 2012
N2 - Optical character recognition (OCR) is the electronic translation of scanned images of text documents. Frequently text documents are scanned at low resolution (LR) to conserve memory, and OCR engines have difficulty translating such LR images. In this article we apply the neighbor embedding (NE) single-image super-resolution (SISR) technique to LR images of text documents and obtain high resolution (HR) versions, which we subsequently process with OCR. We repeat this experimental procedure using bicubic interpolation (BI) in the preprocessing step. We report our experimental findings comparing the character error rates (CER) of OCR translations before and after NE and BI preprocessing. Our experiments with Latin fonts in the 6pt-10pt range show that at 3x (LR scanning at 100 dpi) and 4x (LR scanning at 75dpi) magnification, CER after NE preprocessing was nearly an order of magnitude lower than CER after BI preprocessing. We also observed that in this point range, OCR applied to LR images scanned at 75 dpi completely failed, and CER was at least 94% for OCR applied to LR images scanned at 100dpi. By contrast, at 3x and 4x magnification, CER after NE preprocessing was under 10% at 6pt and under 3% at 8pt and 10pt.
AB - Optical character recognition (OCR) is the electronic translation of scanned images of text documents. Frequently text documents are scanned at low resolution (LR) to conserve memory, and OCR engines have difficulty translating such LR images. In this article we apply the neighbor embedding (NE) single-image super-resolution (SISR) technique to LR images of text documents and obtain high resolution (HR) versions, which we subsequently process with OCR. We repeat this experimental procedure using bicubic interpolation (BI) in the preprocessing step. We report our experimental findings comparing the character error rates (CER) of OCR translations before and after NE and BI preprocessing. Our experiments with Latin fonts in the 6pt-10pt range show that at 3x (LR scanning at 100 dpi) and 4x (LR scanning at 75dpi) magnification, CER after NE preprocessing was nearly an order of magnitude lower than CER after BI preprocessing. We also observed that in this point range, OCR applied to LR images scanned at 75 dpi completely failed, and CER was at least 94% for OCR applied to LR images scanned at 100dpi. By contrast, at 3x and 4x magnification, CER after NE preprocessing was under 10% at 6pt and under 3% at 8pt and 10pt.
KW - Bicubic interpolation
KW - Image enhancement
KW - Neighbor embedding
KW - Optical character recognition
KW - Single image super-resolution
UR - http://www.scopus.com/inward/record.url?scp=84884178664&partnerID=8YFLogxK
U2 - 10.2316/P.2012.786-016
DO - 10.2316/P.2012.786-016
M3 - Conference contribution
AN - SCOPUS:84884178664
SN - 9780889869493
T3 - Proceedings of the IASTED International Conference on Signal and Image Processing, SIP 2012
SP - 1
EP - 8
BT - Proceedings of the IASTED International Conference on Signal and Image Processing, SIP 2012
T2 - 14th IASTED International Conference on Signal and Image Processing, SIP 2012
Y2 - 20 August 2012 through 22 August 2012
ER -