UDiNet: A dilated U-net for improving OCR performance

Fardinfar, Mahyar; Rashidikia, Pouya; Rezaie, Mohammadreza; Zolfy Lighvan, Mina

doi:10.46254/GC02.20240035

Image denoising is a critical task in the field of computer vision. This paper introduces the UDiNet architecture, a dilated variant of the U-Net, specifically designed to address image denoising challenges. We present a novel dataset comprising book sheet images to rigorously evaluate the performance of the proposed method. Experimental results demonstrate that UDiNet significantly enhances the performance of established Optical Character Recognition (OCR) systems, such as Tesseract and Genome. The model effectively mitigates severe noise while preserving essential structural details of English characters. This capability positions UDiNet as a valuable preprocessing technique for various applications, including classification, detection, and OCR tasks. To promote further research in this domain, we have made the code and trained models publicly accessible.

UDiNet: A dilated U-net for improving OCR performance

Mahyar Fardinfar , Pouya Rashidikia , Mohammadreza Rezaie & Mina Zolfy Lighvan

Publisher: IEOM Society International