Client just sent me three years of paper bills that need to be digitized for analysis. Looking for reliable free OCR software that can handle utility bill formats reasonably well. Pacific Power bills have a fairly standard layout but some are faded copies. Anyone have good experiences with free solutions before I consider paid options?
Free OCR software recommendations for bill scanning
I've had decent results with NAPS2 combined with Tesseract OCR. It's completely free and handles most utility bill formats well. The key is getting good scans - 300 DPI minimum and adjust contrast if the originals are faded. For PGE bills, I typically get 85-90% accuracy on the important numbers. Still need to manually verify but beats typing everything.
Adobe's free PDF reader has basic OCR functionality if you're willing to work with PDFs. Not as accurate as dedicated OCR software but convenient since most bills end up as PDFs anyway. Puget Sound Energy bills work reasonably well with it. For better accuracy, try preprocessing the images - increase contrast and maybe run them through a denoising filter.
Thanks for the suggestions! Downloaded NAPS2 and it's working better than expected. Getting about 80% accuracy on the kWh readings which saves significant time. The demand charges are trickier since Pacific Power uses a smaller font for those details, but still much faster than manual entry. Beatrice, what denoising filter do you recommend?
For denoising, I use GIMP's noise reduction filter - also free. Open the scanned image, go to Filters > Noise > Noise Reduction. Start with default settings and adjust if needed. Sometimes a slight Gaussian blur (1-2 pixel radius) before OCR helps with really poor quality scans. The goal is making the text crisp without losing the actual numbers.