====== Optical character recognition systems ([[wp>OCR]]) ====== ===== Libraries ===== * [[http://www.gnu.org/software/ocrad/|OCRAD]] * [[http://jocr.sourceforge.net/index.html|GOCR]] * [[googlecode>p/tesseract-ocr/|Tesseract]] (and [[http://tess4j.sourceforge.net/|Tess4J]]) * [[googlecode>p/tesseract-ocr/wiki/TrainingTesseract3|Training Tesseract]] * [[http://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseracticdar2007.pdf|An Overview of the Tesseract OCR Engine]] * [[http://vorba.ch/2014/tesseract-3.03-vs2013.html|How to build Tesseract 3.03 with Visual Studio 2013]] * [[https://github.com/charlesw/tesseract/tree/master/src/lib/TesseractOcr|Pre-build ×32/×64 Tesseract 3.02 DLLs]] * [[habrahabr>262483|Сборка Tesseract OCR под MinGW]] * [[https://launchpad.net/cuneiform-linux|Cuneiform]] ([[http://cognitiveforms.ru/products/cuneiform/|CuneiForm on CognitiveForms]], [[http://www.cuneiform.ru/products/index.html|original web site]]) * [[http://www.aspose.com/java/ocr-component.aspx|Aspose.OCR for Java]] * [[googlecode>p/ocropus/|OCRopus]] * [[http://www.claraocr.org/|Clara OCR]] * [[http://www.topocr.com/|TopOCR]] -- Free OCR for digital cameras * [[habrahabr>172651|Построение системы оптического распознавания структурной информации на примере Imago OCR]] * [[http://www.beyondrecognition.net/|BeyondRecognition]] * [[http://www.digitisation.eu/tools/|IMPACT OCR Engines]] * [[stackoverflow>1813881|Java OCR implementation]] * [[http://asprise.com/product/ocr/index.php?lang=java|Asprise OCR SDK v4.0 for Java]] * [[stackoverflow>5656462|OCR error correction algorithms]] [[http://www.linkedin.com/groupItem?view=&srchtype=discussedNews&gid=1928744&item=131187252&type=member&ut=1gT7_ZE2nBIRk1|Chinese OCR]]: * [[http://www.novodynamics.com/products/software/image-plus/|MovoImage]] * [[http://www.irislink.com/c2-2115-189/Readiris-14--OCR-Software--Scan--Convert---Manage-your-Documents-.aspx|Readisis]] * [[http://www.newsoftinc.com/products/product_page.php?P_Id=46|NewSoft OCR]] ==== Image pre-processing ==== * [[habrahabr>218195|ImageMagick скрипт]] -- обрабатывает фотографии белой учебной доски, очищая "содержимое" от всего лишнего * [[habrahabr>273159|Нелокальный алгоритм для сглаживания изображений]] * [[habrahabr>278435|Бинаризация изображений: алгоритм Брэдли]] See also topics about images: {{topic>image}} ===== Services ===== * [[http://www.abbyy.com/recognition_server/|ABBY recognition server]] * [[http://ocrsdk.com/|ABBYY Cloud OCR SDK]] * [[http://www.abbyy.com/screenshot_reader/|ABBYY Screenshot Reader]] * [[http://www.ocrterminal.com/|OCR Terminal]]((is based on [[http://finereader.abbyy.com/|ABBYY engine]])) * [[http://www.wisetrend.com/wisetrend_ocr_cloud.shtml|WiseTREND OCR Cloud]]((is based on [[http://finereader.abbyy.com/|ABBYY engine]])) * [[http://saaspose.com/docs/display/ocr/|Saaspose.OCR REST service documentation]] * [[http://www.onlineocr.ru/support/SupportMain.aspx|Online OCR]] -- Онлайн сервис распознавания текста * [[http://maggie.ocrgrid.org/nhocr/|NHocr]] * [[http://www.newocr.com/|NewOCR]] * [[http://weocr.ocrgrid.org/|WeOCR]] * [[http://blog.evernote.com/tech/2013/07/18/how-evernotes-image-recognition-works/|Evernote]] ===== Products ===== * [[http://www.novodynamics.com/novoverus/|NovoVerus]] – advanced OCR with global language support * [[http://www.aspose.com/categories/product-family-packs/aspose.ocr-product-family/default.aspx|Aspose.OCR]] for Java and .NET * [[habrahabr>255699|Технология распознавания этикеток на примере ярлыков из IKEA]] * [[itunes>id972688836|SmartHelper]] ===== Comparisons ===== * [[http://rus-linux.net/nlib.php?name=/MyLDP/office/OCR/OCR_review.html|Системы оптического распознавания текста в Linux: обзор и сравнительное тестирование]] * [[http://www.mathstat.dal.ca/~selinger/ocr-test/|Ocrad, GOCR, Tesseract comparison]] * [[http://www.splitbrain.org/blog/2010-06/15-linux_ocr_software_comparison|ABBY OCR, Cuneiform, GOCR, Ocrad, Tesseract comparison]] * [[http://wellcomedigitallibrary.blogspot.co.at/2012/07/ocring-typescript-benchmarking-test.html|OCRing typescript: A benchmarking test with PRImA]] -- comparison of Abbyy FineReader and Tesseract on selection of 20 documents * [[wp>List of optical character recognition software]] ===== Books ===== * [[http://www.kanungo.com/ocr.html|Tapas Kanungo's Optical Character Recognition (OCR) Page]] * [[http://www.aiim.org/Shop/Product/2132|Optical Character Recognition Checklist]] * [[http://books.google.com/books/about/Handbook_of_character_recognition_and_do.html?id=yn6DN5hAPywC|Handbook of character recognition and document image analysis]] * [[http://www.amazon.com/Reading-Brain-Science-Evolution-Invention/dp/B003H4RAOU/|Reading in the Brain: The Science and Evolution of a Human Invention [2009]]] by Stanislas Dehaene * [[http://openlibrary.org/books/OL21133376M/Using_forms_automation_to_boost_enterprise_productivity|Using forms automation to boost enterprise productivity [2006]]] (ISBN: 0892584106) * [[http://www.amazon.com/Character-Recognition-Systems-Students-Practitioners/dp/0471415707|Character Recognition Systems: A Guide for Students and Practitioners [2007]]] by Cheriet, Kharma, Liu, Suen (ISBN: 0471415707) * [[http://www.amazon.com/Handbook-Character-Recognition-Document-Analysis/dp/981022270X|Handbook of Character Recognition and Document Image Analysis [1997]]] by H. Bunke, P.S.P. Wang (ISBN: 981022270X) * [[http://www.stephenvrice.com/images/AT-1996.pdf|The Fifth Annual Test of OCR Accuracy [1996]]] by Rice, Jenkins, and Nartker * [[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.40.8661|Recognition Of Handwritten Numerals Using Elastic Matching [1995]]] by Patrice Scattolin * [[http://www.dlib.org/dlib/july09/munoz/07munoz.html|Measuring Mass Text Digitization Quality and Usefulness]] (2009) by Simon Tanner et al. * [[http://www.stephenvrice.com/images/AT-1996.pdf|The Fifth Annual Test of OCR Accuracy]] (1996) by Stephen V. Rice et al. {{tag>OCR cpp image}}