The History of OCR
Optical character recognition, or OCR, is commonly known as the technology that can distinguish printed or handwritten text inside digital images of a physical document. This technology is used as a form of data entry using printed paper like passport documents, invoices, bank statements or mortgage documents. Digitisation of text means it can be easily searched within a document and stored for key administrative tasks such as invoicing or sales processing.
OCR’s development began in the late 19th century and continues to evolve even now in 2022. OCR’s roots go back as far as 1914 when Emanuel Goldberg invented a machine capable of reading characters and converting them into telegraph code. He used movie projector technology to handle the microfilm and a photoelectronic cell for pattern recognition in finding the right record. Goldberg continued throughout the years on improving his latent OCR technology by developing a “statistical machine.” This device was, in effect, the world’s first search engine, and used OCR to search microfilm archives for particular patterns of characters. The U.S. patent for his “statistical machine” was later acquired by IBM.
Moving 50 years ahead in the future, Kurzweil Computer Products, Inc founded in 1974 created an omni-font OCR product. This was originally created as a machine learning device for the blind. Fast forward another 30 years and OCR had grown in popularity; in the 2000s, it was made available online as a service. OCR became broader and the recognition algorithms became more sophisticated with optical scanners handling higher and better resolution. A great example of this is that a photo of a passport could be processed onto an OCR app, which could recognise the name and passport number to verify credentials.
Current State of Document Processing
In the digital age of technology and with the focus on automation, OCR has advanced to a new level, introducing intelligent document processing (IDP). IDP grew out of OCR and is now serving as the foundation for the next-generation of AI-driven process automation. It has been designed to deal with an influx of data in a more intelligent way, by extracting the more important information and managing it and storing it more efficiently. IDP uses the best of OCR and applies further advancements such as AI and machine learning to expand its document processing capabilities by handling semi-structured, unstructured and handwritten documents.
The ability to handle unstructured data is becoming increasingly important to organisations, as they seek to move away from traditional paper-based documentation. According to a new report by nRoad, analysts predict the global datasphere will grow to 163 zettabytes (1 zettabytes = 1,000,000,000,000 gigabytes) by 2025, and about 80% of that will be unstructured.
The biggest difference between OCR and IDP is in the results. OCR can process a small percentage of documents successfully due to its capability; it works with machine-printed text that has sufficient spacing between each word, and it works only with high-quality documents. In the complexity of use cases we see in today’s world, OCR routes the majority of the work to humans to extract and validate the data.
Meanwhile, IDP can fully automate the processing of 80%+ of documents, flagging only a handful of documents for humans to handle. IDP embeds machine learning and natural language processing (NLP) to handle more than what OCR can handle: semi-structured, unstructured, handwritten text, images, barcode, stamp detection, signature matching and so on.
Some of the ideal use cases for IDP are email processing, compliance, invoice processing, claims processing and so on. These use cases have something in common, which is why IDP excels at them. They all have identifiable information and clear business rules. Users are able to train the IDP tool to identify and correctly route documents without the need to read them. This eliminates the need for a user to manually enter the data and automates it entirely.
The Future of Document Processing
Document processing continues to evolve and AI is playing a major part in advancing this field. There are two main areas to keep an eye out for in the coming years.
Firstly, as the infinite number of use cases and the complexity of document quality grow for IDP, AI models will need to keep up. The knowledge base of these AI models will need to continue to adapt, learn and improve its ability to handle such variants. This will include reading highly complex tables and processing government issued IDs with holograms or watermarks. IDP will be pushed to ensure it remains accurate, timely and relevant.
Secondly, the use of audio and video are becoming increasingly popular, and it is only a matter of time before organisations start recognising this as a potential fit for IDP. Use cases such as insurance claims and police incidents are the two that will be on the rise in the coming year, if not, already here.
ISG helps enterprises understand the leverage the latest IDP technologies to make smart IT decisions and improve business operations. Contact us to find out how we can help you.
About the author
Sohaib Waqar is a Principal Consultant at ISG.