What results to expect when using AI to automate documents processing

What OCR is, why the paperwork automation is rarely used, and what we have learned in the process of developing our own RPA solution.

Artificial intelligence covers more and more areas nowadays. It checks pizza readiness, searches for criminals and analyzes the origin of the Big Bang. In this article, we will talk about a much more everyday use case: how to automate document flow and reduce the processing time to a few seconds and why, if you do not do it, the growth of almost any company may stop.

OCR as part of RPA — what’s that?

With the increase of paperwork — endless documents preparation, contracts, fillings, agreements, and forms, the costs for back-office grow, too. The department, in turn, requires not only management but also costs a pretty penny. Robotic process automation (RPA) can dramatically decrease your expenses by optimizing all processes without changing their structure. The heart of RPA-based technology is optical character recognition, simply put — OCR.

Let’s imagine you’re taking out a loan. Your passport is quickly scanned, data is checked automatically, and after a few minutes, your application is approved. This becomes possible if we remove the need to perform routine actions of the employee, such as manual entry of data into the system. That is exactly how OCR technology works. It recognizes the picture, divides it into separate fields in a second, extracts the necessary data and automatically enters it into forms, contracts, CRM, applications, and so on. This way, the amount of manual work keeps to a minimum while all the processes related to documents processing speed up dramatically.

As a result, the technology reduces back-office costs up to the complete closure of the department, and at the same time increases the satisfaction of operators: now they can devote more time to customers.

Use cases

In everyday life, OCR is used in a variety of cases. Here are some of them:

automatic bank cards reading;
instant passports recognition;
autofill data for payments in your online account;
quick entry of data into contracts;
reconciliation of customer data from different sources;
autocomplete of CRM;
and much more. A bit later, we’ll come back to that.

Now, let’s talk about the downsides of the technology.

Text recognition accuracy

The first variation of OCR was invented in 1950 in the United States. By today, it is represented by different market players, but in the process of the development of our own technology and customer development, we realized that the existing solutions are not suitable for every client. Here’s why.

Currently, the quality of text recognition, let’s say, in an ID does not exceed 85%. The algorithm still works with errors when processing fuzzy, light-struck or creased pictures. All these factors affect the quality and interfere the system with recognizing the text correctly. To improve it, we have implemented two new features in the technology.

Context analysis. The received text result is additionally run through a neural network that is trained to consider the context and to correct errors automatically. This is very similar to how search engines correct typos when you google something.
Human-in-the-loop. The text extracted by the system is transmitted to verification by qualified crowdworkers onboard. They complement the work of AI and eliminate possible errors. The combination of the algorithm and the human work increases the recognition accuracy from 85% to 99% in any texts. The cherry on the cake is that manual verification solves the issues of handwritten texts: it teaches the algorithm to find and correct errors, and over time, the quality of recognition increases while costs remain the same.

Data security

As we use the human-in-the-loop concept and text is recognized at our capacities, the question of data transfer and its proper storage arises. How to guarantee data safety to customers? We use anonymization, avoiding data storing on our servers. This can also be done using the client’s servers and staff.

Here is an example. The algorithm blurs the picture and splits the passport into several fields on the client side. The information gets to our servers in an anonymous form: it is impossible to determine which field belongs to a particular person. The fields are recognized separately and are sent back to the client using HTTPS encryption. The whole process takes less than a second.

You can connect the technology via REST API — which is very simple as almost all the systems support it.

In a nutshell

OCR as part of RPA reduces the costs for back-office and can dramatically speed up company performance. Even when AI works with personal data, you can not worry about results, as there are solutions that guarantee complete data safety and text recognition quality up to 99%. The technology is applicable to a variety of everyday cases that will be detailed in our next article.