Automated data entry with OCR and machine learning

Automated data entry with OCR and machine learning

The need for automated data entry software is on the rise. “Why?”, you might ask. Well, today’s businesses run on data, and having quick and reliable access to data is a significant competitive advantage. If you know better than your competitor where to optimize and streamline processes, you’re in the winning team.

Did you miss this bandwagon? Is your company’s data still hidden inside a PDF file or even worse, in printed paper documents? Do you devote your team’s valuable time to manually copy-pasting data from these resources?

Fortunately, it’s not too late for a change! Recent technological developments in OCR and machine learning make it easier to turn away from manual data entry, using smart software solutions that automatically extract data from images and documents.

In this blog, we will explain the concept of automated data entry, discuss its advantages and elaborate on the most common use cases for data entry automation. In the end, you will know exactly how to automate your data entry processes with OCR and machine learning.

Manual data entry

Do companies still enter data manually?

The short answer is yes. Manual data entry is a centuries-old practice, and one that has proven to be quite persistent.

In a 2020 study on the state of automation within the manufacturing industry, for example, nearly half (48 percent) of the surveyed companies indicated that they still used spreadsheets or other manual data entry documents.

One of the reasons why data entry is not completely automated is habitual behavior. It may have become a habit to spend a day per week on manual data entry. Other reasons are a lack of insight into the actual costs of the process and too little knowledge of automation tools. After all, manually copying data is a process that requires little effort to implement, use and maintain.

We see this, however, as the high price of short-term savings. The more manual data entry your company has to carry out, the more it’s costing you over time. According to Goldman Sachs, the total costs of manual, paper-based invoice processing even amounted to $2.7 trillion annually, which is a big burden in terms of time and money.

With an automated solution, these costs can be lowered by at least 50-70%, while some studies even estimate cost reductions of up to 80%. That seems to be a good deal, so let’s have a closer look at the concept of data entry automation!

Save time and money

What is automated data entry?

Automated data entry can best be described as the use of software to remove repetitive manual administrative tasks from employee workloads, saving companies time and money to improve performance and increase revenue.

Automated data entry software is capable of reading information from a difficult to access source, such as a PDF file or printed document, and passes the data on to another system or data storage (i.e. databases, spreadsheets and so on).

Several techniques can be used in this process. Broadly speaking, we can distinguish between:

  • Rule-based techniques
  • Machine learning techniques

Rule-based techniques

The traditional rule-based approach looks for data in very specific locations on a document, dictated by pre-defined logic and rules. These rules are coded into a system in the form of if-then-else statements.

The main idea of a rule-based system is to capture the knowledge of a human expert in a specialized field (e.g. accounts payable) and incorporate it into a computer system. That’s it. No more, no less. It’s a bit like a human being born with fixed knowledge.

Rule-based systems have been around for many years and have been reasonably effective in converting data from documents and thereby reducing manual data entry.

However, while rule-based techniques have worked well with highly structured forms, they struggle with semi-structured (e.g. invoice and receipts) and unstructured documents (e.g. the body of an e-mail). These kinds of documents are less predictable and are therefore not well suited for the rule-based approach.

Machine learning techniques

In contrast to the rule-based approach, machine learning systems have adaptive intelligence. They interpret and identify patterns from huge amounts of data, which can then be used to learn and improve from experience. This allows software applications to become more accurate in predicting outcomes over time without being explicitly programmed for it.

A simple example would be the word suggestions you receive on your phone while you are typing. These word suggestions are made based on the inputs in the past and predict what you might want to say at that moment.

If a machine learning technique is used for a data entry program, a similar approach is used to find the data to be entered. Based on the data entered in the past, the right data points are found, extracted and automatically entered in the desired system.

As such, machine learning techniques can take on less structured documents, learn the patterns in it and create methods for turning these documents into structured data.

The best of both worlds

Of course, technology is not always perfect, even machine learning isn’t. Although it evolves very quickly, you will usually not reach an accuracy of 100%. Luckily, there are solutions for this as well. By combining software with the power of humans you can combine the best of both worlds, which is called human-in-the-loop (HITL) automation.

With an HITL solution you can automate the biggest portion of a procedure, followed by a human review to complete the task. Meanwhile, the software learns from any human-made changes and improves over time.

So while you might start at 90% automation, which is already great, HITL processing will get you as close to 100% automation as possible. Below you can find an example of such a workflow:

Human in the loop workflow

What are the benefits of data entry automation?

Thanks to the development of automated data entry software that use OCR and machine learning, there are very few justifications for preferring manual data entry over automation. Automated tools make you more efficient and give you more time to concentrate on more important tasks and strategic direction.

To give you a good understanding of the advantages of data entry automation, we will list the most important ones below:

  • Save loads of time (and thus money)
  • Reduce errors
  • Happier employees
  • Less paperwork

Save loads of time (and thus money)

The biggest selling point of an automated data entry system is the reduction of employee hours. And since time is money, this results in significant cost savings.

There’s no magic formula for this, but studies found that intelligent automation typically results in cost savings of 40% to 75%, with the payback period ranging from several months to several years.

As an example, consider the time needed for manually processing a single invoice. On average, experienced bookkeepers process 50 invoices per hour (i.e. 1.2 minute per invoice). With an average hourly wage of €35, this means that processing just one invoice costs approximately €0.70.

That same bookkeeper using OCR software is able to process at least 200 invoices per hour (≈ 18 seconds per invoice). This leads to a processing cost of approximately €0.18 per invoice. Add the price of the software of €0.05 per invoice (i.e. the rate for Klippa’s OCR technology) and you get a total cost of €0.23 per invoice. That’s a direct improvement of more than 60%!

Curious about our OCR technology? Try it out in our free tool below (desktop only):

Reduce errors

We all make mistakes, whether you like it or not. Error rates in manual data entry typically range from 0.55% to 3.6%, with outliers to as high as 26.9%.

Luckily, automated data entry using machine learning can significantly lower these error rates by eliminating the risk of distractions, keystroke errors and other mistakes commonly found in manual data entry. This translates into better, more accurate data that can be used to make well-informed business decisions.

Happier employees

Manual data entry can be a very time-consuming and boring process for employees. Excessive data entry work can cause physical and psychological issues as well, such as eye strain, carpal tunnel syndrome, tenosynovitis and emotional stress.

Automation, on the other hand, has been found to improve employee satisfaction and engagement, allow employees to focus on more meaningful and valuable tasks and give instant gratification from daily work. That sounds like a pretty good deal, right?

Less paperwork

Managing data manually can take a lot of resources because it requires a lot of things, such as filing cabinets, printers, ink, office space, and so on. With the right data entry software, you can free up these resources and use it for what really matters.

Meanwhile you’re saving a lot of trees and preparing your business for a sustainable future.

Use cases for automated data entry with OCR and machine learning

By default any high-volume repetitive task that includes data entry can be automated. We will highlight a few use cases below to inspire you to start looking for similar procedures within your own organization:

  • Invoice processing and accounts payable
  • Purchase/sales order processing
  • HR and recruitment
  • Loyalty campaigns
  • Know Your Customer (KYC) automation

Invoice processing and accounts payable

Invoice processing is a textbook example of a process that screams for automation. Its repetitive nature, especially for recurring invoices, and large volumes lead to many hours of boring work for employees.

Accounts payable automation can lend you a helping hand here. Some solutions, for example, read invoices with OCR and AI and extract information themselves, which means no more data entry at all.

These solutions process invoices when they come in. All the invoice data is automatically parsed and placed in the respective fields in the accounting or ERP software. It is even possible to let the software perform soft decision making for things such as invoices for food or travel. This minimizes the human input, making the process less prone to errors.

Purchase/sales order processing

Another area of the Finance department that can be automated is purchase or sales order processing. While sales representatives spend a lot of time on tracking down all the client data and entering it into the CRM and ERP system, the Finance department has to replicate all the data and enter it into the accounting system.

Of course, this process is never flawless and can result in duplicates, which are detrimental to productivity. But if you use an automated system with OCR and machine learning instead, you can perform sales activities end-to-end by automating tasks such as sales order entry and invoicing. This will leave you with a clean database and improve customer experience.

Purchase order OCR by Klippa

HR and recruitment

Ask your HR colleagues about the most repetitive and time-consuming task they have and they will most likely answer that it’s processing payrolls and payslips. Every month, they have to make sure that everyone is paid correctly and on time. This also involves filing reports and paying employment taxes to the tax authorities.

With the right automated solution, however, they will breeze through these tasks. They will be assured of consistent employee data across all different enterprise systems, and they will be able to validate timesheets and easily load or update earnings and deductions.

Also for less structured documents, such as resumes, a software solution can be of great assistance. If it includes OCR and machine learning, you can filter and categorize incoming resumes based on a setup of rules and keywords. Identify the most interesting applications at a glance, get rid of unwanted applications or do any other task.

And that’s not all. Even the onboarding process can go a lot faster. With the help of OCR and machine learning, you can extract all the data from forms and identity documents that new employees send over, and place it directly into your HR system.

Loyalty campaigns

Loyalty programs come in many forms, but they all have one thing in common: they involve a lot of back-office work. Most often, this consists of checking and validating a receipt, entering the data into a system and releasing the reward. With an old-fashioned, manually-operated process in place, it may take days or even weeks before this is carried out.

With the Klippa OCR SDK, on the other hand, you can simply use your mobile phone to take a picture of the receipt and let the software do the work. The data from the receipt is extracted, interpreted and processed within seconds!

This will not only lead to faster payouts and happier customers, but you will also be better able to deal with peak volumes, make less mistakes and experience less fraud incidents, like accepting duplicate or photoshopped receipts.

If you want to know more about automating loyalty campaigns, we highly recommend one of our other blogs, which is entirely dedicated to this topic.

Know Your Customer (KYC) automation

Many companies, especially in the rental, telecom, banking and insurance industry, are obligated to verify their customers’ identity. They have to do this to comply with KYC regulations.

An old-school example of this process is going to a bank, presenting your ID card and signing some forms. Subsequently, a banking employee will check this data and enter it into their systems. As you can imagine, this is a very costly and hard-to-scale process. Outsourcing to low-wage countries might be an option, but then questions arise about privacy regulations and the risk of data leaks.

Automating KYC checks is of course the better choice. Rather than entering data manually, you can scan a customer’s ID or passport by taking a picture with your smartphone, combine this with a selfie and a signature and the software will do the rest. The authenticity of the scans is automatically determined and all the required data points are extracted.

Learn more about document-based KYC check with OCR and AI in one of our other blogs.

These are only a few of the many applications of data entry automation software in practice. We hope that it inspired you to explore how such a solution could work for your company. In the final section of this blog, we will cover how you can automate your data entry tasks with Klippa

How to automate your data entry with Klippa

OCR and machine learning are at the base of our data entry automation software. With OCR, we can identify the text in documents and images. As soon as we have the text, we can start gaining an understanding.

By using machine learning, we can identify data points that are interesting within the text and mimic human behaviour by learning from previous examples.

OCR text recognition

Let’s have a look at an example of what our technology can do for you:

Simplify credit card payments

Through OCR, the pixels that contain text are identified and extracted into digital text. The act of manual data copying is replaced directly with OCR. With an accuracy of more than 95%, all text is extracted, whereas manual data copying would have a significantly lower accuracy and costs much more time.

Data extraction and structured output

Next, our machine learning model comes into play:

As we’ve outlined, machine learning grows in effectiveness when fed with more and more examples. An AI is trained with numerous examples of documents and specified data sets so that it can automatically localize and identify specific text on a specific position on the document. Over time, our machine learning model can therefore only get better.

In the example above, all data is automatically contextualized and converted to a structured JSON-format.

Automated data entry

Last but not least, you want to have this data entered into the right fields in your system. Below you can see an example of one of our data entry interfaces that your team can use to process the dropout in a human-in-the-loop setup. In this case, all the relevant data from the invoice is read, extracted and already entered into the corresponding fields. Ready for a final human check.

This interface by itself can already greatly improve your current data entry procedures, because it has error prevention and automated suggestions built in:

Since we have many integrations with major ERP and bookkeeping systems, and can link with basically any system through smart import and export functionalities, you can easily pass the data on to other systems that you may use.

It’s also possible to integrate our technologies into your existing workflows. Many companies already use RPA vendors like AutomationAnywhere, UiPath, BluePrism, Mendix or others to automate certain workflows. From a workflow perspective these solutions are all good. But what you will often see if you look at their OCR and machine learning capabilities is that their built-in solutions are not adaptable enough to reach a high degree of automation for your niche use case.

That’s why Klippa offers its technologies as plugins for all major RPA vendors, such as Automation Anywhere, UiPath, or Mendix. Simply switch on our plugin and integrate it into your workflows. Our solutions can be used for classification, data entry and verification on many different use cases. A few examples are automated receipt processing, automated bank card scanning and document-based KYC checks.

Get in touch with Klippa

We hope that we sparked your interest with this blog, but you still might have some questions. Is Klippa the right solution for your business? Can we help you with your unique use case? How difficult is it to implement and start using our software? And so on.

Our experienced product specialists are more than happy to answer all your questions. They can tell you everything about what our solution can do for your company.

You can reach out to us directly or you can plan a free 30-minute demo below, in which we show you how our software works and how it will benefit your organization.

 Schedule a free online demonstration

A clear overview of Klippa in only 30 minutes.

Works with AZEXO page builder