The adoption of Artificial Intelligence (AI) is rapidly growing. According to a survey from McKinsey, AI adoption increased by 50% from 2020 to 2021. Furthermore, the use of AI significantly impacted the bottom line of surveyed companies, with an increase of 22% in profits from the previous year.
Are you among the companies that already use AI to automate workflows? If yes, that is great! There are a lot of awesome technologies out there that can automate many tedious, error-prone, and repetitive tasks.
However, using these technologies alone may not always solve all the pieces of the puzzle. It certainly doesn’t mean that you need to get rid of the human aspect completely.
Take data extraction as an example. Even with the most advanced technology, it is nearly impossible to extract data from documents with 100% accuracy all the time. In some industries, 1% of data extraction mistakes can already cost your business millions of euros.
That’s why, in many cases, combining the best of humans and the best of Artificial Intelligence can yield the best results. Such an approach is called Human-In-The-Loop (HITL) automation, which we will discuss in detail within this blog.
So, make sure to stay with us until the end (or skip to the benefits if you already know what HITL is).
Automated vs. Manual workflows
Automating various workflows brings operational efficiency to businesses that are not always achievable with manual workflows; it saves time, minimizes errors, and reduces overhead costs. Many manual tasks are often repetitive, time-consuming, and error-prone, which create unnecessary overhead costs.
Take, for instance, a corporation that has to deal with a high volume of documents. A document must be verified; back-office staff must scan the document to digitize it for record keeping; then the data entry clerk must extract and enter the data into the desired system; another person must validate that the data has been entered correctly, etc.
Many things can go wrong within this manual workflow, and it’s simply not scalable. This is why organizations often look for solutions to automate these types of document workflows.
For example, solutions like Intelligent Document Processing (IDP) can easily eliminate manual tasks by automating data extraction, categorization conversion, and validation.
Although automating manual workflows solely with IDP sounds like a great idea, there are limitations that even AI and machines aren’t yet able to solve.
For instance, machines alone can’t cope well with complicated workflows and low-quality data input. At worst, it may eat up your organization’s bottom line.
Imagine a scenario where machines automatically extract data from an invoice and enter €100,000 into a system instead of €10,000. It could lead to a significant financial loss if you don’t have any safeguards to prevent it.
Often, this fact alone makes fully automated solutions fragile.
Luckily, human involvement can overcome this challenge and help you reach the highest possible accuracy and outcome. Thus, many organizations have adopted human-in-the-loop automation.
But what do we mean exactly with human-in-the-loop automation? Keep reading, and you’ll find the answer in the following sections.
What is human-in-the-loop?
Often Human-in-the-loop (HITL) is referred to as a mechanism that leverages human interaction to train, fine-tune, or test certain systems such as AI models or machines to get the most accurate results possible.
Take the workflow in supermarkets as an example. Even though many supermarkets have self-scanning machines, there is often an employee (human-in-the-loop) who would be in the proximity of these machines.
The employee is placed there to ensure that the customers will get help when necessary and validate that the products have been properly scanned to prevent fraud or stealing attempts.
Using self-scanning machines does help cut back the waiting lines and the number of staff supermarkets would need to employ. However, the machines are not flawless enough to be left fully unattended.
Therefore, the human-in-the-loop approach in these types of situations works best.
Next, let’s have a look at human-in-the-loop automation with AI solutions.
Human-in-the-loop & AI
Even though modern technologies are advanced, they are not perfect. Perhaps they can never be “perfected” as goals, needs, and demands change over time, which is why human-in-the-loop automation is so essential in achieving the best possible results.
So how does HITL work in the context of AI? With human-in-the-loop, you can train AI models to become more accurate in identifying, classifying, and predicting an object.
Say you wanted to train AI algorithms to recognize shapes (i.e., square, circle, triangle), you would need a human to label images of these shapes correctly.
When AI would make a mistake with a prediction or identification, the human-in-the-loop would be there to make the corrections. This is called a feedback loop that helps improve the accuracy of AI models.
In general, AI models can’t constantly make predictions with a confidence of 100%.
The same challenge applies to data extraction. Many out-of-the-box OCR software can extract data with an accuracy of 97% (99% in rare cases) at best, but the average data extraction accuracy is still around 80% for most solutions.
In most cases, that leaves 20% of the data inaccurate, which can become a devastating issue for your organization even though it can automate most of the manual data entry work.
Luckily, human intelligence can easily fill in this gap.
Human-in-the-loop allows for fast identification of problems and improvements through a feedback loop, which can be referred to as HITL annotation. Without it, errors and mistakes are hard to detect.
Let’s have a look at this process in more detail below.
Human-in-the-loop annotation or data labeling is often part of the process when developing AI models.
AI models require large volumes of raw data (e.g., documents, images, text files, and other objects) to identify objects and make predictions accurately.
Annotating, building data sets, and collecting data requires a great sum of time, money, and effort.
So how does it work? A data annotator, a human-in-the-loop, labels datasets that enable the AI models to focus on specific data fields repeatedly until they can optimally recognize and make the best predictions.
For instance, if your organization would like the AI model to recognize and extract the line items from receipts, you may end up feeding the model thousands of labeled receipts to get decent results.
To have a labeled data set to train the AI models, you’d need to collect raw data and build out a team of experts for annotation.
So why would you prefer the HITL automation, while there are solutions that can get you to 97% accuracy?
You can find the answers to this question in the following section.
The benefits of HITL automation
Why do we still need to rely on human involvement? Simply because there are still cases where fully automated solutions are flawed like manual ones. We all know that zero errors are impossible. This holds for both manual and fully automated workflows.
Compared to AI, the human brain works great in situations where data or information is limited. For example, if we observe the tail of a tiger, that is sufficient information for us to identify whether it’s a tiger or not.
However, this is not the case with machines, as they need extensive development to achieve that. Therefore, human-in-the-loop automation is used to fill in the gap.
There are various benefits to leveraging HITL to train AI models or improve workflows, which include:
- Using HITL increases prediction, extraction, classification, and validation accuracy, and the quality of the results
- Machines can be trained to understand complex data that they’ve not yet encountered
- Algorithms can be gradually improved through human input
- Is not limited by the quality of data on which the AI models are trained
- Saves valuable time for developers
- Can deal with incomplete and challenging datasets better and more efficiently
However, there are some limitations to keep in mind as well, which we cover next.
Limitations of the HITL approach
Even though human-in-the-loop automation combines the best of human intelligence and artificial intelligence, it has some limitations. These limitations include:
- Human-in-the-loop identification – Organizations need to identify who is going to interact with which interface and section within the automation loop
- Large volumes of data – HITL does not always cope well with large volumes of data as more humans are needed within the automation loop
- Limited scalability – When a human is involved in a process, scalability can become an issue
These limitations are still minor compared to those with manual or fully automated workflows. As long as you are aware of these concerns and address them correctly, the effectiveness of the HITL is not nullified.
Human at the beginning or at the end of a loop?
Not sure when to leverage human-in-the-loop within your workflows? No problem. We’ve got you covered. In our experience, it makes the most sense to have a human in the loop at the beginning or the end of a loop. Let’s take a look at the following options:
- HITL at the beginning of a loop
- HITL at the end of a loop
HITL at the beginning of a loop
In cases where there are no out-of-the-box solutions available, you should consider using the HITL approach and integrating a human at the beginning of the loop. Why?
Let’s say that you currently have no AI models or algorithms to automate certain processes, but you do have a lot of raw data.
With that raw data, you can create labeled data with a human-in-the-loop, who makes sure that data is cleaned (inaccurate data is removed or corrected) and labeled correctly.
Once the data is labeled, you can use it to train your own AI models to recognize invoices or even extract data from them.
For instance, if you would have tons of different invoices, you could label that data to train AI models to recognize invoices.
Such an approach allows you to start from 0% automation and move towards +80% automation. So in which situations should you consider placing a human at the beginning of a loop?
- You want to build your datasets
- You want to create your own AI models
- You don’t have any or only a little automation in place, but you want to move towards +80% automation
- You have inhouse data annotators & AI experts
HITL at the end of a loop
Using human-in-the-loop at the end of a loop is more common in many business cases. This approach leverages automation to do repetitive tasks and human intelligence to ensure that everything runs correctly.
We often see that 80% of the workflow is automated, and 20% is left for humans to complete. So when would you choose this approach over the previous one?
- You want to achieve as close to 100% accuracy as possible (i.e., data extraction, prediction, verification, anonymization, etc.)
- You want to decrease the need for human intervention from 20% for lower overhead costs
- You want to minimize costly errors (i.e., inaccurate data, duplicate entries, etc.)
- You are looking to improve turnaround time while maintaining high accuracy
- There are solutions on the markets that can automate most of the tasks for you with a high accuracy
To show you further the difference between a human at the beginning and end of the loop, we have picked two real-life examples.
Many recognized brands use HITL automation to improve their systems. Below, we provide a couple of human-in-the-loop examples in action.
In the case of Facebook, HITL is used creatively to improve its DeepFace algorithm, which can achieve an accuracy of 97.35%. How Facebook does it is by allowing its users to do facial recognition in photos by confirming or rejecting. The end-users are the humans in the loop (beginning of a loop) and contribute to improving the algorithm (by annotating).
Another big brand, Coca-Cola, created a loyalty program, MyCokeRewards, which used a human-in-the-loop strategy to make it successful. Coca-cola built an app that integrated Optical Character Recognition (OCR) technology.
With such technology, users could simply take pictures of their codes printed on the caps of the bottles and other surfaces instead of manually entering the codes.
The app then provides a confidence level for each character. If the code failed, the application highlighted the characters with low-confidence levels for users to make corrections. Customers’ input trained the model, which improved data extraction accuracy (end of the loop).
Unfortunately, involving the end users in the process is not always possible.
If that’s the case for you, you can look for external data annotation experts or human-in-the-loop systems. However, keep in mind that in some use cases, it may be mandatory to have a self-managed HITL system to comply with data privacy regulations.
Let’s have a look at the differences between the two alternatives.
External vs. Self-managed HITL
You are convinced that human-in-the-loop automation is beneficial for you. Now what? Before jumping ships or trying tons of different APIs from SaaS providers, it is crucial to understand that there are two ways to go about the HITL approach:
- Externally managed HITL
- Self-managed HITL
Externally managed HITL
What externally managed HITL refers to is the human-in-the-loop provided by an external party (i.e., SaaS vendor, data annotation service provider). Having externally managed HITL has its pros and cons.
- Can deal with high peak volumes of data
- Fast and often comes with 24/7 availability
- Cheaper because the experts know what they are doing
- No need to dedicate time to training staff
- Data goes to the external party
- Security measures may depend on the external party
- Regulatory compliance concerns
Self-managed HITL, as the name suggests, refers to companies allocating a human into the loop by themselves. Let’s cover the pros and cons of the self-managed human-in-the-loop approach.
- Data stays within the company
- Beneficial in the long term as staff gets more knowledgeable
- An excellent way to build up data
- Requires a team of experts
- It can become costly with training and implementation
It’s up to you to think about the decisive factor. Do you want to keep your costs low, or is it more important to keep the data within your infrastructure?
In the end, it boils down to what is more critical for you and your use case.
Use cases of HITL Automation
There are various use cases for effective HITL automation. We generally encounter the following use cases:
- Receipt processing for loyalty campaigns
- Invoice processing for accounts payable
- Anonymization of sensitive information for compliance
- ID card verification for KYC processes
If you don’t see your use case listed above, don’t worry. Of course, there are many more use cases for HITL automation. For this blog, we have chosen to break down the first use case in more detail.
Let’s get into it!
Receipt processing for loyalty programs
In loyalty campaigns, customers submit their receipts as proof of purchase. The marketing agency or retail shop will check the receipt data to see if all conditions are met. For instance, whether the items bought are related to the campaign and if they were purchased during the campaign.
If all conditions are met, the customer will receive a reward.
With successful campaigns, you may end up processing thousands of receipts per day. Therefore, automating repetitive tasks can weed out inefficiencies and save time.
However, automating the whole process has its shortcomings because machines and AI are not perfect. For instance, the data extraction accuracy is not high when the scanned receipts are received in poor quality, leading to significant errors.
Combining automation and human intelligence helps minimize inaccurate data, errors, and even document fraud, which leads to desired financial outcomes of your campaigns.
So how does HITL automation come into play? That’s explained to you next.
Moving from the traditional receipt processing workflow
Traditionally, the workflow of receipt processing for loyalty campaigns would consist of the following steps:
- Receive proof of purchases
- Match the document with a customer in the database (for personalization)
- Read every single receipt
- Confirm on a receipt that the campaign items have been purchased within the campaign period
- Enter the data into the database
- Determine the number of loyalty points attributed to the customer
- Send the rewards
This workflow is costly and time-consuming, primarily when your staff and employees are assigned to doing these administrative tasks.
So how does it look with human-in-the-loop automation?
First, the receipt can be uploaded via FTP, email, web application, or scanned with a mobile phone.
Once the receipt is scanned, solutions such as Intelligent Document Processing classify the document with AI. It categorizes whether the document is a receipt or something else.
After the classification, relevant data fields are extracted from the document.
This step is followed up by automated validation, which checks whether the campaign items have been purchased within the campaign period. This is also known as receipt clearing.
If the OCR and AI model gives a low confidence score, the file will be passed on to the human-in-the-loop to validate whether the data is accurate.
As for the level of confidence score, you can determine it yourself. For instance, if the document receives a confidence score <70, you would forward it to the human-in-the-loop for additional validation.
From there, the data is converted into the desired format (Excel sheet, PDF, or JSON) and passed on to a database.
With the human-in-the-loop automation, you can:
- Increase data extraction accuracy
- Speed up the receipt processing time
- Reduce overhead costs
- Enhance employee engagement
- Minimize costly human error
Now that we have covered one of the most common use cases, we hope that you have a better understanding of the benefits of HITL automation.
If you are convinced that human-in-the-loop automation is for you, read the following section with some tips on what things to consider.
How to get started?
Before researching different solutions and vendors, it might be wise to take a minute. Ask yourself the following questions:
- Does your organization need to achieve close to 100% data extraction accuracy
- Do you need externally or internally managed HITL?
- Do you have in-house AI experts?
- How important is the fact that data stays 100% within your internal infrastructure?
- What is essential for your use case?
- Do you want to build your own data sets?
To give you a little bit of extra help, we have compared cost savings with both fully automated and HITL automated workflows. Check out our free cost savings calculator!
Whatever your answers are going to be; we are confident that Klippa can solve your document-related problems with our Intelligent Document Processing solutions.
The best part of AI is that it replicates human capabilities to scan and understand the key insights with high speed and accuracy.
Whatever your business case is, an OCR solution powered by AI can help you make the data work for you.
Automating document-related workflows with Klippa
Klippa is specialized in automating document-related workflows. Whether you want to automate only a tiny part of your workflow with the HITL approach or the entire workflow, we can help you.
Klippa DocHorizon automates data extraction, classification, document conversion, anonymization, and verification with AI-embedded OCR technology. No matter what document automation challenges you face, Klippa can make you a document processing champion.
If you already have a web application, you can choose to integrate our technology into your solution via API. For mobile applications, we provide a mobile scanning solution, which you can easily integrate with a well-documented SDK.
All of our solutions can be connected to our HITL interface to leverage human-in-the-loop automation for more accurate results.
Building and designing any user interface can take up your resources – both time and money. If you don’t have in-house experts such as UI & UX designers, you would need to outsource the work to a third party. Your time to market then depends on this third party.
This is why we offer our clients a human-in-the-loop interface integrated into our or your solutions – so that you don’t have to build one yourself!
You can either take advantage of it with a self-managed HITL workflow or even use our back office to do the last bit of annotation for you.
The interface allows the user to check, verify, validate, and label (annotate) the data effortlessly within seconds. It can be prompted once the confidence score is lower than your configured threshold.
Or even in situations where you know that you will receive low-quality images or complex documents from a certain supplier.
HITL automation saves time and increases the data extraction accuracy, which leads to a significantly improved bottom line. If used at the beginning of a loop, you can annotate data and create your data sets.
Curious how to apply it in your use case?
Book a spot for a demo using the form below to get you started (or contact our team of experts if you have any additional questions)!