4 ways to perform document-based KYC checks with OCR and AI

4 ways to perform document-based KYC checks with OCR and AI


Customer onboarding in the digital age is only just picking up the pace of modernization, as COVID-19 is causing everyone to, for instance, do everything regarding banking online. For customers between the ages of 18 and 24, as much as 82% perform all their banking tasks through their mobile devices. As this is the generation that will grow increasingly influential, financial institutions need to pick up the pace.

In the olden days, if a customer wanted to open a bank account, they’d simply go to a bank, show their ID, have the bank clerk manually copy the information on their ID, and they’d be confirmed as the person opening the account. This was customer onboarding in black and white.

While customer onboarding has now evolved into a digital process, this often still requires a laborious and time-consuming amount of manual data entry. And to do so safely and securely, without making errors, is an increasingly difficult task. The life of a compliance officer can be quite difficult. Thankfully, Klippa’s OCR and AI are here to help. 

Automating the KYC process can save a company tons of time and resources, as the hard work is done by an AI. So how does this work and which KYC checks can actually be done automatically?

Let’s grab a loop and start examining!

Jump to:

How do OCR and AI work for KYC?
Identity documents (ID/Passport/Driver’s License)
What data can be extracted from an identity document?
Is the extracted data safe and in compliance with KYC and AML?
Tax statements
Salary slips
Utility bills
KYC at Klippa


How do OCR and AI work for KYC?

With Optical Character Recognition (OCR), all sorts of documents containing text can be read and data can be extracted from it automatically. An OCR API is therefore extremely handy in taking over the manual process of reading and checking documents, usually performed by clerks. Any kind of document containing information about the potential customer can be run through the OCR API. It automatically recognises the text that it needs and makes it extractable. The extracted data can then be used according to your needs, such as passing it on to any desired system, database or tool for further processing.

This automated recognition is backed by an artificial intelligence (AI), which is trained extensively and specifically on these sorts of documents. The AI is fed thousands of examples of ID’s, tax statements, salary slips, and so on, and is trained to recognize specified data. It is therefore an expert on recognizing specific data.

The KYC process usually involves two proofs: a proof of identity and a proof of residency. In essence, this means that the AI needs to be able to detect a name, a photo, an address, and in some cases personal numbers, contact details or even bank account numbers in order to distill the right information for these forms of proof. At present, Klippa’s OCR API can do this with at least 95% accuracy, bordering on 100%.

So just to clarify the collaboration between OCR and AI: OCR reads the text and the AI recognizes the context and extracts the required data. This data can then be extracted into a specific data format, such as JSON, so that it can connect with an existing database. This makes validation of identity or address a lot easier. Three steps are taken in the OCR process:

  1. A scan of the ID is sent to the OCR API
  2. The photo is converted into a raw text file, that can be read and extracted by the AI
  3. The AI interprets the information means and returns the data in a structured JSON format

All of this is completed fully automatically within a matter of seconds and provides a failsafe way of checking one’s identity, as human error is removed from the process. 

Let’s take a closer look at four documents you can deploy OCR and AI for to determine a proof of identity or a proof of residency.


Identity documents (ID/Passport/Driver’s License)

The first and primary type of document that comes to mind when talking about KYC checks is the identity document, in this case meaning ID cards, passports and driver’s licenses. It is the most obvious first way of checking one’s identity for onboarding purposes, but also involves the most security requirements.

What data can be extracted from an identity document?

Any required data that is on an ID card or passport can be extracted. Usually, the data is placed in a similar position on the card and supplemented with a description in multiple languages. This means that all regular text data can be extracted easily and accurately.

Non-textual data can of course be extracted as well: the ID photo, the signature, and the machine-readable zone (MRZ). This information provides the most complete picture of the potential customer and thus their identity can be confirmed. A photo can for instance be checked with a recent selfie in combination with signature, to further increase security checks.

Is the extracted data safe and in compliance with KYC and AML?

Fraud can be prevented by detecting image tampering via incoherent pixel patterns or detecting duplicates via image hashing. In order to comply with KYC and Anti-Money Laundering (AML) guidelines, you’ll need a system that is waterproof. 

In terms of privacy, you can rest assured that the extracted personal data is not stored by Klippa, allowing you to fully comply with GDPR restrictions. It is even possible to have data extraction performed within the physical borders of your own country, as we can set up a server in your country of choice. 

So this is the general way in which a potential customer’s identity is confirmed accurately and safely for onboarding. But what other ways are there to check someone’s identity, and especially their proof of residency?


Tax statements

One thing we all have in common is that we need to pay our taxes, no matter what country or region we live in. This is why a tax statement is an excellent way to check a proof of residency or sometimes even a proof of identity in a KYC process. A person’s name, address, county or region, depending on how tax is governed in a particular country, can all be identified and extracted from a tax statement.

Purely in terms of KYC standards, this information combined with another form of identification can safely confirm the residency or sometimes identity of a potential customer. As it may differ where the required information can be found on such a statement, a trained AI can automatically spot its position and deploy OCR to extract the text and compare it either through a human check or an automatic database check.

Tax statements can also indicate other information, such as a confirmation of payment of said taxes to the relevant tax authorities, a check for potential tax fraud in line with AML regulations, and much more data. 


Salary slips

A salary slip, or payslip, is another easy way to perform your KYC processing. Although it is generally used to confirm a person’s name, address and even confirm that the person works at a specific company, it can also serve as a confirmation of someone’s monthly income. This is particularly important when a customer applies for a loan, an allowance or subsidy of some sort. 

So by extracting data from a salary slip, you can confirm a person’s name, address, gross- and net salary, hours worked, tax rate, and more. The AI is trained to identify where this information is on the payslip, and turns it into usable data for KYC purposes. 

Governments and financial institutions often require this information to determine whether a person is eligible for government benefits. They often process thousands of such payslips per month, which takes time and manpower to validate. With OCR and AI, this information is validated within seconds, so that authorisation can be initiated quickly. Someone in dire need of benefits can then be quickly paid. 


Utility Bills


Any form of utility bill, be it electricity, water, gas, telephone, or internet service, can be deployed as a proof of residency. Payment of such utility services proves the address of a specific person as it is directly related to their property. Usually, this bill can’t be more than 3 months old, and in some cases this should be a bill of the month the application is made. Quickly and effectively processing these bills is therefore paramount.

A proof of residency in the form of a utility bill indicates that the potential customer is financially solvent and has been so for a period of time, and of course, has been living in a specific region for a certain time. This allows a confirmation that this person falls under the jurisdiction of a certain area. So a utility bill provides a lot of information for companies to comply with KYC regulations.

With OCR and machine learning, not only name and address can therefore be extracted, but also data fields containing location, payment requirements, specified utility, and so on. Any document tampering can be determined and data can be confirmed with an existing database.


The benefits of automating your KYC processes

Releasing the power of automation on your KYC processing comes with numerous benefits. These benefits are present for any type of industry, be it in banking or in the public sector. Let’s look at the primary benefits that make automation come highly recommended to comply with current and future KYC, AML, and GDPR requirements.

Process documents within seconds

KYC often requires companies to validate thousands of documents per month, whilst the data is needed to, for instance, pay out benefits by the end of the month. Automation via OCR and AI allows for processing to take a couple of seconds and can process an endless amount of documents simultaneously. That is not something a human employee can easily do.

Achieve a ~100% accuracy rate

We can only speak of Klippa’s accuracy in terms of data extraction, and we do so gladly. Data extraction at Klippa is near 100% accurate, meaning that you can rest assured that the required data on a provided document is the exact data you need. This is especially the case when there is a short human check afterwards.

Be fully GDPR compliant

Our OCR and AI service for KYC is fully GDPR compliant. Within the European Union, we only use ISO certified servers for processing invoices and a data processor agreement is in place. Outside of the EU, we can open a local server that aligns with local legislation and compliancy regulations. We never store data, we only process it for you. Therefore, we provide full security for all privacy-sensitive data that you need to process.

Applicable to almost every document type

We mentioned four types of documents above, but this service extends to any document that may provide a proof of identity or proof of residency. You can think of bank cards or statements, permits, contracts, P&L statements, you name it. An AI can be trained to identify data on any form of document.

Easily adaptable

Changing your way of working or the data you aim to identify and process can be a laborious process when you’re dealing with a large team of employees. For employees, changing their daily tasks can easily lead to errors and overlooking specific data. When changing the focus of an AI and OCR, this adaptation is executed way more smoothly. An AI can be subjected to way more adaptations in a short period of time than a human could effectively be.


KYC at Klippa

If you are looking to automate and secure your KYC processing, you’re at the right address. Klippa offers swift and secure KYC processing services with OCR and machine learning. Contact us via [email protected] or schedule a demo with our product experts below. We can help you find the solution that is right for you.

 Schedule a free online demonstration

A clear overview of Klippa in only 30 minutes.

Works with AZEXO page builder