The first step is to provide an image or PDF file of a document containing an email address to our API. Usually this is done from a mobile app, email, FTP, or web application.
The document can be sent both cropped (without background) and uncropped (with background). If the image is sent uncropped, our API automatically crops it to the right size. There is also a possibility to use the Klippa scanning SDK in mobile apps.
Step 2 : Image to text with OCR
As soon as an image or PDF with an email address has been submitted, it is converted to a TXT file. In this step, all text from the document is extracted, but not yet structured.
Step 3: JSON output from the API
The Klippa parser takes the TXT gained in step 2, and converts it into structured JSON by using machine learning. The JSON is returned as output from the API. The email address on the document is now extracted.
Now, the field can easily be processed into your database. Whether you are processing email addresses to update CRM systems or to enhance security and privacy by preforming cross-checks with a database, Klippa is here to help you.
The image on the left is a simplified example of the JSON response.
Why use Klippa’s data capture software?
Spend less on extracting email addresses from documents and images.
Process email addresses automatically within seconds.
Prevent manual data entry errors with high quality extraction of email addresses.
Automatically recognize errors, duplicates, and fraud.
Which documents can you scan for email addresses?
Below is a list of documents from which you can extract email addresses. This is not an exhaustive list, but depends on what the client requires.
Frequently Asked Questions
What does OCR for email addresses cost?Which document types can be processed automatically?What are common use cases?Which languages does Klippa support?Is Klippa email address processing GDPR-compliant?Can Klippa convert documents to CSV, XLSX, XML or JSON?
Which document types can be processed automatically?
Klippa can extract email addresses from any type of document. Commonly processed documents for email addresses are invoices, applications, and contact forms.
What are common use cases?
Email address extraction can be done for several reasons, for example in email marketing, or for updating CRM systems.
Manually typing over email addresses into your database is no longer efficient or reliable. Whatever your use case is, you can replace manual data entry with Klippa.
Which languages does Klippa support?
Klippa supports all European languages. Our engine performs best on documents in English, Dutch, Norwegian, Danish, Swedish, Finnish, Italian, Portuguese, Spanish, German, Hebrew, and French.
Other languages can be supported on request. We are happy to train our machine learning models to assist you.
Is Klippa email address processing GDPR-compliant?
All services that Klippa offers are fully GDPR-compliant. By default, we use ISO-certified servers within the European Union to process documents.
If you are not located within the EU, we can also set up servers in a region of choice.
Next to that, a data processor agreement is always in place. We don’t store any of your or your customer’s data after processing.
Can Klippa convert documents to CSV, XLSX, XML or JSON?
Yes we can.
Klippa takes pictures of documents, or their PDF equivalents, and converts them to readable text using OCR. From there, we use machine learning to turn text into structured data.
Most of the time, we use JSON, but depending on your preference we can also convert documents to CSV, XLSX or XML.
Do you have any questions about Klippa DocHorizon? Get in touch by mail, phone or chat!
+31 50 2111631