

If you are in the business of large-scale document processing, you are probably facing problems related to bad data quality.
Bad data quality issues can take many forms – blurry images, incorrect file types, missing information – but one of the most common and frustrating problems is incorrectly rotated documents.
Whether you’re processing receipts, invoices, forms, or other types of paperwork, rotated pages can slow down workflows, reduce data accuracy, and increase labor costs due to manual corrections.
This is where Klippa and its platform come in handy. With the power of Artificial Intelligence (AI) and Optical Character Recognition (OCR), Klippa specialises in smart document processing and is able to solve these issues for you.
In this blog, we will explain how Klippa can help to automatically correct document rotations on a large scale. This optimizes processing efficiency and reduces processing costs.
Key Takeaways
- Rotated documents are a common pain point in large-scale document processing workflows. They negatively impact data quality, slow down review processes, and often require manual correction.
- Manual correction at scale is costly. For example, rotating just 10% of 100,000 monthly documents manually can cost around €20,000 a year in labor. Automating this process can reduce those costs by up to 90%.
- Klippa’s approach uses OCR and AI to detect text orientation and rotate documents accurately, regardless of shape or size. The automated rotation process consists of three main steps: image optimization, text extraction with OC, and smart rotation.
- Klippa’s DocHorizon platform offers much more than just rotation – features like OCR, classification, anonymization, and data extraction make it a complete Intelligent Document Processing (IDP) solution.
An Example Use Case
So, let’s assume you work for a company that is processing financial documents on a large scale. For example, receipts and invoices for loyalty purposes. This is a common use case in a.o. cashback automation.
You have a data entry team that has to check receipts in an interface and extract certain data or perform certain approvals. Manually checking documents is already a time-consuming task for normal documents, let alone for bad-quality documents.
If you are processing 100.000 documents a month, and 10% of the documents are rotated, manually rotating 10.000 documents a month can be a time-consuming and annoying task.
The yearly cost of rotating 120.000 documents will easily be €20.000 in just labor. Luckily, automation can easily reduce these costs by 90%, saving you €18.000 a year. A great business case!
Below is an example of the type of data you can expect from customers:


As you can see, there’s a solid business case for automating something as simple as document rotation. But how do you make that happen? Let’s take a closer look at how you can automatically detect and correct the orientation of receipts and invoices in a smart, scalable way.
How to Automatically Correct Document and Image Orientation?
As you can see, there are many different types of problems with these documents. In this blog, we will focus on the automated rotation of receipts and invoices, but this applies to any document type.
If you are interested in other document processing solutions like automated document sorting, document classification, image to text, or searchable PDF conversion, read our relevant articles on these topics.
So, let’s focus on rotating the incorrect images automatically to the correct orientation. A simple approach that many people would think of first is just to check for the height and width of the documents and rotate them to a vertical orientation so that the height is larger than the width.
While this sounds simple and effective, sadly, it is error-prone. Receipts and invoices come in many different shapes and sizes. Sometimes rectangles, sometimes squares.
This approach can cause documents that are in the right rotation to be turned into the wrong rotation. It can also cause you to rotate documents to the 180 degrees opposite, so upside down. Luckily, there is another solution: doing it based on the text content of a document.
To get there, our software takes 3 important steps:
Step 1: Optimizes image quality
This step is done by cropping the receipts pictures, correcting perspective, and improving the contrast. This already gives us better readable images, which is relevant for the second step. You can see an example result of the first step below:


Step 2: Converts documents and images to text using OCR
Converting the documents and images to text is the second step. If the document is a PDF, it will first be converted into an image and then into text. This creates a searchable document and reveals what the text orientation is.
Of course, nobody reads a sentence from top to bottom, but mostly from left to right and, in some cases, from right to left. On some documents, you will have text in multiple orientations. In these cases, we will perform a text count and choose the rotation that most text is in.
Step 3: Rotates the document
Now that we know the text orientation, we can almost rotate the document. The document should be rotated so that you can read left to right for most languages, but for some languages, you have to read from right to left. This is a determining factor in the rotation.
So, now we first use a machine learning classifier to determine the country of origin and language of the document. Once this is done, the image or document can be rotated and stored in the desired format.
In many cases, this is the original file format, so for images, that would often be a JPEG, but we can also convert it into a format of choice, for example, a PDF. Now that you have good-quality images in the correct orientations, you might already have what you need. The result looks something like this:


Bonus steps
We can even take it one or two steps further: we can give you the OCR results in a TXT format, but we can also give you the results in a structured format like JSON. Below, you can see a simplified example of those two additional steps:


Going Beyond Image-to-Text OCR Software with Klippa


As you can see, automatically rotating documents is a bit of a technical process involving computer vision, OCR, and document conversion techniques.
Luckily, you don’t have to build these tools yourself because of the Klippa platform.
Klippa DocHorizon is an AI-based OCR solution, also known as Intelligent Document Processing (IDP), that automates all of your document-related workflows, including image-to-text conversions.
Automatically rotating images, documents, and pages is just the tip of the iceberg. Using a modern solution like DocHorizon enables you to do the following:
- Mobile scanning – Scanning documents from mobile devices at any place, any time.
- OCR – Turning scanned documents and images into text and structured data formats.
- Data extraction – Real-time extraction of important data points.
- Classification – Classifying and sorting documents according to your needs.
- Data Parsing – Turning JPG, PNG, and PDF files into searchable text and exporting them to formats like PDF or structured CSV, XLSX, XML, and JSON.
- Anonymization – Masking sensitive data, from anonymization to removal.
- Verification – Verifying the authenticity and validity of documents and data.
Does it sound like a solution that suits your needs? Schedule a demo or contact our specialists for more information. We are happy to see how we can help you reach your goals.
FAQ
Correct orientation ensures that OCR software can accurately detect and read text. Misrotated documents may lead to incorrect character recognition, misidentified fields, or completely unreadable results.
OCR extracts text from images. By analyzing the direction and alignment of detected text lines, software can determine the most likely reading direction (e.g., left-to-right or top-to-bottom) and rotate the document accordingly.
Klippa utilizes Optical Character Recognition (OCR) combined with Artificial Intelligence (AI) to detect the text orientation within a document. By analyzing the direction of the text, the system can accurately rotate the document to its correct orientation, ensuring readability and proper processing.
Yes, Klippa’s AI-driven OCR technology is designed to handle a wide variety of document types, layouts, and formats. It can adapt to different structures without the need for manual template creation, making it suitable for diverse document processing needs.