Automating data collection & extraction for cartel damage claims

Automating data collection & extraction for cartel damage claims

Antitrust Proceedings (Traditional)

Although antitrust rules and regulations exist, many companies still abuse their market power to limit the competition. The total cartel fines, for example, reached $4.6 billion globally in 2021. And the same year, 26 companies were waiting for cartel damage claim verdict in Europe alone.

Typically, law firms and economic experts handle cartel damage claims to defend the parties suffering from antitrust violations. 

An essential success factor for the claims is the consistent evaluation and data processing of relevant data from documents (invoices, receipts, bank statements, delivery notes, etc.). Unfortunately, in many cases, this is still done manually. 

It requires law firms tons of time and resources to check, extract, and validate the information from documents sent by their clients. Additionally, it is needed to filter out fraudulent documents and anonymize information that cannot be disclosed. 

For these processes, Intelligent Document Processing (IDP) has become a big game-changer. By combining Optical Character Recognition (OCR) and AI, IDP solutions allow you to automate document processing to increase the success of cartel damage claims. 

This blog will help you uncover the benefits of IDP and how it helps automate data collection and extraction in cartel damage claims.

The challenges of cartel damage claims

The initial proceeding must be well-taken care of to ensure a higher success rate for cartel damage claims. For instance, the law firm leading the case must ensure that their clients provide all necessary documents. Once the data is collected, law firms can estimate the size of the claims with the help of economic experts.

While it may sound simple, it is essential to note that there are many challenges that both the client and the law firms face when it comes to cartel damage claims. 

These challenges include: 

  • Large document volumes
  • Data collection
  • Data extraction
  • Document verification
  • Data accessibility
  • Document redaction

Now, let’s take a look at them one by one.

Large document volumes

One of the major challenges in cartel damage claims is the sheer number of documents to be processed. The more information clients provide to law firms handling their damage claims, the higher the chance for success.

While more documents mean more data, it doesn’t make it any easier for the law firms that these documents come in various formats and types. These documents include:

  • Proof of purchases
  • Delivery notes
  • Bank statements
  • Financial reports
  • Tax statements
  • Expert opinions

In most cartel damage claims, law firms bundle claims from multiple clients to increase the chance of winning the legal battle. This of course increases the document number they need to process.

Adding more lawyers to deal with growing numbers of documents is not scalable or sustainable for any law firm.

Data collection

The data collection process for cartel damage claims can be time-consuming. In most cases, it entails collecting, sorting, labeling, and making the documents searchable. 

Often clients send their documents to law firms via email. Either a paralegal or a lawyer must download the documents from emails and store them in a system.

Cartel Damage Claims - Data collection

But what if these documents come in different types and large volumes? It is not easy to put the data together because they can come in various formats such as PDFs, scans, images, and many others.

Having this process handled manually is far from efficient and optimal, considering how expensive it is to have paralegals and lawyers on the payroll. 

When the data is all over the place, it can be challenging to make a strong case or even grasp the size of the claim.

Data extraction

Once the law firms receive the documents, the next step is to extract the relevant data. The problem is that the data is often unstructured and collected through various sources. Extracting data from such unstructured forms causes unnecessary delays and bottlenecks. Longer turnaround times are costly for both law firms and their clients.

Before the data from documents can be used, it is necessary to convert them to structured data. Traditionally this is done by, for instance, entering data manually into a spreadsheet or a specific system for further analysis. This task can be tedious and time-consuming. 

Additionally, mistakes can happen with long working hours and large document volumes. There may be a situation where a lawyer or a paralegal overlooks a crucial piece of information that could help strengthen the cartel damage claim.

Document verification

Part of the document processing in cartel damage claims involves document verification. This entails that law firms must verify the authenticity of documents they receive from their clients. 

To avoid any potential civil or criminal liability, contents provided through documents must be accurate, genuine, and not misleading. Authenticating can be done by reviewing stamps, watermarks, fonts, and carrier materials. 

Can you imagine the time needed to manually verify thousands or even hundreds of thousands of different documents?

Furthermore, there are limitations to what human eyes can detect. Thus, when verification is done manually, the risks of missing out on fraudulent, forged, or duplicate documents are higher.

While it may be a simple task, it consumes valuable time from the litigators. The more time it takes, the more litigation costs incurred for the clients.

Data accessibility

The number of documents received from claimants can become overwhelming to any law firm, especially when they come in numerous forms and sources. 

You can imagine how time-consuming it is to find relevant information from documents and label them for the litigation process when the text is in an unsearchable format. 

For example, searching for invoices over €1,000 from the last ten years from a single client can easily take up hours, if not days. Multiplying this with the number of clients takes away profitable time that can be spent better elsewhere.

Document redaction

Like many legal proceedings, some information that is considered to be confidential should be redacted to ensure local privacy regulations like GDPR in Europe.  

In general, law firms must ensure that confidential information is sufficiently blacked-out or redacted from the documents used to prove the antitrust violation. However, in some cases, it is allowed to disclose such information if the need is greater than the harm.

Document redaction is not the most simple task as there are grave consequences when done improperly. Therefore, it is a process that requires one lawyer or paralegal to redact sensitive information and another to double-check that it has been done accurately. 

This, of course, increases the turnaround time and costs. You can check out our comprehensive data masking blog to read more about document redaction.

Moving away from traditional document processing

Traditional document processing comes with a baggage of challenges in cartel damage claims or antitrust proceedings. 

First and foremost, it takes up valuable time from all the parties involved; claimants, lawyers, paralegals, economic and sometimes even forensic experts. 

Secondly, the turnaround time becomes exceptionally long with large volumes of documents. The higher the volume, the more time is wasted completing error-prone manual tasks such as data conversion, entry, verification, and anonymization. 

Thirdly, the longer the initial proceedings of cartel damage claims take, the more costs will be incurred. 

Luckily, there is AI technology that can help automate many document-related tasks. Such technology is called IDP, which enables organizations to move away from traditional document processing. 

Let’s have a more detailed look at it and how it works.

Introducing Intelligent Document Processing

So what is Intelligent Document Processing (IDP)? It is a modern solution that automates various document processing workflows.

It uses technologies such as Artificial Intelligence (AI), Optical Character Recognition (OCR), and Natural Language Processing (NLP). 

With IDP software such as Klippa DocHorizon, organizations can automate data extraction, classification, conversion, anonymization, and verification of documents. Thus, it helps organizations in multiple industries (legal, banking, retail, healthcare, travel, transportation, manufacturing, etc.) eliminate manual document processes.  

Klippa DocHorizon Platform

It brings value by transforming unstructured and semi-structured information into structured data. As most business data comes in unstructured formats such as PDFs, emails, or images, it is challenging to manually convert all that data into a unified form.

An IDP solution can simply label documents, make them searchable, and convert unstructured data into machine-readable formats such as XLSM, JSON, CSV, or XLS. Structured data can be effortlessly passed on to a database or ERP system for further analysis.

Next, let’s look at how Intelligent Document Processing can help law firms with cartel damage claims through automation.

The roles of IDP in cartel damage claims

There are several steps in document processing involved in cartel damage claims that IDP can automate. These steps include: 

  • Data collection & organization
  • Document scanning
  • Document classification
  • Automated data extraction and conversion
  • Anonymization
  • Verification

Let’s take a look at each one in more detail.

Data collection & organization

Some IDP vendors, such as Klippa, offer their clients a way to collect data from claimants through a portal or an application. With this portal or application, law firms can provide claimants with a more accessible and secure way to upload the documents instead of sending them via chains of emails.

WhiteLabel Portal - Klippa

As documents (mostly proof of purchases) come in large quantities, in different formats, and from various sources, it is crucial to keep all the data organized and accessible.

Once the documents are sent to the platform, that’s when they can easily be extracted or passed on for further examination.

Document scanning

One of the major tasks that Intelligent Document Processing can help law firms with is document scanning.  

IDP enables lawyers and back-office clerks to scan paper documents even when a scanner is nowhere near them. For instance, they can simply capture documents via photographs by mobile phones anywhere and anytime.

The same mobile scanning solution can be provided to the claimants. 

Instead of having the clients upload or send the documents through various sources and formats, they can simply take photos of the documents themselves. 

The scanned documents can then be sent to the desired location or passed on for further processing, where the data will be accessible to the parties that need it. 

The image pre-processing features such as quality detection and enhancement will improve the quality of scanned documents. Because of that, automated document processing can be done more accurately.

Document classification

After receiving each scanned document, lawyers need to interpret the meaning of the text and categorize each of these documents into labels. This is done to get a better overview of the document collection. 

By doing so, finding relevant information becomes more efficient and reduces unnecessary bottlenecks. IDP can automate this process with the ability to: 

  • Recognize labels and predict new categories
  • Organize documents based on similar words and phrases
  • Understand writing linguistic rules of a text to drive the automatic categorization 

Assigning categories to a document makes it easier for law firms to manage and find relevant data for cartel damage claims. See an example of a document classification below.

Invoice Classification Example

Automated data extraction & conversion

While working on cartel damage claims, lawyers need to work with documents and file types (PDF, JPG, PNG, etc.). They need to collect, combine, filter, and select appropriate data to meet the litigation and compliance demands.

Doing that manually is both time-consuming and error-prone, which reduces the chance of winning the court case. Although humans can easily read scanned documents, the text within the documents is unsearchable for computers.

Therefore, it is needed to convert these unsearchable documents into a format that makes data easily accessible for lawyers.

Purchase Order to Searchable Text
Example of data extraction on a purchase order and conversion to searchable text

Optical Character Recognition, a part of IDP, is used to automate data extraction and document conversion into a machine-readable format. With OCR, content from scanned documents can be extracted within seconds and with near to no errors.

Purchase Order to JSON
Example of data extraction on a purchase order and conversion to machine-readable format (JSON)

This is thanks to the Artificial Intelligence part of the IDP that can be trained to recognize patterns. Thus, data extraction accuracy and document coverage can be increased over time.


Similar to the capability of IDP to classify documents, it can recognize which fields need to be removed or redacted from the documents.

With the automated data masking feature of the IDP solutions, law firms can reduce the workload of lawyers or paralegals by hundreds of hours. 


Instead of adding, modifying, and removing sensitive information from documents, the lawyers can focus on more critical tasks.


What is the use of documents if they are forged or not authentic? Unfortunately, the naked eye can not always detect document fraud due to the evolved methods of deep fake engineering.

Fortunately, OCR technology has evolved thanks to AI. They both enable Intelligent Document Processing solutions to detect signs of forgery through image analysis, duplicate detection, and EXIF analysis

The possible forgeries or duplicates can be detected early, and verification can be done without investing lawyers’ time to do these tasks manually. For cartel damage claims, this would mean a higher chance of succeeding in antitrust proceedings. 

Now that we have covered the primary roles of IDP in cartel damage claims, let’s take a look at the main benefits.

The benefits of IDP in cartel damage claims

So what does this all mean to law firms? Clearly, IDP can help law firms automate document processing to increase the success and efficiency of antitrust proceedings.

The main benefits that come with the use of IDP in cartel damage claims include:

  • Operational excellence
  • Faster turnaround time 
  • Cost reduction
  • Organized data 

All of these benefits can be crucial for successful cartel damage claims. 

In the following section, we will deep dive into these benefits.

Operational excellence

Like many lawsuits, cartel damage claims heavily revolve around the evidence provided in documents. Unfortunately, documents are not sorting themselves on their own, nor will the relevant data be magically extracted. 

One of the main benefits of IDP is that it eliminates the hassle of collecting, sorting, extracting, verifying, and anonymizing documents. Law firms no longer need to deal with document processing the traditional way or think about IT-related issues. 

The expensive hours can now be dedicated to more critical tasks. In addition to that, IDP software does not get tired like human brains. Law firms can truly achieve operational excellence with such a solution.

Faster turnaround time

As mentioned earlier in this blog, time is of the essence. For cartel damage claims, the longer the legal proceedings take, the more costly it will become to law firms and their clients.

One of the main benefits of using IDP is the elimination of manual document processing, which would result in faster turnaround times. 

Let’s assume that for cartel damage claims, you would need to manually process 100,000 documents, with each one averaging 5 pages.

Let’s also presume that a lawyer or back-office clerk can process two pages per minute. That would equal 120 pages per hour. With five people (identical document processing speed) working on this project, completing the project would equal 833 hours.

In comparison, IDP, which has the ability to process 10,000 documents per hour, would complete the project within 50 hours. That would equal a time reduction of nearly 94%.  

Let’s look at how this would impact the costs next.

Cost reduction

As the document processing time is reduced, the costs will also be decreased. 

Let’s assume that, on average, each paralegal or lawyer working on cartel damage claims earns €150 per hour. With five paralegals or lawyers processing 500,000 pages of documents within 833 hours, it would incur €124,950 in costs. 

The price of IDP vendors is dependent on various factors. However, for this example, a rough estimation of costs for a tailored solution for cartel damage claims based on 500,000 pages processed would be €30,000.

Based on the assumptions above, the cost reduction with the use of IDP would be nearly 76%. The Return of Investment (ROI) of using IDP would be above 3.1 (see the calculation below).

IDP ROI - Klippa

The threshold for law firms and their claimants to initiate litigation would also be lowered with the cost reduction. 

High costs involved in cartel damage claims proceedings are often why undertakings shy away from legal action against cartels. 

Thanks to Intelligent Document Processing, law firms can now significantly reduce the expenditure for time-consuming and error-prone administrative tasks.

Organized data

With data coming in from multiple sources and formats, it is essential to keep data organized in a secured location.

The major benefit of IDP for law firms is that it can do that and more. With the ability to automate classification with AI and machine learning, Intelligent Document Software can label documents based on rules or pattern recognition.

Finding necessary data that can substantiate and evidence the damage suffered and the causal link between the damage and the infringement will be easier.  

In addition to labeling and classifying each document, IDP transforms all scanned documents into a format, which makes text or data searchable for lawyers. This will not be the case if document processing is done manually.

Organized data reduces both chances of missing out on crucial information and time spent searching for the relevant documents.

Now that we have covered the key benefits of IDP for cartel damage claims, let’s look at how Klippa can help you.

The Klippa solutions for cartel damage claims

At Klippa, we have developed two solutions that can be used separately or combined to tackle document processing challenges in cartel damage claims.

The first solution is a white-label portal in which the claimants can upload their documents to prove the infringement of antitrust rules. Its value includes easy data accessibility and the elimination of manual data collection procedures. 

The second one is Klippa DocHorizon, the one-stop Intelligent Document Processing solution to automate all document-related workflows. It can extract data, convert, classify, anonymize, verify, and collect documents from various sources. 

Our solutions are scalable and proficient in processing large quantities of documents quickly. On top of that, we have made it our top priority to comply with data privacy regulations such as the GDPR.

If you are worried about privacy or security, we can always provide hosting options in your country or even on-premise if needed

Whether you are looking for an end-to-end solution or a custom one, we at Klippa can help you with your business case. Are you ready for a demonstration? Fill in the form below! 

 Schedule a free online demonstration

A clear overview of Klippa in only 30 minutes.

Works with AZEXO page builder