Reducto have parsed over 400,000 millions docs for their customers

This isn't AI hype. That is thousands of hours being saved for businesses as we speak.

Published: July 24th 2025

Reducto have parsed over 400,000 millions docs for their customers to help with backend processing.

This isn't AI hype.

That is thousands of hours being saved for businesses as we speak.

From a first principles perspective, this makes sense.

AI can now read and parse docs a way a human can.

That was not not possible previously.

That is a real unlock in terms of technological capability.

There must be a bunch enterprises just waiting to role something like that out. Probably waiting for some more proof of concept.

Sure, you can upload a pdf to Gemini or ChatGPT, but you'll get maybe 80% extraction accuracy.

That dosen't cut it when your dealing with medical, legal and financial docs.

That's where you need cracked AI teams like Reducto, super focused on a specific problem like this.

They are leveraging a bunch of the latest tools around Visual Language Models (VLM's), Computer Vision (CV) and Optical Character Recognition (OCR) to hit accuracy rates of 95% - 99%.

And for docs the AI is unsure about, it can flag them for human review, which means they can then also be labelled by the humans to further train the AI and improve the extraction process.

Complicated layouts like tables with merged cells, multi column documents, and text and charts that go across pages are why you need dedicated AI teams/businesses to solve this problem just now.

Funnily, one thing AI struggles with processing is the innocent little checkbox.

Excel sheets are also apparently quite difficult to do well, and one of the areas we still need to figure out, and which we can imagine there is a lot of demand for!

Got the above info from a great talk the Reducto founder @aditabraham did on a podcast episode with Jason Liu, called Better RAG Through Better Data