As much as I adore Jellyfin, Nextcloud, and other popular self-hosted services, FOSS tools that make annoying tasks simpler have a special place in my heart. Paperless-ngx is a prime example of these apps, as this document management tool is the sole reason my invoices, tax filings, receipts, and other records are properly organized instead of staying stuck in a mess of emails and folders.
The best part? Paperless-ngx’s document OCR, analysis, and other document processing provisions can be further enhanced via LLM-powered companion services – like Paperless-GPT. But don’t let the name fool you; although it supports online clouds, it works exceedingly well with my local Ollama LLM collection.
Paperless-GPT can add AI-powered processing features to Paperless-ngx
Its OCR capabilities are terrific
Before I talk about Paperless-GPT, let me add that the base ngx utility is perfectly fine on its own, and that’s pretty much how I’d been using it until last year. However, there are limits to the built-in optical character recognition. When you’ve got too many visual elements or add tables to the mix, it starts to get confused and adds random, meaningless symbols as the OCR contents. Before encountering Paperless-GPT, I had to manually edit the OCR’s scanned data or use custom tags, titles, and other document attributes to help me identify it in the future.
Paperless-GPT overhauls the OCR capabilities by throwing LLMs into the mix. While it accepts cloud APIs, my preferred method involves relying on local vision LLMs, and it works really well for the most part. Of course, there are a couple of situations where even Paperless-GPT failed to recognize haphazardly-arranged data. But for the most part, its “experimental” OCR provisions are rock-solid, especially when it comes to scanning text from low-quality photographs.
Since Paperless-GPT connects to its ngx variant, it can pull any files with the paperless-gpt tag, and I’ve created a workflow to append this tag to every new upload. Once Paperless-GPT has worked its magic on my documents, I can hit the Apply button to automatically send the freshly-generated content to my Paperless-ngx instance, so I don’t have to manually copy and paste them later.
It can also generate titles, correspondents, and custom fields
Another neat aspect of Paperless GPT is that it can leverage my local LLMs for more than just OCR. For example, I can use the tool to generate titles, tags, correspondents, dates, and even custom fields for my documents and push these changes to the actual files stored on Paperless-ngx. So far, I’ve had decent results on all of them, though I must admit that Paperless AI has better automatic tag generation (and I plan to go over this in a future article).
Paperless-GPT also includes an ad-hoc analysis tool, which comes in handy when I want to get some context about multi-page documents, but I’m too tired to sift through all the technical jargon. The default analysis prompt only involves invoices, but Paperless-GPT lets me modify it to generate summaries for practically everything. Likewise, I can create my own prompts for custom fields, and it’s particularly useful for warranty pamphlets, press releases, and other non-monetary documents. In fact, I can modify the prompts for practically every tool on Paperless-GPT, be it the title generator, auto tags, or even OCR.
It needs a Paperless-ngx instance and local (or cloud) LLMs
And you’ll have to link them in its config file
With Paperless-GPT being a companion utility, it needs a Paperless-ngx container to pull documents from. I’ve had a Paperless-ngx server running for several months, so all I had to do was grab its URL and an API key, and I got the latter by hitting the Generate button under the API Auth Key section of the My Profile page.
Likewise, I needed an LLM provider to harness Paperless-GPT’s AI tools, as I couldn’t just pass a GPU to the container and call it a day. I’ve used an Ollama LXC (or rather, a Debian LXC armed with Ollama and its LLMs) that’s powered by my aged Pascal card, though I had to edit the ollama.service config file to add the following lines of code and make it accessible from my Paperless-GPT container:
[Service]
Environment="OLLAMA_HOST=0.0.0.0"
Once I’d completed all the pre-requisites, it was time to spin up a Paperless-GPT container. I could’ve gone down the Proxmox VE Helper-Scripts repo route, but since there are too many variables to modify (and troubleshoot when things go wrong), using a compose.yml file seemed like the best option. Fortunately, Paperless-GPT’s GitHub page includes a detailed Compose config, and all I did was modify the Paperless-ngx URL, API token, and Ollama variables (and comment out the OpenAI parameters using a ‘#’) before running docker compose up -d.
Throw in Paperless AI to make your document management even simpler
Besides Paperless-GPT, Paperless AI is another handy tool for folks who love this document management tool as much as I do. Although the former has better OCR capabilities, Paperless-AI supports RAG chat and can find documents stored on my Paperless-ngx instance using just the context. It also has better tagging provisions and pairs just as well with my self-hosted LLMs.
