Voozh

As your files and folders grow, it becomes increasingly difficult to find what you need, and this is not a Windows-specific issue. I have faced the same issue on both Linux and macOS. On Windows, you have tools like Everything that make searching for files much easier and more efficient. If you are more comfortable with the command line, you can also use a terminal-based file manager. Another handy tool I recently discovered is Diskover. It is an open-source file system indexer and search engine that uses Elasticsearch to index and manage data across different storage systems. It creates a centralized catalog of all your file names and metadata, including sizes and dates, from various locations, allowing you to quickly search and analyze your data in one place.

Diskover packs features that matter

Plus, it's open-source and can be self-hosted

Diskover is an open-source tool that crawls your storage, builds an index of all files, and lets you search and analyze that data from a browser. It runs a file system crawler across directories you specify, whether that is local storage, attached drives, network shares, or mounted cloud buckets, and collects metadata for every file and folder, including names, paths, sizes, timestamps, and owners. All of this is stored in an Elasticsearch database, which enables fast and scalable searches. Because Diskover focuses only on metadata, it never accesses actual file contents and cannot modify or delete any files. Once the initial scan is complete, you can update the index at any time by rescanning or scheduling updates, ensuring the database always reflects your latest storage state.

The web interface provides a simple search bar for quick queries as well as advanced filtering options to refine results by size, type, date range, ownership, and more. You can also sort results by fields such as size, modification date, or owner, making it easy to locate files or identify large and unused data. The built-in dashboard gives a high-level view of your storage, showing total files and folders, occupied space, and breakdowns by file type and age.

Diskover can be scaled efficiently. Its crawler scans in parallel, indexing millions of files without slowing down. It works with virtually any storage source, including Windows and Linux systems, as well as network protocols such as NFS and SMB, or mounted cloud storage. As long as the crawler can access a location, it can include it in the index.

Diskover is useful for everyone

Whether you're a normal user or an IT admin

Diskover is built for anyone dealing with large amounts of data and struggling to keep it organized, whether that is a personal collection at home or a massive storage setup in a company. For most of us, it solves the familiar problem of having too many drives, folders, and files scattered everywhere. Instead of manually digging through hard drives, NAS devices, or media servers, Diskover enables you to search across all of them simultaneously and find what you need in seconds.

If you have large media libraries or long-term archives, you will appreciate Diskover's ability to reveal where space is being wasted, highlight duplicate files, and show exactly how your storage is used.

It is also quite useful for small businesses, offering the same capabilities at scale to make shared storage far more manageable. Over the years, companies accumulate millions of files across different servers and department folders. Diskover consolidates all of that into a single searchable index, allowing anyone to instantly locate the documents, spreadsheets, or project files they need.

Setting up Diskover isn’t exactly child’s play

It takes some effort

Setting up Diskover requires some technical effort, but it’s manageable. First, you’ll need a host machine. This can be a physical server, virtual machine, or always-on PC/NAS. Diskover runs on Linux, macOS, or Windows, though most users prefer Linux or Docker because it is simpler. Ensure the system has enough memory and storage to run Elasticsearch, which stores the search index. For a small home setup, 4–8 GB of RAM is usually sufficient.

Next, install Elasticsearch and Diskover. The easiest method is Docker Compose, which sets up two containers — one for Diskover and one for Elasticsearch — and links them automatically. Edit the provided compose file to point to your directories or network shares, then launch it. Alternatively, you can install Elasticsearch manually and run Diskover as a Python application, but Docker is faster and requires less configuration. I, for one, decided to torture myself with a manual install, and while it worked, I ended up sacrificing a few hours.

Once Diskover is running, start the first crawl to build the index. The community edition does not have a “scan now” button, so you run a command, for example:

docker exec diskover python3 /app/diskover/diskover.py -i myindex /data

This scans the /data directory and stores the results in an index called myindex. The first crawl may take some time depending on your data, but it is usually quick.

After indexing, open the web interface at http://your-server:8015. Log in with the default community credentials (diskover / darkdata) and select your index. You can now browse files or use search and filters to explore your storage.

To keep the index updated, schedule regular crawls using cron jobs on Linux or Task Scheduler on Windows. You can also add new storage locations at any time by creating additional indices or rescanning existing directories. The community edition searches one index at a time, so it is easiest to keep all folders you want indexed under a single root folder.

Self-hosting software is easy

Self-hosting software is simpler than most people think, and it gives you complete control over your setup and data. If you’re not sure where to start, try these six free self-hosted services. With a little time and effort, you can even cut costs. I recently put this into practice by setting up an open-source TV streaming server on my own.

URL: https://www.xda-developers.com/diskover-free-self-hosted-file-indexer/