![]() |
VOOZH | about |
dotnet add package Hangfire.Community.Raft --version 0.0.1
NuGet\Install-Package Hangfire.Community.Raft -Version 0.0.1
<PackageReference Include="Hangfire.Community.Raft" Version="0.0.1" />
<PackageVersion Include="Hangfire.Community.Raft" Version="0.0.1" />Directory.Packages.props
<PackageReference Include="Hangfire.Community.Raft" />Project file
paket add Hangfire.Community.Raft --version 0.0.1
#r "nuget: Hangfire.Community.Raft, 0.0.1"
#:package Hangfire.Community.Raft@0.0.1
#addin nuget:?package=Hangfire.Community.Raft&version=0.0.1Install as a Cake Addin
#tool nuget:?package=Hangfire.Community.Raft&version=0.0.1Install as a Cake Tool
Hangfire job storage backed by a DotNext Raft cluster. Job state lives in replicated memory; durability comes from a per-node write-ahead log and snapshots on local disk. No SQL Server, no Redis, no external database.
Each application node is simultaneously a Hangfire client/server and a Raft cluster member. A cluster of one works too and still survives restarts through the WAL.
var options = new RaftStorageOptions
{
SelfEndpoint = "10.0.0.1:7000", // this node's Raft endpoint
WalPath = "/var/lib/myapp/hangfire-raft", // node-local persistent directory
};
options.Members.Add("10.0.0.1:7000"); // identical list on every node,
options.Members.Add("10.0.0.2:7000"); // including the node itself
options.Members.Add("10.0.0.3:7000");
await using var storage = await RaftJobStorage.StartAsync(options);
GlobalConfiguration.Configuration.UseStorage(storage);
using var server = new BackgroundJobServer(storage);
BackgroundJob.Enqueue(() => Console.WriteLine("Hello from the cluster"));
Every node needs two reachable TCP ports: the Raft port you configure and the command
forwarding port right above it (port + RpcPortOffset, default +1).
The dashboard works as usual (app.UseHangfireDashboard() after UseStorage); it reads from
the local node's replica.
Try it locally with the sample:
dotnet run --project samples/Hangfire.Raft.Sample # single node
dotnet run --project samples/Hangfire.Raft.Sample -- 0 # three terminals: nodes 0, 1, 2
Hangfire API call (enqueue, state change, fetch, lock, ...)
|
v
Command (binary-serialized batch of ops)
|
| leader? -> append to Raft log -> replicate to majority -> commit
| follower? -> forward over TCP to the leader, which appends/commits
v
every node applies the committed entry to its in-memory store (deterministic)
|
v
the submitting node waits for ITS OWN apply, then returns the result
FetchInvisibilityTimeout plus up to MaintenanceInterval,
and only while a leader has quorum) — at-least-once execution, the same model as the SQL
storage's sliding invisibility timeout.WalPath before
it is acknowledged, so an acknowledged write survives a crash of that node; the log is
periodically compacted into snapshots, and on restart a node replays snapshot + log before
serving, then catches up from the leader. On a multi-node cluster the synchronous flush covers
the node that handled the write, while its peers persist the entry through a background flush a
moment later, so a simultaneous crash of the whole committing majority before that background
flush (for example a single-rack power loss) can still lose a just-committed entry — spread
members across failure domains.| Option | Default | Meaning |
|---|---|---|
SelfEndpoint |
required | This node's Raft endpoint (host:port). |
Members |
required | Raft endpoints of all members, identical on every node, including self. |
WalPath |
<app>/hangfire-raft |
Node-local directory for log + snapshots. |
RpcPortOffset |
1 |
Forwarding port = Raft port + offset. |
SubmitTimeout |
30 s | Max time for a single write (replication + local apply). |
LockLeaseTimeout |
2 min | Distributed lock lease; renewed at a third of it. |
FetchInvisibilityTimeout |
5 min | A crashed worker's job becomes fetchable again on the first maintenance pass after this (so up to + MaintenanceInterval, and only with quorum). |
MaintenanceInterval |
30 s | Leader cleanup cadence. |
SnapshotInterval |
4096 | Applied log entries between state-machine snapshots; the log compacts up to each snapshot. A tuning/testing knob. |
LowerElectionTimeoutMs / UpperElectionTimeoutMs |
1500 / 3000 | Raft election timeouts. |
LoggerFactory |
none | Diagnostics for the cluster and storage. |
Members. Replacing a node means
restarting the cluster processes with the updated list (the WAL keeps the data).ExpireJob defaults to 24 h in Hangfire) plus recurring job metadata.ThreadPool.SetMinThreads) so those continuations are not starved and a write
does not stall to SubmitTimeout; the default floor grows by only ~1 thread/second.GetHealth() reports AppliedIndex and CommitIndex; their difference is the
local apply lag, which lets a readiness probe detect a node serving stale reads even while it
still sees a leader. A Hangfire.Raft meter publishes counters for ambiguous writes, fetch-lease
reclaims (possible duplicate executions) and lock losses, for an OpenTelemetry pipeline or
dotnet-counters.LockLeaseTimeout may lose the lock to another owner while still executing its
critical section. The renewal loop logs a warning when this happens, but there is no fencing
token, so do not rely on the lock for correctness of non-idempotent external side effects.Both cluster ports must be confined to a trusted private network. Each node listens on its
Raft port and on the command-forwarding port (Raft port + RpcPortOffset). Neither is
authenticated or encrypted (this matches the default posture of the underlying DotNext Raft
transport), so any host that can reach them can participate in consensus and submit storage
writes.
This matters more than for a typical service because Hangfire executes serialized job payloads: anyone who can submit a write can enqueue a job that runs arbitrary code on a worker — the same exposure as write access to any Hangfire storage (SQL Server, Redis, …). Run the cluster on a private subnet, VPC, or overlay network, and never expose these ports to untrusted clients. Undecodable forwarded commands are rejected before they enter the log, but that is a robustness guard, not an authentication boundary.
Run the cluster as a StatefulSet behind a headless Service with a per-pod PersistentVolume for the
WAL. Host names are kept as DnsEndPoints and re-resolved on reconnect, so rescheduled pods rejoin on
their own (within ~one DNS TTL) and startup tolerates not-yet-resolvable peers. See
for the full guide. Ready-to-use pieces:
src/Hangfire.Raft the storage implementation
Commands/ replicated op set + binary wire format
State/ deterministic in-memory store + snapshot format
Cluster/ DotNext state machine, Raft host, leader forwarding
Monitoring/ dashboard read API
tests/Hangfire.Raft.Tests unit tests (store, serializer) + cluster integration tests
samples/Hangfire.Raft.Sample runnable console demo (single node or 3-node localhost cluster)
samples/Hangfire.Raft.K8sSample Kubernetes-ready ASP.NET host (Hangfire server + dashboard)
deploy/kubernetes example manifests
docs/kubernetes.md Kubernetes deployment guide
dotnet test runs everything, including tests that boot real single- and three-node clusters
on loopback ports.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 net10.0 is compatible. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed. |
This package is not used by any NuGet packages.
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 0.0.1 | 48 | 6/16/2026 |