A granular look at what the JVM is quietly doing with your strings at the native level, when that work genuinely saves you memory, and when it simply burns CPU for nothing.
1. Wait — It’s Already Running?
If you are running Java 8u20 or any later version with the G1 garbage collector, there is a JVM subsystem called String Deduplication that may already be active right now in your production services. You did not turn it on explicitly. Depending on your JVM flags, it was either enabled for you or is one flag away from being so.
According to the OpenJDK JEP 192, which introduced this feature, the motivation was simple: strings make up a significant portion of Java heap usage — often somewhere between 25% and 30% in typical applications — and many of those strings are exact duplicates. The JVM team decided it was worth doing something about that automatically.
String Deduplication was introduced in Java 8 Update 20 (August 2014) and applies exclusively to the G1 garbage collector. It is not available with ZGC, Shenandoah, or the old CMS and Parallel collectors.
To enable it explicitly — or to check whether it is already active — you can use these flags:
# Enable String Deduplication with G1 java -XX:+UseG1GC -XX:+UseStringDeduplication -XX:+PrintStringDeduplicationStatistics -jar yourapp.jar # Verify it is running on an already-started process (requires JDK tools) jcmd <pid> VM.flags | grep StringDedup
Those flags print deduplication statistics to standard output at JVM exit. Now, before we dive into whether those statistics will make you happy or worried, let us understand exactly what is happening under the hood.
2. What Actually Happens at the Native Level
A Java String object has two parts: the object itself (a thin shell with metadata) and an underlying char[] or byte[] array that holds the actual characters. As of Java 9 and the Compact Strings change (JEP 254), most Latin-1 strings are stored as a byte[], which already halves the raw memory footprint versus the old UTF-16 storage. String Deduplication takes a different angle entirely.
Rather than compressing characters, it looks for separate string objects that happen to hold identical character sequences and then makes them all share a single backing array. The string objects themselves remain distinct — your == comparisons are unaffected — but their internal value fields are silently rewired to point at the same heap array. Only one copy of those bytes survives. The rest get garbage collected.
This is not the same as
String.intern(). Interning makes the String objects themselves identical (same reference). Deduplication keeps distinct String objects while merging only their internalbyte[]storage. Your object identity is preserved; only the backing memory is shared.
The GC-Integrated Pipeline
The deduplication process is tightly coupled to the G1 collection cycle, and understanding that coupling is key to understanding the performance cost. Here is how it flows, step by step:
| # | Step | Where it runs | Cost type |
|---|---|---|---|
| 1 | String objects that survive a GC are added to a deduplication queue | GC thread (inline) | Tiny allocation overhead |
| 2 | A background deduplication thread drains the queue concurrently | Background (off GC thread) | CPU contention with app |
| 3 | Thread computes a hash of each string’s byte[] | Background thread | Memory bandwidth |
| 4 | Hash is looked up in an internal string dedup table | Background thread | Hash table overhead |
| 5 | On match, content is compared byte-by-byte to confirm equality | Background thread | Memory read, potential cache miss |
| 6 | The value field of the duplicate is CAS-swapped to the canonical array | Background thread | Write barrier, minimal |
| 7 | Old duplicate array is now unreachable, freed on next GC cycle | GC thread | Standard collection |
Notice that steps 2 through 6 run concurrently — they do not stop your application. However, “concurrent” does not mean “free.” The background thread still shares CPU cores and memory bandwidth with your application threads. On a constrained container with, say, 2 vCPUs, that cost is very real. More on that in the trade-off section below.
3. When It Genuinely Helps: The Microservice Case
String Deduplication shines brightest in a specific type of workload. If you are running microservices that repeatedly deserialize structured data — think JSON APIs, Kafka consumers, gRPC services — you are almost certainly generating thousands of string objects per second that are textually identical.
Consider a user-profile service that reads from Kafka. Every message might contain fields like "country": "DE", "currency": "EUR", "role": "viewer". With ten thousand messages per second, by the time those strings survive their first young-gen GC they exist as thousands of separate byte[] arrays, each spelling out the same seven characters. Deduplication collapses all of them to a single array.
Heap Memory: Before vs After String Deduplication
The savings in this category of workload are not trivial. The original JEP benchmark showed heap reductions of up to 10% in real-world applications, with some string-heavy services seeing considerably more. Furthermore, because the heap stays smaller, GC pauses become shorter and less frequent — a secondary benefit that compounds the first.
JSON/XML parsing services · Kafka consumers · REST APIs with repeated domain values · Database result-set processors · Any service where the same string values recur across many objects.
4. When It Is a CPU Trade-Off Not Worth Making
Here is the part that most articles skip. String Deduplication is not universally beneficial, and blindly leaving it on is not always the right call. There are at least three scenarios where the cost quietly outweighs the benefit.
Scenario 1 — Short-lived strings that never survive GC
The deduplication queue only receives strings that survive at least one GC cycle. If most of your strings are request-scoped and die in the young generation — which is the ideal situation for GC performance — they will never be deduplicated at all. The hashing and table-lookup cost is zero, but so are the savings. Deduplication neither helps nor hurts here; it is simply neutral.
Scenario 2 — Unique-content strings (UUIDs, timestamps, log lines)
Generating a UUID per request? Building a timestamp string every second? Each one is unique, so the dedup table will record a hash, find no match, and store the entry — only to evict it on the next GC-linked table cleanup. The net result is CPU and memory bandwidth spent on hash computation and table writes that produce zero savings.
Scenario 3 — CPU-constrained environments
This is the most dangerous scenario in 2024’s containerised world. If your pod or VM has a limited CPU quota — say 0.5 to 1 vCPU — the background deduplication thread is competing directly with your request-serving threads. You may observe elevated 99th-percentile latencies that are almost impossible to attribute without profiling, because the culprit is not your code.
CPU Overhead of String Deduplication Across Workload Types
| Workload type | String repetition | Dedup memory saving | CPU cost (1–2 vCPU) | Verdict |
|---|---|---|---|---|
| Kafka consumer (domain enums) | Very high | 15–30% | Low–Medium | Enable |
| REST API (JSON with common fields) | High | 8–20% | Low | Enable |
| Computation service (UUIDs / hashes) | Very low | <1% | Medium | Disable |
| Log aggregator (unique log lines) | Very low | <2% | Medium–High | Disable |
| Batch processor (mixed data) | Medium | 5–12% | Low–Medium | Measure first |
| Database ORM (repeated column names) | High | 10–25% | Low | Enable |
5. How to Measure Its Effect with JFR
Opinions about performance are worthless without data. Fortunately, the JVM ships with Java Flight Recorder (JFR) — a production-safe, low-overhead profiling mechanism built directly into the JDK since Java 11 (and backported to 8u262). It captures String Deduplication events natively.
Here is the cleanest way to capture a JFR recording for dedup analysis:
# Start a 120-second recording on a running process jcmd <pid> JFR.start duration=120s filename=dedup-profile.jfr # Dump an already-running recording jcmd <pid> JFR.dump filename=dedup-profile.jfr # Print a human-readable summary directly from the recording jfr print --events StringDeduplication dedup-profile.jfr
Once you have the .jfr file, open it in JDK Mission Control (JMC) — the official GUI for JFR analysis. Under the Memory tab, look for the String Deduplication section. The two numbers that matter most are:
| JFR metric | What it tells you | Healthy range |
|---|---|---|
Last Deduplication Time | CPU time the background thread spent on one dedup pass | <5 ms per pass |
Deduplicated Bytes | Total bytes freed by merging duplicate arrays | Should grow steadily if it’s worth running |
Dedup Table Size | Number of unique strings currently tracked | Should stabilise; unbounded growth is a red flag |
New Table Entries | How many new unique strings were seen in this pass | High value with low savings = unique-string workload |
If Deduplicated Bytes is large and growing while Last Deduplication Time stays under a few milliseconds, String Deduplication is earning its keep. On the other hand, if the table keeps adding new entries without accumulating much freed memory, you are in the unique-string scenario — and you should disable it with -XX:-UseStringDeduplication.
You can also get a quick console summary at shutdown by adding
-XX:+PrintStringDeduplicationStatisticsto your JVM flags. Look for the Deduplicated vs Not Deduplicated row to get an instant signal.
A Realistic Diagnostic Workflow
Rather than guessing whether deduplication is helping, follow this three-step process in your staging environment before touching production:
| Step | Action | Tool | Decision signal |
|---|---|---|---|
| 1 | Capture heap composition | jmap -histo:live <pid> | What % of heap is [B (byte arrays)? |
| 2 | Run JFR recording with dedup events | jcmd + JMC | Is Deduplicated Bytes significant? |
| 3 | Compare GC pause times | JFR GC events or -Xlog:gc* | Are pauses shorter with dedup on? |
6. One More Surprising Detail: the Dedup Table Lives in the JVM Heap
Here is something that catches people off guard. The internal hash table that String Deduplication uses to track known string arrays is itself heap-allocated. This means that, in workloads with very high string cardinality (lots of unique strings), the dedup table can grow to consume meaningful heap space — sometimes enough to cause more GC pressure than the deduplication is relieving.
The JVM attempts to resize and clean the table in sync with GC cycles, but it does so lazily. In practice, if your dedup table’s New Table Entries metric keeps climbing without stabilising, you have found a workload where the feature is actively working against you. That is the signal to disable it — not as a pessimisation, but as a correction.
7. ZGC and Shenandoah Users: You Are Not Affected (Yet)
It is worth noting that if you have already migrated to ZGC or Shenandoah — both of which offer superior pause-time characteristics for most workloads — String Deduplication does not apply to you. As of Java 21, neither collector supports it. The ongoing work in OpenJDK around ZGC improvements does not currently include dedup support.
For ZGC/Shenandoah users who want similar benefits, the closest alternative is String.intern() used deliberately on high-repetition domain values, or application-level caching (an interning map) on the specific fields you know are repeated. That approach is more surgical — and more transparent in a profiler.
8. What We Have Learned
- String Deduplication is a G1-exclusive JVM feature introduced in Java 8u20 that silently merges the internal
byte[]arrays of identical string objects, freeing memory without changing object identity. - It runs on a background thread integrated with the G1 GC cycle — it is concurrent but not free, and it carries a real CPU cost that matters on containers with limited vCPUs.
- It delivers its best results on microservices that repeatedly deserialize structured data with recurring field values: JSON APIs, Kafka consumers, ORM-heavy services. It is actively harmful in workloads dominated by high-cardinality, unique strings like UUIDs or log lines.
- Java Flight Recorder gives you the exact data you need — Deduplicated Bytes, Last Deduplication Time, and Dedup Table Size — to make the enable/disable decision with evidence rather than intuition.
- ZGC and Shenandoah users are unaffected; those collectors do not support deduplication as of Java 21.
Thank you!
We will contact you soon.
Eleftheria DrosopoulouMay 29th, 2026Last Updated: May 24th, 2026

This site uses Akismet to reduce spam. Learn how your comment data is processed.