top -H + multiple thread dumps (with real snapshots)Find the Java PID first:
ps -ef | grep java
# or
pgrep -f 'java|tomcat|catalina'Now capture per-thread CPU:
PID=<pid>
# refresh every 5 seconds (good default)
top -H -p $PID -d 5
# or capture 6 samples at 10-second interval (for evidence)
for i in {1..6}; do
date
top -b -n 1 -H -p $PID | head -n 40
sleep 10
done | tee top_threads_${PID}.logtop -H -p <pid>)This is what you’re looking for (the key column is the Linux thread id a.k.a. LWP/TID):
PID LWP USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2176 2294 tomcat 20 0 18.3g 6.1g 1.2g R 98.7 9.8 12:34.56 java
2176 2301 tomcat 20 0 18.3g 6.1g 1.2g R 75.3 9.8 9:12.01 javaHere, 2294 (LWP) is the hottest thread at that instant.
Use one of these (pick what your access allows):
# best: jcmd (ships with the JDK)
jcmd $PID Thread.print -l > threaddump_1.txt
sleep 10
jcmd $PID Thread.print -l > threaddump_2.txt
sleep 10
jcmd $PID Thread.print -l > threaddump_3.txtAlternatives:
# jstack (also fine)
jstack -l $PID > threaddump_1.txt
# kill -3 writes to stdout/stderr of the process (often catalina.out)
kill -3 $PIDtop thread id (decimal) to JVM thread nid (hex)In HotSpot thread dumps, the native thread id is printed as nid=0x....
From top, you get LWP/TID in decimal. Convert it to hex:
TID_DEC=2294
printf "0x%x\n" $TID_DEC
# => 0x8f6 (example)Now search in the thread dump for that nid:
grep -n "nid=0x8f6" -n threaddump_*.txtYou already stated the most important rule: if the stack is not moving across multiple dumps, it’s a strong signal that code path is burning CPU (tight loop / heavy regex / busy spin / excessive logging / crypto / serialization, etc.).
In your dump, the thread:
name: Log4j2-TF-1-AsyncLogger...
nid=0x8f6
state: RUNNABLE
stack shows heavy regex activity inside your custom PII obfuscator:
shows nid=0x8f6 and the stack going through java.util.regex and LMSThreadNamePIIObfuscator.format(LMSThreadNamePIIObfuscator.java:42).
A second view of the same hot path shows the same thread still in regex-heavy matching (lots of Pattern$GroupHead.match, loop/tail/greedy char property frames).
That’s a classic “CPU sink” pattern: regex backtracking + large input + frequent invocation.
In another dump timestamp, the same nid=0x8f6 thread is WAITING on the disruptor wait strategy, meaning it was not the hot CPU thread at that moment (it’s blocked/parked).
So the correct conclusion comes only after correlating:
top -H hottest LWP at that exact time, and
matching nid in dumps taken in the same window.
PID=<pid>
# 1) capture top thread CPU for 60 seconds
for i in {1..6}; do
date "+%F %T"
top -b -n 1 -H -p $PID | head -n 60
sleep 10
done > topH_${PID}.log
# 2) capture 3 thread dumps during same minute
for i in 1 2 3; do
date "+%F %T" > threaddump_${i}.txt
jcmd $PID Thread.print -l >> threaddump_${i}.txt
sleep 10
doneFrom topH_${PID}.log, pick the top LWP (decimal).
Convert: printf "0x%x\n" <LWP>
grep "nid=0x..." in all thread dumps.
Compare stacks:
same method(s) repeating ⇒ likely CPU culprit
stack changes ⇒ transient / contention / periodic workload
Use these simple heuristics:
Hot RUNNABLE in app code → fix code path (algorithm/regex/loop/logging).
Hot RUNNABLE in GC/VM threads → check GC logs, allocation rate, humongous objects; may need heap/GC tuning.
Hot JIT compiler threads → warmup / new code paths / too many generated classes / instrumentation.
Hot agent threads (APM/security) → verify agent overhead/config; consider sampling settings.
Your thread dump shows very high CPU time reported for OneAgent threads (examples include oneagentautosensor with very large CPU time).
That doesn’t automatically mean “agent is the root cause”, but it’s an important branch in the investigation: if top -H consistently points to those agent LWPs, you focus on APM configuration rather than app logic.
Using your PII-obfuscation / regex example:
If you consistently see java.util.regex.Pattern...match() and Matcher.find() leading into LMSThreadNamePIIObfuscator.format(...) :
Check the regex pattern for catastrophic backtracking (nested quantifiers, ambiguous alternations).
Reduce input size (truncate thread names / sanitize earlier).
Cache precompiled patterns (avoid compiling per call).
Lower the rate of logging or move expensive formatting off the hot path (e.g., avoid heavy lookups/regex in layout pattern for every event).
Confirm by re-running the same evidence loop and showing reduced %CPU for that LWP.
Thread dumps + top -H are the traditional, reliable first pass. When you need exact method-level CPU attribution, add JFR (low overhead on JDK 11+):
jcmd $PID JFR.start name=cpu settings=profile duration=60s filename=/tmp/cpu.jfrThis will usually confirm (or refute) whether the top CPU is coming from regex/logging, GC, crypto, JSON serialization, Kafka polling, or instrumentation.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
| User | Count |
|---|---|
| 36 | |
| 27 | |
| 26 | |
| 26 | |
| 26 | |
| 24 | |
| 23 | |
| 22 | |
| 21 | |
| 20 |