I hope I'm not the only one who sometimes wonders: "What is my Java application doing right now?" When you don't see any output. Yes, you could perform a simple thread dump via jstack, but it is hard to understand which threads are actually consuming CPU and making any sort of progress. This is where my tiny tool called JStall comes in:
JStall is a small command-line tool for one-shot inspection of running JVMs using thread dumps and short, on-demand profiling. The tool essentially takes multiple thread dumps of your application and uses the per-thread cpu-time information to find the most CPU-time-consuming Java threads.
First, download the JStall executable from the GitHub releases page. Let us then start by finding the currently running JVMs:
> ./jstall
Usage: jstall <command> <pid|files> [options]
Available commands:
status - Show overall status (deadlocks + most active threads)
list - List and filter JVMs
deadlock - Check for deadlocks
most-work - Show threads doing the most work
flame - Generate flame graph
threads - List all threads
Available JVMs:
1223 <unknown>
8136 ./renaissance-gpl-0.16.0.jar
6138 org.jetbrains.idea.maven.server.RemoteMavenServer36
5597 DeadlockDemo
49294 com.intellij.idea.MainThis provides us with a list of options for the main status command, as well as a list of JVM processes and their corresponding main classes.
Process 5597 is an instance of the DeadlockDemo program that we saw running above. Let's obtain the status (a combination of deadlock and most-work analysis):
> ./jstall 5597
=== deadlock ===
Deadlock detected:
Found one Java-level deadlock:
=============================
"DeadlockThread-1":
waiting to lock monitor 0x00006000029215f0 (object 0x000000052f817880, a java.lang.Object),
which is held by "DeadlockThread-2"
"DeadlockThread-2":
waiting to lock monitor 0x00006000029280d0 (object 0x000000052f817878, a java.lang.Object),
which is held by "DeadlockThread-1"
Java stack information for the threads listed above:
===================================================
"DeadlockThread-1":
at DeadlockDemo.lambda$main$0(scratch_2.java:15)
- waiting to lock <0x000000052f817880> (a java.lang.Object)
- locked <0x000000052f817878> (a java.lang.Object)
at DeadlockDemo$$Lambda/0x000007fc01040400.run(Unknown Source)
at java.lang.Thread.runWith(java.base@25/Thread.java:1487)
at java.lang.Thread.run(java.base@25/Thread.java:1474)
"DeadlockThread-2":
at DeadlockDemo.lambda$main$1(scratch_2.java:26)
- waiting to lock <0x000000052f817878> (a java.lang.Object)
- locked <0x000000052f817880> (a java.lang.Object)
at DeadlockDemo$$Lambda/0x000007fc01040800.run(Unknown Source)
at java.lang.Thread.runWith(java.base@25/Thread.java:1487)
at java.lang.Thread.run(java.base@25/Thread.java:1474)
Found one Java-level deadlock:
=============================
"DeadlockThread-3":
waiting to lock monitor 0x0000600002938000 (object 0x000000052f817890, a java.lang.Object),
which is held by "DeadlockThread-4"
"DeadlockThread-4":
waiting to lock monitor 0x0000600002934000 (object 0x000000052f817888, a java.lang.Object),
which is held by "DeadlockThread-3"
Java stack information for the threads listed above:
===================================================
"DeadlockThread-3":
at DeadlockDemo.lambda$main$2(scratch_2.java:38)
- waiting to lock <0x000000052f817890> (a java.lang.Object)
- locked <0x000000052f817888> (a java.lang.Object)
at DeadlockDemo$$Lambda/0x000007fc01040c00.run(Unknown Source)
at java.lang.Thread.runWith(java.base@25/Thread.java:1487)
at java.lang.Thread.run(java.base@25/Thread.java:1474)
"DeadlockThread-4":
at DeadlockDemo.lambda$main$3(scratch_2.java:49)
- waiting to lock <0x000000052f817888> (a java.lang.Object)
- locked <0x000000052f817890> (a java.lang.Object)
at DeadlockDemo$$Lambda/0x000007fc01041000.run(Unknown Source)
at java.lang.Thread.runWith(java.base@25/Thread.java:1487)
at java.lang.Thread.run(java.base@25/Thread.java:1474)
Found 2 deadlocks.
=== most-work ===
Top threads by activity (2 dumps):
Combined CPU time: 0.00s, Elapsed time: 5.10s (0.0% overall utilization)
1. Monitor Deflation Thread
CPU time: 0.00s (64.8% of total)
Core utilization: 0.0%
States: RUNNABLE: 100.0%
2. Service Thread
CPU time: 0.00s (14.8% of total)
Core utilization: 0.0%
States: RUNNABLE: 100.0%
3. Attach Listener
CPU time: 0.00s (13.6% of total)
Core utilization: 0.0%
States: RUNNABLE: 100.0%For this, JStack obtained two thread dumps and analyzed them. JStack found two thread dumps and identified that the top CPU-time-consuming threads are JVM-related.
Using the threads command, we can take a look at all threads:
> ./jstall threads 5597
Threads (2 dumps):
Combined CPU time: 0.00s, Elapsed time: 5.10s (0.0% overall utilization)
THREAD CPU TIME CPU % STATES TOP STACK FRAME
------------------------ -------- ----- ------------- --------------------------------------------------
Monitor Deflation Thread 0.00s 65.2% RUNNABLE
Service Thread 0.00s 18.5% RUNNABLE
Attach Listener 0.00s 7.6% RUNNABLE
C1 CompilerThread0 0.00s 4.3% RUNNABLE
C2 CompilerThread0 0.00s 4.3% RUNNABLE
Common-Cleaner 0.00s 0.0% TIMED_WAITING java.lang.Object.wait0
DeadlockThread-1 0.00s 0.0% BLOCKED DeadlockDemo.lambda$main$0
DeadlockThread-2 0.00s 0.0% BLOCKED DeadlockDemo.lambda$main$1
DeadlockThread-3 0.00s 0.0% BLOCKED DeadlockDemo.lambda$main$2
DeadlockThread-4 0.00s 0.0% BLOCKED DeadlockDemo.lambda$main$3
Finalizer 0.00s 0.0% WAITING java.lang.Object.wait0
Notification Thread 0.00s 0.0% RUNNABLE
Reference Handler 0.00s 0.0% RUNNABLE java.lang.ref.Reference.waitForReferencePending...
Signal Dispatcher 0.00s 0.0% RUNNABLE
main 0.00s 0.0% WAITING java.lang.Object.wait0It's also possible to use jstack threads Dead now, selecting JVMs by giving parts of their label.
The threads command provides a concise overview of the app's progress
Let's look at process 8136, which is an instance of the Renaissance benchmark suite.
> ./jstall 8136
=== most-work ===
Top threads by activity (2 dumps):
Combined CPU time: 13.34s, Elapsed time: 5.11s (261.4% overall utilization)
1. Executor task launch worker for task 0.0 in stage 53.0 (TID 75)
CPU time: 4.14s (31.1% of total)
Core utilization: 81.2%
States: RUNNABLE: 100.0%
Common stack prefix:
org.apache.spark.util.collection.SizeTrackingAppendOnlyMap.changeValue(SizeTrackingAppendOnlyMap.scala:32)
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104)
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
org.apache.spark.scheduler.Task.run(Task.scala:141)
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
org.apache.spark.executor.Executor$TaskRunner$$Lambda/0x0000030001959c00.apply
... (8 more lines)
2. Executor task launch worker for task 1.0 in stage 53.0 (TID 76)
CPU time: 4.12s (30.9% of total)
Core utilization: 80.8%
States: RUNNABLE: 100.0%
Common stack prefix:
org.apache.spark.util.collection.SizeTrackingAppendOnlyMap.changeValue(SizeTrackingAppendOnlyMap.scala:32)
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104)
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
org.apache.spark.scheduler.Task.run(Task.scala:141)
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
org.apache.spark.executor.Executor$TaskRunner$$Lambda/0x0000030001959c00.apply
... (8 more lines)
3. Executor task launch worker for task 2.0 in stage 53.0 (TID 77)
CPU time: 4.06s (30.4% of total)
Core utilization: 79.5%
States: RUNNABLE: 100.0%
Common stack prefix:
org.apache.spark.util.collection.SizeTrackingAppendOnlyMap.changeValue(SizeTrackingAppendOnlyMap.scala:32)
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104)
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
org.apache.spark.scheduler.Task.run(Task.scala:141)
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
org.apache.spark.executor.Executor$TaskRunner$$Lambda/0x0000030001959c00.apply
... (8 more lines)Here, you see the top three (configurable) CPU-time-consuming threads over the last 10 seconds, along with the stack trace parts common to both thread dumps.
Using the threads command, you can again see all threads:
> ./jstall threads 8136
Threads (2 dumps):
Combined CPU time: 13.12s, Elapsed time: 5.11s (256.8% overall utilization)
THREAD CPU TIME CPU % STATES TOP STACK FRAME
-------------------------------------------------- -------- ----- --------------------------- --------------------------------------------------
Executor task launch worker for task 1.0 in sta... 4.16s 31.7% RUNNABLE: 50%, BLOCKED: 50% java.lang.ClassValue$ClassValueMap.readAccess
Executor task launch worker for task 2.0 in sta... 4.16s 31.7% RUNNABLE scala.runtime.ClassValueCompat.get
Executor task launch worker for task 0.0 in sta... 4.05s 30.9% RUNNABLE org.apache.spark.util.collection.SizeTrackingAp...
main 0.20s 1.5% WAITING jdk.internal.misc.Unsafe.park
C2 CompilerThread0 0.16s 1.2% RUNNABLE
task-result-getter-1 0.07s 0.6% WAITING jdk.internal.misc.Unsafe.park
task-result-getter-0 0.07s 0.5% WAITING jdk.internal.misc.Unsafe.park
...But thread dumps aren't the only way to peek at a running Java application; you can also take a short profile and generate a flamegraph:
For this purpose, I integrated async-profiler using my ap-loader library:
> ./jstall flame 8136
Starting flamegraph generation for PID 8136...
Event: cpu
Duration: 10s
Interval: 10ms
Output: flame.html
Executing: asprof -d 10 -e cpu -i 10000000 -f flame.html 8136
Profiling for 10 seconds
Done
âś“ Flamegraph successfully generated!
Output file: .../flame.html
File size: 349576 bytesWhich results in this rather unwieldy flamegraph, courtesy of the Apache Spark benchmark:
Do you have an idea for another thread dump analysis? Just implement the Analysis interface, create a CLI command class, and submit a pull request on GitHub. The Analysis interface is pretty simple:
public interface Analyzer {
/**
* Returns the name of this analyzer.
*/
String name();
/**
* Returns the set of options this analyzer supports.
*
* Common options: "dumps", "interval", "keep", "top"
*/
Set<String> supportedOptions();
/**
* Returns the dump requirement for this analyzer.
*/
DumpRequirement dumpRequirement();
default AnalyzerResult analyze(List<ThreadDumpWithRaw> dumpsWithRaw,
Map<String, Object> options) {
List<ThreadDump> dumps = dumpsWithRaw.stream().map(ThreadDumpWithRaw::parsed).toList();
return analyzeThreadDumps(dumps, options);
}
default AnalyzerResult analyzeThreadDumps(List<ThreadDump> dumps,
Map<String, Object> options) {
throw new UnsupportedOperationException(
"Analyzer must implement either analyze(List<ThreadDumpWithRaw>, Map) " +
"or analyzeThreadDumps(List<ThreadDump>, Map)");
}
}Additionally, there are helper classes that make writing new analyses even easier. So feel free to submit your own. Ideas for new analyses could be:
The threaddump analyses use the jthreaddump parsing library, which you can also use for your own projects.
I sometimes get the time to write tools that solve minor annoyances, in this case, finding the status of a Java app. I hope you find JStall as helpful as I do, and make this open-source project part of your daily toolbox. Feel free to contribute via GitHub issues or pull requests.
This article is part of my work in the SapMachine team at SAP, making profiling and debugging easier for everyone.
P.S.: Never stop exploring...
P.P.S.: You can, of course, use the JStack tool via jbang too:
jbang jstall@parttimenerd/jstall
P.P.P.S: Abschalten
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
| User | Count |
|---|---|
| 36 | |
| 33 | |
| 29 | |
| 27 | |
| 26 | |
| 26 | |
| 25 | |
| 24 | |
| 23 | |
| 23 |