Technology Blog Posts by SAP
cancel
Showing results for 
Search instead for 
Did you mean: 
ilya_ulyanov
Associate
Associate
448

Disclaimer:
BiWhy is available again for free to everyone on biwhy.net.

Use case
Thread and stack analysis helped resolve multiple issues in Java processes, mainly in Tomcat, including performance degradation, high CPU usage, out-of-memory conditions, and thread exhaustion.

How it helps

BiWhy identifies problematic thread groups by sorting and filtering on metrics such as CPU, disk I/O, network I/O, opened files, and sockets.

Many of these metrics are available only in SAP JVM thread dumps. BiWhy uses that additional runtime information to make thread and stack analysis faster and more effective.

It then allows drilldown to an individual thread for stack-trace inspection, stack progression analysis, and correlation with metrics, states, and statuses. This helps determine whether threads remain stuck in the same code path, move in cycles, or change behavior over time.

Key capabilities

  • Analysis of individual thread dumps and dump sets
  • Fast identification of problematic thread groups
  • Sorting and filtering by CPU, disk I/O, network I/O, files, and sockets
  • Analysis of thread states, statuses, and stack progression over time
  • Correlation of stack behavior with runtime metrics
  • Tree and table views

Structure and views
The tree is organized from dump sets down to thread groups and individual thread instances.

Available metrics include CPU, elapsed time, memory allocation, class count, priorities, file and network I/O, opened files and sockets, and thread state or status.

All of these can be shown or hidden, used for sorting, and analyzed across multiple views.

Examples

Problematic dump set by CPU in the tree
The chart shows CPU in the top panel and file or network traffic in the bottom panel.

ilya_ulyanov_0-1774052666588.png

Responsible thread group
After identifying the problematic dump set, you can select the responsible thread group and choose which metrics to display in the tree.

ilya_ulyanov_1-1774052666612.png

Stack progression
By changing the selection in the tree, you can see how parameters and the most relevant stack trace evolve over time – whether the thread is stuck in one function or moving in cycles.

ilya_ulyanov_2-1774052666628.png

Problematic thread
In this case, the selected thread was transferring about 300 MB for roughly one hour with very poor outbound network throughput, likely due to a bad or long-distance connection, while allocating almost 8 GB of memory in the process.

ilya_ulyanov_3-1774052666641.png

Raylight Watchdog accumulation
This example shows accumulation of Raylight Watchdog threads.

ilya_ulyanov_4-1774052666652.png

Inactive threads
By reviewing stack traces together with CPU usage, it becomes clear that these threads are effectively inactive.

ilya_ulyanov_5-1774052666663.png

3,834 Raylight Watchdog threads
Here the issue is large-scale accumulation of Raylight Watchdog threads.

ilya_ulyanov_6-1774052666674.png

Tomcat out-of-memory case
This example shows an out-of-memory investigation on Tomcat. The root cause was accumulation of multiple instances of Geo Repository objects in memory. Ironically, this was a static object, so one instance should have been enough. It also interfered with garbage collection because of code that was originally intended to help garbage collection.

ilya_ulyanov_7-1774052666748.png

Allocation before fix

ilya_ulyanov_8-1774058611830.png

Allocation after fix

ilya_ulyanov_9-1774058886542.png

Table view with filters and analytics
Shows the selected thread set in tabular form, with filtering, sorting, and interactive analysis across the available metrics. It is also useful for lock and deadlock analysis.

ilya_ulyanov_10-1774059440907.png