Since wave 13 combining expensive statement information (CPU, Peak Memory) with task log information is delivered as a standard functionality.
The monitoring view TASK_LOGS_V_EXT has additional columns for memory and CPU consumption. They are copied over from the expensive statement table and persisted into the task log table.
Note that you have to re-import the content table to expose the columns in data modelling.
This has the helpful side effect, that we now have a history that goes along with the task log table and is independent from the expensive statement view.
Please check out details from the product documentation here.
Data Warehouse Cloud allows you to configure the thresholds for capturing expensive statements (help page).
These are the delivered default settings:
This is a very helpful functionality for every admin – unfortunately something important is missing:
How do I identify that the statement is related to SAP Data Warehouse Cloud?
How do I map the captured statements to the related activity in SAP Data Warehouse Cloud?
Without this information you can only see that something expensive happened – but you can’t change it as the initiating process stays unknown. Or other way round: you have defined a task in DWC and would like to check if it is “expensive”.
Luckily we can at least find and map the expensive statements for some SAP Data Warehouse Cloud tasks. This blog post describes how to do it for the data integration features view persistence and remote table persistence.
Every HANA task comes with a set of session variables. If the SAP Data Warehouse Cloud task framework is the initiator, the variable '$$DWC_PROCEDURE_RUN$$' is added.
With that information we can immediately select all expensive statements that are related to direct or scheduled execution. The following statement would return all entries that are related to the task framework:
Mapping to Task Information
The value of the variable is a Task log id. This is an identifier for each task execution and hence can be used to lookup all attributes of a task like space id, name of the schedule, runtime, etc.
The view TASK_LOGS_V_EXT contains this mapping and is accessible from a space tagged as monitoring space. Detailed information about this view can be found on this SAP Help Portal.
Here is a screenshot of the view definition and the main columns:
The remaining task is now to combine the filter on the expensive statement table with the lookup on the task log view. This is described in the next chapter.
SQL View - Filter and Combine
The SQL view exposes a subset of the task attributes together with a subset of the expensive statement view:
SPACE_ID: space name of the task definition
APPLICATION_ID: identifies the task type – e.g. view persistence
ACTIVITY: g. persist, remove, …
OBJECT_ID: the id of the schedule
Note: Cleaning the task logs (Configuration -> Task Logs -> Clean Up Older Than) will also delete content of TASK_LOGS_V_EXT and hence might affect the lookup.
Please create an SQL view with the name TASK_EXP_STATM_V_H_01 in your monitoring space (see also setting up a Monitoring Space in SAP Data Warehouse Cloud on the SAP Help Portal).
This view searches for entries in the M_EXPENSIVE_STATEMENTS that are related to DWC DataIntegration.
This clasification is based on an internal property of the SESSION_VARIABLES -> $$DWC_PROCEDURE_RUN$$
-- JSON_VALUE has issues with $$ in the property name
substr_before(substr_after(SESSION_VARIABLES, '"$$DWC_PROCEDURE_RUN$$": "'), '"') TASK_LOG_ID
-- if the session is initiated by the task framework, it contains the property TASK_EXECUTION_ID
WHERE "SESSION_VARIABLES" LIKE '%$$DWC_PROCEDURE_RUN$$%'
LEFT JOIN "TASK_LOGS_V_EXT"
ON "TASK_LOGS_V_EXT"."TASK_LOG_ID" = EXP_STATEMENTS_TASK_RELATED."TASK_LOG_ID"
SQL View - For Reporting
This is a sample view for reporting purposes. It exposes the duration, memory and CPU time as measure and as dimension. You might wonder why the figures are added as dimension in addition – the reason is that I like to enable a filtering (e.g. show all entries with a peak memory greater than 10 GB).
Step 3: Set the following dimensions as the measures:
The related dimensions DURATION_SECS, MEMORY_SIZE_GB and CPU_TIME_SEC should be kept as normal dimension/attributes.
List all Tasks With Peak Memory Greater than 10 GB
This is pretty straight forward. We have the dimension MEMORY_SIZE_GB that contains the peak memory in GB. Just add a filter greater than 10 by the following steps:
Navigate to the filter section in SAC (after you have inserted a table using the DWC view you just created) and select the Peak Memory (GB) (Range) from the dropdown:
In the following dialog, select greater than with a value 10.
Combine the memory and runtime figures from expensive statements with the modeling artefacts of DWC provide a great help in monitoring and understanding the impact of the tasks in terms of system resources. Especially during first productive usage it is highly recommended to evaluate the impact and to take measure to reduce or distribute the load to avoid overload situations.
I’d be happy to hear your thoughts, ideas and comments on this monitoring topic. Let me know in the comments!