last updated: 2023-12-15 16:30 CEST
Ok, large objects aka LOBs (no not Line of Business) are nothing special and well known from other database systems.
LOBs are unstructured data like a picture, a PDF or a message XML content. The shared properties of this kind of data is that they can be quite large.
There are different types of LOBs:
Type |
Description |
BLOB |
The BLOB data type is used to store large amounts of binary data. BLOB values can be converted to VARBINARY. |
CLOB |
The CLOB data type is used to store large amounts of 7-bit ASCII character data. CLOB values can be converted to VARCHAR. |
NCLOB |
The NCLOB data type is used to store a large Unicode character object. NCLOB values can be converted to NVARCHAR. |
TEXT |
The TEXT data type enables text search features. This data type can be defined for column tables, but not for row tables. This is not a standalone SQL type. Selecting a TEXT column yields a column of type NCLOB. |
BINTEXT |
The BINTEXT data type is similar to the TEXT data type and supports text search features, but it is possible to insert binary data. This data type can be defined for column tables, but not for row tables. This is not a standalone SQL type. Selecting a BINTEXT column yields a column of type NCLOB. |
There are also certain restriction for them e.g.:
LOB columns cannot appear in ORDER BY or GROUP BY clauses
or
LOB columns cannot appear in WHERE clauses as predicates other than LIKE (meaning that no comparison is allowed).
All this details and more, you can access via the official
documentation.
For a in memory database those data can be a resource killer. Because of this, not all LOB data is loaded into HANA. But there is more to report, because they are handled differently in dependency to the HANA release and the running application. This must be taken into account in order to achieve max. performance.
Content:
- types of LOBs
- VARBINARY vs. LOBs
- garbage collection
- Parametrization
- Mini Checks
- Frequently occurring known issues
- Threads
- SQL checks
- Summary
Sources for details
2400005 - FAQ: SAP HANA Persistence
2220627 - FAQ: SAP HANA LOBs
1994962 - How-To: Activation of Hybrid LOBs in SAP HANA
2169283 - FAQ: SAP HANA Garbage Collection
1. Types of LOBs
LOBs can be stored as CS and RS table. Here we have different types of LOBs:
- hybrid LOBs (RS+CS): Small values (typically below 1 KB), in memory like normal column values
- Packed / midsize LOBs : Midsize values (typically between 1 KB and 4 KB), packed with other LOBs into one container on disk
- File LOBs: Large values (typically above 4 KB): one container on disk per LOB
Be aware that packed and file LOBs are also named as disk LOBs.
You can use the table data collector (attached to note 1969700) to identifiy which table stores the LOBs in which way:
HANA_Tables_DataCollector*
SCHEMA_NAME |
TABLE_NAME |
COLUMN_NAME |
LOB_TYPE |
DISK_GB |
BINARY_GB |
LOB_COUNT |
schema |
table |
CONTENT |
PACKED |
3.00 |
2.90 |
1215200 |
schema |
table |
CONTENT |
FILE |
1.20 |
0.80 |
1215200 |
2. VARBINARY vs. LOBs
VARBINARY can also hold quite large binary data (up to 5000 byte), but it doesn't belong to the LOB data types. As a consequence the hybrid LOB feature can't be used and the column is loaded into memory when it is accessed.
When you upgrade to SAP_BASIS >= 7.53 the previous VARBINARY columns remain in place until an ABAP table conversion happens. News tables are created with a LOB data type.
As a consequence many potentially large tables with large LRAW columns (<= 5000 byte) are completely loaded into memory (e.g. BALDAT, PCL2, SOC3, SWNCMONI, SWWCNTP0). In order to reduce the memory consumption you can consider taking advantage of paged attributes / Native Storage Extension (SAP Note
2799997) so that unused column pages are evicted from memory in an LRU manner. Please also see SAP Notes
2816823 and
2898319 to safeguard NSE usage in table conversion events. Be sure that your revision is at least on Rev. 57 to avoid issues with the LOBs.
LRAW length limit |
SAP_BASIS |
SAP HANA Data Type |
<= 1000 |
all |
VARBINARY |
1001 - 5000 |
<= 7.52 |
VARBINARY |
1001 - 5000 |
>= 7.53 |
BLOB / CLOB |
> 5000 |
all |
BLOB / CLOB |
How to convert affected objects? Have a look into note
2375917.
You can use
SQL: "HANA_Tables_ColumnStore_Columns" (DATA_TYPE = 'VARBINARY')
Typically tables which you can save some memory from: BALDAT, PCL2, SOC3, SWNCMONI, SWWCNTP0.
ALTER TABLE "<schema>"."<table_name>" ALTER ("<column_name>" BLOB MEMORY THRESHOLD <1-5000>)
ALTER TABLE "SAPABAP1"."BALDAT" ALTER ("CLUSTD" BLOB MEMORY THRESHOLD 2000)
It depends on the size of the threshold you defined how high your savings are. If the column is part of any index it is
not possible to convert the attribute to LOB.
3. Garbage Collection
There is also a LOB garbage collection (=LGC) for deleted / fragmented data.
You can also trigger a manually GC for all CS tables:
ALTER SYSTEM RECLAIM COLUMN LOB SPACE
for a defined CS table:
ALTER TABLE "<table_name>" WITH PARAMETERS ('LOB_GARBAGE_COLLECTION' = '1')
For row store tables it is not possible to trigger LOB garbage collection on table level.
ALTER SYSTEM RECLAIM [ROW|COLUMN] LOB SPACE [ALL]
The ALL key word is available as of SAP HANA 2.00 and makes sure that also LOBs of tables are cleaned that are currently not loaded into memory.
Be aware that LOB garbage collection itself doesn't purge the LOB files on disk level. This is handled at a later time by the persistence garbage collection. Starting with SAP HANA 2.0 SPS 06 you can alternatively use the
SAP HANA job scheduler to trigger column store LOB garbage collection at specific times (SAP Note
3147465).
indexserver.ini -> [lobhandling] -> garbage_collect_daily_schedule_s -> 3600
Available starting with SAP HANA 2.0 SPS 00 as alternative to garbage_collect_interval_s (see below), inactive per default with SAP HANA <= 2.0 SPS 03
Specifies the time in seconds since midnight when the
volume-scan LOB garbage collection is executed on a daily basis. A negative value pauses the daily scan.
indexserver.ini -> [lobhandling] -> garbage_collect_interval_s -> 900 (<= 2.00.053) / 43200 (>= 2.00.054)
Defines the time interval of (
table-wise) column store LOB garbage collections, value 0 disables LOB garbage collection
Specifies the time in seconds between table-wise LOB garbage collection runs. If the interval is zero the table-wise LOB garbage collection is paused.
Attention: When garbage_collect_daily_schedule_s is explicitly set on SAP HANA <= 2.0 SPS 03, garbage_collect_interval_s is ignored. |
indexserver.ini -> [row_engine] -> garbage_lob_file_handler_enabled -> true
# Value 'true' activates row store LOB garbage collection
indexserver.ini -> [row_engine] -> garbage_lob_file_handler_interval -> 10
# Defines the time interval of row store LOB garbage collections, value 0 disables LOB garbage collection |
In some patched environments both the table- and the volume-wise LGC is active.
This is not neccessary! Only the
volume-wise LOB garbage collection can remove LOBs belonging to previously removed columns, partitions or tables. Therefore, it is possible to completely disable the table-wise LOB garbage collection to eliminate its impact on system performance, but it is
not recommended to disable the volume-wise LOB garbage collection for productive cases with LOB load.
SPS06
SchedulerJobs can only execute procedures, therefore the following command will create a procedure that triggers the LOB garbage collection:
CREATE PROCEDURE PROC_RECLAIM_COLUMN_LOB_SPACE AS BEGIN
EXEC 'ALTER SYSTEM RECLAIM COLUMN LOB SPACE';
END;
The following example defines the job to be executed once a week on Sunday beginning at 3 am at night:
CREATE SCHEDULER JOB RECLAIM_COLUMN_LOB_SPACE CRON '* * * SUN 3 0 0' ENABLE PROCEDURE PROC_RECLAIM_COLUMN_LOB_SPACE;
SQL Collection (1969700)
HANA_GarbageCollection_LOBs
4. Parametrization
Here are the default values (for SPS05 - some other releases may differ):
indexserver.ini -> [persistence] -> midsizelob_threshold = 0
global.ini -> [sql] -> lob_memory_threshold
global.ini -> [sql] -> default_lob_storage_type = hybrid
indexserver.ini -> [persistence] -> disposition_lob_read = 1
indexserver.ini -> [persistence] -> disposition_lob_read_small = 1
indexserver.ini -> [persistence] -> disposition_lob_write = 1
indexserver.ini -> [persistence] -> disposition_lob_write_small = 1
indexserver.ini -> [lobhandling] -> garbage_collect_interval_s = 43200
# When garbage_collect_daily_schedule_s is explicitly set on SAP HANA <= 2.0 SPS 03, garbage_collect_interval_s is ignored
indexserver.ini -> [row_engine] -> garbage_lob_file_handler_enabled = true
# row Value 'true' activates row store LOB garbage collection
indexserver.ini -> [row_engine] -> garbage_lob_file_handler_interval = 10
# Defines the time interval of row store LOB garbage collections, value 0 disables LOB garbage collectionglobal.ini -> [persistence] -> lob_page_cache_max_size => 10-20% GAL
global.ini -> [persistence] -> lob_page_cache_reclaim_target_size = 5%
global.ini -> [persistence] -> lob_page_cache_reclaim_threshold_size = 10%indexserver.ini -> [lobhandling] -> garbage_collect_daily_schedule_s -> 3600indexserver.ini -> [cs_nse_advisor] -> enable_lob_pageloadable = false (SPS5) / true (SPS06)# This parameter controls if LOB columns are considered by the NSE advisor (SAP Note 2799997). It is
#available starting with SAP HANA 2.00.059 and its default value is "false" for SAP HANA 2.0 SPS 05
#and "true" for SAP HANA >= 2.0 SPS 06 (SAP Note 3125483). |
Attention: The LOB Page Cache only includes the disk LOBs (packed + file)! Also activated LOBs for NSE are not placed into NSE buffer cache. It is still placed into the LOB Page Cache. |
This means you can limit the size of in memory lobs which are placed as resource killer. For sure it is a trade-off but are LOB data really business critical? You can save a lot of resources at the beginning of your sizing. You can also use NSE to page out LOBs for certain tables / columns which is an addition to the Imitation parameter because they are global. With NSE you can also limit the memory LOBs in a different way.
5. Mini Checks
Check ID |
Description |
SAP note |
M0371 |
Unused space in packed LOBs vs. data % |
2220627 |
M0372 |
Unused space in file LOBs vs. data % |
2220627 |
M0373 |
Tables with many small file LOBs |
2220627 |
M0374 |
Tables with many disk LOBs |
2220627 |
M0450 |
Tables with memory LOBs > 2GB |
1994962 |
M0849 |
Orphan disk LOBs |
2220627 |
M0850 |
Time since last CS LOB garb. coll. (days) |
2220627 |
Check out note
1999993 - How-To: Interpreting SAP HANA Mini Check Results how to handle them.
6. Frequently occurring known issues
Memory leak
Starting with JDBC 2.3.37 JDBC uses transactional LOBs. As a consequence database requests returning LOB data are no longer closed at the end of the statement execution. As a consequence statements can remain in SUSPENDED state for a long time, introducing various trouble like blocked garbage collection or growing Pool/Statistics and Pool/RowEngine/QueryExecution allocator sizes. This affects e.g. all Solution Managers Java stacks!
This can be fixed by setting transactionalLobs=false as stated out in
2711824 - High Number of Prepared Statements Causing High Usage of Memory Allocator Pool/Statistics.
I/O issues
The critical phase can also take longer in context of many disk LOBs (SAP Note
2220627) and a large Pool/PersistenceManager/PersistentSpace/DefaultConverter/ConvPage allocator (SAP Note
1999997, issue number 257466). The FlushResourcesThread is in this case busy clearing converter pages in a call stack like:
memset_impl
PageAccess::ConverterIndexPageControlBlock::resetFlushPerSPCounter
DataAccess::PersistentSpaceImpl::resetFlushPerSPCounter
ResourceManager::WaitAndSwitchCounterResource::determineFlushDestination
In this scenario each critical phase has a certain duration, even if no I/O writes are done. Reducing the number of disk LOBs in the system can reduce the runtime of the critical phase. With SAP HANA >= 2.00.048.04 and >= 2.00.054 this scenario is optimized and significantly shorter critical phases can be expected.
Table Consistency Check (CHECK_TABLE_CONSISTENCY) against the table returns the following inconsistency:
ERROR_CODE;ERROR_MESSAGE ;SEVERITY;CHECK_ACTION ;AFFECTED_COUNT;DETAILS
5,406 ;Lob container with CID= x-x-x
has unusually high overhead, binToRawRatio=5604.99 which is above the threshold of 10 with a total size of 317351264256.
Please run REPAIR_HYBRID_LOB_OVERHEAD once on this table.;INFO ;CHECK_HYBRID_LOB_OVERHEAD;? ;?
CALL CHECK_TABLE_CONSISTENCY('REPAIR_HYBRID_LOB_OVERHEAD', '<SCHEMA_NAME>', '<TABLE_NAME>');
2950474 - DB table growth in SAP HANA related to Packed LOB's
7. Threads
Following known thread methods:
GarbageLobFileHandler |
Lob garbage collection (row store) |
RSLobGarbageRemover |
Deletion of orphan row store LOBs |
LobGarbageCollector |
Lob garbage collection (column store) |
LobGarbageCollectorJob |
Column store LOB garbage collection
Thread detail "table wise" indicates a table wise LOB garbage collection triggered by regular runs based on lob_garbage_collect_interval_s |
8. SQL checks
Also for LOBs Martin has created some amazing scripts packed into the SQL Collection (1969700):
HANA_GarbageCollection_LOBs
HANA_LOBs_HybridLOBActivation_CommandGenerator
HANA_LOBs_LOBFiles
HANA_GarbageCollection_LOBs_ColumnStore_Reclaims_1.00.122.17+
HANA_LOBs_LOBSizeHistogram_CommandGenerator |
9. Summary
In the end LOBs should not be placed for a long time into a in-memory database like HANA. But for some circumstances it is not possible for every scenario. Here you have to find a trade-off between performance and resources. Therefor you have to tune the thresholds of the parameters or use NSE. This can save a lot of memory also during the initial sizing of the system and the further operation after going live.