Table - partitioning is always a deep physical work to get the best results of the hardware as memory and CPU. HANA typical does need this only with very large tables. The rule of thumb per partition is some 300.000.000 rows (300 Mio ). The limit is with 2 billion rows per partition and with max 16000 partitions per table. Now these partitions can be either COLUMN LOADABLE or PAGE LOADABLE. A column loadable table/partition is in memory (and stored on disk), a page loadable table/partition in store on disk and fetched into memory if needed. This concept is called HOT and WARM data. In HANA this is call NSE (Native Storage Extension)
As CAP or HDI developer yo are more interested in the tables structures and not so much on the physical aspects of storing or accessing the data - only it should be fast. And what if the HANA could understand itself to build a new partition and you can live with a simple rule on top.
Now this is exactly where we start this blog. There is not the "one size fits all" solution but I think you can see the pattern:
Assuming you are doing financials and you most like look at data that are in the current year, maybe in the same quarter of last year. That is a span of 15 month you want to have in memory. All others data (older) should be on disk = PAGE LOADABLE. And you have a volume that fits in one month a partition size.
So we have
Lets us look at the syntax graph:
RANGE VALUES are:
PARTITION OTHERS are:
Sample:
Table:
DO BEGIN
DECLARE DROP INT;
select count(*) into DROP from tables where table_name = 'PART_1';
If DROP > 0 then
DROP TABLE PART_1;
END IF;
END;
CREATE COLUMN TABLE PART_1
(A INT NOT NULL,
B DATE NOT NULL)
PARTITION BY RANGE(MONTH(B)) (
PARTITION '197001' <= VALUES < '202012' PAGE LOADABLE,
PARTITION OTHERS DYNAMIC INTERVAL 1 MONTH DISTANCE 15 MONTH PAGE LOADABLE
);
Now we have everything before 202012 (Dec. 2020) as PAGE LOADABLE and the latest 15 month as COLUMN (DEFAULT) LOADABLE.
Now you can fill the table with some data:
TRUNCATE TABLE PART_1;
INSERT INTO PART_1
SELECT A.A, B.B FROM
(SELECT ADD_MONTHS('2000-01-01', GENERATED_PERIOD_END) AS B
FROM SERIES_GENERATE_INTEGER(1,0,296)) B,
(SELECT LPAD(GENERATED_PERIOD_END, 10, '0') AS A
FROM SERIES_GENERATE_INTEGER(1,0,100)) A,
(SELECT GENERATED_PERIOD_END FROM SERIES_GENERATE_INTEGER(1,0,1000));
-- force dynamic processing (or wait ~5 minutes)
ALTER TABLE PART_1 ADD PARTITION FROM OTHERS;
ALTER TABLE PART_1 REFRESH DYNAMIC DISTANCE;
This will nearly generate 30 mio rows.
296 months and some simple numbers. On my HANA this is a 6 sec job. Enough to play and see effects of the partitioning.
Look for the table:
SYS.TABLE_PARTITIONS
SYS.M_TABLE_PARTITIONS
To find some more meta data on the partitions.
or
Important advice:
The dynamic instruction is not returned by this function but shows you how the distribution is in this very moment:
Considerations:
DO BEGIN
DECLARE DROP INT;
select count(*) into DROP from tables where table_name = 'PART_2';
If DROP > 0 then
DROP TABLE PART_2;
END IF;
END;
CREATE COLUMN TABLE PART_2
(A INT NOT NULL,
B DATE NOT NULL)
PARTITION BY RANGE (B)
( PARTITION OTHERS DYNAMIC THRESHOLD 100000 DISTANCE 15 MONTH PAGE LOADABLE )
This will give a filling of 100000 row (300000000 = rule of thumb) and for the 15 month distance. Please double check if this is working for you. The first rule should work for the most cases. IOT scenarios might need this one or a daily range. At a certain volume you should consider with so massive data to store them into the SQL on Files in HANA option.
Help:
heterogenious partion HANA Cloud
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
| User | Count |
|---|---|
| 46 | |
| 41 | |
| 38 | |
| 31 | |
| 28 | |
| 28 | |
| 27 | |
| 24 | |
| 24 | |
| 23 |