cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

Efficiently Inserting Large Data Volumes into Database Tables Using CAP and CQL

alexandrayuyi
Discoverer
0 Kudos
483

Hello SAP Community!

We're currently developing applications using SAP Cloud Application Programming (CAP) Model, SAP Fiori Elements, and SAP HANA on SAP Business Technology Platform (BTP). One of our core requirements is to load and process large CSV files. While we've found a solution for loading the files, we’ve noticed that inserting large volumes of data can take several minutes. We're wondering if there’s a more efficient way to perform these inserts, or if there are alternative approaches we might not have considered.

Below is a custom service in Node.js we wrote to handle deleting all the data from the Modules entity and then inserting new data. This code snippet parses the CSV file content using the PapaParse library, and performs a transaction that deletes existing entries in the Modules and TCodes entities before inserting the new data. This is only one example of many other cases in which we have to process csv files and insert their data into the database, but essentially we do the same process in all the custom services.

If you have any advice on optimizing this approach or suggestions for handling bulk data insertion more efficiently, we’d love to hear from you.

Here’s the custom service code:

 

 

const cds = require('@sap/cds');
const Papa = require('papaparse'); // Library for parsing CSV

module.exports = async (srv) => {
    const db = await cds.connect.to('db');
    const { Modules, TCodes } = srv.entities;

    srv.on('uploadCSVModules', async (req) => {
        try {
            
            const csvContentBase64 = req.data.fileContent;

            // Using PapaParse to parse the CSV
            const parsedData = Papa.parse(csvContentBase64, {
                header: true,
                skipEmptyLines: true
            });

            // Check if the entity is defined correctly
            if (!Modules) {
                throw new Error("Entity Modules is undefined");
            }

            // Start the transaction
            await db.tx(async (trans) => {

                const entries = parsedData.data.map(row => ({
                    code: row.code.trim().toUpperCase(), // Ensure no white spaces
                    descr: row.descr.trim(),
                    fuesLicense_code: row.fuesLicense_code ? row.fuesLicense_code.trim() : 'none',
                }));

                try {
                    console.log("Fetching TCodes...");
                    let tcodes = await SELECT.from(TCodes);
                    console.log("TCodes fetched:", tcodes);

                    console.log("Deleting existing Modules...");
                    await DELETE.from(Modules);
                    console.log("Existing Modules deleted.");

                    console.log("Deleting existing Modules...");
                    await DELETE.from(TCodes);
                    console.log("Existing Modules deleted.");

                    console.log("Inserting new Modules...");
                    await INSERT.into(Modules).entries(entries);
                    console.log("New Modules inserted.");

                    console.log("Re-inserting TCodes...");
                    await INSERT.into(TCodes).entries(tcodes);
                    console.log("TCodes re-inserted.");

                } catch (err) {
                    console.error(`Error during database operations: `, err);
                    throw err; // Throw error to abort transaction
                }

            });

            return { message: "CSV processed successfully" };
        } catch (err) {
            console.error("Error processing CSV:", err);
            return req.reject(500, "Error processing CSV: " + err.message);
        }
    });
};

 

 

The reason why we also delete and reinsert the data from the Tcodes table is because the entity Modules has a composition of many Tcodes, and that is the way we found to re-trigger the association between both entities, we have a table in our sap fiori elements app (object list of a module) to display the tcodes of each module, and it was not getting populated after the use of the custom service, so that is how we solved it. Here are the entities in cds just in case they are useful:

 

 

entity Modules: COMMON.Enumeration { 
    key code: COMMON.SapCode;     
        fuesLicense: Association to Fues;
        tcodes : Composition of many TCodes on tcodes.module = $self;
};

entity TCodes  {
    key code            : COMMON.SapCode;
        descr           : String;
        module          : Association to Modules;
        keyFeature      : Association to ExcepKeyFeatures;
        use             : Association to Fues;
};

 

 

In this example we are inserting 1954 entries in the modules table from a csv file, and reinserting all 118570 entries from the tcodes table, we take them from the table, save them into a variable and then reinsert them.

This process takes around 5 minutes with these amounts of data.


Any tips on enhancing the speed and efficiency of this bulk insert process would be greatly appreciated!

Thank you!

Best regards,

Maria Chang

Developer at SofOs

 

Accepted Solutions (0)

Answers (1)

Answers (1)

alexandrayuyi
Discoverer
0 Kudos

Hello John, 
In this example we are inserting 1954 entries in the modules table from a csv file, and reinserting all 118570 entries from the tcodes table, we take them from the table, save them into a variable and then reinsert them.

This process takes around 5 minutes with these amounts of data.

Thank you for replying and your interest to help.

Best regards,

Maria Chang