cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

Python FOR loop does'nt work fine

Ricobye
Explorer
0 Likes
1,432
Hi Experts
I'm facing an issue when I try to use the FOR Loop in my datasphere data flow script. My goal is to fill a table with hierarchies links of father-childrens from KNVH. I use a dataphrame with 9000 rows.
For the test, a filter on the father 0001010970 is done. As you can see, we have 8 childrens
 
client pere.png
 
The first step was to be sure that the FOR Loop runs fine and it is.
 
image (14).png
 
 
image (15).png
For each row, the link is done.
 
But the goal is to iterate through a table of parents and find all the childrens. When we move on to the normal code, we notice that some lines are not being processed...
image (16).png
 
image (17).png
As you can see some lines are not processed... They don't even go through the FOR loop. I don't know why they are ignored...
I've turned the code around in circles without success. I've used different instructions without success.
 
Do you have any Ideas please? Need your help
 
Best regards
 
View Entire Topic
Vitaliy-R
Developer Advocate
Developer Advocate
0 Likes

The logic of the code is slightly confusing for me, like why do you assign 

data.loc[i,'HIERCUST4'] = w_cust

to all rows in every loop oj `j`...

...but if I read your requirements right, and `data` and `data_h` are both Pandas dataframes, then how about trying this code?

# Iterate through each customer from data_h
for _, row_h in data_h.iterrows():
    w_cust = row_h['KUNNR']

    # Boolean mask for matching HKUNNR
    mask = data['HKUNNR'] == w_cust

    # Conditional assignments only where the mask is True
    data.loc[mask, 'HIERCUST1'] = data.loc[mask, 'HKUNNR']
    data.loc[mask, 'HIERCUST2'] = data.loc[mask, 'KUNNR']
    data.loc[mask, 'HIERCUST4'] = w_cust

Regards.

Ricobye
Explorer
0 Likes

Hi Vitaliy-R,

Thank you for your back. I tried your code but it doesnt work. As you can see, I got the same result...Just four rows are valueted

Ricobye_0-1746218847402.png

 

Best regards

 

 

 

Vitaliy-R
Developer Advocate
Developer Advocate
0 Likes
Are there any additional filters outside of the block code that you shared? Or is there any split into packages of the input datasets before the code is executed?
Ricobye
Explorer
0 Likes

Hi Vitaliy-R

Here is my code: 

def transform(data):
    """
    This function body should contain all the desired transformations on incoming DataFrame. Permitted builtin functions
    as well as permitted NumPy and Pandas objects and functions are available inside this function.
    Permitted NumPy and Pandas objects and functions can be used with aliases 'np' and 'pd' respectively.
    This function executes in a sandbox mode. Please refer the documentation for permitted objects and functions. Using
    any restricted functions or objects would cause an internal exception and result in a pipeline failure.
    Any code outside this function body will not be executed and inclusion of such code is discouraged.
    :param data: Pandas DataFrame
    :return: Pandas DataFrame
    """
    #####################################################
    # Provide the function body for data transformation #
    #####################################################
    data["HIERCUST1"] = ""
    data["HIERCUST2"] = ""
    data["HIERCUST3"] = ""
    data["HIERCUST4"] = ""
    data["HIERCUST5"] = "" 

	H = int(0)
    H1 = int(0)
    nb_ind = int(0)
    w_cust = ""

    #---------------------------------------------------------------------
    #   Noeuds sans parents, tête de hierarchie niveau 1
    #---------------------------------------------------------------------
    data.sort_values('HKUNNR')
    
    for i in range(data.shape[0]):
        if data['HKUNNR'][i] == "":
            nb_ind = nb_ind + 1
        
    ind = np.arange(nb_ind)
    
    #-----Création d'une table temporaire pour les entetes de hierarchie
    
    data_h = pd.DataFrame(columns=['KUNNR'], index = ind)
    
    #-----Alimentation de la table de têtes de hierarchie
    
    for i in range(data.shape[0]):
        if data['HKUNNR'][i] == "":
            data.loc[i,'HIERCUST1'] = data['KUNNR'][i]

            new_row = 'KUNNR':data['KUNNR'][i]
            data_h.loc[len(data_h)] = new_row
			

    #---------------------------------------------------------------------
    #   Hierarchie niveau 2, enfants tête de hierarchie niveau 1
    #---------------------------------------------------------------------
  
    data.sort_values('KUNNR')
    data_h.sort_values('KUNNR')
    
    """    
    for i, row in data.iterrows():
        if row ['HKUNNR'] == "0001010970":
            data.loc[i,'HIERCUST1'] = row ['HKUNNR']
            data.loc[i,'HIERCUST2'] = row ['KUNNR']
    

    for j in list(range(0, len(data_h))):
    #for j, row in data_h.iterrows():
    #    w_cust = row ['KUNNR']

        w_cust = "0001010970"
        for i, row in data.iterrows():
            data.loc[i,'HIERCUST4'] = w_cust
            
            if row ['HKUNNR'] == w_cust:
                data.loc[i,'HIERCUST1'] = row ['HKUNNR']
                data.loc[i,'HIERCUST2'] = row ['KUNNR']
    
    """
   
    #----------------------------SAP forum-----------
    
    for _, row_h in data_h.iterrows():
        w_cust = row_h['KUNNR']

    # Boolean mask for matching HKUNN
        mask = data['HKUNNR'] == w_cust

    # Conditional assignments only where the mask is True
        data.loc[mask, 'HIERCUST1'] = data.loc[mask, 'HKUNNR']
        data.loc[mask, 'HIERCUST2'] = data.loc[mask, 'KUNNR']
        data.loc[mask, 'HIERCUST4'] = w_cust
		
	#----------------------------SAP forum end-------
	
    return data	

As you can see, there is no filter... 

Best regards

 

AndreasForster
Product and Topic Expert
Product and Topic Expert

Could this be related to the fact that the data arrives in multiple batches? the code might never see the full dataset.

"In a data flow, the script operator may receive the incoming table in multiple batches of rows, depending on the size of the table. This means that the transform function is called multiple times, for each batch of rows, and that its data parameter contains only the rows for data given batch.

Hence, the operations that require the complete table within the data parameter are not possible. For example, removing duplicates."

https://help.sap.com/docs/SAP_DATASPHERE/c8a54ee704e94e15926551293243fd1d/f3e2570966ac4036b552ebd998...