cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

URGENT: Error in Data Flow (Python): str' object has no attribute 'as_tuple'

mitko1994
Participant
0 Likes
1,130

Hello,

I have a python script running inside a Data Flow that results in the following error. 

Python error.png

Since it's impossible to debug inside Datasphere, I took the source code to another interpreter and loaded my dataset from csv to test. The code I have runs fine and produces the expected result in the other interpreter. Any ideas what might be causing the issues? This issue is a showstopper for what I'm doing right now, so any input would be greatly appreciated.

Thanks,
Dimitar

Accepted Solutions (0)

Answers (1)

Answers (1)

mitko1994
Participant
0 Likes

Here is the code

def transform(data):
    """
    This function body should contain all the desired transformations on incoming DataFrame. Permitted builtin functions
    as well as permitted NumPy and Pandas objects and functions are available inside this function.
    Permitted NumPy and Pandas objects and functions can be used with aliases 'np' and 'pd' respectively.
    This function executes in a sandbox mode. Please refer the documentation for permitted objects and functions. Using
    any restricted functions or objects would cause an internal exception and result in a pipeline failure.
    Any code outside this function body will not be executed and inclusion of such code is discouraged.
    :param data: Pandas DataFrame
    :return: Pandas DataFrame
    """

    data=data[data['refunds'].isnull() == False]
    data.rename(str.upper, axis='columns', inplace=True)


    result = pd.DataFrame(
        data=data,
        columns=[
            'ORDER_LINE_ID', 'ID', 'AMOUNT', 'COMMISSION_AMOUNT', 'COMMISSION_VAT_AMOUNT', 'QUANTITY', 
            'DATE_WAITING_REFUND', 'DATE_WAITING_REFUND_PAYMENT', 'DATE_REFUNDED', 'STATE', 'TRANSACTION_DATE', 
            'TRANSACTION_NUMBER', 'REASON_CODE', 'REASON_LABEL', 'PRICE_TAXES', 'SHIPPING_PRICE_TAXES', 
            'CTA_AMOUNT', 'CTA_CODE', 'CTA_RATE', 'RFD_COUNT', 'CURRENCY_ISO_CODE', 'DATE_LOADED','REFUNDS']
    )

    all_refunds=[]
    for idx in result.index:
        row = result.loc[idx].to_dict()
  
        tmp = row['REFUNDS'].replace('"', '').split("{id")
        refunds = list(filter(lambda x: len(x) > 1, tmp))
        #get the number of refunds for each order id
        row['RFD_COUNT'] = len(refunds)
        for rec in refunds:
            # remove all double quotes.
            rec = rec.replace('"', "")

            # split off the commission taxes
            tax_split = rec.split("commission_taxes:[")

            # split off the the key/value pairs after the commission taxes
            tmp = tax_split[1].split("],")
            com_taxes = tmp[0][1:-1] #remove the json braces {}
            # rebuild the refund key/val pairs after splitting off the commission taxes
            refund = "id" + tax_split[0] + tmp[1]
            # remove all occurrences of: [,],{,}
            refund = refund.replace("[", "").replace("]", "").replace("{", "").replace("}", "")
            # drop the extra comma on the record if it's not the last refund for this order
            if refund[-1] == ",":
                refund = refund[:-1]
                
            for key_val in refund.split(","):
               tmp = key_val.split(":")
               key = tmp[0].upper()
               if key in result.columns:
                   val = tmp[1]
                   row[key] = val
                   
            for key_val in com_taxes.split(","):
               tmp = key_val.split(":")
               key = tmp[0].upper()
               val = tmp[1]
               row['CTA_'+key] = val
           

            all_refunds.append(row)
            

    result = result.loc[0:0]
    result = pd.concat([result,pd.DataFrame(data=all_refunds)])
    result.drop(['REFUNDS'],axis='columns', inplace = True)
    return result

Vitaliy-R
Developer Advocate
Developer Advocate
0 Likes

It does not seem the code is using any of non-supported statements from https://help.sap.com/docs/SAP_DATASPHERE/c8a54ee704e94e15926551293243fd1d/73e8ba1a69cd4eeba722b458a2..., so I would suggest trying it in the external interpreter but with the environment similar to the one used in DSP: python==3.9, NumPy==1.21.5, Pandas==1.2.5

`as_tuple` usually comes from https://docs.python.org/3/library/decimal.html#decimal.Decimal.as_tuple, so it might be the result of some of these other libraries available in DSP's environment, like Pandas or NumPy.