URGENT: Error in Data Flow (Python): str' object h...

mitko1994 · ‎2024 May 29

Hello,

I have a python script running inside a Data Flow that results in the following error.

Python error.png

Since it's impossible to debug inside Datasphere, I took the source code to another interpreter and loaded my dataset from csv to test. The code I have runs fine and produces the expected result in the other interpreter. Any ideas what might be causing the issues? This issue is a showstopper for what I'm doing right now, so any input would be greatly appreciated.

Thanks,
Dimitar

mitko1994 · ‎2024 May 31

Here is the code

def transform(data):
    """
    This function body should contain all the desired transformations on incoming DataFrame. Permitted builtin functions
    as well as permitted NumPy and Pandas objects and functions are available inside this function.
    Permitted NumPy and Pandas objects and functions can be used with aliases 'np' and 'pd' respectively.
    This function executes in a sandbox mode. Please refer the documentation for permitted objects and functions. Using
    any restricted functions or objects would cause an internal exception and result in a pipeline failure.
    Any code outside this function body will not be executed and inclusion of such code is discouraged.
    :param data: Pandas DataFrame
    :return: Pandas DataFrame
    """

    data=data[data['refunds'].isnull() == False]
    data.rename(str.upper, axis='columns', inplace=True)


    result = pd.DataFrame(
        data=data,
        columns=[
            'ORDER_LINE_ID', 'ID', 'AMOUNT', 'COMMISSION_AMOUNT', 'COMMISSION_VAT_AMOUNT', 'QUANTITY', 
            'DATE_WAITING_REFUND', 'DATE_WAITING_REFUND_PAYMENT', 'DATE_REFUNDED', 'STATE', 'TRANSACTION_DATE', 
            'TRANSACTION_NUMBER', 'REASON_CODE', 'REASON_LABEL', 'PRICE_TAXES', 'SHIPPING_PRICE_TAXES', 
            'CTA_AMOUNT', 'CTA_CODE', 'CTA_RATE', 'RFD_COUNT', 'CURRENCY_ISO_CODE', 'DATE_LOADED','REFUNDS']
    )

    all_refunds=[]
    for idx in result.index:
        row = result.loc[idx].to_dict()
  
        tmp = row['REFUNDS'].replace('"', '').split("{id")
        refunds = list(filter(lambda x: len(x) > 1, tmp))
        #get the number of refunds for each order id
        row['RFD_COUNT'] = len(refunds)
        for rec in refunds:
            # remove all double quotes.
            rec = rec.replace('"', "")

            # split off the commission taxes
            tax_split = rec.split("commission_taxes:[")

            # split off the the key/value pairs after the commission taxes
            tmp = tax_split[1].split("],")
            com_taxes = tmp[0][1:-1] #remove the json braces {}
            # rebuild the refund key/val pairs after splitting off the commission taxes
            refund = "id" + tax_split[0] + tmp[1]
            # remove all occurrences of: [,],{,}
            refund = refund.replace("[", "").replace("]", "").replace("{", "").replace("}", "")
            # drop the extra comma on the record if it's not the last refund for this order
            if refund[-1] == ",":
                refund = refund[:-1]
                
            for key_val in refund.split(","):
               tmp = key_val.split(":")
               key = tmp[0].upper()
               if key in result.columns:
                   val = tmp[1]
                   row[key] = val
                   
            for key_val in com_taxes.split(","):
               tmp = key_val.split(":")
               key = tmp[0].upper()
               val = tmp[1]
               row['CTA_'+key] = val
           

            all_refunds.append(row)
            

    result = result.loc[0:0]
    result = pd.concat([result,pd.DataFrame(data=all_refunds)])
    result.drop(['REFUNDS'],axis='columns', inplace = True)
    return result

By Category

Related Content

Activity Groups

Industry Groups

Influence and Feedback Groups

Interest Groups

Location Groups

Customer Only Groups

Forums

Related Resources

Products

Learning and Support

About

My Account

My Account

URGENT: Error in Data Flow (Python): str' object has no attribute 'as_tuple'

Know the answer?

Need more details?

Accepted Solutions (0)

Answers (1)

Answers (1)