on 2021 Oct 07 10:43 AM
Hi Everyone,
I am trying to read pickle file from SDL data lake but it seems unable to map with the input port.It works fine in the Jupyter notebook but unable to map the file to input port using python3 operator .
Here is the below error
Here is the python code
import pickle
import pandas as pd
import numpy as np
import re
import nltk
SVM = None
text = None
def on_input(data):
with open(data, 'rb') as data1:
SVM = pickle.load(data1)
data1.close()
api.send("output", str(SVM))
api.set_port_callback("input1", on_input)
I have even tried to set the input port as byte but it is not mapping therefore had to use string type for input port .Any thoughs as how to map the pickle file from DI_DATA_LAKE (SDL) to python opertor
Thanks
Request clarification before answering.
Hi mohammad.safiullah.
Here is my working example, although in my case I read a pickle of an object of a Pandas dataframe with data.
The output port of "Read File" operator is `message.file`, so I created a port of type `message` in my Python3 operator. The binary content of the file is in the `body` attribute of the `msg`.
And then here is the code of Py operator
import io
import pandas, pickle
def on_input(msg):
f = io.BytesIO(msg.body)
unpickled_df = pickle.load(f)
#unpickled_df = pandas.read_pickle(f)
api.send("outData", str(unpickled_df.columns))
api.set_port_callback("fileContent", on_input)<br>
The pickle is properly loaded as a dataframe (using both `pickle.load(f)` and `pandas.read_pickle(f)`) and the output is correct:
What I had to do to get my example working was making sure the version of Python and pandas producing the pickle are the same as version used in the run-time container of the pipeline, i.e.
python=3.6
pandas=1.0.4
I hope this helps you.
-Vtaliy
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Great ..Thanks a lot Mr Witalij Rudnicki .
It's a great learning for beginners like us..
I had to learn it yesterday myself first, mohammad.safiullah 😄
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi mohammad.safiullah ,
To load model saved on pickle file on python3 operator you can set the input as blob.
Make sure you save the pickle file with the same python version with your python3 operator.
import pickle
SVM_b = pickle.dumps(SVM)
with open(model_path, 'wb') as f:
f.write(SVM_b)
Then you can just load the pickle file like this.
import pickle
def on_input(data):
SVM = pickle.loads(data)
api.send("output", str(SVM))
api.set_port_callback("input", on_input)
Hope this helps.
Regards,
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
73 | |
21 | |
9 | |
8 | |
6 | |
6 | |
5 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.