Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
gho6
Participant
10,521

Introduction:-


Converting an XML file to CSV format is a common task in data integration and processing workflows. SAP Cloud Platform Integration (CPI) is a cloud-based integration service that makes it easy to create data processing pipelines using a variety of components and languages.


In this blog post, I will show you how to use Groovy script in CPI to convert XML to CSV with a comma separator. The script will iterate through all elements in the XML file and extract the element names and values to create a CSV file.



Importance:-


In CPI we already have one standard function to convert the data XML to CSV format but that function doesn't work properly all the time.

In my case, I have one XML input file there have some different types of data:- String("Hebrew" language) data, numeric data, String ("English" language) data and some date functions.

when I convert the data with a standard function the field data are sorted. Therefore, I write a groovy script that helps to convert the data without any issues.

 

Objective:-


I will use a simple example XML file and show how to write a Groovy script that converts it to a CSV file. I will also show the  output file after converting the data.
Input file:-

 
<root>
<row>
<UserId>100</UserId>
<Name>Lionel Messi</Name>
<DOB>1987-06-24</DOB>
<Phone>0524389823</Phone>
<Email>lm10@gmail.com</Email>
</row>
<row>
<UserId>105</UserId>
<Name>Cristiano Ronaldo</Name>
<DOB>1985-02-05</DOB>
<Phone/>
<Email>cr7@gmail.com</Email>
</row>
<row>
<UserId>115</UserId>
<Name>Kylian Mbappe</Name>
<DOB>1998-12-20</DOB>
<Phone>062-47788354</Phone>
<Email/>
</row>
</root>

 

Groovy Script:-

 
import com.sap.gateway.ip.core.customdev.util.Message
import java.nio.charset.StandardCharsets
import java.io.OutputStreamWriter
import groovy.xml.*

def Message processData(Message message) {
def payload = message.getBody(java.lang.String.class)
def root = new XmlParser().parseText(payload)
def csv = new StringWriter()

// Write header row
root.children().first().children().each { field ->
csv.write(field.name())
if (field != root.children().first().children().last()) {
csv.write(',')
}
}
csv.write('\n')

// Write data rows
root.children().each { record ->
record.children().each { field ->
csv.write(field.text())
if (field != record.children().last()) {
csv.write(',')
}
}
csv.write('\n')
}

message.setBody(csv.toString())
return message
}

 

In this script, we start by importing the necessary libraries for processing the message payload and parsing XML.

The Message class is used to represent the input and output messages in CPI. The java.nio.charset.StandardCharsets and java.io.OutputStreamWriter classes are used to specify the character encoding and output format for the CSV file. The groovy.xml.* classes are used for parsing the input XML file.

We then define a process Data function that takes a Message object as input and returns a modified Message object.

Inside the function, we get the payload of the message as a string and parse it into an XML document using XmlParser().

The XmlParser class parses the input XML file and returns a groovy.util.Node object, which represents the root node of the XML file.Then initializes a StringWriter object, which is used to store the CSV output.

The script then iterates through the elements in the XML file to extract the element names and values and write them to the CSV output. First, the script writes the header row of the CSV file:
root.children().first().children().each { field ->
csv.write(field.name())
if (field != root.children().first().children().last()) {
csv.write(',')
}
}
csv.write('\n')

This code block uses the children() method to iterate through the child nodes of the root node. The first() method is used to get the first child node, which is assumed to contain the header row of the CSV file. The children() method is called again to iterate through the child nodes of the header row node. For each child node, the element name is extracted using the name() method and written to the CSV output. If the current child node is not the last child node, a comma separator is also written to the CSV output.

The script then writes the data rows of the CSV file:
root.children().each { record ->
record.children().each { field ->
csv.write(field.text())
if (field != record.children().last()) {
csv.write(',')
}
}
csv.write('\n')
}

This code block uses the children() method to iterate through each child node of the root node. For each child node, the script iterates through its child nodes to extract the element values and write them to the CSV output. The text() method is used to extract the element value. If the current child node is not the last child node, a comma separator is also written to the CSV output. After all the data rows are written to the CSV output, the StringWriter object is converted to a String object and set as the output message body.

 

Output:-
UserId,Name,DOB,Phone,Email
100,Lionel Messi,1987-06-24,0524389823,lm10@gmail.com
105,Cristiano Ronaldo,1985-02-05,,cr7@gmail.com
115,Kylian Mbappe,1998-12-20,062-47788354,

 

Conclusion:-


In this blog post, I have shown how to use Groovy script in CPI to convert an XML file to a CSV file with a comma separator. The script uses the XmlParser class to parse the XML input and extract the element names and values to create the CSV output. With this script, you can easily convert XML files to CSV files in CPI for further processing and analysis.

Note:-


I write this script as a comma separator if your requirement with different separator please change it accordingly.

This script assumes that the XML file has a simple structure where each record is a direct child of the root element and each field is a direct child of its corresponding record element. If your XML file has a more complex structure, you may need to modify the script to handle it correctly.

If you have any question please ask me to feel free.

If you like the blog and it's useful for you kindly share this.
7 Comments
0 Kudos

Hi Chandranath

The purpose of the groovy is good. But, can you show the sample XML where you had actually faced the CSV conversion issue, because the one depicted in the blog is easily convertible to CSV.

Regards

Baskaran

gho6
Participant
0 Kudos
Hi Baskaran,

I am apologize for the late reply.

No, all the time convertor doesn't work properly. I already faced this type issue 2 times. Therefore, I made this script. i share with you one example where i faced this issue.

Example:
<root>
<row>
<ChangeEffectiveOn>03/29/2023</ChangeEffectiveOn>
<EmployeeID>9OB8C</EmployeeID>
<PositionID>W0666</PositionID>
<FirstName>Jet</FirstName>
<LastName>Sih</LastName>
<BirthDate/>
<Gender/>
<TaxIDType>SSN</TaxIDType>
<TaxIDNumber>618578088</TaxIDNumber>
<HireDate/>
<IsPaidByWFN/>
<PayFrequencyCode/>
<SUI_SDITaxJurisdictionCode/>
<WorkedStateTaxCode/>
<Address1Line1/>
<Address1Line2/>
<Address1City/>
<Address1StatePostalCode/>
<Address1ZipCode/>
<Address1UseasLegal/>
<Address1County/>
<Address1Country/>
<Address2Line1/>
<Address2Line2/>
<Address2City/>
<Address2StatePostalCode/>
<Address2ZipCode/>
<Address2County/>
<Address2Country/>
<EEO_Ethnic_Code/>
<Race_ID_method/>
<Middle_Name/>
<PreferredName/>
<Works_From_Home/>
<ActualMaritalStatus/>
<BenefitsEligibilityClass/>
<FTE/>
<EmployeeStatus/>
<EmployeeType/>
<Hiring_Source/>
<RehireDate/>
<LeaveofAbsenceStartDate/>
<LeaveofAbsenceReason/>
<LeaveofAbsenceReturnDate/>
<LeaveofAbsenceReturnReason/>
<TerminationDate/>
<TerminationReason/>
<Voluntary_InvoluntaryTermination/>
<LastDayWorked/>
<EligibleForRehire/>
<BusinessUnit/>
<HomeCostNumber/>
<HomeDepartment/>
<LocationCode/>
<ReportsToPositionID/>
<JobTitle/>
<JobChangeReason/>
<PositionStartDate/>
<EEO_Establishment/>
<EEOCJobCode/>
<FLSACode/>
<JobClass/>
<ManagementPosition/>
<NAICS_WorkersComp_Code/>
<UnionCode/>
<HomePhoneNumber/>
<WorkCellNumber/>
<CompensationChangeReason/>
<RateType/>
<Rate1Amount/>
<Rate2Amount/>
<StandardHours/>
<BasisofPay/>
</row>
</root>

this is my one input file data.

when I use the standard Function the output file data sorted. beacuse, of the null data in the field.

output file:-


output file with standard XML to CSV Convertor


you can check this grey highlight line. After conversion, This employee TaxIdType field data(SSN) move at end of the file.

Then, I use this script instead of standard function it's work perfectly.

output file2:-


converted output file through groovy script


 

Now, you can check TaxIdType value(SSN) coming properly in place not end of the file.
Therefore, i use this groovy script.
If you are facing this type issue you can use this script.

I hope it will help you.

Thank you,
Chandranath Ghosh
philippeaddor
Active Participant
0 Kudos
I wouldn't recommend to do this. Don't rewrite functionality that is readily available in Java libraries. I'm using openCSV for writing (and parsing) CSV files. It gives you different configurable options (like automatically add double quotes around values where needed) and you can rest assured that it works because it is more elaborated than code quickly written by ourselves and because it gets tested by thousands of people who use it every day.

Download the opencsv*.jar file and upload it to the resources tab of the IFlow. Then Import it at the top of the Groovy script. That's it! The class documentation e.g. for the writer is here: https://opencsv.sourceforge.net/apidocs/com/opencsv/CSVWriter.html

Best regards,

Philippe
codylpatterson
Explorer
0 Kudos
This is amazing! Thank you. It was much easier to use this than the built-in converter. One question, can it be modified to put quotes around the text?
gho6
Participant
0 Kudos
yes, you can modify it accordingly.
codylpatterson
Explorer
Thank you for the response. I'm new to Groovy, what modifications need to be made to have quotes around the text? Thank you.
jamaic
Explorer
0 Kudos
Hi Chandranath,

Your explanation here is so true. I do relate to it alot. I had the same experience using the standard XML to CSV converter and my output was totally messed up. Columns were moved from one location to another location. I was really mad and frustrated and didn't know what to do.

Based on your example here, I now know how to fix this problem of Columns being moved around. Thanks alot for sharing.

Regards,

Chuo
Labels in this area