Introduction:-
Converting an XML file to CSV format is a common task in data integration and processing workflows. SAP Cloud Platform Integration (CPI) is a cloud-based integration service that makes it easy to create data processing pipelines using a variety of components and languages.
In this blog post, I will show you how to use Groovy script in CPI to convert XML to CSV with a comma separator. The script will iterate through all elements in the XML file and extract the element names and values to create a CSV file.
Importance:-
In CPI we already have one standard function to convert the data XML to CSV format but that function doesn't work properly all the time.
In my case, I have one XML input file there have some different types of data:- String("Hebrew" language) data, numeric data, String ("English" language) data and some date functions.
when I convert the data with a standard function the field data are sorted. Therefore, I write a groovy script that helps to convert the data without any issues.
Objective:-
I will use a simple example XML file and show how to write a Groovy script that converts it to a CSV file. I will also show the output file after converting the data.
Input file:-
<root>
<row>
<UserId>100</UserId>
<Name>Lionel Messi</Name>
<DOB>1987-06-24</DOB>
<Phone>0524389823</Phone>
<Email>lm10@gmail.com</Email>
</row>
<row>
<UserId>105</UserId>
<Name>Cristiano Ronaldo</Name>
<DOB>1985-02-05</DOB>
<Phone/>
<Email>cr7@gmail.com</Email>
</row>
<row>
<UserId>115</UserId>
<Name>Kylian Mbappe</Name>
<DOB>1998-12-20</DOB>
<Phone>062-47788354</Phone>
<Email/>
</row>
</root>
Groovy Script:-
import com.sap.gateway.ip.core.customdev.util.Message
import java.nio.charset.StandardCharsets
import java.io.OutputStreamWriter
import groovy.xml.*
def Message processData(Message message) {
def payload = message.getBody(java.lang.String.class)
def root = new XmlParser().parseText(payload)
def csv = new StringWriter()
// Write header row
root.children().first().children().each { field ->
csv.write(field.name())
if (field != root.children().first().children().last()) {
csv.write(',')
}
}
csv.write('\n')
// Write data rows
root.children().each { record ->
record.children().each { field ->
csv.write(field.text())
if (field != record.children().last()) {
csv.write(',')
}
}
csv.write('\n')
}
message.setBody(csv.toString())
return message
}
In this script, we start by importing the necessary libraries for processing the message payload and parsing XML.
The Message class is used to represent the input and output messages in CPI. The java.nio.charset.StandardCharsets and java.io.OutputStreamWriter classes are used to specify the character encoding and output format for the CSV file. The groovy.xml.* classes are used for parsing the input XML file.
We then define a process Data function that takes a Message object as input and returns a modified Message object.
Inside the function, we get the payload of the message as a string and parse it into an XML document using
XmlParser().
The
XmlParser class parses the input XML file and returns a
groovy.util.Node object, which represents the root node of the XML file.Then initializes a
StringWriter object, which is used to store the CSV output.
The script then iterates through the elements in the XML file to extract the element names and values and write them to the CSV output. First, the script writes the header row of the CSV file:
root.children().first().children().each { field ->
csv.write(field.name())
if (field != root.children().first().children().last()) {
csv.write(',')
}
}
csv.write('\n')
This code block uses the children() method to iterate through the child nodes of the root node. The first() method is used to get the first child node, which is assumed to contain the header row of the CSV file. The children() method is called again to iterate through the child nodes of the header row node. For each child node, the element name is extracted using the name() method and written to the CSV output. If the current child node is not the last child node, a comma separator is also written to the CSV output.
The script then writes the data rows of the CSV file:
root.children().each { record ->
record.children().each { field ->
csv.write(field.text())
if (field != record.children().last()) {
csv.write(',')
}
}
csv.write('\n')
}
This code block uses the
children() method to iterate through each child node of the root node. For each child node, the script iterates through its child nodes to extract the element values and write them to the
CSV output. The text() method is used to extract the element value. If the current child node is not the last child node, a comma separator is also written to the CSV output. After all the data rows are written to the CSV output, the
StringWriter object is converted to a
String object and set as the output message body.
Output:-
UserId,Name,DOB,Phone,Email
100,Lionel Messi,1987-06-24,0524389823,lm10@gmail.com
105,Cristiano Ronaldo,1985-02-05,,cr7@gmail.com
115,Kylian Mbappe,1998-12-20,062-47788354,
Conclusion:-
In this blog post, I have shown how to use Groovy script in CPI to convert an XML file to a CSV file with a comma separator. The script uses the
XmlParser class to parse the XML input and extract the element names and values to create the CSV output. With this script, you can easily convert XML files to CSV files in CPI for further processing and analysis.
Note:-
I write this script as a comma separator if your requirement with different separator please change it accordingly.
This script assumes that the XML file has a simple structure where each record is a direct child of the root element and each field is a direct child of its corresponding record element. If your XML file has a more complex structure, you may need to modify the script to handle it correctly.
If you have any question please ask me to feel free.
If you like the blog and it's useful for you kindly share this.