There has been a lot of background discussion related to Graph Databases. While developers are eagerly anticipating the potential availability of such a feature in SAP HANA, I wanted to prepare a test case to show the limitations of current technologies.
Business Questions
In the SAP ECC system, the Bills Of Materials are maintained in a single-level hierarchy and it is very hard to explore and optimize Materials.
Transaction CS03: Display material BOM - Source: SAP
Some of the questions that are currently challenging to answer:
About Graph Databases
Traditional relational databases store data in tables. For instance, in the case of Material Master and Bills of Materials, the following tables are needed:
In a Graph Database, data is stored as Nodes and relationships are stored as Links or Edges. This is not very appropriate for transaction applications, but much more appropriate for data discovery and analysis.
Graph Databases have their own query language. Cypher is the one we'll be using in this example.
More information about Graph Databases and Cypher can be found O'Reilly's book available for free here: http://neo4j.com/book-graph-databases/
Exporting Nodes and Edges from ECC
The first task is to export the Material Master as Nodes with a simple join on MARA, MARC, and MAKT. You probably want to restrict your dataset by filtering MARC by plant (WERKS). If you have access to an IDES system, Plant "0001" is a good candidate. The result is enclosed in file "graph_node.csv".
Then, for all the materials we have selected, we need to select the corresponding links in the Bill of Materials from tables MAST, STKO, and STPO. The result is enclosed in file "graph_edge.csv".
Importing into Neo4J
Neo4 is one of the leading Graph Databases and offers a free community edition. After installation, a web interface will enable you to interact with the database. By default, the access is http://localhost:7474/
Assuming you are running Neo4J from Windows and have stored the above files under C:/Tmp (update line 3 otherwise), run the following commands:
USING PERIODIC COMMIT 500 LOAD CSV WITH HEADERS FROM 'file:///Tmp/graph_node.csv' AS csvLine FIELDTERMINATOR ',' WITH csvLine MERGE (:Material { plant: csvLine.PLANT, type: csvLine.TYPE, material: csvLine.MATNR, name:csvLine.DESC } )This cypher query creates nodes of type "Material" and fills attributes plant, type, material, and name. It might take a while depending how many nodes you'll be importing and the performance of your machine.
Now, let's import the links with the following command:
USING PERIODIC COMMIT 500 LOAD CSV WITH HEADERS FROM 'file:///Tmp/graph_edge.csv' AS csvLine FIELDTERMINATOR ',' WITH csvLine MATCH (m1:Material { plant: csvLine.PLANT_1, material: csvLine.MATNR_1 } ) MATCH (m2:Material { plant: csvLine.PLANT_2, material: csvLine.MATNR_2 } ) MERGE (m1)-[link:BOM { id: csvLine.BOM } ]->(m2) RETURN m1,m2,linkThe above cypher query finds the nodes using the plant / material keys and creates a link with the BOM Number as attribute.
Business Answers
Show the network of Bill of Materials
MATCH (n) RETURN nThis query finds all nodes and displays them.
Show all Materials from Plant '0001' connected in Bills of Materials
MATCH (n {plant:"0001"})--(m) RETURN n
Bill of Materials as Network - Source: Neo4J
We can easily identify here that the "Apple Cinnamon" and "Cookies, Apple" have all of their ingredients in common. There might be an opportunity here to rationalize production. More interestingly, ingredients "Vitamin C", "Flavoring", and "Sugar Coating" are specific to the "Apple Cinamon" product. From the perspective of product management or purchasing, there might be opportunities for rationalization, recipe change, cost savings or different pricing.
Show all Orphans
MATCH (n) WHERE NOT (n)--() RETURN n;Orphan Materials - Source: Neo4J
The above query identifies candidates of potential useless materials. Transactions CS03 (Display Bill of Materials) and CS15 (BOM Where-Used) confirmed that material 0001 / 2491 doesn't have a Bill of Material. We might need to update our query or confirm with other groups within the organization if this material is indeed unnecessary. For instance, most companies have a way to use status to mark old materials as obsolete. In our case, we could extend the query to filter out MARA where MSTAE = '99' (Obsolete).
Limitations and Potential Solutions
Since this exercise was just a proof of value, I chose to use export files. uwe.fetzer2 wrote a detailed blog on how to connect Neo4J to ABAP:
Neo4a: The Neo4j ABAP Connector
The biggest concern is in the performance of the visualization in the default console. There are multiple alternatives detailed on Ne4J's website, but would require a better architecture: Graph Visualization for Neo4j: Tools, Methods and More
Assuming the performance of the visualization can be solved, what would be a better user experience? First, if the application is used as a data exploration, users would have to learn the cypher language, which is not straightforward for business analysts. In the application is used to solve predefined questions, which ones should they be and how would the application look like?
Conclusion
If HANA gets a Graph Database capability, I believe that it could help solve a number of scenarios that are currently challenging, including the ones described above. However, some of the most critical aspects of this topic might not be limited to a database or a query engine, but to the visualization and interaction.
If you are interested in this topic, please contact me.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
4 | |
4 | |
4 | |
4 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 |