
This blog post is part of a series that dives into various aspects of SAP’s approach to generative AI, and its technical underpinnings. In the previous blog post, we discussed some of the engineering best practices we follow at SAP in developing the next generation AI-enabled applications. Read the first blog post of the series.
SAP is building the SAP Foundation Model, a table-native AI model that accelerates time-to-value for predictive tasks on tabular data. The vision is to create a generic prediction engine that can make predictions out-of-the-box based on tabular data with little or no additional training data. While large language models (LLMs) are useful for generating text, they have limitations when it comes to classification, regression and prediction problems in the business domain where data is mostly tabular. Indeed, whereas state-of-the-art LLMs perform 20-50% worse than incumbent narrow AI methods in such scenarios, the SAP Foundation Model can match and outperform these traditional approaches while at the same time offering the versatility of foundation models. SAP is investing in this because of its strong history in understanding business needs and its aim to build relevant, reliable and responsible AI to help solve business problems for our customers. We share details on our approach, initial findings and the way forward in this blog post.
Even in today’s leanest companies, too many people are spending too much time on manual business processes. Reducing manual effort in business applications such as enterprise resource planning (ERP) has been an on-going effort for decades. Whether it’s through process digitization, custom coding, business rules or application of “narrow” AI models, businesses have seen a steady evolution of their business processes and enterprise application software to achieve more with less.
ChatGPT showed the world what large language models (LLMs) are capable of and ushered in the age of generative AI. Business leaders everywhere are imagining the potential that models such as OpenAI’s GPT-4, Google Gemini, Anthropic Claude, or Meta’s Llama can have on business processes and applications, and SAP is partnering very closely with many of the creators of these models to tap into this potential for our customers.
State-of-the-art generative AI technologies are particularly strong when it comes to the processing of “unstructured” text and image inputs and outputs. There is a plethora of business applications where text and images play a decisive role, and software vendors like SAP have made great progress in leveraging LLMs in particular to transform business processes based off such unstructured data. We refer to the 2024 Q1 SAP Business AI Release Highlights and the 2024 SAP Business AI Sapphire announcement blog for an overview of SAP’s recent AI announcements.
At the same time, due to the long-term push for automation, many enterprise business processes are heavily centered around data that exists in a structured, directly machine-processible format. These could be tables of an ERP system as well as other database entries in defined schemas. Even though these processes and data play such a crucial role in enterprise back- and front-office activities, they remain relatively untapped by the generative AI revolution so far.
Given its strong pioneering role in digitization and automation of enterprise business processes, SAP has set out to change this and transfer the technological advancements of the generative AI age to these core business processes. We describe the associated challenges, how we are approaching these, our initial experimentation results, and what we are looking forward to below.
SAP has successfully delivered numerous AI solutions and applications on structured business data over the past years. Narrow AI models (traditional task-specific machine learning as opposed to generalizing methods of generative AI) have been the work horse behind this success, not only for SAP, but across industry. Despite their success, narrow AI models have certain intrinsic limitations towards a broader adoption for automation. They require relatively high engineering effort, domain area expertise, and data science skills to prepare data and train a suitable model for a specific scenario. Large amounts of standardized and cleansed training data are required to train a high-quality model, and this model is in many cases only effective for the specific company whose data it was trained on.
LLMs have shown that they can overcome some of these limitations effectively. It would be natural to leverage LLMs for automation tasks on structured data. However, at this time, LLMs face some significant limitations for the processing of structured data. LLMs are known to hallucinate, which is unacceptable in many domains, e.g., when processing financial data or inventory numbers. Their context windows may not be large enough to effectively process large tables, and they show limited understanding when it comes to enterprise data models and the complex dependencies between business objects (which are inherently different from the linear nature of text).
As a consequence, we have observed LLMs performing poorly on tabular data compared to traditional narrow AI approaches (see further below for specific results). Techniques like grounding, retrieval-augmented generation (RAG) or text-to-SQL and agentic approaches attempt to bridge the gap to the realm of structured data. These techniques are however somewhat complementary in nature to the prediction and classification tasks that are currently achieved with narrow AI models. While some successful applications of LLMs to (typically smaller-scale) tabular problems have been reported (cf. Fang et al.), other recently published findings offer support to our observations (cf. Bordt et al.; van Breugel and van der Schaar). A broader overview of relevant academic research can also be found in a recent whitepaper by SAP and research partner Merantix Momentum.
To overcome the limitations inherent to LLMs and narrow AI, SAP is building a foundation model for structured business data (as opposed to plain text like in LLM training). This way, we aim to reshape the way AI is done on structured data, much like LLMs have done for text. SAP is in a unique position to develop such a model given its strong footprint in the business enterprise domain. We believe that such a foundation model can offer a competitive edge for enterprises to truly benefit from the breakthroughs in generative AI, helping to automate business processes beyond the processing of texts and images, and laying the foundation for increasingly complex and powerful decision support on the path to becoming an ever-more intelligent enterprise.
A key ingredient that made generative AI models successful in the text and image domains is pre-training models on large amounts of unlabeled data through self-supervised learning. This enables the models to develop a general understanding of the underlying data structures that can then be leveraged when using them for more specific downstream tasks. In this manner, by predicting the next word of a sentence or the next patch in an image for very large amounts of data, impressive general capabilities have emerged in LLMs and vision foundation models. Our aim is to produce similar general capabilities in the structured data domain.
The architecture underlying most foundation models for text and images is the transformer, introduced in Vaswani et al. almost 7 years ago. The general transformer architecture is flexible enough to be directly applicable to the structured data domain. However, a few tweaks are needed compared to the transformers used in LLMs to account for the unique properties of structured data.
Firstly, unlike the words of a sentence, the cells in a table generally have no natural order – the information contained in a table is generally preserved, even if the order of its columns is changed. Instead of being determined by its left-to-right position in a row, the meaning of a cell in a table is therefore determined by its column header. For this reason, we replace the classical positional embeddings encoding word order in LLMs with tailored embeddings adapted to the tabular setting.
Secondly, tables contain a lot of different data types such as numerical, datetime, categorical and free text. Here, the tokenizers used in most LLMs are not suitable for properly capturing the semantics of these different entities. Instead, we develop specific embeddings for the various data types appearing in tabular data.
Thirdly, whereas generating text in a sequential manner is naturally done by decoder-only transformers, the central prediction challenges pertaining to structured data are more efficiently addressed by an encoder-only architecture. This enables masking the values of an arbitrary collection of cells and have the model predict these values during pre-training. In this manner, we train our model on large amounts of structured business data and have it develop a general understanding for the underlying business processes. This task can be naturally mapped to most downstream scenarios and further enables general table representation learning which can serve as the foundation for an even larger set of tasks.
The resulting high-level architecture of the SAP Foundation Model for structured data is shown in Figure 1. Structured business data is encoded with domain-specific semantics from SAP’s knowledge graph (a specific blog post on SAP’s knowledge graph activities will follow soon as part of this blog series), resulting in linked business data. Jointly with relevant context data from business systems and other data sources, context-aware embeddings are created, which are fed into a transformer architecture. The weights of the transformer are fitted to a large variety of tabular data in a self-supervised pre-training step. Initially, we focus on ERP data, but other business domains are planned to follow. Based on these ideas plus a few additional design choices, a base model is trained that can then be used for different downstream use cases.
Figure 1: High-level architecture of the SAP Foundation Model
By fine-tuning the base model in a light-weight fashion for individual scenarios, we obtain a model that both benefits from an understanding of general business patterns and is at the same time carefully tailored to the specific problem at hand. This enables quickly adapting the base model to new tasks in a simple and unified way while offering predictive capabilities that match and sometimes exceed the best specialized narrow AI approaches. This way, the SAP Foundation Model can scale to a broad range of distinct use cases where previously the cost of implementing a large number of individual AI solutions was prohibitive or slowed down realization.
Using the model's inherent representation of different business scenarios further opens up the possibility to use it for certain tasks without the need to undertake any fine-tuning at all. This enables out-of-the-box inference even for customers with very little or no data in a given scenario, e.g., due to a recent greenfield migration to SAP, or due to introduction of a disruptive process change. Whereas traditional AI approaches often require significant amounts of training data to accumulate before providing good predictive performance, a business foundation model can therefore provide immediate value.
The model architecture sketched above is focused on traditional prediction/inference scenarios. Such scenarios are a natural first application area of a business foundation model. Still, once trained, this architecture can be adapted to serve other objectives, too, such as supporting more interactive paradigms, or being used in a truly generative fashion. Over time, we expect the SAP Foundation Model to enable a much broader perspective on how to interact with data-centric enterprise applications like ERP systems, similar to how LLMs have drastically widened our horizon for how we can interact with systems on a textual basis.
The SAP Foundation Model relies on domain-specific business data to learn and understand the basic mechanisms of enterprise business processes. Protecting sensitive and confidential data is our top priority and a main guiding principle for every step of the development life cycle, ensuring that all data confidentiality and privacy requirements are met. SAP drives significant investments in differential privacy methods that are combined natively with learning algorithms, as well as federated learning setups and AI security research, to gradually grow the usable data scope. Most importantly, the SAP Foundation Model is trained in a highly secured training environment that adheres to the same security standards as productive systems.
In contrast to LLMs, the SAP Foundation Model for structured data is accessed through fixed input and output schemas. Embedded directly into SAP products, it ensures predictions are made within a predefined context, and outputs data relevant only to the SAP customer using it. For future scenarios requiring flexible interfaces, robust privacy assessment pipelines and reviews ensure strong safeguards remain intact, further adhering to SAP’s Responsible AI framework.
LLMs, even though just trained for text completion, have shown amazing emerging capabilities from logic reasoning to passing university-level exams. In line with this, even though they are in no way optimized for it, we have seen state-of-the-art LLMs adding significant value for a range of SAP use cases. Still, in the realm of structured business data, when compared to specialized narrow AI methods, we observed LLMs to yield 20-50% lower model performance (depending on the specific scenario, dataset and chosen metric), even after having been carefully optimized for the specific task at hand through meticulous prompt engineering and making optimal use of the available context window. This echoes observations reported elsewhere (cf., e.g., van Breugel and van der Schaar) as to the unsuitability of using LLMs instead of hand-crafted narrow AI models for problems based on structured data.
With this in mind, let us take a look at how an early version of our own SAP Foundation Model for structured data performs in comparison. Unlike large proprietary LLMs, our model can be easily adapted to the problem at hand by quickly fine-tuning it on basic infrastructure. For the datasets from our experimental setting, we therefore included a quick round of fine-tuning and then compared to standard tabular ML methods (which themselves all need to be trained from scratch). Instead of the performance drop observed in the comparison between LLMs and narrow AI models, an early preview of SAP Foundation Model for structured data performed up to 15% better than the incumbent narrow AI models in the given setting.
Figure 2: Performance comparison of SAP Foundation Model against other approaches
Our experiments thus indicate that dedicated business foundation models can indeed solve some enterprise AI problems based on structured data much better than state-of-the-art LLMs, and perhaps more crucially, they can also take on and replace tailored narrow AI solutions in a realistic setting. A first application domain where we have seen promising results is in the context of Fiori AI-assisted object creation, where users are aided by AI during the creation of complex business objects. The scenario is covered more extensively in this week’s Sapphire announcements, and a dedicated blog on Fiori AI-assisted object creation will follow shortly. In this context, having one foundation model instead of hundreds of individually crafted models per use case and customer is particularly exciting, as it offers tremendous benefits such as more streamlined and simplified adoption, a significantly lower maintenance effort, and much better scaling when implementing new features.
With increasing amounts of data available for the pre-training stage, we expect to require ever smaller amounts of data downstream as a result of the additional signal coming from cross-tenant learning. This will especially benefit customers who recently completed a greenfield migration to an SAP product, meaning they will see immediate value from the SAP Foundation Model without having to wait for enough of their own data to accumulate and individual models being trained just for them.
Figure 3: SAP Foundation Model aims to fundamentally change the way how AI use cases are created
By pre-training the model on more data, we also aim to gradually reduce the manual effort required for new downstream use cases, particularly when it comes to data preparation and feature engineering. Looking further ahead, we expect business foundation models, like the SAP Foundation Model for structured data, to start showing a latent “understanding” of business objects and their underlying concepts and relationships when pre-trained on sufficient amounts of data, similar to how LLMs develop a latent understanding of the concepts behind words and sentences. As an example, while initially a “Plant 1010” and a “Plant 1020” might just be abstract identifiers to an AI model, a sophisticated business foundation model would have learned relevant cross-dependencies and commonalities to be able to relate these plants to one another for example by their geography, size, industry, etc. This could in turn lead to even better predictive results with much less data needed for individual applications and customers. At the same time, it could allow to tackle much harder problems in the business domain, such as complex analyses or predicting the effects of certain business decisions.
Iterating on this approach, we aspire for the model to gradually develop general capabilities comprehensive enough to be applicable to new, previously unseen prediction problems on the fly without requiring any additional training. Combining these efforts with other solutions, it is our long-term vision that the predictive capabilities of the SAP Foundation Model can also be accessed through human language interaction in a chat-like fashion via SAP’s AI copilot Joule, truly revolutionizing the way our customers interact with and gain insights from their data.
On this journey, we are not alone. SAP is collaborating with academic institutions including Stanford University, Technical University of Munich and the University of California, Berkeley, as well as research partners from the private sector to jointly push the frontier of business AI technology and ensure a steady exchange of ideas to ultimately build the best foundation model on the market for structured business data.
Until we achieve the long-term vision of a largely autonomous, intelligent enterprise, there are still several challenges ahead of us. But given the enormous potential that lies in dedicated business foundation models, we strongly believe that this push is necessary to tap into the full potential of generative AI tech advancements in the context of highly structured business processes.
We have seen first promising results that already the current generation of business foundation models can make a difference for enterprises across industries and lines of business when it comes to massively scaling up AI usage across business processes, ease of consumption of AI features and reduced data needs. We hence look forward to introducing the SAP Foundation Model in SAP business applications. In addition, we highly recommend you watch this week’s SAP Sapphire sessions, where we will share additional aspects of the SAP Foundation Model for structured data and other SAP Business AI announcements.
99 of the 100 largest companies in the world use SAP software to run their businesses. Do you want to work on some of the largest and most valuable datasets in business and transform the business processes with AI to make the world run better and improve people’s lives? Join our team!
Co-authored by Dr. Janick Zaehringer-Frasch, Dr. Sam Thelin, Dr. Marco Spinaci, Dr. Markus Kohler, Ted Way, PhD, Dr. Johannes Hoffart, Mayank Shrivastava, Walter Sun, PhD, and Dr. Philipp Herzig
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
24 | |
23 | |
22 | |
15 | |
12 | |
10 | |
9 | |
7 | |
7 | |
7 |