After six years working as a data scientist on custom AI use cases, I’ve come to appreciate the power of structured frameworks like the CRISP-DM (Cross-Industry Standard Process for Data Mining). Its iterative phases - business understanding, data understanding & preparation, modeling, evaluation, and deployment - have consistently helped me to turn complex AI use cases into real, measurable results. Now, as I transitioned into enterprise architecture, I’m discovering how another framework, TOGAF (The Open Group Architecture Framework), brings a new layer of clarity. It offers a strategic lens that complements AI development by aligning it with broader business goals, making integration and deployment of AI more sustainable and scalable.
To me, TOGAF and CRISP-DM aren’t separate or competing methodologies—they’re complementary. When used together, they create a powerful synergy that bridges strategic business alignment with hands-on AI development. This integrated approach is especially valuable when designing AI architectures that need to stay aligned with business objectives yet remain adaptable to evolving business & technical requirements.
In this blog, I will share some concrete personal insights aimed at helping data scientists and enterprise architects better understand each other - so they can more effectively bring AI solutions into production.
Let’s start with the CRISP-DM shown in the following figure.
The CRISP-DM is a structured framework for developing data science projects. It consists of six iterative phases:
The process is iterative, allowing teams to refine their approach as they learn more through each phase. It's widely used because it's flexible, industry-agnostic, and focused on delivering actionable results.
TOGAF is a framework for designing, planning, and governing enterprise IT architecture. It helps align technology with business goals through a structured, phased approach called the Architecture Development Method (ADM).
The ADM is the core of TOGAF—a step-by-step process for developing and managing enterprise architecture. It guides projects through phases like vision, business, data, application, and technology architecture, ensuring alignment with business goals and evolving requirements.
Let’s look at three concrete examples, how these methodologies relate to each other:
A clear understanding of the business requirements a data science project aims to solve is critical for its success - and this begins with a strong business architecture. Defining the scope means asking key questions: Which business units are involved? What KPIs are we trying to improve? What processes are currently in place to achieve those metrics? These considerations shape essential deliverables like functional and non-functional requirements, a stakeholder map, and process documentations. Tools like SAP LeanIX or SAP Signavio can support this phase by providing visibility into the current business landscape and aligning initiatives with strategic goals.
Data preparation goes beyond cleaning and transforming datasets - it’s deeply connected to both data and application architecture. This phase involves evaluating the quality and availability of data across IT systems and deciding on the best approach to ensure its fit for modeling. For example, should we build ongoing data wrangling pipelines, or address the issues directly at the source? Which stakeholders are responsible for cleaning the data? Business or IT? These decisions impact not only the project's technical setup but also its long-term sustainability. Key deliverables include for example a baseline architecture consisting of relevant systems & actors and a solid requirement management plan. SAP solutions like SAP Business Data Cloud and SAP HANA Cloud play a vital role in supporting these efforts at scale.
The modeling phase sits at the intersection of technology architecture, opportunities and solutions, and implementation governance. A key decision here is whether to use a pre-built, integrated models or develop a custom AI solution. This involves evaluating model performance, assessing technical feasibility, and identifying infrastructure requirements for development. These considerations shape the overall technology stack and integration strategy. A clear data flow diagram is a crucial deliverable, helping to visualize how data moves through the systems. SAP solutions like Joule, Embedded AI and the AI Foundation provide robust support for both pre-integrated and custom AI modeling approaches.
As we continue to integrate AI into enterprise ecosystems, aligning the business strategy with the AI target architecture is more important than ever. Frameworks like CRISP-DM and TOGAF offer valuable structures to ensure that AI projects are not only technically sound but also strategically aligned. Whether it’s through defining clear problem statements, preparing quality data, or choosing the right modeling approach, each phase plays a crucial role in delivering successful AI solution concepts. By bridging the worlds of data science and enterprise architecture, we can create AI-driven systems that deliver real value, scalability, and long-term impact.
If you are curious to explore more, have a look at the following references:
Enterprise AI Practice — Responsibilities and Deliverables
Data Science and Architecture: Building bridges with CRISP-DM and TOGAF
CRISP-DM: Towards a Standard Process Model for Data Mining
I want to thank Stefan Fassmann and Johannes Euler for their support and insights while crafting this blog post.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
25 | |
12 | |
11 | |
9 | |
9 | |
7 | |
7 | |
6 | |
6 | |
6 |