In this post, we’ll dive into the steps for creating an refined Acquired Model and explore how it can really enhance your data analysis experience using just ask. While these best practices specifically apply to Acquired Models, we will go over soon in another post the best practices for working with Live Models on top of Datasphere.
Just ask is the new natural language query feature powered by AI. With just ask any user can query SAC data models by asking questions in everyday English. It supports SAP Analytics Cloud acquired data models and SAP Datasphere models.
Creating an Acquired Model in SAP Analytics Cloud (SAC) is a powerful way to represent and manage business data. Along with the just ask feature, which allows us to explore the model visually using natural language, this enhances our ability to analyze and interpret data effectively.
For individuals who are not yet familiar with just ask and require an introductory guide, one can follows the videos below as a helpful starting point:
or the following blog:
In this post, we will explore the steps to create an Acquired Model and how it can enhance your data analysis experience.
For more in-depth information on Acquired Models, please visit our SAP Help Portal by clicking here:
Once you have created an Acquired Model in SAC, it can be consumed by just ask—a powerful AI feature in SAP Analytics Cloud that enables users to ask questions about their data in natural language. This conversational approach allows you to quickly gain insights without needing to navigate complex datasets.
When using an Acquired Model with just ask:
The main idea behind just ask is that, unlike other LLMs or chat assistants, just ask does not jump to conclusions if the information you are seeking is not included in the data or is not well modeled. This approach allows just ask to provide accurate answers based on your data, ensuring that it makes informed assumptions.
Here are a few suggestions for improving your model. Keep in mind that these are not mandatory but are highly recommended, and this list is not exhaustive; it can also be adapted based on the user’s discretion and specific use case.
When managing date-related data in your model, it's essential to streamline the information by keeping only one date column.
Often, files may include multiple date columns with varying levels of granularity, such as daily, monthly, or yearly data. While these different perspectives can be useful, having too many date columns can clutter your model and make analysis more complex. To avoid redundancy, remove any extra date columns and retain the one that provides the most meaningful and explanatory information for your analysis.
The good news is that just ask is already equipped to handle different time granularities like Month, Year, Quarter, Year-to-Date (YTD), and Fiscal Year (FY). This means that, with the right date column in place, you can leverage just ask to easily apply time-based filters and perform date-driven analysis without the need for multiple date granularity.
By following this best practice, you simplify your model and ensure that it is optimized for efficient data queries and analysis.
When working with data extracted from different data sources, it's common to encounter situations where the Year column is misinterpreted by the system. Instead of being recognized as a date, it may be treated as a measure, resulting as an integer or even as a value like "2023.0" being saved with a decimal point.
To fix this, it’s important to clean up the data by first removing the ".0" when identified as a decimal value. This can be easily done by replacing the decimal with an empty string, leaving you with a clean and correct year format.
Once the data is cleaned, make sure to save the Year column as a Date dimension and apply the appropriate format, typically YYYY. This ensures that the year is recognized properly in your model and is ready for time-based analysis.
If this is the only date-related column available in the model, make sure to retain it, as it will serve as the key date reference for your analysis, even if the granularity is broad.
By following this best practice, you’ll avoid data misinterpretation and ensure that your year date is properly structured for efficient analysis.
When working with SAP Analytics Cloud (SAC) and just ask, there's no need to manually include columns for Year-to-Date (YTD), Previous Year, or similar data points.
Both SAC and just ask natively support queries and filters that handle these time-based calculations automatically. If the correct dates and years are set within your model, you can easily ask for YTD or Previous Year insights without needing dedicated columns for them.
This not only simplifies your model but also ensures more streamlined data management, allowing SAC and just ask to do the heavy lifting when it comes to time-based filtering and comparisons.
Here’s a list of time filters handled by just ask:
Additionally, formats like "first quarter of YYYY" are also acceptable.
When working with data models, it's important to ensure that columns containing IDs or that similar values are correctly identified as dimensions, not measures.
For example, an ID column might contain a sequence of n-digit integers, but these numbers are not meant to be treated as numerical values for calculations. Instead, they represent unique identifiers, and therefore should be categorized as dimensions within your model.
By properly identifying ID columns as dimensions, you ensure that your data model remains accurate and functional, making it easier to filter, group, and analyze your data in a meaningful way.
To optimize the performance of your data models, it's crucial to ensure that each dimension contains unique values and avoid duplicating content across different dimensions.
When the same data is repeated in multiple dimensions, it can unnecessarily increase the size of your model, leading to slower response times during analysis. By keeping each dimension unique, you reduce the model size, which improves both efficiency and performance.
It's essential to mention that having repeated values across multiple dimensions can lead to misinterpretation of the query by just ask.
Similarly to the previous point, it's important to avoid repeating the name of a dimension within its own entities. Each category or entity within a dimension should have unique values, without any unnecessary duplication.
This practice not only helps keep your model clean and organized but also significantly improves response times by reducing the overall size of the model. A smaller, more efficient model translates into faster query processing and better overall performance.
A key best practice for optimizing your data model is to limit the presence of redundant columns, especially those that simply concatenate existing values.
Focus on keeping only the essential columns and ensure that any information derived from existing values is managed appropriately to avoid duplication.
When your data contains binary values like "Yes/No" or "1/0", it's a good practice to refactor the column and its values with more descriptive terms.
This adjustment helps improve the clarity and usability of your model, particularly when users query the data. For instance, instead of a binary column labeled "Region" with values like 1 for Europe and 0 for the rest of the world, consider renaming it with explicit categories, such as "Regions in Europe" or "Regions Outside Europe."
This transformation enhances the quality of the results when users access the content through queries like "give me the sales for regions in Europe" or "show me sales for regions outside Europe." Clear, descriptive column names make data queries more intuitive and effective.
When a column contains many-to-many values, it’s best to split those values into multiple columns to improve clarity and analysis.
Just like with binary cases, split many-to-many columns into separate one, more specific columns enhance the quality of the results when users query the data. For example, instead of having a single column that lists multiple product categories (like "Food, Beverage, Household"), it's better to break this down into individual columns for each category.
This approach makes it easier to run specific queries, such as "give me the sales for food products," leading to more accurate and focused results. Missing values are allowed, so if certain entries don't apply to every column, the model will still handle it efficiently.
When structuring your data, it’s important to remove any currency symbols or units of measure from within the content of your measures.
While these symbols are useful for interpretation, they should not be included directly in the data fields. Instead, both currency and units of measure should be set appropriately in the modeler. This ensures that the data remains clean and consistent, allowing for easier manipulation and analysis.
Detailed guidance on Setting Units of Measure (Static Currencies) can be found here.
Additionally, SAP Analytics Cloud (SAC) offers the ability to set dynamic currencies using table conversion features, providing flexibility for users who need to work with different currencies across regions or reports.
For comprehensive instructions on Table Currencies Conversion (Dynamic Currencies), please refer to the Display Currencies in Tables wiki.
However, at the present time, JA does not support dynamic currencies, but only static ones, so we highly recommend to set separate columns with the conversion.
One of the simplest yet most impactful best practices for data modeling is to ensure that column names are clear and understandable.
Avoid using names that contain only numbers or special characters, such as "Date2023_02", "--sales!@", or "2023for&$". These can be confusing and make the model harder to navigate and query. Instead, opt for straightforward, descriptive names that make the purpose of each column immediately obvious to anyone working with the data.
If, for some reason, a column cannot be renamed, make sure to set a synonym or ensure that it is queried exactly as it appears.
Adhering to above best practices is crucial for achieving high-quality, transparent, and efficient results. These guidelines ensure data is clean and well-prepared, which improves the accuracy and reliability of insights. Following consistent processes also helps reduce biases, leading to fairer outcomes.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
27 | |
26 | |
14 | |
14 | |
13 | |
11 | |
10 | |
7 | |
7 | |
7 |