One of the things that set SAP Analytics Cloud apart is the fact that you can model your data directly within the application.
Data Preparation
Data preparation is also known as data wrangling is the first stage of modeling. It's when you clean and transforms your data in preparation for analysis.
Models can be created from files imports from your computer or google drive or from data connected on-premise and cloud data sources through import data connections.
Please note: That data preparation and modeling within Analytics Cloud is not possible, or not necessary for data sources connected via live data connections. This is because live data connections use the existing models with your source systems and are updated with new data in real-time.
In this blog post, I am going to take you through the process of preparing your data with the
Analytics Cloud Modeler.
I will show you:
- How to check your data quality?
- How to define relationships within your data?
- How to update your data quickly using quick actions and transformation?
- How to create Geolocation?
Okay!! Let's get started with
Data Modeling.
- In your SAP Analytics Cloud window expand the Navigation Bar> more >choose modeler. Before that just try to understand the data set that you gonna use.
2. Because our dataset is an excel spreadsheet we will choose the
"import a file from your computer" option.
Choose From a CSV or Excel File
3. Upload the excel file that you got downloaded already.
4. Make sure to select the first sheet labeled
"Data Upload" and click
"Import".
Choose Data Upload
5. If your imported file consists of a large number of records, you will see that the data has been sampled.
Data Sampling helps Analytics Cloud run faster during data preparation. The changes you make to this sample will be applied to the entire dataset once you create your model.
Data Sampling was done
6. Once the data sampling has been done give ok. Then you will see the
data integration workspace of the
modeler.
Data integration workspace of the model
7. You'll notice that there is not an option to save in the toolbar. This is because the process of preparing your data must take place in one session.
8. Once the data has been imported you will see it organized in a familiar row and column format. This is called
"Table View".
9.
Data Summary and Model Information: To the
right of your data, the details panel displays a
data summary and
model information. You can use the panel to update your model information and select from some model options. As this model is for analysis only, we will leave the
"Planning Enabled" option unchecked.
10. To get a good overview of your dataset you can switch to the card view model ( you can find the card view model in the upper navbar of the layout section).
Card View
Card View: Each card represents a column of data and displays some summary information. When you select a card, detailed information about the column appears in the detailed panel.
11. Let's start by checking how the columns have been categorized:
11.1 Are the measures and dimensions defined correctly? yeah...It's so important to check whether the measures and dimensions are defined correctly otherwise you will face problems in future calculations. For a better understanding let's see what is measures and dimensions...
Measures: The measure contains quantitative information that can be used for calculations.
Dimensions: Dimensions are qualitative and help in providing context.
Miscategorization is unlikely to happen. Since Analytics Cloud recognizes data patterns and is typically able to deduce whether or not a column of numeric values is a measure or dimensions. So always checking it once will be good.
For example, In the upper image information in the count column represents the number of travelers in each record This means that the column is a measure, not a dimension so let's update the datatype in the upper image.
11.2 The details panel is also where you can add dimensions like descriptions, properties, and hierarchies to your existing dimensions. Okay..now let's discuss them:
Description: Descriptions give context to dimension columns that are considered IDs.
Properties: Properties are made up of information that is related to dimensions.
Hierarchies: Hierarchies are dimension attributes that create a parent-child relationship. These relationships will allow you to drill down by different levels of details within your charts.
Note: You can also add data descriptions to your dimensions later(I mean once after the model creation). But if you gonna do that before, please sort the cards from A to Z so that it will be easy. Then just have something in your mind that the number of unique values on the ID should match the values on the description card.
11.3 Geolocation- The latitude and longitude coordinates in your dataset will be used to create geolocations.
11.4 Another part of the data preparation is cleansing your data and ensuring that only relevant data is included in the model and it's an important part too🧐.
11.5 Selecting a measures card allows you to check the data quality and see the data distribution in the details panels. At this point make sure all the data make sense. For example: If you gonna include values greater than $0 you can remove that record from your model.
11.6 After all these things did come across once again and check whether all the dataset columns are necessary for analysis. If any columns didn't contain any values just delete them using the quick actions.
11.7 Data Transformation: While it comes to transforming your data, You can choose from the smart transformation suggested by Analytics Cloud or create your own using the transformation bar.
11.8 Now almost all the things are ready for your data modeling. yup!!! you people are so excited right
😅. Now just give
create model available at the right down in details panels.
11.9 okay! Now your model is created just save them. Now your model is totally ready for analysis purposes.
I think so, The above steps will be useful in data preparation. Then please always remember to double-check in measures and dimensions otherwise you will face problems in the calculation part while building your story. And also provide some time in description, properties, and hierarchies for effective data modeling.
In case you have questions regarding the content you just read, feel free to post your questions in the dedicated tag are for SAP Analytics Cloud:
https://answers.sap.com/questions/ask.html?primaryTagId=67838200100800006884
Also, please follow the tag:
https://answers.sap.com/tags/67838200100800006884 and ensure your communication settings (
#communications) are enabled to stay up to date with content in SAP Analytics Cloud.
Check out the topic page for SAP Analytics Cloud, too:
https://community.sap.com/topics/cloud-analytics. Provide your feedback in the comment section, I am looking forward to reading it!
Okay, Folks let's meet on another blog post
😊. Thank you for reading...Bye!!