Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
Showing results for 
Search instead for 
Did you mean: 
Survey programs such as SAP’s #Unfiltered are a fantastic way to better understand employee views and create insights for the business from thousands of feedback responses. For example, this year in April more than 80.000 of SAP’s employees provided feedback on a variety of topics, ranging from engagement to health and well-being to learning and development opportunities. Traditionally, a lot of survey data is collected on items phrased as questions or statements that are then rated on so-called Likert scales. The great advantage of Likert items is that they are easily analyzed: the responses are captured on a numerical scale, typically 1-5, and with those numerical indices we can do a lot ranging from simple descriptive statistics (e.g., averages or percentage favorable) to advanced analytics (e.g., regression models or forecasts). But sometimes, this way of approaching experience data isn’t enough!

The Likert scale approach


The need for discovery

Relying solely on Likert-type data gathering can lead to very narrow and skewed view of employee opinions, ideas, and sentiments, because it assumes we have perfect top-down knowledge of the range of potential topics that are worth inquiring about and all that is left to do is do convert agreement on each topic into a number between 1 and 5. But we live in fast and tumultuous times, and sometimes the “how much do you agree” approach of Likert items isn’t enough to understand the topics occupying the minds of the business, sometimes we need to go into discovery mode instead! We can do this the same way we would in a conversation: we ask an open-ended question. This approach, typically called qualitative (in contrast to our quantitative Likert items), can help us discover a lot more about the views of employees, but it brings its own drawbacks. For example, in the April #Unfiltered we received almost 60.000 comments in response to two open questions we asked, making it impossible for our small team to manually extract insights from each comment.

Asking open questions to shift into discovery mode

Introducing text mining to our continuous listening strategy – a little case study

So, this is where text mining comes into the picture. The world of data science has increasingly embraced the fact that not all data come in nice rectangular frames with numbers but are often quite verbose. Yet the field of natural language processing, which very much sits at the interface of linguistics, computer science and machine learning, has opened several avenues to process such data in a manageable fashion. Here we’ll take a quick look at an example of one such analysis we did.

  1. Asking our employees about their hybrid work needs

With the recent SARS-CoV-2 pandemic the world saw an unprecedented move to remote work. SAP showed a particularly proactive approach to this challenge and implemented a holistic framework to enable hybrid work outside pandemic times as well, the Pledge to Flex program. In summer this year, a time where most countries abandoned social distancing measures and regulatory prescriptions, we ran Future of Work Pulse to ask our employees how hybrid work is going. An obvious concern in this circumstance is the question of resources. Do employees have everything they need to work in the hybrid world? Well, we asked that with the Likert-rated item “I have all of the resources that I need to successfully transition into a hybrid working set-up.” But what do we do about people who say they don’t have the necessary resources? In normal conversation you would simply ask: “What resources are you currently lacking?” And that’s exactly what we did.

How text mining helped us discover the hot topics

When asking that question, we got hundreds of responses from employees across the globe. We used a machine learning method called Latent Dirichlet Allocation (LDA) for the open comment analysis. LDA is a generative statistical model commonly employed in topic modelling and natural language processing, which aims to discover topics within a collection of documents. It treats each document (in this case, each open comment submitted in response to our question of what resources are lacking) as a mixture of topics, and each topic as a mixture of individual words. Documents may therefore overlap each other in terms of content, mirroring typical use of natural language.  Our analysis suggested that the answers fall into three broad categories:

  1. Guidance on Pledge to Flex implementation

  2. Equipment and workspaces

  3. Internal travel and bringing teams together

Using text mining to find common topics amongst open comments


Making sense of the discoveries

Obviously, it still takes a good old human brain to make sense of the initial output, and then gain real depth in understanding. The model provides the user with the most common terms associated with each topic (but they may pop up in different topics, e.g., the term “hybrid”). But one particularly cool feature of LDA is that it estimates the probability that a particular comment has been “generated” by a particular topic. This means it is possible to sort the comments by their probability of belonging to a topic, look at the top comments for each topic and thereby make more sense of what the LDA believes is happening. Basically, an easy way to get the comments most closely related to a topic. Alternatively, we can look at which topics appear most frequently, or see how positive or negative the emotional tone of comments within topics.

Bringing it together

Text mining and topic modelling methods can be a great way to make more sense of open comments in surveys more quickly. They don’t take away all the work and ultimately the comments to need to be read carefully to really understand what’s going on, but text mining methods provide a great entrance into understanding employee needs better and arriving at a bird’s eye view more quickly and more objectively than assigning topic categories completely manually. Most importantly, it allows for more open questions in surveys, knowing that even if comments are in the thousands, we can do them justice. We can discover what drives employees, what worries they have and what actions can we take based on those insights to best support them. We can ask them in the same way you would ask a colleague, but in the Future of Work we can listen to the responses of thousands of colleagues at the same time.