Managing cybersecurity risks is challenging in any climate. Doing it in the middle of rapid cloud transformation adds additional complexity and need for agility. Understanding the direction the company strategy was pointing the company in, to better manage the associated cybersecurity risks, SAP decided to implement the NIST Cybersecurity Framework (NIST CSF). Recently, Ernst & Young and SAP jointly published Keeping SAP Customers Secure Around the Globe - Taking a Risk-Based Approach to Protect Customer Dat...(EY/SAP), a brochure that describes lessons learnt and initial benefits of SAP's NIST Cybersecurity Framework (NIST CSF) implementation.
What the brochure only hints at through the collaboration and dialogue across the organization is how this implementation took place against a backdrop of rapid technological and organizational change of rapid cloud transformation. In this article, I go deeper into how NIST CSF provides a stable structure to drive continuous improvement in our cloud security posture, while allowing the flexibility and agility for cloud transformation with ever-changing and evolving policies and compliance audit requirements.
Flexible and Globally Applicable
As the EY/SAP brochure says:
The implementation of NIST CSF is a forward-thinking approach helping SAP transition from a compliance and risk assessment mindset into a more adaptive and responsive posture of managing cybersecurity risk to deal with ever-changing cyber threats.
This transition is particularly important as SAP continues to evolve from a software provider to a cloud services provider, experiencing rapid growth in Cloud ERP and customers migrating on-premise SAP systems to the cloud. Therefore, beyond the ever-changing cyber threats we are all presented with at any moment, SAP also assumes new risks now that we increasingly operate the systems that are critical to our customers to conduct business.
SAP implemented NIST CSF version 1.1 as described in Framework for Improving Critical Infrastructure Cybersecurity (NIST CSF v1.1). Version 2.0 is currently in draft form and SAP is actively involved in the feedback round. While version 2.0 is still not explicitly cloud aware, the framework is intended to be flexible enough to accommodate technological changes. As NIST CSF v1.1 states:
The Framework remains effective and supports technical innovation because it is technology neutral, while also referencing a variety of existing standards, guidelines, and practices that evolve with technology. By relying on those global standards, guidelines, and practices developed, managed, and updated by industry, the tools and methods available to achieve the Framework outcomes will scale across borders, acknowledge the global nature of cybersecurity risks, and evolve with technological advances and business requirements.
The flexible nature of the framework as well as applicability globally are important to SAP. Managing cybersecurity risk during ongoing cloud transformation implies constant change. NIST CSF provides a stable framework that describes the necessary capabilities to manage cybersecurity risk, where the underlying security policies and processes can evolve with technological changes. But it also accommodates continuous changes in the global regulatory environment and associated compliance audits, as different jurisdictions around the world adopt similar but varying security regulations, requirements and guidelines. Since SAP offers its cloud solutions everywhere apart from sanctioned countries and territories, and many of our customers are in the public or regulated private sector and themselves are subject to 3rd party audits, NIST CSF's ability to scale across borders is particularly beneficial.
NIST CSF Overview
NIST CSF v1.1 is a fairly easy read as far as such documents go. In short. NIST CSF is a self-assessment framework that provides a common taxonomy and mechanism for organizations to:
Describe their current cybersecurity posture
Describe their target state for cybersecurity
Identify and prioritize opportunities for improvement within the context of a
continuous and repeatable process
Assess progress toward the target state
Communicate among internal and external stakeholders about cybersecurity risk
The communication aspect cannot be underestimated in a large, complex organization where it isn't always easy to get everyone on the same page or who is responsible for what. Such challenges are amplified in the middle of cloud transformation and organizational change. NIST CSF is also very helpful in external communication with customers and partners as well as auditors globally, many of whom are familiar with the framework or at least can easily find online resources on the NIST website if they aren't.
The framework categorizes cybersecurity functions and capabilities necessary to manage cybersecurity risk appropriate to the business risks the organization is exposed to and at a level of sophistication it is willing and able to bear. Specifically,
[t]he Framework Core consists of five concurrent and continuous Functions—Identify, Protect, Detect, Respond, Recover. When considered together, these Functions provide a high-level, strategic view of the lifecycle of an organization’s management of cybersecurity risk. The Framework Core then identifies underlying key Categories and Subcategories – which are discrete outcomes – for each Function, and matches them with example Informative References such as existing standards, guidelines, and practices for each Subcategory.
The diagram above shows the five Functions and the Category outcomes that are tied to particular activities in each. Each of these break down further into Subcategories of specific outcomes of technical and/or management activities that support a Category outcome. For instance, the PR.DS-2 Data-in-transit is protected Subcategory is a specific outcome that supports Data Security in the Protect Function (PR.DS).
The ability to assess current and target cybersecurity posture, as well as progress towards the target, comes from the NIST CSF Framework Implementation Tiers. They provide the context how well cybersecurity risk management is integrated into an organization's overall risk management practices. "Tiers describe an increasing degree of rigor and sophistication in cybersecurity risk management practices. [...] Progression to higher Tiers is encouraged when a cost-benefit analysis indicates a feasible and cost-effective reduction of cybersecurity risk." (NIST CSF v1.1) The Tiers describe the level of sophistication in the Risk Management Process, the Integrated Risk Management Program, and External Participation, and thereby allows you to assess where you are currently and select a desired Tier while ensuring that "the selected level meets the organizational goals, is feasible to implement, and reduces cybersecurity risk to critical assets and resources to levels acceptable to the organization." (NIST CSF v1.1)
There are two themes in there I want to highlight.
Investment in cybersecurity functions should be appropriate to the value of the assets that must be protected, the risks the organization is exposed to and the cost of any necessary mitigations to draw the risk down to a level that the business is willing to accept. You wouldn't invest $100 to protect assets worth $100, as it makes more sense to accept an uncertain possibility of a $100 loss instead of a certain $100 cost of investment. Even a $10 investment is hard to justify if there is only a 10% probability of that $100 loss.
Each organization should set for themselves a target Tier that is appropriate to the risk it faces and it is able to achieve with the resources at its disposal and is willing to invest, including opportunity costs and possibly necessary organizational change. Setting unachievable targets sets you up for failure and is bound to run into organizational resistance that could additionally make you lose executive support. Avoid chasing rainbows or boiling oceans.
A key component of the Tier criteria is the External Participation and the role of information sharing in risk management. In the first "Partial" Tier, the organization neither uses external threat feeds and best practices from others, nor does it share any information externally. Through the remaining Tiers (Risk Informed, Repeatable and Adaptive) increasingly the organization uses both external information sources and shares information with others to manage cybersecurity risk.
I see an interesting parallel there with the evolution in cloud security from Shared Responsibility and Shared Fate to Shared Faith, with a similar progression of allowing customers to manage cybersecurity risks more effectively. In this evolution, we move from a position where each party does their own part, to one where the cloud provider more directly supports the customer to run securely, to a where the cloud provider shares broader information how its services are operated.
How the Five Functions Support Risk Management
To show how the five NIST CSF Functions work together to reduce cybersecurity risks, let's run through a theoretical critical risk. This could be a ransomware incident or an ignored cryptomining incident or a data breach through a misconfigured storage bucket, for instance. Assuming that we have adequate Asset Management and Risk Management processes (Identify Function) in place, we can plot that risk on a probability/impact matrix. Any protective measures that are in place or planned would (primarily) reduce the probability of security incidents. Such measures in SAP include MFA for cloud administrator accounts, automated remediation of misconfigurations, etc. through our guardrails.
Secondly, recovery measures (primarily) reduce the impact of security incidents. If we have a contemporary CI/CD pipeline in place and can restore backups that haven't been tampered with, then we can quickly recover from even major incidents and restore affected services, thereby reducing any downstream impact the incident has for customers.
The intersection of these already could draw down the risk from Critical to Low depending on the sophistication of the measures in place.
The risk can be further reduced by the combined effect of Detect and Respond Functions through the capability to detect misconfigurations, vulnerabilities or threats and the ability of the organization to respond to such alerts. The effectiveness of both these Functions itself depends on the level of sophistication of Asset Management and the ability to identify the right team responsible for the resources that raised the alert.
It hopefully is clear from this that the five Functions operate together in a mutually reinforcing feedback loop, and that therefore a Swiss Cheese Model approach to Cloud Cyber Resiliency is more effective in managing cybersecurity risk rather than focusing attention just on one Function while ignoring others.
From Control to Compliance Audit
As we have discussed so far, the NIST CSF Categories and Subcategories are outcomes, or capabilities of a certain rigor and sophistication. They don't describe what you must do to achieve them. Moreover, beyond reducing security risks to the company and its customers, SAP also must meet regulatory requirements and pass compliance audits. We need verifiable controls that can be audited and policies the organization can implement.
Helpfully, each Subcategory refers to controls in other standards documents such as NIST SP 800-53 Rev. 4 or ISO/IEC 27001:2013. These references and other relevant compliance regimes determine what you must do. Those controls are often still quite generic and technology neutral (or worse, out-of-date) and need to be turned into more specific policies that define concretely how you are going to do it. That is, what the desired state is within a specific context so that developer teams understand what is expected of them and compliance with the policy can be verified and tracked. The policy then needs to be communicated to stakeholders and adopted by teams, before we can prove we meet a particular control. Ideally the policy has wide applicability so that a policy can cover as many different compliance requirements globally and teams only have to follow one policy to be compliant with many different applicable standards.
To go back to the previous example of PR.DS-2 Data-in-transit is protected Subcategory, this needs to be translated to a control and then to a particular policy. For the public cloud landscape this is defined as requiring TLS 1.2+ (i.e. TLS1.2 with the weak cyphers removed) minimum for cloud services and load balancers. This policy control is then implemented by a central services team as an organizational policy in each of the cloud providers that cascades to all cloud accounts in the global public cloud landscape that can be easily audited.
Another example is (Detect/Security Continuous Monitoring) DE.CM-8 Vulnerability scans are performed. For the desired outcome to be achieved, a tool needs to be selected and mandated, then adopted by the organization to ensure full coverage. The control owner cannot do this alone. It must be decided who selects the tool, who manages adoption and performs the scans, and who pays for it. Budgets and headcount must be allocated and the tool operationalized across the organization before the control can be evidenced and audited.
This requires collaboration across the company, between the security organization and central services teams; platform-, infrastructure- and applications teams; and finance, governance and procurement teams. Therefore the arrows between Policy Definition and Policy Adoption are bi-directional (blue -> right, yellow <- left) to reflect the balancing of cybersecurity risks with operational burden and cost and the ability of the organization to meet a policy requirement. This is manifested for instance in the dialogue that takes place during our weekly Cloud Security Office Hours as described in this previous blog. A policy has to be feasible to be effective. Adoption of a policy is a lot harder when you give teams more work and higher costs, rather than find a way to make it easier on them.
Through changes in the cloud threat landscape and cloud-native services and technologies, as well as new or expanded central security services, policies have to be agile and can change quite frequently (weeks, months). Controls, driven by standards bodies and governmental institutions, change on a much slower frequency. The even higher level NIST CSF v1.1 has an even more stable cycle. Version 1.1 is from 2018 while version 2.0 is scheduled to be released early 2024. It therefore provides a stable structure to drive continuous improvement in our cloud security posture, while allowing the necessary flexibility and agility for cloud transformation with ever-changing and evolving policies and compliance audit requirements.
None of this is easy. But in these fast-changing times and increasing threats, it is good to have a stable life raft to hold on to, allowing you to assess where you are and what you need to do to get where you want to be in your cybersecurity posture and program. If you would like to explore what NIST CSF can do for your organization, in addition to the EY/SAP brochure, NIST CSF v1.1 and other resources linked throughout this blog, you may find Sounil Yu's Cyber Defense Matrix useful to get started.