Modern web applications often rely on third-party services to provide key functionality to users, turning traditional web applications into
multi-party web applications. Take as an example an online shop that integrates with PayPal to provide smoothly payment functionalities to its users.

Figure 1 - Multi-party scenarios
The online shop is referred to as the Relying Party (RP) and PayPal as the Trusted Third Party (TTP). By outsourcing the payment to a payment service provider like PayPal, the online shop does not have to bother about difficult-to-implement operations around payment, including compliance with legislation requirements on credit cards (e.g., Payment Card Industry Data Security Standard,
PCI DSS). Another example of multi-party scenario relates to the authentication. Many web applications (RPs) allow their users to authenticate through the Single Sign On services offered by identity providers like Facebook, Google or Twitter (TTPs).
Unfortunately, a rich research literature showed that the
secure integration of third-party services is a non-trivial task. Vulnerabilities might arise due to errors in the protocol specification underlying these services, incorrect implementation practices at the RP, and subtle bugs in the integration APIs provided by the TTP. Most of these vulnerabilities belong to the class of
business logic vulnerabilities, also known as logic flaws.
A business logic vulnerability ”is usually one of the hardest to detect, and one of the most detrimental to the application, if exploited.”
OWASP Web Security Testing Guide
Logic flaw attacks see the attacker grasping in a legit manner some key information in one protocol session that is then misused in other protocol session, leading the attacker to take over the account of a victim, shop-for-free, and similar. In the example hereafter, the attacker Malice buys Product1 from MyShop, but she uses the shop identifier PayeeId2 (grasped in the session before) to trick PayPal in paying KittyShop rather than MyShop. This attack was possible in an old version of the PayPal Express Checkout.

Figure 2 - Example of logic flaw in old version of PayPal Express Checkout
The automated detection of logic flaws is extremely hard, and it is normally out-of-scope for standard static and dynamic analysis tools on the marketplace.
Many research approaches have been proposed to fight against these vulnerabilities, ranging from formal method analysis, model-based testing, and runtime monitoring. At SAP Security Research, in collaboration with key research institutes, we have been exploring thoroughly the first two approaches, contributing to the rich literature in this area, for instance:
- [1]: detected a logic flaw in the SAML-based Google Apps via model-checking (2008)
- [2]: identified a logic flaw in the SAML standard protocol specification (2013)
- [3]: devised a dynamic approach to detect logic flaws from HTTP traffic traces (2016)
Recently, under a collaboration with
Ca' Foscari University of Venice, we have also focused our attention on runtime monitoring, to defend potentially vulnerable multi-party web applications from logic flaws while executing.
The runtime soldiers
Trusted-third parties play a fundamental role in multi-party scenarios. Very often, they are operated by large companies that can afford strong security best practices and thus deliver high quality solutions. This explains why most of the reported vulnerabilities lie at the RP side.
Now let us put ourselves in the shoes of a company delivering TTP services. Although TTPs normally release artifacts such as software development kits (SDKs) to facilitate the integration steps that need to be implemented at the RP side, this process is still far from being error-free. Wouldn’t it be great if as TTP we could complement these artifacts with
runtime monitors that act as soldiers protecting from logic flaws any RP using the protocol? In this way, even if an error is made at the RP side during the integration, the soldier would prevent attackers from exploiting that defect, keeping the entire scenario secure, in line with the
runtime application self-protection (RASP) paradigm. Of course, these soldiers shall be easy to deploy and shall not impact the user experience.
The Bulwark prototype, developed by SAP Security Research in collaboration with the University Ca ’Foscari of Venezia, proposes a novel approach that generates runtime monitors and cover the aforementioned requirements. This approach is presented in the scientific paper
Bulwark: Holistic and Verified Security Monitoring of Web Protocols, published at the
25th European Symposium on Research in Computer Security (ESORICS 2020).
Bulwark is currently proprietary software at SAP: the prototype could be made available upon request and an open-source license is under consideration.
Bulwark on an example
The Bulwark approach comprises three main phases and it is shown in picture below. Hereafter we will briefly overview the approach on a concrete example. For the complete technical details, the interested reader can refer to our technical document [4].

Figure 3 - Bulwark approach
Our example is illustrated in the protocol message sequence chart below: MyShop is an RP that wants to integrate with the Facebook identity provider (the TTP) to allow Facebook’ users like Alice to smoothy authenticate at MyShop. The integration is done with the OAuth2 protocol and MyShop implements it, using the SDK and guidelines provided by Facebook.

Figure 4 - Example: MyShop integrating the Facebook OAuth 2.0 explicit mode
Unfortunately, MyShop makes a mistake and its implementation lacks checking that the "state" parameter received at step 7 is the same that was generated for step 2. MyShop results thus vulnerable to a session swapping attack, enabling an attacker to track any user’ activities at MyShop.
Now let’s see how the TTP Facebook could use Bulwark to provide to MyShop a runtime soldier that would prevent that vulnerability and any other similar logic flaws.
Phase 1 - Ideal specification: the team at Facebook TTP writes a formal specification of the OAuth2 protocol, specifying what each protocol participant is doing and which security properties the protocol shall satisfy. While writing formal specifications can be hard, this task can be simplified with a graphical notation tool, similarly to the approach we devised in [5]. Moreover, the formal specification of this Facebook example is portable without any modification to other TTPs like VK and Google, meaning that different TTPs supporting the OAuth 2.0 explicit protocol can use our approach straightaway. The formal specification is then automatically verified for security violations using
ProVerif, state-of-the-art protocol verification tool developed at INRIA and integrated into Bulwark. When the formal specification is successfully verified, it becomes the ‘ideal specification’ and it can be used as input for the next phases.
Phase 2 – Monitored Specification: in contrast to the ideal specification, the implementation of the RP may lack some important security checks like in our MyShop example. From the ideal specification, Bulwark automatically creates a “monitored specification”, that complies with the protocol but where the participants are inattentive and forget relevant security checks and where some abstract monitors are introduced to enforce the forgotten security checks. Basically, these abstract security monitors perform those security checks that inattentive participants may have been sloppy about.
Phase 3 – Monitor Generation: finally, Bulwark translates the abstract runtime monitors into
real service workers (written in JavaScript) and
real server-side proxies (written in Python, though other languages could be supported). This is a relatively direct one-to-one translation, whose key challenge is mapping the abstract messages in the ideal specification to the real HTTP messages exchanged in the web protocol. This mapping is done with a configuration file, which drives the monitor generation process by defining the concrete values of the symbols and data constructors that are used by the ideal specification. For instance, the “reduri” abstract parameter could correspond to the concrete URI value “
https://myshop.com/rpservice/”.
Figure 5 hereafter shows how the initial situation described in Figure 4 is now modified: using Bulwark, Facebook has generated a monitor M(RP) that, among other security checks, will ensure that the state parameter is properly saved and then checked to avoid the session swapping vulnerability.

Figure 5 - Monitor example
Let's use Bulwark and see what happens
We experimented Bulwark against 8 case studies (CS) of known-to-be vulnerable
multi-party web applications, considering identity management (OAuth2 protocol) and e-commerce scenarios (PayPal protocol). These case studies are summarized in the following table:

We started from an entirely artificial case study, where we developed both the RP and the TTP that use an open source OAuth2 protocol library, and we introduced one known vulnerability in each party (CS n.1). We then considered three scenarios with major TTPs, i.e., Facebook, VK and Google, where we developed our own vulnerable RPs that use the respective SDKs (CS n.2, n.3 and n.4).
To complete the study for the identity management scenario, we also considered a case study where we have no control of any party, i.e., the online editor Overleaf that offers authentication via Google (CS n.5). We specifically chose this case study since we discovered the lack of the state parameter in the Overleaf implementation of OAuth 2.0, which introduces known vulnerabilities. Notice that we responsibly disclosed the issue to Overleaf and they fixed it before the publication of our paper.
To assess the e-commerce scenarios, we selected legacy versions of three popular online shop platforms, suffering from known vulnerabilities in their integration with PayPal: osCommerce 2.3.1 (CS n.6), NopCommerce 1.6 (CS n.7) and TomatoCart 1.1.15 (CS n.8).
We evaluated each case study in terms of four key aspects:
- security: we experimentally confirmed that the monitors generated by Bulwark stop the exploitation of the vulnerabilities. We observed that 5 experiments are secured by a service worker alone, 4 experiments are protected by a server-side proxy and only one experiment needed the deployment of two monitors.
- compatibility: we experimentally verified that the monitors do not break legitimate protocol runs. We were able to complete legitimate protocol runs successfully, both with and without the monitors.
- portability: we demonstrated that our ideal specifications can be used without significant changes across different case studies. For instance, the ideal specification written for CS n.1 was used without any change for CS n.2, n3 and n.4 too, and only minor changes were necessary to be portable for CS n.5.
- performance: we showed that the time spent to verify the protocol and generate the monitors is acceptable for practical use. Both steps are performed offline and just once, never exceeding one hour and a half. The network overhead was also estimated negligible.
Demo videos
You can watch a few videos about Bulwark on this
channel in SAP media video.
OverLeaf case study (CS n.5):
- video 1/2 (03:11): Session-swapping attack identified by us
- video 2/2 (02:53): Bulwark’s generated monitor preventing the attack on Overleaf
Artificial case study (CS n.1, artificial IdP and artificial RP):
- video 1/2 (01:57): Unauthorized Login by Code Redirection attack (embedded by us in the artificial IdP)
- video 2/2 (02:13): Bulwark’s generated monitor preventing the attack
Now what?
In our research we devised Bulwark and demonstrated its effectiveness on various case studies. By using Bulwark, a Trusted-Third Party can generate runtime monitors for all its Relaying Parties, protecting the entire web protocol even if the Relaying Party made some mistakes in its implementation. All together our approach has the potential to eradicate logic flaws from web-based protocol execution.
This goes very much in a similar direction of Runtime application self-protection (RASP), but rather than using runtime instrumentation to detect and block attacks, Bulwark uses specific server-side proxies and service workers to monitor the protocol-related HTTP traffic and block malicious protocol messages.
While some engineering effort could be necessary to make our approach more viable in industrial settings (e.g., development of a graphical notation to specify protocols and compile them in formal specifications), we strongly believe in the RASP vision of creating self-protecting apps and beyond (e.g., self-protecting protocols).
References
[1]
Formal Analysis of SAML 2.0 Web Browser Single Sign-on: Breaking the SAML-based Single Sign-on for G.... ACM Workshop on Formal Methods in Security Engineering at Computer and Communications Security (FMSE CCS 2008).
[2]
An authentication flaw in browser-based Single Sign-On protocols: Impact and remediations. Journal Computers and Security 2013 (JCS13).
[3]
Attack Patterns for Black-Box Security Testing of Multi-Party Web Applications. 23rd Annual Network and Distributed System Security Symposium (NDSS 2016).
[4]
Bulwark: Holistic and Verified Security Monitoring of Web Protocols. 25th European Symposium on Research in Computer Security (ESORICS 2020).
[5]
Security threat identification and testing. IEEE 8th International Conference on Software Testing, Verification and Validation (ICST 2015).
Contact and credits

Contact for further information:
Dr. Luca Compagna, research expert at SAP Security Research,
luca.compagna
Joint work with:
Lorenzo Veronese and Prof.
Stefano Calzavara (University Ca ’Foscari of Venezia)

Discover how SAP Security Research serves as a security thought leader at SAP,
continuously transforming SAP by improving security.