Most of the HYPERSCALERS offers native features like 'Auto Recovery' or 'Self healing' for the virtual machines running in their DCs. In AWS it is called 'Auto Recovery' and has to be explicitly enabled for a virtual machine from the AWS console.
What does Auto Recovery do?
If there is an issue with the VM due to power, network OR the underlying hypervisors then AWS automation takes care of rebooting the VM automatically on a different hypervisor. The new virtual machine is identical to the previous one is all aspects like Instance ID, Instance Name, storage, IP etc. This is clearly an advantage as compared to a classical DC where such feature could only be achieved by extremely complex automation OR a cluster kind of setup.During 'Auto Recovery' basically the virtual machine reboots and it gives the end user the possibility to configure any business processes OR components to start during the system reboots.
In a typical NW based system the ASCS instance which mainly has 2 processes i.e. message server and enqueue server. Enqueue server holds the locks for SAP transaction and if these locks are lost then it could lead to a data inconsistencies. The startup behaviour for the SAP instance can be controlled using the autostart parameter, add the parameter "Autostart = 1" (case sensitive!) to the START profiles (or instance profiles in case your system does not use START profiles). Once the parameter is added, then during VM reboot the ASCS instance will be automatically started.
NW high availability setup requires an Enqueue Replication Server, basically the lock table from the Enqueue server is replicated and in event of a failure of hte ASCS instance the locks are still preserved on the ERS instance running on a different server.
With ENQ2 we have an advantage that during the startup it can read the replicated lock table entry from the ERS instance and rebuild the lock table. This is different from the classical ENQ<->ERS setup where the ASCS instance needs to be failed over to the host where the ERS instance is running.
For a NW high availability setup in AWS there is an advantage.In an event of any unplanned downtime of the virtual machine where the ASCS instance is running the 'Auto Recovery' from AWS takes care of bringing the Virtual machine back and the Autostart parameter takes care of starting the ASCS instance.
ENQ2 process can build the lock table again after reading the replicated lock table from the ERS instance. There is no need to failover the ASCS instance to any other host.
This setup works without the need of a cluster kind of setup. Below diagram shows one such setup.
In the above diagram AZ1 & AZ2 are 2 availability zones in the AWS Region. indicates that 'Auto Recovery' is enabled for these Virtual Machines in AWS environment.
With simple steps high availability setup for NW based system(NW release higher that 7.52) can be done in AWS so that all the single point of failures can be avoided. ASCS instance and ERS are protected against Virtual Machine failures with AWS "Auto Recovery", ENQ2 ensures that after the restart of ASCS the lock table is built reading the replicated lock table in ERS instance.
With AWS Auto Recovery and ENQ2, the overhead of maintaining a cluster setup can be avoided and there is reduction in the TOC from a customer perspective.
Note: In case of entire failure of an availability zone where the ASCS instance is running there would not be any auto-recovery possible from AWS. In such rare scenarios ASCS instance needs to be failed over to virtual machine where the ERS instance is running in the second availability zone OR can be started on a new virtual machine.