Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
c_baker
Employee
Employee
(...continuing from Part 2)

Dropping the DR node


Dropping the DR node can be done in 3 steps from the primary RMA:

  1. Drop database replication to DR:
    1> sap_disable_replication Toronto, Offsite, tpcc
    2> go
    TASKNAME TYPE VALUE
    ------------------- ----------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------
    Disable Replication Start Time Thu Dec 08 21:54:47 UTC 2022
    Disable Replication Elapsed Time 00:00:45
    DisableReplication Task Name Disable Replication
    DisableReplication Task State Completed
    DisableReplication Short Description Disable the flow of Replication
    DisableReplication Long Description Successfully disabled Replication for database 'tpcc'. Please execute 'sap_enable_replication Toronto, Offsite, tpcc' to enable replication for database.
    DisableReplication Task Start Thu Dec 08 21:54:47 UTC 2022
    DisableReplication Task End Thu Dec 08 21:55:32 UTC 2022
    DisableReplication Hostname primarynode.openstack.na-ca-1.cloud.sap

    (9 rows affected)
    1>


  2. Remove DR node from the HADR system:
    1> sap_update_replication remove Offsite
    2> go
    TASKNAME TYPE VALUE
    ------------------ ----------------- ---------------------------------------------------------------------------
    Update Replication Start Time Thu Dec 08 21:57:52 UTC 2022
    Update Replication Elapsed Time 00:01:54
    UpdateReplication Task Name Update Replication
    UpdateReplication Task State Completed
    UpdateReplication Short Description Update configuration for a currently replicating site.
    UpdateReplication Long Description Update replication request to remove host 'Offsite' completed successfully.
    UpdateReplication Task Start Thu Dec 08 21:57:52 UTC 2022
    UpdateReplication Task End Thu Dec 08 21:59:46 UTC 2022
    UpdateReplication Hostname primarynode.openstack.na-ca-1.cloud.sap

    (9 rows affected)


  3. Clean up replication definitions to the DR host:
    1> sap_drop_host Offsite  
    2> go
    TASKNAME TYPE VALUE
    ----------- ----------------- --------------------------------------------------------------------
    Drop Host Start Time Thu Dec 08 22:03:12 UTC 2022
    Drop Host Elapsed Time 00:00:01
    DropHostApi Task Name Drop Host
    DropHostApi Task State Completed
    DropHostApi Short Description Drop the logical host from the environment.
    DropHostApi Long Description Submission of the design change for a model property was successful.
    DropHostApi Task Start Thu Dec 08 22:03:12 UTC 2022
    DropHostApi Task End Thu Dec 08 22:03:13 UTC 2022
    DropHostApi Hostname primarynode.openstack.na-ca-1.cloud.sap

    (9 rows affected)



The DR host is now removed from the HADR environment:
1> sap_status path
2> go
PATH NAME VALUE INFO
--------------------- ------------------------- ----------------------- ------------------------------------------------------------------------------------
Start Time 2022-12-08 22:03:50.498 Time command started executing.
Elapsed Time 00:00:00 Command execution time.
London Hostname companionnode Logical host name.
London HADR Status Standby : Inactive Identify the primary and standby sites.
London Synchronization Mode Synchronous The configured Synchronization Mode value.
London Synchronization State Inactive Synchronization Mode in which replication is currently operating.
London Distribution Mode Remote Configured value for the distribution_mode replication model property.
London Replication Server Status Active The status of Replication Server.
Toronto Hostname primarynode Logical host name.
Toronto HADR Status Primary : Active Identify the primary and standby sites.
Toronto Synchronization Mode Synchronous The configured Synchronization Mode value.
Toronto Synchronization State Synchronous Synchronization Mode in which replication is currently operating.
Toronto Distribution Mode Remote Configured value for the distribution_mode replication model property.
Toronto Replication Server Status Active The status of Replication Server.
London.Toronto.DEM State Suspended Path is suspended (Replication Agent Thread). Transactions are not being replicated.
London.Toronto.DEM Latency Time Unknown No latency information for database 'DEM'.
London.Toronto.DEM Latency Unknown No latency information for database 'DEM'.
London.Toronto.DEM Commit Time Unknown No last commit time for the database 'DEM'.
London.Toronto.DEM Distribution Path Toronto The path of Replication Server through which transactions travel.
London.Toronto.DEM Drain Status Unknown The drain status of the transaction logs of the primary database server.
London.Toronto.master State Suspended Path is suspended (Replication Agent Thread). Transactions are not being replicated.
London.Toronto.master Latency Time Unknown No latency information for database 'master'.
London.Toronto.master Latency Unknown No latency information for database 'master'.
London.Toronto.master Commit Time Unknown No last commit time for the database 'master'.
London.Toronto.master Distribution Path Toronto The path of Replication Server through which transactions travel.
London.Toronto.master Drain Status Unknown The drain status of the transaction logs of the primary database server.
London.Toronto.tpcc State Suspended Path is suspended (Replication Agent Thread). Transactions are not being replicated.
London.Toronto.tpcc Latency Time Unknown No latency information for database 'tpcc'.
London.Toronto.tpcc Latency Unknown No latency information for database 'tpcc'.
London.Toronto.tpcc Commit Time Unknown No last commit time for the database 'tpcc'.
London.Toronto.tpcc Distribution Path Toronto The path of Replication Server through which transactions travel.
London.Toronto.tpcc Drain Status Unknown The drain status of the transaction logs of the primary database server.
Toronto.London.DEM State Active Path is active and replication can occur.
Toronto.London.DEM Latency Time 2022-12-06 18:47:31.278 Time latency last calculated
Toronto.London.DEM Latency 379 Latency (ms)
Toronto.London.DEM Commit Time 2022-12-06 18:47:31.284 Time last commit replicated
Toronto.London.DEM Distribution Path London The path of Replication Server through which transactions travel.
Toronto.London.DEM Drain Status Not Applicable The drain status of the transaction logs of the primary database server.
Toronto.London.master State Active Path is active and replication can occur.
Toronto.London.master Latency Time 2022-12-06 18:47:31.286 Time latency last calculated
Toronto.London.master Latency 383 Latency (ms)
Toronto.London.master Commit Time 2022-12-06 18:47:31.286 Time last commit replicated
Toronto.London.master Distribution Path London The path of Replication Server through which transactions travel.
Toronto.London.master Drain Status Not Applicable The drain status of the transaction logs of the primary database server.
Toronto.London.tpcc State Active Path is active and replication can occur.
Toronto.London.tpcc Latency Time 2022-12-06 18:47:31.286 Time latency last calculated
Toronto.London.tpcc Latency 383 Latency (ms)
Toronto.London.tpcc Commit Time 2022-12-06 19:33:53.846 Time last commit replicated
Toronto.London.tpcc Distribution Path London The path of Replication Server through which transactions travel.
Toronto.London.tpcc Drain Status Not Applicable The drain status of the transaction logs of the primary database server.

(50 rows affected)

which can also be confirmed by connecting to the DR node with isql:
1> sp_configure 'HADR mode'
2> go
Parameter Name Default Memory Used Config Value Run Value Unit Type
---------------- ----------- ------------- -------------- ------------ -------------- -------
HADR mode -1 0 -1 -1 not applicable dynamic

(1 row affected)

'-1' as a run value indicates that this instance is no longer participating in any HADR environment.

There is still an RMA instance running.  The SRS instance should already be shutdown and removed.

If the database is to be used for other purposes, it can be further cleaned up.  Steps are documented at: Removing the DR Node from the HADR System.  Otherwise, cleaning up the node and recreating the DR instance, adding the DR node back into the HADR cluster might be desirable for testing.

The following steps can be taken to help clean up the DR node if resetting to add it again instead of attempting to reuse the ASE instance:

  • Stop the RMA instance (connect to the RMA and issue the 'shutdown' command).

  • Remove the entries in the second interfaces file located in $SYBASE/DM


A new DR instance can now be added back using 'Adding the DR node' from Part 1.

Checking the HADR Cluster

Running the test application results in the same number of records in both primary (active) and companion (standby):
1> use tpcc
2> go
1> select count(*) from ORDER_LINE
2> go

-----------
900695

(1 row affected)

but, obviously since the DR node is no longer part of the cluster, the count remains the same as previously reported.

Before proceeding, we will perform a shutdown and startup of the HADR system.  This is documented at: Starting and Stopping the HADR System but consists of the following steps and assumes that the active ASE is on the primary node.

Shutdown sequence:

  1. fault manager

  2. primary and companion backup servers

  3. deactivate and shutdown primary ASE

  4. primary SRS (by default this is actually on the companion node)

  5. primary and companion RMAs

  6. companion ASE

  7. companion SRS (by default this is actually on the primary node)


Startup sequence:

  1. companion/standby ASE

  2. primary SRS (by default this is on the companion niode)

  3. primary ASE

  4. primary and companion backup servers

  5. companion SRS (by default this is on the primary node)

  6. primary and companion RMAs

  7. fault manager


Failover only needs 2 commands now:

  • sap_failover <active>, <standby>, <timeout>

  • sap_host_available <new standby/previous active>


1> sap_failover Toronto, London, 120
2> go
TASKNAME TYPE VALUE
-------------- --------------------- ----------------------------------------------------------------------------------------------------------
Failover Start Time Fri Dec 09 17:24:05 UTC 2022
Failover Elapsed Time 00:00:02
DRExecutorImpl Task Name Failover
DRExecutorImpl Task State Running
DRExecutorImpl Short Description Failover makes the current standby ASE as the primary server.
DRExecutorImpl Long Description Started task 'Failover' asynchronously.
DRExecutorImpl Additional Info Please execute command 'sap_status task' to determine when task 'Failover' is complete.
Failover Task Name Failover
Failover Task State Running
Failover Short Description Failover makes the current standby ASE as the primary server.
Failover Long Description Waiting for markers that verify all in-flight data has been sent from source 'Toronto' to target 'London'.
Failover Current Task Number 6
Failover Total Number of Tasks 18
Failover Task Start Fri Dec 09 17:24:05 UTC 2022
Failover Hostname primarynode.openstack.na-ca-1.cloud.sap

(15 rows affected)
1> sap_status task
2> go
TASKNAME TYPE VALUE
---------- --------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------
Status Start Time Fri Dec 09 17:24:05 UTC 2022
Status Elapsed Time 00:00:04
Failover Task Name Failover
Failover Task State Completed
Failover Short Description Failover makes the current standby ASE as the primary server.
Failover Long Description Failover from source 'Toronto' to target 'London' is complete. The target may be unquiesced.
Failover Additional Info Please run command 'sap_host_available Toronto' to complete disabling replication from the old source, now that the target 'London' is the new primary.
Failover Current Task Number 14
Failover Total Number of Tasks 14
Failover Task Start Fri Dec 09 17:24:05 UTC 2022
Failover Task End Fri Dec 09 17:24:09 UTC 2022
Failover Hostname primarynode.openstack.na-ca-1.cloud.sap

(12 rows affected)
1> sap_host_available Toronto
2> go
TASKNAME TYPE VALUE
------------- --------------------- -------------------------------------------------------------------------------------------------------
HostAvailable Start Time Fri Dec 09 17:24:47 UTC 2022
HostAvailable Elapsed Time 00:01:44
HostAvailable Task Name HostAvailable
HostAvailable Task State Completed
HostAvailable Short Description Resets the original source logical host when it is available after failover.
HostAvailable Long Description Completed the reset process of logical host 'Toronto' receiving replication from logical host 'London'.
HostAvailable Current Task Number 11
HostAvailable Total Number of Tasks 11
HostAvailable Task Start Fri Dec 09 17:24:47 UTC 2022
HostAvailable Task End Fri Dec 09 17:26:31 UTC 2022
HostAvailable Hostname primarynode.openstack.na-ca-1.cloud.sap

(11 rows affected)

At this point, we are back to an HADR primary and companion -only cluster.

In a future blog, we will examine using the Fault Manager, unplanned failover, and making the application HA-aware.

My next blog addresses how to enable HADR for an existing ASE instance.

Please let me know by the comments if you have any issues/corrections and I will address them.

Chris Baker
7 Comments
hganga
Explorer
0 Kudos
Dear Chris, great blog. i was struggle with the HA + DR configuration a few months ago. Now is working correctly. I have only one doubt, that i could not find in the documentation:

The failover to the DR site is not supported. So, how do you atctivate the DR node in case of a disaster? i think that will be possible with "sp_hadr_admin primary, force" but the documentation is not clear about that.

Your help will be appreciatted.

 

Thanks and best regards.
c_baker
Employee
Employee
0 Kudos
In the case of a disaster that takes the primary and companion nodes both offline, the DR node can be used to preserve the data, but must be activated using a manual or other process.  Automatic client connection failover is only supported between the primary and companion nodes of the cluster.

Recovering the HADR cluster or a database in the HADR cluster from the DR node is documented under Recovering the HADR Cluster from the DR Node.

Chris
hganga
Explorer
0 Kudos
Thanks for your answer Chris. That information (activating the DR node by manual process) is not clear in the documentation. I think my manual approach can work, but i can not test it by now.

Your advice will be appreciatted.

From your answer, i can understand that the DR node is only for data preservation, and to reconstruct the primary site? Is not intended for use with applications (Even manually redirecting all the conected apps to this new activated DR server)?

 

Thanks and best regards.
c_baker
Employee
Employee
0 Kudos
Only the active node should have data altered.  When configured, replication from the active (primary) to the companion (standby) is one-way only.  The direction is reversed during failover when the companion becomes the active and the primary becomes the standby.

Applications cannot connect to the non-active node of the cluster - they will be redirected (unless the login has 'allow hadr login' privilege) to the active node - hence the reason why an application does not need to be 'ha-aware' if both nodes are sill live, as in this blog (my next blog will cover what application changes are needed in an 'unplanned' failover).

The DR node is replicated from the HADR cluster, but does not participate in failover/failback operations.  Connections to the DR are not recommended unless the primary and companion are both offline.  However, once data is altered on the DR, restoring the primary and companion nodes of the cluster from the DR would be a planned operation as documented in the link previously provided.

Chris

 

 
hganga
Explorer
0 Kudos

Thank Chris. I understand clearly that you said. My original question is how to activate de DR node in case of a disaster?, that is, if the HA site is lost completly (both nodes off line), that procedure is not clear in the documentation (Manual procedure to activate the DR node, because "sap_failover" command is not supported for third node) and be able to the business still operates in this scenario.  I know that later we will need to restore the HA site with a backup from the DR site, because the business was operating in the DR node (in the case of lost HA nodes).

So, i think that the "sp_hadr_admin primary, force" in the DR node will be sufficient, because in the documentation that you comment, there is not mention of activation of the DR database, just the backup, restore in the HA and enabling replication as originaly flow was.

I really apprecite your help and advice. Sadly, i can not test this in my DR node, but i need to document the procedure in case of disaster and lost of HA site.

 

Thanks and best regards.

hganga
Explorer
0 Kudos
Dear Chris, the Azure team has cloned and isolated the DR node to test the manual activation of the DR node with the command "sp_hadr_admin ptimary, force" but is not working.

Do you know how the DR node can be activated?

 

Thanks and best regards.

c_baker
Employee
Employee
0 Kudos

Per the documentation (Overview), the DR node only backs up the databases.  It does not participate in the failover or failback of the active and standby nodes, so cannot be activated using RMA or ASE commands.

It is not an HADR cluster standby node, only a live backup of the primary/companion HADR cluster utilizing the HADR replication capabilities, instead of requiring additional replication licensing.

If the original cluster is not available (isolated, per your last comment), simply start using it as a standalone ASE instance.  You can remove the 'cluster' parts by following the documentation at Manually Removing the Replication in HADR System with DR Node Environment on the DR node to clear information retained in the DR node related to the original HADR cluster.

Obviously, once any data in the DR node has been changed, you will need to follow steps in the previously linked documentation (Recovering the HADR Cluster from the DR Node) to re-establish the HADR cluster.

Chris