on 2011 Apr 18 6:39 AM
Hi, we are using right now SQLA 10 for 3 databases, all running on one engine and on one box, each HA component is separated on its own box (primary, arbiter, mirror), works fine but we need to place each DB on its own box. (performance and license reasons)
Desired state:
PRIMARY SITE:
Box1 - dbsrv_1 - db1 -----
Box2 - dbsrv_2 - db2 -----
Box3 - dbsrv_3 - db3 -----
ARBITER SITE:
Box4 - 3x arbiter for each primary
MIRROR SITE:
Box5 - 3x dbsrv for each mirror engine (each per 1 db)
Boxes 1,2,3,5 are planning to be the same hardware, (mirror doesnt need to be superfast).
Question 1> Almost each of 3 primary dbsrv is heavily used, thats the reason of separating them to separate boxes, how will the performance get affected by Box5, where all the DBs will be mirrored ? There is really just a small chance of a failure of primary, so I understand that in case of failure i.e. Box1, the performance of dbsrv_1 after failover to Box5 will be affected by other mirror engines, there's doubt about. But in case where each primary is running correctly on Boxes1-3, will be Box5 fast enough to handle all the transactions/changes of primary DBs ?
Question 2> How are the transactions performed in behind when using HA mode? Is it A: 2phase commit (change writes to both sides at one time) or is it B: tran commits on primary db, and after successfull write on primary it performs the same change and commits on mirror db (something like rep server)
Thanks a lot in advance
Request clarification before answering.
To answer your question 1, you have to understand the change transaction volume on your primary servers, as opposed to the query volume. Are your servers running a mixed workload behind an interactive application, where there is a healthy mix of queries, followed by a few change (INSERT, UPDATE, DELETE) transactions? OR, are your servers primarily change transactions, perhaps fed by an automated process? These are the questions you have to answer to understand if your server5 can keep up.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Well, the system is an ERP, so everything runs on it. Its a mix of OLTP (sales coming from number of branches, deletes from stores, RFID..), some complex processing (performed by accountants,HR,stores) and also some DSS (managers, reports, statistics). Everything is now handled by primary, mirror is just playing a role of "safe twin" and waiting for failure of primary.
I can't comment on the performance considerations, but as to the "commit mode" - there's the "database mirroring mode" with three options:
AFAIK, the default synchronous mode does not work really in 2PC mode (transactions are committed locally before they have been negotiated with the mirror) but it works similar: The primary waits until the according translog pages have been transmitted to the mirror and written to its trangslog.
The other modes offer a looser coupling (and therefore better performance) at the risk of losing committed transaction in case of a failover.
More discussion on this topic can be found in this HA question.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Volker, thanks for your reply and link to other question. So according to this, synchronous is the slowest, but deliveres top recoverability. Asynchronous provides better performance for primary servers, but there is highly potencial possibility of loosing some transactions (+- 5secs in async mode) or the all transactions on last log page if not filled completely up (asyncfullpage mode - or as set up by pagetimeout parameter).
Probably the performance recommendation according to our future server placement of servers would be to go and try with sync mode, and in case performance will suffer, switch to async and we should get better performance for primaries.. sounds like a plan.
Any other ideas ?
Your plan sounds good to me.
The main questions might still be how you value the trade-offs between
E.g. when you use an asynchronous mode and don't use auto-failover, how fast can you transport a backup between primary and mirror, and is there experienced IT staff available at both locations (if they are distinct at all...).
These considerations will usually differ based on your goal: Are you using HA to ensure a reduce/omit downtime in case of planned server maintenance and the like, or are you trying to shelter your business against acts of God?
You know, the customer always refuses to accept any of lost transactions or downtime... but we have to choose for him which is less painfull.. But moreless, their HA is implemented because of reducing downtime, its a store with lots of branches, and if the DB is not accessible, each branch is stucked and customers are not able to buy anything, cashier is not able to print invoices.. From this point of view, lost 5secs of transactions would be acceptable, because it will not reach the number of sales, lost during lets say 30min. downtime.
Just a note - if you do choose to run in an asynchronous mode, you can make specific transactions synchronous using the synchronize_mirror_on_commit option. This way, you can get the asynchronous performance boost most of the time and if there are transactions you absolutely can't afford to lose, you can use this option.
Have a look at the current configuration, is your mirror server under heavy load, or is only your primary under heavy load? My assumption is, that as you are already looking for 3 separate primaries, that the primaries have a higher load than the current mirror. Anyway your server 5 will have the exact load of your current mirror. So if it is ok now it will be also ok in your planned configuration.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Not necessarily. Currently, the single primary may be the bottleneck, and the mirror may not be heavily used. But if the splitting into 3 primaries will improve the overall throughput (which is the main goal in my understanding) then the load on the mirror will increase, too. - If the increase is significant, well, that's hard to tell for us:)
Well, as said, neither you nor me can't comment on the actual workload of Marchello's systems... My point is just the expectation that the mirror's peak load might be higher when it has to act as mirror for three different servers in contrast to only one.
E.g. currently the one primary has to "serialize" nearly parallel transactions from all three databases as it is only one database engine that has to handle these. (At least they will interfere with each other somewhat as the queries will share some resources.) - In the planned setup, these transactions could really run in parallel on the separate boxes without interfering each other, and the "serialization" might appear on the mirror who has to apply translog pages from all three primaries in parallel.
I don't think the current setup cannot handle the daily workload. It's just the assumption that peak workloads will be handled faster in the planned setup, and then the mirror might become a bottleneck in peak situations. But as said, that's all mere speculation:)
well, as I mentioned before, only the primary is under heavy load, not even DSS queries are executed against mirror server - its used just as a hot/online backup waiting for failover. So our current bottleneck is the primary server. Yes, splitting current 1 primary box to 3boxes (1box per 1db) is to reach higher throughput for each of them. Right now, with 1primary & 1mirror box, the mirror is not having the same load as the primary has, because only inserts, updates, deletes are sent to it. None from DSS queries. How it will catch up, when serving 3 mirror dbs, will depend for sure on number of inserts/updates/deletes comming from 3x primary boxes together.. Thanks to your posts its more clearlier to me right now. The possibilities are a bit limited, as we cant change the design of the application, neither use some fancy features of SQLA v11&12, we have to stay on v10 and work with it.
User | Count |
---|---|
82 | |
29 | |
9 | |
8 | |
7 | |
7 | |
7 | |
6 | |
6 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.