on 2011 Jul 06 8:12 AM
This has been reported a few times but no resolutions reported and I wondered if anyone had got to the bottom of it?
We occasionally have database connections where the client application has been disconnected but the connection is still active on the server and cannot be dropped. Attempting to drop the connection either through Sybase Central or using DROP CONNECTION has no effect (and no error is reported). It makes no difference when the client machine is rebooted.
The connection appears in the list of active connection in Sybase Central and is consuming CPU time (slightly less than a full processor's worth - though it varies). The statement reported as the last statement is a fairly straightforward and is being executed by the system for other connections thousands of times per hour without problem. The connection is not shown as being blocked.
The server version is 10.0.1.4239
Related questions:
We have seen customer reports in the past (known to exist 11.0.1) of the bug CR #594568 leaving 'dead connections' around that can no longer be dropped. (e.g. We started a HashFilter/parallelized intra-query task and 'accidentally' left around some worker threads afterwards).
Notably, CR #594568 was only the first fix for a class of 'HashFilter' problems - there were additional issues (in particular, CR #620095 ) fixed on later builds/versions (including 10.0.1/12.0.1). I would recommend seeing if this 'zombie connection' behaviour still occurs after these fixes/builds:
Update (2012/03/30):
We have now diagnosed more instances of a 'hung connection' problem from other customer reports. The updated list for parallelized HashFilter problems (that may leave Exchange worker tasks 'abandoned' and 'undroppable') now includes the following issues/builds:
The list of associated parallelized 'HashFilter' issues (server hang/crashes/incorrect results) include:
Note: CR #702733 was identified after the End-of-Life date for SQL Anywhere 10.0.x. As such, this fix was not backported to this codeline and it remains a possibility for 10.0.x builds to encounter undroppable internal connections in very rare circumstances. If this is a concern for you, please upgrade to a later version of SQL Anywhere (11.0.1 / 12.0.1), which includes this fix. Setting the connection option 'MAX_QUERY_TASKS' to '1' or restarting the database server is a workaround for the issue in 10.0.1.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
It's not directly reproducible but is occurring regularly. Potentially more helpfully: the server in question crashed with this issue on-going and as part of the investigation of that (probably unrelated) issue (case express #116801982) a full core dump and a request level log have been uploaded to support today.
If the server is also crashing, it sounds like the environment is possibly 'unstable'. There is likely another issue involved (that is responsible for the crash specifically) that may be contributing to the hanging connections (e.g. potentially something 'like' CR #674917). It could be the case that getting a connection into this state precipitates the crash, or...
I would be interested to see if this behaviour still happens after we figure out the cause of the crash internally - it looks like we're still currently investigating that issue.
You're quite welcome - I've been trying to keep track of the reports for this issue and I think I've found all of the questions now. Feel free to link back to this thread if you find any more around.
If I hear of anything else regarding this situation, I will try to keep this answer updated. Please feel free to report future instances of any 'undroppable connections' as a new question after these builds have been released. (These builds have also been currently requested in QA).
The 11.0.1 Windows / Linux EBFs have now been released for the CR #702733 issue:
11.0.1.2789 - Windows x86/x64 - http://downloads.sybase.com/swd/detail.do?baseprod=144&client=ianywhere&relid=15647
11.0.1.2794 - Linux x86/x64 - http://downloads.sybase.com/swd/detail.do?baseprod=144&client=ianywhere&relid=15643
The 12.0.1 EBFs are still progressing through QA.
My guess would be that the connection has a huge transaction to rollback. The communications link may be gone but the server must actually rollback all uncommitted operations performed by that connection before the connection will disappear from the list reported by the server.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
As far as the server is concerned, the communication link and the internal representation of a database connection are distinct (though clearly related) entities. The internal representation of a database connection cannot go away until the rollback completes. I don't know for sure what information is returned by the connection info procedures when the communication link has terminated but the internal representation of a connection is still being torn down but I wouldn't be surprised if it was 'active'.
One thing that could be done while in the state described by Volker is to try to execute a checkpoint on another connection. Checkpoints cannot be performed while a rollback is in progress. If you can execute a checkpoint, my guess is wrong and perhaps there is a bug. If you cannot execute a checkpoint, you can't really tell for sure if there's a rollback in progress since other things can prevent checkpoints too but it's a good indicator.
Thank for that tip - it hadn't occurred to me. Checkpoints are happening regularly (about every 10 mins) as normal for the server, taking a few seconds (the connection has been in this state for a few hours now) - so does that suggest that there could be an issue? It's the only odd thing we've been able to establish so far about this server that is crashing regularly for no apparent reason. I'm just wondering if we could be running into some sort of resource leak.
User | Count |
---|---|
66 | |
10 | |
10 | |
10 | |
10 | |
8 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.