Dancer::Plugin::Database Hangs Randomly on check_connection
Hello - I'm having an issue with Dancer::Plugin::Database that has been hard to reproduce and track down. I have a new app that I just moved to its production environment. It connects to a database that is in a remote data center over a vpn tunnel. Everything functions normally as expected 98% of the time, however occasionally I will hit a worker thread (deployed via Starman) that just hangs trying to get a database handle. It hangs in the _check_connection subroutine until it reaches some timeout. Then will reconnect just fine with a new connection. I'm using plugin version 1.82 and I have traced it down to this line in the plugin: if ($dbh->{Active} && (my $result = $dbh->ping)) { Right at this if statement is where it hangs, i assume while calling the ping function. It seems there is some confusion as to whether the dbh is still active, it then calls ping which hangs for a number of minutes (not sure how many exactly yet, b/t 3-5). This renders this worker thread unresponsive until it times out. It will eventuall fall into the else block and just return false, which then a new connection is established just fine. I have successfully deployed a number of apps using this same plugin with no problems this only difference here is the physical location so there maybe something network related, just not sure where. I have dug into DBI, etc and the ping functions works fine on all my test scripts. Again, this only happens once in a while so I cannot reproduce on command, which makes it difficult to debug. Here is the output of my debug statements, showing the one thread that hangs: [27037] debug @0.005053> [hit #4]Database connection_check_threshold [30] in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 69 [27037] debug @0.005189> [hit #4]Database handle last check [1398093938] in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 70 [27037] debug @0.005345> [hit #4]Database calling check_connection in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 83 [27037] debug @0.005481> [hit #4]Database.pm in check_connection before if active and ping in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 265 ... other threads working ... [27037] debug @962.134467> [hit #4]Database in check_connectiion false return in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 285 [27037] debug @962.134667> [hit #4]Database connection went away, reconnecting in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 90 [27037] debug @962.134894> [hit #4]Database calling get_connection in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 97 [27037] debug @962.135059> [hit #4]Adding mysql_enable_utf8 to DBI connection params to enable UTF-8 support in /usr/local/share/perl5/Dance Any help or suggestions on where to look next is appreciated. Thank you.
Hello James if you app works well on localnetwork, the problem come from your vpn connection. Please check MTU and fragmantion packet of your vpn ( depend of vpn type, ipsec, ssl , other ) see http://codeidol.com/telecommunications/vpn/Scaling-and-Optimizing-IPsec-VPNs... bye Hugues. Le 21/04/2014 18:55, James Baer a écrit :
Hello - I'm having an issue with Dancer::Plugin::Database that has been hard to reproduce and track down.
I have a new app that I just moved to its production environment. It connects to a database that is in a remote data center over a vpn tunnel. Everything functions normally as expected 98% of the time, however occasionally I will hit a worker thread (deployed via Starman) that just hangs trying to get a database handle. It hangs in the _check_connection subroutine until it reaches some timeout. Then will reconnect just fine with a new connection.
I'm using plugin version 1.82 and I have traced it down to this line in the plugin: if ($dbh->{Active} && (my $result = $dbh->ping)) {
Right at this if statement is where it hangs, i assume while calling the ping function. It seems there is some confusion as to whether the dbh is still active, it then calls ping which hangs for a number of minutes (not sure how many exactly yet, b/t 3-5). This renders this worker thread unresponsive until it times out. It will eventuall fall into the else block and just return false, which then a new connection is established just fine.
I have successfully deployed a number of apps using this same plugin with no problems this only difference here is the physical location so there maybe something network related, just not sure where.
I have dug into DBI, etc and the ping functions works fine on all my test scripts. Again, this only happens once in a while so I cannot reproduce on command, which makes it difficult to debug.
Here is the output of my debug statements, showing the one thread that hangs:
[27037] debug @0.005053> [hit #4]Database connection_check_threshold [30] in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 69 [27037] debug @0.005189> [hit #4]Database handle last check [1398093938] in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 70 [27037] debug @0.005345> [hit #4]Database calling check_connection in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 83 [27037] debug @0.005481> [hit #4]Database.pm in check_connection before if active and ping in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 265 ... other threads working ... [27037] debug @962.134467> [hit #4]Database in check_connectiion false return in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 285 [27037] debug @962.134667> [hit #4]Database connection went away, reconnecting in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 90 [27037] debug @962.134894> [hit #4]Database calling get_connection in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 97 [27037] debug @962.135059> [hit #4]Adding mysql_enable_utf8 to DBI connection params to enable UTF-8 support in /usr/local/share/perl5/Dance
Any help or suggestions on where to look next is appreciated.
Thank you.
_______________________________________________ dancer-users mailing list dancer-users@dancer.pm http://lists.preshweb.co.uk/mailman/listinfo/dancer-users
Whilst the actual problem may well be network-level, it would be far better if D::P::D didn't hang for unacceptable periods waiting to determine if the connection is still usable, and instead decided that if $dbh->ping didn't return in a reasonable time, that it would give up on that connection and throw it away and get a new one. My kneejerk reaction is that I could change that bit of code to call the $dbh->ping in an eval block with a timeout - that ought to do the job, I'd think. On Mon, 21 Apr 2014 22:34:43 +0200 Hugues <hugues@max4mail.com> wrote:
Hello James if you app works well on localnetwork, the problem come from your vpn connection. Please check MTU and fragmantion packet of your vpn ( depend of vpn type, ipsec, ssl , other ) see http://codeidol.com/telecommunications/vpn/Scaling-and-Optimizing-IPsec-VPNs... bye Hugues.
Le 21/04/2014 18:55, James Baer a écrit :
Hello - I'm having an issue with Dancer::Plugin::Database that has been hard to reproduce and track down.
I have a new app that I just moved to its production environment. It connects to a database that is in a remote data center over a vpn tunnel. Everything functions normally as expected 98% of the time, however occasionally I will hit a worker thread (deployed via Starman) that just hangs trying to get a database handle. It hangs in the _check_connection subroutine until it reaches some timeout. Then will reconnect just fine with a new connection.
I'm using plugin version 1.82 and I have traced it down to this line in the plugin: if ($dbh->{Active} && (my $result = $dbh->ping)) {
Right at this if statement is where it hangs, i assume while calling the ping function. It seems there is some confusion as to whether the dbh is still active, it then calls ping which hangs for a number of minutes (not sure how many exactly yet, b/t 3-5). This renders this worker thread unresponsive until it times out. It will eventuall fall into the else block and just return false, which then a new connection is established just fine.
I have successfully deployed a number of apps using this same plugin with no problems this only difference here is the physical location so there maybe something network related, just not sure where.
I have dug into DBI, etc and the ping functions works fine on all my test scripts. Again, this only happens once in a while so I cannot reproduce on command, which makes it difficult to debug.
Here is the output of my debug statements, showing the one thread that hangs:
[27037] debug @0.005053> [hit #4]Database connection_check_threshold [30] in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 69 [27037] debug @0.005189> [hit #4]Database handle last check [1398093938] in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 70 [27037] debug @0.005345> [hit #4]Database calling check_connection in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 83 [27037] debug @0.005481> [hit #4]Database.pm in check_connection before if active and ping in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 265 ... other threads working ... [27037] debug @962.134467> [hit #4]Database in check_connectiion false return in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 285 [27037] debug @962.134667> [hit #4]Database connection went away, reconnecting in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 90 [27037] debug @962.134894> [hit #4]Database calling get_connection in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 97 [27037] debug @962.135059> [hit #4]Adding mysql_enable_utf8 to DBI connection params to enable UTF-8 support in /usr/local/share/perl5/Dance
Any help or suggestions on where to look next is appreciated.
Thank you.
_______________________________________________ dancer-users mailing list dancer-users@dancer.pm http://lists.preshweb.co.uk/mailman/listinfo/dancer-users
-- David Precious ("bigpresh") <davidp@preshweb.co.uk> http://www.preshweb.co.uk/ www.preshweb.co.uk/twitter www.preshweb.co.uk/linkedin www.preshweb.co.uk/facebook www.preshweb.co.uk/cpan www.preshweb.co.uk/github
Thanks for the responses. I do not believe this to an MTU issue as all other traffic is flowing correctly. I'm thinking its more of tcp session timeout in one of the firewalls or something. The problem has been replicating it as I need to have all my monitoring ready to catch it when it happens. David, That was my first thought as well. setting some sort of timer around the ping code and forcing it to grab a new connection. What do you think an acceptable value would be for that timeout? I'll try test something out on my end in the meantime, which i would be happy to contribute. Thanks On Mon, Apr 21, 2014 at 4:55 PM, David Precious <davidp@preshweb.co.uk>wrote:
Whilst the actual problem may well be network-level, it would be far better if D::P::D didn't hang for unacceptable periods waiting to determine if the connection is still usable, and instead decided that if $dbh->ping didn't return in a reasonable time, that it would give up on that connection and throw it away and get a new one.
My kneejerk reaction is that I could change that bit of code to call the $dbh->ping in an eval block with a timeout - that ought to do the job, I'd think.
On Mon, 21 Apr 2014 22:34:43 +0200 Hugues <hugues@max4mail.com> wrote:
Hello James if you app works well on localnetwork, the problem come from your vpn connection. Please check MTU and fragmantion packet of your vpn ( depend of vpn type, ipsec, ssl , other ) see
http://codeidol.com/telecommunications/vpn/Scaling-and-Optimizing-IPsec-VPNs...
bye Hugues.
Le 21/04/2014 18:55, James Baer a écrit :
Hello - I'm having an issue with Dancer::Plugin::Database that has been hard to reproduce and track down.
I have a new app that I just moved to its production environment. It connects to a database that is in a remote data center over a vpn tunnel. Everything functions normally as expected 98% of the time, however occasionally I will hit a worker thread (deployed via Starman) that just hangs trying to get a database handle. It hangs in the _check_connection subroutine until it reaches some timeout. Then will reconnect just fine with a new connection.
I'm using plugin version 1.82 and I have traced it down to this line in the plugin: if ($dbh->{Active} && (my $result = $dbh->ping)) {
Right at this if statement is where it hangs, i assume while calling the ping function. It seems there is some confusion as to whether the dbh is still active, it then calls ping which hangs for a number of minutes (not sure how many exactly yet, b/t 3-5). This renders this worker thread unresponsive until it times out. It will eventuall fall into the else block and just return false, which then a new connection is established just fine.
I have successfully deployed a number of apps using this same plugin with no problems this only difference here is the physical location so there maybe something network related, just not sure where.
I have dug into DBI, etc and the ping functions works fine on all my test scripts. Again, this only happens once in a while so I cannot reproduce on command, which makes it difficult to debug.
Here is the output of my debug statements, showing the one thread that hangs:
[27037] debug @0.005053> [hit #4]Database connection_check_threshold [30] in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 69 [27037] debug @0.005189> [hit #4]Database handle last check [1398093938] in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 70 [27037] debug @0.005345> [hit #4]Database calling check_connection in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 83 [27037] debug @0.005481> [hit #4]Database.pm in check_connection before if active and ping in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 265 ... other threads working ... [27037] debug @962.134467> [hit #4]Database in check_connectiion false return in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 285 [27037] debug @962.134667> [hit #4]Database connection went away, reconnecting in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 90 [27037] debug @962.134894> [hit #4]Database calling get_connection in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 97 [27037] debug @962.135059> [hit #4]Adding mysql_enable_utf8 to DBI connection params to enable UTF-8 support in /usr/local/share/perl5/Dance
Any help or suggestions on where to look next is appreciated.
Thank you.
_______________________________________________ dancer-users mailing list dancer-users@dancer.pm http://lists.preshweb.co.uk/mailman/listinfo/dancer-users
-- David Precious ("bigpresh") <davidp@preshweb.co.uk> http://www.preshweb.co.uk/ www.preshweb.co.uk/twitter www.preshweb.co.uk/linkedin www.preshweb.co.uk/facebook www.preshweb.co.uk/cpan www.preshweb.co.uk/github
_______________________________________________ dancer-users mailing list dancer-users@dancer.pm http://lists.preshweb.co.uk/mailman/listinfo/dancer-users
Just an update on this. I've tried to eval the ping with a timeout but it does not appear to working. I'm not sure if using alarm is safe within dancer, i'm guessing not and thats my issue now or that alarm is being overridden somewhere else. In the meantime, I'm lowering my wait_timeout on the mysql server to force it close connections sooner. We'll see if this helps. On Mon, Apr 21, 2014 at 5:16 PM, James Baer <jamesfbaer@gmail.com> wrote:
Thanks for the responses. I do not believe this to an MTU issue as all other traffic is flowing correctly. I'm thinking its more of tcp session timeout in one of the firewalls or something. The problem has been replicating it as I need to have all my monitoring ready to catch it when it happens.
David, That was my first thought as well. setting some sort of timer around the ping code and forcing it to grab a new connection. What do you think an acceptable value would be for that timeout? I'll try test something out on my end in the meantime, which i would be happy to contribute.
Thanks
On Mon, Apr 21, 2014 at 4:55 PM, David Precious <davidp@preshweb.co.uk>wrote:
Whilst the actual problem may well be network-level, it would be far better if D::P::D didn't hang for unacceptable periods waiting to determine if the connection is still usable, and instead decided that if $dbh->ping didn't return in a reasonable time, that it would give up on that connection and throw it away and get a new one.
My kneejerk reaction is that I could change that bit of code to call the $dbh->ping in an eval block with a timeout - that ought to do the job, I'd think.
On Mon, 21 Apr 2014 22:34:43 +0200 Hugues <hugues@max4mail.com> wrote:
Hello James if you app works well on localnetwork, the problem come from your vpn connection. Please check MTU and fragmantion packet of your vpn ( depend of vpn type, ipsec, ssl , other ) see
http://codeidol.com/telecommunications/vpn/Scaling-and-Optimizing-IPsec-VPNs...
bye Hugues.
Le 21/04/2014 18:55, James Baer a écrit :
Hello - I'm having an issue with Dancer::Plugin::Database that has been hard to reproduce and track down.
I have a new app that I just moved to its production environment. It connects to a database that is in a remote data center over a vpn tunnel. Everything functions normally as expected 98% of the time, however occasionally I will hit a worker thread (deployed via Starman) that just hangs trying to get a database handle. It hangs in the _check_connection subroutine until it reaches some timeout. Then will reconnect just fine with a new connection.
I'm using plugin version 1.82 and I have traced it down to this line in the plugin: if ($dbh->{Active} && (my $result = $dbh->ping)) {
Right at this if statement is where it hangs, i assume while calling the ping function. It seems there is some confusion as to whether the dbh is still active, it then calls ping which hangs for a number of minutes (not sure how many exactly yet, b/t 3-5). This renders this worker thread unresponsive until it times out. It will eventuall fall into the else block and just return false, which then a new connection is established just fine.
I have successfully deployed a number of apps using this same plugin with no problems this only difference here is the physical location so there maybe something network related, just not sure where.
I have dug into DBI, etc and the ping functions works fine on all my test scripts. Again, this only happens once in a while so I cannot reproduce on command, which makes it difficult to debug.
Here is the output of my debug statements, showing the one thread that hangs:
[27037] debug @0.005053> [hit #4]Database connection_check_threshold [30] in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 69 [27037] debug @0.005189> [hit #4]Database handle last check [1398093938] in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 70 [27037] debug @0.005345> [hit #4]Database calling check_connection in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 83 [27037] debug @0.005481> [hit #4]Database.pm in check_connection before if active and ping in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 265 ... other threads working ... [27037] debug @962.134467> [hit #4]Database in check_connectiion false return in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 285 [27037] debug @962.134667> [hit #4]Database connection went away, reconnecting in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 90 [27037] debug @962.134894> [hit #4]Database calling get_connection in /usr/local/share/perl5/Dancer/Plugin/Database.pm l. 97 [27037] debug @962.135059> [hit #4]Adding mysql_enable_utf8 to DBI connection params to enable UTF-8 support in /usr/local/share/perl5/Dance
Any help or suggestions on where to look next is appreciated.
Thank you.
_______________________________________________ dancer-users mailing list dancer-users@dancer.pm http://lists.preshweb.co.uk/mailman/listinfo/dancer-users
-- David Precious ("bigpresh") <davidp@preshweb.co.uk> http://www.preshweb.co.uk/ www.preshweb.co.uk/twitter www.preshweb.co.uk/linkedin www.preshweb.co.uk/facebook www.preshweb.co.uk/cpan www.preshweb.co.uk/github
_______________________________________________ dancer-users mailing list dancer-users@dancer.pm http://lists.preshweb.co.uk/mailman/listinfo/dancer-users
participants (3)
-
David Precious -
Hugues -
James Baer