Wednesday, June 12, 2013

The Percona Server Thread Pool Feature

I recently had the opportunity to explore, test, and implement Percona's new thread pool option. This feature has been included as of Percona Server 5.5.29-30.0. Percona ported the feature from MariaDB - more information about the original commit can be found here. To be fair, this feature is an alternative implementation of Oracle's Thread Pool Plugin from Oracle MySQL Enterprise Edition. To be completely fair, thread pool functionality was originally added, but not released to the public, in 2007 - some interesting history. Vadim Tkachenko, co-author of High Performance MySQL, Percona's CTO, and all around smart guy already explained thread pools better than I can,  no need to duplicate that effort. :)

I suspect that most MySQL DBA's have reviewed the numerous benchmarks and analyses regarding MySQL performance in relation to high-concurrency. If you haven't, refer to Dimitri Kravtchuk's blog for a benchmarking shot in the arm. Simply put: MySQL query performance drops substantially once the number of concurrent queries surpasses the number of CPU threads available. Multitenant servers can see spiked load and MySQL will come to a grinding halt while the OS kernel scheduler spends almost all its time context-switching processes rather than executing the instructions in each process. This is the very phenomenon I experience at TrackVia, and wish to prevent. I have tuned my database servers to handle the usual workload, it is the bursty traffic that can put them into an unhealthy state that requires automated, and at times, manual mitigation.

I have been testing thread pooling with some of our workload, and thus far, have seen marked improvement overall. As one would expect, CPU load during peak usage has been reduced to max out at about 100% per CPU thread. The observed side effect: ordinarily light and fast queries take longer to return results, due to being queued within the thread pool. Percona provides a few knobs for thread pool tuning - priority policy is interesting, but unhelpful in my case (all application connections come from the same db user). One option that has helped is thread_pool_stall_limit, which allows the DBA to adjust how long before Percona Server context switches out a query to the next connection in the queue.

I expect to write a followup at some point in the future, once I have a better grasp of how this changes the overall behavior of Percona Server's query processing. For now, I wanted to put it out there for those who may not have read about or considered thread pooling. Bear in mind, at the time of this post, Percona classifies this feature as beta quality - so do your own testing before considering using it.

I would love to see an implementation of Mark Callaghan's suggestion, a step closer to QoS in MySQL. But then, I'd love to see MySQL Proxy go from alpha to GA... :)

1 comment:

  1. Hi Paul,

    I have been testing thread pooling in our production env. A couple of interesting things.

    1. I see random spikes in the no of aborted connects on the mysql master that has thread pooling enabled.

    2. Also I use pt-tcp-model to collect stats from tcpdump. One of the things that i observe is that the quantile time jumps to 750 ms to 1 sec on the master with thread pooling enabled. It doesn't do this on the server with no thread pooling enabled.

    3. If possible can you share stats like qps and what your thread pooling configurations looks like.

    4. I did a strace on 2 mysql slaves that are part of the application pool and see that the one with thread pooling enabled does a lot of system calls. Did you come across such behaviour ?

    5. I am using the information_schema.query_response_time stats on these (loadbalanced) slaves . on slave with thread pooling I see that the query response time has increased by a factor of 10.

    Regards,
    M

    ReplyDelete