So in the last post I talked about how to detect the problem, in this post I'll talk about some solutions. This list is not ordered from easiest to hardest nor do I suggest just doing one of these will solve your problem.
- Hardware Solution
- Get more servers.
- Turn on Jumbo Frames.
- Find a Network card that supports MSI-X.
- Software Solutions
- Hack the Network drivers and Epoll.
- Use Keep-Alive and Connection Po0ling.
- Compress your content.
Let me go into a little detail about each of the points above and hope to give some
guidence on if it will help you.
Get More Servers - So what would this do for you? Well it's really just buying time until you take one of the other actions. Also there is a big expense in power, space, and cooling.
Turn on Jumbo Frames - The problem that you are
experiencing has to do with each packet sends an interrupt. Standard packets are ~1500
MTU, if you enable jumbo frames you are increasing this to ~9000
MTU. The only issue is that all your switches and servers need to support jumbo frames. So what would happen is the end server where you are seeing this issue would have the load reduced and your Firewall or Load
Balancer would be
responsible for changing the packets back to a 1500
MTU or to something else depending on your media type.
Network card that supports MSI-X. - This is one of the better solutions
avaible to you. Depending on what kind of
CPUs you have installed and how many will depend on what kind of Network card you want to use. The first thing is that you *MUST* be using a kernel of 2.6.21 or greater.
Hack the Network drivers and Epoll. - This is always an option if you need something special but not something that I would undertake unless you have the Staff to support your own kernel build and Drivers.
Use Keep-Alive and Connection Po0ling. - Using both of these options will reduce the number of opens and closes on the
tcp session and help reduce the number of interrupts
Compress your content. - I hope you are doing this. If not enable it, code it, or find someone to help with it. This just helps over all. By sending compressed content you can send the same amount of data in half the time.
In my next post I'll detail how we solved the problem with a solution that's not listed above.