We see a consistent issue where there’s a transient load spike on the server while the incoming connection count is ramping up; when we hit about 500 incoming connections, the load average on the box hits around ~200, and CPU goes to ~100% (all in system; no iowait or user) and the network throughput drops out, causing issues for viewers. User memory rises a little during these few minutes, but then returns to normal. After about 10 minutes, everything settles down, and the connection count keeps going up without any problems at all; we can comfortably accommodate ~1200 users, saturating a 1Gb/s connection, and the server is basically coasting - load average is <1, plenty of free memory.
Basically, everything is awesome apart from those few minutes during ramp-up. Has anyone seen this? Any advice on how to avoid it? My guess is that it’s new threads being spun up to accommodate more connections, but I am not a java expert and am not sure how to test this hypothesis.
Linux; WowzaMediaServer-3.6.3; 64-bit RHEL 6; java version “1.7.0_17”; 16GB RAM; 8 physical cores; hyperthreading disabled; Wowza is tuned according to the best practices outlined here. Aside from this one issue, everything is rock solid and pretty much completely maintenance free, Wowza just sits there doing its job awesomely.
Thanks in advance for any suggestions.