Wowza leaks threads from WebRTC connections

It seems that wowza is leaking threads from webrtc connections. All of the threads left stranded have a stack trace as follows:

Name: Thread-484002 State: TIMED_WAITING on java.lang.Object@19cc7d5f Total blocked: 0 Total waited: 492,193 Stack trace: java.lang.Object.wait(Native Method) com.wowza.wms.webrtc.dtls.WebRTCDTLSHandlerTransport.receive(WebRTCDTLSHandlerTransport.java:66) org.bouncycastle.crypto.tls.DTLSRecordLayer.receiveRecord(Unknown Source) org.bouncycastle.crypto.tls.DTLSRecordLayer.receive(Unknown Source) org.bouncycastle.crypto.tls.DTLSReliableHandshake.receiveMessage(Unknown Source) org.bouncycastle.crypto.tls.DTLSServerProtocol.serverHandshake(Unknown Source) org.bouncycastle.crypto.tls.DTLSServerProtocol.accept(Unknown Source) com.wowza.wms.webrtc.dtls.WebRTCDTLSHandlerThread.run(WebRTCDTLSHandlerThread.java:35)

These threads never die off and whatever objects they create are never getting GC’d. Reproducing the problem is fairly simple… just start a cycle of adding and removing connections and watch the thread count grow over time.

After running my test for a few hours, I’m now seeing a sharp increase in CPU, memory, and the rate at which threads are leaked. I’m also getting this in my access log:

WebRTCRunnerMina.createSocket: : java.lang.IllegalArgumentException: port out of range:68812|at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)|at java.net.InetSocketAddress.(InetSocketAddress.java:188)|at com.wowza.wms.webrtc.model.WebRTCRunnerMina.b(WebRTCRunnerMina.java:212)|at com.wowza.wms.webrtc.model.WebRTCRunnerMina.start(WebRTCRunnerMina.java:303)|at com.wowza.wms.webrtc.model.WebRTCSession.startRunner(WebRTCSession.java:250)|

… and these in the logs as well:

2017-11-2717:14:11UTCcommentserverWARN200-UDPPortManager.acquireUDPPortPair: Auto UDP port number is greater than maximum[65536]: 66766—1471544.294

So, it seems that the thread leak is also creating a UDP port leak. Now that all of my UDP ports are used up, wowza is returning only a TCP ice candidate (which is sort of cool in of itself, but I’d love to have my UDP ports back).

The test ended with wowza crashing at 32356 leaked threads. It seems that wowza ran out a resource limit somewhere as well, as attempts to restart the wowza service or reboot the server resulted in fork errors until the java process was forcibly terminated using killall:

sudo service WowzaStreamingEngine restart

/etc/init.d/functions: fork: retry: No child processes

sudo reboot now

sudo: unable to fork: Resource temporarily unavailable

Ulimit output is as follows:

core file size (blocks, -c) 0

data seg size (kbytes, -d) unlimited

scheduling priority (-e) 0

file size (blocks, -f) unlimited

pending signals (-i) 122590

max locked memory (kbytes, -l) 64

max memory size (kbytes, -m) unlimited

open files (-n) 1024

pipe size (512 bytes, -p) 8

POSIX message queues (bytes, -q) 819200

real-time priority (-r) 0

stack size (kbytes, -s) 8192

cpu time (seconds, -t) unlimited

max user processes (-u) 122590

virtual memory (kbytes, -v) unlimited

file locks (-x) unlimited

The problem still exists in 4.7.3.01 .

My test case involves 32 players that load the same live stream from a single edge server which is consuming that stream from an origin using a typical liverepeater setup. The source is OBS x264/aac with wowza transcoded audio, but I don’t suspect the source is relevant to this issue.

Each player reloads the stream every 8 seconds, and my test as configured results in a churn rate of an average of 4 connections per second (actual rate varies slightly with the time it takes to negotiate with the socket service for SDP). Since all of the players are connecting to the same stream on the same edge, the repeater connection remains persistent throughout the test.

I’m finding that for every 100-200 connections (or perhaps more accurately, disconnections) at this rate, a thread (as described above) is left hanging. There’s nothing in the logs to indicate the condition, and the problem can only be realized when inspecting threads directly. If the connection churn rate is increased, the frequency of leaked threads increases.

When the test is stopped, wowza reports zero connections as expected, but each of the stranded threads continue to consume memory and prevent UDP ports from being released back into the pool from which they came, eventually causing wowza to return only TCP ICE candidates and (with a high enough churn rate in this condition) eventually stop responding altogether.

The problem seems to be slightly less pronounced in 4.7.3.01 as compared to my tests on 4.7.1, but still represents a serious concern for deployment. A response would be greatly appreciated.

Ran the same test with firefox as the client. In this case wowza leaks a thread for each and every connection.

Please, someone will fix this issue, because I need to restart the server every week. The amount of threads is reaching thousands. Please, this bug is present since the first WebRTC preview version.

Seeing this on 4.7.5 too

Seeing this on 4.7.5 too

Has the issue been resolved?

Yes, these posts are referring to Engine version 4.7.5, which is old. We’ve released 12 versions since then.

We are now on Engine version 4.8.14 so make sure you are using at least 4.8.10 so you can benefit from all the WebRTC fixes.