BUG: Wowza 4.7.7 hanging thread takes up entire cpu core.

A quick preface: I submitted a lengthy bug report ticket about this a week before Christmas, and had a back-and-forth with a few support staff. The support staff then very unhelpfully closed the ticket due to me not responding within 72 hours during the Christmas break (Merry Christmas, guys!). So, I’m posting this here now (Happy New Year).

Ever since we updated from 4.7.5 to 4.7.7, we occasionally see extremely high cpu usage for no apparent reason. E.g. with less than 4 connections CPU usage goes from 2%, and shoots up to +50%.

I was peering inside the VM, and found one thread taking up an entire core:

As you can see above one particular thread is suddenly doing a whole lot of work for no apparent reason.

  • This was my 5th or 6th time seeing this behaviour since we updated.
  • It happens at low loads (1-4ish connections), it happens at high loads (200-400 connections). I can’t find a pattern.
  • It happens with Java 1.8.77 that the AMI is born with, and also with 1.8.191 that I upgraded to.
  • There seems to be no way to fix it except restarting the WSE; something we’d rather not do when we’re live.

We’re currently using an Amazon M5.large instance (2 cores, 8GB ram), which is covering our needs quite nicely. We’re seeing something like 40% usage during peak times.

We’re ingesting RTSP and have an old-fashioned RTMP stream, and a WebRTC stream for modern browsers. We’re transcoding audio from AAC to Opus for WebRTC purposes.

We’ve been testing the WebRTC implementation right from the beginning of the beta, and this is a new issue we’ve been seeing.

The symptoms are a lot like this guy’s from September: http://community.wowza.com/community/questions/48969/is-it-normal-to-have-high-cpu-usage-with-no-connec.html

I don’t have a thread dump, and I can’t reproduce the issue with any kind of consistency.

Hello @Kristoffer Cobley,

Feel free to reply to the last email you got from the support technician to continue working on the issue.
It does look like we wanted to review the thread dump when the issue occurs to investigate the issue.

Regards,

Alex

Hi @Alex C,

Good to know I can write them back. It could be months before the issue is provoked again (We only do a lot of streaming once per quarter, and I’m moving on to other tasks).

I just thought I’d let you guys now there’s a pretty serious performance-related issue currently, which is only mitigated by throwing cores at it.

Sorry you felt upset by this, but your ticket was submitted Dec 17th, 8 days before Xmas and we tried to reach you for 6 days straight. For efficiency sake and the sheet volume of tickets we receive, we need to close tickets that are not responded to after several attempts. We hope you can understand this must be our process due to volume.

OK Rose, we’ll try to gather all the information, as soon as we can make it happen again …

Hi Kristoffer and all, did you get any solution on this? We are experiences a very similar “hard to reproduce issue” since 4.7.8, but it might be that in 4.7.7 we did not incur into the issue just by chance…

Silvia

You would need to submit your own support ticket since the person who originally posted this had his own unique workflow, certain modules and transcoding needs and we’d like to accurately diagnose your issue by reviewing your specific config and logs.

There were some very big changes in 4.7.8 and we’d like to properly assist you by taking a closer look. Thanks.

https://www.wowza.com/support/open-ticket