WOWZ™ protocol triggered by upgrade to all v3.5

We’ve been running an origin/edge config successfully for two years, re-streaming IP network cameras with millions of views per year.

In the last two months, we moved our edge servers to new DC’s and upgraded to v3.5 . No problems. Origin still on v3.1.0 build1410 .

Yesterday we moved our origin server to a different DC and installed v3.5.

In old and new setups, origin also runs a local edge config.

The new all-3.5 cluster would not reliably stream RTMP to Flash clients. We use Flowplayer v3.2.14 with their RTMP plug-in v3.2.11 … nothing changed in that config.

RTMP streaming was equally unreliable whether we connected via the remote edge servers or via the edge local to the origin, tending to discount the new paths between edge and origin DC’s. HLS streaming to Apple devices (from the edge on the origin box) was working fine… in fact, better than before!

Unreliable = maybe 1 in 4 connection attempts would yield video, the others would just give us a black screen, even though Flowplayer thought the stream started.

Noticed that an all v3.5 config triggers your new WOWZ protocol … from your 3.5 manual:

The WOWZ™ protocol is a new TCP-based messaging protocol in Wowza Media Server 3.5 and is used for server-to-server communication. It’s enabled by default. If one of the Wowza Media Servers in the origin/edge configuration isn’t running Wowza Media Server 3.5, an RTMP connection will be established between the servers instead.

The only error message of note is the one we reported two years ago but with no solution…

NetConnectionConnection.connect: Failed to connect[localhost:1935]: org.apache.mina.common.RuntimeIOException: java.net.SocketException: Invalid argument: no further information

Java+Windows socket issue…

We still see this on our new edge servers (64-bit Windows Server 2008 + 64-bit Java 1.7.0_09) and the above on our new origin (32-bit Windows Server 2008 + 32-bit Java 1.7.0_10). First time we have seen it on a 32-bit WinOS and first time on a loopback local connection.

Random connect failures could explain it, especially if your new WOWZ protocol is using different code paths, timing, call parameters or whatever else triggers this issue with Java.

I’m contemplating adding a v3.1 install to the new origin box to try and remove the number of variables in this move.

The only way we got around the SocketException errors for the last two years was to run all 32-bit WinOS + 32-bit Java … never saw the problem again. Now we need bigger servers and greater than 4GB memory space… hence 64-bit . Hoped that sticking with 32-bit on the new origin would dodge that bullet but apparently not.

Any other suggestions or insights given the above symptoms?

Thanks!

Two things changed, according to your report, the origin was updated to 3.5, and it was moved.

An origin should be the same version as each edge. Maybe the move is a factor? The log message you show is Wowza reporting a network problem.

NetConnectionConnection.connect: Failed to connect[localhost:1935]: org.apache.mina.common.RuntimeIOException: java.net.SocketException: Invalid argument: no further information

Richard

Okay, that would be helpful.

What is the two year old problem?

Richard

Zip up and send /conf and /logs folders from the new origin and an active edge to support@wowza.com. I may follow with some GC suggestions after I take a look at what you are doing now.

Also send live links for the tests in your previous post. What player are you testing with?

Are your edge and origin servers located far from each other? Describe the cluster distribution in the email above.

Include a link to this thread for reference.

Suggestions:

  1. Make sure all drivers are up to date, especially the network drivers.

  2. In conf/MediaCaster.xml try changing the ConnectionTimeout for the liverepeater MediaCaster type from 4000 to 6000

Richard

Tim,

You can do that if both servers are running Wowza version 3.5+

And in that case even if you use rtmp:// the stream between edge and origin will be over wowz://

Richard

Origin and edge with different versions is working fine for us. All v3.5 is not but that’s probably not the core issue.

I’m looking at this thread:

https://issues.apache.org/jira/browse/DIRMINA-379

… which suggests the error message is more than just a “network problem”.

A problem with the network stack… yes. An incompatibility / bug exposure between Java and Windows TCP/IP sockets … yes. An issue that can be resolved at the Java application level… maybe. An issue that can be resolved by tweaking TCP/IP behavior parameters… maybe.

Please don’t dismiss this issue as someone else’s problem. It’s a problem for one of your customers and has been for two years.

I will try and create a reproducible environment on our new origin server which is out of production right now.

What is the two year old problem?

Win64 Server 2008 and SocketException Invalid argument

Same SocketException error as above. May or may not be related to our new origin issue. Just saying that it shows in the logs and in our new edge servers which were recently installed with Win Server 2008 64-bit.

We use StartupStreams.xml and most times, all our streams from origin to edge start okay on the edge. Sometimes they don’t and then takes random number of retries, walking up the free socket list. Occasionally one or two never start and we have to reboot. We don’t reset the edge servers too often so it’s manageable. We run the origin to edge streams 24/7 and once started successfully… don’t have a problem.

When we reported two years ago, we took the discussion offline and both you and Charlie punted on a solution. So we switched DC’s and went all 32-bit until last month.

Our new origin should really be a 64-bit environment too but we were scared about running into this issue in a more critical place. Seems we may have hit it anyway.

It should not be happening on any OS-Wowza combination.

I will try and create a reproducible environment on our new origin server which is out of production right now.

Very reproducible as follows using your Flash RTMP Player example (to rule out our Flowplayer config):

Always works

============

Server: rtmp://(oldOriginServer)/liveorigin

Stream: camorigin.stream

Server: rtmp://(newOriginServer)/liveorigin

Stream: camorigin.stream

Server: rtmp://(oldOriginServer)/liveedge

Stream: cam.stream

Unreliable (black screen more than 75% of the time)

=======================================

Server: rtmp://(newOriginServer)/liveedge

Stream: cam.stream

*** Connecting to the edge app on the new origin server is the only case that fails in these tests.

cam.stream contains :

[FONT=Courier New]rtmp://localhost:1935/liveorigin/_definst_/camorigin.stream[/FONT]

camorigin.stream contains :

[FONT=Courier New]rtsp://username:password@(ourCamIP):554/axis-media/media.amp?videocodec=h264&streamprofile=TEST[/FONT]

Both streams started in startupstreams.xml …

[FONT=Courier New]		<!-- Cam Origin Stream -->
		<StartupStream>
			<Application>liveorigin/_definst_</Application>
			<MediaCasterType>rtp</MediaCasterType>
			<StreamName>camorigin.stream</StreamName>
		</StartupStream>
		<!-- Cam Edge Stream -->
		<StartupStream>
			<Application>liveedge/_definst_</Application>
			<MediaCasterType>liverepeater</MediaCasterType>
			<StreamName>cam.stream</StreamName>
		</StartupStream>[/FONT]

“INFO stream play cam.stream” is always logged to the console when attempting a client connect… whether video appears or not.

We use ModuleHotlinkDenial which always reports first.

No errors logged on server.

Only one warning on startup as follows:

[FONT=Courier New]LiveStreamPacketizerSmoothStreaming.handlePacket[liveorigin/_definst_/camorigin.stream]: Fragment duration greater than suggested range of 1-4 seconds. Adjust keyframe interval accordingly: Fragment durations: [5.0,5.0,5.0][/FONT]

Any ideas?

Thank you Richard. I have just emailed everything you requested to support@wowza.com .

Let me know if you need anything else.

David.

Suggestions:

  1. Make sure all drivers are up to date, especially the network drivers.

  2. In conf/MediaCaster.xml try changing the ConnectionTimeout for the liverepeater MediaCaster type from 4000 to 6000

Richard

  1. Network driver is old(ish) but latest on MSFT update.

  2. Tried ConnectionTimeout of 6000 instead of 4000 … no difference.

Hi There -

Not sure that my issue is the same but I too am experiencing a very similar issue when upgrading systems to 3.5. All of my systems are running either Centos 6.2 or Centos 6.3. Each with more than enough RAM. I have four transcoding / origin servers… 3 are running version 3.1.2. The 4th, running 3.5. I currently have 8 edge servers delivering streams. 4 are running V3.1.2 with the remaining 4 running V3.5. Each of the 8 edge servers load the same streams from all 4 transcoding/origin servers via RTMP URLs configured in the aliasmap.stream.txt file and started via StartupStreams.xml. HLS streaming as David reports works great. It seems that all IP camera streams ingested into transcoder/origin servers 1, 2 and 3 running V3.1.2 are fine and stable. The RTMP streams being pulled from Transcoder/Origin 4 running V3.5 appear to start half the time but show black. If I try to reload the stream with the player a few times, it works… I imagine that each time I try to restart the stream, it either pulls a stream from a V3.5 edge server or a V3.1.2 edge server. If V.35 edge server, I get black… ?? Since introducing V3.5, I’ve begun seeing these issues. Is WOWZ taking place between the V3.5 Edge server and V3.5 Origin automatically? Should the config be different?

Thank you in advance,

Tim

Hi There -

I notice in article: https://www.wowza.com/docs/how-to-configure-a-live-stream-repeater

To use the following in the originURL of the Edge Application like follows…

<originURL>wowz://[wowza-origin-address]:1935/liveorigin</originURL>

Considering that I have multiple origin servers, I don’t enter an OriginURL into my edge applications. Instead, I enter the RTMP URL of the stream in the aliasmap.stream.txt file to the appropriate origin/transcoding server. And start up each stream on each edge server via StartupStreams.xml.

Instead of entering each stream URL like follows:

stream1_360p=rtmp://origin01.domain.com:1935/transratelive/stream1_360p
stream1_576p=rtmp://origin01.domain.com:1935/transratelive/stream1_576p

Should I use something like the following?

stream1_360p=wowz://origin01.domain.com:1935/transratelive/stream1_360p
stream1_576p=wowz://origin01.domain.com:1935/transratelive/stream1_576p

Thanks again,

Tim

OK… So Wowza knows if both origin and edge are running V3.5 and the the URL is rtmp:// it will auto default and use wowz:// - If that is the case, probably better that I leave it as rtmp:// so that everything is consistent… until I get the rest of the servers up and running on V3.5. A little hesitant to upgrade others until the issues I noted are solved or determined not to be a V3.5 issue.

Thanks Richard!

Tim