DirectRandomAccessReader.read() and seek() generate performance bottleneck

Hi,

I’m currently tuning performance of wowza 4.0.3, with following hardware and test situation, we can get about 8Gbps-9Gbps.

  • 2*Intel® Xeon® CPU E5-2690 0 @ 2.90GHz(total 32 logic processors)

  • 32G RAM

  • Only one vod file, H.264 encoded, 5Mbps, 1.57GBytes, located in {WowzaInstallDir}/content, which is a tmpfs directory by command “mount -t tmpfs -o size=2048m tmpfs {WowzaInstallDir}/content”

  • Clients use HLS to access VOD video

    We start 1800 clients and Wowza works well, peformance of the machine is ok, command “top”’s output and VisualVM’s output are as following:

    top - 08:50:47 up 10 days, 16:45, 9 users, load average: 8.21, 10.79, 10.98

    Tasks: 596 total, 1 running, 595 sleeping, 0 stopped, 0 zombie

    Cpu(s): 19.1%us, 7.5%sy, 0.0%ni, 71.8%id, 0.1%wa, 0.0%hi, 1.6%si, 0.0%st

    Mem: 32918272k total, 26107320k used, 6810952k free, 187908k buffers

    Swap: 68517872k total, 3006824k used, 65511048k free, 12386628k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

    4108 root 18 0 11.3g 9.7g 12m S 594.0 30.9 6032:28 java

    4262 root 16 0 2443m 102m 2740 S 45.5 0.3 358:35.09 python

    7597 root 16 0 2438m 87m 2740 S 43.8 0.3 354:20.58 python

    But if we start 2000 client and Wowza can’t provide normal service, most of clients get timeout, top and VisualVM’s outputs are as following. And also from VisualVM, we can see that all ServerTransportThreads are busy.

    top - 08:57:34 up 10 days, 16:52, 9 users, load average: 36.05, 25.62, 17.19

    Tasks: 600 total, 1 running, 599 sleeping, 0 stopped, 0 zombie

    Cpu(s): 23.1%us, 9.1%sy, 0.0%ni, 66.0%id, 0.1%wa, 0.0%hi, 1.8%si, 0.0%st

    Mem: 32918272k total, 26271264k used, 6647008k free, 188248k buffers

    Swap: 68517872k total, 3006824k used, 65511048k free, 12395196k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

    4108 root 18 0 11.3g 9.7g 12m S 724.9 30.9 6079:49 java

    19783 root 16 0 2406m 62m 2740 S 48.4 0.2 1:29.96 python

    7379 root 15 0 2368m 101m 2740 S 44.1 0.3 342:09.06 python

    7073 root 16 0 2362m 87m 2740 S 40.8 0.3 356:30.99 python

    It looks like the performance bottleneck is the function DirectRandomAccessReader.read() and seek() in ServerTransportThread, but we have using tmpfs to store the vod file, do you have any good suggest for us to overcome the bottleneck?

    Thanks

Hi,

How are your test clients connecting to the stream? Are they connecting via the network or via localhost? What protocol are they using?

The 5mbps file x 2000 connections is 10gbps so is it possible you are reaching the limits of the network connection (assuming a 10gbps network)?

Roger.

Hi,

Thanks for the info.

Can you please let us know how you are generating the connections. It would be interesting to replicate your configuration in order to perform some tests.

Roger.

Hi,

The read() & seek() methods basically wrap the same methods in the Java RandomAccessFile.

I’ll run some tests tomorrow to see if we can see anything.

Roger.

Hi,

I’m not sure if there is an issue with wowza or with your script. I can start around 2500 connections with no issues but then when I start more, something starts going wrong and there are 1000’s of connections started at the same time.

Looking at the script, when the thread hits the end of the while loop, either due to an error or a stream end, it restarts immediately without sleeping. At this stage, an error is thrown again immediately and it loops again quickly. I’m not sure if the script is then making a bad request or it is just trying to request the urls too quickly but when it happens, Wowza uses up all of it’s max allowed files very quickly. I think you may need to put the thread to sleep for a bit before retrying or shut down that thread and create a new one.

Can you see if you can get more logging for when the stream errors occurs. It may give us a clue as to what the server has sent that the script thinks is an error.

Roger.

I had a look at the errors being thrown and it may be possible the server is running out of TCP sockets. This is the error. Thread 39 the exception is: <urlopen error [Errno 99] Cannot assign requested address>

Doing a netstat, I see nearly 30k open sockets in TIME_WAIT.

Normally, HTTP streaming will reuse the same socket for a number of requests. It looks like your script is creating a new socket for each request which could cause issues.

Update: Setting net.ipv4.tcp_tw_recycle = 1 allows the sockets to be reused quickly. With this set, I can get around 5000 connections. Over that, I am running out of cpu. Please try that setting with your tests. I also added a sleep to the end of the thread loop.

Roger.

Hi,

I was just using a single file and it was around 1mbps. I was hitting cpu limits with only 8 cores available.

Also look at the thread pool sizes. Increasing these from the default settings may help, especially as you have 32 cores.

Roger.

Hi,

I will try with a larger file.

I’m not sure what you would gain by implementing your own reader as the read & seek methods in the DirectRandomAccessReader class are about as close as you can get to the java RandomAccessFile methods that they call.

Saying that, you can create you own reader class. You would have to implement IRandomAccessReader & ITrackRandomAccessReaderPerformance and handle all their methods. See https://www.wowza.com/docs/media-cache-implementation-that-will-first-try-to-access-content-locally-before-getting-it-from-a-remote-source-mediacachelocalcontent as an example of how to create a custom reader class. The source code is available in the Module Collection download.

Roger.

The OS parameters can be set with the sysctl command and made persistent with the /etc/sysctl.conf file.

The Wowza parameters you mention can be set via the Wowza Streaming Engine Manager UI in the Server > Performance Tuning section.

Roger.

Hi,

The only setting I have changed from standard is the net.ipv4.tcp_tw_reuse to enable it. The Wowza settings are default production settings.

Please note, I have really only needed to change the above setting to allow your python script to work properly. You shouldn’t need to do so normally as the real connections wouldn’t be coming from localhost or all from the same machine.

Roger.

Hi,

How are your test clients connecting to the stream? Are they connecting via the network or via localhost? What protocol are they using?

The 5mbps file x 2000 connections is 10gbps so is it possible you are reaching the limits of the network connection (assuming a 10gbps network)?

Roger.

Thanks, we use localhost, so network should not be the limits. And we can see that, when using 2000 clients, the Bytes Out speed goes down to about 6Gbps-7Gbps

Hi,

Thanks for the info.

Can you please let us know how you are generating the connections. It would be interesting to replicate your configuration in order to perform some tests.

Roger.

Thanks for rapid response.

We use python script as as simple client, its content as follows. To start 200 client connection, using command “python wowzatest2.py 200 &”, where ‘wowzatest2.py’ is the name of python script file, and if you want to start 2000 connection, execute above command 10 times to avoid performance limit of python itself.

Could you provide some details about DirectRandomAccessReader.read() and seek() ? It look like they are native method?

import httplib
import urllib2
import random
import time
import sys
import logging
import thread 
import getopt
urlString = "http://10.0.63.180:1935/vod/"
id_range_start = 1
id_range_end = 1
ERRORNUM = 0
global CONSOLE
def initLogging(logFilename = None):
    """Init for logging
    """
    if logFilename is None:
        logFilename = "run.log"
    logging.basicConfig(    
                    level    = logging.DEBUG,
                    format   = 'LINE %(lineno)-4d  %(levelname)-8s %(message)s',
                    datefmt  = '%m-%d %H:%M',
                    filename = logFilename,
                    filemode = 'w');
    # define a Handler which writes INFO messages or higher to the sys.stderr
    #CONSOLE = logging.StreamHandler();
    #CONSOLE.setLevel(logging.INFO);
    # set a format which is simpler for console use
    #formatter = logging.Formatter('LINE %(lineno)-4d : %(levelname)-8s %(message)s');
    # tell the handler to use this format
    #CONSOLE.setFormatter(formatter);
    #logging.getLogger().addHandler(CONSOLE);
#-------------------------------------------------------------------------------  
def destoryLogging():
    logging.getLogger().removeHandler(CONSOLE)
def get_url_from_wowza_m3u8(url,vod_id,threadNo):
    try:
        rawData = urllib2.urlopen(url).readlines()
        for data in rawData:
            if(('.m3u8') in data):
                playm3u8 = urlString + "mp4:POI_" + vod_id + ".mp4/" + data 
        logging.info("Thread "+str(threadNo)+"get_url_from_wowza_m3u8() return: " + str(playm3u8))
	return playm3u8
    except Exception, e:
        logging.error("Thread "+str(threadNo)+"Error! get_url_from_wowza_m3u8() via url: " + str(url))
        logging.error("Thread "+str(threadNo)+"the exception is: " + str(e))
        errorData = ['error']
        return errorData
def get_ts(playm3u8,vod_id,threadNo):
    try:
        rawData = urllib2.urlopen(playm3u8).readlines()
        #logging.info("----------m3u8 content: " + str(rawData))
        for data in rawData:	
            if(('EXTINF') in data):        
                    tsDuration=int(round(float(data[8:12])))
                    logging.info("duration: " + str(tsDuration))
            if(('.ts') in data):
                    tsurl = urlString + "mp4:POI_" + vod_id + ".mp4/" + data;
                    beginTime = time.time()
                    #logging.info("----------start time: " + str(beginTime))
                    get_ts_from_url(tsurl)
                    endTime = time.time()
                    #logging.info("----------end time: " + str(endTime))
                    logging.info("Thread "+str(threadNo)+" get_ts:ok")
		   # logging.info("end-start:"+str(int(endTime-beginTime)))
		    #logging.info("tsDuration:"+str(tsDuration))
                    sleep_period = tsDuration - int(endTime - beginTime)
                    logging.info("sleep " + str(sleep_period))
                    if (sleep_period >= 0):
                        time.sleep(sleep_period)
                    else:
                        logging.error("Thread "+str(threadNo)+" get ts spend "+str(endTime - beginTime)+",that's too long")
    except Exception, e:
        logging.error("Thread "+str(threadNo)+" get_ts() via url: " + str(playm3u8))
        logging.error("Thread "+str(threadNo)+" the exception is: " + str(e))
        
def url_of_wowaz(urlString,vod_id):
    url = urlString + "mp4:POI_" + vod_id + ".mp4/playlist.m3u8"
    return url
def get_ts_from_url(ts_url):
    try:
        tsData = urllib2.urlopen(ts_url).read()
#        ts_file_name = str(time.time())
#        f_ts = file(ts_file_name + '.ts','wb')
#        f_ts.write(tsData)
#        f_ts.close()
        return "ok"
    except Exception, e:
        logging.error("Thread "+str(threadNo)+" get_ts_from_url() via url: " + str(ts_url))
        logging.error("Thread "+str(threadNo)+" the exception is: " + str(e))
        return "error"
def test(no):
    randomtime=random.randint(0,13)
    time.sleep(randomtime)
    while(1):
        logging.info(str(no)+" start:")
        vod_id = str(random.randint(id_range_start, id_range_end))
        playlist_m3u8 = url_of_wowaz(urlString,vod_id)
        logging.info(str(no)+" the playlist_m3u8 is " + str(playlist_m3u8))
        ts_m3u8 = get_url_from_wowza_m3u8(playlist_m3u8,vod_id,no)
        logging.info(str(no)+"---------- the ts_m3u8 is " + str(ts_m3u8))
        get_ts(ts_m3u8,vod_id,no)
        logging.info(str(no)+"---------- get_all_ts:ok")
        logging.info(str(no)+"---- end:")
if __name__=='__main__':
    logname = str(time.time())+ ".log"
    initLogging(logname)
    for i in range(1,int(sys.argv[1])):
    	print "start thread ",i
    	thread.start_new_thread(test,(i,)) 
    test(0)

Hi, Roger, could you give me some suggests? Thanks.

Or could you give some detailed information about DirectRandomAccessReader.read() and seek(), for example, the java code? Em, I’m a software engineer, I like source code :), Thank you very much.

Hi,

Today, I copy the VOD file as anther VOD file, and clients access them randomly, then the max client connections increase to 2400. Em, It looks like some limit of the file access, may be file-lock?

DaLi

En, Thanks, I will set net.ipv4.tcp_tw_recycle = 1 and modify the client script, then perform a test.

How many Vod files do you use? And what’s the bitrate, size, and frame resolution of them?

What you suggest about thread pool size? At first, I user the default 600/400, the max client connections number is 1400, and then I change it to 64/32, the max client connections number increase to 1800.

Wait a moment please, I will post the test result with net.ipv4.tcp_tw_reuse = 1

Em, net.ipv4.tcp_tw_reuse = 1 took effect, there is almost no TIME_WAIT connection, but the max support client connections did not changed, it remained about 1800.

Following is the netstat’s output (12455 is wowza engine’s PID):

[root@boss ~]# netstat -nap |grep 12455 | grep TIME_WAIT | wc -l

0

[root@boss ~]# netstat -nap |grep 12455 | grep ESTABLISHED | wc -l

1721

[root@boss ~]# netstat -nap |grep 12455 | wc -l

1781

And none error logs appeared in wowza’s logs.

The VisualVM’s output of “thread” tag is as following, green means the thread’s state is running.

It looks like that performance bottleneck of my test case is different from yours. Could you test a vod file with 5Mbps, 1280*720 frame resolution? Thanks.

And I transcode my testing vod file to 1Mbps, test again, which can reach max 2400 client connection, and then the above image appears.

Could you give me your parameters including OS’s tune parameters and Wowza’s parameters? Thanks

Hello, Roger, are you tracking this issue? Thanks.

Another question, can I using a extended class instead of DirectRandomAccessReader? I would like to override its read and seek.

Hello, Roger

Could you tell me how do you set the server’s parameters? Including OS’s (net.ipv4.tcp_tw_reuse, net.core.rmem_max, etc) and Wowza’s (GC method, ServerTransportThread Number,etc)

Thanks.

The OS parameters can be set with the sysctl command and made persistent with the /etc/sysctl.conf file.

The Wowza parameters you mention can be set via the Wowza Streaming Engine Manager UI in the Server > Performance Tuning section.

Roger.

Em, I did not say clearly.

I mean which system parameters you have changed, and changed to what.

could you execute command “sysctl -p”, " cat /proc/cpuinfo" on your testing server and paste the output here?

Thanks.