EC2 Capacity Planning

I’ve experience running 3 wowza instances in EC2 ( Singapore ) and would like to add more servers in EC2.

The current setup :

1 Origin

2 Edge + AWS Load Balancing

I’m using Ubuntu Server m2.xlarge instance type but I don’t have any idea what’s the max bandwidth the instance could handle ?

I assume

250Mb/s max throughput per small instance

500Mb/s max throughput per large instance

1000Mb/s max throughput per xlarge instance

So, thats mean my 2 loadbalanced box capable of 2GB max throughput

and if I stream 500kbps of video, max concurrent users will be 4000 users theoritically.

In order to support 20k user concurrently , I’ll 5 xlarge instance , am I right ?

Amazon AWS also have a feature to fire up instances if alarm is triggered. I’m going to limit each server to 800mb troughput and launch more instance.

This is because, on normal basis, our client only ahve 400 concurrent users max, but on some occasion ( live footbal match ), up to 10,000 users will watch the event.

Any thoughts ?

@AzrilNazliAlias

This is our understanding:

150Mb/s max throughput per m1.small instance

250Mb/s max throughput per m1.large instance

350Mb/s max throughput per m1.xlarge instance

Richard

I don’t know about the m1.medium. The m1.small, large and xlarge are supposed to get about 150, 250 and 350mbs, resp. Those are the only ones we have some idea about, however that info is not recent.

Richard

Amazon is a bit obtuse about this. Except for the quadruple cluster types which have specific info every other instance type is rated “I/O Performance: Low | Moderate | High” with no qualification for what low, moderate or high mean.

http://aws.amazon.com/ec2/instance-types/

It is also thought that the m1.small has more flexible resource allocation than the larger types. There is a very early post from Dave on this (more than 4 years old) to that point. I do not know how current that info is, but customers do occasionally complain about m1.small having less throughput or memory than expected.

Richard

I’ve just recently run tests on most of the EC2 instance sizes and these are my results.

SIZE

MAX MBS

MAX CONN

Micro

100

260

Small

100

320

Medium

140

360

Large

210

440

Extra Large

410

710

2G Extra Large

520

980

2G Double Extra Large

650

1400

Quadruple Extra Large

710

2900

Eight Extra Large

710

3300

HIO Quadruple Extra Large

1100

4820

This test is NOT scientific at all and if you base your production setup on this data and it fails, you only have yourself to blame. This was a quick test done thusly:

1.- Launch a Wowza instance at the appropriate size

2.- From 25 servers outside Amazon, hit the sample.mp4 VOD file until errors start popping up.

3.- Record MBps at time of first error.

4.- From 25 servers outside Amazon, hit dummy.mp4 (160x120@50kbps) until errors start popping up.

5.- Record connected users at time of first error.

The results have been rounded down to the tens.

David, the reason I did two tests was to differentiate between hitting a limit with bandwidth vs connections.

The sample.mp4 file is a good example of a standard video, with 525kbps it fills up the pipes nicely. This tested max bandwidth.

The dummy file is ten times smaller so that I can ramp up connections without worrying about hitting the bandwidth cap. This tested max connections without considering bandwidth (much).

If you’d like to send me or post your numbers we can invite others to do the same and maybe get some real hard data about this and make a nice list with the numbers.

good reading on this, http://blog.ianbeyer.com/tag/superquad/

thanks

Does anyone at Wowza have insight into what bandwidth you can expect from a “medium” instance – and also from a “2xlarge” instance?

I have a live event coming up that already has over 10,000 people registered to view the event online. I’m curious if there is a ballpark number floating out there that might give me an idea of what the 2xlarge can handle over a standard xlarge AMI, if anything.

David L Good

Richard

Thanks for the very fast reply. In Ian Beyer’s blog post (link a few posts up) he seems to indicate that an xlarge might see about 450. Obviously, if you’re sharing this resource with other “neighbors” on a shared server it’s difficult to know for sure… and I like the assumed 350 number, as it seems like a more conservative guestimate.

So, while you may not have have a ballpark for a 2xlarge… do you at least know if it gets “more” bandwidth allocated to it over a “xlarge” instance? I’m not asking for numbers, just wondering if it’s “more” than a xlarge, “similar” to a xlarge, or you honestly don’t know. No one seems to be talking on the Amazon forums about it.

I’ll be running another large event this weekend… I’m segregating iOS traffic and Flash traffic to different EDGE locations, so I’ll also keep an eye out and see what kind of numbers I can pull from that. If I discover anything useful I’ll post it – maybe it will be helpful to other users.

On that note: Seeing as how this is a question that seems to pop up like a bad disease on the forums every few weeks – is there any “sticky” post on this topic? Maybe a thread where users can report their experiences on use/bandwidth/CPU performance? If not, it might be a helpful thread to have… and might save the 1, 2, 3, 5, and 0 keys on your keyboard from wearing out early by always having to type the same numbers over and over each time you reply to this reoccurring question :).

David L Good

Richard

Thanks! I would agree with other users about the m1.small. I was both surprised with how well some things worked, but also disappointed at other resources of the m1.small.

I have NEVER been able to see 150Mb/s out of a small instance. For example, for some time now I’ve been running a live event each Wednesday evening that requires one ORIGIN and several EDGE servers (the number of EDGE servers have been anywhere from 3 to 20 depending on the event demand).

Each EDGE for this event receives two streams – a 350kbps stream and a 700kbps stream. These are all flash-based (all iOS traffice, for example, is pushed to a different EDGE setup).

The limitation I’ve seen over and over is not the bandwidth, but the number of users connected. Right around the 200 connection point the EDGE servers will start to get bottlenecked and video will start to suffer for clients connected to that EDGE. The bandwidth going OUT is usually around the 70Mb/s mark… maybe a tad bit higher. Of the 200 connections, most are connecting to the 350kbps stream with just 5-10 connecting to the 700kbps stream.

That said, I’m guessing the limitation with the m1.small is some sort of processing resource – as I can see the “processes in the run queue” in CACTI (these are Wowza 2.X AMIs) start to exceed the 1.0 mark. CACTI’s “processes in the run queue” is usually my favorite graph to watch for these setups, as that particular data is usually right on the mark to let me know when a Wowza instance is going to have issues, or not, with handling all the video requests.

If I launch a m1.small instance and just let it sit (doing absolutely nothing) the “processes in the run queue” in CACTI will show a fluctuating number… jumping from 0.18 to 0.30 over the course of a minute, or two, and then back down again… over and over. I’m not sure exactly what’s happening… but that’s what I’ve seen from the graphs.

So – I’ve learned to keep my connection counts on those particular EDGE instances to around 150 max before launching additional EDGE servers. And even though all the EDGE servers are load balanced, some will have moderately low “processes in the run queue” during a live event (usually around 0.25 - 0.35) while other EDGE servers seem to jump to 0.85 every couple minutes. Odd.

In another setup – also weekly – I’ve had an ORIGIN m1.small instance (running 24/7). This one is for a church, and each week they’ll have 10 congregations (at once) presenting live with recording (live-record) which is also load-balanced to additional m1.small EDGE instances that run for one day. In addition to this, there are two congregations that push rtplive streams (not recorded and not load-balanced). All of these congregations have individual “text” chat modules (based off the simple text chat module Wowza provided) which is handled by the ORIGIN. All VOD is also handled by the ORIGIN.

So… the m1.small instance has been able to take 10 incoming live-record streams, record them and push them to the EDGE locations… take 2 incoming rtplive streams and push them to connected users (about 25 users connected to each one)… as well as handle all the text chat traffic and VOD playback (usually about a dozen VOD files) all at the same time. Not too shabby, if you ask me.

Until a month ago, this setup seemed to work just fine. Then they started to experience a few issues when another congregation started to live-record stream… so I think this was the straw that broke the camel’s back.

I upgraded them to a c1.medium instance – thinking that I could keep the “processes in the run queue” lower with his “High-CPU” instance… and WOW… it made a huge difference. The extra compute units seem to have made a lot of difference. With the same load as the previous m1.small instance, this new c1.medium instance seems to be taking a nap when it comes to “processes in the run queue” in CACTI… reporting extremely low numbers during the most aggressive loads. This is a good thing.

With that information, I’ll continue to push the c1.medium instance to see where the next bottleneck is… be it bandwith, CPU, processes, etc.

Sorry for the long post – just thought someone out there might find the information helpful. Wish I had it when I started.

David L Good

hdezela – THANK YOU VERY MUCH for taking the time and putting the effort into doing this. Even if this isn’t “scientific” – it’s a LOT better than anything I have. And your numbers seem to support some of my experiences as well… so I think they’re pretty good.

Quick question: So… you did two different tests, correct? Once while trying to overload the “sample.mp4” file… and another time trying to overload the “dummy.mp4” file. Did you then just take the average of the two for your final totals?

Thanks again for all the effort you put into this – it really is very much appreciated!!

David L Good

hdezela – ahhhhh!! Brilliant! That makes it even better.

Were these all Wowza 3.5 servers?

What about the server “type” – for example, there is a m2.medium (standard medium) and there is a c1.medium (high-CPU medium). In my own experience, the high-CPU machines really seem to perform well… especially when you start getting a lot of connections. I’m guessing the bandwidth would be the same between the two… but then again, I’m just guessing as I haven’t performed any kind of tests even close to what you’ve done.

So, it would be nice to know what Instance Size/type each one was listed at. Was your ‘Extra Large’ a m1.xlarge? m2.xlarge? c1.xlarge? Was your ‘Medium’ a m1.medium? c1.medium? etc.

I’m also curious about the Micro instance. Does Wowza actually have a Micro-instance AMI? If not, did you just install Wowza and tune it yourself… and if so, did you do that with the others, or were those all the pre-built, pre-tuned AMIs made by Wowza?

Sorry for all the questions… I’m just excited to be getting some answers. :slight_smile:

David L Good

Hi David,

I think a good point to add is that we don’t know how Amazon allocates resources between accounts and between actual hardware. If for instance, you have 10 small instances, do all 10 get put on the same piece of hardware or do they get spread across multiple machines. If it is the former then you will be severely limited on resources per instance and probably only get around 80mbps per instance. If the latter then it would depend on what other customers are doing at the same time.

I think it is best to go with the largest instances you can if you are having to use multiple instances this way, you can be pretty sure that they will be spread across multiple host systems. Also, starting them as required rather than all at once will give a better chance of them being spread out if they are busy.

Roger.