Improving Cloudfront cache hit ratio for HTTP live streams

We do weekly live streams with several thousand viewers around the world in HLS-only at 7 different bitrates from 64 kbps (audio only) up to 2500 kbps (HD video). We employ a single Wowza Streaming Engine 4.1 on an EC2 c3.2xlarge with all the out-of-the-box settings of HTTP Origin Mode. Only one exception: our HLS sliding window is 5 segments large instead of the default 3. (bigger buffer at the cost of added delay) We have a simple Cloudfront distribution in front of the origin so that we can scale infinitely. (really?)

Last week, our origin peaked out at 1,000 mbps egress (outgoing) bandwidth. This, in theory, should be impossible. The theoretical limit should be 50 edge locations x 8 mbps (all bitrates added together) = 400 mbps + a little extra for the ongoing requests of the m3u8 chunk list. Even at the most popular edges (FRA6 + FRA50) our cache hit ratio was below 70%.

Why is this and has anybody else experienced the same problem?

Does anyone know of any workarounds to lower the amount of egress (outgoing) bandwidth from the origin?

Does anyone know how to improve the cache hit ratio? Fewer bitrates? Shorter HLS segments < 10 seconds? Non-default max-age durations in the cache header? Fewer Cloudfront edge locations?

Hi,

What I would do , if your configuration is as a default as it can be, increase the cupertinoCacheControlPlaylist value from 1 to say 10 or even 15. As you have 5 chunks in the playlist and the default is 10 second chunk lengths then even with a 15 timeout on the playlist it is well below a retry time for getting more information. This may help, if you have that many users connecting to CloudFront and the playlist is timing out so quickly it may be making many many requests back to the origin across all the edges. This may not reduce the bandwidth a huge amount but it should help.

I would also check you only have cupertino enabled and/or no other HLS protocols where being requested. This may also be impacting your bandwidth as each protocol will pull a different chunk (I suspect you know this but worth double checking the logs).

Andrew.

Thanks, Andrew, for trying to help.

Each client fetches an update for the chunk list only every 10 seconds anyway so increasing the cache TTL for that won’t help and might even introduce new problems.

I have only Cupertino/HLS enabled.