Yes, of course. I had a dialogue with AWS Support and they confirmed the 1.73 Gbps limit. There seems to be no way around this, so when it comes to actual throughput the c3.8xlarge (or other 10 GbE instances) offer terrible value compared to instances with “High” network capabilities that performs very close to 1 Gbps.
Below is the most relevant snippets from our dialogue:
ME - When I test my own ‘10 Gigabit’ instances (c3.8xlarge) with iperf I won’t see transfer rates exceeding 1.73 Gbps. I have tested at various times of the day over the span of seven days. I have run the tests multiple times but each time I’ll hit the very same limit of 1.73 Gbps. I have tried setting the windows size to 64KB, 128KB and 512KB. I also tried setting the number of parallel client streams to 2 and 10. These settings offered no real improvements to the measured throughput. I have also tried testing with Wowza Load Testing Tool. I have simulated 2000 connections to a 5160 Kbps stream, but at 345 connections traffic out maxes out. 345 x 5160 Kbps equals ~ 1.7 Gbps, so I basically hit the same roof. This is at least four times worse than what a blogger at scalablelogic reports where tests show results of 7 Gbps and 9.5 Gbps. I’m testing between two c3.8xlarge instances located in the same zone and region, so these should be optimal benchmarking conditions. The one c3.8xlarge acts as iperf server and the other as an iperf client. I have tried with instances launched with Amazon Linux AMI 2013.09.2 (64-bit) as well as Ubuntu Server 13.10 (64 bit). I have also tried launching instances in the same placement group, but I still hit the same limit.
Why am I seeing such poor results? What should I look at if I want to improve throughput? Is this a limit that can be dealt with? There’s plenty of forum members that would be interested in knowing this. Also, it would be great to know what causes this limit. If it is indeed the case that this limit is known to Amazon and maybe even artificially imposed, I think you should specify that a 10 GbE instance is only capable of 1.7 Gbps (just 1.8 x more than a “High” instance) when not using internal IP’s.
SUPPORT - I appreciate your patience on this issue. I reached out to our operations folks and asked them about what you are seeing. They advised me that there are no specific caps on the amount of bandwidth that our enforced by AWS, that speed is based on a number of variables including network equipment, regional considerations, etc. Additionally, they advised me that the 10 gig network as sold is based on transfer rates within the data center/local network to the instance. They did similar testing and their tests showed varying transfer rates depending on the datacenter and availability zone they used.
One suggestion is that in order to sustain higher rates of transfer to the public Internet that scaling horizontally with more instances might perform better than scaling vertically with a larger instance. As this is not a “bandwidth as a service” speeds to the public Internet can not be guaranteed.
I understand that this might not be the solution you are looking for, and I apologize that this product doesn’t meet your needs for this particular case. Please let us know if you have further questions.
ME - Of those variables, what exactly is thought to be the most prominent bottleneck? Since we both see the 1.73 Gbps limit consistently it should be possible to find out. Do you have any plans for improving external throughput from these 10 GbE instances? The 10 GbE may refer to internal transfer rate, but it is worth to note that network performance for “Low” to “High” instances has a 1:1 relationship between internal and external throughput. External throughput on 10 GbE instances seems to be 5.5 times slower than internal throughput. Very noticeable.
SUPPORT - The placement groups bandwidth apply for the private subnet network. What you do there is causing the connection to leave the subnet go out of the network and renter again which is suboptimal at best. The problem here lies that since it will be routed outside of our network and then back in again it will traverse through different network equipment than in a private local subnet.
Unfortunately there is nothing we can do here but rather advise you to use the private IP addresses to get the benefit of the full 10G link. Everything else will simply not work. Furthermore we do keep information of our internal network confidential and we cannot possible troubleshoot further or explain why you constantly see 1.73G. It actually makes a lot of sense why a local dedicated subnet would achieve higher throughput rather pipes that are shared because they are routed outside the placement group and get routed outside of our network and inside again. So you are basically asking why you get lower bandwidth by using a public pipe on a router than in the same subnet with optimal conditions and no shared resources. Also please keep in mind different network equipment use different medium characteristics as well as configurations.
While I understand this answer may not be satisfactory and may not explain a lot in detail we certainly cannot give you any information of the internals of our network. In case you are interested enterprise customers have the benefit of Non-Disclosure Agreements which we can share more information of our internal systems.
If this seriously impacts your business logic we can certainly work with you towards a solution (e.g Direct Connect)and point you to the right direction.