In this work we examine the implications of building a single logical link out of multiple physical links. We use MultiEdge to examine the throughput-CPU utilization tradeoffs and examine how overheads and performance scale with the number and speed of links. We use low- level instrumentation to understand associated overheads, we experiment with setups between 1 and 8 1-GBit/s links, and we contrast our results with a single 10-GBit/s link. We find that: (a) Our base protocol achieves up-to 65% of the nominal aggregate throughput, (b) Replacing the interrupts with polling significantly impacts only the multiple link configurations, reaching 80% of nominal throughput, (c) The impact of copying on CPU overhead is significant, and removing copying results in up-to 66% improvement in maximum throughput, reaching almost 100% of the nominal throughput, (d) Scheduling packets over heterogeneous links requires simple but dynamic scheduling to account for different link speeds and varying load.
|Title of host publication||Proceedings of Workshop on Communication Architecture for Clusters, CAC 2008|
|Publisher||IEEE Computer Society Press|
|Publication status||Published - 2008|
|Event||22nd IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2008 - |
Duration: 1 Jan 2008 → …
|Conference||22nd IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2008|
|Period||01/01/2008 → …|