Huawei CloudFabric 3.0 Hyper-Converged DCN Solution Empowers Lossless Ethernet, Unleashing 100% of Computing Power - Total Telecom

Forgot Username or Password
[Paris, France, April 7, 2022] Today, Zheng Xiaolong, Chief Researcher of Data Center Network (DCN), Huawei Canada Research Center, delivered a keynote speech titled "Zero-Packet-loss Ethernet Helps Release 100% Computing Power" at the MPLS, SD & AI Net World Congress. In the keynote, Mr. Zheng delved into how Huawei’s CloudFabric 3.0 Hyper-Converged DCN Solution offers an innovative solution to the packet loss problem on DCNs and builds Ethernets with low latency, high throughput, and large scale to unleash 100% of computing power. Viewpoint of Huawei’s Chief Researcher for data center networks Efficient Improvement of Computing Power Is Crucial in the Data-centric Computing Power Era "Insufficient computing power is the biggest challenge in the data-centric computing power era," said Zheng Xiaolong. "To implement real-time data processing and value monetization, robust computing power is required…
[Paris, France, April 7, 2022] Today, Zheng Xiaolong, Chief Researcher of Data Center Network (DCN), Huawei Canada Research Center, delivered a keynote speech titled "Zero-Packet-loss Ethernet Helps Release 100% Computing Power" at the MPLS, SD & AI Net World Congress. In the keynote, Mr. Zheng delved into how Huawei’s CloudFabric 3.0 Hyper-Converged DCN Solution offers an innovative solution to the packet loss problem on DCNs and builds Ethernets with low latency, high throughput, and large scale to unleash 100% of computing power.
Huawei CloudFabric 3.0 Hyper-Converged DCN Solution Empowers Lossless Ethernet, Unleashing 100% of Computing Power - Total Telecom
Viewpoint of Huawei’s Chief Researcher for data center networks
Efficient Improvement of Computing Power Is Crucial in the Data-centric Computing Power Era
"Insufficient computing power is the biggest challenge in the data-centric computing power era," said Zheng Xiaolong. "To implement real-time data processing and value monetization, robust computing power is required."
Today, big data has uses everywhere, spanning everything from the metaverse and AI-powered drug research, to user habit-based intelligent advertisement recommendation. Key to such big data applications is robust computing power, yet the scale of AI computing models is growing exponentially. For example, Megatron-Turing NLG — the industry’s latest language model — now supports 53 billion parameters. In comparison, even the most complex model in 2017 supported a mere 61 million parameters. In other words, the computing pressure increased by almost 10,000 times in the past five years. Evidently, finding a way to efficiently improve computing power and unleash 100% of the computing power has become the top priority in the computing power era.
DCNs Become the Core Bottleneck for Improving Cluster Computing Power
Completing E-level floating-point computing operations required to train an AI model, such as the GPT3 language model, requires a large number of computing servers to form a cluster. However, all AI training clusters have their performance threshold. Once the threshold is reached, even if more server nodes are added, performance cannot improve and may even deteriorate. This is because computing nodes collaborate with each other in the cluster and, if packet loss occurs on the network, the overhead will increase due to the prolonged waiting time for collaboration. Even with a 0.1% packet loss, the computing power will be slashed in half, making a lossless DCN vital to improving computing power.
Lossless Ethernet Built on Huawei’s CloudFabric 3.0 Hyper-Converged DCN Solution, Unleashing 100% of Computing Power
Huawei’s CloudFabric 3.0 Hyper-Converged DCN Solution leverages iLossless — a Huawei-unique intelligent and lossless algorithm — to eliminate packet loss that has hampered Ethernets for more than 4 decades. This solution features high throughput, low latency, and zero packet loss, unleashing 100% of computing power in all scenarios.
 – High throughput: Traditional traffic scheduling is manually configured, and as such cannot adapt to dynamic network changes. Huawei’s Automatic ECN (ACC) is an intelligent and lossless technology that accurately predicts network congestion status and achieves nearly 100% throughput while eliminating packet loss on any congested link. As verified by Tolly Group, a global provider of testing and third-party validation and certification services, Huawei’s CloudFabric 3.0 Hyper-Converged DCN Solution can drive up the all-flash IOPS performance by 93%. In August 2021, the paper ACC: Automatic ECN Tuning for High-Speed Datacenter Networks explored Huawei’s intelligent and lossless hyper-converged DCN innovations, and was accepted by the Association for Computing Machinery (ACM)’s flagship annual event: the Special Interest Group on Data Communication (SIGCOMM) 2021. This demonstrates industry experts’ high regard for Huawei’s innovations, and that these innovations have a far reaching impact felt around the world.
Low latency: In high-performance computing (HPC) scenarios, application latency is the product of the number of calculation steps and the latency of each step. For latency-sensitive applications, reducing the number of steps can effectively reduce the overall application latency. Powered by in-network computing and topology-aware computing, Huawei’s Integrated Network and Computing (INC) technology implements network and computing collaboration. With these technologies, the network participates in aggregation and synchronization of computing information, reducing the number of times computing information is synchronized. Meanwhile, computing tasks are assigned to the same TOR switch, reducing the number of communication hops, which in turn reduces the application delay. Take MPI_allreduce as an example. Compared with traditional networks that only forward data without participating in computing, the CloudFabric 3.0 Hyper-Converged DCN Solution can drastically reduce the latency and improve computing efficiency by 27%.
Large scale: The traditional three-layer Clos network architecture of a data center supports a maximum of 65,000 nodes, far short of that required by large-scale data centers. Huawei’s CloudFabric 3.0 Hyper-Converged DCN Solution adopts the next-generation direct connection topology architecture and innovative distributed adaptive routing protocols. It not only builds a lossless computing network, but also supports large-scale networking of up to 270,000 nodes, four times that of the industry. This makes it ideal for E-level and 10E-level large and ultra-large computing hubs.
Zero packet loss and continuous performance evolution are of great significance to the data-centric computing power era. Huawei has carried out full-scale joint tests with customers throughout the finance, manufacturing, and HPC sectors. The test results prove that Huawei’s CloudFabric 3.0 Hyper-Converged DCN Solution has significant performance advantages in scenarios such as all-flash, distributed storage, HPC, and AI computing. In the future, Huawei will continue to invest in intelligent and lossless technology research to further improve lossless network capabilities, fully unleash computing power, and enable intelligent upgrade of enterprises.
About Huawei
Huawei is a leading global provider of information and communications technology (ICT) infrastructure and smart devices. With integrated solutions across four key domains – telecom networks, IT, smart devices, and cloud services – we are committed to bringing digital to every person, home and organization for a fully connected, intelligent world. Huawei’s end-to-end portfolio of products, solutions and services are both competitive and secure. Through open collaboration with ecosystem partners, we create lasting value for our customers, working to empower people, enrich home life, and inspire innovation in organizations of all shapes and sizes. At Huawei, innovation focuses on customer needs. We invest heavily in basic research, concentrating on technological breakthroughs that drive the world forward. We have more than 197,000 employees, and we operate in more than 170 countries and regions. Founded in 1987, Huawei is a private company wholly owned by its employees. For more information, please visit Huawei online at www.huawei.com or follow us on: 
http://www.linkedin.com/company/Huawei
http://www.twitter.com/Huawei
http://www.facebook.com/Huawei
http://www.youtube.com/Huawei
 
 
 
We are having trouble showing you adverts on this page, which may be a result of ad blocker software being installed on your device. To view the article please disable any ad blocking software
…the Telecoms industry is characterised by constant change and evolution. That’s why it’s crucial for telecoms professionals to keep up-to-date with what is happening. Join 35,000+ of your peers and sign up to our free newsletter service today, to be in the know about what is going on. PLUS, as a member you can submit your own press releases!
See all membership options
Quickly get on board and up to date with the telecoms industry
Total Telecom logo
Total Telecom meets the information and research needs of the Global Communications industry, from breaking news to expert analysis. It is the leading communications link between end users and the vendors, carriers and resellers of telecommunications technology and services.
Email: info@totaltele.com
Telephone: +44 (0) 20 7092 1000
© terrapinn holdings ltd mmxvi all rights reserved

Huawei CloudFabric 3.0 Hyper-Converged DCN Solution Empowers Lossless Ethernet, Unleashing 100% of Computing Power - Total Telecom

source

Leave a Reply

Your email address will not be published. Required fields are marked *