Dec 02, 2024

The latest trend of backbone network optical communication

Leave a message

1.400G, it's really here

Not long ago, in March 2024, China Mobile opened the world's first 400G all-optical inter provincial (Beijing Inner Mongolia) trunk line, which is considered an important milestone event.

The reason for upgrading the backbone network to 400G is obvious.

On the one hand, the growth of consumer Internet traffic brought by residents' digital life (high-definition video, teleconferencing, online live broadcast, online games, etc.) is still continuing.

On the other hand, the entire industry is promoting digital transformation, and the surge in traffic from industry digital systems has intensified the pressure on backbone networks.

The sudden increase in pressure on the backbone network is also due to a key reason - the explosion of AI.

After the rise of the AIGC big model, it triggered a wave of AI. In order to meet the needs of AI business, it is necessary to build a large number of intelligent computing centers. The model has evolved from billions of parameters to trillions of parameters, and the GPU computing power cluster has also moved from a thousand card cluster to a ten thousand card cluster or even a hundred thousand card cluster.

A GPU computing power cluster is actually an array of massive GPU cards (GPU servers) connected together through high-performance networks such as InfiniBand and RoCEv2. It has extremely high requirements for network performance and reliability, which directly affects training efficiency and cost.

In terms of the network port speed of GPU servers alone, it has already started from a single port of 400G, and even requires 800G or higher.

 

info-288-216

 

Network ports of GPU servers

Previously, GPU computing power clusters belonged to the category of DCN (Data Center Network). Now, with the continuous expansion of cluster size, we have begun to consider applying distributed intelligent computing centers to model training.

That is to say, several intelligent computing centers in different locations will be used together for training.

This puts forward higher requirements for DCI (Data Center Interconnection Network), and the optical communication backbone network must be able to meet this demand in terms of technical performance.

Our country's strategy in computing power still adheres to the idea of "national coordination and overall layout". Starting from February 2022, China has launched the East West Computing Project to create a nationwide integrated computing power system.

Simply put, on the one hand, we need to build a large number of data centers (equivalent to power plants), and on the other hand, we also need to build a robust backbone transmission network (equivalent to a power grid) to distribute this computing power and meet the needs of various industries.

 

How was 400G achieved?

The current optical communication backbone network, as the foundation of the entire digital society, must have multiple characteristics such as ultra large bandwidth (400G, future 800G or even 1.6T), ultra-low latency (multi-level latency circle), ultra large scale networking (serving distributed computing and AI clusters mentioned earlier), ultra-high stability, ultra-high reliability, ultra-high security, ultra flexible deployment, intelligent operation and maintenance control, etc.

Today, we will mainly talk about the most important speed bandwidth.

The development of optical communication technology to the present day, in order to achieve speed improvement, is nothing but to focus on the following aspects:

Firstly, there is the baud rate.

Transmission rate, also known as bit rate, is the number of bits transmitted per unit time, measured in bits per second.

Bit rate=baud rate multiplied by the number of binary bits corresponding to a single modulation state.

The baud rate is the number of symbols transmitted per unit of time. The higher the baud rate, the more symbols are transmitted per second, and of course, the greater the amount of information, leading to an increase in speed.

The baud rate is determined by the capability of the optical device. The more advanced the device chip process, the higher the baud rate, and the higher the bit rate.

At present, the CMOS process has increased from 16nm to 7nm and 5nm, and the baud rate has gradually increased from 30+GBaud to 64+GBaud, 90+GBaud, and 128+GBaud.

The current 400G is commercially available thanks to its baud rate reaching 128Gbaud.

Let's take another look at the modulation method.

The 'binary digits corresponding to a single modulation state' in the formula just now is determined by the modulation method.

The modulation schemes of 400G technology currently mainly include 16QAM, 16QAM-PCS (PCS is a probability shaping technology, which will be introduced in detail next time), and QPSK, which are suitable for different application scenarios.

info-378-146

Optical communication is different from wireless communication in that it does not blindly pursue high-order modulation.

The lower the modulation order, the lower the requirements for the line, and the lower the cost of network construction. So, in the early design stage of long-distance backbone networks, the focus was basically on 16QAM and QPSK. Later, 16QAM-PCS also joined the competition.

Previously, there was no mention of "East West Calculation", and operators believed that 400G would not require too long distance transmission. Therefore, adopting low baud rate devices with more mature technology and lower prices, combined with 16QAM with higher modulation order, is the mainstream opinion in the industry.

Later, on the one hand, due to the increasing requirements for transmission distance, it increased from over 1000 km to several thousand km. On the other hand, 128GBaud baud rate devices rapidly matured (in the DCN scenario, 800G quickly rose, stimulating and promoting the industry chain), creating conditions for QPSK to stand out.

QPSK has a higher tolerance for nonlinearity and can appropriately increase the input power compared to 16QAM-PCS. Secondly, the back-to-back OSNR threshold of QPSK is optimized compared to 16QAM-PCS. Furthermore, setting the channel spacing of QPSK to 150GHz ensures almost no filtering cost during transmission.

These advantages have gradually made QPSK the industry's preferred choice for backbone networks and DCI.

 

Channel spacing

Baud rate

transmission distance

16QAM 400G

75GHZ

64GBd

~600km

16QAM-PCS 400G

100GHZ

90GBd

~1000km

QPSK 400G

150GHZ

128GBd

~1500km

A rough comparison of three options

Now, the first two options are more commonly considered for urban or provincial applications.

Thirdly, it is to expand the frequency band.

The baud rate and modulation mainly affect the single wave rate. A fiber optic cable can have multiple waves, as long as the spectral range is large enough.

Single wave bandwidth x single fiber wavenumber=single fiber bandwidth.

As stated in the previous table, the channel spacing of QPSK 400G reaches 150GHz. Both traditional C-band and extended C-band are insufficient to meet the demand for spectrum bandwidth.

 

So, the C6T+L6T method is gradually being adopted, with a total spectrum bandwidth of 12THz. Calculate, with 80 waves and a single wave of 400G, the total capacity of a single fiber is 32T. If we sacrifice some distance and use it to save costs, deploying QPSK or 16QAM-PCS can increase the capacity even further, reaching 48T.

 

For a detailed introduction to frequency bands, you can see here: What are the frequency bands for optical communication?

The biggest issue with extending the frequency band is whether the device can support it and whether the cost is controllable. The devices referred to here include ITLA, CDM, ICR, EDFA, and WSS, which involve the transmission and reception of light, as well as the exchange and amplification of optical paths.

When it comes to band expansion, there is also an issue involved, which is integration.

 

The current band extension is actually more like a simple binding of two systems (C and L). Two systems operate independently, transmit through multiplexing, and then split at the opposite end, each continuing to process.

 

If there are two systems, the volume will be larger, the power consumption will be higher, and the design will be more complex. So, the industry needs to study how to integrate devices and truly make a system that supports different extended bands at the same time. That is to say, achieving true integration.

 

Fiber optic communication, in addition to optical modules and equipment, also requires attention to fiber optic.

 

The current mainstream fiber optic is G.652D fiber optic. 400G QPSK can also transmit 1500km on G.652D with EDFA amplification.

After years of verification, the industry has identified G.654E fiber as the new successor. If using the better performing G.654E, under the same conditions, the transmission distance of 400G QPSK can be increased by more than 30%.

 

G. The 654E fiber optic cable has the capability for large-scale production and will be deployed on long-distance trunk lines on a large scale. G. Some low loss optical fibers of the 654 series have also become the preferred choice for long-distance transmission across oceans in submarine cable systems.

Apart from traditional fiber optics. The industry also believes that multi-core fibers and hollow fibers have broad application prospects.

Multi core fiber is a type of spatial division multiplexing, in which more fiber cores are inserted into one fiber and few modes are used to significantly increase the capacity of the fiber.

Hollow fiber optic cables are even more impressive. Simply make the fiber optic cable hollow and replace the glass fiber core with air.

Hollow fiber has been proven to bring greater capacity, lower latency, smaller transmission loss, and ultra-low nonlinearity, and is widely regarded as one of the most promising technologies in optical communication by the industry.

 

Next step for 400G, 800G or 1.6T?

After the official commercial scale of 400G, the entire industry will focus on the technical standard system beyond 400G.

The industry is still intensively debating whether to proceed with 800G, 1.2T, or 1.6T.

 

If you want to achieve higher speeds, you must continue to work on "modulation method+baud rate". 130GBd, or even higher at 260GBd, is the inevitable direction. A higher baud rate means that related devices must keep up and form a mature industrial chain.

 

Beyond 400G, we can no longer rely on QPSK. 16QAM modulation is currently a widely recognized option in the industry.

The frequency band also needs to be further expanded. On the basis of expanding C and L, consider expanding to S-band, U-band, E-band, etc. If it is C+L+S, then it is 12T+5T, achieving a bandwidth of 17THz.

 

With the combination of multiple factors, the transmission rate of a single fiber in a single direction can exceed 100Tbps, which is just around the corner.

Within the data center, 800G (based on baud rates above 100GBd, single channel 100G) has been commercially available. Single channel 200G, 400G, 800G, it's just a matter of time. In this regard, progress is faster abroad.

 

With the continuous increase in capacity, the technological challenges it brings are also increasing. The development of optical communication, in other words, relies on devices, chips, processes, and materials.

 

To meet the requirements of power consumption, security, operation and maintenance mentioned earlier, it also relies on a series of innovations such as technology, architecture, packaging, artificial intelligence, and digital twins. There is still a lot of work to be done in the upstream and downstream of the industrial chain. The road ahead is still long.

Send Inquiry