Routing multiple serial signals can be a daunting task during the development of fibre channel systems. Fortunately, digitalrepeaters and retimer ICs can clean up signals and counteract jitter in order to ensure spec compliance.
Storage area networking (SAN) and network attached storage (NAS) have garnered a great deal of attention in the
communication market as the demand for sophisticated e-commerce and mission-critical systems has exploded over
the past year. Estimates of future growth optimistically point to widespread deployment of these systems to distribute
databases, Web pages, streaming video, and movies-on-demand.
A chief concern of designers of SAN and NAS equipment is, of course, fast access to files. Merely locating audio clips,
video clips, and large databases is not enough; fast access and loading of the data is critical. To this end, fibre
channel has emerged as a fast and reliable method for making SAN and NAS equipment a reality.
Fibre channel uses high-speed serial links at either 1.0625- or 2.125-Gbps data rates to link servers, switches, routers,
disk arrays, tape systems, and RAID controllers into complex, redundantly interconnected networks which allow user
access to data in the presence of component failures. Fibre channel-derived SAN architectures are now being deployed
in increasingly complex and creative ways to solve serious data availability issues which, to-date, have been
unsolvable. Designers implementing these systems must ensure physical layer interoperability in order to reliably
deliver future scaleability to users.
Unfortunately, designing robust gigabit speed serial links is typically viewed as a black art beyond the skill of most
system architects. The most challenging area is jitter, which can adversely affect the reliability of data transfer
through the link. As you'll see, repeaters and retimers can be used to mitigate these difficulties and, with care,
provide signals that comply with all jitter-related industry specifications. To better illustrate the impact of repeaters
and retimers, let's start with a more detailed look at the physical layer of fibre channel links.
Speeding the connection
At the transmit source of a fibre channel architecture, 8 b of raw data are encoded into a 10-b character using the
8B/10B encoding scheme. This encoding scheme provides the encoded data with several critical properties of value in
serial data communications. 10-b data is serialized and transmitted through a high-speed, differential output buffer,
onto copper cable or into an optical transceiver for conversion to light. This is done with an output buffer that uses
either positive emitter coupled logic (PECL) or current mode logic (CML).
At the other end of the fibre channel link, high-speed differential data from the copper cable or output of an optical
transceiver is input to a high-speed, differential input buffer that drives a clock and data recovery (CDR) unit. This CDR
extracts a bit-rate clock from the serial data, samples the data, then deserializes and decodes it into 8-b raw data.
The deserializer aligns the recovered serial data to the 8-b parallel bus and uses a distinct 7-b comma character
(0011111) in the serial stream to identify alignment boundaries. This parallel data is then pro-cessed by higher-level
functions.
The 8B/10B encoding aids serial communications by conditioning the transmitted signal in order to ease the receiver
implementation. 8B/10B encodes raw data into a 10-b character, which has guaranteed edge density (30% on average)
and limited run length (5 b maximum) making the CDR easier to design. Furthermore, 8B/10B also ensures that DC
balance is maintained on the copper cable so that the receiver does not see a DC shift at its input. Due to the 8B/10B
encoding, links may be AC coupled in order to provide implementation flexibility between vendors at each end of the
link.
In most serial protocols, there is an assumption that the transmitting system and receiving system are asynchronous
to each other. In fibre channel, each side of the link is normally implemented using oscillators with plus/minus 100
parts per million (PPM) frequency tolerance so that the receiver sees potentially up to plus/minus 200 PPM between
the baud rate of the incoming data and that of its local reference clock. All circuit functions that recover data, either
serially or in parallel, must accommodate this frequency offset.
All that jitter
By the time the serial signal reaches a receiver, various forms of jitter may have been added to the signal, making
reliable, error-free data recovery difficult. Jitter is the displacement of a signal's edge from its ideal location and may
be caused by numerous factors. Random jitter is unbounded Gaussian jitter (caused primarily by thermal noise) which
is generated in clock multiplier units (CMUs) in transmitters and in optical transceivers.
Due to the presence of random jitter, links are specified for a given bit error rate (BER),which is usually better than
10-12. At 1.0625- Gbps fibre channel rates, this allows a single bit error approximately every 15 minutes. Actual
systems perform much better than industry specifications but reliable data communication products assume a
worst-case BER and implement reliable recovery mechanisms that result in error-free transmission.
Another consideration is deterministic jitter, timing distortions and irregularities caused by circuit effects in the
transmission system. The disturbance is generated by numerous mechanisms in links but is bounded so that a receiver
can reliably accommodate deterministic jitter, if under a certain threshold. Most deterministic jitter is generated by the
bandwidth limitations of the transmission system and results in data-dependent variations in signal.
In the real world, some pretty awful signals are input to systems and must be reliably recovered by the receiver. In
general, receivers are implemented with a CDR that is phase-locked loop (PLL) based. The PLLs in these systems
generally feature loop bandwidths in the MHz range, providing decent jitter tolerance. If jitter is considered as a
sinusoidal variation in the location of signal edges, then the frequency of jitter refers to the rate at which the edge
moves relative to its ideal position. The amplitude of jitter refers to the amount of absolute movement from ideal.
The CDR's PLL can reliably recover sinusoidal jittered serial data that has jitter below a certain amplitude threshold.
This threshold amplitude varies with the frequency of jitter. Jitter below the loop bandwidth of the PLL is trackable,
while jitter above the loop bandwidth is not. As a rule, the higher the receiver's loop bandwidth, the higher the jitter
tolerance.
Complex serial systems
While a typical point-to-point connection addresses the needs of some system integrators, much more useful systems
can be built using hubs, switches, and other serial routing functions to connect, steer, and replicate serial signals.
Duplex links between systems would then route through more circuitous paths to connect end-user nodes but would
provide more flexibility and robustness than point-to-point connections.
One of the more complex subsystems in fibre channel is the just-a-bunch-of-disks (JBOD) disk array. In this array,
incoming serial data is routed to all functional drives, isolated from all non-functional drives, and then routed back to
the source of data.
Figure 1 shows a typical fibre channel arbitrated loop (FC-AL) JBOD. Each drive has two fibre channel links for
redundant access in high-availability systems. Two independent sets of circuits are used to connect the incoming serial
data (on loop A or loop B) to the disk drives. Repeaters and retimers are used at the incoming and outgoing ports of
the system to clean up signals in each direction.
Port bypass circuits (PBCs) are also included in the fibre channel architecture highlighted in Figure 1. These circuits
serve as high-speed serial multiplexers that steer gigabit signals to the drives.
When combined together, the repeaters, retimers, and PBCs form a virtual FC-AL that includes all functional drives and
isolates all non-functional drives. Intermediate repeaters may be required, depending on the size and complexity of
the system.
A repeat performance
In this article, the term "repeater" will be used for a CDR circuit where the recovered serial data is retransmitted
synchronously to the recovered clock. Repeaters are used to receive and retransmit serial data and, in the process,
clean up the signal. Repeaters are not protocol-addressable entities but are considered part of the interconnect.
(Please note that in alternate technologies, such as Ethernet, other terms may be used for this function).
In a typical repeater block diagram, serial input data is sent to the repeater PLL and CDR sampling flip-flop (see Figure 2). The PLL lines up a 1.0625- or 2.125-GHz clock synchronously to the rising and falling edges of input data as well as
a clock 180¼ out of phase with the recovered clock used to resample the data in the flip-flop.
Resampled data is then retransmitted through the output buffer. Some implementations require an external control
input, LOCK2REF, which forces the repeater to lock to a local reference clock when not locking on data. Other repeaters
do not need a LOCK2REF input or a reference clock.
The repeater is simple, requires fairly low power, and features low latency through the device (usually several bit
times). However, jitter below the loop bandwidth of the CDR PLL will be retransmitted without any jitter attenuation.
Any jitter above the loop bandwidth of the PLL is attenuated aggressively.
Traditionally, analog repeaters have been employed in fibre channel architectures. However, modern,
application-specific, digital repeaters are gaining popularity due to their superior performance. Digital repeaters offer
outstanding jitter tolerance, which is the ability of the receiver to reliably recover the data error-free, as well as low
jitter transfer, which measures how much of the jitter at the input signal is transferred to the output.
In general, the higher the loop bandwidth of the receiver, the better the jitter tolerance and the worse the jitter
transfer. Typically, one-stage analog repeaters are forced to compromise between high loop bandwidth (resulting in
good jitter tolerance but poor jitter transfer) or low loop bandwidth (poor jitter tolerance but good jitter transfer).
Analog vs. digital
Figure 3 shows the jitter transfer of an analog and a digital repeater. The X-axis is the frequency of jitter. The Y-axis
is the ratio of jitter output versus jitter input measured in dB. The jitter-transfer curve shown in Figure 3 for an analog
repeater shows high loop bandwidth (~15 MHz) where the loop bandwidth is measured at the -3-dB transfer point.
Any jitter below the loop bandwidth is transferred from the input to the output unattenuated. Furthermore, many
analog repeaters exhibit jitter peaking where the input jitter actually excites a higher jitter at the output. In general,
this is less than 1 dB, but it does limit the number of repeaters that can be cascaded in real-world systems.
A digital repeater offers several benefits. First, it operates identically over process, voltage, and temperature. A
two-stage digital repeater implements a first-stage CDR with high loop bandwidth (approximately 4 MHz) in order to
have good jitter tolerance. The second stage features very low loop bandwidth (approximately 75 kHz) in order to
minimize the amount of jitter transferred to downstream devices. Also, since they do not peak, they can be cascaded
without worry. When these features are combined, this repeater provides good jitter tolerance and good jitter transfer
in a non-peaking design.
Retimers
The term "retimer" is used for a CDR that retransmits the recovered serial data synchronously to a local reference
clock. This complex retimer function eliminates jitter transfer at the expense of latency.
Due to the potential mismatch between the baud rate of the incoming data and the local reference clock
(approximately plus/minus 200 PPM), an add/ drop first in first out (FIFO) is required to insert or delete 40-b ordered
sets known as "fill words" to accommodate this rate difference. By eliminating jitter transfer, signal-quality standards
compliance is ensured. A block diagram of a typical retimer includes a digital receiver, the add/drop FIFO scheduler,
and an output flip-flop for retransmission (see Figure 4).
Data that is added or dropped must be anticipated by the protocol level, in this case fibre channel, which must provide
for addable and droppable protocol entities. Since fibre channel was conceived with the idea of complex data paths,
retimers were considered at the early stages of protocol development. For example, a transmitting source for fibre
channel frames must place at least six idle ordered sets between each frame. At the receiving end, at least two idle
ordered sets must be present. This allows intermediated devices to drop a maximum of four idle ordered sets.
Since data in the add/drop FIFO is taken out of the FIFO with a locally-generated reference clock, the jitter at the
output will depend on the output stage of the repeater and the internally generated bit-rate clock, but not on the
input serial data.
A price is paid for the additional complexity involved in retimers. Obviously, power increases, but more importantly,
there is a latency penalty on the order of several hundred bit times due to the FIFO. The efficiency of a network is
proportional to the latency of each path. Therefore, the increased latency in the retimer reduces network efficiently.
Which to use?
As a general rule, designers of complex systems should use retimers at the output of their
systems or at any node that customers might view a data eye. This ensures that the customer only sees
specification-compliant data eyes at the output of retimers. By using repeaters elsewhere, the latency penalty
associated with retimers is avoided and network efficiency is optimized. Although repeaters may transfer some low
frequency jitter from the input to the outputs, this frequency is low enough that downstream receivers will reliably
recover the data without error.
The basic functions of a fibre channel repeater or retimer can be used in other protocols such as Gigabit Ethernet and
Infiniband. In Ethernet, a repeater is equivalent to a fibre channel retimer - an unfortunate naming snafu. Unlike fibre
channel, Gigabit Ethernet is dominated by point-to-point links between network interface cards (NICs) and switches.
Ethernet repeaters are used between two subnets in the same collision domain and are most commonly called upon to
translate from one media type to another. Consequently, an Ethernet repeater is more complicated than a fibre
channel retimer since multiple physical layer protocols must be supported and management is almost always
mandatory.
In Infiniband, these functions are addressed by a "retiming repeater," and each link is allowed to contain no more than
two retiming repeaters. This function is equivalent to a fibre channel retimer. In order to accommodate frequency
offsets between the incoming data and the local transmit clock, retiming repeaters are allowed to add or drop skip
symbols (a special 10-b word) within the skip ordered set (initially a 40-b entity) which starts with a K28.5 character.
Since Infiniband consists of 1x, 4x, or 12x wide links, the retiming repeater may be designed to perform lane-to-lane
deskew as well.
The equivalents of fibre channel repeaters have been used in either Ethernet or Infiniband, although nothing in the
specifications preclude their use.
About the Author
Bob Rumer is the vice president of Storage Communications at Vitesse Semiconductor Corp. Prior to Vitesse, Rumer
held engineering positions at Tandon Corp., Philips Ultrasound, and Beckman Instruments. Rumer received his MSEE
from the University of California at Berkeley and can be reached at rumer@vitesse.com.