Commsdesign Home Register About Commsdesign Feedback Online Opportunities SpecSearch GlobalSpec




















eLibrary

EE TIMES NETWORK
 Online Editions
 EE TIMES
 EE TIMES ASIA
 EE TIMES CHINA
 EE TIMES FRANCE
 EE TIMES GERMANY
 EE TIMES INDIA
 EE TIMES JAPAN
 EE TIMES KOREA
 EE TIMES TAIWAN
 EE TIMES UK

 EE TIMES EUROPE
 ANALOG EUROPE
 INDUSTRIAL EUROPE
 AUTOMOTIVE DL EUROPE

 POWER DL EUROPE

 Web Sites
 • Audio DesignLine
 • Automotive DesignLine
 • Career Center
 • CommsDesign
 • Microwave
    Engineering
 • Deepchip.com
 • Design & Reuse
 • Digital Home DesignLine
 • DSP DesignLine
 • EDA DesignLine
 • Embedded.com
 • Elektronik i Norden
 • Green SupplyLine
 • Industrial Control
    DesignLine
 • Planet Analog
 • Mobile Handset
    DesignLine
 • Power Management
    DesignLine
 • Programmable Logic
    DesignLine
 • RF DesignLine
 • RFID-World
 • Techonline
 • Video | Imaging
    DesignLine
 • Wireless Net
    DesignLine

ELECTRONICS GROUP SITES

 • eeProductCenter
 • Electronics Supply &
    Manufacturing
 • Conferences
    and Events
 • Electronics Supply &
    Manufacturing--China
 • Electronics Express
 • Webinars


18 March 2010

Intelligent Data Recovery

Customers will continue to clamor for more bandwidth, and so the opportunities for data to become lost amidst all of the added noise and distortion will continue to multiply. Because data recovery is critical, it takes a “smart” CDR to pass the test.

By Gary McCormack

The clock and data recovery (CDR) function is a key element in any high-performance fiber-optic link. The ability to recover data amidst the noise and distortions present in a transmission path is a primary determinant of that link’s reliability and span. The need for higher bandwidth, longer spans, and more wavelengths continues to increase, making the CDR function even more critical.

The CDR is the point in a system where the analog signal is reduced to a digital result. In that conversion, much information (such as signal strength and jitter) is left behind. If the conversion goes well, the look of the eye pattern for the incoming data is not as important. However, when the link starts to fail, the errors can grow rapidly, leaving no room to recover. In a communication system where statistical effects are significant, the binary decision that a CDR makes when it captures the data seems to be an oversimplification.

With the push for higher performance comes the demand for higher link integrity, which has fueled such error-recovery systems as forward error correction (FEC). But even FEC has its limits. At some point, lost data bits from an overtaxed CDR can rise well above FEC’s ability to compensate. When a link fails, there may be little to no telemetry as to why it failed. FEC can provide an early warning that signal integrity is low, but it provides little information about the cause of the errors, nor does it do anything to eliminate the cause of the errors in the first place.

Of the two basic functions of a CDR, clock extraction and data retiming, the process of clock extraction tends to get the most attention. Specific standards define the behavior of the clock recovery loop, which become key discussion points in the evaluation of a CDR chip. After all is said and done regarding jitter transfer, for example, what about the setup-and- hold time of the retiming latch? A CDR chip could meet all of the SONET jitter specs, and still do a poor job of capturing the data eye because of a slow or poorly timed retiming latch.

A good CDR can usually extract the clock frequency from the data without too much difficulty. The SONET standards for the loop dynamics are so well-defined that little room is left for improvement in the design of the clock recovery loop. One task remains — to recover the data eye using the extracted clock. In data recovery, there is no such thing as “good enough” — there can never be too much voltage or phase sensitivity.

Ideally, the CDR circuitry should position the voltage- and phase-slicing levels in the optimal position within the eye pattern of the incoming data signal. However, in real life the characteristics of the eye patterns are far from ideal — their shapes are often distorted and vary over time. In worst-case conditions, the actual eye can be so small, and the holes formed by the edge transitions so large that the CDR circuitry misreads the data stream or even falsely locks on the eye transitions (see Figure 1 ). The point at which the operation of the CDR becomes unreliable is a key performance factor in the design of a fiber link. In some cases, a small improvement in margin can mean a big change in channel capacity or span.

The usual practice for conventional CDRs is to set the voltage- and phase-slicing levels halfway between their maximums and minimums, which is the geometric center of the eye. More sophisticated recovery systems will shift the sampling point away from the geometric center to compensate for asymmetrical error gradients and/or the decision latch’s setup time. Although these offsets can significantly improve the acquisition margin, the amount of adjustment is frequently static and assumes a constant error gradient over time.

To find and track the center of the data eye, most CDRs rely on an analog or digital control loop to position the sampling point in the data eye. The feedback that drives the loop is taken from the transitions in the data stream. Unfortunately, those control loops are susceptible to offset errors that can shift the sampling point away from the optimum position. Additionally, most loops are unaware of the setup time of the retiming latch, which can be a significant fraction of the eye opening. The usual recourse for dealing with these offsets is a combination of thorough (and time- consuming) design and production iterations. Even after aggressive tweaking and fine-tuning, the best CDRs can still find themselves on the brink of failure. As the eye quality degrades, the CDR may run the risk of falsely locking on the wrong part of the eye pattern. As the CDR drifts into the noisy region of the eye or false-locks, it will happily pass errored data and render the link useless.

If I only had a brain

The demands placed on modern high-performance fiber links don’t tolerate error in the data recovery process. In response to those demands, methods have been developed to deal with the offsets in the CDR by wrapping the equivalent of a technician with a screwdriver around it. Using a high-level control loop, the system is able to tweak the chip. In doing so, the system introduces an important aspect to the recovery process by recognizing the statistical nature of the data. Now, the sampling point can be moved from the geometric center to the position with the highest margin for errors. While these systems that correct poor behavior can be effective, they tend to be complex and expensive. One key reason for this complexity is that the solution is implemented in layers on top of a conventional CDR, composed of many (and sometimes disparate) pieces, such as digital-to-analog converters (DACs), op-amps, microprocessors, and SONETperformance monitors.

Rather than design systems to correct a conventional CDR’s behavior, why not design the CDR with the intent to work in a high-level control loop? The task of fine-tuning the CDR then becomes more efficient and effective. Also, many of the constraints that burden the task of designing a good CDR disappear, allowing the designer to focus on the more important attributes. The CDR becomes part of a larger system that is designed to extract the maximum performance from the CDR.

In its simplest form, the recovery system would involve two functions: a clock/data recovery function and a brain to control it (as shown in Figure 2 ). The two functions would interact through two processes: control and telemetry. The control process allows the brain to change the operating point in the CDR, and the telemetry process provides data to the brain so it can figure out what to do.

Examples of the control function are commands that might set the slicing level of the input comparator using a DAC or move the phase of the clock used to strobe the retiming latch. Data regarding the current sampling point’s proximity to the edge of the data eye opening would be an example of telemetry. Through the exchange of telemetry and control commands, the overall system would be able to fine-tune the operation of the CDR to optimize margin. Such a system would be able to find the true center of the eye.

It should be the goal of the CDR designer to maximize the simplicity (minimize chip count) without sacrificing flexibility. At first glance, integrating both the CDR and processor into a single chip would seem desirable, however, this produces certain compromises. In many cases, a processor is already present and available in the system. Also, the semiconductor technology that is best suited to building a high-performance CDR may not be the optimum technology to implement a reasonably powerful processor. Single-chip controllers have become cost-effective and come in form factors that require about as much board area as the bypass capacitors for the CDR. A good compromise would be to split the system into a processor and a digitally controlled CDR.

The CDR function in this type of system has several requirements above and beyond a conventional CDR. At the core would still be a clock extraction circuit and a retimer. Layered on top would be functions needed to steer the sampling point of the retimer. Many of these functions were in the recovery systems described earlier, but now they would be integrated into a single chip (see Figure 3 ).

It’s important to note that if the variable voltage-slicing capability is to be of any use, a fully linear signal path must lead to the retimer function. Although many system designs use a limiting amplifier, doing so will move the voltage-slicing decision to the input of the limiting amplifier (which is highly sensitive, making control difficult). To avoid these issues, an automatic gain control (AGC) buffer should be used to provide a leveled, linear signal path to the voltage-slicing function.

All provisions have to be made under digital control to adjust the voltage and phase of the retiming latch. The most straightforward way to adjust the voltage-slicing point would be to use a DAC to set the slicing level of a comparator. By stepping through the codes in the DAC, the system controller should be able to adjust its slicing point from the extreme low to the extreme high level of the data signal. Fortunately, the DAC doesn’t need to carry a lot of precision or linearity. By moving in incremental steps, a centering algorithm could work around the nonlinearities. The phase of the sampling clock also needs to be controlled digitally. One simple way to achieve this digital control is to use an analog multiplexer to mix specific phases from the clock source to create the desired clock phase.

For the telemetry, the CDR must include a method to measure the margin of its operating point and produce a digital result. A simple approach is to use two acquisition channels and compare the data captured by each. By positioning one channel near the center of the eye, the other channel can be moved relative to the first and any differences between the two channels can be logged for each placement of the scanning channel. Using counters, the number of errors per unit of time can be determined, so the results of a multidirectional scan could be used to construct an iso-BERR contour (see Figure 4 ).

Once an error contour is determined, the microcontroller must calculate the optimum position and re-center. The algorithm can be as elaborate as a Q-test, or as simple as a ping-pong routine that incrementally moves away from the closest error. An advantage to putting the algorithm in software is that it can be changed to adapt to the characteristics of the data stream. The software can also be designed to keep track of the size of the eye opening over time, so that signal fading can be detected and an alarm activated before the signal is lost.

Show me the margin

So what’s gained by all this optimization? Thanks to the statistical behavior of the data, the answer is: It depends. When the data eye is healthy, little is gained. But when the eye becomes distorted and noisy, the improvement in error rate by moving just a few millivolts or picoseconds can be dramatic. Distortion, in particular, can wreak havoc on a conventional CDR that aims for the center of the eye. Therefore, the ability to shift away from center can be crucial. Improvements of 3 dB in optical loss are quite possible, which can translate to significant improvements in total span or channel capacity.

Besides improved margin in the acquisition, another significant benefit to a digitally controlled CDR is remote monitoring. The telemetry gathered to support the centering algorithms could also provide data to the system controller on the general health of the link. By monitoring the acquisition margin, the system can track the signal strength of the link and forecast failures before they occur. Predicting failures allows the protection switch to preempt the failure or could initiate a service call before a crisis occurs. Remote monitoring is the equivalent of sending a technician with a scope to manually check the signal.

With the advent of inexpensive microcontrollers, more and more systems are becoming smart. While at first it may seem like overkill to computerize a function that has historically been a pure analog problem, the benefits of increased system intelligence outweigh the incremental cost. Complexity at the chip level may be higher than before, but the complexity at the system level is improved because it is easier to predict errors and to quickly correct them with code revisions.

Gary McCormack is the director of the Portland product development group for Vitesse Semiconductor Corp. He holds a BSEE from Oregon State University and can be reached at garym@vitesse.com .

Illustrations

Figure 1
Figure 2
Figure 3
Figure 4

Return to the Table of Contents





Virtualab

  • Analysts: Five observations on mobile from MWC
  • M'soft says no comment on Project Pink phone
  • What made you become an EE? Join the Conversation
  • Nvidia blames sales shortfall on TSMC
  • MORE
    Prototype fuel cell for handsets eyes fivefold run-time boost
    As part of a research collaboration on miniaturized energy sources, the French Atomic Energy Agency (CEA) and STMicroelectronics NV (Geneva) have prototyped a hydrogen fuel cell for mobile phones that aims to reduce dependency on the use of electrical power supplies to recharge batteries. EE Times' Anne-Francoise Pele Takes a closer look.Click here to learn more.

    Tech Article Library
    Check out CommsDesign's Design corner to find a detail technical articles on a host of communication design issues. To access the design corner, click here.

    Phyworks demos 10G copper interconnects
    Communications chip specialist Phyworks (Bristol, England) has demonstrated 10Gbits/s rack-to-rack copper interconnects of up to 30 metres using technology it originally developed for the optical module market. EE Times Europe's John Walko gets the story. Click here for details.

    Puzzled by a network processing design issue?

    Join former NPF CEO Colin Mick in discussing net processing design issues by clicking here!


    EE Times TechCareers
    Search Jobs

    Enter Keyword(s):


    Function:


    State:
      

    Post Your Resume
    -----------------
    Employers Area
    Most Recent Posts
    Boeing seeking Senior Software Engineer in Annapolis Junction, MD

    Emulex seeking Senior Program Manager in Costa Mesa, CA

    Accenture seeking Data Center Technology in Reston, VA

    Eurotech seeking Sales Executive in Amaro, Italy

    NYU Langone Medical Center seeking IS Manager in New York, NY

    More career-related news, resources and job postings for technology professionals




    Home  |  Register  |  About  |  Feedback  |  Contact   |  Site Map
    All materials on this site Copyright © 2010 EE Times Group, a Division of United Business Media LLC All rights reserved.
    Privacy Statement ¦ Terms of Service