Springing Forward to High Performance
State-of-the-art performance in today's digital communication
systems can be achieved if error correction is employed.
By Robert Howald
In our first two "Building Blocks"
columns, we pointed out the importance of top-down design of a communication system by emphasizing the importance of the link budget, followed by a discussion of a couple of relevant contemporary channels. While the link budget is the holy grail of an accurate system design, it is important to recognize that link and channel analysis often require trade-offs. In the column on link budget, the modulation type was implicitly assumed in order to outline how to apply impairments and ultimately determine
degradations. However, choice of modulation obviously depends heavily on the channel type, as well as the application and its performance requirements. The choice is driven heavily by whether the channel is power-limited or bandwidth-limited, two self-explanatory and commonly used terms to roughly describe the fundamental constraints of the channel. Both channels described last month (twisted pair, which are used for digital communications, and the return-path HFC [hybrid fiber coax] channel) represent
bandwidth-constrained situations. Of course, each also has power limitations to be wary of. An obvious power-limited situation is a satellite-to-ground link, where limited downlink power is available relative to the link loss. For many known channels, the choice of modulation family is clear.
The digital modulation choices available are well covered in my two-part series published in this magazine last summer.1,2 For those who spend their extra summer hours at softball games and barbecues, now would be a good time
to pull these issues off the bottom of that pile of Communication Systems Design magazines in the corner of your office (if, of course, a review of the techniques is of interest). Link analysis, channel impairments, and modulation type ý these are a communication systems engineer's version of the "Big Three." Understanding these to be the priority-one issues, we will undertake a perusal of a topic that might be among those considered priority "1a:" error correction.
Errors can cost you the game
If the shortstop on a major league baseball team had a fielding percentage of .999, he'd be a sure hall-of-famer. But if I send a message from the opposite side of Mars, I am a failure if the symbols are correct with a probability of .999 (granted, baseball statisticians misuse the word percentage; they are the same people who write six and two-thirds innings as 6.2). Many talented individuals have brought about a fundamental change in what we consider reliable performance in today's communication
systems. I'd list them, but I'd be sure to unjustly miss someone important. The work in the error correction arena has been continual since Shannon pointed out just how far any of our existing systems were from what was theoretically achievable. There once was a time, not all that long ago, when debates raged over whether or not it made sense to "squander" bandwidth and complexity on coding circuits instead of just applying that effort towards better modems. Now, however, there likely will be disagreement from
some that I have classified error correction as only 1a. There is justification for this complaint, as best performance is achieved by considering modulation and forward error correction together, instead of as distinct pieces. Additionally, it is a significant element in link budget trade-offs to consider various coding schemes and the anticipated link gains they can provide.
For this discussion, we will consider, as the title suggests, only forward error correction (FEC). This is in contrast to
automatic repeat-request (ARQ), where a corrupted transmission is requested by the receiver to be retransmitted. There are several applications where this is not a viable option due to latency issues.
FEC instills structure to the message on the transmit side of the link ý generally a linear algebraic structure. In doing so, it adds redundancy, resulting in overhead. The required throughput must be accommodated through increased bandwidth or a change in symbol definition. The accompanying signal-to-noise-ratio
(SNR) loss must be more than overcome by the redundancy to be useful. The structure imposed on the message is known at the receiver, which allows the message to be decoded, and the decoding operation can correct errors that occur in transmission. The number of errors that a particular code is capable of correcting is a function of the code design (how the redundancy is incorporated) and the code rate (a measure of efficiency with which the structure was added). The rate is some fraction less than one, and
is the ratio of the number of message symbols to the number of total transmitted symbols of the code, or the ratio of averages of each, depending on whether the code is a block or trellis implementation.
The fundamental purpose of FEC is to improve error rate by providing coding gain, a term that refers to the dB difference in SNR required to meet a fixed error rate for the coded and uncoded systems. For example, to achieve a bit error ratio (BER) of 1e-5 for QPSK requires ideally about 9.5 dB of Eb/No.
By introducing coding, we might try to achieve this same 1e-5 at 8.5 dB Eb/No in order to ease requirements on a transmit power amplifier or a receiver noise figure. It may not sound like much, but 1 dB of noise figure on a microwave satellite receiver front end could easily be multiple times more costly then some cheap digital encoding hardware.
Care must be taken when studying this topic, because gain exists as both raw SNR gain at fixed BER as described earlier, or as net gain after bandwidth
expansion. Care must also be taken when comparing error rate curves so as not to mix up bit-error probabilities, symbol-error probabilities, and word-error probabilities; the latter being a straightforward measure of choice for coded systems.
Block codes
The most straightforward code description is that of the block codes, where, not surprisingly, a block of symbols is taken as a message word, operated on by a linear transformation (a matrix multiplication), resulting in a new, longer word to be
transmitted. It is easiest for comprehension purposes to consider a binary message and codeword, and a binary matrix. The word is longer than the original by an amount related to the code rate, where, generally, more redundancy can correct more errors. "Optimum" (in some sense) tables of codes are available for given amounts of redundancy and error correction. For example, the famous Golay code (this one has particular significance to theorists), with rate 12/23, written as a (23, 12) code, has twelve message
symbols, twenty-three total symbols, and can correct up to three transmission errors. The error-correction capability, often denoted by t, and a code associated with it, can be chosen from the tables alluded to earlier.
To give a sense of the linear algebra aspect in a way that can be understood, consider the "sphere-packing" problem. For example, envision a clear three-dimensional cube, filled with marbles of all one color. These marbles represent all of the possible words of a block of symbols. For binary
symbols, this is all the possible arrangements of 0s and 1s (all arrangements of 23 for Golay code = 223). The message word can be any possible combination of the k binary symbols (k = 12 for Golay). A block code uses a linear transformation to expand all of these possible k arrangements (2k) into a subset of the 2n possibilities. That is, valid codewords are represented by just some of the marbles in the box. If the valid codewords are a different color, a good code would scatter them throughout the box
so that they are as far apart as possible, making it difficult to confuse them. The closest two would represent the minimum distance (which we want to maximize). Then, if a few errors in the channel make the received word not a valid codeword, the idea would be that it is just a marble or two away from a valid one. Thus, it would be easy to classify to the closest valid codeword. Naturally, there is linear algebra and combinatorial analysis that mathematically describe all of this, although the search for
codes that do the best sphere packing has been, and still is, a tough battle steeped in Cray computer searches.
Trellis codes, concatenation, and interleaving
Trellis, or convolutional codes, have essentially the same mathematical structure, though they are more complex to describe than the straightforward matrix multiplication. This type of a binary code implements redundancy using shift registers and adding (binary case). Output sequences are a shift-and-add function of input and past digits.
This operation is inverted at the receiver, and code rates, error-correction capability, and distance properties can similarly be defined for these codes. The term trellis is used because the feedback structure can be considered a state machine, whose sequence of states can be modeled in a trellis diagram, through which error performance can be evaluated. Simple trellis codes have performance very similar to that of simple block codes, with asymptotic behavior (unbounded code length) sometimes resulting in
minor performance differences under some criteria.
While most design literature favors block codes because of its ease in analysis, trellis codes are easy to implement, and represent cheap, simple, moderate-performing link "gain blocks." They also make a nice piece of a concatenated structure, where two codes are placed in series. Trellis codes fit quite nicely in concatenation schemes when complemented with the powerful, and now-popular, Reed-Solomon class of block codes. One of the important reasons for
this is that the Reed-Solomon class is efficient and handles burst errors well. While most early coding work was based on correcting randomly distributed errors in additive white Gaussian noise (AWGN), developments eventually led to structures designed or suitable for the burst environment, a characteristic of many contemporary channels, where transmission errors occur in bunches. The burst and random-correction capabilities in concatenated codes complement each other well. The use of concatenation of codes
represents designs aimed for very high performance.
Another effective way to handle burst errors is through simple interleaving. Interleaving is the organized scrambling of the order of the transmission, such that initially adjacent symbols when transmitted come out many symbols apart. When de-interleaved at the receiver, errors that occur in bursts are thereby spread out randomly, and are suitable for a random error correcting decoder. All that is necessary to know of the symbol separation required is
knowledge of burst-duration characteristics. Naturally, interleaving does not provide coding gain the way Reed-Solomon code would. It simply allows the normal block or trellis decoder to achieve close to its theoretical gain, rather than be degraded while trying to correct in a bursty channel. Some of the highest performance systems in use today implement concatenated codes and interleavers.
Signal space codes and TCM
A drawback of the block and trellis approaches described above is the
additional bandwidth required to maintain a fixed information rate. The code rate results in only a fraction of the encoded word being message information ý the rest is FEC overhead. If the throughput rate must remain fixed, then an increase in signaling rate must occur. Assuming no change in the shaping filters, which could increase susceptibility to intersymbol interference (ISI), the result is a bandwidth increase. To overcome this, the redundancy can instead be implemented via signal space codes. In this
approach, the original constellation size is increased, generating a larger set of symbols than needed by the data format alone. For example, a QPSK constellation can be increased to an 8-PSK constellation by using a rate 2/3 signal space code, since 3 bits make up an 8-PSK symbol. Of course, while there is no bandwidth expansion, an accompanying SNR loss still occurs. In this case, if it is assumed the average transmit power is fixed, the performance loss manifests itself as a decrease in distance between
constellation points, which increases raw symbol error rate. When tinkering with constellations, it is important to be aware of other relationships. In the case described above, another important effect to consider is that we have created a constellation that is also more sensitive to untracked carrier phase jitter.
Trellis-coded modulation (TCM) represents the first methodical attempt to consider the modulation and encoding process as a single entity in order to optimize performance of the whole.2,3 It is
the most recent major advance in FEC development, and was invented by G. Ungerboeck in the early 1980s.4 Traditionally, the modulation scheme and the FEC were largely independently developed. In TCM, the points of the signal space constellation are carefully arranged to maximize the distance associated with the most likely errors, so the correct symbol is easier still to distinguish. The TCM concept uses aspects of both trellis and signal space structures, and is remarkably straightforward. Encoding via TCM
is a two-step process. A set of symbols is delivered to a trellis encoder, which adds redundancy. The outputs are used to select one member of a group of constellation subsets. For example, in the 8-PSK situation above, a trellis encoder may output a single bit. As a state machine, the bit is a function of several previous bits. This digit selects one of the two "QPSK" subset constellations that together form an 8-PSK structure. Given that a constellation has been chosen, the remaining 2 bits (in this case
with QPSK subsets) are then used to choose a symbol within the QPSK subset.
The optimum arrangement of the bits within subsets is obtained by decomposing the complete constellation (ultimately down to the two-symbol level), with each additional subset resulting in an increase in minimum distance in the symbols within it compared to the previous subset. This procedure, called set partitioning, ensures that for a given desired throughput and code, properties that minimize detection errors are optimized.
The TCM design process involves optimally determining, for a given application, what number of symbols from what trellis encoder select what size constellation from what size of original signal space. What this sentence implies is that TCM is a developing art.
In the latter part of 1995, while in the midst of a design of a telecommunications system over cable, I was in a technical meeting with one of the brightest modem engineers I have had the privilege of working with. Our FEC approach was, in part,
inherited from some previous work, where the typical modus operandi (unfortunately) was to optimize time to market at the expense of a nonoptimized product. While reviewing some of the modem simulation work done in conjunction with the FEC, the fellow commented that the modem work was not suited for our application, and it was done by "an FEC guy, who doesn't really know modems that well." Not too long after, I was in another technical meeting, this one discussing telecommunications on a switched digital video
network, where another of the sharpest FEC individuals I had ever met was in attendance. During the discussion, he mentioned that a particular FEC structure may not be quite right for the system, and needed to be checked, since it was done "by a modem guy, who doesn't really know FEC that well." The moral? At the next big joint meeting, I'll be ready with helmets and pads to hand out. And besides that? The "us-versus-them" attitude will have to change. The highest performing systems in the future will need
to incorporate methods that draw heavily on both modulation and coding theory, and in thinking about them as a single entity. We may never see a shortstop with a perfect fielding percentage, but at least our work in error correction will keep Shannon smiling.
Robert Howald is a staff engineer in the transmission network systems group at General Instrument's communications division in Hatboro, PA. He has a BSEE and an MSEE from Villanova University, and is currently a PhD
candidate at Drexel University.
References
1. Howald, R., "Modulation Basics for Digital Communications, Part 1," Communication Systems Design, July 1996.
2. Howald, R., "Modulation Basics for Digital Communications, Part 2," Communication Systems Design, August 1996.
3. Biglieri, E.; Divsalar, D.; McLane P.; and Simon, M., Introduction to Trellis-Coded Modulation with Applications, Macmillan Publishing, NY, 1991.
4. Ungerboeck, G., "Channel Coding with Multilevel/Phase Signals", IEEE
Trans. on Information Theory, IT-28, No. 1, January 1982.