Trellis-coded modulations using an "industry standard" 64-state,
rate 1/2 convolutional code on M-ary PSK constellations have been
proposed and patented to improve the bandwidth utilization of
satellite transponders. This article describes implementation
details of this technique, which provides simpler branch-metric
computations for a DSP-based 8-PSK trellis decoder and a
phase-ambiguity-resolution method using the R-S outer code.
Introduction
Intelsat has adopted a new standard
, which employs trellis-coded modulation (TCM),
using 8-PSK, with mandatory Reed-Solomon (R-S) (219, 201) outer
coding over GF(28). This standard has twice the
bandwidth efficiency (at almost 2 bps/ Hz) than the IESS-308
standard it replaces (at almost 1 bps/Hz). The standard states,
"Since 8-PSK TCM uses practically the same satellite power and is
twice as bandwidth efficient, its usage will permit more efficient
use of orbital spectrum". This standard, reviewed in the Background
section, does not use the optimum 64-state Ungerboeck TCM
for the chosen 8-PSK modulation. Instead the
standard chooses a "pragmatic" TCM (PTCM) scheme, based on the
methodology of Viterbi
and two patents

. The use of PTCM is justified in Viterbi
as follows:
- There exists a widely used "industry standard"
constraint-length 7 (64-state), rate ½ convolutional code that
is optimum for BPSK and QPSK.
- While the use of this convolutional code in PTCM results in a
2dB clear sky loss relative to the optimum 64-state, rate 2/3
Ungerboeck TCM at very low BERs, there is only a 0.4dB loss at a
BER of 10-5. The mandatory R-S outer code further
reduces this BER to an acceptable level.
One patent title
reflects the chief benefit of PTCM for 8-PSK:
reduction of the trace-back memory and computation complexity (per
decoded bit) associated with the PTCM decoder relative to a decoder
for the optimum 64-state Ungerboeck TCM decoder. The patent also
describes a metric-setting method, reviewed in the Branch-Metric
Computation section, that requires a conversion from in-phase and
quadrature data to phase (this requires a divide, table look-up,
and other four-quadrant logic to be provided external to the
"industry-standard" Viterbi decoder). The Efficient Branch-Metric
Calculations section describes a simple metric-setting procedure,
suitable for DSP software implementation, yielding the desired
performance using only multiplies and saturation logic.
The second patent
describes the phase-ambiguity-resolution circuit
you need if the PTCM scheme is used by itself (in other words,
without Reed-Solomon outer coding). The use of this circuit effects
the branch-metric computation at low-to-moderate values of
Eb/N0. You can minimize the multiplication of
errors caused by the ambiguity-resolution circuit by erasing (in
other words, setting to 0) some branch metrics when the received
signal is close to intermediate significant-bit (ISB)
transitions.
This article continues with a review of two proposals for PTCM
schemes at 2.5 bps/Hz using 8-PSK. Also in this section is the
performance of simple branch-metric computations for the more
promising scheme. We conclude with a description of a procedure
that uses the R-S outer code to resolve phase ambiguity.
Background
Figure 1 shows the PTCM phase-ambiguity-resolving encoder
proposed in the two patents.
Figure 1: Phase Ambiguity Encoder for rate 2/3 PTCM using 8-PSK
The 8-ary symbol formed by ENCC[2:0] is mapped to the 8-PSK
constellation (Figure 2).
Figure 2: Symbol mapping to 8-PSK constellation
Figure 3 shows a block diagram of the PTCM decoder. We
are principally concerned with a description and simplification of
the first module in Figure 3 (the computation of the branch
metrics to the Viterbi Decoder). The decoder's performance
enhancement with respect to uncoded QPSK may be intuitively
understood using the following description. The distance between
closest signal pairs for uncoded BPSK is a factor of √2 times
that of uncoded QPSK with equal signal power. However, the latter's
signal power may be increased by 3dB while maintaining the same
energy per bit as the former; thus, uncoded BPSK and QPSK have
identical performance. Assume the Viterbi decoding/ re-encoding in
Figure 3 produces 2 least-significant bits (of the 3 bits)
that are error-free. Therefore, the distance between antipodal
signals is greater (by a factor of √2) than the minimum
distance for uncoded QPSK; thus, the asymptotic coding gain (ACG)
is 3dB (when compared to uncoded QPSK, also 2bps/Hz).
Figure 3: PTCM decoder block diagram
For more exact estimates of coding gain at various
Eb/N0's (neglecting-error-multiplication by
the phase ambiguity resolution decoder), it is typical to compare
the performance of a rate (N-1)/N PTCM scheme using M=2N
signals with the performance of an equivalent bandwidth uncoded
system using M/2 signals. For uncoded operation, the BER,
Pbu is bounded by:

while the coded BER, Pbc, with an M-signal
constellation, is lower bounded by:

where K≤1. For rate 2/3 PTCM using 8-PSK, using a standard
64-state convolutional code, this reduces to:

The following description will help to explain these results.
The minimum distance path is only one branch long, due to the two
parallel transitions from a state, X, at stage n to stage n+1
(Figure 4). The single-branch error probability is merely
the BPSK bit-error probability, with energy doubled since two bits
are sent per symbol, and multiplied by a factor of 0.5 because only
one out of two input bits is involved in such single-branch
decision errors. This is a lower bound, since errors from
multi-branch paths must also be considered. At higher
Eb/N0 for the standard 64-state code, the
multi-branch errors may be neglected (in comparison to
single-branch errors).
Figure 4: Trellis for PTCM
To achieve this performance, the branch metrics must be set
according to the Euclidean distance between the received signal and
the four closest transmitted points in the signal
constellation.
Branch-Metric Computation
For gray-coded QPSK used in IESS-308, with the constellation
points being ejkπ/2, k=0,1,3 and 2, four
(soft-decision) branch metrics computed from the incoming in-phase
and quadrature matched filter outputs, I and Q, are simply I, Q,
-I, and -Q corresponding to symbols 00, 01, 11, and 10. These
matched filter values, when negated, may be considered relative
squared Euclidean distances.
Figure 5: Branch metrics calculated using the Euclidean distances to four nearest neighbors
For IESS-310, without loss of generality, we consider received
in-phase and quadrature components as shown in Figure 5 and
the squared Euclidean distances between the received signal and the
four closest transmitted points in the 8-PSK constellation. The
squared distances with respect to constellation points on a circle
of radius R are:

R (which, assuming no fading, is constant) may be estimated
using an automatic gain control circuit (AGC). If the effects of
varying P (=I2+Q2+R2) are ignored,
the branch metrics may be taken to be the last terms in the
right-hand side of the equations, the correlation of the received
signal with the closest four transmitted signals. In order to avoid
determining which four signals (of the eight) are closest to the
received signal, you can use the absolute values of the correlation
of the received signal with only those vectors depicted in
Figure 4. You can compute the four correlations using two
additions, two multiplications by a constant, and four
comparisons.
There are two difficulties with directly using the previously
computed correlations in a Viterbi decoder:
- When P>>2R2 or P<<2R2 (due to noise), the
correlations (the relative distances) are given undue weight in the
Viterbi decoder
- ISB decoding (ISB hard-decision boundaries are shown by the
dashed lines in Figure 2) errors cause error multiplication
in the phase-ambiguity resolution-circuit.
To resolve the first difficulty, there are upper and lower
limits on the correlations. To resolve the second difficulty,
erasures (0s) take the place of correlations with respect to the
furthest two (of the four closest) transmit signals when the
received signal is close to these ISB decision boundaries.
In current practice, the I and Q matched-filter outputs are
first converted to an angle (using a division and a four-quadrant
arctangent table look-up) and then the four metrics are set
according to a table. Note that these metrics have two periods in
(0,2π), due to taking the absolute values of the
correlations).
Efficient Branch-Metric Calculations
Motivated by the periodicity of the correlations described
above, more efficient metric calculations are:

These surrogate squared distances may be computed using three
multiplies. These are then limited symmetrically with respect to 0
(this involves an additional eight comparisons and, in the worst
case, four substitutions). You can also choose the limit value such
that the relative distances are, for all practical purposes, the
same as the relative distances previously shown (after limiting).
For simplicity, modification of metrics for received signals close
to ISB transitions are omitted.
The performance of this metric setting procedure, using 11-bit
quantized I and Q values, a trace-back memory of 38 states, and
empirically optimized metric saturation, are shown in Figure
6. The performance approaches the theoretical lower bound at
high Eb/N0 and is comparable to a
commercially available PTCM decoder. Details of the Viterbi decoder
used for this implementation are provided in Singh and Jayasimha.
Figure 6: Rate 2/3, 8-PSK PTCM performance using simplified branch metrics (red line) compared to uncoded QPSK (blue line). The green line is the bound given in the Background section.
2.5 bps/Hz PTCM Using 8-PSK
A rate 5/6 code for 8-PSK, using a "industry-standard" rate 1/2,
64-state convolutional encoder punctured to rate 3/4, is described
in Wolf and Zehavi
and is shown in Figure 7. The normalized
square Euclidean distance for this code is 1.465 (the punctured
code has a free Hamming distance of 5). Thus, though this PTCM
provides 2.5 bps/Hz as compared to 2 bps/Hz for uncoded QPSK, it
still provides an ACG of 1.66dB.
Figure 7: Rate 5/6 PTCM using a standard rate 1/2 code punctured to rate 3/4. The odd and even sets of tri-bits are mapped to 8-PSK constellation according to Figure 2
You may use a phase-ambiguity-resolution circuit, based on
similar ideas to those shown in Figure 2. However, McCallister et al
point out that, when mandatory R-S outer
coding is used, it may be used to resolve phase ambiguity. This
avoids error multiplication due to the phase-ambiguity-resolution
circuit. The PTCM scheme in McCallister et al
does not use a punctured 64-state code; instead,
the pair of bits produced by the convolutional code are time
interleaved on odd and even baud rates (Figure 8).
Figure 8: Rate 5/6 PTCM using an unpunctured rate code. The odd and even sets of tri-bits are mapped to 8-PSK constellation in lexicographic order (rather than the gray-coded order of Figure 2)
For this rate, 5/6 PTCM using 8-PSK, using a 64-state
convolutional code, the BER at high SNRs is:

The minimum-distance path is only one branch long, because of
the two parallel transitions from a state, X, at stage n to stage
n+1 (Figure 4). The single-branch error probability is
merely the QPSK bit-error probability, with energy multiplied by
1.25 since 2.5 bits are sent per symbol as compared to 2 bits in
QPSK, and multiplied by a factor of 0.8 because four of five input
bits are involved in such single-branch decision errors. At higher
Eb/N0, for the 64-state code, the
multi-branch errors may be neglected (in comparison to
single-branch errors). Thus, the ACG is
10log10(1.25)=0.97dB, which is worse than the TCM
obtained using a rate 1/2 code punctured to rate 3/4 by 0.69dB.
However, McCallister et al
states that ACG is not the sole criterion used in
selecting a TCM scheme; instead, the coding gain at the operating
range of BERs should be considered. As the BER of the scheme in
Wolf and Zehavi
is ultimately limited by the rate 3/4 punctured
code, it exhibits a sharper "knee" than the scheme of McCallister
et al
. Thus, in a range of BERs (typically between
10-3 and 10-5 where R-S outer coding further
reduces BERs to make them acceptable), the scheme in McCallister et
al
performs better than the scheme of Wolf and
Zehavi
. Figure 9 shows the performance of the
scheme of McCallister et al
.
Figure 9: Rate 5/6, 8-PSK PTCM performance compared to uncoded QPSK. The approximate performance derived in this section is shown as a green line. The circles show the performance with a (225,205) Reed-Solomon outer code over GF(28)
Phase-Ambiguity Resolution Using R-S Outer
Code
As suggested in McCallister et al
, phase ambiguity (as well as symbol-pair
synchronization in the PTCM of Figure 8) may be resolved
using the R-S outer codes with periodically inserted unique words
(UWs) (avoiding error multiplication in phase ambiguity resolution
circuits). For example, IESS
prescribed mandatory R-S (219, 201) outer coding
over GF(28) with periodically inserted unique words but,
curiously, includes a testing requirement as follows:
"Due to the steepness of the BER versus
Eb/N0 response curve when using the
Reed-Solomon outer coding, an inordinately long period of time is
necessary to detect a sufficient number of errors to determine the
BER performance with a reasonable degree of confidence at even
moderate Eb/N0 values. Assuming that the
Reed-Solomon outer codec is functioning, determining the BER
performance of the TCM codec without Reed-Solomon outer-coding
would enable users to quickly determine whether or not the modem is
functioning correctly".
Evidently, any scheme that uses the R-S synchronization pattern
to resolve phase ambiguities cannot cater to testing without R-S
outer coding. Furthermore, the scheme of McCallister et al
avoids the error multiplication caused by the
ambiguity-resolution circuit in Wolf
. As seen in the Efficient Branch-Metric
Calculations section, setting of branch metrics in the scheme of
Viterbi
due to an ambiguity-resolution circuit is also
made more complex. However, the absence of a
phase-ambiguity-resolution circuit may allow the inner code, for
some repetitive data patterns, to indicate node synchronization,
but the outer code to fail to synchronize. The following procedure
ensures synchronization with all data patterns, assuming that inner
and concatenated codes are not synchronized and timer=0
initially:
if (inner code synchronized)
if (outer code errors in s-bit UW< r)
concatenated code is synchronized
else {
set inner code is not synchronized;
increment inner code phase reference by 2π/M (mod 2π)
if (phase==0) change symbol pair alignment;
}
else
if (timer++==timeout) {
timer=0;
increment inner code phase reference by 2π/M (mod 2π)
if (phase==0) change symbol pair alignment;
}
The inner code usually correlates the re-encoded decoded
sequence and the (suitably delayed) hard-decision decoded received
symbols in order to determine phase synchronization. The expected
outer-code synchronization time using this method, calculated using
the methods described in Jayasimha and Kumar
, is not significantly different than the
outer-code synchronization time when the inner code incorporates a
phase-ambiguity-resolution circuit such as the one in Wolf
.
Conclusions
The performance of a simplified metric-setting procedure for a 2
and 2.5 bps/Hz PTCM decoder, suitable for the trade-off criteria
used in selecting a TCM scheme, are:
- Eb/N0 at operating BER versus decoder
memory/complexity
- Performance loss associated with phase-ambiguity-resolution
methods versus synchronization time
- bps/Hz versus sensitivity to phase error/spectral re-growth DSP
implementation, is described and shown to be comparable to that
provided by a commercially available PTCM decoder chip. We also
extend this procedure to one 2.5 bps/Hz PTCM proposal. A method for
phase ambiguity resolution and/or symbol set alignment using the
R-S outer-code unique words, that is data pattern insensitive, is
also described.
Acknowledgements
We are grateful to Mr. V. Harikishan and Mr.
P. Jyothender who were instrumental in implementing various PTCM
schemes on a low-cost 16-bit DSP chip.