 |
| |
Dr.
Casimer M. DeCusatis is a senior engineer for IBM Corporation,
eServer Network Hardware Development Laboratory, Poughkeepsie, N.Y.
He received the M.S. and Ph.D. degrees from Rensselaer Polytechnic
Institute (Troy, NY) in 1988 and 1990, respectively, and the B.S.
degree magna cum laude in the Engineering Science Honors Program
from the Pennsylvania State University (University Park, PA) in
1986.
He is co-inventor of 35 patents and co-author of over 70 technical
papers. He also serves on the editorial board of the journal
Optical Engineering, and was recently guest editor for a special
issue on optical data communication. He has been involved with the
design of new highly reliable, scalable, continuously available
computer architectures, which rely on optical fiber technology,
including Parallel Sysplex, SANs for disaster recovery at
multi-terabyte rates, and metropolitan area networks which enable
electronic commerce over the Internet.
|
|
 |
Many large data processing applications
require petabyte storage systems, interconnected over metropolitan
area networks with terabytes of aggregate bandwidth. These storage
area networks (SANs) require high availability (99.999% or better),
fault tolerance, and guaranteed quality of service for all
communication protocols. These requirements, coupled with fiber
exhaust in metropolitan areas, are driving the widespread use of
dense optical wavelength division multiplexing (DWDM).
The requirements of voice, video, data, and IP traffic were
previously addressed by separate overlay networks; however, the
rapid growth of Internet traffic and e-commerce has created
interest in a service transparent DWDM backbone capable of
allocating bandwidth on demand. This offers the advantages of a
highly scalable, low-cost, protocol independent infrastructure, and
may be the first step towards switchable, all-optical networks.
This paper describes the recent results of DWDM testing in SANs
using the IBM 2029 Fiber Saver. In particular, we will examine two
system testbeds that demonstrate a combination of time and
wavelength multiplexing between mainframe servers and large storage
devices in a multivendor environment.
The IBM 2029 Fiber Saver Platform
The 2029 uses passive thin-film
interference filters to perform optical multiplexing; you can
multiplex up to 40 gigabytes over one pair of optical fibers. This
corresponds to 32 wavelengths (duplex channels) at 1.25 Gbit/second
each; you can manage the available bandwidth and increase the
number of channels using time division multiplexing (TDM). For
example, using features first made available late last year, you
can allocate up to four channels of OC-3 traffic per wavelength,
for a total capacity of 128 channels per fiber pair. The 2029 is
also compatible with external TDM applications, such as the FICON
Bridge.
Unidirectional protection switching at the physical layer is
configurable on a per-channel basis (with a maximum switching time
of 50 ms) to restore service in the event of either a fiber break
or hardware failure. This insures that there are no single points
of failure in a protected or high availability channel. The
multiplexer is also a complete 3R repeater (retimes, reshapes, and
regenerates the signals); it supports native attachment of all
industry standard protocols including ESCON, FICON, Fibre Channel,
ATM OC-3 and OC-12, FDDI, Gigabit Ethernet, and others.
The 2029 architecture configures its filter banks as dual
self-healing counter-rotating fiber rings
. This supports new topologies including fully
protected point-to-point and protocol-independent rings with up to
nine physical locations. Each data channel includes a low-speed,
in-band service channel, which carries network management
information. Thus, you can monitor the entire network from a single
point over an IP connection; Java-enabled software provides a
graphical user interface that supports various network management
options, including remote IP management via an SNMP connection.
DWDM Testbeds
The first DWDM testbed (Figure 1),
consisted of a point-to-point 2029 network used in a SAN between an
IBM G5 mainframe and various types of ESCON-capable direct-attach
storage devices (DASD), consisting of magnetic disk or tape
drives
.
Figure 1: SAN Testbed 1, which included a switch
fabric with four ESCON 9032-5 Directors (not all physical
connections are shown for clarity).
The G5 and ESCON Directors were equipped with the FICON Bridge
feature



, which performs up to an 8-to-1 TDM of ESCON
channels into a single FICON channel (at 50% channel utilization).
This allows up to eight independent ESCON channels to occupy a
single wavelength in the DWDM network, effectively increasing the
total channel capacity to 256 duplex links over a single fiber
pair. In this testbed, a total of 64 ESCON channels ran over eight
FICON channels.
By using DASD from many sources, we verified that the DWDM
provides native ESCON channel extension without any device-specific
performance limitations. Applications tested include data
mirroring, extended remote copy, and peer-to-peer remote copy
functions. Performance measurements were made to compare a DWDM
solution with a hybrid TDM/DWDM approach, and to characterize SAN
behavior under stressful workloads (Linux, MVS, and Unix operating
systems, Lotus Domino databases, and transaction processing using
NIST Level 4 Certified cryptography). We found no measurable
performance difference at ESCON data rates between the hybrid
TDM/WDM approach and a direct use of 64 DWDM wavelengths; in both
cases, the network bit-error rate (BER) remained less than 10e-12
over a three-week test run.
Remote IP management using Tivoli Netview with SNMP was also
demonstrated by controlling the testbed in Washington, D.C. from a
remote center in Toronto, Canada. Note that for large SANs the
protocols of choice are either ESCON or FICON; this is because of
their large data block size (20-30 MByte or more). By contrast, the
largest block size for Gigabit Ethenet (using jumbo frames) is only
about 9 KByte.
In a second WDM testbed (Figure 2), multiple 2029s were
cascaded in series to reach a distance of 75 km; this is possible
because the 2029 functions as a full 3R repeater.
Figure 2: SAN Testbed 2 included an IBM 3990
asynchronous tape drive, 3590 tape drive, and 3494 Virtual Tape
Server Library connected through multiple ESCON Directors (not
shown for clarity).
As before, the BER remained less than 10e-12 over a four-week
test run. Network reliability was also tested by removing and
re-plugging optical fibers and electrical cards within the 2029 to
establish that a high availability configuration offers no single
points of failure because of the implementation of 1+1 protection
switching and redundant support systems.
Many disaster recovery and data backup situations require
distances on the order of 50-100 km across the SAN; conventional
channel extenders can achieve this by encapsulating data in a SONET
frame, but there is typically a tradeoff in terms of higher cost
per-channel (especially in areas that have high leasing costs for
dark fiber), as well as performance degradation associated with the
SONET encapsulation process. In principle, DWDM offers a lower cost
solution by sharing many channels across a single pair of leased
fibers, along with improved performance because no encapsulation is
required. To evaluate extended distance performance, we increased
the total fiber distance to 100 km and performed measurements of
performance degradation.
The ESCON protocol requires many duplex acknowledgments to be
exchanged whenever a block of data is transferred; combined with
the memory buffer size on the mainframe channels, this translates
into a degradation in performance (Figure 3). Performance
droop begins at around 9 km and the peak data rate of 17
MByte/second degrades significantly by 60 km. In contrast, a native
FICON link not only has higher bandwidth (around 70 MByte/s peak),
but also does not experience significant droop at 100-km distances.
This is due to the streamlined protocol (fewer acknowledgments per
data-block transfer) and the larger buffer sizes on more recent
model mainframes. Note that FICON performance can be obtained when
using the FICON bridge solution to transport multiple ESCON
channels as well.
Figure 3: An illustration of the relative
performance of ESCON and FICON links over extended distances.
Testbed Conclusions
The increasing growth of large servers is
driving a corresponding trend in long-distance disaster recovery
and data mirroring. The preferred protocols for these applications
appear to be ESCON and FICON, and there is a need for
high-performance native-attached fiber-optic networks to share
bandwidth and lower cost.
One solution is the IBM 2029 Fiber Saver DWDM platform
demonstrated for application to large SANs in the two testbeds. The
platform exhibits high reliability and low BER over extended test
runs, as well as the ability to cascade 2029s and remotely manage
the network. The results indicate no performance penalty for using
a hybrid TDM/WDM solution such as the FICON Bridge; furthermore,
this offers the advantage of improved channel capacity at extended
distances.
The next step in the evolution of these platforms will be some
form of intelligent bandwidth management and allocation, which may
require a closer integration of the network management facilities
and the multiplexer hardware. This has the potential to provide a
scalable platform for next-generation enterprise servers.