Commsdesign Home Register About Commsdesign Feedback Online Opportunities SpecSearch GlobalSpec




















eLibrary

EE TIMES NETWORK
 Online Editions
 EE TIMES
 EE TIMES ASIA
 EE TIMES CHINA
 EE TIMES FRANCE
 EE TIMES GERMANY
 EE TIMES INDIA
 EE TIMES JAPAN
 EE TIMES KOREA
 EE TIMES TAIWAN
 EE TIMES UK

 EE TIMES EUROPE
 ANALOG EUROPE
 INDUSTRIAL EUROPE
 AUTOMOTIVE DL EUROPE

 POWER DL EUROPE

 Web Sites
 • Audio DesignLine
 • Automotive DesignLine
 • Career Center
 • CommsDesign
 • Microwave
    Engineering
 • Deepchip.com
 • Design & Reuse
 • Digital Home DesignLine
 • DSP DesignLine
 • EDA DesignLine
 • Embedded.com
 • Elektronik i Norden
 • Green SupplyLine
 • Industrial Control
    DesignLine
 • Planet Analog
 • Mobile Handset
    DesignLine
 • Power Management
    DesignLine
 • Programmable Logic
    DesignLine
 • RF DesignLine
 • RFID-World
 • Techonline
 • Video | Imaging
    DesignLine
 • Wireless Net
    DesignLine

ELECTRONICS GROUP SITES

 • eeProductCenter
 • Electronics Supply &
    Manufacturing
 • Conferences
    and Events
 • Electronics Supply &
    Manufacturing--China
 • Electronics Express
 • Webinars


18 March 2010



RLDRAMs vs CAMs/SRAMs: Part 2

Part 2 of this two-part set details the RLDRAM-II architecture and then shows how RLDRAM stacks up in real-life packet buffer and table look-up applications.

By Eugene Chang, Bill Lu, and Felix Markhovsky, Infineon Technologies
CommsDesign
Jun 09, 2003
Print This Story Send As Email Reprints
 
Rate this article
WORSE | BETTER
1 2 3 4 5
With memory bandwidths becoming a big concern for today's networking designers, a set of companies has developed the reduced-latency DRAM (RLDRAM) architecture to displace existing SRAM and content-addressable memory (CAM) architectures. Through RLDRAMs designers can receive increased memory densities, improved form factor size, and reduced overall system cost.

In Part 1 of this series, we took a look at the RLDRAM architecture and showed how it improved on current SRAM and CAM approaches in networking designs. We also examined how RLDRAM stacked up against emerging FCRAM approaches.

In Part 2, we'll further this discussion by examining how RLDRAM will perform in a packet-buffer and look-up applications. We'll also take a look at the RLDRAM-II architecture, which is currently under development.

Moving RLDRAM Forward
Before discussions on applications in packet buffering and table look-up applications can continue, a brief introduction to the next generation of RLDRAM, called RLDRAM-II, needs to happen. RLDRAM-II will be a slight departure from the RLDRAM described in Part 1 of this article series.

Like the original RLDRAM, RLDRAM-II will have eight memory banks to allow for fast access, but will be organized in a way that will allow the use of parity. Thus, RLDRAM-II Architectures will offer densities of 288 or 576 Mbits and data bit lines in x9, x18 or x36 organizations.

Although the RLDRAM-II will have the same 144-pin T-FBGA package as the first-generation RLDRAM, the two will not be pin-compatible due to the RLDRAM-II parity features. However, the manufacturers of RLDRAM and RLDRAM-II have made a pledge to keep subsequent process generations pin-compatible with the original designs. RLDRAM and RLDRAM-II will continue to be offered side-by-side in subsequent semiconductor process generations.

In addition to the above features, RLDRAM-II will offer reduced row-cycle time (down to 20 ns), while increasing the clock rates to 400 MHz. The 20-ns row-cycle time will be enforced for all speed grades. Moreover, to allow for more data flexibility and better performance, a burst length of 8 will be supported in addition to the burst lengths of 2 and 4.

To allow for better signal integrity at faster speeds, the new architecture will provide on-die termination as well as adjustable output driver impedance levels of from 25 to 60 ohms. This enhanced I/O interface feature provides more flexible board routing and still allows for high-speed signals. Also available for more flexibility, designers will be able to choose I/O voltages of 1.5 or 1.8 V to best fit the interface power. RLDRAM-II will also feature a write clock for every 18 bit lines.

Parallel Versions
Several parallel versions of RLDRAM-II will be available to ensure maximum utilization of the part in many kinds of applications. The first generation of RLDRAM supported an SRAM-like non-multiplexed addressing scheme. In RLDRAM-II, both SRAM-like non-multiplexed addressing schemes and the typical DRAM multiplexed addressing mode will be supported. Users will select which mode to use, with the multiplexed mode being available for users who favor more DRAM-like operations and reduced signal counts. Furthermore, a separated I/O version of RLDRAM-II with x18 and x9 organization will also be available for increased system performance.

RLDRAM-II performance will be further enhanced by the implementation of differential input clocks, as well as differential read and write strobes. An on-chip delay lock loop (DLL) will ensure that the clock edges are aligned with the data and with the read strobe lines.

It has been widely promoted that a new version of FCRAM will also have 288-Mbit densities.3 However, in comparing the RLDRAM-II with that of the proposed features of 288-Mbit FCRAM, RLDRAM-II still comes out ahead in many areas. These include performance enhancements to the RLDRAM-II, such as increasing the operating frequency range from 200 to 400 MHz and adding high-performance I/O features (such as on-die termination, adjustable output driver strength and output impedance matching) and a data valid signal. These features will allow the RLDRAM-II to achieve quick memory access and a higher effective bandwidth operation.

Designs using RLDRAM-II will also be much more flexible than 288-Mbit FCRAM in that 1) separate I/O version can replace or be complementary to QDR SRAM, 2) a choice of either multiplexed or non-multiplexed data/addressing modes will be employed. Users will decide between the SRAM-like non-multiplexed scheme or the DRAM-like multiplexed mode. In all FCRAM versions, there is only the one multiplexed mode, with only four memory banks and no JTAG support. Again, this is unlike the RLDRAM-II which will have both the non-multiplexed and multiplexed modes, eight banks, and JTAG support.

RLDRAM-II is currently under development by both Infineon Technologies and Micron Technologies. It is projected that the 288-Mbit RLDRAM-II will be available for sampling by the fourth quarter of 2003.

Packet Buffer Example
OK, up to this point, we've discussed how RLDRAM will outperform FCRAM in networking applications. Now let's provide some real-life design examples. To start, we'll compare how the two architectures handle a packet-buffering application.

Note: In the example to follow, two possible scenarios will be discussed: one in which perfectly-sized packets will be processed, and one in which imperfectly-sized packets are stored. Perfectly-sized packets are packets that consist of an octal multiple, such as 32, 64, or 128 bytes. Non-perfectly-sized packets consist of packets that are non-octal.

In the case of perfectly-sized packets, implementing RLDRAM devices reduces the number of components over that of FCRAM. In Table 3, a comparison is made between RLDRAM, FCRAM, and DDR-II in a write-read-write with a burst-length of 4 bits arrangement. RLDRAM reduces the number of memory components over FCRAM and DDR-II in any of the network applications shown. As a rule of thumb, there is at least a 2-to-1 advantage in using RLDRAM over FCRAM, and even more of an advantage when compared to DDR-II.

Moreover, as can be seen in Table 3, RLDRAM-II is predicted to have an even better advantage. A single RLDRAM-II piece will be able to be used for random access write-read-write bandwidth of up to 24.09 Gbit/s in a Gigabit Ethernet port.

Table 3: Comparison of the number of RLDRAM-I, RLDRAM-II, FCRAM and DDR-II

Since fewer memory components are required when using RLDRAM in a given system, there exist significant power savings as well. As can be seen in Figure 4, systems using RLDRAM lead in reducing overall system power over switch/router systems implementing FCRAM or DDR-II. An even greater power savings can be seen when RLDRAM is compared to an SRAM implementation, since SRAMs would require more than 60 pieces (16 Mbit) to be instantiated in order to obtain the same amount of memory depth that RLDRAM can provide in a particular switch/router system.


Figure 4: Power consumption comparison among DRAM types in switch/router applications.

Imperfectly-Sized Packet Buffering
In applications where imperfectly-sized packets are processed, the efficiency at which the bandwidth of the memory storage is used should be considered. One way for bandwidth loss to occur is to have a packet size that is not an integer multiple of the minimum transfer size used in the memory buffering architecture.

In OC-192 packet-over-Sonet (PoS) apps, the minimum transfer size is 40 bytes. On the contrary, the minimum packet size is 54 bytes for ATM and 64 bytes for 10 Gigabit Ethernet. In some applications, the size could even be as small as 20 bytes. Should the memory data transfer size not equate to the minimum packet size, bandwidth losses will occur.

The lowest efficiency happens if the packet is just one byte more than the transfer size. For example, in a system that allows for a minimum 32-byte packet transfer, a 33-byte packet will take two cycles to accomplish. Thus, in this case, the efficiency rating would be 33/64, or 52 percent. Typically, a smaller transfer size is better for small random packets. In the case of RLDRAM, the burst length can be as small as two, while the burst length for FCRAM running at its fastest cycle is four. Table 4 shows the difference between the two DRAM architectures when compared with the minimum allowable transfer size possible versus I/O size.

Table 4: Minimum Transfer Packet Size Comparison Between RLDRAM and FCRAM for Various I/O /sizes.

Data from Table 4 is graphed in Figures 5 and 6 to further show how RLDRAM can store and read more packets more efficiently due to the finer sizes of the packets. FCRAM tends to be able to "catch" the packets efficiently at half the amount of RLDRAM. Again, the ability to have a smaller burst length of data allows the RLDRAM to store and fetch the largest amount of minimum-sized packets. In other words, RLDRAM can achieve 100-percent bus efficiency at more byte intervals than FCRAM. This shows that RLDRAM is designed to be more efficiently apt to handle differing packet sizes than FCRAM.


Figure 5: RLDRAM and FCRAM bandwidth efficiency comparison with a 64-bit-wide data bus.


Figure 6: Comparison of RLDRAM and FCRAM bandwidth efficiency with a 128-bit-wide data bus.

Table Look-Up Application
As discussed previously, high-performance DRAMs are edging into an area that is normally reserved for SRAMs and CAMs. Advances in DRAM technology and design are allowing this trend to occur. Table look-ups are one area in which the RLDRAM can replace CAM devices.

Currently, RLDRAM requires that the look-up database be replicated a few times so that random reads can be guaranteed 100 percent. For example, using four 300 MHz x 32 RLDRAMs operating at a burst length of four, and with a tRC of 8 clock cycles, four database replications would be needed to achieve a 150-Mbit access-per-second look-up performance for a 128-bit key entry with a database size of 64 Mbit.

Although database replication is required in order to guarantee 100 percent random reads when using RLDRAM, the number of board components is significantly reduced when compared with such traditional look-up memories as ternary CAMs.

In a presentation from an independent network processor vendor at a recent Network Processor Conference4, designs using RLDRAM significantly reduced component count while increasing look-up performance. The proposed solution reduced the conventional 15 CAM and two SRAM components to only six RLDRAMs, while increasing the look-up performance from 50 million look-ups per second to 187 million look-ups per second (Table 5).

Table 5: Comparison of CAM/SRAM vs. RLDRAM in Look-Up App

The network processing vendor in this example claimed that a system using its chip and six RLDRAM devices can perform 1 million flows/sessions, 2 million subscribers (MAC, VLAN), 2 million IPv4 routes, 1 million IPv6 routes, or 1 million MPLS labels.

In comparison, a typical CAM with the highest memory density today (9 Mbits) can achieve a maximum search speed of from 100 to 133 million searches per second. In this case, the performance of RLDRAM can compare favorably with that of CAM while providing seven times more storage elements. Using RLDRAM as part of a table look-up system ties in with the previous discussion on how new high-performance DRAM devices can be applied in networking applications that were once dominated by SRAMs and CAMs.

Table Look-Up with RLDRAM-II
Using RLDRAM-II in table look-up applications will increase the performance even over the original RLDRAM architecture. Figures 7 and 8 compare use of RLDRAM and RLDRAM-II in look-up applications. As before, we again use the assumption of a look-up application that has a database of 128-bit key entries and a guarantee of 100 percent fully random reads.


Figure 7: RLDRAM used in a table look-up memory application.


Figure 8: RLDRAM-II used in a table look-up memory application.

Due to the enhanced speed performance features of RLDRAM-II, fewer database replications will be required to obtain even faster searches than the original RLDRAM. For example, four RLDRAM-II devices will be able to perform 200 million accesses per second using four 64-Mbit database replications, while the same amount of RLDRAM with the same four 64-Mbit database replications can perform up to 150 million accesses per second.

Again, in comparison, a a table look-up CAM that can only perform a maximum of 100 to 133 million accesses per second while offering a much smaller memory capacity of 9 Mbits. Although the database needs to be replicated, the speed at which searches can be performed with RLDRAMs rivals that of CAM-based systems, while allowing larger database sizes and smaller board area constraints.

Wrap Up
The RLDRAM family will replace SRAMs and CAMs in networking equipment due to a combination of performance, higher density, lower power, small form factor, and reduced cost. The RLDRAM was designed with networking applications in mind, whereas the FCRAM and other DRAM offerings are mere modifications of DRAM memories that were originally targeted for handheld mobile devices or PCs. As a result, the RLDRAM has natural tendencies to excel over FCRAM and other standard DRAM offerings in the area of network applications.

Author's Note: More information on RLDRAM can be found at www.rldram.com or at www.infineon.com/memory/rldram/.

Editor's Note: To view Part 1 of this article, click here.

References

  1. C. Bernard Shung, Network Processing ICs, ISSCC 2001 Tutorial, San Francisco, February 4, 2001.
  2. Toshiba FCRAM datasheet, November 2001.
  3. Samsung Network DRAM-II Specification Rev 0.0
  4. Fast Chip Presentation, NPC EAST 2002.

About the Authors
Eugene L. Chang is a senior manager at Infineon Technologies, and is responsible for specialty memory devices, including Embedded DRAM, Flash, RLDRAM and others. Eugene received his Ph.D. and M.S.E.E. from Southern Methodist University, and a B.S. in Electrical Engineering from Columbia University, New York. He can be reached at eugene.chang@infineon.com.

Bill Lu is a senior manager in the Specialty Memory Group at Infineon Technologies. He is currently responsible for specialty memory technology roadmaps, product definition, and customer acquisition and support. Bill holds a Ph.D. in physics from Lehigh University, PA, and an M.S. degree from the University of Science and Technology of China (USTC). Bill can be reached at bill.lu@infineon.com.

Felix Markhovsky is a product manager with Infineon Technologies. Felix has an MSEE degree in control theory and data communications from Polytechnic University, Moscow, Russia. He can be reached at felix.markhovsky@infineon.com.




EE Times TechCareers
Search Jobs

Enter Keyword(s):


Function:


State:
  

Post Your Resume
-----------------
Employers Area
Most Recent Posts
Boeing seeking Senior Software Engineer in Annapolis Junction, MD

Emulex seeking Senior Program Manager in Costa Mesa, CA

Accenture seeking Data Center Technology in Reston, VA

Eurotech seeking Sales Executive in Amaro, Italy

NYU Langone Medical Center seeking IS Manager in New York, NY

More career-related news, resources and job postings for technology professionals



Home  |  Register  |  About  |  Feedback  |  Contact   |  Site Map
All materials on this site Copyright © 2010 EE Times Group, a Division of United Business Media LLC All rights reserved.
Privacy Statement ¦ Terms of Service