Commsdesign Home Register About Commsdesign Feedback Online Opportunities SpecSearch GlobalSpec


















Audio Designline



eLibrary

EE TIMES NETWORK
 Online Editions
 EE TIMES
 EE TIMES ASIA
 EE TIMES CHINA
 EE TIMES FRANCE
 EE TIMES GERMANY
 EE TIMES INDIA
 EE TIMES JAPAN
 EE TIMES KOREA
 EE TIMES TAIWAN
 EE TIMES UK

 EE TIMES EUROPE
 ANALOG EUROPE
 INDUSTRIAL EUROPE

 Web Sites
 • Audio DesignLine
 • Automotive DesignLine
 • Career Center
 • CommsDesign
 • Microwave
    Engineering
 • Deepchip.com
 • Design & Reuse
 • DSP DesignLine
 • EDA DesignLine
 • Embedded.com
 • Elektronik i Norden
 • Industrial Control
    DesignLine
 • Planet Analog
 • Mobile Handset
    DesignLine
 • Power Management
    DesignLine
 • Programmable Logic
    DesignLine
 • Video | Imaging
    DesignLine
 • Wireless Net
    DesignLine
 • RF DesignLine

ELECTRONICS GROUP SITES
 • NEW! SpecSearch
 • eeProductCenter
 • Electronics Supply &
    Manufacturing
 • Conferences
    and Events
 • Electronics Supply &
    Manufacturing--China
 • Electronics Express
 • Webinars


11 May 2008

Feature

The Case for a Classification Language


By Eric J. Rothfus

Slow classification languages have forced system designs to lag behind in providing faster and more reliable networks. By turning to a functional programming language, engineers can streamline code development and speed the classification process.

In the last year, many new players have emerged in the network processor industry. Network processors offer OEMs the speed of an ASIC with the programming flexibility of a microprocessor. With programmability, service providers and carriers can add new features through software, instead of waiting years for the next ASIC spin.

Programmability brings with it a new set of issues and design considerations. Which programming language is best suited for communication tasks? While general-purpose languages like C have many merits, a high-level language specifically targeted at communication applications can help engineers tackle the complex classification requirements of modern communication systems.

This article will provide an overview of classification, the necessary attributes of classification, and the drawbacks to traditional classification approaches. Additionally, the article will explore a new classification method that employs a functional programming language (FPL).

Defining classification


For the purposes of this article, the term classification refers to the general concept of examining data, in whole or in part, and determining which of several possible actions should be taken. For example, if your task is to sort colored blocks into matching boxes, classification refers to examining each incoming block, determining its color, and placing it in the right box (see Figure 1 ).

This simple example is only the beginning. Classification is often performed while taking into account numerous attributes of the incoming items or information. For example, sorting all blue blocks with gold trim into box 145 and all red balls into box 122.

In addition, classification does not have to be active. That is, a simple form of classification may be to count all of the blue balls. The action being performed is simple counting.

Classification and communications


Classification is a major issue within communication systems with many different applications, especially in network routers and switches. However, this article will concentrate on the data path, the most time-critical portion of communication systems.

Figure 2 shows a simplified diagram of the data flow within a communication system. Data flowing at high speed, or wire speed, flows into network processing elements from a data transceiver (not shown). Almost all of the data (99%) is fully processed in the data path at wire speed, and is passed directly to the backplane after processing. Some data requires more extensive processing and is passed to the slower-speed control path, which normally includes a microprocessor. For example, routing protocols are processed in the control path. The processed protocols are then used to configure the data path in order to implement the adjusted routing or forwarding tables.

In the data path, classification may be performed numerous times using several, if not all, portions of the input data. For example, when processing a packet of information within a protocol such as IP, the IP destination address (such as 209.109.83.100) is used as a key for determining which port the packet should flow to next. While this classification task may sound easy, there may be hundreds of thousands of different possibilities in today’s network systems.

Within the data path, another form of classification is used to gather statistics. In this case, the task is to count the number of packets flowing through a device that conform to a given set of characteristics. If engineers are trying to count the number of telnet packets that are flowing between the IP addresses 209.109.83.100 and 192.207.128.193, they will use classification to identify packets that have these two addresses in the source and destination fields, as well as the telnet port number, 23, in the port field.

There are numerous other places where classification plays a key role in the data path:

  • Layer 2 switching

  • Layer 3/4 switching

  • Layer 3+ routing

  • Priority processing

  • Error detection

  • Filtering

  • Flow detection

  • Access control list (ACL) processing

  • Load balancing

  • Statistics gathering

  • Segmentation and reassembly (SAR)

The last item in this list, SAR, is quite distinct. It refers to what can be called “deep classification,” or the ability to classify incoming information so that further classification can be done. In the case of SAR, classification is normally performed on reassembly and consists of reconstructing a larger data packet from smaller data cells. The first classification task involves examining incoming cells and identifying the appropriate packet to which it should be attached. In most cases, SAR is followed by further classification such as deciding which port to send an assembled packet to. Deep classification can be described as a recursive application of multiple classification tasks to a growing set of attributes and data.

Harder than it looks


Classification, even deep classification, sounds easy. Simply recognize what’s coming in, put it in an appropriate class or category, and execute an action associated with that class. It should be easy, but it’s not, especially within communication systems.

The first problem is in the number of possibilities that must be taken into account during even simple classification tasks. For example, when performing simple IP packet routing, classification must take into account 50,000 or more possible routes. Future systems may require 500,000 or more routes.

Classification can also be complicated by the use of many fields within the packet. For IP routing, one to three fields in a packet may be used. For flow identification, two to five fields or more may be used. For ACL processing, six fields can be used. When performing higher-layer switching, such as load balancing, even more fields must be taken into account.

But why do the number of possibilities and their complexity raise difficulties with classification? Imagine that you were performing the simple block classification task yourself, and that it was your job to look at the block’s color and place it into a box with the same color. It would be easy with five colors and five collection boxes. What if there were 50 boxes?

Classification starts to become difficult in two ways: distinguishing the colors of the blocks and their boxes, and gathering all of the boxes around you, so that you can easily grab a block and toss it into its box. Maybe an engineer could classify 50 boxes. But what about 1,000, 10,000, or 100,000 boxes? Soon, the engineer can’t tell the difference between distinct colors. It also becomes increasingly difficult to find the right box to throw the block into, even if the engineer could reach it from where he or she sat.

As if that weren’t hard enough, the engineer could have some blocks with spots and some without. This is like having a classification task with multiple fields or attributes; it effectively multiplies the task’s difficulty by the number of fields.

Fortunately, communication systems are accustomed to dealing with large numbers of routes and complex classification. However, as network speeds increase, it becomes more difficult to keep up with the larger number of routes and the requirements for more complex classification methods. If the blocks needed to be sorted into 100 boxes, with the incoming blocks arriving once a minute, there should be ample time to complete the task. If the blocks arrived once every couple seconds, there may be just enough time. Imagine, though, that the blocks are arriving every second, or maybe 1,000 are arriving every second, or, given modern network speeds, one million blocks are arriving every second.

The traditional approach


Classification is not a new concept within communication systems. The very first cell switch contained a form of classification engine that examined each incoming cell and moved it to the appropriate output port based on what was inside it. That first cell switch used a microprocessor that ran a form of assembly language to examine each cell and decide what to do with it. The code that drove that microprocessor may have looked something like the code in Code Listing 1 .

Later devices used a higher-level language like C. The code in Code Listing 1 may look like the code in Code Listing 2 .

Both of these snippets of code perform roughly the same tasks. They look at a data packet, first checking to see if the packet qualifies to be routed, then looking at data fields and calling subfunctions to determine what route the packet should take as a class B subnet route.

The code is an example using a procedural language to perform a classification task. The code instructs a microprocessor to perform a classification. Each step in the process is spelled out in great detail through microprocessor instructions such as move, test, and add.

This traditional approach to programming classification tasks has a few important drawbacks. The first is that the procedural nature of the code makes it difficult to write and later to read and debug. Since the programmer expresses the classification task by describing how it is done, the meaning of the classification is obscured within the code and is unnecessarily complex.

Suppose that someone tried to tell you how to sort your blocks by saying: “Focus on the incoming block; determine its hue, saturation, and luminance; look down at the first box; determine its hue, saturation, and luminance and compare this to the same for the incoming block. If the two are the same, move your left arm.” It would be far easier for you to follow a higher-level description of the problem such as: “Put each block in the box of the same color.”

Another drawback to the traditional approach is that the code’s complexity makes getting the system to work as expected a difficult task. Furthermore, as the number of lines of code increases, so does the number of bugs. More importantly, as the number of lines and their complexity increase, it becomes far more difficult to make changes, which are necessary not only to fix bugs, but also to add new capabilities. In essence, this approach is not scaleable to solve current and future classification problems.

Since the traditional approach is procedural, describing each task in great detail, it is inherently slow. This approach uses a general-purpose language on a general-purpose processor and is not optimized for its task, so it will always be slower than an approach constructed and optimized for classification.

The procedural approach also suffers from a more obscure but important problem for modern communication systems. Using the traditional approach, it is difficult to accelerate pieces of the task through hardware. Ideally, the classification approach or language would allow important pieces to be implemented transparently in hardware. This would allow the same classification program to be used for many different pieces of equipment or generations of the same equipment.

A new classification language is needed to express the complex classification tasks used by communication systems in a way that is scaleable, can be optimized, and allows acceleration. This new language, called the functional programming language (FPL), enables engineers to obtain a scaleable, optimized solution that accelerates the classification process.

Making it work


A classification language is a specification language that allows a programmer to express classification tasks easily and efficiently. It should provide for specification of most types of classification used by communication systems. It should also be easily maintained and conducive to hardware acceleration methods. A useful classification language is probably not procedural. It will not specify how a classification task is to be performed. Instead, it will clearly specify what classification must be performed.

The obvious result of using such a classification language is that the number of lines of code will be dramatically reduced, leading to fewer bugs and easier maintenance. Another expected outcome is that changes would be far easier to incorporate than with a procedural language. This language would also allow scaleable solutions to be implemented, as well as those that can be transparently accelerated.

A classification language for communication systems should support the following features at a minimum. These features could be considered requirements for a Level-1 classification language:

  • Definition of incoming data cells or packets so that important fields can be identified.

  • Definition of rules that allow the programmer to link classification outcomes to appropriate actions. Multiple, simultaneous rules (and therefore actions) should be possible.

  • An extensible set of actions as outcomes for classification. These actions should include “move” and “count” operations at a minimum.

  • Definition of sets of classification possibilities with associated actions. These sets should not have a preset limit on their sizes.

  • Ability to combine many different fields for complex classification.

A classification language with this minimal feature set would be extremely powerful. An even more powerful language would result from the addition of the following features of a Level-2 classification language:

  • Wild-card and priority capability for subnets and ACLs.

  • Ability to express different rule sets for processing multiple, simultaneous protocols.

  • Support for deep classification and resulting reassembly.

    Finally, the following Level-3 features would complete the language for most modern communication systems:

  • State maintenance for cross-packet classification tasks.

  • Regular expression matching.

  • Non-anchored searching.

However, even a Level-3 classification language would be useless if it weren’t easy to use, modify, and incorporate into communication systems. FPL has been designed to meet these requirements.

FPL Overview


The FPL language has been specifically created to address classification problems within communication systems. FPL, which is processed directly by a network processor, is a Level-3 classification language that allows an engineer to program a variety of classification tasks.

FPL can be best described by the words functional and descriptive:

  • Functional. As its name suggests, FPL is a functional language. In sharp contrast to procedural languages, functional languages specify mappings or transformations, as opposed to procedures for solving problems. Through these mappings, a functional language promotes bug-free implementations of certain tasks. Classification is an ideal task to be performed through functional languages.

  • Descriptive. Instead of specifying how a task is to be accomplished, FPL simply describes the task to be accomplished. With FPL, an engineer describes the protocols to be processed. Then, the engineer attaches actions to these descriptions. The description of the protocol itself is 95% of programming a classification task.

Consider the previously mentioned assembly and C code examples. Each of these approaches attempts to process data found in a TCP/IP packet of the format highlighted in Figure 3 .

The FPL code used for processing the data highlighted in Figure 3 is quite simple:

ClassB: 10b net host action($1,$2)

As evident from this example, a line of FPL code describes the data it is meant to process. In the code line above, the given action ($1, $2) is executed whenever the target data is seen.

Each line of FPL is called a rule. Engineers can think of each rule as a pattern with an associated action. In other words, an FPL rule specifies a classification pattern along with a classification action that is executed when the given pattern is seen.

In the example, “ClassB” is the name of the rule. After the colon, which separates the name from the body of the rule, is a constant: a binary 10. When this rule begins to execute, the binary 10 is compared against the incoming data. If it matches, then the rule continues to execute. In this case, the next piece of the rule, “net,” begins to execute.

The word “net” refers to another rule within FPL. Much like within the C code example, where a subfunction is called “net,” FPL can use sub-rules to build up a larger rule. This rule may look like:

net: 100010b 128:8 return(1)

net: 100010b 207:8 return(2)

net: 100010b 192:8 return(3)

These three lines define three distinct possibilities for the net. Each line is considered a separate rule, though together they make up the net rule. When running FPL, the rule that matches the incoming data will be executed, causing the appropriate value to be returned by the rule.

These are the basic constructs within FPL. By combining these basic constructs with other features of FPL, an engineer can use it to express numerous classification tasks. More importantly, FPL can express all classification tasks necessary for a Level-3 classification language

FPL vs. C


FPL was created to express classification tasks. Reviewing the code segments above, an engineer can see how the descriptive specification in FPL is far simpler than that in C.

When using FPL, a communication systems designer will write 20 times fewer lines of code for many different networking devices. In other words, with FPL the engineer concentrates on the details of the protocol, not the details of the language itself, which leads to fewer lines of code (see Figure 4 ).

Writing fewer lines of code provides distinct benefits to the communication systems designer. In particular, writing fewer lines of code leads to fewer bugs, faster time-to-market, lower maintenance costs, and greater reuse capabilities.

Eric Rothfus is executive vice president and co-founder of Agere, Inc. He was a cofounder of Dazel Corp., where he was the vice president of development and Internet initiatives. He was also an early Tivoli employee and a co-founder of PhaseII Corp. Rothfus received a BSEE/CS from MIT, and has done graduate work in electrical engineering at the University of Texas at Austin. He can be reached at eric@agere.com. .


Illustrations Code Listings
Figure 1
Figure 2
Figure 3
Figure 4
Code Listing 1
Code Listing 2



Return to the Table of Contents





Virtualab

  • Newcomer delivers innovative FETs for less-visible, fragmented market
  • GPON startup expands in Asia, secures $7 million
  • Conexant sells home networking line to CopperGate
  • Cisco quarterly results top expectations
  • MORE
    Waveguides bridge 'terahertz gap'
    Researchers at the University of Utah have demonstrated a method of building wires that act as terahertz-caliber waveguides atop stainless-steel foil by using lines of micron-scale perforations. EE Times' R Colin Johnson Takes a closer look.Click here to learn more.

    Tech Article Library
    Check out CommsDesign's Design corner to find a detail technical articles on a host of communication design issues. To access the design corner, click here.

    Air Force secretary wants more routers in space
    Air Force Secretary Michael Wynne called for an increase in small satellite clusters and Internet Protocol routing in space as one means of improving U.S. satellite capabilities. EE Times' Loring Wirbel delves into the plan. Click here for details.

    Puzzled by a network processing design issue?

    Join former NPF CEO Colin Mick in discussing net processing design issues by clicking here!


    EE Times TechCareers
    Search Jobs

    Enter Keyword(s):


    Function:


    State:
      

    Post Your Resume
    -----------------
    Employers Area
    Most Recent Posts More career-related news, resources and job postings for technology professionals




    Home  |  Register  |  About  |  Feedback  |  Contact