|
The Case for a Classification Language
By Eric J. Rothfus
Slow classification languages have forced system designs to lag behind in providing faster and more reliable networks.
By turning to a functional programming language, engineers can streamline code development and speed the classification process.
In the last year, many new players have emerged in the network processor industry. Network processors offer OEMs the speed of an ASIC with the programming flexibility of a microprocessor. With programmability, service providers and carriers can add new features through software, instead of waiting years for the next ASIC spin.
Programmability brings with it a
new set of issues and design considerations. Which programming language is best suited for communication tasks? While general-purpose languages like C have many merits, a high-level language specifically targeted at communication applications can help engineers tackle the complex classification requirements of modern communication systems.
This article will provide an overview of classification, the necessary attributes of classification, and the drawbacks to traditional classification approaches.
Additionally, the article will explore a new classification method that employs a functional programming language (FPL).
Defining classification
For the purposes of this article, the term classification refers to the general concept of examining data, in whole or in part, and determining which of several possible actions should be taken. For example, if your task is to sort colored
blocks into matching boxes, classification refers to examining each incoming block, determining its color, and placing it in the right box (see
Figure 1
).
This simple example is only the beginning. Classification is often performed while taking into account numerous attributes of the incoming items or information. For example, sorting all blue blocks with gold trim into box 145 and all red balls into box 122.
In addition, classification does not
have to be active. That is, a simple form of classification may be to count all of the blue balls. The action being performed is simple counting.
Classification and communications
Classification is a major issue within communication systems with many different applications, especially in network routers and switches. However, this article will concentrate on the data
path, the most time-critical portion of communication systems.
Figure 2
shows a simplified diagram of the data flow within a communication system. Data flowing at high speed, or wire speed, flows into network processing elements from a data transceiver (not shown). Almost all of the data (99%) is fully processed in the data path at wire speed, and is passed directly to the backplane after processing. Some data requires more extensive processing and is passed to
the slower-speed control path, which normally includes a microprocessor. For example, routing protocols are processed in the control path. The processed protocols are then used to configure the data path in order to implement the adjusted routing or forwarding tables.
In the data path, classification may be performed numerous times using several, if not all, portions of the input data. For example, when processing a packet of information within a protocol such as IP, the IP destination address
(such as 209.109.83.100) is used as a key for determining which port the packet should flow to next. While this classification task may sound easy, there may be hundreds of thousands of different possibilities in todays network systems.
Within the data path, another form of classification is used to gather statistics. In this case, the task is to count the number of packets flowing through a device that conform to a given set of characteristics. If engineers are trying to count the number of
telnet packets that are flowing between the IP addresses 209.109.83.100 and 192.207.128.193, they will use classification to identify packets that have these two addresses in the source and destination fields, as well as the telnet port number, 23, in the port field.
There are numerous other places where classification plays a key role in the data path:
Layer 2 switching
Layer 3/4 switching
Layer 3+ routing
Priority processing
Error detection
Filtering
Flow detection
Access control list (ACL) processing
Load balancing
Statistics gathering
Segmentation and reassembly (SAR)
The last item in this list, SAR, is
quite distinct. It refers to what can be called deep classification, or the ability to classify incoming information so that further classification can be done. In the case of SAR, classification is normally performed on reassembly and consists of reconstructing a larger data packet from smaller data cells. The first classification task involves examining incoming cells and identifying the appropriate packet to which it should be attached. In most cases, SAR is followed by further classification
such as deciding which port to send an assembled packet to. Deep classification can be described as a recursive application of multiple classification tasks to a growing set of attributes and data.
Harder than it looks
Classification, even deep classification, sounds easy. Simply recognize whats coming in, put it in an appropriate class or category, and execute an action
associated with that class. It should be easy, but its not, especially within communication systems.
The first problem is in the number of possibilities that must be taken into account during even simple classification tasks. For example, when performing simple IP packet routing, classification must take into account 50,000 or more possible routes. Future systems may require 500,000 or more routes.
Classification can also be complicated by the use of many fields within the packet. For IP
routing, one to three fields in a packet may be used. For flow identification, two to five fields or more may be used. For ACL processing, six fields can be used. When performing higher-layer switching, such as load balancing, even more fields must be taken into account.
But why do the number of possibilities and their complexity raise difficulties with classification? Imagine that you were performing the simple block classification task yourself, and that it was your job to look at the
blocks color and place it into a box with the same color. It would be easy with five colors and five collection boxes. What if there were 50 boxes?
Classification starts to become difficult in two ways: distinguishing the colors of the blocks and their boxes, and gathering all of the boxes around you, so that you can easily grab a block and toss it into its box. Maybe an engineer could classify 50 boxes. But what about 1,000, 10,000, or 100,000 boxes? Soon, the engineer cant tell the difference
between distinct colors. It also becomes increasingly difficult to find the right box to throw the block into, even if the engineer could reach it from where he or she sat.
As if that werent hard enough, the engineer could have some blocks with spots and some without. This is like having a classification task with multiple fields or attributes; it effectively multiplies the tasks difficulty by the number of fields.
Fortunately, communication systems are accustomed to dealing with large
numbers of routes and complex classification. However, as network speeds increase, it becomes more difficult to keep up with the larger number of routes and the requirements for more complex classification methods. If the blocks needed to be sorted into 100 boxes, with the incoming blocks arriving once a minute, there should be ample time to complete the task. If the blocks arrived once every couple seconds, there may be just enough time. Imagine, though, that the blocks are arriving every second, or maybe
1,000 are arriving every second, or, given modern network speeds, one million blocks are arriving every second.
The traditional approach
Classification is not a new concept within communication systems. The very first cell switch contained a form of classification engine that examined each incoming cell and moved it to the appropriate output port based on what was inside it.
That first cell switch used a microprocessor that ran a form of assembly language to examine each cell and decide what to do with it. The code that drove that microprocessor may have looked something like the code in
Code Listing 1
.
Later devices used a higher-level language like C. The code in
Code Listing 1
may look like the code in
Code Listing 2
.
Both of these
snippets of code perform roughly the same tasks. They look at a data packet, first checking to see if the packet qualifies to be routed, then looking at data fields and calling subfunctions to determine what route the packet should take as a class B subnet route.
The code is an example using a procedural language to perform a classification task. The code instructs a microprocessor to perform a classification. Each step in the process is spelled out in great detail through microprocessor
instructions such as move, test, and add.
This traditional approach to programming classification tasks has a few important drawbacks. The first is that the procedural nature of the code makes it difficult to write and later to read and debug. Since the programmer expresses the classification task by describing how it is done, the meaning of the classification is obscured within the code and is unnecessarily complex.
Suppose that someone tried to tell you how to sort your blocks by saying: Focus
on the incoming block; determine its hue, saturation, and luminance; look down at the first box; determine its hue, saturation, and luminance and compare this to the same for the incoming block. If the two are the same, move your left arm. It would be far easier for you to follow a higher-level description of the problem such as: Put each block in the box of the same color.
Another drawback to the traditional approach is that the codes complexity makes getting the system to
work as expected a difficult task. Furthermore, as the number of lines of code increases, so does the number of bugs. More importantly, as the number of lines and their complexity increase, it becomes far more difficult to make changes, which are necessary not only to fix bugs, but also to add new capabilities. In essence, this approach is not scaleable to solve current and future classification problems.
Since the traditional approach is procedural, describing each task in great detail, it is
inherently slow. This approach uses a general-purpose language on a general-purpose processor and is not optimized for its task, so it will always be slower than an approach constructed and optimized for classification.
The procedural approach also suffers from a more obscure but important problem for modern communication systems. Using the traditional approach, it is difficult to accelerate pieces of the task through hardware. Ideally, the classification approach or language would allow important
pieces to be implemented transparently in hardware. This would allow the same classification program to be used for many different pieces of equipment or generations of the same equipment.
A new classification language is needed to express the complex classification tasks used by communication systems in a way that is scaleable, can be optimized, and allows acceleration. This new language, called the functional programming language (FPL), enables engineers to obtain a scaleable, optimized solution that
accelerates the classification process.
Making it work
A classification language is a specification language that allows a programmer to express classification tasks easily and efficiently. It should provide for specification of most types of classification used by communication systems. It should also be easily maintained and conducive to hardware acceleration methods. A useful
classification language is probably not procedural. It will not specify how a classification task is to be performed. Instead, it will clearly specify what classification must be performed.
The obvious result of using such a classification language is that the number of lines of code will be dramatically reduced, leading to fewer bugs and easier maintenance. Another expected outcome is that changes would be far easier to incorporate than with a procedural language. This language would also allow
scaleable solutions to be implemented, as well as those that can be transparently accelerated.
A classification language for communication systems should support the following features at a minimum. These features could be considered requirements for a Level-1 classification language:
Definition of incoming data cells or packets so that important fields can be identified.
Definition of rules that allow the programmer
to link classification outcomes to appropriate actions. Multiple, simultaneous rules (and therefore actions) should be possible.
An extensible set of actions as outcomes for classification. These actions should include move and count operations at a minimum.
Definition of sets of classification possibilities with associated actions. These sets should not have a preset limit on their sizes.
Ability to combine many different fields for complex classification.
A classification language with this minimal feature set would be extremely powerful. An even more powerful language would result from the addition of the following features of a Level-2 classification language:
Wild-card and priority capability for subnets and ACLs.
Ability to express different rule sets for processing multiple,
simultaneous protocols.
Support for deep classification and resulting reassembly.
Finally, the following Level-3 features would complete the language for most modern communication systems:
State maintenance for cross-packet classification tasks.
Regular expression matching.
Non-anchored searching.
However, even a Level-3 classification language would be
useless if it werent easy to use, modify, and incorporate into communication systems. FPL has been designed to meet these requirements.
FPL Overview
The FPL language has been specifically created to address classification problems within communication systems. FPL, which is processed directly by a network processor, is a Level-3 classification language that allows an
engineer to program a variety of classification tasks.
FPL can be best described by the words functional and descriptive:
Functional.
As its name suggests, FPL is a functional language. In sharp contrast to procedural languages, functional languages specify mappings or transformations, as opposed to procedures for solving problems. Through these mappings, a functional language promotes bug-free implementations of certain tasks. Classification is an
ideal task to be performed through functional languages.
Descriptive.
Instead of specifying how a task is to be accomplished, FPL simply describes the task to be accomplished. With FPL, an engineer describes the protocols to be processed. Then, the engineer attaches actions to these descriptions. The description of the protocol itself is 95% of programming a classification task.
Consider the previously mentioned assembly and C code examples.
Each of these approaches attempts to process data found in a TCP/IP packet of the format highlighted in
Figure 3
.
The FPL code used for processing the data highlighted in
Figure 3
is quite simple:
ClassB: 10b net host action($1,$2)
As evident from this example, a line of FPL code describes the data it is meant to process. In the code line above, the given action ($1, $2) is
executed whenever the target data is seen.
Each line of FPL is called a rule. Engineers can think of each rule as a pattern with an associated action. In other words, an FPL rule specifies a classification pattern along with a classification action that is executed when the given pattern is seen.
In the example, ClassB is the name of the rule. After the colon, which separates the name from the body of the rule, is a constant: a binary 10. When this rule begins to execute, the
binary 10 is compared against the incoming data. If it matches, then the rule continues to execute. In this case, the next piece of the rule, net, begins to execute.
The word net refers to another rule within FPL. Much like within the C code example, where a subfunction is called net, FPL can use sub-rules to build up a larger rule. This rule may look like:
net: 100010b 128:8 return(1)
net: 100010b 207:8 return(2)
net: 100010b 192:8
return(3)
These three lines define three distinct possibilities for the net. Each line is considered a separate rule, though together they make up the net rule. When running FPL, the rule that matches the incoming data will be executed, causing the appropriate value to be returned by the rule.
These are the basic constructs within FPL. By combining these basic constructs with other features of FPL, an engineer can use it to express numerous classification tasks. More importantly,
FPL can express all classification tasks necessary for a Level-3 classification language
FPL vs. C
FPL was created to express classification tasks. Reviewing the code segments above, an engineer can see how the descriptive specification in FPL is far simpler than that in C.
When using FPL, a communication systems designer will write 20 times fewer lines of code for many
different networking devices. In other words, with FPL the engineer concentrates on the details of the protocol, not the details of the language itself, which leads to fewer lines of code (see
Figure 4
).
Writing fewer lines of code provides distinct benefits to the communication systems designer. In particular, writing fewer lines of code leads to fewer bugs, faster time-to-market, lower maintenance costs, and greater reuse capabilities.
Eric Rothfus
is executive vice president and co-founder of Agere, Inc. He was a cofounder of Dazel Corp., where he was the vice president of development and Internet initiatives. He was also an early Tivoli employee and a co-founder of PhaseII Corp. Rothfus received a BSEE/CS from MIT, and has done graduate work in electrical engineering at the
University of Texas at Austin. He can be reached at
eric@agere.com.
.
|