Standards & Protocols
Sparse Mode Multicast
A new class of multicast protocols has been under investigation in the engineering community. These so-called sparse mode multicast algorithms are built with the understanding that multicast transmissions can be made across the Internet at large.
By Mike
Rodbell
Over the past few months, weve been reviewing many of the well-understood techniques for Internet-based multicast. Most of those techniques have been based on the assumption that the multicast participants were logically close to one another in the network. As one considers routing multicast packets over larger networks, the problems become more significant. Expecting all routers in a huge network to maintain comprehensive routing tables is unrealistic. Therefore, predicting the size of
these tables would rapidly become an impossible job. Additionally, the amount of control information required to manage the multicast memberships could add so much congestion to the network that many of the gains achieved through the use of multicast protocols could be canceled out.
Given that the basic multicast protocols dont scale into larger network applications, a new class of multicast protocols has been under investigation in the engineering community. These so-called sparse mode
multicast algorithms are built with the understanding that multicast transmissions can be made across the Internet at large. Predicting both transmission points and receipt points will eventually be impossible. If one considers using the Internet as a generally available broadcast medium (multicast actually) in the model of television, the need for sparse multicast services becomes far more evident.
Two areas of protocol research are of particular interest in the sparse mode multicast world. Core-based
trees (CBT) are based on the concept of shared delivery trees. Protocol independent multicast, sparse mode (PIM-SM), is a multicast mechanism that can operate independent of the underlying routing mechanisms. Where protocols such as the distance vector multicast routing protocol (DVMRP) and multicast extensions to the open shortest path first (MOSPF) protocols are tightly coupled with their respective underlying routing information protocols, PIM-SM is being designed not to care about the supporting unicast
routing protocols. PIM-SM is only concerned about being able to reach the participating nodes in the network through any available unicast routing protocols.
This month, Ill review the operations of the now experimental CBT approach and cover PIM-SM next month. Additional technical details on CBT can also be found in request for comments (RFC) 2201 and RFC 2189. At this point, this is an experimental protocol that is intended to address many of the problems that weve noted in scaling multicast
into larger network environments.
CBT
As indicated (somewhat) by their name, core-based trees are based on the concept of a centralized router that is used to coordinate all transmissions for a particular group. Where the dense-mode multicast routing protocols relied on distributed knowledge of the multicast routes, this information is now maintained in a centralized location. While this provides some distinct advantages over trying to use source-based trees (routers need not all contain huge
lists), there is still some significant potential to create hot spots around the core routers.
While the CBT approach alters the general methodology for multicast group membership, these memberships still begin with the Internet group-management protocol (IGMP) primitives. When a node first wishes to join a new multicast group, it issues an IGMP host membership report. This is where the behavior becomes specific to the CBT algorithm. When a multicast router receives a request to join a new multicast
group, the CBT algorithm employs a mechanism in which all memberships are directed towards a single, centralized core router. One idiosyncrasy of the core routing algorithm is that, while the name implies the existence of a single, core-based routing tree, all CBT capable routers actually maintain a cache of multicast routing information. The core aspects of the routing algorithm help to ensure that routing loops are avoided through focusing all multicast transmissions through a centralized, core
routing facility.
The general concept of operation of the CBT routing algorithms is best understood by example. There is a single core router being used to coordinate multicast traffic through a particular region of a network. As each multicast-capable router learns of a request to participate in a multicast group (through IGMP host membership reports), it will direct a CBT join request towards the core router. If the join request reaches another multicast router (MRX) that is already participating in that
group, the membership is granted by the intermediate router. As these routes are established, each multicast router maintains a list of group participation along with the particular directions that the multicast traffic is to be directed. Where the reverse path multicast (RPM) types of protocols needed to continuously observe and manage the actual flow of multicast transmissions, CBT routers need only maintain forwarding caches for the actual multicast traffic. When multicast packets are received by the
CBT routers, they need only search their local routing cache to identify all of the interfaces for which the particular multicast group is registered. The router will then proceed to retransmit the multicast packet on all interfaces other than the interface over which the packet has been received.
Some general terms
Not to be left behind in the art of defining terms to identify the equipments roles, the CBT RFC employs some new terms. These include:
The parent router
is
a router that supports the transfer of multicast traffic towards the core. MR 4 is the parent to MR 7, MR 9, and MR 6.
The child router
is a router that depends on a parent router. For example, MR 6 is a child to MR 4.
An upstream interface
is the interface towards the core router. The link to MR 3 from MR 4 is MR 4s upstream interface.
The downstream interface
is the interface that points away from the core router. The link between MR 3 and MR 4 is
a downstream interface from the perspective of MR 3.
Given these basics, we have a frame of reference for understanding many of the information exchanges that occur within CBT-based multicast routing systems.
Protocol primitives
As in any data protocol, CBT can also be partially understood as a collection of its various parts. In this case, we can start with the message primitives. Reviewing the actions taken when these messages are sent helps in seeing how the actual routes are
established, and once no longer necessary, disabled. The protocol primitives employed by CBT include:
JOIN_REQUEST
is the message sent by a leaf router towards the core router when it determines that a new group membership is warranted. Note that this transmission isnt necessarily a unicast to the core router. The actual intent is that the message is forwarded through the intermediate multicast routers (MRX) towards the core. As the intermediate routers detect a JOIN_REQUEST for a group
for which they already have a cached membership, the intermediate router will process the request and not forward it to the core. As the intermediate routers process these memberships, the intermediate router retains the multicast routes in its local cache.
JOIN_ACK
completes the successful processing of the actions requested by the JOIN_REQUEST. Once a router receives the JOIN_ACK message, the multicast route is installed. From that time forward, multicast transmissions can follow the path
dictated by the route.
QUIT_NOTIFICATION
supports the inverse of the JOIN_REQUEST/JOIN_ACK operations. The QUIT_NOTIFICATION message is sent when a router determines that it no longer has a need to process messages from a particular multicast group. Unlike the JOIN_REQUEST, there is no explicit acknowledgement message. The QUIT_NOTIFICATION is normally sent a number of times. The QUIT_ NOTIFICATION message contains information needed to identify the cache entry that represents the group
being canceled.
ECHO_REQUEST
recognizes that one of the more typical characteristics of widely distributed data networks is variability, child routers periodically send ECHO_REQUESTS to monitor the availability of parent routers. This message is actually broadcasted to all-cbt-routers. When an upstream router receives this message, a timer is started. Upon the expiration of the timer, the router will issue an ECHO_REPLY message. If no ECHO_REPLY is received, the related routes in
the cache need to be re-evaluated. The corresponding (cached) route is no longer dependable.
ECHO_REPLY
is the companion to the ECHO_REQUEST. When the time has arrived to respond to the ECHO_REQUEST messages received over its child interface(s), the router will transmit an ECHO_REPLY to one or more requesting routers. On receipt of the ECHO_REPLY, the child routers maintain the active status of the routing cache entries related to the leaf/branch router pairing.
FLUSH_TREE
is
when a router loses connectivity to its parent for a particular group (doesnt receive the ECHO_REPLY message in the appropriate time), it will issue a FLUSH_TREE message to all downstream routers. This message will contain references to the groups that must be disabled due to lost connectivity.
This collection of messages comprises the set of information that CBT uses to co-ordinate the establishment, state of health monitoring, and removal of the multicast routes.
Sending multicast
traffic on CBT systems
With a multicast routing tree that has no loops, the transmission of multicast routing information becomes a fairly simple matter. The information enters the multicast network for a particular group once a participating multicast router processes it. On receipt, the router will forward the message over all of the interfaces with the exception of the incoming interface. Given the tree-based architecture of the multicast network, all traffic is either heading towards or away from the
core router. This is a logical tree network, not a mesh. The system architects need to concern themselves with the number and characteristics of the groups that are processed by the core router. There is a potential for a significant hot spot on all of the core routers interfaces.
There are some additional details on the management and configuration of CBT networks that are addressed in the CBT RFCs. In particular, locating the core router can be done through a variety of mechanisms including
automatic boot-strapping, and static configurations. Given a likely need for widely distributed multicast, the CBT protocols and services offer some significant advantages.
Mike Rodbell is director of embedded software development for CIENA Communications Inc. He has developed voice and data communication systems for a wide range of commercial and military systems. He holds a BSCS from Trinity College of Hartford, CT, and an MSEE from Loyola College of Baltimore, MD. He can be reached at
mrodbell@ciena.com
or
http://www.ciena.com.