draft-ietf-idr-rfc2796bis-00.txt   draft-ietf-idr-rfc2796bis-01.txt 
Network Working Group T. Bates Network Working Group T. Bates
Internet Draft Cisco Systems Internet Draft Cisco Systems
Expiration Date: September 2004 R. Chandra Expiration Date: November 2004 R. Chandra
E. Chen E. Chen
Redback Networks Redback Networks
BGP Route Reflection - BGP Route Reflection -
An Alternative to Full Mesh IBGP An Alternative to Full Mesh IBGP
draft-ietf-idr-rfc2796bis-00.txt draft-ietf-idr-rfc2796bis-01.txt
1. Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.'' material or to cite them other than as ``work in progress.''
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
2. Abstract Abstract
The Border Gateway Protocol [1] is an inter-autonomous system routing The Border Gateway Protocol (BGP) is an inter-autonomous system
protocol designed for TCP/IP internets. Currently in the Internet BGP routing protocol designed for TCP/IP internets. Typically all BGP
deployments are configured such that that all BGP speakers within a speakers within a single AS must be fully meshed so that any external
single AS must be fully meshed so that any external routing routing information must be re-distributed to all other routers
information must be re-distributed to all other routers within that within that AS. This represents a serious scaling problem that has
AS. This represents a serious scaling problem that has been well been well documented with several alternatives proposed.
documented with several alternatives proposed [2,3].
This document describes the use and design of a method known as This document describes the use and design of a method known as
"Route Reflection" to alleviate the the need for "full mesh" IBGP. "Route Reflection" to alleviate the the need for "full mesh" IBGP.
3. Specification of Requirements This documents obsoletes RFC 2796 and RFC 1966.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [7].
4. Introduction 1. Introduction
Currently in the Internet, BGP deployments are configured such that Typically all BGP speakers within a single AS must be fully meshed
that all BGP speakers within a single AS must be fully meshed and any and any external routing information must be re-distributed to all
external routing information must be re-distributed to all other other routers within that AS. For n BGP speakers within an AS that
routers within that AS. For n BGP speakers within an AS that
requires to maintain n*(n-1)/2 unique IBGP sessions. This "full requires to maintain n*(n-1)/2 unique IBGP sessions. This "full
mesh" requirement clearly does not scale when there are a large mesh" requirement clearly does not scale when there are a large
number of IBGP speakers each exchanging a large volume of routing number of IBGP speakers each exchanging a large volume of routing
information, as is common in many of todays internet networks. information, as is common in many of today's networks.
This scaling problem has been well documented and a number of This scaling problem has been well documented and a number of
proposals have been made to alleviate this [2,3]. This document proposals have been made to alleviate this [2,3]. This document
represents another alternative in alleviating the need for a "full represents another alternative in alleviating the need for a "full
mesh" and is known as "Route Reflection". This approach allows a BGP mesh" and is known as "Route Reflection". This approach allows a BGP
speaker (known as "Route Reflector") to advertise IBGP learned routes speaker (known as "Route Reflector") to advertise IBGP learned routes
to certain IBGP peers. It represents a change in the commonly to certain IBGP peers. It represents a change in the commonly
understood concept of IBGP, and the addition of two new optional non- understood concept of IBGP, and the addition of two new optional non-
transitive BGP attributes to prevent loops in routing updates. transitive BGP attributes to prevent loops in routing updates.
5. Design Criteria This documents obsoletes RFC 2796 [6] and RFC 1966 [4].
2. Specification of Requirements
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [7].
3. Design Criteria
Route Reflection was designed to satisfy the following criteria. Route Reflection was designed to satisfy the following criteria.
o Simplicity o Simplicity
Any alternative must be both simple to configure as well as Any alternative must be both simple to configure as well as
understand. understand.
o Easy Transition o Easy Transition
skipping to change at page 3, line 9 skipping to change at page 3, line 14
o Compatibility o Compatibility
It must be possible for non compliant IBGP peers to continue be It must be possible for non compliant IBGP peers to continue be
part of the original AS or domain without any loss of BGP part of the original AS or domain without any loss of BGP
routing information. routing information.
These criteria were motivated by operational experiences of a very These criteria were motivated by operational experiences of a very
large and topology rich network with many external connections. large and topology rich network with many external connections.
6. Route Reflection 4. Route Reflection
The basic idea of Route Reflection is very simple. Let us consider The basic idea of Route Reflection is very simple. Let us consider
the simple example depicted in Figure 1 below. the simple example depicted in Figure 1 below.
+-------+ +-------+ +-------+ +-------+
| | IBGP | | | | IBGP | |
| RTR-A |--------| RTR-B | | RTR-A |--------| RTR-B |
| | | | | | | |
+-------+ +-------+ +-------+ +-------+
\ / \ /
skipping to change at page 4, line 23 skipping to change at page 4, line 23
+-------+ +-------+
| | | |
| RTR-C | | RTR-C |
| | | |
+-------+ +-------+
Figure 2: Route Reflection IBGP Figure 2: Route Reflection IBGP
The Route Reflection scheme is based upon this basic principle. The Route Reflection scheme is based upon this basic principle.
7. Terminology and Concepts 5. Terminology and Concepts
We use the term "Route Reflection" to describe the operation of a BGP We use the term "Route Reflection" to describe the operation of a BGP
speaker advertising an IBGP learned route to another IBGP peer. Such speaker advertising an IBGP learned route to another IBGP peer. Such
a BGP speaker is said to be a "Route Reflector" (RR), and such a a BGP speaker is said to be a "Route Reflector" (RR), and such a
route is said to be a reflected route. route is said to be a reflected route.
The internal peers of a RR are divided into two groups: The internal peers of a RR are divided into two groups:
1) Client Peers 1) Client Peers
skipping to change at page 5, line 31 skipping to change at page 5, line 31
- - - - - /- - -\- - - - - - / - - - - - /- - -\- - - - - - /
IBGP / \ IBGP IBGP / \ IBGP
+-------+ +-------+ +-------+ +-------+
| RTR-D | IBGP | RTR-E | | RTR-D | IBGP | RTR-E |
| Non- |---------| Non- | | Non- |---------| Non- |
|Client | |Client | |Client | |Client |
+-------+ +-------+ +-------+ +-------+
Figure 3: RR Components Figure 3: RR Components
8. Operation 6. Operation
When a RR receives a route from an IBGP peer, it selects the best When a RR receives a route from an IBGP peer, it selects the best
path based on its path selection rule. After the best path is path based on its path selection rule. After the best path is
selected, it must do the following depending on the type of the peer selected, it must do the following depending on the type of the peer
it is receiving the best path from: it is receiving the best path from:
1) A Route from a Non-Client IBGP peer 1) A Route from a Non-Client IBGP peer
Reflect to all the Clients. Reflect to all the Clients.
skipping to change at page 6, line 23 skipping to change at page 6, line 23
not understand the concept of Route-Reflectors (let us call them not understand the concept of Route-Reflectors (let us call them
conventional BGP speakers). The Route-Reflector Scheme allows such conventional BGP speakers). The Route-Reflector Scheme allows such
conventional BGP speakers to co-exist. Conventional BGP speakers conventional BGP speakers to co-exist. Conventional BGP speakers
could be either members of a Non-Client group or a Client group. This could be either members of a Non-Client group or a Client group. This
allows for an easy and gradual migration from the current IBGP model allows for an easy and gradual migration from the current IBGP model
to the Route Reflection model. One could start creating clusters by to the Route Reflection model. One could start creating clusters by
configuring a single router as the designated RR and configuring configuring a single router as the designated RR and configuring
other RRs and their clients as normal IBGP peers. Additional clusters other RRs and their clients as normal IBGP peers. Additional clusters
can be created gradually. can be created gradually.
9. Redundant RRs 7. Redundant RRs
Usually a cluster of clients will have a single RR. In that case, the Usually a cluster of clients will have a single RR. In that case, the
cluster will be identified by the ROUTER_ID of the RR. However, this cluster will be identified by the BGP Identifier of the RR. However,
represents a single point of failure so to make it possible to have this represents a single point of failure so to make it possible to
multiple RRs in the same cluster, all RRs in the same cluster can be have multiple RRs in the same cluster, all RRs in the same cluster
configured with a 4-byte CLUSTER_ID so that an RR can discard routes can be configured with a 4-byte CLUSTER_ID so that an RR can discard
from other RRs in the same cluster. routes from other RRs in the same cluster.
10. Avoiding Routing Information Loops 8. Avoiding Routing Information Loops
When a route is reflected, it is possible through mis-configuration When a route is reflected, it is possible through mis-configuration
to form route re-distribution loops. The Route Reflection method to form route re-distribution loops. The Route Reflection method
defines the following attributes to detect and avoid routing defines the following attributes to detect and avoid routing
information loops: information loops:
ORIGINATOR_ID ORIGINATOR_ID
ORIGINATOR_ID is a new optional, non-transitive BGP attribute of Type ORIGINATOR_ID is a new optional, non-transitive BGP attribute of Type
code 9. This attribute is 4 bytes long and it will be created by a RR code 9. This attribute is 4 bytes long and it will be created by a RR
in reflecting a route. This attribute will carry the ROUTER_ID of in reflecting a route. This attribute will carry the BGP Identifier
the originator of the route in the local AS. A BGP speaker SHOULD NOT of the originator of the route in the local AS. A BGP speaker SHOULD
create an ORIGINATOR_ID attribute if one already exists. A router NOT create an ORIGINATOR_ID attribute if one already exists. A
which recognizes the ORIGINATOR_ID attribute SHOULD ignore a route router which recognizes the ORIGINATOR_ID attribute SHOULD ignore a
received with its ROUTER_ID as the ORIGINATOR_ID. route received with its BGP Identifier as the ORIGINATOR_ID.
CLUSTER_LIST CLUSTER_LIST
CLUSTER_LIST is a new optional, non-transitive BGP attribute of Type CLUSTER_LIST is a new optional, non-transitive BGP attribute of Type
code 10. It is a sequence of CLUSTER_ID values representing the code 10. It is a sequence of CLUSTER_ID values representing the
reflection path that the route has passed. reflection path that the route has passed.
When a RR reflects a route, it MUST prepend the local CLUSTER_ID to When a RR reflects a route, it MUST prepend the local CLUSTER_ID to
the CLUSTER_LIST. If the CLUSTER_LIST is empty, it MUST create a new the CLUSTER_LIST. If the CLUSTER_LIST is empty, it MUST create a new
one. Using this attribute an RR can identify if the routing one. Using this attribute an RR can identify if the routing
information is looped back to the same cluster due to mis- information has looped back to the same cluster due to mis-
configuration. If the local CLUSTER_ID is found in the CLUSTER_LIST, configuration. If the local CLUSTER_ID is found in the CLUSTER_LIST,
the advertisement received SHOULD be ignored. the advertisement received SHOULD be ignored.
11. Impact on Path Selection 9. Impact on Route Selection
The ORIGINATOR_ID (when present) of a path SHOULD be treated as the The BGP Decision Process Tie Breaking rules (Sect. 9.1.2.2, [1]) are
BGP Identifier of the path in the route selection as described in modified as follows:
[1].
If the BGP Identifiers of two paths are equal when compared in the If a route carries the ORIGINATOR_ID attribute, then in Step f)
route selection, then the path with the shorter CLUSTER_LIST length the ORIGINATOR_ID SHOULD be treated as the BGP Identifier of
SHOULD be preferred. The CLUSTER_LIST length SHOULD be considered as the BGP speaker that has advertised the route.
zero for a path that has no CLUSTER_LIST attribute.
12. Implementation Considerations In addition, the following rule SHOULD be inserted between Steps
f) and g): a BGP Speaker SHOULD prefer a route with the shorter
CLUSTER_LIST length. The CLUSTER_LIST length is zero if a route
does not carry the CLUSTER_LIST attribute.
10. Implementation Considerations
Care should be taken to make sure that none of the BGP path Care should be taken to make sure that none of the BGP path
attributes defined above can be modified through configuration when attributes defined above can be modified through configuration when
exchanging internal routing information between RRs and Clients and exchanging internal routing information between RRs and Clients and
Non-Clients. Their modification could potential result in routing Non-Clients. Their modification could potentially result in routing
loops. loops.
In addition, when a RR reflects a route, it SHOULD NOT modify the In addition, when a RR reflects a route, it SHOULD NOT modify the
following path attributes: NEXT_HOP, AS_PATH, LOCAL_PREF, and MED. following path attributes: NEXT_HOP, AS_PATH, LOCAL_PREF, and MED.
Their modification could potential result in routing loops. Their modification could potential result in routing loops.
13. Configuration and Deployment Considerations 11. Configuration and Deployment Considerations
The BGP protocol provides no way for a Client to identify itself The BGP protocol provides no way for a Client to identify itself
dynamically as a Client of an RR. The simplest way to achieve this dynamically as a Client of an RR. The simplest way to achieve this
is by manual configuration. is by manual configuration.
One of the key component of the route reflection approach in One of the key component of the route reflection approach in
addressing the scaling issue is that the RR summarizes routing addressing the scaling issue is that the RR summarizes routing
information and only reflects its best path. information and only reflects its best path.
Both MEDs and IGP metrics may impact the BGP route selection. Both MEDs and IGP metrics may impact the BGP route selection.
skipping to change at page 9, line 7 skipping to change at page 9, line 7
designing a route reflection topology. In general, the route designing a route reflection topology. In general, the route
reflection topology should congruent with the network topology when reflection topology should congruent with the network topology when
there exist multiple paths for a prefix. One commonly used approach there exist multiple paths for a prefix. One commonly used approach
is the POP-based reflection, in which each POP maintains its own is the POP-based reflection, in which each POP maintains its own
route reflectors serving clients in the POP, and all route reflectors route reflectors serving clients in the POP, and all route reflectors
are fully meshed. In addition, clients of the reflectors in each POP are fully meshed. In addition, clients of the reflectors in each POP
are often fully meshed for the purpose of optimal intra-POP routing, are often fully meshed for the purpose of optimal intra-POP routing,
and the intra-POP IGP metrics are configured to be better than the and the intra-POP IGP metrics are configured to be better than the
inter-POP IGP metrics. inter-POP IGP metrics.
14. Security Considerations 12. Security Considerations
This extension to BGP does not change the underlying security issues This extension to BGP does not change the underlying security issues
inherent in the existing IBGP [5]. inherent in the existing IBGP [5].
15. Acknowledgments 13. Acknowledgments
The authors would like to thank Dennis Ferguson, John Scudder, Paul The authors would like to thank Dennis Ferguson, John Scudder, Paul
Traina and Tony Li for the many discussions resulting in this work. Traina and Tony Li for the many discussions resulting in this work.
This idea was developed from an earlier discussion between Tony Li This idea was developed from an earlier discussion between Tony Li
and Dimitri Haskin. and Dimitri Haskin.
In addition, the authors would like to acknowledge valuable review In addition, the authors would like to acknowledge valuable review
and suggestions from Yakov Rekhter on this document, and helpful and suggestions from Yakov Rekhter on this document, and helpful
comments from Tony Li, Rohit Dube, John Scudder and Bruce Cole. comments from Tony Li, Rohit Dube, John Scudder and Bruce Cole.
16. Normative References 14. References
[1] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 (BGP-4)", 14.1. Normative References
RFC 1771, March 1995.
[1] Rekhter, Y., T. Li and S. Hares, "A Border Gateway Protocol 4
(BGP-4)", draft-ietf-idr-bgp4-23.txt, November 2003.
14.2. Informative References
[2] Haskin, D., "A BGP/IDRP Route Server alternative to a full mesh [2] Haskin, D., "A BGP/IDRP Route Server alternative to a full mesh
routing", RFC 1863, October 1995. routing", RFC 1863, October 1995.
[3] Traina, P., "Limited Autonomous System Confederations for BGP", [3] Traina, P., "Limited Autonomous System Confederations for BGP",
RFC 1965, June 1996. RFC 1965, June 1996.
[4] Bates, T. and R. Chandra, "BGP Route Reflection An alternative [4] Bates, T. and R. Chandra, "BGP Route Reflection An alternative
to full mesh IBGP", RFC 1966, June 1996. to full mesh IBGP", RFC 1966, June 1996.
[5] Heffernan, A., "Protection of BGP Sessions via the TCP MD5 [5] Heffernan, A., "Protection of BGP Sessions via the TCP MD5
Signature Option", RFC 2385, August 1998. Signature Option", RFC 2385, August 1998.
[6] Bates, T., R. Chandra and E. Chen "BGP Route Reflection - An [6] Bates, T., R. Chandra and E. Chen "BGP Route Reflection - An
Alternative to Full Mesh IBGP", RFC 2796, Arpil 2000. Alternative to Full Mesh IBGP", RFC 2796, Arpil 2000.
[7] Bradner, S., "Key words for use in RFCs to Indicate Requirement [7] Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", BCP 14, RFC 2119, March 1997. Levels", BCP 14, RFC 2119, March 1997.
17. Authors' Addresses 15. Authors' Addresses
Tony Bates Tony Bates
Cisco Systems, Inc. Cisco Systems, Inc.
170 West Tasman Drive 170 West Tasman Drive
San Jose, CA 95134 San Jose, CA 95134
EMail: tbates@cisco.com EMail: tbates@cisco.com
Ravi Chandra Ravi Chandra
Redback Networks Inc. Redback Networks Inc.
skipping to change at page 10, line 28 skipping to change at page 10, line 31
EMail: rchandra@redback.com EMail: rchandra@redback.com
Enke Chen Enke Chen
Redback Networks Inc. Redback Networks Inc.
300 Holger Way. 300 Holger Way.
San Jose, CA 95134 San Jose, CA 95134
EMail: enke@redback.com EMail: enke@redback.com
18. Appendix A Comparison with RFC 2796 16. Appendix A Comparison with RFC 2796
The impact on route selection is added. The impact on route selection is added.
19. Appendix B Comparison with RFC 1966 17. Appendix B Comparison with RFC 1966
Several terminologies related to route reflection are clarified, and Several terminologies related to route reflection are clarified, and
the reference to EBGP routes/peers are removed. the reference to EBGP routes/peers are removed.
The handling of a routing information loop (due to route reflection) The handling of a routing information loop (due to route reflection)
by a receiver is clarified and made more consistent. by a receiver is clarified and made more consistent.
The addition of a CLUSTER_ID to the CLUSTER_LIST has been changed The addition of a CLUSTER_ID to the CLUSTER_LIST has been changed
from "append" to "prepend" to reflect the deployed code. from "append" to "prepend" to reflect the deployed code.
The section on "Configuration and Deployment Considerations" has been The section on "Configuration and Deployment Considerations" has been
expanded to address several operational issues. expanded to address several operational issues.
20. Full Copyright Statement 18. Intellectual Property Notice
Copyright (C) The Internet Society (2000). All Rights Reserved. The IETF takes no position regarding the validity or scope of any
intellectual property or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; neither does it represent that it
has made any effort to identify any such rights. Information on the
IETF's procedures with respect to rights in standards-track and
standards-related documentation can be found in BCP-11. Copies of
claims of rights made available for publication and any assurances of
licenses to be made available, or the result of an attempt made to
obtain a general license or permission for the use of such
proprietary rights by implementors or users of this specification can
be obtained from the IETF Secretariat.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights which may cover technology that may be required to practice
this standard. Please address the information to the IETF Executive
Director.
19. Full Copyright Statement
Copyright (C) The Internet Society (2004). All Rights Reserved.
This document and translations of it may be copied and furnished to This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of Internet organizations, except as needed for the purpose of
skipping to change at page 11, line 32 skipping to change at line 484
The limited permissions granted above are perpetual and will not be The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns. revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
21. Acknowledgement
Funding for the RFC Editor function is currently provided by the
Internet Society.
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/