--- 1/draft-ietf-idr-bgp4-15.txt 2006-02-04 23:30:12.000000000 +0100 +++ 2/draft-ietf-idr-bgp4-16.txt 2006-02-04 23:30:12.000000000 +0100 @@ -1,19 +1,19 @@ Network Working Group Y. Rekhter INTERNET DRAFT Juniper Networks T. Li Procket Networks, Inc. Editors A Border Gateway Protocol 4 (BGP-4) - + Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. @@ -52,21 +52,22 @@ This updated version of the document is the product of the IETF IDR Working Group with Yakov Rekhter and Tony Li as editors. Certain sections of the document borrowed heavily from IDRP [7], which is the OSI counterpart of BGP. For this credit should be given to the ANSI X3S3.3 group chaired by Lyman Chapin and to Charles Kunzinger who was the IDRP editor within that group. We would also like to thank Enke Chen, Edward Crabbe, Mike Craren, Vincent Gillet, Eric Gray, Jeffrey Haas, Dimitry Haskin, John Krawczyk, David LeRoy, Dan Massey, Dan Pei, Mathew Richardson, John Scudder, John Stewart III, Dave Thaler, - Paul Traina, Curtis Villamizar, and Alex Zinin for their comments. + Paul Traina, Russ White, Curtis Villamizar, and Alex Zinin for their + comments. We would like to specially acknowledge numerous contributions by Dennis Ferguson. 2. Introduction The Border Gateway Protocol (BGP) is an inter-Autonomous System routing protocol. It is built on experience gained with EGP as defined in RFC 904 [1] and EGP usage in the NSFNET Backbone as described in RFC 1092 [2] and RFC 1093 [3]. @@ -142,30 +143,30 @@ planned to explore various aspects of BGP application. 3. Summary of Operation Two systems form a transport protocol connection between one another. They exchange messages to open and confirm the connection parameters. The initial data flow is the portion of the BGP routing table that is allowed by the export policy, called the Adj-Ribs-Out (see 3.2). Incremental updates are sent as the routing tables change. BGP does - not require periodic refresh of the routing table. Therefore, A BGP + not require periodic refresh of the routing table. Therefore, a BGP speaker must retain the current version of the routes advertised by all of its peers for the duration of the connection. If the implementation decides to not store the routes that have been received from a peer, but have been filtered out according to configured local policy, the BGP Route Refresh extension [12] may be used to request the full set of routes from a peer without resetting the BGP session when the local policy configuration changes. - KEEPALIVE messages are sent periodically to ensure the liveness of + KEEPALIVE messages may be sent periodically to ensure the liveness of the connection. NOTIFICATION messages are sent in response to errors or special conditions. If a connection encounters an error condition, a NOTIFICATION message is sent and the connection is closed. The hosts executing the Border Gateway Protocol need not be routers. A non-routing host could exchange routing information with routers via EGP or even an interior routing protocol. That non-routing host could then use BGP to exchange routing information with a border router in another Autonomous System. The implications and applications of this architecture are for further study. @@ -188,35 +189,35 @@ care not to lose BGP attributes that will be needed by EBGP speakers if transit connectivity is being provided. For the purpose of discussion, it is assumed that BGP information is passed within an AS using IBGP. Care must be taken to ensure that the interior routers have all been updated with transit information before the EBGP speakers announce to other ASs that transit service is being provided. 3.1 Routes: Advertisement and Storage - For purposes of this protocol a route is defined as a unit of - information that pairs a destination with the attributes of a path to - that destination, where the destination is the systems whose IP - addresses are reported in the Network Layer Reachability Information - (NLRI) field, and the path is the information reported in the path - attributes fields of the same UPDATE message. + For the purpose of this protocol, a route is defined as a unit of + information that pairs a set of destinations with the attributes of a + path to those destinations. The set of destinations are the systems + whose IP addresses are reported in the Network Layer Reachability + Information (NLRI) field and the path is the information reported in + the path attributes field of the same UPDATE message. Routes are advertised between BGP speakers in UPDATE messages. Routes are stored in the Routing Information Bases (RIBs): namely, the Adj-RIBs-In, the Loc-RIB, and the Adj-RIBs-Out. Routes that will be advertised to other BGP speakers must be present in the Adj-RIB- Out. Routes that will be used by the local BGP speaker must be present in the Loc-RIB, and the next hop for each of these routes - must be present in the local BGP speaker's Routing Table. Routes + must be resolvable via the local BGP speaker's Routing Table. Routes that are received from other BGP speakers are present in the Adj- RIBs-In. If a BGP speaker chooses to advertise the route, it may add to or modify the path attributes of the route before advertising it to a peer. BGP provides mechanisms by which a BGP speaker can inform its peer that a previously advertised route is no longer available for use. There are three methods by which a given BGP speaker can indicate @@ -362,23 +363,24 @@ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | My Autonomous System | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Hold Time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | BGP Identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Opt Parm Len | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | - | Optional Parameters | + | Optional Parameters (variable) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + Version: This 1-octet unsigned integer indicates the protocol version number of the message. The current BGP version number is 4. My Autonomous System: This 2-octet unsigned integer indicates the Autonomous System number of the sender. @@ -473,27 +475,27 @@ 4.3 UPDATE Message Format UPDATE messages are used to transfer routing information between BGP peers. The information in the UPDATE packet can be used to construct a graph describing the relationships of the various Autonomous Systems. By applying rules to be discussed, routing information loops and some other anomalies may be detected and removed from inter-AS routing. - An UPDATE message is used to advertise a single feasible route to a - peer, or to withdraw multiple unfeasible routes from service (see - 3.1). An UPDATE message may simultaneously advertise a feasible route - and withdraw multiple unfeasible routes from service. The UPDATE - message always includes the fixed-size BGP header, and also includes - the other fields as shown below (note, some of the shown fields may - not be present in every UPDATE message): + An UPDATE message is used to advertise feasible routes sharing common + path attribute to a peer, or to withdraw multiple unfeasible routes + from service (see 3.1). An UPDATE message may simultaneously + advertise a feasible route and withdraw multiple unfeasible routes + from service. The UPDATE message always includes the fixed-size BGP + header, and also includes the other fields as shown below (note, some + of the shown fields may not be present in every UPDATE message): +-----------------------------------------------------+ | Withdrawn Routes Length (2 octets) | +-----------------------------------------------------+ | Withdrawn Routes (variable) | +-----------------------------------------------------+ | Total Path Attribute Length (2 octets) | +-----------------------------------------------------+ | Path Attributes (variable) | +-----------------------------------------------------+ @@ -667,25 +669,25 @@ This is an optional non-transitive attribute that is a four octet non-negative integer. The value of this attribute may be used by a BGP speaker's decision process to discriminate among multiple entry points to a neighboring autonomous system. Its usage is defined in 5.1.4. e) LOCAL_PREF (Type Code 5): - LOCAL_PREF is a well-known mandatory attribute that is a - four octet non-negative integer. A BGP speaker uses it to - inform other internal peers of the advertising speaker's - degree of preference for an advertised route. Usage of this - attribute is described in 5.1.5. + LOCAL_PREF is a well-known attribute that is a four octet + non-negative integer. A BGP speaker uses it to inform other + internal peers of the advertising speaker's degree of + preference for an advertised route. Usage of this attribute + is described in 5.1.5. f) ATOMIC_AGGREGATE (Type Code 6) ATOMIC_AGGREGATE is a well-known discretionary attribute of length 0. Usage of this attribute is described in 5.1.6. g) AGGREGATOR (Type Code 7) AGGREGATOR is an optional transitive attribute of length 6. The attribute contains the last AS number that formed the @@ -1105,28 +1105,28 @@ 1 and 2). An implementation MAY also (based on local configuration) alter the value of the MULTI_EXIT_DISC attribute received over an external link. If it does so, it shall do so prior to determining the degree of preference of the route and performing route selection (decision process phases 1 and 2). 5.1.5 LOCAL_PREF - LOCAL_PREF is a well-known mandatory attribute that SHALL be included - in all UPDATE messages that a given BGP speaker sends to the other - internal peers. A BGP speaker SHALL calculate the degree of - preference for each external route based on the locally configured - policy, and include the degree of preference when advertising a route - to its internal peers. The higher degree of preference MUST be - preferred. A BGP speaker shall use the degree of preference learned - via LOCAL_PREF in its decision process (see section 9.1.1). + LOCAL_PREF is a well-known attribute that SHALL be included in all + UPDATE messages that a given BGP speaker sends to the other internal + peers. A BGP speaker SHALL calculate the degree of preference for + each external route based on the locally configured policy, and + include the degree of preference when advertising a route to its + internal peers. The higher degree of preference MUST be preferred. A + BGP speaker shall use the degree of preference learned via LOCAL_PREF + in its decision process (see section 9.1.1). A BGP speaker MUST NOT include this attribute in UPDATE messages that it sends to external peers, except for the case of BGP Confederations [13]. If it is contained in an UPDATE message that is received from an external peer, then this attribute MUST be ignored by the receiving speaker, except for the case of BGP Confederations [13]. 5.1.6 ATOMIC_AGGREGATE ATOMIC_AGGREGATE is a well-known discretionary attribute. @@ -1147,21 +1147,21 @@ A BGP speaker that receives a route with the ATOMIC_AGGREGATE attribute needs to be cognizant of the fact that the actual path to destinations, as specified in the NLRI of the route, while having the loop-free property, may not be the path specified in the AS_PATH attribute of the route. 5.1.7 AGGREGATOR AGGREGATOR is an optional transitive attribute which may be included - in updates which are formed by aggregation (see Section 9.2.4.2). A + in updates which are formed by aggregation (see Section 9.2.2.2). A BGP speaker which performs route aggregation may add the AGGREGATOR attribute which shall contain its own AS number and IP address. The IP address should be the same as the BGP Identifier of the speaker. 6. BGP Error Handling. This section describes actions to be taken when errors are detected while processing BGP messages. When any of the conditions described here are detected, a @@ -1217,21 +1217,23 @@ All errors detected while processing the OPEN message are indicated by sending the NOTIFICATION message with Error Code OPEN Message Error. The Error Subcode elaborates on the specific nature of the error. If the version number contained in the Version field of the received OPEN message is not supported, then the Error Subcode is set to Unsupported Version Number. The Data field is a 2-octets unsigned integer, which indicates the largest locally supported version number less than the version the remote BGP peer bid (as indicated in the - received OPEN message). + received OPEN message), or if the smallest locally supported version + number is greater than the version the remote BGP peer bid, then the + smallest locally supported version number. If the Autonomous System field of the OPEN message is unacceptable, then the Error Subcode is set to Bad Peer AS. The determination of acceptable Autonomous System numbers is outside the scope of this protocol. If the Hold Time field of the OPEN message is unacceptable, then the Error Subcode MUST be set to Unacceptable Hold Time. An implementation MUST reject Hold Time values of one or two seconds. An implementation MAY reject any proposed Hold Time. An @@ -1380,40 +1382,46 @@ reached, the speaker (under control of local configuration) may either (a) discard new address prefixes from the neighbor, or (b) terminate the BGP peering with the neighbor. If the BGP speaker decides to terminate its peering with a neighbor because the number of address prefixes received from the neighbor exceeds the locally configured upper bound, then the speaker must send to the neighbor a NOTIFICATION message with the Error Code Cease. 6.8 Connection collision detection. - If a pair of BGP speakers try simultaneously to establish a TCP + If a pair of BGP speakers try simultaneously to establish a BGP connection to each other, then two parallel connections between this - pair of speakers might well be formed. We refer to this situation as - connection collision. Clearly, one of these connections must be + pair of speakers might well be formed. If the source IP address used + by one of these connections is the same as the destination IP address + used by the other, and the destination IP address used by the first + connection is the same as the source IP address used by the other, we + refer to this situation as connection collision. Clearly in the + presence of connection collision, one of these connections must be closed. Based on the value of the BGP Identifier a convention is established for detecting which BGP connection is to be preserved when a collision does occur. The convention is to compare the BGP Identifiers of the peers involved in the collision and to retain only the connection initiated by the BGP speaker with the higher-valued BGP Identifier. Upon receipt of an OPEN message, the local system must examine all of its connections that are in the OpenConfirm state. A BGP speaker may also examine connections in an OpenSent state if it knows the BGP Identifier of the peer by means outside of the protocol. If among these connections there is a connection to a remote BGP speaker whose - BGP Identifier equals the one in the OPEN message, then the local - system performs the following collision resolution procedure: + BGP Identifier equals the one in the OPEN message, and this + connection collides with the connection over which the OPEN message + is received then the local system performs the following collision + resolution procedure: 1. The BGP Identifier of the local system is compared to the BGP Identifier of the remote system (as specified in the OPEN message). 2. If the value of the local BGP Identifier is less than the remote one, the local system closes BGP connection that already exists (the one that is already in the OpenConfirm state), and accepts BGP connection initiated by the remote system. @@ -1693,137 +1701,121 @@ depending on the type of the optional attribute, it is processed locally, retained, and updated, if necessary, for possible propagation to other BGP speakers. If the UPDATE message contains a non-empty WITHDRAWN ROUTES field, the previously advertised routes whose destinations (expressed as IP prefixes) contained in this field shall be removed from the Adj-RIB- In. This BGP speaker shall run its Decision Process since the previously advertised route is no longer available for use. - If the UPDATE message contains a feasible route, it shall be placed - in the appropriate Adj-RIB-In, and the following additional actions - shall be taken: - - i) If its Network Layer Reachability Information (NLRI) is identical - to the one of a route currently stored in the Adj-RIB-In, then the - new route shall replace the older route in the Adj-RIB-In, thus - implicitly withdrawing the older route from service. The BGP speaker - shall run its Decision Process since the older route is no longer - available for use. - - ii) If the new route is an overlapping route that is included (see - 9.1.4) in an earlier route contained in the Adj-RIB-In, the BGP - speaker shall run its Decision Process since the more specific route - has implicitly made a portion of the less specific route unavailable - for use. - - iii) If the new route has identical path attributes to an earlier - route contained in the Adj-RIB-In, and is more specific (see 9.1.4) - than the earlier route, no further actions are necessary. - - iv) If the new route has NLRI that is not present in any of the - routes currently stored in the Adj-RIB-In, then the new route shall - be placed in the Adj-RIB-In. The BGP speaker shall run its Decision - Process. + If the UPDATE message contains a feasible route, the Adj-RIB-In will + be updated with this route as follows: if the NLRI of the new route + is identical to the one of the route currently stored in the Adj-RIB- + In, then the new route shall replace the older route in the Adj-RIB- + In, thus implicitly withdrawing the older route from service. + Otherwise, if the Adj-RIB-In has no route with NLRI identical to the + new route, the new route shall be placed in the Adj-RIB-In. - v) If the new route is an overlapping route that is less specific - (see 9.1.4) than an earlier route contained in the Adj-RIB-In, the - BGP speaker shall run its Decision Process on the set of destinations - described only by the less specific route. + Once the BGP speaker updates the Adj-RIB-In, the speaker shall run + its Decision Process. 9.1 Decision Process The Decision Process selects routes for subsequent advertisement by applying the policies in the local Policy Information Base (PIB) to the routes stored in its Adj-RIBs-In. The output of the Decision Process is the set of routes that will be advertised to all peers; the selected routes will be stored in the local speaker's Adj-RIB- Out. The selection process is formalized by defining a function that takes - the attribute of a given route as an argument and returns a non- - negative integer denoting the degree of preference for the route. + the attribute of a given route as an argument and returns either (a) + a non-negative integer denoting the degree of preference for the + route, or (b) a value denoting that this route is ineligible to be + installed in LocRib and will be excluded from the next phase of route + selection. + The function that calculates the degree of preference for a given route shall not use as its inputs any of the following: the existence of other routes, the non-existence of other routes, or the path attributes of other routes. Route selection then consists of individual application of the degree of preference function to each feasible route, followed by the choice of the one with the highest degree of preference. The Decision Process operates on routes contained in the Adj-RIB-In, and is responsible for: - selection of routes to be used locally by the speaker - - selection of routes to be advertised to internal peers - - - selection of routes to be advertised to external peers + - selection of routes to be advertised to other BGP peers - route aggregation and route information reduction The Decision Process takes place in three distinct phases, each triggered by a different event: a) Phase 1 is responsible for calculating the degree of preference - for each route received from a peer, and MAY also advertise to all - the internal peers the routes from external peers that have the - highest degree of preference for each distinct destination. + for each route received from a peer. b) Phase 2 is invoked on completion of phase 1. It is responsible for choosing the best route out of all those available for each distinct destination, and for installing each chosen route into the Loc-RIB. c) Phase 3 is invoked after the Loc-RIB has been modified. It is - responsible for disseminating routes in the Loc-RIB to each - external peer, according to the policies contained in the PIB. - Route aggregation and information reduction can optionally be - performed within this phase. + responsible for disseminating routes in the Loc-RIB to each peer, + according to the policies contained in the PIB. Route aggregation + and information reduction can optionally be performed within this + phase. 9.1.1 Phase 1: Calculation of Degree of Preference The Phase 1 decision function shall be invoked whenever the local BGP speaker receives from a peer an UPDATE message that advertises a new route, a replacement route, or withdrawn routes. The Phase 1 decision function is a separate process which completes when it has no further work to do. The Phase 1 decision function shall lock an Adj-RIB-In prior to operating on any route contained within it, and shall unlock it after operating on all new or unfeasible routes contained within it. For each newly received or replacement feasible route, the local BGP - speaker shall determine a degree of preference. If the route is - learned from an internal peer, either the value of the LOCAL_PREF - attribute shall be taken as the degree of preference, or the local - system may compute the degree of preference of the route based on - preconfigured policy information. Note that the latter (computing the - degree of preference based on preconfigured policy information) may - result in formation of persistent routing loops. If the route is - learned from an external peer, then the local BGP speaker computes - the degree of preference based on preconfigured policy information - and uses it as the LOCAL_PREF value in any IBGP readvertisement. The - exact nature of this policy information and the computation involved - is a local matter. For a route learned from an external peer, the - local speaker shall then run the internal update process of 9.2.1 to - select and advertise the most preferable route. + speaker shall determine a degree of preference as follows: + + If the route is learned from an internal peer, either the value of + the LOCAL_PREF attribute shall be taken as the degree of + preference, or the local system may compute the degree of + preference of the route based on preconfigured policy information. + Note that the latter (computing the degree of preference based on + preconfigured policy information) may result in formation of + persistent routing loops. + + If the route is learned from an external peer, then the local BGP + speaker computes the degree of preference based on preconfigured + policy information. If the return value indicates that the route + is ineligible, the route may not serve as an input to the next + phase of route selection; otherwise the return value is used as + the LOCAL_PREF value in any IBGP readvertisement. + + The exact nature of this policy information and the computation + involved is a local matter. 9.1.2 Phase 2: Route Selection The Phase 2 decision function shall be invoked on completion of Phase - 1. The Phase 2 function is a separate process which completes when - it has no further work to do. The Phase 2 process shall consider all - routes that are present in the Adj-RIBs-In, including those received - from both internal and external peers. + 1. The Phase 2 function is a separate process which completes when it + has no further work to do. The Phase 2 process shall consider all + routes that are eligible in the Adj-RIBs-In. The Phase 2 decision function shall be blocked from running while the Phase 3 decision function is in process. The Phase 2 function shall lock all Adj-RIBs-In prior to commencing its function, and shall unlock them on completion. If the NEXT_HOP attribute of a BGP route depicts an address that is not resolvable, or it would become unresolvable if the route was installed in the routing table the BGP route should be excluded from the Phase 2 decision function. @@ -2008,34 +2000,41 @@ The Phase 3 decision function shall be invoked on completion of Phase 2, or when any of the following events occur: a) when routes in the Loc-RIB to local destinations have changed b) when locally generated routes learned by means outside of BGP have changed c) when a new BGP speaker - BGP speaker connection has been established - The Phase 3 function is a separate process which completes when it has no further work to do. The Phase 3 Routing Decision function shall be blocked from running while the Phase 2 decision function is in process. - All routes in the Loc-RIB shall be processed into a corresponding - entry in the associated Adj-RIBs-Out. Route aggregation and - information reduction techniques (see 9.2.4.1) may optionally be - applied. + All routes in the Loc-RIB shall be processed into Adj-RIBs-Out + according to configured policy. This policy may exclude a route in + the Loc-RIB from being installed in a particular Adj-RIB-Out. A + route shall not be installed in the Adj-Rib-Out unless the + destination and NEXT_HOP described by this route may be forwarded + appropriately by the Routing Table. If a route in Loc-RIB is excluded + from a particular Adj-RIB-Out the previously advertised route in that + Adj-RIB-Out must be withdrawn from service by means of an UPDATE + message (see 9.2). + + Route aggregation and information reduction techniques (see 9.2.2.1) + may optionally be applied. When the updating of the Adj-RIBs-Out and the Routing Table is - complete, the local BGP speaker shall run the external update process - of 9.2.2. + complete, the local BGP speaker shall run the Update-Send process of + 9.2. 9.1.4 Overlapping Routes A BGP speaker may transmit routes with overlapping Network Layer Reachability Information (NLRI) to another BGP speaker. NLRI overlap occurs when a set of destinations are identified in non-matching multiple routes. Since BGP encodes NLRI using IP prefixes, overlap will always exhibit subset relationships. A route describing a smaller set of destinations (a longer prefix) is said to be more specific than a route describing a larger set of destinations (a @@ -2076,93 +2075,49 @@ NLRI of this route can not be made more specific. Forwarding along such a route does not guarantee that IP packets will actually traverse only ASs listed in the AS_PATH attribute of the route. 9.2 Update-Send Process The Update-Send process is responsible for advertising UPDATE messages to all peers. For example, it distributes the routes chosen by the Decision Process to other BGP speakers which may be located in either the same autonomous system or a neighboring autonomous system. - Rules for information exchange between BGP speakers located in - different autonomous systems are given in 9.2.2; rules for - information exchange between BGP speakers located in the same - autonomous system are given in 9.2.1. - - Distribution of routing information between a set of BGP speakers, - all of which are located in the same autonomous system, is referred - to as internal distribution. - -9.2.1 Internal Updates - - The Internal update process is concerned with the distribution of - routing information to internal peers. When a BGP speaker receives an UPDATE message from an internal peer, the receiving BGP speaker shall not re-distribute the routing information contained in that UPDATE message to other internal peers, unless the speaker acts as a BGP Route Reflector [11]. - When a BGP speaker receives a new route from an external peer, it - MUST advertise that route to all other internal peers by means of an - UPDATE message if this route will be installed in its Loc-RIB - according to the route selection rules in 9.1.2. - - When a BGP speaker receives an UPDATE message with a non-empty - WITHDRAWN ROUTES field, it shall remove from its Adj-RIB-In all - routes whose destinations were carried in this field (as IP - prefixes). The speaker shall take the following additional steps: - - 1) if the corresponding feasible route had not been previously - advertised, then no further action is necessary - - 2) if the corresponding feasible route had been previously - advertised, then: - - i) If a new route for the same NLRI is selected for - advertisement, then the BGP speaker shall advertise the - replacement route - - ii) if a replacement route is not available for advertisement, - then the BGP speaker shall include the destinations of the - unfeasible route (in form of IP prefixes) in the WITHDRAWN - ROUTES field of an UPDATE message, and shall send this message - to each peer to whom it had previously advertised the - corresponding feasible route. - - All feasible routes which are advertised shall be placed in the - appropriate Adj-RIBs-Out, and all unfeasible routes which are - advertised shall be removed from the Adj-RIBs-Out after the - corresponding update messages have been sent. - -9.2.2 External Updates + As part of Phase 3 of the route selection process, the BGP speaker + has updated its Adj-RIBs-Out. All newly installed routes and all + newly unfeasible routes for which there is no replacement route shall + be advertised to its peers by means of an UPDATE message. - The external update process is concerned with the distribution of - routing information to external peers. As part of Phase 3 route - selection process, the BGP speaker has updated its Adj-RIBs-Out and - its Routing Table. All newly installed routes and all newly - unfeasible routes for which there is no replacement route shall be - advertised to external peers by means of UPDATE message. + A BGP speaker should not advertise a given feasible BGP route from + its Adj-RIB-Out if it would produce an UPDATE message containing the + same BGP route as was previously advertised. Any routes in the Loc-RIB marked as unfeasible shall be removed. + Changes to the reachable destinations within its own autonomous system shall also be advertised in an UPDATE message. -9.2.3 Controlling Routing Traffic Overhead +9.2.1 Controlling Routing Traffic Overhead The BGP protocol constrains the amount of routing traffic (that is, UPDATE messages) in order to limit both the link bandwidth needed to advertise UPDATE messages and the processing power needed by the Decision Process to digest the information contained in the UPDATE messages. -9.2.3.1 Frequency of Route Advertisement +9.2.1.1 Frequency of Route Advertisement The parameter MinRouteAdvertisementInterval determines the minimum amount of time that must elapse between advertisement of routes to a particular destination from a single BGP speaker. This rate limiting procedure applies on a per-destination basis, although the value of MinRouteAdvertisementInterval is set on a per BGP peer basis. Two UPDATE messages sent from a single BGP speaker that advertise feasible routes to some common set of destinations received from external peers must be separated by at least @@ -2181,48 +2136,48 @@ to the explicit withdrawal of unfeasible routes (that is, routes whose destinations (expressed as IP prefixes) are listed in the WITHDRAWN ROUTES field of an UPDATE message). This procedure does not limit the rate of route selection, but only the rate of route advertisement. If new routes are selected multiple times while awaiting the expiration of MinRouteAdvertisementInterval, the last route selected shall be advertised at the end of MinRouteAdvertisementInterval. -9.2.3.2 Frequency of Route Origination +9.2.1.2 Frequency of Route Origination The parameter MinASOriginationInterval determines the minimum amount of time that must elapse between successive advertisements of UPDATE messages that report changes within the advertising BGP speaker's own autonomous systems. -9.2.3.3 Jitter +9.2.1.3 Jitter To minimize the likelihood that the distribution of BGP messages by a given BGP speaker will contain peaks, jitter should be applied to the timers associated with MinASOriginationInterval, Keepalive, and MinRouteAdvertisementInterval. A given BGP speaker shall apply the same jitter to each of these quantities regardless of the destinations to which the updates are being sent; that is, jitter will not be applied on a "per peer" basis. The amount of jitter to be introduced shall be determined by multiplying the base value of the appropriate timer by a random factor which is uniformly distributed in the range from 0.75 to 1.0. -9.2.4 Efficient Organization of Routing Information +9.2.2 Efficient Organization of Routing Information Having selected the routing information which it will advertise, a BGP speaker may avail itself of several methods to organize this information in an efficient manner. -9.2.4.1 Information Reduction +9.2.2.1 Information Reduction Information reduction may imply a reduction in granularity of policy control - after information is collapsed, the same policies will apply to all destinations and paths in the equivalence class. The Decision Process may optionally reduce the amount of information that it will place in the Adj-RIBs-Out by any of the following methods: a) Network Layer Reachability Information (NLRI): @@ -2230,37 +2185,37 @@ Destination IP addresses can be represented as IP address prefixes. In cases where there is a correspondence between the address structure and the systems under control of an autonomous system administrator, it will be possible to reduce the size of the NLRI carried in the UPDATE messages. b) AS_PATHs: AS path information can be represented as ordered AS_SEQUENCEs or unordered AS_SETs. AS_SETs are used in the route aggregation - algorithm described in 9.2.4.2. They reduce the size of the + algorithm described in 9.2.2.2. They reduce the size of the AS_PATH information by listing each AS number only once, regardless of how many times it may have appeared in multiple AS_PATHs that were aggregated. An AS_SET implies that the destinations listed in the NLRI can be reached through paths that traverse at least some of the constituent autonomous systems. AS_SETs provide sufficient information to avoid routing information looping; however their use may prune potentially feasible paths, since such paths are no longer listed individually as in the form of AS_SEQUENCEs. In practice this is not likely to be a problem, since once an IP packet arrives at the edge of a group of autonomous systems, the BGP speaker at that point is likely to have more detailed path information and can distinguish individual paths to destinations. -9.2.4.2 Aggregating Routing Information +9.2.2.2 Aggregating Routing Information Aggregation is the process of combining the characteristics of several different routes in such a way that a single route can be advertised. Aggregation can occur as part of the decision process to reduce the amount of routing information that will be placed in the Adj-RIBs-Out. Aggregation reduces the amount of information that a BGP speaker must store and exchange with other BGP speakers. Routes can be aggregated by applying the following procedure separately to path attributes of @@ -2274,22 +2229,22 @@ different NEXT_HOP values can be aggregated when announced through an external BGP session. Path attributes that have different type codes can not be aggregated together. Path attributes of the same type code may be aggregated, according to the following rules: ORIGIN attribute: If at least one route among routes that are aggregated has ORIGIN with the value INCOMPLETE, then the aggregated route must have the ORIGIN attribute with the value - INCOMPLETE. Otherwise, if at least one route among routes that are - aggregated has ORIGIN with the value EGP, then the aggregated + INCOMPLETE. Otherwise, if at least one route among routes that + are aggregated has ORIGIN with the value EGP, then the aggregated route must have the origin attribute with the value EGP. In all other case the value of the ORIGIN attribute of the aggregated route is IGP. AS_PATH attribute: If routes to be aggregated have identical AS_PATH attributes, then the aggregated route has the same AS_PATH attribute as each individual route. For the purpose of aggregating AS_PATH attributes we model each AS within the AS_PATH attribute as a tuple , where @@ -2372,23 +2327,23 @@ Care must be taken to ensure that BGP speakers in the same AS do not make inconsistent decisions. 9.4 Originating BGP routes A BGP speaker may originate BGP routes by injecting routing information acquired by some other means (e.g. via an IGP) into BGP. A BGP speaker that originates BGP routes shall assign the degree of preference to these routes by passing them through the Decision Process (see Section 9.1). These routes may also be distributed to - other BGP speakers within the local AS as part of the Internal update - process (see Section 9.2.1). The decision whether to distribute non- - BGP acquired routes within an AS via BGP or not depends on the + other BGP speakers within the local AS as part of the update process + (see Section 9.2). The decision whether to distribute non-BGP + acquired routes within an AS via BGP or not depends on the environment within the AS (e.g. type of IGP) and should be controlled via configuration. Appendix 1. Comparison with RFC1771 There are numerous editorial changes (too many to list here). The following list the technical changes: Changes to reflect the usages of such features as TCP MD5 [10],