draft-ietf-idr-bgp4-13.txt | draft-ietf-idr-bgp4-14.txt | |||
---|---|---|---|---|
Network Working Group Y. Rekhter | Network Working Group Y. Rekhter | |||
INTERNET DRAFT Juniper Networks | INTERNET DRAFT Juniper Networks | |||
T. Li | T. Li | |||
Procket Networks, Inc. | Procket Networks, Inc. | |||
Editors | Editors | |||
A Border Gateway Protocol 4 (BGP-4) | A Border Gateway Protocol 4 (BGP-4) | |||
<draft-ietf-idr-bgp4-13.txt> | <draft-ietf-idr-bgp4-14.txt> | |||
Status of this Memo | Status of this Memo | |||
This document is an Internet-Draft and is in full conformance with | This document is an Internet-Draft and is in full conformance with | |||
all provisions of Section 10 of RFC2026. | all provisions of Section 10 of RFC2026. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
Drafts. | Drafts. | |||
skipping to change at page 2, line 20 | skipping to change at page 2, line 20 | |||
with a strong combination of toughness, professionalism, and | with a strong combination of toughness, professionalism, and | |||
courtesy. | courtesy. | |||
This updated version of the document is the product of the IETF IDR | This updated version of the document is the product of the IETF IDR | |||
Working Group with Yakov Rekhter and Tony Li as editors. Certain | Working Group with Yakov Rekhter and Tony Li as editors. Certain | |||
sections of the document borrowed heavily from IDRP [7], which is the | sections of the document borrowed heavily from IDRP [7], which is the | |||
OSI counterpart of BGP. For this credit should be given to the ANSI | OSI counterpart of BGP. For this credit should be given to the ANSI | |||
X3S3.3 group chaired by Lyman Chapin and to Charles Kunzinger who was | X3S3.3 group chaired by Lyman Chapin and to Charles Kunzinger who was | |||
the IDRP editor within that group. We would also like to thank Enke | the IDRP editor within that group. We would also like to thank Enke | |||
Chen, Edward Crabbe, Mike Craren, Vincent Gillet, Eric Gray, Jeffrey | Chen, Edward Crabbe, Mike Craren, Vincent Gillet, Eric Gray, Jeffrey | |||
Haas, Dimitry Haskin, John Krawczyk, David LeRoy, John Scudder, John | Haas, Dimitry Haskin, John Krawczyk, David LeRoy, Dan Massey, Dan | |||
Stewart III, Dave Thaler, Paul Traina, Curtis Villamizar, and Alex | Pei, Mathew Richardson, John Scudder, John Stewart III, Dave Thaler, | |||
Zinin for their comments. | Paul Traina, Curtis Villamizar, and Alex Zinin for their comments. | |||
We would like to specially acknowledge numerous contributions by | We would like to specially acknowledge numerous contributions by | |||
Dennis Ferguson. | Dennis Ferguson. | |||
2. Introduction | 2. Introduction | |||
The Border Gateway Protocol (BGP) is an inter-Autonomous System | The Border Gateway Protocol (BGP) is an inter-Autonomous System | |||
routing protocol. It is built on experience gained with EGP as | routing protocol. It is built on experience gained with EGP as | |||
defined in RFC 904 [1] and EGP usage in the NSFNET Backbone as | defined in RFC 904 [1] and EGP usage in the NSFNET Backbone as | |||
described in RFC 1092 [2] and RFC 1093 [3]. | described in RFC 1092 [2] and RFC 1093 [3]. | |||
skipping to change at page 3, line 39 | skipping to change at page 3, line 39 | |||
BGP uses TCP [4] as its transport protocol. TCP meets BGP's transport | BGP uses TCP [4] as its transport protocol. TCP meets BGP's transport | |||
requirements and is present in virtually all commercial routers and | requirements and is present in virtually all commercial routers and | |||
hosts. In the following descriptions the phrase "transport protocol | hosts. In the following descriptions the phrase "transport protocol | |||
connection" can be understood to refer to a TCP connection. BGP uses | connection" can be understood to refer to a TCP connection. BGP uses | |||
TCP port 179 for establishing its connections. | TCP port 179 for establishing its connections. | |||
This document uses the term `Autonomous System' (AS) throughout. The | This document uses the term `Autonomous System' (AS) throughout. The | |||
classic definition of an Autonomous System is a set of routers under | classic definition of an Autonomous System is a set of routers under | |||
a single technical administration, using an interior gateway protocol | a single technical administration, using an interior gateway protocol | |||
and common metrics to route packets within the AS, and using an | and common metrics to determine how to route packets within the AS, | |||
exterior gateway protocol to route packets to other ASs. Since this | and using an exterior gateway protocol to determine how to route | |||
classic definition was developed, it has become common for a single | packets to other ASs. Since this classic definition was developed, it | |||
AS to use several interior gateway protocols and sometimes several | has become common for a single AS to use several interior gateway | |||
sets of metrics within an AS. The use of the term Autonomous System | protocols and sometimes several sets of metrics within an AS. The use | |||
here stresses the fact that, even when multiple IGPs and metrics are | of the term Autonomous System here stresses the fact that, even when | |||
used, the administration of an AS appears to other ASs to have a | multiple IGPs and metrics are used, the administration of an AS | |||
single coherent interior routing plan and presents a consistent | appears to other ASs to have a single coherent interior routing plan | |||
picture of what destinations are reachable through it. | and presents a consistent picture of what destinations are reachable | |||
through it. | ||||
The planned use of BGP in the Internet environment, including such | The planned use of BGP in the Internet environment, including such | |||
issues as topology, the interaction between BGP and IGPs, and the | issues as topology, the interaction between BGP and IGPs, and the | |||
enforcement of routing policy rules is presented in a companion | enforcement of routing policy rules is presented in a companion | |||
document [5]. This document is the first of a series of documents | document [5]. This document is the first of a series of documents | |||
planned to explore various aspects of BGP application. | planned to explore various aspects of BGP application. | |||
3. Summary of Operation | 3. Summary of Operation | |||
Two systems form a transport protocol connection between one another. | Two systems form a transport protocol connection between one another. | |||
They exchange messages to open and confirm the connection parameters. | They exchange messages to open and confirm the connection parameters. | |||
The initial data flow is the entire BGP routing table. Incremental | The initial data flow is the portion of the BGP routing table that is | |||
updates are sent as the routing tables change. BGP does not require | allowed by the export policy, called the Adj-Ribs-Out (see 3.2). | |||
periodic refresh of the entire BGP routing table. Therefore, a BGP | Incremental updates are sent as the routing tables change. BGP does | |||
speaker must retain the current version of the entire BGP routing | not require periodic refresh of the routing table. Therefore, A BGP | |||
tables of all of its peers for the duration of the connection. If | speaker must retain the current version of the routes advertised by | |||
the implementation decides to not store the routes that have been | all of its peers for the duration of the connection. If the | |||
implementation decides to not store the routes that have been | ||||
received from a peer, but have been filtered out according to | received from a peer, but have been filtered out according to | |||
configured local policy, the BGP Route Refresh option [12] may be | configured local policy, the BGP Route Refresh option [12] may be | |||
used to request the full set of routes from a peer without resetting | used to request the full set of routes from a peer without resetting | |||
the BGP session when the local policy configuration changes. | the BGP session when the local policy configuration changes. | |||
KEEPALIVE messages are sent periodically to ensure the liveness of | KEEPALIVE messages are sent periodically to ensure the liveness of | |||
the connection. NOTIFICATION messages are sent in response to errors | the connection. NOTIFICATION messages are sent in response to errors | |||
or special conditions. If a connection encounters an error condition, | or special conditions. If a connection encounters an error condition, | |||
a NOTIFICATION message is sent and the connection is closed. | a NOTIFICATION message is sent and the connection is closed. | |||
skipping to change at page 11, line 27 | skipping to change at page 11, line 30 | |||
peers. The information in the UPDATE packet can be used to construct | peers. The information in the UPDATE packet can be used to construct | |||
a graph describing the relationships of the various Autonomous | a graph describing the relationships of the various Autonomous | |||
Systems. By applying rules to be discussed, routing information loops | Systems. By applying rules to be discussed, routing information loops | |||
and some other anomalies may be detected and removed from inter-AS | and some other anomalies may be detected and removed from inter-AS | |||
routing. | routing. | |||
An UPDATE message is used to advertise a single feasible route to a | An UPDATE message is used to advertise a single feasible route to a | |||
peer, or to withdraw multiple unfeasible routes from service (see | peer, or to withdraw multiple unfeasible routes from service (see | |||
3.1). An UPDATE message may simultaneously advertise a feasible route | 3.1). An UPDATE message may simultaneously advertise a feasible route | |||
and withdraw multiple unfeasible routes from service. The UPDATE | and withdraw multiple unfeasible routes from service. The UPDATE | |||
message always includes the fixed-size BGP header, and can optionally | message always includes the fixed-size BGP header, and also includes | |||
include the other fields as shown below: | the other fields as shown below (note, some of the shown fields may | |||
not be present in every UPDATE message): | ||||
+-----------------------------------------------------+ | +-----------------------------------------------------+ | |||
| Withdrawn Routes Length (2 octets) | | | Withdrawn Routes Length (2 octets) | | |||
+-----------------------------------------------------+ | +-----------------------------------------------------+ | |||
| Withdrawn Routes (variable) | | | Withdrawn Routes (variable) | | |||
+-----------------------------------------------------+ | +-----------------------------------------------------+ | |||
| Total Path Attribute Length (2 octets) | | | Total Path Attribute Length (2 octets) | | |||
+-----------------------------------------------------+ | +-----------------------------------------------------+ | |||
| Path Attributes (variable) | | | Path Attributes (variable) | | |||
+-----------------------------------------------------+ | +-----------------------------------------------------+ | |||
skipping to change at page 15, line 42 | skipping to change at page 15, line 47 | |||
LOCAL_PREF is a well-known mandatory attribute that is a | LOCAL_PREF is a well-known mandatory attribute that is a | |||
four octet non-negative integer. A BGP speaker uses it to | four octet non-negative integer. A BGP speaker uses it to | |||
inform other internal peers of the advertising speaker's | inform other internal peers of the advertising speaker's | |||
degree of preference for an advertised route. Usage of this | degree of preference for an advertised route. Usage of this | |||
attribute is described in 5.1.5. | attribute is described in 5.1.5. | |||
f) ATOMIC_AGGREGATE (Type Code 6) | f) ATOMIC_AGGREGATE (Type Code 6) | |||
ATOMIC_AGGREGATE is a well-known discretionary attribute of | ATOMIC_AGGREGATE is a well-known discretionary attribute of | |||
length 0. A BGP speaker uses it to inform other BGP speakers | length 0. Usage of this attribute is described in 5.1.6. | |||
that the local system selected a less specific route without | ||||
selecting a more specific route which is included in it. | ||||
Usage of this attribute is described in 5.1.6. | ||||
g) AGGREGATOR (Type Code 7) | g) AGGREGATOR (Type Code 7) | |||
AGGREGATOR is an optional transitive attribute of length 6. | AGGREGATOR is an optional transitive attribute of length 6. | |||
The attribute contains the last AS number that formed the | The attribute contains the last AS number that formed the | |||
aggregate route (encoded as 2 octets), followed by the IP | aggregate route (encoded as 2 octets), followed by the IP | |||
address of the BGP speaker that formed the aggregate route | address of the BGP speaker that formed the aggregate route | |||
(encoded as 4 octets). This should be the same address as | (encoded as 4 octets). This should be the same address as | |||
the one used for the BGP Identifier of the speaker. Usage | the one used for the BGP Identifier of the speaker. Usage | |||
of this attribute is described in 5.1.7. | of this attribute is described in 5.1.7. | |||
Network Layer Reachability Information: | Network Layer Reachability Information: | |||
skipping to change at page 17, line 24 | skipping to change at page 17, line 26 | |||
attributes. All path attributes contained in a given UPDATE message | attributes. All path attributes contained in a given UPDATE message | |||
apply to all destinations carried in the NLRI field of the UPDATE | apply to all destinations carried in the NLRI field of the UPDATE | |||
message. | message. | |||
An UPDATE message can list multiple routes to be withdrawn from | An UPDATE message can list multiple routes to be withdrawn from | |||
service. Each such route is identified by its destination (expressed | service. Each such route is identified by its destination (expressed | |||
as an IP prefix), which unambiguously identifies the route in the | as an IP prefix), which unambiguously identifies the route in the | |||
context of the BGP speaker - BGP speaker connection to which it has | context of the BGP speaker - BGP speaker connection to which it has | |||
been previously advertised. | been previously advertised. | |||
An UPDATE message may advertise only routes to be withdrawn from | An UPDATE message might advertise only routes to be withdrawn from | |||
service, in which case it will not include path attributes or Network | service, in which case it will not include path attributes or Network | |||
Layer Reachability Information. Conversely, it may advertise only a | Layer Reachability Information. Conversely, it may advertise only a | |||
feasible route, in which case the WITHDRAWN ROUTES field need not be | feasible route, in which case the WITHDRAWN ROUTES field need not be | |||
present. | present. | |||
An UPDATE message should not include the same address prefix in the | ||||
WITHDRAWN ROUTES and Network Layer Reachability Information fields, | ||||
however a BGP speaker MUST be able to process UPDATE messages in this | ||||
form. A BGP speaker should treat an UPDATE message of this form as if | ||||
the WITHDRAWN ROUTES doesn't contain the address prefix. | ||||
4.4 KEEPALIVE Message Format | 4.4 KEEPALIVE Message Format | |||
BGP does not use any transport protocol-based keep-alive mechanism to | BGP does not use any transport protocol-based keep-alive mechanism to | |||
determine if peers are reachable. Instead, KEEPALIVE messages are | determine if peers are reachable. Instead, KEEPALIVE messages are | |||
exchanged between peers often enough as not to cause the Hold Timer | exchanged between peers often enough as not to cause the Hold Timer | |||
to expire. A reasonable maximum time between KEEPALIVE messages would | to expire. A reasonable maximum time between KEEPALIVE messages would | |||
be one third of the Hold Time interval. KEEPALIVE messages MUST NOT | be one third of the Hold Time interval. KEEPALIVE messages MUST NOT | |||
be sent more frequently than one per second. An implementation MAY | be sent more frequently than one per second. An implementation MAY | |||
adjust the rate at which it sends KEEPALIVE messages as a function of | adjust the rate at which it sends KEEPALIVE messages as a function of | |||
the Hold Time interval. | the Hold Time interval. | |||
skipping to change at page 18, line 18 | skipping to change at page 18, line 22 | |||
The BGP connection is closed immediately after sending it. | The BGP connection is closed immediately after sending it. | |||
In addition to the fixed-size BGP header, the NOTIFICATION message | In addition to the fixed-size BGP header, the NOTIFICATION message | |||
contains the following fields: | contains the following fields: | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Error code | Error subcode | Data | | | Error code | Error subcode | Data | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | |||
| | | | (variable) | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
Error Code: | Error Code: | |||
This 1-octet unsigned integer indicates the type of | This 1-octet unsigned integer indicates the type of | |||
NOTIFICATION. The following Error Codes have been defined: | NOTIFICATION. The following Error Codes have been defined: | |||
Error Code Symbolic Name Reference | Error Code Symbolic Name Reference | |||
1 Message Header Error Section 6.1 | 1 Message Header Error Section 6.1 | |||
skipping to change at page 22, line 33 | skipping to change at page 22, line 37 | |||
as the last element of the sequence (put it in the leftmost | as the last element of the sequence (put it in the leftmost | |||
position) | position) | |||
2) if the first path segment of the AS_PATH is of type AS_SET, | 2) if the first path segment of the AS_PATH is of type AS_SET, | |||
the local system shall prepend a new path segment of type | the local system shall prepend a new path segment of type | |||
AS_SEQUENCE to the AS_PATH, including its own AS number in that | AS_SEQUENCE to the AS_PATH, including its own AS number in that | |||
segment. | segment. | |||
When a BGP speaker originates a route then: | When a BGP speaker originates a route then: | |||
a) the originating speaker shall include its own AS number in the | a) the originating speaker shall include its own AS number in a | |||
AS_PATH attribute of all UPDATE messages sent to an external peer. | path segment of type AS_SEQUENCE in the AS_PATH attribute of all | |||
(In this case, the AS number of the originating speaker's | UPDATE messages sent to an external peer. (In this case, the AS | |||
autonomous system will be the only entry in the AS_PATH | number of the originating speaker's autonomous system will be the | |||
attribute). | only entry the path segment, and this path segment will be the | |||
only segment in the AS_PATH attribute). | ||||
b) the originating speaker shall include an empty AS_PATH | b) the originating speaker shall include an empty AS_PATH | |||
attribute in all UPDATE messages sent to internal peers. (An | attribute in all UPDATE messages sent to internal peers. (An | |||
empty AS_PATH attribute is one whose length field contains the | empty AS_PATH attribute is one whose length field contains the | |||
value zero). | value zero). | |||
For the purpose of inter-AS traffic engineering, a BGP speaker may | Whenever the modification of the AS_PATH attribute calls for | |||
include more than one instance of its own AS number in the AS_PATH | including or prepending the AS number of the local system, the local | |||
attribute. This is controlled via local configuration. | system may include/prepend more than one instance of its own AS | |||
number in the AS_PATH attribute. This is controlled via local | ||||
configuration. | ||||
5.1.3 NEXT_HOP | 5.1.3 NEXT_HOP | |||
The NEXT_HOP path attribute defines the IP address of the border | The NEXT_HOP path attribute defines the IP address of the border | |||
router that should be used as the next hop to the destinations listed | router that should be used as the next hop to the destinations listed | |||
in the UPDATE message. The NEXT_HOP attribute is calculated as | in the UPDATE message. The NEXT_HOP attribute is calculated as | |||
follows. | follows. | |||
1) When sending a message to an internal peer, the BGP speaker | 1) When sending a message to an internal peer, the BGP speaker | |||
should not modify the NEXT_HOP attribute, unless it has been | should not modify the NEXT_HOP attribute, unless it has been | |||
explicitly configured to announce its own IP address as the | explicitly configured to announce its own IP address as the | |||
NEXT_HOP. | NEXT_HOP. | |||
2) When sending a message to an external peer X: | 2) When sending a message to an external peer X, and the peer is | |||
one IP hop away from the speaker: | ||||
- If the route being announced was learned from an internal | - If the route being announced was learned from an internal | |||
peer or is locally originated, the BGP speaker can use for the | peer or is locally originated, the BGP speaker can use for the | |||
NEXT_HOP attribute an interface address of the internal peer | NEXT_HOP attribute an interface address of the internal peer | |||
router through which the announced network is reachable for the | router through which the announced network is reachable for the | |||
speaker, provided that peer X shares a common subnet with this | speaker, provided that peer X shares a common subnet with this | |||
address. This is a form of "third party" NEXT_HOP attribute. | address. This is a form of "third party" NEXT_HOP attribute. | |||
- If the route being announced was learned from an external | - If the route being announced was learned from an external | |||
peer, the speaker can use in the NEXT_HOP attribute an IP | peer, the speaker can use in the NEXT_HOP attribute an IP | |||
skipping to change at page 23, line 41 | skipping to change at page 23, line 44 | |||
with this address. This is a second form of "third party" | with this address. This is a second form of "third party" | |||
NEXT_HOP attribute. | NEXT_HOP attribute. | |||
- If the external peer to which the route is being advertised | - If the external peer to which the route is being advertised | |||
shares a common subnet with one of the announcing router's own | shares a common subnet with one of the announcing router's own | |||
interfaces, the router may use the IP address associated with | interfaces, the router may use the IP address associated with | |||
such an interface in the NEXT_HOP attribute. This is known as a | such an interface in the NEXT_HOP attribute. This is known as a | |||
"first party" NEXT_HOP attribute. | "first party" NEXT_HOP attribute. | |||
- By default (if none of the above conditions apply), the BGP | - By default (if none of the above conditions apply), the BGP | |||
speaker should use in the NEXT_HOP attribute the IP address | speaker should use in the NEXT_HOP attribute the IP address of | |||
that is used to establish the BGP session. | the interface that the speaker uses to establish the BGP | |||
session to peer X. | ||||
3) When sending a message to an external peer X, and the peer is | ||||
multiple IP hops away from the speaker (aka "multihop EBGP"): | ||||
- The speaker may be configured to propagate the NEXT_HOP | ||||
attribute. In this case when advertising a route that the | ||||
speaker learned from one of its peers, the NEXT_HOP attribute | ||||
of the advertised route is exactly the same as the NEXT_HOP | ||||
attribute of the learned route (the speaker just doesn't modify | ||||
the NEXT_HOP attribute). | ||||
- By default, the BGP speaker should use in the NEXT_HOP | ||||
attribute the IP address of the interface that the speaker uses | ||||
to establish the BGP session to peer X. | ||||
Normally the NEXT_HOP attribute is chosen such that the shortest | Normally the NEXT_HOP attribute is chosen such that the shortest | |||
available path will be taken. A BGP speaker must be able to support | available path will be taken. A BGP speaker must be able to support | |||
disabling advertisement of third party NEXT_HOP attributes to handle | disabling advertisement of third party NEXT_HOP attributes to handle | |||
imperfectly bridged media. | imperfectly bridged media. | |||
A BGP speaker must never advertise an address of a peer to that peer | A BGP speaker must never advertise an address of a peer to that peer | |||
as a NEXT_HOP, for a route that the speaker is originating. A BGP | as a NEXT_HOP, for a route that the speaker is originating. A BGP | |||
speaker must never install a route with itself as the next hop. | speaker must never install a route with itself as the next hop. | |||
skipping to change at page 25, line 20 | skipping to change at page 25, line 38 | |||
via LOCAL_PREF in its decision process (see section 9.1.1). | via LOCAL_PREF in its decision process (see section 9.1.1). | |||
A BGP speaker MUST NOT include this attribute in UPDATE messages that | A BGP speaker MUST NOT include this attribute in UPDATE messages that | |||
it sends to external peers, except for the case of BGP Confederations | it sends to external peers, except for the case of BGP Confederations | |||
[13]. If it is contained in an UPDATE message that is received from | [13]. If it is contained in an UPDATE message that is received from | |||
an external peer, then this attribute MUST be ignored by the | an external peer, then this attribute MUST be ignored by the | |||
receiving speaker, except for the case of BGP Confederations [13]. | receiving speaker, except for the case of BGP Confederations [13]. | |||
5.1.6 ATOMIC_AGGREGATE | 5.1.6 ATOMIC_AGGREGATE | |||
ATOMIC_AGGREGATE is a well-known discretionary attribute. If a BGP | ATOMIC_AGGREGATE is a well-known discretionary attribute. There are | |||
speaker, when presented with a set of overlapping routes from one of | two cases where the ATOMIC_AGGREGATE attribute is used: | |||
its peers (see 9.1.4), selects the less specific route without | ||||
selecting the more specific one, then the local system MUST attach | - a speaker receives both more and less specific routes, these | |||
the ATOMIC_AGGREGATE attribute to the route when propagating it to | routes have the same NEXT_HOP, the AS_PATH attribute of the more | |||
other BGP speakers (if that attribute is not already present in the | specific route is different from the AS_PATH attribute of the less | |||
received less specific route). A BGP speaker that receives a route | specific route, and the speaker installs in its Loc-RIB only the | |||
with the ATOMIC_AGGREGATE attribute MUST NOT remove the attribute | less specific route. In this case the speaker should advertise | |||
from the route when propagating it to other speakers. A BGP speaker | this route with the ATOMIC_AGGREGATE attribute to all neighbors | |||
that receives a route with the ATOMIC_AGGREGATE attribute MUST NOT | (subject to the outbound route filtering). | |||
make any NLRI of that route more specific (as defined in 9.1.4) when | ||||
advertising this route to other BGP speakers. A BGP speaker that | - a speaker receives both more and less specific routes the | |||
receives a route with the ATOMIC_AGGREGATE attribute needs to be | AS_PATH attribute of the more specific route is different from the | |||
cognizant of the fact that the actual path to destinations, as | AS_PATH attribute of the less specific route, the speaker installs | |||
specified in the NLRI of the route, while having the loop-free | in its Loc-RIB both routes, but the speaker advertises to a | |||
property, may traverse ASs that are not listed in the AS_PATH | particular neighbor only the less specific route. In this case the | |||
attribute. | advertisement MUST carry the ATOMIC_AGGREGATE attribute. | |||
A BGP speaker that receives a route with the ATOMIC_AGGREGATE | ||||
attribute MUST NOT remove the attribute from the route when | ||||
propagating it to other speakers. | ||||
A BGP speaker that receives a route with the ATOMIC_AGGREGATE | ||||
attribute MUST NOT make any NLRI of that route more specific (as | ||||
defined in 9.1.4) when advertising this route to other BGP speakers. | ||||
A BGP speaker that receives a route with the ATOMIC_AGGREGATE | ||||
attribute needs to be cognizant of the fact that the actual path to | ||||
destinations, as specified in the NLRI of the route, while having the | ||||
loop-free property, may not be the path specified in the AS_PATH | ||||
attribute of the route. | ||||
5.1.7 AGGREGATOR | 5.1.7 AGGREGATOR | |||
AGGREGATOR is an optional transitive attribute which may be included | AGGREGATOR is an optional transitive attribute which may be included | |||
in updates which are formed by aggregation (see Section 9.2.4.2). A | in updates which are formed by aggregation (see Section 9.2.4.2). A | |||
BGP speaker which performs route aggregation may add the AGGREGATOR | BGP speaker which performs route aggregation may add the AGGREGATOR | |||
attribute which shall contain its own AS number and IP address. The | attribute which shall contain its own AS number and IP address. The | |||
IP address should be the same as the BGP Identifier of the speaker. | IP address should be the same as the BGP Identifier of the speaker. | |||
6. BGP Error Handling. | 6. BGP Error Handling. | |||
skipping to change at page 29, line 4 | skipping to change at page 29, line 36 | |||
If the ORIGIN attribute has an undefined value, then the Error | If the ORIGIN attribute has an undefined value, then the Error | |||
Subcode is set to Invalid Origin Attribute. The Data field contains | Subcode is set to Invalid Origin Attribute. The Data field contains | |||
the unrecognized attribute (type, length and value). | the unrecognized attribute (type, length and value). | |||
If the NEXT_HOP attribute field is syntactically incorrect, then the | If the NEXT_HOP attribute field is syntactically incorrect, then the | |||
Error Subcode is set to Invalid NEXT_HOP Attribute. The Data field | Error Subcode is set to Invalid NEXT_HOP Attribute. The Data field | |||
contains the incorrect attribute (type, length and value). Syntactic | contains the incorrect attribute (type, length and value). Syntactic | |||
correctness means that the NEXT_HOP attribute represents a valid IP | correctness means that the NEXT_HOP attribute represents a valid IP | |||
host address. Semantic correctness applies only to the external BGP | host address. Semantic correctness applies only to the external BGP | |||
links. It means that the interface associated with the IP address, as | links, and only when the sender and the receiving speaker are one IP | |||
specified in the NEXT_HOP attribute, shares a common subnet with the | hop away from each other. To be semantically correct, the IP address | |||
receiving BGP speaker (unless the speaker has been configured to run | in the NEXT_HOP must not be the IP address of the receiving speaker, | |||
the external BGP session over multiple IP hops), and is not the IP | and the NEXT_HOP IP address must either be the sender's IP address | |||
address of the receiving BGP speaker. If the NEXT_HOP attribute is | (used to establish the BGP session), or the interface associated with | |||
semantically incorrect, the error should be logged, and the route | the NEXT_HOP IP address must share a common subnet with the receiving | |||
should be ignored. In this case, no NOTIFICATION message should be | BGP speaker. If the NEXT_HOP attribute is semantically incorrect, the | |||
sent. | error should be logged, and the route should be ignored. In this | |||
case, no NOTIFICATION message should be sent. | ||||
The AS_PATH attribute is checked for syntactic correctness. If the | The AS_PATH attribute is checked for syntactic correctness. If the | |||
path is syntactically incorrect, then the Error Subcode is set to | path is syntactically incorrect, then the Error Subcode is set to | |||
Malformed AS_PATH. | Malformed AS_PATH. | |||
The information carried by the AS_PATH attribute is checked for AS | The information carried by the AS_PATH attribute is checked for AS | |||
loops. AS loop detection is done by scanning the full AS path (as | loops. AS loop detection is done by scanning the full AS path (as | |||
specified in the AS_PATH attribute), and checking that the autonomous | specified in the AS_PATH attribute), and checking that the autonomous | |||
system number of the local system does not appear in the AS path. If | system number of the local system does not appear in the AS path. If | |||
the autonomous system number appears in the AS path the route may be | the autonomous system number appears in the AS path the route may be | |||
skipping to change at page 39, line 41 | skipping to change at page 40, line 21 | |||
speaker receives from a peer an UPDATE message that advertises a new | speaker receives from a peer an UPDATE message that advertises a new | |||
route, a replacement route, or withdrawn routes. | route, a replacement route, or withdrawn routes. | |||
The Phase 1 decision function is a separate process which completes | The Phase 1 decision function is a separate process which completes | |||
when it has no further work to do. | when it has no further work to do. | |||
The Phase 1 decision function shall lock an Adj-RIB-In prior to | The Phase 1 decision function shall lock an Adj-RIB-In prior to | |||
operating on any route contained within it, and shall unlock it after | operating on any route contained within it, and shall unlock it after | |||
operating on all new or unfeasible routes contained within it. | operating on all new or unfeasible routes contained within it. | |||
For the newly received or replacement feasible route, the local BGP | For each newly received or replacement feasible route, the local BGP | |||
speaker shall determine a degree of preference. If the route is | speaker shall determine a degree of preference. If the route is | |||
learned from an internal peer, the value of the LOCAL_PREF attribute | learned from an internal peer, either the value of the LOCAL_PREF | |||
shall be taken as the degree of preference. If the route is learned | attribute shall be taken as the degree of preference, or the local | |||
from an external peer, then the degree of preference shall be | system may compute the degree of preference of the route based on | |||
computed based on preconfigured policy information and used as the | preconfigured policy information. Note that the latter (computing the | |||
LOCAL_PREF value in any IBGP readvertisement. The exact nature of | degree of preference based on preconfigured policy information) may | |||
this policy information and the computation involved is a local | result in formation of persistent routing loops. If the route is | |||
matter. For a route learned from an external peer, the local speaker | learned from an external peer, then the local BGP speaker computes | |||
shall then run the internal update process of 9.2.1 to select and | the degree of preference based on preconfigured policy information | |||
advertise the most preferable route. | and uses it as the LOCAL_PREF value in any IBGP readvertisement. The | |||
exact nature of this policy information and the computation involved | ||||
is a local matter. For a route learned from an external peer, the | ||||
local speaker shall then run the internal update process of 9.2.1 to | ||||
select and advertise the most preferable route. | ||||
9.1.2 Phase 2: Route Selection | 9.1.2 Phase 2: Route Selection | |||
The Phase 2 decision function shall be invoked on completion of Phase | The Phase 2 decision function shall be invoked on completion of Phase | |||
1. The Phase 2 function is a separate process which completes when | 1. The Phase 2 function is a separate process which completes when | |||
it has no further work to do. The Phase 2 process shall consider all | it has no further work to do. The Phase 2 process shall consider all | |||
routes that are present in the Adj-RIBs-In, including those received | routes that are present in the Adj-RIBs-In, including those received | |||
from both internal and external peers. | from both internal and external peers. | |||
The Phase 2 decision function shall be blocked from running while the | The Phase 2 decision function shall be blocked from running while the | |||
skipping to change at page 41, line 18 | skipping to change at page 41, line 48 | |||
selecting one of the possible paths (if multiple best paths to the | selecting one of the possible paths (if multiple best paths to the | |||
same prefix are available). If the route to the address depicted by | same prefix are available). If the route to the address depicted by | |||
the NEXT_HOP attribute changes such that the immediate next hop or | the NEXT_HOP attribute changes such that the immediate next hop or | |||
the IGP cost to the NEXT_HOP (if the NEXT_HOP is resolved through an | the IGP cost to the NEXT_HOP (if the NEXT_HOP is resolved through an | |||
IGP route) changes, route selection should be recalculated as | IGP route) changes, route selection should be recalculated as | |||
specified above. | specified above. | |||
Notice that even though BGP routes do not have to be installed in the | Notice that even though BGP routes do not have to be installed in the | |||
Routing Table with the immediate next hop(s), implementations must | Routing Table with the immediate next hop(s), implementations must | |||
take care that before any packets are forwarded along a BGP route, | take care that before any packets are forwarded along a BGP route, | |||
it's associated NEXT_HOP address is resolved to the immediate | its associated NEXT_HOP address is resolved to the immediate | |||
(directly connected) next-hop address and this address (or multiple | (directly connected) next-hop address and this address (or multiple | |||
addresses) is finally used for actual packet forwarding. | addresses) is finally used for actual packet forwarding. | |||
Unfeasible routes SHALL be removed from the Loc-RIB and the routing | Unresolvable routes SHALL be removed from the Loc-RIB and the routing | |||
table. However, corresponding unfeasible routes SHOULD be kept in the | table. However, corresponding unresolvable routes SHOULD be kept in | |||
Adj-RIBs-In. | the Adj-RIBs-In. | |||
9.1.2.1 Route Resolvability Condition | 9.1.2.1 Route Resolvability Condition | |||
As indicated in Section 9.1.2, BGP routers should exclude | As indicated in Section 9.1.2, BGP routers should exclude | |||
unresolvable routes from the Phase 2 decision. This ensures that only | unresolvable routes from the Phase 2 decision. This ensures that only | |||
valid routes are installed in Loc-RIB and the Routing Table. | valid routes are installed in Loc-RIB and the Routing Table. | |||
The route resolvability condition is defined as follows. | The route resolvability condition is defined as follows. | |||
1. A route Rte1, referencing only the intermediate network | 1. A route Rte1, referencing only the intermediate network | |||
skipping to change at page 43, line 5 | skipping to change at page 43, line 33 | |||
be removed from consideration. The algorithm terminates as soon as | be removed from consideration. The algorithm terminates as soon as | |||
only one route remains in consideration. The criteria must be | only one route remains in consideration. The criteria must be | |||
applied in the order specified. | applied in the order specified. | |||
Several of the criteria are described using pseudo-code. Note that | Several of the criteria are described using pseudo-code. Note that | |||
the pseudo-code shown was chosen for clarity, not efficiency. It is | the pseudo-code shown was chosen for clarity, not efficiency. It is | |||
not intended to specify any particular implementation. BGP | not intended to specify any particular implementation. BGP | |||
implementations MAY use any algorithm which produces the same results | implementations MAY use any algorithm which produces the same results | |||
as those described here. | as those described here. | |||
a) Remove from consideration routes with less-preferred | a) Remove from consideration all routes which are not tied for | |||
having the smallest number of AS numbers present in their AS_PATH | ||||
attributes. Note, that when counting this number, an AS_SET counts | ||||
as 1, no matter how many ASs are in the set, and that, if the | ||||
implementation supports [13], then AS numbers present in segments | ||||
of type AS_CONFED_SEQUENCE or AS_CONFED_SET are not included in | ||||
the count of AS numbers present in the AS_PATH. | ||||
b) Remove from consideration all routes which are not tied for | ||||
having the lowest Origin number in their Origin attribute. | ||||
c) Remove from consideration routes with less-preferred | ||||
MULTI_EXIT_DISC attributes. MULTI_EXIT_DISC is only comparable | MULTI_EXIT_DISC attributes. MULTI_EXIT_DISC is only comparable | |||
between routes learned from the same neighboring AS. Routes which | between routes learned from the same neighboring AS. Routes which | |||
do not have the MULTI_EXIT_DISC attribute are considered to have | do not have the MULTI_EXIT_DISC attribute are considered to have | |||
the highest possible MULTI_EXIT_DISC value. | the lowest possible MULTI_EXIT_DISC value. | |||
This is also described in the following procedure: | This is also described in the following procedure: | |||
for m = all routes still under consideration | for m = all routes still under consideration | |||
for n = all routes still under consideration | for n = all routes still under consideration | |||
if (neighborAS(m) == neighborAS(n)) and (MED(n) < MED(m)) | if (neighborAS(m) == neighborAS(n)) and (MED(n) < MED(m)) | |||
remove route m from consideration | remove route m from consideration | |||
In the pseudo-code above, MED(n) is a function which returns the | In the pseudo-code above, MED(n) is a function which returns the | |||
value of route n's MULTI_EXIT_DISC attribute. If route n has no | value of route n's MULTI_EXIT_DISC attribute. If route n has no | |||
MULTI_EXIT_DISC attribute, the function returns the highest | MULTI_EXIT_DISC attribute, the function returns the lowest | |||
possible MULTI_EXIT_DISC value, i.e. 2^32-1. | possible MULTI_EXIT_DISC value, i.e. 0. | |||
Similarly, neighborAS(n) is a function which returns the neighbor | Similarly, neighborAS(n) is a function which returns the neighbor | |||
AS from which the route was received. | AS from which the route was received. | |||
b) Remove from consideration any routes with less-preferred | d) If at least one of the candidate routes was received from an | |||
external peer in a neighboring autonomous system, remove from | ||||
consideration all routes which were received from internal peers. | ||||
e) Remove from consideration any routes with less-preferred | ||||
interior cost. The interior cost of a route is determined by | interior cost. The interior cost of a route is determined by | |||
calculating the metric to the next hop for the route using the | calculating the metric to the next hop for the route using the | |||
Routing Table. If the next hop for a route is reachable, but no | Routing Table. If the next hop for a route is reachable, but no | |||
cost can be determined, then this step should be skipped | cost can be determined, then this step should be skipped | |||
(equivalently, consider all routes to have equal costs). | (equivalently, consider all routes to have equal costs). | |||
This is also described in the following procedure. | This is also described in the following procedure. | |||
for m = all routes still under consideration | for m = all routes still under consideration | |||
for n = all routes in still under consideration | for n = all routes in still under consideration | |||
if (cost(n) is better than cost(m)) | if (cost(n) is better than cost(m)) | |||
remove m from consideration | remove m from consideration | |||
In the pseudo-code above, cost(n) is a function which returns the | In the pseudo-code above, cost(n) is a function which returns the | |||
cost of the path (interior distance) to the address given in the | cost of the path (interior distance) to the address given in the | |||
NEXT_HOP attribute of the route. | NEXT_HOP attribute of the route. | |||
c) If at least one of the candidate routes was received from an | f) Remove from consideration all routes other than the route that | |||
external peer in a neighboring autonomous system, remove from | ||||
consideration all routes which were received from internal peers. | ||||
d) Remove from consideration all routes other than the route that | ||||
was advertised by the BGP speaker whose BGP Identifier has the | was advertised by the BGP speaker whose BGP Identifier has the | |||
lowest value. | lowest value. | |||
9.1.3 Phase 3: Route Dissemination | 9.1.3 Phase 3: Route Dissemination | |||
The Phase 3 decision function shall be invoked on completion of Phase | The Phase 3 decision function shall be invoked on completion of Phase | |||
2, or when any of the following events occur: | 2, or when any of the following events occur: | |||
a) when routes in the Loc-RIB to local destinations have changed | a) when routes in the Loc-RIB to local destinations have changed | |||
skipping to change at page 47, line 8 | skipping to change at page 47, line 45 | |||
corresponding feasible route. | corresponding feasible route. | |||
All feasible routes which are advertised shall be placed in the | All feasible routes which are advertised shall be placed in the | |||
appropriate Adj-RIBs-Out, and all unfeasible routes which are | appropriate Adj-RIBs-Out, and all unfeasible routes which are | |||
advertised shall be removed from the Adj-RIBs-Out after the | advertised shall be removed from the Adj-RIBs-Out after the | |||
corresponding update messages have been sent. | corresponding update messages have been sent. | |||
9.2.1.1 Breaking Ties (Internal Updates) | 9.2.1.1 Breaking Ties (Internal Updates) | |||
If a local BGP speaker has connections to several external peers, | If a local BGP speaker has connections to several external peers, | |||
there will be multiple Adj-RIBs-In associated with these peers. These | there will be multiple Adj-RIBs-In associated with these peers. | |||
Adj-RIBs-In might contain several equally preferable routes to the | These Adj-RIBs-In might contain several equally preferable routes to | |||
same destination, all of which were advertised by external peers. | the same destination, all of which were advertised by external peers. | |||
The local BGP speaker shall select one of these routes according to | The local BGP speaker shall select one of these routes according to | |||
the following rules: | the following rules: | |||
a) If the candidate routes differ only in their NEXT_HOP and | a) If the candidate routes differ only in their NEXT_HOP and | |||
MULTI_EXIT_DISC attributes, and the local system is configured to | MULTI_EXIT_DISC attributes, and the local system is configured to | |||
take into account the MULTI_EXIT_DISC attribute, select the route | take into account the MULTI_EXIT_DISC attribute, select the route | |||
that has the lowest value of the MULTI_EXIT_DISC attribute. A | that has the lowest value of the MULTI_EXIT_DISC attribute. A | |||
route with the MULTI_EXIT_DISC attribute shall be preferred to a | route with the MULTI_EXIT_DISC attribute shall be preferred to a | |||
route without the MULTI_EXIT_DISC attribute. | route without the MULTI_EXIT_DISC attribute. | |||
skipping to change at page 50, line 41 | skipping to change at page 51, line 28 | |||
Routes that have the following attributes shall not be aggregated | Routes that have the following attributes shall not be aggregated | |||
unless the corresponding attributes of each route are identical: | unless the corresponding attributes of each route are identical: | |||
MULTI_EXIT_DISC, NEXT_HOP. | MULTI_EXIT_DISC, NEXT_HOP. | |||
If the aggregation occurs as part of the update process, routes with | If the aggregation occurs as part of the update process, routes with | |||
different NEXT_HOP values can be aggregated when announced through an | different NEXT_HOP values can be aggregated when announced through an | |||
external BGP session. | external BGP session. | |||
Path attributes that have different type codes can not be aggregated | Path attributes that have different type codes can not be aggregated | |||
together. Path of the same type code may be aggregated, according to | together. Path attributes of the same type code may be aggregated, | |||
the following rules: | according to the following rules: | |||
ORIGIN attribute: If at least one route among routes that are | ORIGIN attribute: If at least one route among routes that are | |||
aggregated has ORIGIN with the value INCOMPLETE, then the | aggregated has ORIGIN with the value INCOMPLETE, then the | |||
aggregated route must have the ORIGIN attribute with the value | aggregated route must have the ORIGIN attribute with the value | |||
INCOMPLETE. Otherwise, if at least one route among routes that are | INCOMPLETE. Otherwise, if at least one route among routes that are | |||
aggregated has ORIGIN with the value EGP, then the aggregated | aggregated has ORIGIN with the value EGP, then the aggregated | |||
route must have the origin attribute with the value EGP. In all | route must have the origin attribute with the value EGP. In all | |||
other case the value of the ORIGIN attribute of the aggregated | other case the value of the ORIGIN attribute of the aggregated | |||
route is INTERNAL. | route is INTERNAL. | |||
skipping to change at page 52, line 14 | skipping to change at page 52, line 49 | |||
aggregated AS_PATH attribute. | aggregated AS_PATH attribute. | |||
Appendix 6, section 6.8 presents another algorithm that satisfies | Appendix 6, section 6.8 presents another algorithm that satisfies | |||
the conditions and allows for more complex policy configurations. | the conditions and allows for more complex policy configurations. | |||
ATOMIC_AGGREGATE: If at least one of the routes to be aggregated | ATOMIC_AGGREGATE: If at least one of the routes to be aggregated | |||
has ATOMIC_AGGREGATE path attribute, then the aggregated route | has ATOMIC_AGGREGATE path attribute, then the aggregated route | |||
shall have this attribute as well. | shall have this attribute as well. | |||
AGGREGATOR: All AGGREGATOR attributes of all routes to be | AGGREGATOR: All AGGREGATOR attributes of all routes to be | |||
aggregated should be ignored. | aggregated should be ignored. The BGP speaker performing the route | |||
aggregation may attach a new AGGREGATOR attribute (see Section | ||||
5.1.7). | ||||
9.3 Route Selection Criteria | 9.3 Route Selection Criteria | |||
Generally speaking, additional rules for comparing routes among | Generally speaking, additional rules for comparing routes among | |||
several alternatives are outside the scope of this document. There | several alternatives are outside the scope of this document. There | |||
are two exceptions: | are two exceptions: | |||
- If the local AS appears in the AS path of the new route being | - If the local AS appears in the AS path of the new route being | |||
considered, then that new route cannot be viewed as better than | considered, then that new route cannot be viewed as better than | |||
any other route (provided that the speaker is configured to accept | any other route (provided that the speaker is configured to accept | |||
skipping to change at page 53, line 27 | skipping to change at page 54, line 20 | |||
attribute. | attribute. | |||
Procedures for imposing an upper bound on the number of prefixes | Procedures for imposing an upper bound on the number of prefixes | |||
that a BGP speaker would accept from a peer. | that a BGP speaker would accept from a peer. | |||
The ability of a BGP speaker to include more than one instance of | The ability of a BGP speaker to include more than one instance of | |||
its own AS in the AS_PATH attribute for the purpose of inter-AS | its own AS in the AS_PATH attribute for the purpose of inter-AS | |||
traffic engineering. | traffic engineering. | |||
Clarifications on the various types of NEXT_HOPs. | Clarifications on the various types of NEXT_HOPs. | |||
Clarifications to the use of the ATOMIC_AGGREGATE attribute. | ||||
The relationship between the immediate next hop, and the next hop | The relationship between the immediate next hop, and the next hop | |||
as specified in the NEXT_HOP path attribute. | as specified in the NEXT_HOP path attribute. | |||
Clarifications on the tie-breaking procedures. | Clarifications on the tie-breaking procedures. | |||
Appendix 2. Comparison with RFC1267 | Appendix 2. Comparison with RFC1267 | |||
All the changes listed in Appendix 1, plus the following. | All the changes listed in Appendix 1, plus the following. | |||
End of changes. | ||||
This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/ |