draft-ietf-idr-route-damp-00.txt   draft-ietf-idr-route-damp-01.txt 
Internet Engineering Task Force Curtis Villamizar Internet Engineering Task Force Curtis Villamizar
INTERNET-DRAFT ANS INTERNET-DRAFT ANS
draft-ietf-idr-route-damp-00 Ravi Chandra draft-ietf-idr-route-damp-01 Ravi Chandra
Cisco Cisco
Ramesh Govindan Ramesh Govindan
ISI ISI
October 30, 1997 January 8, 1998
BGP Route Flap Damping BGP Route Flap Damping
Status of this Memo Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts. working documents as Internet-Drafts.
skipping to change at page 2, line ? skipping to change at page 2, line ?
well behaved routes. well behaved routes.
This must be accomplished keeping other goals of BGP in mind: This must be accomplished keeping other goals of BGP in mind:
o pack changes into a small number of updates o pack changes into a small number of updates
o preserve consistent routing o preserve consistent routing
o minimal addition space and computational overhead o minimal addition space and computational overhead
Excessive updates to reachability state has been widespread in the An excessive rate of update to the advertised reachability of a subset
Internet. This observation was made in the early 1990s by many people of Internet prefixes has been widespread in the Internet. This
involved in Internet operations and remains to case to date. These observation was made in the early 1990s by many people involved in
excessive updates are not necessarily periodic so route oscillation Internet operations and remains the case. These excessive updates are
would be a misleading term. The informal term used to describe this not necessarily periodic so route oscillation would be a misleading
effect is ``route flap''. The techniques described here are now term. The informal term used to describe this effect is ``route
widely deployed and are commonly referred to as ``route flap flap''. The techniques described here are now widely deployed and are
damping''. commonly referred to as ``route flap damping''.
1 Overview 1 Overview
It is necessary to reduce the amount of routing traffic (the number of To maintain scalability of a routed internet, it is necessary to
update message) generated by BGP in order to limit processing reduce the amount of change in routing state propagated by BGP in
requirements. The primary contributors of processing load resulting order to limit processing requirements. The primary contributors of
from BGP updates are the BGP decision process and adding and removing processing load resulting from BGP updates are the BGP decision
forwarding entries. process and adding and removing forwarding entries.
Consider the following example. A widely deployed BGP implementation Consider the following example. A widely deployed BGP implementation
may tend to fail due to high routing update volume. For example, it may tend to fail due to high routing update volume. For example, it
may be unable to maintain it's BGP or IGP sessions if sufficiently may be unable to maintain it's BGP or IGP sessions if sufficiently
loaded. The failure of one router can further contribute to the load loaded. The failure of one router can further contribute to the load
on other routers. This additional load may cause failures in other on other routers. This additional load may cause failures in other
instances of the same implementation or other implementations with a instances of the same implementation or other implementations with a
similar weakness. In the worst case, a stable oscillation could similar weakness. In the worst case, a stable oscillation could
result. Such worse cases have already been observed in practice. result. Such worse cases have already been observed in practice.
A BGP implementation must be prepared for a large volume of routing A BGP implementation must be prepared for a large volume of routing
traffic. A BGP implementation cannot rely upon the sender to traffic. A BGP implementation cannot rely upon the sender to
sufficiently shield it from route instabilities. The guidelines here sufficiently shield it from route instabilities. The guidelines here
are designed to prevent sustained oscillations, but do not eliminate are designed to prevent sustained oscillations, but do not eliminate
the need for robust and efficient implementations. The mechanisms the need for robust and efficient implementations. The mechanisms
described here allow routing instability to be contained at an AS described here allow routing instability to be contained at an AS
border router bordering the instability. border router bordering the instability.
Even where BGP implementations are highly robust, the performance of
the routing process is limited. Limiting the propagation of
unnecessary change then becomes an issue of maintaining reasonable
route change convergence time as a routing topology grows.
2 Methods of Limiting Route Advertisement 2 Methods of Limiting Route Advertisement
Two methods of controlling the frequency of route advertisement are Two methods of controlling the frequency of route advertisement are
described here. The first involves fixed timers. The fixed timer described here. The first involves fixed timers. The fixed timer
technique has no space overhead per route but has the disadvantage of technique has no space overhead per route but has the disadvantage of
slowing route convergence for the normal case where a route does not slowing route convergence for the normal case where a route does not
have a history of instability. The second method overcomes this have a history of instability. The second method overcomes this
limitation at the expense of maintaining some additional space limitation at the expense of maintaining some additional space
overhead. The additional overhead includes a small amount of state overhead. The additional overhead includes a small amount of state
per route and a very small processing overhead. per route and a very small processing overhead.
skipping to change at page 4, line 46 skipping to change at page 4, line 52
to advertise a route that is different from the route that is being to advertise a route that is different from the route that is being
used, except for a very minimal time. It is more desirable to used, except for a very minimal time. It is more desirable to
suppress the acceptance of a route (and therefore the use of that suppress the acceptance of a route (and therefore the use of that
route in the IGP) rather than suppress only the redistribution. route in the IGP) rather than suppress only the redistribution.
It is clearly not possible to accurately predict the future stability It is clearly not possible to accurately predict the future stability
of a route. The recent history of stability is generally regarded as of a route. The recent history of stability is generally regarded as
a good basis for estimating the likelihood of future stability. The a good basis for estimating the likelihood of future stability. The
criteria that is used to distinguish well behaved from poorly behaved criteria that is used to distinguish well behaved from poorly behaved
routes is therefore based on the recent history of stability of the routes is therefore based on the recent history of stability of the
route. There is no simple direct quantitative expression of recent route. There is no simple quantitative expression of recent stability
stability so a figure of merit must be defined. Some desirable so a figure of merit must be defined. Some desirable characteristics
characteristics of this figure of merit would be that the farther in of this figure of merit would be that the farther in the past that
the past that instability occurred, the less it's affect on the figure instability occurred, the less it's affect on the figure of merit and
of merit and that the instability measure would be cumulative rather that the instability measure would be cumulative rather than
than reflecting only the most recent event. reflecting only the most recent event.
The algorithms should behave such that for routes which have a history The algorithms should behave such that for routes which have a history
of stability but make a few transitions, those transitions should be of stability but make a few transitions, those transitions should be
made quickly. If transitions continue, advertisement of the route made quickly. If transitions continue, advertisement of the route
should be suppressed. There should be some memory of prior instabil- should be suppressed. There should be some memory of prior instabil-
ity. The degree to which prior instability is considered should be ity. The degree to which prior instability is considered should be
gradually reduced as long as the route remains announced and stable. gradually reduced as long as the route remains announced and stable.
2.3 Design Choices 2.3 Design Choices
After routes have been accepted their readvertisement will be briefly After routes have been accepted their readvertisement will be briefly
suppressed to improve packing of updates. There may be a lengthy suppressed to improve packing of updates. There may be a lengthy
suppression of the acceptance of an external route. How long a route suppression of the acceptance of an external route. How long a route
will be suppressed is based on a figure of merit that is expected to will be suppressed is based on a figure of merit that is expected to
be loosely correlated to the probability of future instability of a be correlated to the probability of future instability of a route.
route. Routes with high figure of merit values will be suppressed. Routes with high figure of merit values will be suppressed. An
An exponential decay algorithm was chosen as the basis for reducing exponential decay algorithm was chosen as the basis for reducing the
the figure of merit over time. These choices should be viewed as figure of merit over time. These choices should be viewed as
suggestions for implementation. suggestions for implementation.
An exponential decay function has the property that previous An exponential decay function has the property that previous
instability can be remembered for a fairly long time. The rate at instability can be remembered for a fairly long time. The rate at
which the instability figure of merit decays slows as time goes on. which the instability figure of merit decays slows as time goes on.
Exponential decay is a transitive function. Exponential decay has the following property.
f(f(figure-of-merit, t1), t2) = f(figure-of-merit, t1+t2) f(f(figure-of-merit, t1), t2) = f(figure-of-merit, t1+t2)
This transitive property allows the decay for a long period to be This property allows the decay for a long period to be computed in a
computed in a single operation regardless of the current value single operation regardless of the current value (figure-of-merit).
(figure-of-merit). As a performance optimization, the decay can be As a performance optimization, the decay can be applied in fixed time
applied in fixed time increments. Given a desired decay half life, increments. Given a desired decay half life, the decay for a single
the decay for a single time increment can be computed ahead of time. time increment can be computed ahead of time. The decay for multiple
The decay for multiple time increments is expressed below. time increments is expressed below.
f(figure-of-merit, n * t0) = f(figure-of-merit, t0) ** n = K f(figure-of-merit, n*t0) = f(figure-of-merit, t0)**n = K**n
** n
The values of K ** n can be precomputed for a reasonable number of The values of K ** n can be precomputed for a reasonable number of
``n'' and stored in an array. The value of ``K'' is always less than ``n'' and stored in an array. The value of ``K'' is always less than
one. The array size can be bounded since the value quickly approaches one. The array size can be bounded since the value quickly approaches
zero. This makes the decay easy to compute using an array bound zero. This makes the decay easy to compute using an array bound
check, an array lookup and a single multiply regardless as to how much check, an array lookup and a single multiply regardless as to how much
time has elapsed. time has elapsed.
3 Limiting Route Advertisements using Fixed Timers 3 Limiting Route Advertisements using Fixed Timers
skipping to change at page 6, line 23 skipping to change at page 6, line 27
equal to the bound for a reachable advertisement. equal to the bound for a reachable advertisement.
Routes that need to be readvertised can be marked in the RIB or an Routes that need to be readvertised can be marked in the RIB or an
external set of structures maintained, which references the RIB. external set of structures maintained, which references the RIB.
Periodically, a subset of the marked routes can be flushed. This is Periodically, a subset of the marked routes can be flushed. This is
fairly straightforward and accomplishes the objectives. Computation fairly straightforward and accomplishes the objectives. Computation
for too simple an implementation may be order N squared. To avoid N for too simple an implementation may be order N squared. To avoid N
squared performance, some form of data structure is needed to group squared performance, some form of data structure is needed to group
routes with common attributes. routes with common attributes.
Any implementation should packs updates efficiently, provide a minimum An implementation should pack updates efficiently, provide a minimum
readvertisement delay, provide a bounds on the maximum readvertisement readvertisement delay, provide a bounds on the maximum readvertisement
delay that would be experienced solely as a result of the algorithm delay that would be experienced solely as a result of the algorithm
used to provide a minimum delay, and must be computationally efficient used to provide a minimum delay, and must be computationally efficient
in the presence of a very large number of candidates for in the presence of a very large number of candidates for
readvertisement. readvertisement.
4 Stability Sensitive Suppression of Route Advertisement 4 Stability Sensitive Suppression of Route Advertisement
This method of limiting route advertisements uses a measure of route This method of limiting route advertisements uses a measure of route
stability applied on a per route basis. This technique is applied stability applied on a per route basis. This technique is applied
skipping to change at page 6, line 51 skipping to change at page 7, line 9
suppressed. Each time a route is withdrawn, the figure of merit is suppressed. Each time a route is withdrawn, the figure of merit is
incremented. While the route is not changing the figure of merit incremented. While the route is not changing the figure of merit
value is decayed exponentially with separate decay rates depending on value is decayed exponentially with separate decay rates depending on
whether the route is stable and reachable or has been stable and whether the route is stable and reachable or has been stable and
unreachable. The decay rate may be slower when the route is unreach- unreachable. The decay rate may be slower when the route is unreach-
able, or the stability figure of merit could remain fixed (not decay able, or the stability figure of merit could remain fixed (not decay
at all) while the route remains unreachable. Whether to decay un- at all) while the route remains unreachable. Whether to decay un-
reachable routes at the same rate, a slower rate, or not at all is an im- reachable routes at the same rate, a slower rate, or not at all is an im-
plementation choice. Decaying at a slower rate is recommended. plementation choice. Decaying at a slower rate is recommended.
An very efficient implementation is suggested in the following A very efficient implementation is suggested in the following
sections. The implementation only requires computation for the routes sections. The implementation only requires computation for the routes
contained in an update, when an update is received or withdrawn (as contained in an update, when an update is received or withdrawn (as
opposed to the simplistic approach of periodically decaying each opposed to the simplistic approach of periodically decaying each
route). The suggested implementation involves only a small number of route). The suggested implementation involves only a small number of
simple operations, and can be implemented using scaled integers. simple operations, and can be implemented using scaled integers.
The behavior of unstable routes is believed to be fairly predictable. The behavior of unstable routes is fairly predictable. Severely
Severely flapping routes will often be advertised and withdrawn at flapping routes will often be advertised and withdrawn at regular time
regular time intervals corresponding to the timers of a particular intervals corresponding to the timers of a particular protocol (the
protocol (the IGP or exterior protocol in use where the problem IGP or exterior protocol in use where the problem exists). Marginal
exists). Marginal circuits or mild congestion can result in a long circuits or mild congestion can result in a long term pattern of
term pattern of occasional brief route withdrawal or occasional brief occasional brief route withdrawal or occasional brief connectivity.
connectivity.
4.1 Single vs. Multiple Configuration Parameter Sets 4.1 Single vs. Multiple Configuration Parameter Sets
The behavior of the algorithm is modified by a number of configurable The behavior of the algorithm is modified by a number of configurable
parameters. It is possible to configure separate sets of parameters parameters. It is possible to configure separate sets of parameters
designed to handle short term severe route flap and chronic milder designed to handle short term severe route flap and chronic milder
route flap (a pattern of occasional drops over a long time period). route flap (a pattern of occasional drops over a long time period).
The former would require a fast decay and low threshold (allowing a The former would require a fast decay and low threshold (allowing a
small number of consecutive flaps to cause a route to be suppressed, small number of consecutive flaps to cause a route to be suppressed,
but allowing it to be reused after a relatively short period of but allowing it to be reused after a relatively short period of
skipping to change at page 9, line 49 skipping to change at page 10, line 5
This is the time value corresponding to the last reuse list. This This is the time value corresponding to the last reuse list. This
may be the maximum value of T-hold for all parameter sets of may be may be the maximum value of T-hold for all parameter sets of may be
configured. configured.
number of reuse lists (reuse-list-size) number of reuse lists (reuse-list-size)
This is the number of reuse lists. It may be determined from This is the number of reuse lists. It may be determined from
reuse-list-max or set explicitly. reuse-list-max or set explicitly.
A necessary optimization is described in Section 4.8.6 that involves FIGURES ARE FOUND IN THE POSTSCRIPT AND HTML VERSIONS ONLY
an array referred to as the ``reuse index array''. A reuse index
Figure 1: Instability figure of merit for flap at a constant rate Figure 1: Instability figure of merit for flap at a constant rate
A necessary optimization is described in Section 4.8.6 that involves
an array referred to as the ``reuse index array''. A reuse index
array is needed for each decay rate in use. The reuse index array is array is needed for each decay rate in use. The reuse index array is
used to estimate which reuse list to place a route when it is used to estimate which reuse list to place a route when it is
suppressed. Proper placement avoids the need to periodically evaluate suppressed. Proper placement avoids the need to periodically evaluate
decay to determine if a route can be reused. Using the reuse index decay to determine if a route can be reused. Using the reuse index
array avoids the need to compute a logarithm to determine placement. array avoids the need to compute a logarithm to determine placement.
One additional system wide parameter can be introduced. One additional system wide parameter can be introduced.
reuse index array size (reuse-index-array-size) reuse index array size (reuse-index-array-size)
This is the size of reuse index arrays. This size determines the This is the size of reuse index arrays. This size determines the
accuracy with which suppressed routes can be placed within the set accuracy with which suppressed routes can be placed within the set
of reuse lists when suppressed for a long time. of reuse lists when suppressed for a long time.
4.3 Guidelines for Setting Parameters 4.3 Guidelines for Setting Parameters
The decay half life should be set to a time considerably longer than The decay half life should be set to a time considerably longer than
the period of the route flap it is intended to address. If for the period of the route flap it is intended to address. For example,
example, the decay is set to ten minutes and a route is withdrawn and if the decay is set to ten minutes and a route is withdrawn and
readvertised exactly every ten minutes, the route would continue to readvertised exactly every ten minutes, the route would continue to
flap if the cutoff was set to a value of 2 or above. flap if the cutoff was set to a value of 2 or above.
The stability figure of merit itself is an accumulated time decayed The stability figure of merit itself is an accumulated time decayed
total. This must be kept in mind in setting the decay time, cutoff total. This must be kept in mind in setting the decay time, cutoff
values and reuse values. For example, if a route flaps at four times values and reuse values. For example, if a route flaps at four times
the decay rate, it will reach 3 in 4 cycles, 4 in 6 cycles, 5 in 10 the decay rate, it will reach 3 in 4 cycles, 4 in 6 cycles, 5 in 10
cycles, and will converge at about 6.3. At twice the decay time, it cycles, and will converge at about 6.3. At twice the decay time, it
will reach 3 in 7 cycles, and converge at a value of less than 3.5. will reach 3 in 7 cycles, and converge at a value of less than 3.5.
skipping to change at page 10, line 45 skipping to change at page 11, line 5
constant rate. The time axis is labeled in multiples of the decay constant rate. The time axis is labeled in multiples of the decay
half life. The plots represent route flap with a period of 1/2, 1/3, half life. The plots represent route flap with a period of 1/2, 1/3,
1/4, and 1/8 times the decay half life. A ceiling of 4.5 was set, 1/4, and 1/8 times the decay half life. A ceiling of 4.5 was set,
which can be seen to affect three of the plots, effectively limiting which can be seen to affect three of the plots, effectively limiting
the time it takes to readvertise the route regardless of the prior the time it takes to readvertise the route regardless of the prior
history. With the cutoff and reuse thresholds suggested by the dotted history. With the cutoff and reuse thresholds suggested by the dotted
lines, routes would be suppressed after being declared unreachable 2-3 lines, routes would be suppressed after being declared unreachable 2-3
times and be used again after approximately 2 decay half life periods times and be used again after approximately 2 decay half life periods
of stability. of stability.
FIGURES ARE FOUND IN THE POSTSCRIPT AND HTML VERSIONS ONLY
Figure 2: Separate decay constants when unreachable
From either maximum hold time value (Tmax-ok or Tmax-ng), a ratio of From either maximum hold time value (Tmax-ok or Tmax-ng), a ratio of
the cutoff to a ceiling can be determined. An integer value for the the cutoff to a ceiling can be determined. An integer value for the
ceiling can then be chosen such that overflow will not be a problem ceiling can then be chosen such that overflow will not be a problem
and all other values can be scaled accordingly. If both cutoffs are and all other values can be scaled accordingly. If both cutoffs are
specified or if multiple parameter sets are used the highest ceiling specified or if multiple parameter sets are used the highest ceiling
Figure 2: Separate decay constants when unreachable
will be used. will be used.
Figure 2 show the effect of configuring separate decay rates to be Figure 2 show the effect of configuring separate decay rates to be
used when the route is reachable or unreachable. The decay rate is used when the route is reachable or unreachable. The decay rate is
5 times slower when the route is unreachable. In the three case 5 times slower when the route is unreachable. In the three case
shown, the period of the route flap is equal to the decay half life shown, the period of the route flap is equal to the decay half life
but the route is reachable 1/8 of the time in one, reachable 1/2 the but the route is reachable 1/8 of the time in one, reachable 1/2 the
time in one, and reachable 7/8 of the time in the other. In the last time in one, and reachable 7/8 of the time in the other. In the last
case the route is not suppressed until after the third unreachable case the route is not suppressed until after the third unreachable
(when it is above the top threshold after becoming reachable again). (when it is above the top threshold after becoming reachable again).
skipping to change at page 13, line 38 skipping to change at page 14, line 5
Implementations may chose to make the array size shorter and multiply Implementations may chose to make the array size shorter and multiply
more than once when decaying a long time interval to reduce storage. more than once when decaying a long time interval to reduce storage.
The reuse index arrays serve a similar purpose to the decay arrays. The reuse index arrays serve a similar purpose to the decay arrays.
The amount of time until a route can be reused can be determined using The amount of time until a route can be reused can be determined using
a array lookup. The array can be built given the decay rate. The a array lookup. The array can be built given the decay rate. The
array is indexed using a scaled integer proportional to the ratio array is indexed using a scaled integer proportional to the ratio
between a current stability figure of merit value and the value needed between a current stability figure of merit value and the value needed
for the route to be reused. for the route to be reused.
4.4.3 Data Structures per Route 4.4.3 Per Route State
Information must be maintained per some tuple representing a route.
At the very minimum, the NLRI (BGP prefix and length) must be
contained in the tuple. Different BGP attributes may be included or
excluded depending on the specific situation. The AS path should also
be contained in the tuple be default. The tuple may also optionally
contain other BGP attributes such as MULTI_EXIT_DISCRIMINATOR (MED).
The tuple representing a route for the purpose of route flap damping
is:
tuple entry default options
-------------------------------------------
NLRI
prefix required
length required
AS path included option to exclude
last AS set in path excluded option to include
next hop excluded option to include
MED excluded option to include
in comparisons only
The AS path is generally included in order to identify downstream
instability which is not being damped or not being sufficiently damped
and is alternating between a stable and an unstable path. Under rare
circumstances it may be desirable to exclude AS path for all or a
subset of prefixes. If an AS path ends in an AS set, in practice the
path is always for an aggregate. Changes to the trailing AS set
should be ignored. Ideally the AS path comparison should insure that
at least one AS has remained constant in the old and new AS set, but
completely ignoring the contents of a trailing AS set is also
acceptable.
Including next hop and MED changes can help suppress the use of an AS
which is internally unstable or avoid a next hop which is closer to an
unstable IGP path in the adjacent AS. If a large number of MED values
are used, the increase in the amount of state may become a problem.
For this reason MED is disabled by default and enabled only as part of
the tuple comparison, using a single state entry regardless of MED
value. Including MED will suppress the use of the adjacent AS even
though the change need not be propagated further. Using MED is only a
safe practice if a path is known to exist through another AS or where
there are enough peering sites with the adjacent AS such that routes
heard at only a subset of the peering sites will be suppressed.
4.4.4 Data Structures per Route
The following information must be maintained per route. A route here The following information must be maintained per route. A route here
is considered to be a tuple containing at least NLRI prefix, next hop, is considered to be a tuple usually containing NLRI, next hop, and AS
and AS path. The tuple may also contain other BGP attributes such as path as defined in Section 4.4.3.
MULTI_EXIT_DISCRIMINATOR (MED).
stability figure of merit (figure-of-merit) stability figure of merit (figure-of-merit)
Each route must have a stability figure of merit per applicable Each route must have a stability figure of merit per applicable
parameter set. parameter set.
last time updated (time-update) last time updated (time-update)
The exact last time updated must be maintained to allow exponential The exact last time updated must be maintained to allow exponential
decay of the accumulated figure of merit to be deferred until the decay of the accumulated figure of merit to be deferred until the
route might reasonable be considered eligible for a change in route might reasonable be considered eligible for a change in
status (having gone from unreachable to reachable or advancing status (having gone from unreachable to reachable or advancing
within the reuse lists). within the reuse lists).
config block pointer config block pointer
Any implementation that supports multiple parameter sets must Any implementation that supports multiple parameter sets must
provide a means of quickly identifying which set of parameters provide a means of quickly identifying which set of parameters
skipping to change at page 17, line 22 skipping to change at page 18, line 45
This example is used in later sections. The use of multiple parameter This example is used in later sections. The use of multiple parameter
sets complicates the examples somewhat. Where multiple parameter sets sets complicates the examples somewhat. Where multiple parameter sets
are allowed for a single route, the decay portion of the algorithm is are allowed for a single route, the decay portion of the algorithm is
repeated for each parameter set. If different routes are allowed to repeated for each parameter set. If different routes are allowed to
have different parameter sets, the routes must have pointers to the have different parameter sets, the routes must have pointers to the
parameter sets to keep the time to locate to a minimum, but the parameter sets to keep the time to locate to a minimum, but the
algorithms are otherwise unchanged. algorithms are otherwise unchanged.
A sample set of configuration parameters and a sample set of A sample set of configuration parameters and a sample set of
implementation parameters are provided in in the two following list. implementation parameters are provided in in the two following lists.
1. Configuration Parameters 1. Configuration Parameters
o cut = 1.25 o cut = 1.25
o reuse = 0.5 o reuse = 0.5
o T-hold = 15 mins o T-hold = 15 mins
o decay-ok = 5 min o decay-ok = 5 min
skipping to change at page 18, line 4 skipping to change at page 19, line 31
o delta-t = 1 sec o delta-t = 1 sec
o delta-reuse o delta-reuse
o reuse-list-size = 256 o reuse-list-size = 256
o reuse-index-array-size = 1,024 o reuse-index-array-size = 1,024
Using these configuration and implementation parameters and the Using these configuration and implementation parameters and the
equations in Section 4.5, the space overhead can be computed. There equations in Section 4.5, the space overhead can be computed. There
Figure 3: Some fairly long route flap cycles, repeated for 12
minutes, followed by a period of stability.
is a fixed space overhead that is independent of the number of routes. is a fixed space overhead that is independent of the number of routes.
There is an space requirement associated with a stable route. There There is a space requirement associated with a stable route. There is
is a larger space requirement associated with an unstable route. The a larger space requirement associated with an unstable route. The
space requirements for the parameters above are provide in the lists space requirements for the parameters above are provide in the lists
below. below.
1. fixed overhead (using parameters from previous example) 1. fixed overhead (using parameters from previous example)
o 900 * integer - decay array o 900 * integer - decay array
o 1,800 * integer - decay array o 1,800 * integer - decay array
o 120 * pointer - reuse list-heads o 120 * pointer - reuse list-heads
o 2,048 * integer - reuse index arrays o 2,048 * integer - reuse index arrays
2. overhead per stable route 2. overhead per stable route
o pointer - containing null entry o pointer - containing null entry
FIGURES ARE FOUND IN THE POSTSCRIPT AND HTML VERSIONS ONLY
Figure 3: Some fairly long route flap cycles, repeated for 12
minutes, followed by a period of stability.
3. overhead per unstable route 3. overhead per unstable route
o pointer - to a damping structure containing the following o pointer - to a damping structure containing the following
o integer - figure of merit + bit for state o integer - figure of merit + bit for state
o integer - last time updated o integer - last time updated
o pointer (optional) to configuration parameter block o pointer (optional) to configuration parameter block
skipping to change at page 19, line 32 skipping to change at page 21, line 14
2. A route becomes unreachable (Section 4.8.2) 2. A route becomes unreachable (Section 4.8.2)
3. A route becomes reachable again (Section 4.8.3) 3. A route becomes reachable again (Section 4.8.3)
4. A route changes (Section 4.8.4) 4. A route changes (Section 4.8.4)
5. A peer goes down (Section 4.8.5) 5. A peer goes down (Section 4.8.5)
The reuse list is used to provide a means of fast evaluation of route The reuse list is used to provide a means of fast evaluation of route
that had been suppressed, but had been stable long enough to be reused that had been suppressed, but had been stable long enough to be reused
again. The following two operations are described. again or had been suppressed long enough that it can be treated as a
new route. The following two operations are described.
1. Inserting into a reuse list (Section 4.8.6) 1. Inserting into a reuse list (Section 4.8.6)
2. Reuse list processing every delta-t seconds (Section 4.8.7) 2. Reuse list processing every delta-t seconds (Section 4.8.7)
4.8.1 Processing a New Peer or New Routes 4.8.1 Processing a New Peer or New Routes
When a peer comes up, no action is required if the routes had no When a peer comes up, no action is required if the routes had no
previous history of instability, for example if this is the first time previous history of instability, for example if this is the first time
the peer is coming up and announcing these routes. For each route, the peer is coming up and announcing these routes. For each route,
skipping to change at page 22, line 20 skipping to change at page 23, line 47
} }
3. if ( not suppressed and figure-of-merit < cut ) { 3. if ( not suppressed and figure-of-merit < cut ) {
usethe route usethe route
}else if( suppressedand figure-of-merit< reuse) { }else if( suppressedand figure-of-merit< reuse) {
setstate tonot suppressed setstate tonot suppressed
removethe routefrom areuse listif itis on one remove the route from a reuse list
usethe route usethe route
}else { }else {
setstate tosuppressed setstate tosuppressed
don'tuse theroute don'tuse theroute
insertinto areuse list(see Section4.8.6) insertinto areuse list(see Section4.8.6)
skipping to change at page 22, line 51 skipping to change at page 24, line 33
zeropointer todamping struct zeropointer todamping struct
} }
If the route is deemed usable, a search for the current best route If the route is deemed usable, a search for the current best route
must be made. The newly reachable route is then evaluated according must be made. The newly reachable route is then evaluated according
to the BGP protocol rules for route selection. to the BGP protocol rules for route selection.
If the new route is usable, the previous best route is examined. If the new route is usable, the previous best route is examined.
Prior to coute comparisons, the current best route may have to be Prior to route comparisons, the current best route may have to be
reevaluated if separate parameter sets are used depending on the reevaluated if separate parameter sets are used depending on the
presence or absence of an alternate route. If there had been no presence or absence of an alternate route. If there had been no
alternate the previous best route may be suppressed. alternate the previous best route may be suppressed.
If the new route is to be suppressed it is placed on a reuse list only If the new route is to be suppressed it is placed on a reuse list only
if it would have been preferred to the current best route had the new if it would have been preferred to the current best route had the new
route been accepted as stable. There is no reason to queue a route on route been accepted as stable. There is no reason to queue a route on
a reuse list if after the route becomes usable it would not be used a reuse list if after the route becomes usable it would not be used
anyway due to the existence of a more preferred route. Such a route anyway due to the existence of a more preferred route. Such a route
would not have to be reevaluated unless the preferred route became would not have to be reevaluated unless the preferred route became
skipping to change at page 23, line 27 skipping to change at page 25, line 11
If a route is replaced by a peer router by supplying a new path, the If a route is replaced by a peer router by supplying a new path, the
route that is being replaced should be treated as if an unreachable route that is being replaced should be treated as if an unreachable
were received (see Section 4.8.2). This will occur when a peer were received (see Section 4.8.2). This will occur when a peer
somewhere back in the AS path is continuously switching between two AS somewhere back in the AS path is continuously switching between two AS
paths and that peer is not damping route flap (or applying less paths and that peer is not damping route flap (or applying less
damping). There is no way to determine if one AS path is stable and damping). There is no way to determine if one AS path is stable and
the other is flapping, or if they are both flapping. If the cycle is the other is flapping, or if they are both flapping. If the cycle is
sufficiently short compared to convergence times neither route through sufficiently short compared to convergence times neither route through
that peer will deliver packets very reliably. Since there is no way that peer will deliver packets very reliably. Since there is no way
to affect the peer such that it choses the stable of the two AS paths, to affect the peer such that it chooses the stable of the two AS
the only viable option is to penalize both routes by considering each change paths, the only viable option is to penalize both routes by considering
as an unreachable followed by a route advertisement. each change as an unreachable followed by a route advertisement.
4.8.5 Processing A Peer Router Loss 4.8.5 Processing A Peer Router Loss
When a peer routing session is broken, either all individual routes When a peer routing session is broken, either all individual routes
advertised by that peer may be marked as unstable, or the peering advertised by that peer may be marked as unstable, or the peering
session itself may be marked as unstable. Marking the peer will save session itself may be marked as unstable. Marking the peer will save
considerable memory. Since the individual routes are advertised as considerable memory. Since the individual routes are advertised as
unreachable to routers beyond the immediate problem, per route state unreachable to routers beyond the immediate problem, per route state
will be incurred beyond the peer immediately adjacent to the BGP will be incurred beyond the peer immediately adjacent to the BGP
session that went down. If the instability continues, the immediately session that went down. If the instability continues, the immediately
skipping to change at page 25, line 14 skipping to change at page 26, line 40
4.8.7 Handling Reuse Timer Events 4.8.7 Handling Reuse Timer Events
The granularity of the reuse timer should be more course that that of The granularity of the reuse timer should be more course that that of
the decay timer. As a result, when the reuse timer fires, suppressed the decay timer. As a result, when the reuse timer fires, suppressed
routes should be decayed by multiple increments of decay time. Some routes should be decayed by multiple increments of decay time. Some
computation can be avoided by always inserting into the reuse list computation can be avoided by always inserting into the reuse list
corresponding to one time increment past reuse eligibility. In cases corresponding to one time increment past reuse eligibility. In cases
where the reuse lists have a longer ``memory'' than the ``decay where the reuse lists have a longer ``memory'' than the ``decay
memory'' (described above), all of the routes in the first queue will memory'' (described above), all of the routes in the first queue will
be available for immediate reuse. be available for immediate reuse if reachable or the history entry
could be disposed of if unreachable.
When it is time to advance the lists, the first queue on the reuse When it is time to advance the lists, the first queue on the reuse
list must be processed and the circular queue must be rotated. Using list must be processed and the circular queue must be rotated. Using
an array and an offset as a circular array (as described in an array and an offset as a circular array (as described in
Section 4.8.6), the algorithm below is repeated every t-reuse seconds. Section 4.8.6), the algorithm below is repeated every t-reuse seconds.
1. save a pointer to the current zeroth queue head and zero the list 1. save a pointer to the current zeroth queue head and zero the list
head entry head entry
2. set offset = modulo reuse-list-size ( offset + 1 ), thereby 2. set offset = modulo reuse-list-size ( offset + 1 ), thereby
skipping to change at page 25, line 50 skipping to change at page 27, line 32
else else
re-insertinto anotherlist (seeSection 4.8.6) re-insertinto anotherlist (seeSection 4.8.6)
} }
The value of the zeroth list head would be saved and the array entry The value of the zeroth list head would be saved and the array entry
itself zeroed. The list heads would then be advanced by incrementing itself zeroed. The list heads would then be advanced by incrementing
the offset. Starting with the saved head of the old zeroth list, each the offset. Starting with the saved head of the old zeroth list, each
route would be reevaluated and used or requeued if it were not ready route would be reevaluated and used, disposed of entirely or requeued
for reuse. If a route is used, it must be treated as if it were a new if it were not ready for reuse. If a route is used, it must be
route advertisement as described in Section 4.8.3. treated as if it were a new route advertisement as described in
Section 4.8.3.
5 Implementation Experience 5 Implementation Experience
The first implementations of ``route flap damping'' were the route The first implementations of ``route flap damping'' were the route
server daemon (rsd) coding by Ramesh Govindan (ISI) and the Cisco IOS server daemon (rsd) coding by Ramesh Govindan (ISI) and the Cisco IOS
implementation by Ravi Chandra. Both implementations first became implementation by Ravi Chandra. Both implementations first became
available in 1995 and have been used extensively. The rsd available in 1995 and have been used extensively. The rsd
implementation has been in use in route servers at the NSF funded implementation has been in use in route servers at the NSF funded
Network Access Points (NAPs) and at other major Internet Network Access Points (NAPs) and at other major Internet
interconnects. The Cisco IOS version has been in use by Internet interconnects. The Cisco IOS version has been in use by Internet
Service Providers worldwide. The rsd implementation has been Service Providers worldwide. The rsd implementation has been
integrated in releases of gated (see http://www.gated.org) and is integrated in releases of gated (see http://www.gated.org) and is
available in commercial routers using gated. available in commercial routers using gated.
There are now more than 2 years of BGP route damping deployment
experience. Some problems have occurred in deployment. So far these
are solvable by careful implementation of the algorithm and by careful
deployment. In some topologies coordinated deployment can be helpful
and in all cases disclosure of the use of route damping and the param-
eters used is highly beneficial in debugging connectivity problems.
Some of the problems have occurred due to subtle implementation
errors. Route damping should never be applied on IBGP learned routes.
To do so can open the possibility for persistent route loops.
Implementations should disallow this configuration. Penalties for
flapping should only be applied when a route is removed or replaced
and not when a route is added. If damping parameters are applied
consistently, this implementation constraint will result in a stable
secondary path being preferred over an unstable primary path due to
damping of the primary path near the source.
In topologies where multiple AS paths to a given destination exist
flapping of the primary path can result in suppression of the
secondary path. This can occur if no damping is being done near the
cause of the route flap or if damping is being applied more
aggressively by a distant AS. This problem can be solved in one of two
ways. Damping can be done near the source of the route flap and the
damping parameters can be made consistent. Alternately, a distant AS
which insists on more aggressive damping parameters can disable
penalizing routes on AS path change, penalizing routes only if they
are withdrawn completely. In order to do so, the implementation must
support this option (as described in Section 4.4.3).
Route flap should be damped near the source. Single homed
destinations can be covered by static routes. Aggregation provides
another means of damping. Providers should damp their own internal
problems, however damping on IGP link state origination is not yet
implemented by router vendors. Providers which use multiple AS within
their own topology should damp between their own AS. Providers should
damp adjacent providers AS.
Damping provides a means to limit propagation excessive route change
when connectivity is highly intermittent. Once a problem is
corrected, select damping state can be manually cleared. In order to
determine where damping may have occurred after connectivity problems,
providers should publish their damping parameters. Providers should
be willing to manually clear damping on specific prefixes or AS paths
at the request of other providers when the request is accompanied by
assurance that the problem has truly been addressed.
By damping their own routing information, providers can reduce their
own need to make requests of other providers to clear damping state
after correcting a problem. Providers should be pro-active and
monitor what prefixes and paths are suppressed in addition to
monitoring link states and BGP session state.
Acknowledgements Acknowledgements
This work and this document may not have been completed without the This work and this document may not have been completed without the
advise, comments and encouragement of Yakov Rekhter (Cisco). Dennis advise, comments and encouragement of Yakov Rekhter (Cisco). Dennis
Ferguson (MCI) provided a description of the algorithms in the gated Ferguson (MCI) provided a description of the algorithms in the gated
BGP implementation and many valuable comments and insights. David BGP implementation and many valuable comments and insights. David
Bolen (ANS) and Jordan Becker (ANS) provided valuable comments, Bolen (ANS) and Jordan Becker (ANS) provided valuable comments,
particularly regarding early simulations. At the time of this writing particularly regarding early simulations. Over four years elapsed
two implementations exists. One was led by Ramesh Govindan (ISI) for between the initial draft presented to the BGP WG (October 1993) and
the NSF Routing Arbiter project. The second was led by Ravi Chandra this iteration. At the time of this writing there is significant
(Cisco). Sean Doran (Sprintlink) and Serpil Bayraktar (ANS) were experience with two implementations, each having been deployed since
among the early independent testers of the Cisco pre-beta 1995. One was led by Ramesh Govindan (ISI) for the NSF Routing Ar-
implementation. biter project. The second was led by Ravi Chandra (Cisco). Sean Doran
(Sprintlink) and Serpil Bayraktar (ANS) were among the early independent
testers of the Cisco pre-beta implementation. Valuable comments and im-
plementation feedback were shared by many individuals on the IETF IDR WG
and the RIPE Routing Work Group and in NANOG and IEPG.
References References
[1] P. Gross and Y. Rekhter. Application of the border gateway proto- [1] P. Gross and Y. Rekhter. Application of the border gateway proto-
col in col in
the internet. Request for Comments (Draft Standard) RFC 1268, In- the internet. Request for Comments (Draft Standard) RFC 1268, In-
ternet Engineering Task Force, October 1991. (Obsoletes RFC1164); ternet Engineering Task Force, October 1991. (Obsoletes RFC1164);
(Obsoleted by RFC1655). ftp://ds.internic.net/rfc/rfc1268.txt. (Obsoleted by RFC1655). ftp://ds.internic.net/rfc/rfc1268.txt.
[2] ISO/IEC. Iso/iec 10747 - information technology - telecommunica- [2] ISO/IEC. Iso/iec 10747 - information technology - telecommunica-
tions and information exchange between systems - protocol for tions and information exchange between systems - protocol for
exchange of inter-domain routeing information among intermediate exchange of inter-domain routeing information among intermediate
systems to support forwarding of iso systems to support forwarding of iso
8473 pdus. Technical report, International Organization for Stan- 8473 pdus. Technical report, International Organization for Stan-
dardization, August 1994. ftp://merit.edu/pub/iso/idrp.ps.gz. dardization, August 1994. ftp://merit.edu/pub/iso/idrp.ps.gz.
[3] K. Lougheed and Y. Rekhter. A border gateway protocol 3 (BGP-3). [3] K. Lougheed and Y. Rekhter. A border gateway protocol 3 (BGP-3).
Request for Comments (Draft Standard) RFC 1267, In- Request for Comments (Draft Standard) RFC 1267, In-
ternet Engineering Task Force, October 1991. (Obsoletes RFC1163). ternet Engineering Task Force, October 1991. (Obsoletes RFC1163).
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/