Thanks to Kent England for taking notes!
examples are given that are under discussion in ANS
coding for this is quite easy
impact of configuration generation to be determined
Tony Bates [TB]: MCI registry has community specifics as well.
[TB]: We don't know the topology, so that last example needs a cautionary label.
[CV]: We don't want to set these attributes arbitrarily. Major providers should understand their topology well enough to properly set proxy aggregation boundaries.
[TB]: We need to unify the semantics.
Sean Doran [SD] w/o slides: This talk sounds like IRR versus non-IRR folks. :-)
SprintLink has four levels of aggregation:
Item 4) has helped the European routers, but there are operational difficulties. Our ICM customers need full routes, but require proxy aggregation.
Route Servers can't deal with the exponential growth of prefixes.
The key issue is MAKE GROWTH LINEAR, however it's done.
Route Servers might buy us some time but they don't solve the problem. Route damping is also a short term solution and Route Servers may help with flap damping.
Curtis and Tony understand my routing policy syntax, but I'd rather solve problems than spend time standardizing syntax.
[TB]: This is a very good area for the IRR. Good documentation would help. Documenting SprintLink's aggregation would make things easier for others.
[SD]: Well, I published it on cidrd. Some level of documentation is worthwhile, but there are more important things to do.
[TB]: Operations is important, but so is documentation.
[SB]: Let's move to the issue of the increasing number of paths. There are now 30,000 paths in the global routing table. What is the upper bound to the number of prefixes we can handle?
Andrew Partan [AP]: The AGS+ defines the problem. A 16M AGS+ can handle 28k routes. So we've passed the upper limit.
[SB]: How many providers in the audience use AGS+?
SD and AP raised their hands.
[SB]: How many of you use 7000s with SSP?
[Audience]: Many hands.
[SB]: How many of you use 7000s without SSP?
[Audience]: Many hands.
[TB]: SSPs have nothing to do with this.
[SB]: So what is the upper bound for cisco 7000s?
[TB]: Our 7000s consume 17M of memory to carry full routes.
[SD]: 60k routes is the limit of the CPU. Memory can support 100k routes.
[CV]: A route server would help on the CPU limit.
[AP]: ATM would help! :-)
[CV]: Since the cisco 7000s have a CPU limit, do the flap damping in the route servers.
[AP]: The RA reported only one operational NAP and there are many other places where we route servers won't help.
[CV]: But the high flap rates are seen at the IXPs.
[AP]: No, the flaps come from customers. We see it at the customer. ... more back and forth ...
[SD]: No one has used the route servers. We don't know their failure modes, operational problems, etc so I don't want to switch over to using route servers. We have experimented with Netcom and route servers, though.
Elise Gerich [EG]: Let me step in here and clarify what I said about the operational status of the route servers at the NAPs. The route servers are operational, the code is out there. We don't have all our dial-up connections, so technically the RSs are not fully operational, but there is nothing more to be done with the route server hardware and software except for folks to begin using them. ESnet compares real router tables to the route server tables, so people are beginning to use them. The Routing Arbiter provides a service and you use route servers like you use routers. You install cisco software and work with the problems. It's the same with route server code. But folks need to start using them, gain experience with them and help us improve them. It's no different than with routers.
[CV]: I see two issues with the new route servers, new code and the hardware. But there is no problem with the gated code. I think the key question is: Is the RADB accurate? Currently we trust other sources more than the RADB. But we need route servers to scale up the Internet.
[SB]: Who in the audience is using route servers? for comparison?
[Audience]: about four hands.
[SB]: Who refuses to use route servers?
[Audience]: No one admitted they wouldn't.
[SB]: Are there ways to reduce paths at specific routers?
[CV]: The number of paths is reduced by reducing the number of peering sessions.
[SD]: I agree, but the route server still needs to see multiple views.
[CV]: The route server has 500M of memory.
[SD]: Multiple paths don't overload ciscos. The route server looks attractive for N**2 peering but SprintLink doesn't expect large numbers of peering sessions.
[SB]: So, in summary, implementations differ in the efficiency of handling multiple paths.
[CV]: We put 128M in our router at MAE East to handle lots of peers.
[AP]: I have more operational and management problems for large numbers of peering sessions than I do with router memory and implementation specifics.
[SB]: Let's move to the issue of CIDR. Are we reaching the point of diminishing return on CIDR? How are we to find suitable targets for aggregation? Is it time to invent something new, as discussed on big-internet?
Bill Manning [BM]: CIDR is well defined and it's clear how to use it, but getting people to use it is a social issue. It's way past time to invent something new. We don't want to be dealing with /96 prefixes out of 128. :-)
[CV]: Being able to record aggregation in one place and generating configs from that database is very helpful.
[SB]: How many in the audience are doing CIDR?
[Audience]: Most raised hand.
[SB]: How many are NOT doing CIDR?
[Audience]: BM admitted he isn't. :-)
[TB]: There is only so much we can do with tools and pointers. We need to address the documentation and training issues. There are many vendors who don't understand class-less.
Sue Hares [SH]: I'm married to an X.25 man who is converting a large network to TCP/IP. He tells the customer, who is a sophisticated customer but doesn't know Internet, that aggregation is good, but the customer doesn't understand the rules. Our customers may be very smart about networking but since our rules aren't written, they can't be understood = by newcomers from different cultures like X.25.
[SB]: Let's discuss proxy aggregation. Are we using it? Is unauthorized proxy aggregation bad?
Matt Mathis [MM]: If you do unauthorized aggregation you usually find the party aggregated is in the wrong and will aggregate for themselves. The nanog list is the appropriate forum to discuss proxy aggregation.
[CV]: You need to talk to your customers about their back doors. You can't just pick a large number [prefix length]. The issue really isn't about four Class Cs -- it's about large providers who don't aggregate. You need to examine the other party's routers and policy. You need a database to examine and understand policy and choose an aggregation point and prefix length.
[MM]: We need tools to find the side doors.
[SB]: Will the Merit tools we heard about help?
[EG]: Yes.
[MM]: Only if the database is accurate.
[SB]: Merit tools use real routes too.
[??]: We recently got a customer who was part of a proxy aggregation and it took us a week to figure it out.
[SB]: So, should we use the nanog list to discuss folk's proxy aggregations?
[Audience]: Yes.
[MM]: We need a procedure to announce the intent to proxy aggregate and solicit feedback prior to implementation.
[TB]: We still have a lot to gain from encouraging aggregation by customers and providers. Proxy aggregation is dangerous. We need to encourage self-aggregation.
[SB]: Do you think the CIDR FAQ is useful?
[Audience]: Not much response.