Core network done! Last update before summer.

av Fredrik "Hugge" Korsbäck den 27 Jun 2016

Before the vacation-period kicks-in i wanted to update about what the latest news is in the SUNET-C project.

Our first major milestone has finally happened!

A few days ago our core-network become fully built out with the arrival of the last two core-routers! One in Stockholm and one in Narvik (norway). We still have a few open incidents on bad fiber and sites not living up to our standards but we can get bits to flow in the whole network at least so all paths is alive and kicking, but some is not performing fully. We don’t even anticipate to be fully satisfied with all ”layer1” deliveries until next year even.

To visualize the core, we built a MPLS-tunnel throughout the whole network that touches every single core-interface in the network, might look abit silly but this is a great way to find out faulty links actually (which we did on one of the links in Halmstad as you can see in the MTR underneath), with one ping you can test your whole network for functionality which can come in handy.

Screen Shot 2016-06-27 at 14.27.09

In this same tunnel we have also connected a 100G traffic generator and run every single interface in the network at 100% utilization, this is also a good way of finding faults if traffic is lost somewhere, here below shows the total util-graph from tug-r1 which carries the tunnel three times through the router and therefore gets 600G in total utilization

Screen Shot 2016-06-27 at 14.38.29

While speaking about tug-r1 (one of the Juniper MX2020 core routers in Stockholm) there was quite a logistical effort to even get the router installed. To not break any fragile network engineers backs  we contacted a firm which usually carries pianos  to haul it up to the datacenter at Tulegatan. A stripped MX2020 is few hundred kilos but with the right tools, its not a problem getting it up to the second floor without a elevator, or big enough doors.


The provisioning of the configuration in the whole backbone is almost all done. There will be a more detailed post later on how we achieve this but essentially the only configuration we manually configure by hand in the network is 7 lines of configuration to make the router internally reachable so we can provision it centrally – after that we generate most other configurations. We use the tool ”NSO” from Cisco (Which was NCS by Tail-F before Cisco acquisition). Inside this we either write text-based templates with variables or we build services using YANG-models depending if the configuration is just plaintext or something that is dynamic (such as bgp-peering rules, or bandwidth-on-demand type of services).

Below is an example of a template that writes generic routing-option settings (inside a logical system). Nothing exotic, but handy and effective. Then on top of that you build template-groups so you can bulk-push all the templates you need if certain criteria are met, and then nestled templates…and then…yeah… can get quite advanced. Its also handy to with one command validate the whole network against the template.

Screen Shot 2016-06-27 at 19.03.33

 

A ”service” is slightly more exotic, this is a customer HIG (Högskolan i Gävle) BGP-peering service which runs in the old OptoSunet network that is provisioned already. This has slightly more intelligence than the plain configuration-templates. Below we have pre-defined models on import and export rules and use input-information such as prefix-lists, RIPE IRR data (read from the variable ”as” which querys IRR-database for info using that asn) and behind this it generates config and pushs it out through Netconf to the corresponding routers (m1fre and m1tug). There is also of script-chaining behind the services to be able to produce the configuration dynamically and on-demand (for example it triggers when a IRR-data change is noticed on our monitored objects)

Also in  reality it does not care what vendor the router is from – that is converted internally using published Yang-models (if those exists), but in this case its Juniper straight through so it only needs to translate into one set of models.

Screen Shot 2016-06-27 at 16.15.13

This is also constantly sanitized and validated centrally to avoid having ”im just gonna fix this little thing” hack-configurations laying around in the network everywhere which acts as both security holes and could also potentially break things completely in the future if not hacked together carefully (which it never is)

We will continue during the summer to build the last templates and services for customers to make sure that whenever CPE-devices is being installed out at the university we can acheive almost a Zero-Touch provisioning behaviour, the idea is that the local field-engineers consoles in to the router (since we dont have 4G OOB to CPEs as we have in the core), enter hostname, set an ip-adress and enable SSH and then the rest of the configuration is pushed out centrally as soon as the router ”calls home” so the field-engineer can focus on what he does best, install things correctly. We use RFC1918 adress-space for commisioning first so all access-links is already configured with private address-space that is just routed internally and just used for commissioning configuration.

 

Save

Fredrik "Hugge" Korsbäck

Network architect and chaosmonkey for AS1653 and AS2603. Fluent in BGP hugge@nordu.net