Yesterday's Facebook outage showed yet again the fragility of the Internet's routing infrastructure. A lot has been written about various deficiencies of BGP, the Border Gateway Protocol. But all too often, the problem isn't the protocol but the people (or scripts) administering the routers. Our ISC website did suffer a couple of outages last year due to Verizon misconfiguring BGP (sadly... several times within a few days). Facebook's outage appears to be a misconfiguration as well, according to some early statements from Facebook .
So how do you debug these routing issues, in particular, if they are beyond your control? Or what to do next if DNS isn't the problem for a change?
One useful tool is "Looking Glasses." These are websites that various ISPs, and in some cases, Universities and others have created. These websites will allow you to query the routing table of various routers. Before you read any further: These tools are meant for occasional manual debugging (and most try to enforce this via captchas and rate-limits). They are not meant to be used by automated scripts. If you want automated alerting about routing issues: Check commercial services like BGPMon, Thousandeyes, and Kentik.
The routing table isn't the same for every router on the Internet. It is always good to query routing issues from different locations, which is why these "Looking Glass" sites are so useful.
First of all: Where do you find them? There is a nice web page, http://www.bgplookingglass.com, that lists public-looking glasses. Personally, I like the CenturyLink one (https://lookingglass.centurylink.com). It does provide a wide range of locations. Also, it reminds me of Don Smith, who worked for CenturyLink. I will use the CenturyLink site for my examples here.
Let's use "DShield.org" as an example. The current IP address for DShield.org is 126.96.36.199. A quick "whois" shows that the IP address is owned by DigitalOcean and part of AS14061.
Note that the AS information in whois is not always current. But it is a good start to tell you where you *should* find that IP address.
Let us start with that information and see what we get from BGP via CenturyLink:
The output you will get back is essentially what you would have gotten from the router's command line:
No Matching Entries Found? Is DShield.org down? .... no. And this is one of the issues: DigitalOcean owns 188.8.131.52/16, but they choose not to advertise the entire block. They may use different parts of that /16 in different datacenters. One quick way to figure out what prefix our IP is part of is to use Team Cymru's DNS service (they also operate a whois service with the same information, but I prefer the DNS version)
It so looks like that DigitalOcean uses a /20. Let's redo our query using this /20.
We now receive a lengthy response:
The router we selected has multiple "peers." Each peer will exchange routing information with this router resulting in multiple "RIB-in" entries. I am only displaying one of the entries above. Discrepancies in these entries could indicate a problem with information received from a particular router. But they do not have to be identical. Sometimes, there may be a good reason for one router to advertise slightly different information. (RIB = Routing Information Base. The internal database routers use to store routing information).
The important part for us is the "AS-Path" line. I highlighted it above for visibility. It lists the networks that the packet will pass through to reach the destination, starting with the particular router we used to issue this query. In our case, the result is pretty simple. DigitalOcean peers directly with CenturyLink. The AS "Path" in this case is just DigitalOcean's AS, which will receive the packet next.
What you should be looking for is loops (the same ASN showing up multiple times in an AS-Path). Or packets passing through ASNs you did not expect (for example in geographic locations that do not make sense).
Are you able to get the same information via "traceroute"? Yes and no. The route displayed by traceroute should follow the route communicated via BGP. But not all routers will send ICMP errors back. Many Looking Glass sites do include traceroute as an option so you may run a traceroute from the router to confirm what you are seeing in BGP. A packet may pass through an AS using multiple routers. You will see more "hops" with traceroute and traceroute may identify issues within an AS that are not necessarily visible in BGP.
 https://lookingglass.centurylink.comApplication Security: Securing Web Apps, APIs, and Microservices - SANS London June 2022
Oct 5th 2021
|Thread locked Subscribe||
Oct 5th 2021
7 months ago