caida.parse

This FIREWHEEL model component is designed to import and process Center for Applied Internet Data Analysis (CAIDA) Autonomous System (AS) trace data into a FIREWHEEL graph. The plugin reads AS links and “prefix-to-AS” mappings, generates AS links, assigns BGP networks, and removes OSPF information.

Attribute Provides:
  • internet_ases

  • internet_as_annotation

Attribute Depends:
  • graph

Model Component Dependencies:

Getting Required Data

The data used is from two CAIDA datasets:

  1. ARK IPv4 AS Links - We combine data from all three teams. - URL: https://publicdata.caida.org/datasets/topology/ark/ipv4/as-links/

  2. Routeviews Prefix-to-AS Mappings for IPv4 and IPv6 - URL: https://publicdata.caida.org/datasets/routing/routeviews-prefix2as/

We typically use a one-month sample from each dataset, then process the data into a single file. This process is currently automated, to use data from August 2018. To replicate this process, with a different set of data, please review the INSTALL file.

Seg Fault Issue

This model component was previously built using pysubnettree, but the package occasionally caused seg faults. If pytricia is ever deprecated, moving back to pysubnettree could be a potential option. If the model component silently fails, adding the following lines to the top of plugin.py could help debug:

import faulthandler
faulthandler.enable()

Plugin

class caida.parse_plugin.ParseCAIDA(graph, log)[source]

Bases: AbstractPlugin

This imports a CAIDA AS trace into the graph.

This will walk through the CAIDA trace files and create a BGP router for each AS number. Any links in the traces will be placed into the graph appropriately.

The attributes that it sets in the graph vertices are as follows:

Router:
  • type - Always set to "router"

  • as - The router’s AS number

  • interfaces - The active interfaces for the router

Format: { <NUM>: { 'netmask', 'name', 'address' } } - bgp - The BGP neighbor information for this router Format: { <NUM>: { 'as', 'address' } } - bgp_networks - the networks to be explicitly advertised by BGP Format: { <NUM>: { 'netmask', 'address' } } - new - Indicates that this is a new device and needs to be processed by the other plugins. (It is set to True)

Link:
  • new - Indicates that this is a new device and needs to be processed

by the other plugins. (It is set to True)

__annotate_func__ = None
__annotations_cache__ = {}
__firstlineno__ = 18
__static_attributes__ = ('control_net_hosts', 'link_attrs', 'tree', 'vertices')
_get_AS_list(as_str)[source]

Return a list of the ASes referenced, including multi-origin AS (MOAS) and sets.

The CAIDA syntax for MOAS is X_Y_Z, and the syntax for sets is that they are comma separated. This method separates each AS number as if they were independent (effectively ignoring MOAS).

Parameters:

as_str (str) – The string representing AS numbers.

Returns:

A list of AS numbers.

Return type:

list

_get_AS_name(as_number)[source]

Return the AS name in the graph for the given AS number.

Parameters:

as_number (str) – The AS number.

Returns:

The AS name in the graph.

Return type:

str

_get_bgp_net_switch(net)[source]

Return the canonical switch name for the given BGP network.

Parameters:

net (IPNetwork) – The BGP network.

Returns:

The canonical switch name for the BGP network.

Return type:

str

_get_switch_name(from_as, to_as)[source]

Return the switch name in the graph for the given link.

Parameters:
  • from_as (str) – The AS number of the source.

  • to_as (str) – The AS number of the destination.

Returns:

The switch name in the graph.

Return type:

str

assign_bgp_networks(bgp_table)[source]

Assign the given BGP networks to each AS.

Parameters:

bgp_table (str) – The file mapping AS numbers to IP networks.

Generate the appropriate links between ASes in the graph.

Parameters:

aslinks (str) – The AS links file to parse and import.

process_bgp_table_line(line)[source]

Process the given line in the BGP table by adding the BGP networks to the appropriate ASes in the graph.

This line takes the form:

network     cidr        AS

Note

Currently, multi-origin AS (MOAS) isn’t supported, so if this is the case then the first entry is chosen.

Parameters:

line (str) – The line from the BGP table representing a network, CIDR, and AS.

Raises:

ValueError – If the line in the BGP table is malformatted.

Process a direct link input line

This line takes the form:

D    from_AS    to_AS   monitor_key{i}   ...

We are only interested in from_AS and to_AS at this point. This function will take this line and make sure that both from_AS and to_AS exist as vertices, and then create a link between them if such a link doesn’t exist.

This method also connects all nodes to a giant “control network” via the SWITCH_BGP_CONTROL Switch. This will later be pruned off in caida.prune_routers but will enable keeping a slightly larger subset of the BGP topology during the pruning. For example, using the July 2018 data we ran: firewheel experiment caida.test_topology caida.prune_routers both with SWITCH_BGP_CONTROL and without. The resulting graph without the switch had 53 Nodes and 165 Edges while the graph with the switch contained 64 Nodes and 396 Edges.

This implementation currently does not handle the case where an AS may have multiple networks associated with it. The linking logic assumes a single BGP connection between two routers, which may not accurately represent the topology in scenarios involving multi-origin AS (MOAS) or multiple networks.

Parameters:

line (str) – The line from the AS links file representing a direct link.

remove_ospf_info()[source]

Set all OSPF parameters to None.

run(aslinks='/home/runner/work/firewheel/firewheel/.tox/docs/lib/python3.14/site-packages/firewheel_repo_caida/parse/cycle-aslinks.txt.gz', bgp_table='/home/runner/work/firewheel/firewheel/.tox/docs/lib/python3.14/site-packages/firewheel_repo_caida/parse/routeviews.gz')[source]

Parse the given CAIDA information into a BGP network used for a FIREWHEEL experiment. The method generates AS links, assigns BGP networks, and removes OSPF information.

Parameters:
  • aslinks (str) – The AS links file to parse and import. This should be a gzipped file in the format provided by CAIDA. An example filename would be cycle-aslinks.l7.t3.c006830.20180719.txt.gz.

  • bgp_table (str) – The file mapping AS numbers to IP networks. This should be a gzipped file in the format provided by CAIDA. An example filename would be routeviews-rv2-20180731-1200.pfx2as.gz.

Available Objects

class caida.parse.ASAnnotation(name)[source]

Bases: object

The ASAnnotation class is used to manage and annotate Autonomous System (AS) subnets within a network graph. It leverages the pytricia library to store and retrieve subnet information efficiently.

__annotate_func__ = None
__annotations_cache__ = {}
__firstlineno__ = 4
__init__(name)[source]

Initializes the ASAnnotation instance.

Parameters:

name (str) – The name of the annotation.

__static_attributes__ = ('name', 'tree', 'type')
add_subnet(new_subnet, as_name, switch)[source]

Adds a new subnet to the annotation tree.

Parameters:
  • new_subnet (netaddr.IPNetwork) – The subnet to be added.

  • as_name (str) – The name of the Autonomous System (AS) associated with the subnet.

  • switch (base_objects.Switch) – The Switch associated with the subnet.

get_as_for_subnet(subnet)[source]

Retrieves the AS name for a given subnet.

Parameters:

subnet (netaddr.IPNetwork) – The subnet for which to retrieve the AS name.

Returns:

The AS name associated with the subnet.

Return type:

str

get_switch_for_subnet(subnet)[source]

Retrieves the switch for a given subnet.

Parameters:

subnet (netaddr.IPNetwork) – The subnet for which to retrieve the switch.

Returns:

The Switch associated with the subnet.

Return type:

base_objects.Switch

is_network_in_tree(subnet)[source]

Checks if a given subnet is present in the annotation tree.

Parameters:

subnet (netaddr.IPNetwork) – The subnet to check.

Returns:

True if the subnet is present in the tree, False otherwise.

Return type:

bool