BGP Component
Overview
Enabling the BGP Component
The BGP configuration is held under the top level bgp object in /etc/flockd/flockd.json. If the bgp object exists BGP will be enabled and the BGP master thread will be started.
With this configuration file:
-
The BGP master thread will be started.
-
The router is in an Autonomous System identified by the Autonomous System Number
65016 -
The router has a BGP router identifier assigned as
172.16.10.1"bgp": { "asn": 65016, "id": "172.16.10.1" }
Show the status of the BGP component
Check BGP is listed in the enabled_protocols field.
flock@r70:~$ flockc sys overview
{"host_info":{"hostname":"r70", ...},"system_info":{"name":"flockd","description":"The Flock Networks Ltd Routing Suite Daemon", ...},"pid":1234,"log_level":"info","uptime":"days: 0, hours: 0, mins: 0, secs: 19","enabled_protocols":["BGPv4"],"software_errors":0, ...}
Show BGP Overview
flock@r70:~$ flockc bgp overview
{"overview":{"id":"70.0.100.70","asn":70,"cluster_id":null,"route_server":false,"route_reflector":false, ...,"routes":{"unicast_routes":{"default":{"ipv4_unicast":{"route_count":3, ...},"ipv6_unicast":{"route_count":0, ...}}},"vpn_routes":{}},"neighbor_summary":{"default":{"count":4,"established":4,"send_converged":4,"recv_converged":4}}}, ...}
send_converged means all updates have been sent to this neighbor. The neighbor send update queue is empty. The neighbor may not have received all the updates yet, they may still be in the local TCP send buffer (or the neighbors TCP receive buffer)
recv_converged means all available updates from this neighbor have been processed. The neighbor receive TCP buffer is empty. However the neighbor may not have managed to send all updates yet.
- There are 3 IPv4 Unicast routes in the BGP RIB
- There are 4 neighbors, all of which have reached
establishedstate - All 4 neighbors are
send_convergedandrecv_converged
Originating Networks
With this configuration file:
-
The router originates the
172.16.0.0/16andfc00:46::/32networks"bgp": { "asn": 65016, "id": "172.16.10.1", "vrfs": { "default": { "networks": { "172.16.0.0/16": {}, "fc00:46::/32": {} } } } }
By default, a network is only originated when a matching prefix exists in the RIB. To originate a network unconditionally, set originate_always to true:
"networks": {
"172.16.0.0/16": {
"originate_always": true
}
}
flock@r01:~$ flockc bgp vrf default rib lookup 172.16.0.0/16
{"best":[{"neigh_api_key":null, ...,"path_type":"ViaOriginate", ...,"reason":"SelfOriginated", ...}], ...}
riboperates on the IPv4/IPv6 unicast BGP RIB. Useafi-rib <afi>to select a different address family.
Address Families
The following address families can be configured per-neighbor via the afis object:
ipv4-unicast— IPv4 unicast routesipv6-unicast— IPv6 unicast routesipv4-mplsvpn— IPv4 MPLS VPN routes (L3VPN)ipv6-mplsvpn— IPv6 MPLS VPN routes (L3VPN)l2vpn-evpn— L2VPN EVPN routesipv4-mpls— IPv4 labeled unicast routesipv6-mpls— IPv6 labeled unicast routes
Configuring Neighbors
With this configuration file:
-
The router has a single iBGP neighbor
172.16.20.2- The iBGP connection source is
172.16.20.1(specified via the neighbor key"172.16.20.2 172.16.20.1") - The iBGP connection will advertise IPv4 unicast routes
- Routes are advertised over iBGP with a next hop of
172.16.20.1(next_hop_self) - BGP add-path [RFC7911] is enabled for both send and receive
- The iBGP connection source is
-
The router has a single eBGP neighbor
172.17.20.1in remoteAS 65017"bgp": { "asn": 65016, "id": "172.16.10.1", "vrfs": { "default": { "neighs": { "172.16.20.2 172.16.20.1": { "asn": 65016, "next_hop_self": true, "add_path": { "receive": true, "transmit": true }, "afis": { "ipv4-unicast": {} } }, "172.17.20.1": { "asn": 65017, "afis": { "ipv4-unicast": {} } } } } } }
The neighs object is keyed by the neighbor IP address. When a local source IP is needed, the key takes the form "<remote_ip> <local_ip>".
Enable Dynamic BGP Neighbors
When the "Dynamic BGP Neighbor" feature is enabled, BGP neighbors are allowed to connect without any explicit configuration.
With this configuration file:
-
Only neighbors with a source address in the
192.168.0.0/16will be accepted as dynamic neighbors -
Connecting neighbors will be expected to be part of autonomous system number (ASN) 65073. If the BGP Open message from the neighbor is not from ASN 65073 the connection will be terminated.
"bgp": { "asn": 65016, "id": "172.16.10.1", "vrfs": { "default": { "dynamic_neighs": { "192.168.0.0/16": { "asn": 65073, "connect_mode": "Passive", "afis": { "ipv4-unicast": {} } } } } } }
The dynamic_neighs object is keyed by the listen range prefix. Each entry specifies the expected ASN, connect mode, and address families.
Show all neighbors
BGP runs two Finite State Machines (FSM's) per neighbor. One FSM handles the Outgoing TCP connection and the other handles the Incoming TCP connection. The Flock Routing Suite does not hide this from the operator. In the final working state each neighbor should have one FSM in the Established state, and one FSM in the Idle state.
The last error to cause a BGP Notify Message is held in each FSM's last_notify_send / last_notify_recv fields. These fields are never cleared, they are only overwritten with the last error. So a value of null means there have been no errors that have caused a notify message since flockd was started.
flock@r70:~$ flockc bgp vrf default show
[{"common":{"neigh_key":"70.0.100.72 70.0.100.70","hostname":"...","asn":70,"bgp_id":"70.0.100.72","neigh_type":"Internal","connect_mode":"Both", ...},"outgoing":{"state":"Idle","last_notify_send":"Cease","last_notify_recv":null, ...},"incoming":{"state":"Established","last_notify_send":null,"last_notify_recv":null, ...},"stats":{...}}, ...]
Show BGP RIB prefixes
Note that this is not the RIB held in the RIB component, this is the BGP RIB. The BGP RIB records routes from all neighbors and sends the 'best entry' route to the RIB component. By default, BGP will show the ipv6 routes if the af parameter is not specified.
Walk the IPv4 unicast BGP RIB. Only the 'best entry' for each prefix is shown, along with the reason why it was the best.
flock@r70:~$ flockc bgp vrf default rib walk 0.0.0.0/0
[{"prefix":"50.0.0.0/8","entry":{"best":[{...,"attrs":{"origin":"Igp","as_path":{"segments":[{"segment_type":"AsSequence","segment_value":[70,60,60,60,50,50]}]},"next_hop":"90.0.93.61", ...},"reason":"OnlyValidPeer", ...}], ...}},{"prefix":"60.0.0.0/8","entry":{...}},{"prefix":"70.0.0.0/8","entry":{"best":[{...,"reason":"SelfOriginated", ...}], ...}}]
Look up a specific prefix. The 'best entry' and all the candidate paths are shown.
flock@r70:~$ flockc bgp vrf default rib lookup 70.0.0.0/8
{"best":[{...,"path_type":"ViaOriginate", ...,"reason":"SelfOriginated", ...}],"paths":[{...,"route_type":"Originated","attrs":{"origin":"Igp","as_path":{"segments":[]}, ...}, ...}], ...}
Configuring BGP Active / Passive Neighbors
By default BGP will try to create two TCP transport connections to each neighbor. One outgoing to the neighbors remote BGP TCP port 179, and one allowing incoming connections from the neighbor to the local BGP TCP port 179. A tie break is used to ensure only one connection remains when the BGP neighbor moves to the 'Established' state.
The router can be configured to form a single TCP transport connection to each neighbor using the connect_mode neighbor configuration parameter.
"bgp": {
"vrfs": {
"default": {
"neighs": {
"172.17.20.1": {
"asn": 65017,
# Only create the outgoing connection to this neighbor.
# Refuse any incoming connection.
"connect_mode": "Active",
"afis": {
"ipv4-unicast": {}
}
}
}
}
}
}
or
# Only allow the incoming connection from this neighbor.
# Do not create any outgoing connection.
"connect_mode": "Passive"
Configuring BGP Route Reflectors
To configure a router as a BGP Route Reflector, specify which neighbors are Route Reflector clients using the route_reflector_client configuration boolean.
"bgp": {
"vrfs": {
"default": {
"neighs": {
"172.16.20.2 172.16.20.1": {
"asn": 65016,
# Reflect iBGP routes to and from this neighbor
"route_reflector_client": true,
"afis": {
"ipv4-unicast": {}
}
}
}
}
}
}
To deploy redundant Route Reflectors a Route Reflector Cluster Id can optionally be configured.
"bgp": {
"cluster_id": "1.2.3.4"
}
Configuring BGP to act as a Route Server
BGP Route Server functionality is defined in RFC7947. To configure a router as a BGP Route Server use the route_server configuration boolean.
"bgp": {
"asn": 65056,
"id": "192.168.0.14",
"route_server": true,
"vrfs": {}
}
To check BGP is running as a route server.
flock@r01:~$ flockc -f json-pretty bgp overview
...
"route_server": true,
...
Configuring Multihop BGP
Multihop BGP is configured by changing the Time to Live (TTL) of the BGP packets that are sent.
The default BGP packet TTL's are iBGP = 64 and eBGP = 1.
Use the neighbor ttl configuration keyword to override the defaults.
"bgp": {
"vrfs": {
"default": {
"neighs": {
"60.0.20.61": {
"asn": 60,
"ttl": {
"send": 2
},
"afis": {
"ipv4-unicast": {}
}
}
}
}
}
}
Additional Neighbor Options
Local AS
Override the AS number used in the BGP OPEN message for a specific neighbor. This is useful during AS migration.
"neighs": {
"172.16.20.2 172.16.20.1": {
"asn": 65016,
"local_as": 65099,
"afis": {
"ipv4-unicast": {}
}
}
}
MD5 Authentication
Enable TCP-MD5 authentication for a BGP neighbor. The key is read from a file on the flockd host; it does not appear inline in the configuration.
"neighs": {
"172.16.20.2": {
"asn": 65017,
"auth_password_file": "/etc/flockd/bgp-md5/peer-172.16.20.2.key",
"afis": {
"ipv4-unicast": {}
}
}
}
The file's UTF-8 contents form the key; flockd strips trailing line
endings (LF or CRLF), so a key file saved with Windows line endings
matches the peer's key instead of silently differing by a stray
carriage return. Permissions are the operator's responsibility --
0600 owned by the flockd user is the expected baseline.
Relative paths in flockd.json resolve against the directory of the
JSON file. Over gRPC (flockc bgp config vrf <vrf> neigh <addr> set --auth-password-file <PATH>) the path must be absolute; flockc never
reads the file or transmits the key bytes, so the file must already
exist on the flockd host before the command runs.
An inline "auth_password": "..." field is rejected at config-load
with an error pointing at this auth_password_file form.
Disabled Neighbor
A neighbor can be administratively disabled. The neighbor configuration is retained but no TCP connections will be established.
"neighs": {
"172.16.20.2": {
"asn": 65017,
"disabled": true,
"afis": {
"ipv4-unicast": {}
}
}
}
Allow AS In
Allow the local AS number to appear in the AS path received from a neighbor. The value specifies the maximum number of times the local AS is allowed. Default is 0 (not allowed).
"neighs": {
"172.16.20.2": {
"asn": 65017,
"allow_as_in": 2,
"afis": {
"ipv4-unicast": {}
}
}
}
Remove Private AS
Remove private AS numbers from the AS path when advertising to a neighbor. Options are RemoveAll (remove all private AS numbers) or ReplaceAll (replace private AS numbers with the local AS).
"neighs": {
"172.16.20.2": {
"asn": 65017,
"remove_private_as": "RemoveAll",
"afis": {
"ipv4-unicast": {}
}
}
}
AS Override on Export
Replace the neighbor's AS number in the AS path with the local AS when advertising routes.
"neighs": {
"172.16.20.2": {
"asn": 65017,
"as_override_export": true,
"afis": {
"ipv4-unicast": {}
}
}
}
Configuring L3VPN (BGP/MPLS VPN)
L3VPN configuration uses the l3vpn object within a VRF to define the Route Distinguisher (RD) and Route Targets (RTs). BGP neighbors in the default VRF carry VPN routes using the ipv4-mplsvpn or ipv6-mplsvpn address families.
"bgp": {
"asn": 200,
"id": "200.0.100.200",
"vrfs": {
"default": {
"neighs": {
"200.0.100.251 200.0.100.200": {
"asn": 200,
"local_as": 200,
"afis": {
"ipv4-mplsvpn": {}
}
}
}
},
"green": {
"l3vpn": {
"rd": "0:100:100",
"rts_v4": {
"import": [
"route-target:100:100"
],
"export": [
"route-target:100:100"
]
}
},
"neighs": {
"100.200.1.100": {
"asn": 100,
"local_as": 200,
"afis": {
"ipv4-unicast": {}
}
}
}
}
}
}
The l3vpn object contains:
rd— Route Distinguisher intype:value:valueformatrts_v4— IPv4 route targets withimportandexportlistsrts_v6— IPv6 route targets withimportandexportlists
Configuring EVPN
EVPN (Ethernet VPN) uses the l2vpn-evpn address family. Ethernet Segment Identifiers (ESIs) for multihoming are configured under the system intfs object.
"system": {
"intfs": {
"bond-d80": { "esi": "03:44:38:39:ff:ff:01:00:00:01" }
}
},
"bgp": {
"asn": 80,
"id": "10.80.100.80",
"vrfs": {
"default": {
"neighs": {
"10.80.100.88 10.80.100.80": {
"asn": 80,
"afis": {
"ipv4-unicast": {},
"l2vpn-evpn": {}
}
}
}
}
}
}
As with the bridge and VXLAN interfaces, flockd does not create the
Ethernet Segment — it learns the esi association from the system and drives
the matching Type-1 (Ethernet A-D) and Type-4 (Ethernet Segment) signalling
and Designated-Forwarder election for it.
EVPN L3VNI (Type-5 / symmetric IRB)
For inter-subnet routing across the fabric, set evpn_l3vni on the tenant
VRF. This turns on EVPN Type-5 (IP Prefix) route origination for that VRF,
so a leaf advertises the tenant's IP prefixes — not just host MAC/IP — and
remote leaves route to them over the L3 VNI (symmetric IRB, RFC 9136 /
RFC 9135).
"bgp": {
"asn": 80,
"id": "10.80.100.83",
"vrfs": {
"default": {
"neighs": {
"10.80.100.88 10.80.100.83": {
"asn": 80,
"afis": {
"ipv4-unicast": {},
"l2vpn-evpn": {}
}
}
}
},
"cust3": {
"evpn_l3vni": 90030
}
}
}
The l2vpn-evpn AFI is enabled on the EVPN session in the default VRF, as
above; the tenant VRF (cust3) only needs evpn_l3vni. Unlike the MPLS
l3vpn block, no rd or rts are configured for an EVPN L3VNI — the Route
Distinguisher (from the router id) and the import/export Route Target (from
the local ASN and the VNI) are derived automatically. As with L2VNIs,
flockd learns the tenant VRF and its L3 VXLAN device from the system rather
than creating them.
Route Aggregation
Route aggregation allows summarizing multiple BGP prefixes into a single aggregate prefix. Configure aggregates under the aggregate_addresses object within a VRF.
"bgp": {
"asn": 65016,
"id": "172.16.10.1",
"vrfs": {
"default": {
"aggregate_addresses": {
"10.0.0.0/8": {
"summary_only": true,
"as_set": false
}
}
}
}
}
The aggregate address options are:
summary_only— whentrue, suppress the more-specific prefixes and only advertise the aggregateas_set— whentrue, include an AS_SET in the aggregate to preserve path information from the contributing routesexport_to_vpn— whentrue, export the aggregate to VPN address families
Configuring prefix-limit
The prefix-limit feature can be configured per-AFI at the neighbor level or at the VRF level. By default prefix-limit is disabled.
Inheritance principle is used to determine which configuration takes effect for a neighbor: neighbor level configuration takes precedence if present, if not VRF level configuration if present.
Configuration consists of two items:
- The maximum number of prefixes allowed ('max-prefixes')
- The action to take when max-prefixes has been reached. The default action is 'discard', 'reset' is also supported but not recommended.
When the action is 'discard', any extra prefixes are discarded. When the number of prefixes eventually goes below 'max-prefixes', a Route-Refresh msg is sent to the peer to request the peer to resend Update msgs. The Route-Refresh msg is sent at most once every minute (hard-coded).
When the action is 'reset', the BGP connection to the neighbor is reset when we exceed the allowed 'max-prefixes'. This can lead to network convergence issues and is therefore not recommended.
There is also an alarm raised when we reach the soft-limit, which is set to 75% of max-prefixes. Another alarm is raised if we exceed 'max-prefixes'.
BGP Operation Commands Reference
Help
flockc bgp -h
Overview
flockc bgp overview
List VRFs
flockc bgp list-vrfs
VRF show (all neighbors in the VRF)
flockc bgp vrf <vrf-name> show
Redistribution policy
flockc bgp vrf <vrf-name> redist-policy
Neighbor detail
flockc bgp vrf <vrf-name> neigh <ip-addr> show [--stats]
Neighbor adjacency RIB (unicast)
flockc bgp vrf <vrf-name> neigh <ip-addr> adj-rib-unicast {in | out} walk <root>
Neighbor adjacency RIB (Route Target Constraint)
flockc bgp vrf <vrf-name> neigh <ip-addr> adj-rib-rtc {in | out} walk
Reset a neighbor
flockc bgp vrf <vrf-name> neigh <ip-addr> reset {soft-in | soft-out | hard | refresh-in | refresh-out}
Unicast RIB lookup / walk
flockc bgp vrf <vrf-name> rib lookup <ip-network>
flockc bgp vrf <vrf-name> rib walk <root> [--start-from <prefix>] [--max-entries <N>]
Unicast RIB statistics
flockc bgp vrf <vrf-name> rib stats [--memory]
Per-AFI RIB lookup / walk
flockc bgp vrf <vrf-name> afi-rib <afi> lookup <ip-network>
flockc bgp vrf <vrf-name> afi-rib <afi> walk <root>
VPN RIB lookup by RD
flockc bgp vpn-rib lookup <rd> <ip-network>
VPN RIB walk / statistics
flockc bgp vpn-rib walk
flockc bgp vpn-rib stats
Route Target Constraint RIB
flockc bgp rtc-rib lookup <rt>
flockc bgp rtc-rib walk
Per-VRF VPN policy statistics
flockc bgp vrf <vrf-name> vpn policy-stats
Event buffer
flockc bgp event-log
Source deaggregation labels
flockc bgp vrf <vrf-name> rib deagg-labels
EVPN RIB (all route types, keyed by Route Distinguisher)
flockc bgp evpn rib walk [<rd>]
flockc bgp evpn rib lookup <rd> <ip-addr>
EVPN NVE (local VTEP / L2 VNI state, with locally-learned MACs)
flockc bgp evpn nve walk [--vni <vni>]
EVPN Ethernet Segments (multihoming, with per-VNI Designated-Forwarder status)
flockc bgp evpn es walk
flockc bgp evpn es lookup <esi>
Configuration inspection and edit
flockc bgp config show
flockc bgp config show-pending
flockc bgp config init
flockc bgp config set ...
flockc bgp config apply
flockc bgp config vrf <name> set ...
flockc bgp config vrf <name> delete
flockc bgp config vrf <name> neigh <addr> set ...
flockc bgp config vrf <name> neigh <addr> delete