GLink Architecture - Aurora

GLink architecture is a spine-and-leaf topology designed to provide determinism and Denial of Service (DoS) protections within the CME Globex order routing network.

CME GLink - Aurora connectivity provides access to:

  • iLink order entry on the CME Globex platform for futures and options markets

  • Market data for futures and options markets on CME Globex disseminated over the CME Market Data Platform (MDP) 

  • CME Clearing House systems for CME Group markets.

Contents

Topology





Physical (L1)

From bottom of the diagram to top:

  • GLink access switches (customer access)

    • Arista 7060s using 10 Gbps Ethernet

    • Each customer connects to an 'A' switch and a 'B' switch

      • All GLink switches are deployed in pairs

    • Each GLink switch is connected to three spines at 100 Gbps.  'A' feed Glink access switches do not connect to the 'B' feed spine, 'B' feed Glink access switches do not connect to the 'A' feed spine

    • Count:

      • 24 pairs of GLink switches

  •  Spine switches

    • Arista 7060s for multicast spines and 7260s for order entry non-multicast spines using 100 Gbps 

    • Spine 'A' and Spine 'B' pass market data multicast traffic

    • Non multicast spines pass order entry (MSG/CGW) unicast traffic. The non multicast spines also pass non order-entry unicast traffic to services outside of Glink. 

    • Count: 4 switches

  • MSGW access switches

    • Arista 7060s using 10 Gbps Ethernet for gateway connectivity

    • Each MSG access switch is connected to the 2 non-multicast Spine switches at 100 Gbps

    • MSGs connect to only one gateway access switch at 10 Gbps. Fault tolerant pairs should be on separate switches

    • Count: 4 switches

  • CGW access switches

    • Arista 7060s using 10 Gbps Ethernet for gateway connectivity 

    • Each CGW access switch is connected to the 2 non-multicast Spine switches at 100 Gbps

    • Count: 2 switches

  • WAN distributions (to the left and right of the spines)

    • Switches using 100 Gbps Ethernet connectivity

    • Each WAN distribution is connected to all four spines

    • Market data routes through this distribution layer into the 'A' and 'B' spines

    • Count: 2 switches

Data Link (L2)

  • 10 GbE interfaces are supported

  • All customer connected interfaces have policing applied which reduces available bandwidth to 1 Gbps (covered in more detail below) 

  • We do not use VLANs within the GLink front end network

    • All nodes, including servers, have routable L3 addresses

    • We do not run Spanning Tree Protocol (STP) nor any variants of it

  • All switching layers (customer, spine and gateway) operate in 'store-and-forward' mode

    • Store-and-forward mode means that, for any given switch, it must completely receive a datagram before it will transmit that datagram to another interface

      • Implication: If a switch starts receiving two separate datagrams at exactly the same time and both datagrams are destined to leave the same port, then the smaller of the two datagrams will leave the switch first

Network (L3)

Active-Standby Routing and Paths

Each MSGW server is 'available' via 2 paths from the 2 non-multicast spine switches, in an active-standby manner.

When a session arrives at a customer access switch, the outbound path will be selected based on which of the 2 non-multicast spines is active for a given MGW access switch.  The return path traffic (MSGW to customer) would follow the same path in a symmetric fashion.

  • Ordering/re-ordering of packets

    • Packets can be reordered within the spine layer provided that

      • two different sessions are used

      • the load from customer access to spine or within the individual spines are different

    • The ordering of packets from all sessions to MSGW will be 'final' at the server access switching layer

    • Latency difference between spines will vary depending upon load but is expected to stay within the hundreds of nanos range

  • Traffic can be routed over the 'A' and 'B' simultaneously.

  •  

    • On 'A' advertise via BGP the summary route for the CME supplied address range.

    • On 'B' advertise via BGP a more specific route for the traffic you want returned on the 'B' interface.

    • If a more specific route is not advertised, then all traffic will return to the summary interface (asymmetric routing) - which will break Network Address Translation if it is in use. 

  • The path for a particular GLink connection to a specific market segment is the same for all messages

Performance

  • Basic Behavior

    • Reduce network jitter by dispersing microbursts through the customer-side fabric

    • Queuing delays will be associated with the order routing gateways

  • Pertinent Info

    • Nominal 1-way latency of 3 microseconds for spine-and-leaf switch performance only (slightly less than 1 us forwarding latency per switch) 

  •  

    • Oversubscription - GLink customers per switch

      • 0.48:1 – All customers to single Spine (worst case)

      • 0.24:1 – All customers to all Spines (best case)

    • There will be differences in performance between any two switches based upon standard networking principles.

    • The Arista 7060 is built around the Broadcom Tomahawk chip family and is a "Switch on Chip" (SOC), shared memory design. Queue depth monitoring is generally more prevalent with multi-stage switching designs where buffers are separate physical entities on a per port basis.

Policing Overview

  • At Ingress - per GLink:

    • Green: < 750 Mbps is allowed and marked ‘normal’ (AF11)

    • Yellow: 750 Mbps – 1 Gbps is allowed and marked ‘discard eligible’ (AF12)

    • Red: > 1 Gbps is silently dropped

Ingress Policing

  • General Mechanics

    • ‘Credit’ value similar to token bucket

    • Two-rate three-color marker (RFC 2698)

  • Metering Calculation

    • Obtain current credits: CCURR = CPREV + (TDELTA * RCREDIT)

    • Check current against the limit: If CCURR > CLIMIT, then CCURR == CLIMIT

    • Calculate the eventual credits: CPOST = CCURR – PSIZE

    • Check whether packet will be policed

  • Marking and Action

    • Committed Information Rate (CIR) = 750 M

      • Action: Mark (AF12)

  • Committed Burst Size (CBS) = 500 K

  • Peak Information Rate (PIR) = 1 G

    • Action: Drop

  • Peak Burst Size (PBS) = 625 K

  • Conforming traffic under CIR is marked as AF11






How was your Client Systems Wiki Experience? Submit Feedback

Copyright © 2024 CME Group Inc. All rights reserved.