Rev | Date | Author | Change Description |
---|---|---|---|
0.1 | 02/01/2022 | Prince Sunny | Initial version |
0.2 | 03/09/2022 | Prince Sunny | Packet Flows/DB Objects |
0.3 | 05/24/2022 | Oleksandr | Memory Footprints |
0.4 | 06/01/2022 | Prince Sunny | Design Considerations |
0.5 | 06/13/2022 | Chris Sommers | Schema Relationships |
0.6 | 08/05/2022 | Mukesh M Velayudhan | Outbound VNI derivation in pipeline |
0.7 | 08/09/2022 | Prince Sunny | Add Inbound Routing rules |
0.6 | 04/20/2022 | Marian Pritsak | APP_DB to SAI mapping |
0.8 | 09/30/2022 | Prabhat Aravind | Update APP_DB table names |
1.0 | 10/10/2022 | Prince Sunny | ST and PL scenarios |
1.1 | 01/09/2023 | Prince Sunny | Underlay Routing and ST/PL clarifications |
1.2 | 02/12/2023 | Vijay Srinivasan | Metering schema and description |
1.3 | 04/12/2023 | Ze Gan | AppDB protobuf design |
1.4 | 05/03/2023 | Prince Sunny | ACL Tagging, ACL Requirements |
1.5 | 05/22/2023 | Oleksandr Ivantsiv | Update configuration examples |
1.6 | 06/01/2023 | Prince Sunny | Added FastPath |
This document provides more detailed design of DASH APIs, DASH orchestration agent, Config and APP DB Schemas and other SONiC buildimage changes required to bring up SONiC image on an appliance card. General DASH HLD can be found at dash_hld.
DASH | Disaggregated APIs for SONiC Hosts |
VNI | Vxlan Network Identifier |
VTEP | Vxlan Tunnel End Point |
VNET | Virtual Network |
ENI | Elastic Network Interface |
gNMI | gRPC Network Management Interface |
vPORT | VM's NIC. Eni, Vnic, VPort are used interchangeably |
ST | Service Tunnel |
PL | Private Link |
At a high level the following should be supported:
-
Bringup SONiC image for DEVICE_METADATA subtype -
Appliance
-
Bringup Swss/Syncd containers for switch_type -
dpu
-
Able to program DASH objects configured via gRPC client to appliance card via SAI DASH API
Phase 1
- Vnet-Vnet scenario
- Route/LPM support
- Underlay IPv4 and IPv6
- Stateful ACL support
- TCP state tracking on flows
- Telemetry and Monitoring
Phase 2
- Private Link
- Service Tunnel
- Overlay IPv6
Initial support is only for show
and clear
commands
- User shall be able to show the DASH configured objects
- User shall be able to clear state of ENI, VNET or all
Warm-restart support is not considered in Phase 1. TBD
Following are the minimal scaling requirements
Item | Expected value |
---|---|
VNETs | 1024* |
ENI | 32 Per Card |
Outbound Routes per ENI | 100k |
Inbound Routes per ENI | 10k** |
NSGs per ENI | 10*** |
ACL rules per NSG | 1000 |
ACL prefixes per ENI | 10x100k |
Max prefixes per rule | 8k |
ACL ports per ENI | 10x10k SRC/DST ports |
Total tags per ENI | 4k |
Max prefixes per tag | 24k |
Max tags one prefix belong to | 512 |
Max tags in an ACL rule | 4k |
CA-PA Mappings | 8M Per Card |
Active Connections/ENI | 1M (Bidirectional TCP or UDP) |
Total active connections | 32M (Bidirectional) |
Metering Buckets per ENI | 4000 |
CPS | 3M |
* Number of VNET is a software limit as VNET by itself does not take hardware resources. This shall be limited to number of VNI hardware can support
** Support 10K peering in-region/cross-region
*** 10 stages per ENI, prefix list can be optimized by tagging or other similar approach and reuse it for multiple ACLs/stages within the card thereby optimizing memory usage
Detailed scaling requirements can be found here
Metering is essential for billing the customers and below are the high-level requirements. Metering/Bucket in this context is related to byte counting for billing purposes and not related to traffic policer or shaping.
- Billing shall be at per ENI level and shall be able to query metering packet bytes per ENI
- All metering buckets must be UINT64 size and start from value 0 and shall be counting number of bytes. A bucket contains 2 counters; 1 inbound (Rx) and 1 outbound (Tx) from an ENI perspective.
- Implementation (a.k.a H/W pipeline implementation) must support metering at the following levels:
- Policy based metering. - E.g. For specific destinations (prefix) that must be billed separately, say action_type 'direct'
- Route table based metering - E.g. For Vnet peering cases.
- Mapping table based metering - E.g For specific destinations within the mapping table that must be billed separately
- Policy in the metering context refers to metering policy associated to Route tables. This is not related to ACL policy or any ACL packet counters.
- If packet flow hits multiple metering buckets, order of priority shall be Policy->Route->Mapping
- User shall be able to override the precedence between Routing/Policy and Mapping buckets by setting an override flag. When policy is enabled for a route, it takes higher precedence than routing and mapping metering bucket unless override flag is set in which case Mapping takes precedence
- Implementation shall aggregate the counters on an "ENI+Metering Bucket" combination for billing:
- All traffic from an ENI to a Peered VNET
- All traffic from an ENI to a Private Link destination
- All traffic from an ENI to an internet destination
- All traffic from an ENI to cross-AZ destination (within the same VNET)
- All traffic from an ENI to cross-region destination (towards the peer VNET)
- All outbound metered traffic from an ENI
- All inbound metered traffic towards an ENI
- Customer is billed based on number of bytes sent/received separately. A distinct counter must be supported for outbound vs inbound traffic of each category.
- Outbound and Inbound bytes are from ENI perspective and not based on where the traffic is initiated. Any traffic from ENI to outbound is treated as TX bytes and towards ENI inbound is RX bytes.
- For outbound flow and associated metering bucket, created as part of VM initiated traffic, the metering bucket shall account for outbound (Tx) bytes. Based on this outbound flow, pipeline shall also create a unified inbound flow. The same metering bucket shall account for the inbound (Rx) bytes for the return traffic to VM that matches this flow.
- Application shall utilize the metering hardware resource in an optimized manner by allocating meter id and deallocating when not-in-use
- Application shall bind all associated metering buckets to an ENI. During ENI deletion, all associated metering bucket binding should be auto-removed.
- Inbound metering: It is similar to outbound pipeline. A route rule table can have a metering bucket or a meter policy association for explicitly accounting the inbound traffic for an ENI. If inbound route rule points to a vnet, and mapping has a bucket id, it should be used for metering while creating the unified flow.
Open Items
- Can we avoid explicit dependency between ENI's and mappings?
- Bucket id integer to be associated to an optional metadata string?
DASH Sonic implementation is targeted for appliance scenarios and must handles millions of updates. As such, the implementation is performance and scale focussed as compared to traditional switching and routing based solutions. The following are the design considerations.
- Implementation must support single and bulk update of LPM and CA-PA Mapping tables.
- During startup, it is possible to have scaled configurations applied successively. Implementation must support such bulk updates
- In normal operation, mapping updates can occur as much as 100 mappings/sec
- In normal operation, route updates can occur every 30 sec.
- An add or delete of VM translates to updating ENI and routes in bulk context
- ACL operations (rules adding/deleting) per group for a stage must be handled atomically and should not have transient drop/forward cases within the group. When a rule is modified, controller shall send entire rules in the group and Sonic implementation replaces the old set with new set of rules.
- Implementation must support ability to get all ACL rules/groups based on guid.
- In normal operation, mappings churn often followed by routes and least for ACLs.
- ENIs shall have an admin-state that enables normal connections and forwarding only after all configurations for an ENI is applied during initial creation. When the ENI is admin-state down, the packets destined to this ENI shall be dropped. Order of operation/configuration shall be enforced by the controller. Sonic implementation shall honor the state set by controller and ENI shall accept and forward traffic only if the admin-state is set to 'up'.
- ENI must support 1M active bi-directional TCP connections or UDP flows however the connection pool can be oversubscribed. An oversubscription of 2:1 would be expected, so the connection pool can be more optimal if executed as one large table where ENI can be a part of the key.
- During VNET or ENI delete, implementation must support ability to delete all mappings or routes in a single API call.
- Add and Delete APIs are idempotent. As an example, deleting an object that doesn't exists shall not return an error.
- During a delete operation, if there is a dependency (E.g. mappings still present when a VNET is deleted), implementation shall return error and shall not perform any force-deletions or delete dependencies implicitly.
- During a bulk operation, if any part/subset of API fails, implementation shall return error for the entire API. Sonic implementation shall validate the entire API as pre-checks before applying and return accordingly.
- Implementation must have flexible memory allocation for ENI and not reserve max scale during initial create (e.g 100k routes). This is to allow oversubscription.
- Implementation must not have silent failures for APIs. E.g accepting an API from controller, returning success and failing in the backend. This is orthogonal to the idempotency of APIs described above for ADD and Delete operations. Intent is to ensure SDN controller and Sonic implementation is in-sync
ACL is essential for NSGs and have different stages. In the current model, there are two stages for Azure and VNET level and another three stages for customer. The following are some of the key requirements and considerations:
- ACL binding to ENI must be atomic.
- User can add/remove rules in a group as long as it is not currently bound to an ENI
- It is not permitted to modify rules within a group that is currently bound to an ENI. For such scenario, a new group shall be created by application with the modified set of rules which could then be bind to ENI. An exception is if tags are expanded to individual prefixes in an ACL rule. In such cases, if tags are modified, application shall update the corresponding rule by adding/removing a prefix.
- No requirement to modify an existing rule except for the case above. For e.g change action of a rule from permit to deny or vice-versa
- ACL Tagging
- Mapping a prefix to a tag can reduce the repetition of prefixes across different ACL rules and optimize memory usage
- Tagging is implemented as list in Orchagent and SAI layers
- A prefix can belong to multiple tags
- Prefixes can be added or removed from a tag at any time
- SAI implementation shall return a capability for the list size. A '0' return value shall be treated as no-tag support.
- An example tag list is as below.
- Tag1 - 10.1.1.0/24, 20.1.1.0/24
- Tag2 - 10.1.0.0/8, 20.1.1.0/24, 50.1.1.1/32
- Tag8 - Empty
- SAI_SRC_TAG_ENTRY_ATTR_TAG_MAP: 10.1.1.0/24 -> "Tag1, Tag2", 20.1.1.0/24 -> "Tag1, Tag2", 10.1.0.0/8 -> "Tag2". Note that orchagent shall extend the tag map to include all subnet to allow an LPM based lookup. In this case, tags for 10.1.1.0/24 shall also include the tag for 10.1.0.0/8.
- SAI_DASH_ACL_RULE_ATTR_SRC_TAG: This shall be a list of tags to match.
- If a packet arrives with src ip of 10.1.1.1, the corresponding derived src tag list shall be "Tag1, Tag2". It shall be then matched against the ACL_RULE Tag list
- If the tag field is empty, ACL rule must match ANY tag or NO tag.
- The tag list size depends on the SAI implementation capability. It is fixed during initialization based on the capability value returned by SAI implementation.
- Deleting ACL group is permitted as long as it is not bind to an ENI. It is not expected for application to delete individual rules prior to deleting a group. Implementation is expected to delete/free all resources when application triggers an ACL group delete.
- ACL rules are not expected to have both tags and prefixes of same type configured in the same rule. For e.g, same rule shall not have both src tag and src prefix configured, but it is possible to have src tag and dst prefix or vice-versa
- Counters can be attached to ACL rules optionally for retrieving the number of connections/flows. It is not required to get the packet/byte counter as in the traditional model. A new SAI counter type shall be required for this.
The following section captures at a high-level on the VNET packet flow. Detailed lookup and pipeline behavior can be referenced here.
Based on the incoming packet's VNI matched against the reserved VNI assigned for VM->Appliance, the pipeline shall set the direction as TX(Outbound) and using the inner src-mac, maps to the corresponding ENI.The incoming packet will always be VXLAN encapsulated and outer dst-ip is the appliance VIP. The pipeline shall parse the VNI, and for VM traffic, the VNI shall be a special reserved VNI. Everything else shall be treated as as network traffic(RX). Pipeline shall use VNI to differentiate the traffic to be VM (Inbound) or Network (Outbound).
In the outbound flow, the appliance shall assume it is the first appliance to apply policy. It applies the outbound ACLs in three stages (VNIC, Subnet and VNET), processed in order, with the outcome being the most restrictive of the three ACLs combined.
After the ACL stage, it does LPM routing based on the inner dst-ip and applies the respective action (encap, subsequent CA-PA mapping). Finally, update the connection tracking table for both inbound and outbound.
After the connection tracking update, the packet proceeds to the "underlay routing" stage. Based on the destination IP address (outer dst-ip), the underlay routing block decides how the packet should be forwarded towards its destination.
The figure below shows how the VNI to be encapsulated in the outgoing packet is derived based on the ENI, outbound routing and outbound CA-PA mapping table lookup in the Dash outbound packet processing pipeline.
Based on the incoming packet's VNI, if it does not match against any reserved VNI, the pipeline shall set the direction as RX(Inbound) and using the inner dst-mac, maps to the corresponding ENI. In the inbound flow, Priority based "Routing Rule" lookup happens based on VNI and optionally SRC PA prefix and maps to a VNET. In other words, the VNET is derived from a VNI key or a combination of VNI key and SRC PA based on the routing rule entry. It is possible that in some cases, two VNETs in different region can have the same VNI key and hence Inbound routing rule shall have both SRC PA prefix and VNI key to uniquely determine the VNET. Using the derived VNET's mapping tables, source PA address is validated against the list of mappings. If the check passes, decap action is performed, else dropped. Note that, PA validation is conditional and routing rule shall specify (say, by a flag) whether to perform PA validation or not as there are some cases like SLB traffic for which PA validation is not required. After route lookup is the three stage ACL, processed in order. ACLs can have multiple src/dst IP ranges or port ranges as match criteria. After the connection tracking update, the packet proceeds to the "underlay routing" stage. Based on the destination IP address (outer dst-ip), the underlay routing block decides how the packet should be forwarded towards its destination.
It is worth noting that CA-PA mapping table shall be used for both encap and decap process
ST/PL is employed for scenarios like multiple different customers want to access a common shared resource (e.g storage). This shall not fall into the regular Vnet packet path or Vnet peering path and hence a Private Endpoint is assigned for such accesses, as part of ENI routing or VNET's mapping tables. The lookup happens as described in the above sections, but actions are different. For ST/PL, actions include IPv4 to IPv6 transpositions and special routing/mapping lookups for encapsulation. By having packet transpositions, Service Tunnel feature provides the capability of encoding “region id”, “vnet id”, “subnet id” etc via packet transformation. IPv6 transformation includes last 32 bits of the IPv6 packet as IPv4 address, while the remaining 96 bits of the IPv6 packet is used for encoding. Private Link feature is an extension to Service Tunnel feature and enables customers to access public facing shared services via their private IP addresses within their vnet. More details on traffic flow is captured in the example section. ST/PL Inbound flow: Using the outbound unified flow, the reverse transposition (inbound unified flow) is created. If no inbound flow is created, the packet shall be dropped if it does not match any existing inbound routing rule. There is no inbound policy based lookup expected for ST/PL scenarios. When FastPath kicks in, the respective outbound and inbound unified flows shall be modified accordingly.
ST encap shall be either nvgre or vxlan. The same encapsulation is expected on the inbound flow.
- Network traffic is metered through configuration of route table entries, mapping entries, and meter policy rules with metering class IDs
- An (ENI, meter-class-id) entry is created to associate a specific ENI with a meter class ID
- The (ENI, meter-class-id) entry must be deleted when traffic is no longer metered OR the ENI is deleted
- Meter statistics for each ENI can be obtained through GET request on (ENI, meter-class-id) or (ENI, *)
- Each (ENI, meter-class-id) is mapped to a meter bucket counter in the pipeline
- The meter bucket counter tracks the number of transmitted and received bytes (number of packets is optional)
- A meter policy consists of a set of metering rules
- Each meter rule consists of an IP prefix mapped to meter-class-id
- Each meter rule can have a priority configured and the priority will be used to break the tie in case of prefix overlap across the rules
- Meter policy is applied to an ENI and can be shared across ENIs
It is possible that a given packet can get a hit in route table and/or mapping table and/or meter policy. Described below is the scheme to pick the final metering class:
-
If the route table entry has the metering policy lookup enabled and if an entry in meter policy is a hit (based on priority of the rule), then the corresponding policy metering class is picked over the route table entry's. If subsequent mapping lookup also gives a metering class then the mapping metering class takes precedence ONLY if the override flag is set on the mapping entry.
-
If the route table entry has the metering policy lookup disabled then the metering policy lookup is skipped and the route table entry metering class is picked. If subsequent mapping lookup also gives a metering class then the mapping metering class takes precedence ONLY if the override flag is set on the mapping entry.
This section captures the Sonic-Dash specifics of FastPath use-case. Detailed document on FastPath is captured here (FastPath)
The following are the salient points and requirements. Detailed design for FastPath feature shall come as a separate PR. FastPath redirect packets shall be handled by a standalone application and use SAI APIs to update the appliance/dpu flows.
-
FastPath kicks in when appliance receives an ICMP redirect that matches an existing unified flow
-
Each ICMP redirect shall only update one side of the flow.
-
ICMP packet is generated only for TCP. It is not generated for UDP traffic.
-
FastPath example for Service Tunnel. Notice that this is an example scenario and can be extended later for other packet formats
- After the first SYN pkt, appliance shall create two flows (one Outbound and another Inbound)
- Original Outbound packet shall have an inner IPv6 header and outer IPv4 (Src VIP-A and Dst VIP-B)
- After an ICMP redirect is received from VIP-B hosting MUX, the Outbound flow shall be fixed-up to have outer IPV4 dst address to use the Redirect IP of VIP-B. Same fixup for Inbound flow to change VIP-B to Redirect IP
- After an ICMP redirect is received from VIP-A hosting MUX, the Outbound flow shall be fixed-up to have outer IPV4 src address to use the Redirect IP of VIP-A. Same fixup for Inbound flow to change VIP-A to Redirect IP
- ICMP redirect shall have the original inner IPv6 address as the IP header's src and dst address.
- Redirect info shall contain the transposed IPv6 address, src and dst ports, sequence number and the encap type (NVGRE in this case) in addition to redirect address.
struct { uint32 Reserved; in6_addr Target; in6_addr Destination; uint8 Type; uint8 Length; uint8 Reserved2[6]; IPV6_HEADER Ipv6Header; uint16 SourcePort; uint16 DestinationPort; uint32 SequenceNumber; } Redirect; struct { uint32 Version; uint16 AddrFamily; uint16 EncapType; uint32 EncapId; union { struct { in_addr DipPAv4; char VMMac[MAC_ADDR_SIZE]; } Info4; struct { in6_addr DipPAv6; char VMMac[MAC_ADDR_SIZE]; } Info6; } Redirect_Info;
-
Implementation can use the above struct as a type-cast reference (packed as metadata in the redirect packet) and map it to a flow. Full packet capture is available in this (doc)
-
Redirect packet format is as below:
SLB IP APPL IP GRE SLB MAC VM MAC IP Inner Src IP Inner Dst IP ICMP Target Addr Dst Addr Redirect Header Custom format -
The following shall be used for translations
Field | Mapping |
---|---|
VM Mac | Source ENI |
Inner Src IP | Original Src IP |
Inner Dst IP | Original Dst IP |
Target Address | Original Dst IP |
Redirect Header | Original IPv6 Header + TCP ports (5 tuple) |
Addr Family | AF_INET/AF_INET6 |
Encap Type | NVGRE 1/VXLAN 2 |
Encap Id | Redirect GRE Key/ VXLAN Id |
Custom Redirect Info | Redirect DIP and Dst Mac |
The following are the schema changes. The NorthBound APIs shall be defined as sonic-yang in compliance to yang-guideline.
For DASH objects, the proposal is to use the existing APP_DB instance and objects are prefixed with "DASH". DASH APP_DB objects are preserved only during warmboots and isolated from regular configurations that are persistent in the appliance across reboots. All the DASH objects are programmed by SDN and hence treated differently from the existing Sonic L2/L3 'switch' DB objects. Status of the configured objects shall be reflected in the corresponding STATE_DB entries.
Reference Yang model for DASH Vnet is here.
"DEVICE_METADATA": {
"localhost": {
"subtype": "Appliance",
"type": "SonicHost",
"switch_type": "dpu",
"sub_role": "None"
}
}
Following diagram captures the object reference model.
DASH_VNET_TABLE:{{vnet_name}}
"vni": {{vni}}
"guid": {{"string"}}
"address_spaces": {{[list of addresses]}} (OPTIONAL)
"peer_list": {{vnet_name_list}} (OPTIONAL)
DASH_QOS_TABLE:{{qos_name}}
"qos_id": {{string}}
"bw": {{bw}}
"cps": {{cps}}
"flows": {{flows}}
key = DASH_QOS_TABLE:qos_name ; Qos name as key
; field = value
bw = bandwidth in kbps
cps = Number of connection per second
flows = Number of flows
DASH_ENI_TABLE:{{eni}}
"eni_id": {{string}}
"mac_address": {{mac_address}}
"qos": {{qos_name}}
"underlay_ip": {{ip_addr}}
"admin_state": {{enabled/disabled}}
"vnet": {{vnet_name}}
"pl_sip_encoding": {{string}} (OPTIONAL)
"pl_underlay_sip": {{ip_addr}} (OPTIONAL)
"v4_meter_policy_id": {{string}} (OPTIONAL)
"v6_meter_policy_id": {{string}} (OPTIONAL)
key = DASH_ENI_TABLE:eni ; ENI MAC as key
; field = value
mac_address = MAC address as string
qos = Associated Qos profile
underlay_ip = PA address for Inbound encapsulation to VM
admin_state = Enabled after all configurations are applied.
vnet = Vnet that ENI belongs to
pl_sip_encoding = Private Link encoding for IPv6 SIP transpositions; Format "0xfield_value/0xfull_mask". field_value must be used as a replacement to the
first len(full_mask) bits of pl_sip. Last 32 bits are reserved for the IPv4 CA. Logic: ((pl_sip & !full_mask) | field_value).
pl_underlay_sip = Underlay SIP (ST GW VIP) to be used for all private link transformation for this ENI
v4_meter_policy_id = IPv4 meter policy ID
v6_meter_policy_id = IPv6 meter policy ID
DASH_PREFIX_TAG_TABLE:{{tag_name}}
"ip_version": {{ipv4/ipv6}}
"prefix_list": {{list of prefix}}
tag_name = STRING; unique tag name
addresses = list of ip prefixes ',' separated. valid to have empty list of prefixes.
If the prefix is empty, no packet will be assigned to this TAG.
DASH_ACL_IN_TABLE:{{eni}}:{{stage}}
"v4_acl_group_id": {{group_id}} (OPTIONAL)
"v6_acl_group_id": {{group_id}} (OPTIONAL)
DASH_ACL_OUT_TABLE:{{eni}}:{{stage}}
"v4_acl_group_id": {{group_id}} (OPTIONAL)
"v6_acl_group_id": {{group_id}} (OPTIONAL)
key = DASH_ACL_IN_TABLE:eni:stage ; ENI name and stage as key; ACL stage can be {1, 2, 3 ..}
; field = value
v4_acl_group_id = IPv4 ACL group ID
v6_acl_group_id = IPv6 ACL group ID
DASH_ACL_GROUP_TABLE:{{group_id}}
"ip_version": {{ipv4/ipv6}}
"guid": {{string}}
DASH_ACL_RULE_TABLE:{{group_id}}:{{rule_num}}
"priority": {{priority}}
"action": {{action}}
"terminating": {{bool}}
"protocol": {{list of protocols}} (OPTIONAL)
"src_tag": {{list of tag name}} (OPTIONAL)
"dst_tag": {{list of tag name}} (OPTIONAL)
"src_addr": {{list of prefix}} (OPTIONAL)
"dst_addr": {{list of prefix}} (OPTIONAL)
"src_port": {{list of range of ports}} (OPTIONAL)
"dst_port": {{list of range of ports}} (OPTIONAL)
key = DASH_ACL_RULE_TABLE:group_id:rule_num ; unique rule num within the group.
; field = value
priority = INT32 value ; priority of the rule, lower the value, higher the priority
action = allow/deny
terminating = true/false ; if true, stop processing further rules
protocols = list of INT ',' separated; E.g. 6-tcp, 17-udp; if not provided, match on all protocols
src_tag = list of source tag name ',' separated; if not provided, match on ANY tag or NO tag.
dst_tag = list of destination tag name ',' separated; if not provided, match on ANY tag or NO tag.
src_addr = list of source ip prefixes ',' separated; if not provided, match on all source IPs.
dst_addr = list of destination ip prefixes ',' separated; if not provided, match on all destination IPs.
src_port = list of range of source ports ',' separated; if not provided, match on all source ports.
dst_port = list of range of destination ports ',' separated; if not provided, match on all destination ports.
DASH_ROUTING_TYPE_TABLE:{{routing_type}}: [
"action_name":{{string}}
"action_type": {{action_type}}
"encap_type": {{encap type}} (OPTIONAL)
"vni": {{vni}} (OPTIONAL)
]
key = DASH_ROUTING_TYPE_TABLE:routing_type; routing type can be {direct, vnet, vnet_direct, vnet_encap, appliance, privatelink, privatelinknsg, servicetunnel, drop}; actions can be a list of action_types
; field = value
action_name = action name as string
action_type = action_type can be {maprouting, direct, staticencap, appliance, 4to6, mapdecap, decap, drop}
encap_type = encap type depends on the action_type - {vxlan, nvgre}
vni = vni value associated with the corresponding action. Applicable if encap_type is specified.
DASH_ROUTING_APPLIANCE_TABLE:{{appliance_id}}:
"appliance_guid":{{string}}
"addresses": {{list of addresses}}
"encap_type": {{encap type}}
"vni": {{vni}}
key = DASH_ROUTING_APPLIANCE_TABLE:appliance_id; Used for PL NSG
; field = value
addresses = list of addresses used for ECMP across appliances
encap_type = encap type depends on the action_type - {vxlan, nvgre}
vni = vni value associated with the corresponding action.
DASH_APPLIANCE_TABLE:{{appliance_id}}
"sip": {{ip_address}}
"vm_vni": {{vni}}
key = DASH_APPLIANCE_TABLE:id ; attributes specific for the appliance
; field = value
sip = source ip address, to be used in encap
vm_vni = VM VNI that is used for setting direction. Also used for inbound encap to VM
DASH_ROUTE_TABLE:{{eni}}:{{prefix}}
"action_type": {{routing_type}}
"vnet":{{vnet_name}} (OPTIONAL)
"appliance":{{appliance_id}} (OPTIONAL)
"overlay_ip":{{ip_address}} (OPTIONAL)
"overlay_sip":{{ip_address}} (OPTIONAL)
"overlay_dip":{{ip_address}} (OPTIONAL)
"underlay_sip":{{ip_address}} (OPTIONAL)
"underlay_dip":{{ip_address}} (OPTIONAL)
"metering_policy_en": {{bool}} (OPTIONAL)
"metering_class": {{class_id}} (OPTIONAL)
key = DASH_ROUTE_TABLE:eni:prefix ; ENI route table with CA prefix for packet Outbound
; field = value
action_type = routing_type ; reference to routing type
vnet = vnet name ; destination vnet name if routing_type is {vnet, vnet_direct}, a vnet other than eni's vnet means vnet peering
appliance = appliance id ; appliance id if routing_type is {appliance}
overlay_ip = ip_address ; overly_ip to lookup if routing_type is {vnet_direct}, use dst ip from packet if not specified
overlay_sip = ip_address ; overlay ipv6 src ip if routing_type is {servicetunnel}, transform last 32 bits from packet (src ip)
overlay_dip = ip_address ; overlay ipv6 dst ip if routing_type is {servicetunnel}, transform last 32 bits from packet (dst ip)
underlay_sip = ip_address ; underlay ipv4 src ip if routing_type is {servicetunnel}; this is the ST GW VIP (for ST traffic) or custom VIP
underlay_dip = ip_address ; underlay ipv4 dst ip to override if routing_type is {servicetunnel}, use dst ip from packet if not specified
metering_policy_en = bool ; Metering policy lookup enable (optional), default = false
metering_class = class_id ; Metering class-id, used if metering policy lookup is not enabled
DASH_ROUTE_RULE_TABLE:{{eni}}:{{vni}}:{{prefix}}
"action_type": {{routing_type}}
"priority": {{priority}}
"protocol": {{protocol_value}} (OPTIONAL)
"vnet":{{vnet_name}} (OPTIONAL)
"pa_validation": {{bool}} (OPTIONAL)
"metering_class": {{class_id}} (OPTIONAL)
"region": {{region_id}} (OPTIONAL)
key = DASH_ROUTE_RULE_TABLE:eni:vni:prefix ; ENI Inbound route table with VNI and optional SRC PA prefix
; field = value
action_type = routing_type ; reference to routing type, action can be decap or drop
priority = INT32 value ; priority of the rule, lower the value, higher the priority
protocol = INT32 value ; protocol value of incoming packet to match; 0 (any)
vnet = vnet name ; mapped VNET for the key vni/pa
pa_validation = true/false ; perform PA validation in the mapping table belonging to vnet_name. Default is set to true
metering_class = class_id ; Metering class-id
region = region_id ; optional region_id which the vni/prefix belongs to as a string for any vendor optimizations
DASH_VNET_MAPPING_TABLE:{{vnet}}:{{ip_address}}
"routing_type": {{routing_type}}
"underlay_ip":{{ip_address}}
"mac_address":{{mac_address}} (OPTIONAL)
"metering_class": {{class_id}} (OPTIONAL)
"override_meter": {{bool}} (OPTIONAL)
"use_dst_vni": {{bool}} (OPTIONAL)
"use_pl_sip_eni": {{bool}} (OPTIONAL)
"overlay_sip":{{ip_address}} (OPTIONAL)
"overlay_dip":{{ip_address}} (OPTIONAL)
key = DASH_VNET_MAPPING_TABLE:vnet:ip_address ; CA-PA mapping table for Vnet
; field = value
action_type = routing_type ; reference to routing type
underlay_ip = ip_address ; PA address for the CA
mac_address = MAC address as string ; Inner dst mac
metering_class = class_id ; metering class-id
override_meter = bool ; override the metering class-id coming from the route table
use_dst_vni = bool ; if true, use the destination VNET VNI for encap. If false or not specified, use source VNET's VNI
overlay_sip = ip_address ; overlay src ip if routing_type is {privatelink}, transform last 32 bits from packet
overlay_dip = ip_address ; overlay dst ip if routing_type is {privatelink}
DASH_METER_POLICY:{{meter_policy_id}}
"ip_version": {{ipv4/ipv6}}
key = DASH_METER_POLICY:meter_policy_id ; policy_id as key
; field = value
ip_version = IP version (IPv4/IPv6)
DASH_METER_RULE:{{meter_policy_id}}:{{rule_num}}
"priority": {{priority}}
"ip_prefix":{{ip_prefix}}
"metering_class": {{class_id}}
key = DASH_METER_RULE:meter_policy_id:rule_num ; unique rule num within the policy.
; field = value
priority = INT32 value ; priority of the rule: lower the value, higher the priority
ip_prefix = ip_prefix ; ip prefix for matching
metering_class = class_id ; metering class-id
DASH_METER:{{eni}}:{{metering_class_id}}
“metadata”: {{string}} (OPTIONAL)
"tx_counter": {{bytes}}
"rx_counter": {{bytes}}
key = DASH_METER:eni:metering_class_id ; metering class id table per (ENI)
; field = value
metadata = string ; Optional metadata string
tx_counter = bytes ; Number of transmitted bytes (read-only)
rx_counter = bytes ; Number of received bytes (read-only)
APP_DB Table | Key | Field | SAI Attributes/objects | Comment |
---|---|---|---|---|
DASH_APPLIANCE_TABLE | ||||
appliance_id | ||||
sip | sai_vip_entry_t.vip | |||
vm_vni | sai_direction_lookup_entry_t.VNI | |||
DASH_VNET_TABLE | SAI_OBJECT_TYPE_VNET | |||
vnet_name | ||||
vxlan_tunnel | VxLAN tunnel won't be used | |||
vni | SAI_VNET_ATTR_VNI | |||
guid | Not relevant | |||
address_spaces | ||||
peer_list | ||||
DASH_QOS_TABLE | ||||
qos_name | ||||
qos_id | ||||
bw | SAI_ENI_ATTR_PPS | |||
cps | SAI_ENI_ATTR_CPS | |||
flows | SAI_ENI_ATTR_FLOWS | |||
DASH_ENI_TABLE | SAI_OBJECT_TYPE_ENI | |||
eni | ||||
eni_id* | SAI_ENI_ETHER_ADDRESS_MAP_ENTRY_ATTR_ENI_ID | |||
mac_address* | sai_eni_ether_address_map_entry_t.address | |||
eni_id** | sai_outbound_eni_to_vni_entry_t.ENI | |||
qos | ||||
vnet** | SAI_ENI_ATTR_VNET_ID | VNET object ID | ||
DASH_ACL_V4_IN_TABLE | Same for V6 | |||
eni | ||||
stage | SAI_ENI_ATTR_INBOUND_V4_stage_DASH_ACL_GROUP_ID | STAGE1..STAGE5 | ||
acl_group_id | SAI_ENI_ATTR_INBOUND_V4_stage_DASH_ACL_GROUP_ID | |||
DASH_ACL_GROUP_TABLE | SAI_OBJECT_TYPE_DASH_ACL_GROUP | |||
group_id | ||||
ip_version | SAI_DASH_ACL_GROUP_ATTR_IP_ADDR_FAMILY | |||
DASH_ACL_RULE_TABLE | SAI_OBJECT_TYPE_DASH_ACL_RULE | |||
group_id | SAI_DASH_ACL_RULE_ATTR_GROUP_ID | |||
rule_num | ||||
priority | SAI_DASH_ACL_RULE_ATTR_PRIORITY | |||
action | SAI_DASH_ACL_RULE_ATTR_ACTION | |||
terminating | SAI_DASH_ACL_RULE_ATTR_ACTION | AND_CONTINUE if not terminating | ||
protocol | SAI_DASH_ACL_RULE_ATTR_PROTOCOL | |||
src_addr | SAI_DASH_ACL_RULE_ATTR_SIP | |||
dst_addr | SAI_DASH_ACL_RULE_ATTR_DIP | |||
dst_port | SAI_DASH_ACL_RULE_ATTR_DST_PORT | |||
src_port | SAI_DASH_ACL_RULE_ATTR_SRC_PORT | |||
DASH_ROUTE_TABLE | ||||
eni | sai_outbound_routing_entry_t.ENI | |||
prefix | sai_outbound_routing_entry_t.destination | |||
action_type | Need action type for future cases | |||
vnet | SAI_OUTBOUND_ROUTING_ENTRY_ATTR_DEST_VNET_VNI | VNI value taken from DASH_VNET table | ||
appliance | Not supported yet | |||
overlay_ip | SAI_OUTBOUND_ROUTING_ENTRY_ATTR_OVERLAY_IP | |||
underlay_ip | Not supported yet | |||
overlay_sip | Not supported yet | |||
underlay_dip | Not supported yet | |||
customer_addr | Not supported yet | |||
metering_bucket | SAI_OUTBOUND_ROUTING_ENTRY_ATTR_COUNTER_ID | |||
DASH_MAPPING_TABLE | ||||
vnet | sai_outbound_ca_to_pa_entry_t.dest_vni | VNET's VNI | ||
ip_address | sai_outbound_ca_to_pa_entry_t.dip | |||
routing_type | ||||
underlay_ip | SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_UNDERLAY_DIP | |||
mac_address | SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_OVERLAY_DMAC | |||
metering_bucket | SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_COUNTER_ID | |||
vnet* | sai_pa_validation_entry_t.vnet_id | VNET's VNI | ||
underlay_ip* | sai_pa_validation_entry_t.sip | SAI_PA_VALIDATION_ENTRY_ATTR_ACTION is permit | ||
DASH_ROUTE_RULE_TABLE | ||||
eni | sai_inbound_routing_entry_t.eni_id | |||
vni | sai_inbound_routing_entry_t.vni | |||
prefix | sai_inbound_routing_entry_t.prefix | |||
action_type | ||||
priority | sai_inbound_routing_entry_t.priority | |||
protocol | ||||
vnet | SAI_INBOUND_ROUTING_ENTRY_ATTR_SRC_VNET_ID | |||
pa_validation | SAI_INBOUND_ROUTING_ENTRY_ATTR_ACTION | use PA_VALIDATE if true | ||
metering_bucket |
For saving memory consumption(AppDBMemoryEstimation.xlsx), the DASH table of APP_DB could be encoded as protobuf.
key: text, Same as the original design
field_name: fixed text "pb"
field_value: binary array of protobuf message
E.G.
127.0.0.1:6379> hgetall "DASH_VNET_TABLE:vnet1"
1) "pb"
2) "\n\x010\x12$b6d54023-5d24-47de-ae94-8afe693dd1fc\x1a\x17\n\x12\x12\x10\r\xc0-\xdd\x82\xa3\x88;\x0fP\x84<\xaakc\x16\x10\x80\x01\x1a\x17\n\x12\x12\x10-\x0e\xf2\x7f\n~c_\xd8\xb7\x10\x84\x81\xd6'|\x10\x80\x01\x1a\x17\n\x12\x12\x10\x1bV\x89\xc8JW\x06\xfb\xad\b*fN\x9e(\x17\x10\x80\x01\x1a\x17\n\x12\x12\x107\xf9\xbc\xc0\x8d!s\xccVT\x88\x00\xf8\x9c\xce\x90\x10\x80\x01\x1a\x17\n\x12\x12\x10\tEb\x11Mf]\x12\x17x\x99\x80\xea\xd1u\xb4\x10\x80\x01\x1a\x17\n\x12\x12\x10\x1f\xd3\x1c\x89\x99\x16\xe7\x18\x91^0\x81\xb1\x04\x8c\x1e\x10\x80\x01\x1a\x17\n\x12\x12\x10\x06\x9e55\xdb\xb5&\x93\x99\xfaC\x81\x16P\xdc\x1d\x10\x80\x01\x1a\x17\n\x12\x12\x10&]U\x96e4\xf4\xd2'&\x04i\xdf\x8dA\x9f\x10\x80\x01\x1a\x17\n\x12\x12\x108\xd5\xa3*\xe7\x80\xdc\x1e\x80f\x94\xb7\xb6\x86~\xcd\x10\x80\x01\x1a\x17\n\x12\x12\x101\xf0@F\nu+}\x1e\"\\\\\xdb\x01\xe3\x82\x10\x80\x01\"\x05vnet1\"\x05vnet2\"\x05vnet1\"\x05vnet2\"\x05vnet2\"\x05vnet1\"\x05vnet2\"\x05vnet2\"\x05vnet1\"\x05vnet1"
- Pre-defined Type
message IpAddress {
oneof ip {
fixed32 ipv4 = 1; // Network byte order (big-endian)
bytes ipv6 = 2; // Network byte order (big-endian)
}
}
message IpPrefix {
IpAddress ip = 1;
IpAddress mask = 2;
}
message Range {
uint32 min = 1;
uint32 max = 2;
}
message ValueOrRange {
oneof value_or_range {
uint32 value = 1;
Range range = 2;
}
}
enum IpVersion {
IP_VERSION_IPV4 = 0;
IP_VERSION_IPV6 = 1;
}
- Type mapping
Table type | Protobuf type |
---|---|
bool | bool |
string | string |
int | uint32 |
object id | string |
mac address | bytes |
ip address | IpAddress |
ip prefix | IpPrefix |
vni | uint32 |
l4 protocol | RangeOrValue |
l4 port | RangeOrValue |
Enumerations | enum |
list * | repeated * |
- Here is an example for AclRule
message AclRule {
uint32 priority = 1;
acl.Action action = 2;
bool terminating = 3;
repeated uint32 protocol = 4;
repeated string src_tag = 9;
repeated string dst_tag = 10;
repeated types.IpPrefix src_addr = 5;
repeated types.IpPrefix dst_addr = 6;
repeated types.ValueOrRange src_port = 7;
repeated types.ValueOrRange dst_port = 8;
}
A high-level module interaction is captured in the following diagram.
The figure below illustrates the various schema and their transformations into the various SONiC layers, including:
- gNMI northbound API, which uses YANG to specify schema
- Redis APP_DB, which uses ABNF schema definition language. Redis objects can be directly manipulated using SAI-redis clients.
- JSON import/export formats
- SAI table and attribute objects
For testing purposes, it is convenient to express test configurations in a single canonical format, and use this to drive the different API layers to verify correct behavior. A tentative JSON format for representing DASH service configurations is described in Reference configuration example (JSON). Test drivers can accept this input, transform it into different schemas and drive the associated interfaces. For example, a JSON representation of an ACL rule can be transformed into gNMI API calls, SAI-redis calls, SAI-thrift calls, etc.
The following containers shall be enabled for SONiC host and part of the image. Switch specific containers shall be disabled for the image built for the appliance card.
Container/Feature Name | Is Enabled? |
---|---|
SNMP | Yes |
Telemetry | Yes |
LLDP | Yes |
Syncd | Yes |
Swss | Yes |
Database | Yes |
BGP | Yes |
Teamd | No |
Pmon | Yes |
Nat | No |
Sflow | No |
DHCP Relay | No |
Radv | No |
Macsec | No |
Resttapi | No |
gNMI | Yes |
A new orchestration agent "dashorch" shall be implemented that subscribes to DASH APP DB objects and programs the ASIC_DB via the SAI DASH API. DASHOrch shall have sub-orchestrations to handle ACLs, Routes, CA-PA mappings. DASH orchestration agent shall write the state of each tables to STATEDB that applications shall utilize to fetch the programmed status of configured objects.
DASH APIs shall be exposed as gNMI interface and part of the SONiC gNMI container. Clients shall configure the SONiC via gRPC get/set calls. gNMI container has the config backend to translate/write DASH objects to CONFIG_DB and/or DASH APP_DB.
SONiC for DASH shall have a lite swss initialization without the heavy-lift of existing switch based orchestration agents that SONiC currently have. The initialization shall be based on switch_type "dpu". For the underlay support, the following SAI APIs are expected to be supported:
Component | SAI attribute |
---|---|
Host Interface | SAI_HOSTIF_ATTR_NAME |
SAI_HOSTIF_ATTR_OBJ_ID | |
SAI_HOSTIF_ATTR_TYPE | |
SAI_HOSTIF_ATTR_OPER_STATUS | |
SAI_HOSTIF_TABLE_ENTRY_ATTR_CHANNEL_TYPE | |
SAI_HOSTIF_TABLE_ENTRY_ATTR_HOST_IF | |
SAI_HOSTIF_TABLE_ENTRY_ATTR_TRAP_ID | |
SAI_HOSTIF_TABLE_ENTRY_ATTR_TYPE | |
SAI_HOSTIF_TRAP_ATTR_PACKET_ACTION | |
SAI_HOSTIF_TRAP_ATTR_TRAP_GROUP | |
SAI_HOSTIF_TRAP_ATTR_TRAP_PRIORITY | |
SAI_HOSTIF_TRAP_ATTR_TRAP_TYPE | |
SAI_HOSTIF_TRAP_GROUP_ATTR_POLICER | |
SAI_HOSTIF_TRAP_GROUP_ATTR_QUEUE | |
Neighbor | SAI_NEIGHBOR_ENTRY_ATTR_DST_MAC_ADDRESS |
Nexthop | SAI_NEXT_HOP_ATTR_IP |
SAI_NEXT_HOP_ATTR_ROUTER_INTERFACE_ID | |
SAI_NEXT_HOP_ATTR_TYPE | |
Nexthop Group | SAI_NEXT_HOP_GROUP_TYPE_ECMP |
SAI_NEXT_HOP_GROUP_MEMBER_ATTR_NEXT_HOP_ID | |
SAI_NEXT_HOP_GROUP_MEMBER_ATTR_NEXT_HOP_GROUP_ID | |
Packet | SAI_PACKET_ACTION_FORWARD |
SAI_PACKET_ACTION_TRAP | |
SAI_PACKET_ACTION_DROP | |
Policer | SAI_POLICER_ATTR_CBS |
SAI_POLICER_ATTR_CIR | |
SAI_POLICER_ATTR_COLOR_SOURCE | |
SAI_POLICER_ATTR_GREEN_PACKET_ACTION | |
SAI_POLICER_ATTR_METER_TYPE | |
SAI_POLICER_ATTR_MODE | |
SAI_POLICER_ATTR_PBS | |
SAI_POLICER_ATTR_PIR | |
SAI_POLICER_ATTR_RED_PACKET_ACTION | |
SAI_POLICER_ATTR_YELLOW_PACKET_ACTION | |
Port | SAI_PORT_ATTR_ADMIN_STATE |
SAI_PORT_ATTR_ADVERTISED_AUTO_NEG_MODE | |
SAI_PORT_ATTR_ADVERTISED_FEC_MODE | |
SAI_PORT_ATTR_ADVERTISED_INTERFACE_TYPE | |
SAI_PORT_ATTR_ADVERTISED_MEDIA_TYPE | |
SAI_PORT_ATTR_ADVERTISED_SPEED | |
SAI_PORT_ATTR_AUTO_NEG_MODE | |
SAI_PORT_ATTR_FEC_MODE | |
SAI_PORT_ATTR_HW_LANE_LIST | |
SAI_PORT_ATTR_INTERFACE_TYPE | |
SAI_PORT_ATTR_MTU | |
SAI_PORT_ATTR_OPER_SPEED | |
SAI_PORT_ATTR_OPER_STATUS | |
SAI_PORT_ATTR_SPEED | |
SAI_PORT_ATTR_SUPPORTED_SPEED | |
RIF | SAI_ROUTER_INTERFACE_ATTR_ADMIN_V4_STATE |
SAI_ROUTER_INTERFACE_ATTR_ADMIN_V6_STATE | |
SAI_ROUTER_INTERFACE_ATTR_MTU | |
SAI_ROUTER_INTERFACE_ATTR_PORT_ID | |
SAI_ROUTER_INTERFACE_ATTR_SRC_MAC_ADDRESS | |
SAI_ROUTER_INTERFACE_ATTR_TYPE | |
SAI_ROUTER_INTERFACE_ATTR_VIRTUAL_ROUTER_ID | |
Route | SAI_ROUTE_ENTRY_ATTR_NEXT_HOP_ID |
SAI_ROUTE_ENTRY_ATTR_PACKET_ACTION | |
Switch | SAI_SWITCH_ATTR_CPU_PORT |
SAI_SWITCH_ATTR_DEFAULT_TRAP_GROUP | |
SAI_SWITCH_ATTR_DEFAULT_VIRTUAL_ROUTER_ID | |
SAI_SWITCH_ATTR_DEFAULT_VLAN_ID | |
SAI_SWITCH_ATTR_ECMP_DEFAULT_HASH_SEED | |
SAI_SWITCH_ATTR_INIT_SWITCH | |
SAI_SWITCH_ATTR_PORT_LIST | |
SAI_SWITCH_ATTR_PORT_NUMBER | |
SAI_SWITCH_ATTR_PORT_STATE_CHANGE_NOTIFY | |
SAI_SWITCH_ATTR_SHUTDOWN_REQUEST_NOTIFY | |
SAI_SWITCH_ATTR_SRC_MAC_ADDRESS | |
SAI_SWITCH_ATTR_SWITCH_ID | |
SAI_SWITCH_ATTR_TYPE | |
SAI_SWITCH_ATTR_VXLAN_DEFAULT_PORT | |
SAI_SWITCH_ATTR_VXLAN_DEFAULT_ROUTER_MAC |
DASH Appliance shall establish BGP session with the connected Peer and advertise the prefixes (VIP PA). In turn, the Peer (e.g, Network device or SmartSwitches) shall advertise default route to appliance. With two Peers connected, the appliance shall have route with gateway towards both Peers and does ECMP routing. Orchagent install the route and resolves the neighbor (GW) mac and programs the underlay route/nexthop and neighbor. Underlay attributes on a DASH appliance shall be programmed similar to Sonic switch. RIF entries shall be created first using SAI_ROUTER_INTERFACE APIs with IP2ME routes installed using SAI_ROUTE_ENTRY APIs. Based on neighbor learned from peer(e.g, Network device or SmartSwitches), neighbor and next-hop entries shall be programmed using SAI_NEIGHBOR_ENTRY and SAI_NEXT_HOP APIs. Finally underlay routes learned via BGP shall be programmed with regular or ECMP next-hops via SAI underlay APIs as mentioned above.
Note that only default route is expected from the peer BGP and appliance is not expected to allocate an LPM resource for underlay. Implementation can choose whether to forward the packet on the same port it is received or do forwarding based on route and next-hop entry. Same is applicable for ECMP where the implementation can perform 5-tuple hashing or forward the "return" traffic on the same port it has received the original packet.
Default DSCP behavior for vxlan or nvgre encap shall be "uniform" model (similar to SAI_TUNNEL_DSCP_MODE_UNIFORM_MODEL) and TTL behavior for encap shall be "pipe" model (similar to SAI_TUNNEL_TTL_MODE_PIPE_MODEL). However for DSCP, pipeline must copy the DSCP value to the new encap outer header from the original incoming packet before decap was done and not from the customer packet header. Appliance shall not modify the DSCP or TTL values in the inner packet (customer packet) during an encap and shall ensure outer header values are independent of the customer generated headers. Similarly, for routing action type "direct", for e.g internet traffic, the DSCP value shall be reset to original header's DSCP value even-if original packet arrives with a non-zero value effectively ensuring all outbound traffic from the appliance is transmitted with same value that is received in the original outer header. Outer TTL value shall be default set to 64.
Running components | Memory usage |
---|---|
Base Debian OS | 159MB |
Base Debian OS + docker containers | 1.3GB |
Container | Memory usage |
---|---|
snmp | 52.5MB |
telemetry | 88.38MB |
lldp | 57.07MB |
syncd* | 36.36MB |
swss | 53.5MB |
bgp | 74.66MB |
pmon* | 108.1MB |
database | 83.56MB |
* These containers have vendor-specific components. Their memory usage will vary from vendor to vendor.
The following commands shall be added :
- show dash eni <eni>
- show dash eni <eni> routes all
- show dash eni <eni> acls stage <ingress/egress/all>
- show dash vnet <vnet>
- show dash vnet <vnet> mappings
- show dash route-types
- show dash qos
- show dash vnet brief
- sonic-clear dash all
- sonic-clear dash eni <eni>
- sonic-clear dash vnet <vnet>
Refer DASH documentation for the test plan.
[
{
"DASH_VNET_TABLE:Vnet1": {
"vni": "45654",
"guid": "559c6ce8-26ab-4193-b946-ccc6e8f930b2"
},
"OP": "SET"
},
{
"DASH_ENI_TABLE:F4939FEFC47E": {
"eni_id": "497f23d7-f0ac-4c99-a98f-59b470e8c7bd",
"mac_address": "F4-93-9F-EF-C4-7E",
"underlay_ip": "25.1.1.1",
"admin_state": "enabled",
"vnet": "Vnet1",
"v4_meter_policy_id": "245bea34-1000-0000-0000-0000082764ac"
},
"OP": "SET"
},
{
"DASH_ROUTING_TYPE_TABLE:vnet": {
"name": "action1",
"action_type": "maprouting"
},
"OP": "SET"
},
{
"DASH_ROUTING_TYPE_TABLE:vnet_direct": {
"name": "action1",
"action_type": "maprouting"
},
"OP": "SET"
},
{
"DASH_ROUTING_TYPE_TABLE:vnet_encap": {
"name": "action1",
"action_type": "staticencap",
"encap_type": "vxlan"
},
"OP": "SET"
},
{
"DASH_ROUTE_TABLE:F4939FEFC47E:10.1.0.0/16": {
"prefix":"10.1.0.0/16",
"action_type":"vnet",
"vnet":"Vnet1"
},
"OP": "SET"
},
{
"DASH_ROUTE_TABLE:F4939FEFC47E:10.1.0.0/24": {
"action_type":"vnet_direct",
"vnet":"Vnet1",
"overlay_ip":"10.0.0.6"
},
"OP": "SET"
},
{
"DASH_ROUTE_TABLE:F4939FEFC47E:30.0.0.0/16": {
"action_type":"direct",
"metering_policy_en":"false",
"metering_class":"1000"
},
"OP": "SET"
},
{
"DASH_ROUTE_TABLE:F4939FEFC47E:40.0.0.0/16": {
"action_type":"direct",
"metering_policy_en":"true",
"metering_class":"1000"
},
"OP": "SET"
},
{
"DASH_ROUTE_TABLE:F4939FEFC47E:10.2.5.0/24": {
"action_type":"drop"
},
"OP": "SET"
},
{
"DASH_VNET_MAPPING_TABLE:Vnet1:10.0.0.6": {
"routing_type":"vnet_encap",
"underlay_ip":"2601:12:7a:1::1234",
"mac_address":"F9-22-83-99-22-A2"
"metering_class":"1002"
},
"OP": "SET"
},
{
"DASH_VNET_MAPPING_TABLE:Vnet1:10.0.0.5": {
"routing_type":"vnet_encap",
"underlay_ip":"100.1.2.3",
"mac_address":"F9-22-83-99-22-A2"
},
"OP": "SET"
},
{
"DASH_VNET_MAPPING_TABLE:Vnet1:10.1.1.1": {
"routing_type":"vnet_encap",
"underlay_ip":"101.1.2.3",
"mac_address":"F9-22-83-99-22-A2",
"metering_class":"1001"
},
"OP": "SET"
},
{
"DASH_ROUTE_RULE_TABLE:F4939FEFC47E:45654:101.1.2.3/32: {
"action_type":"decap",
"priority": 1,
"protocol": "0",
"vnet":"Vnet1",
"pa_validation": true
}
},
{
"DASH_ROUTE_RULE_TABLE:F4939FEFC47E:45654: {
"action_type":"decap",
"priority": 2,
"protocol": "0",
"vnet":"Vnet2",
"pa_validation": true
}
},
{
"DASH_METER_POLICY: {
"meter_policy_id": "245bea34-1000-0000-0000-0000082764ac",
"ip_version": "ipv4"
}
},
{
"DASH_METER_RULE: {
"meter_policy_id": "245bea34-1000-0000-0000-0000082764ac",
"rule_num": "1",
"prioirty": "0",
"ip_prefix": "40.0.0.1/32",
"metering_class":"20000"
}
"OP": "SET"
},
{
"DASH_METER: {
"eni_id": "497f23d7-f0ac-4c99-a98f-59b470e8c7bd",
"metadata":"ROUTE_DIRECT_VNET1",
"metering_class": "1000"
}
"OP": "SET"
},
{
"DASH_METER: {
"eni_id": "497f23d7-f0ac-4c99-a98f-59b470e8c7bd",
"metadata":"MAPPING_VNET1_10010101",
"metering_class": "1001"
}
"OP": "SET"
},
{
"DASH_METER: {
"eni_id": "497f23d7-f0ac-4c99-a98f-59b470e8c7bd",
"metadata":"MAPPING_VNET1_10000006",
"metering_class": "1002"
}
"OP": "SET"
},
{
"DASH_METER: {
"eni_id": "497f23d7-f0ac-4c99-a98f-59b470e8c7bd",
"metadata":"ROUTE_DIRECT_POLICY_40000001",
"metering_class": "20000"
}
"OP": "SET"
},
]
For the example configuration above, the following is a brief explanation of lookup behavior in the outbound direction:
1. Packet destined to 10.1.1.1:
a. LPM lookup hits for entry 10.1.0.0/16
b. The action in this case is "vnet" and the routing type for "vnet" is "maprouting"
c. Next lookup shall happen on the "mapping" table for Vnet "Vnet1"
d. Mapping table for 10.1.1.1 shall be hit and it takes the action "vnet_encap".
e. Encap action shall be performed and use PA address as specified by "underlay_ip
f. Metering:
- LPM route does not have a metering bucket configured
- Metering policy lookup is not enabled
- Metering class 1001 from mapping entry 10.1.1.1 is used for accounting the traffic
2. Packet destined to 10.1.0.1:
a. LPM lookup hits for entry 10.1.0.0/24
b. The action in this case is "vnet_direct" and the routing type for "vnet" is "maprouting", with overlay_ip specified
c. Next lookup shall happen on the "mapping" table for Vnet "Vnet1", but for overlay_ip 10.0.0.6
d. Mapping table for 10.0.0.6 shall be hit and it takes the action "vnet_encap".
e. Encap action shall be performed and use PA address as specified by "underlay_ip"
f. Metering:
- LPM route does not have a metering bucket configured
- Metering policy lookup is not enabled
- Metering class 1002 from mapping entry 10.0.0.6 is used for accounting the traffic
3. Packet destined to 30.0.0.1
a. LPM lookup hits for entry 30.0.0.0/16
b. The action in this case is "direct".
c. Direct routing happens without any further encapsulation
d. Metering:
- LPM route has metering class 1000 configured
- Metering policy lookup is not enabled
- Metering class 1000 from routing entry 30.0.0.0/16 is used for accounting the traffic
4. Packet destined to 10.2.5.1
a. LPM lookup hits for entry 10.2.5.0/24
b. The action in this case is "drop".
c. Packets gets dropped
5. Packet destined to 40.0.0.1
a. LPM lookup hits for entry 40.0.0.0/16
b. The action in this case is "direct"
c. Direct routing happens without any further encapsulation
d. Metering:
- LPM route has metering class 1000 configured
- Metering policy lookup is enabled
- Metering policy 245bea34-1000-0000-0000-0000082764ac is looked up. Metering rule_num 1 is hit
- Metering class 20000 from metering rule_num 1 is used for accounting the traffic
For the inbound direction, after Route/ACL lookup, pipeline shall use the "underlay_ip" as specified in the ENI table to VXLAN encapsulate the packet and VNI shall be the vm_vni
specified in the APPLIANCE table
5. Inbound packet destined to 10.1.2.5 with source PA 101.1.2.3 and VNI 45654
a. After setting direction to inbound, the Route Rule table is looked up based on priority
b. First Inbound rule gets hit as PR prefix and VNI key match
c. PA validation is set to true and Vnet is given as Vnet1.
d. PA 101.1.2.3 is matched against Vnet1's mapping table and it is a hit. (Validation succeeds)
e. Action is to decap the packet
[
{
"DASH_ROUTING_TYPE_TABLE:servicetunnel": [
{
"name": "action1",
"action_type": "4to6",
},
{
"name": "action2",
"action_type": "staticencap",
"encap_type": "nvgre"
"key":"100"
} ],
"OP": "SET"
},
{
"DASH_ROUTE_TABLE:F4939FEFC47E:50.1.2.0/24": {
"action_type":"servicetunnel",
"overlay_sip":"fd00:108:0:d204:0:200::0",
"overlay_dip":"2603:10e1:100:2::0",
"underlay_sip":"40.1.2.1"
"metering_policy_en":"false",
"metering_class":"50000"
},
"OP": "SET"
},
{
"DASH_ROUTE_TABLE:F4939FEFC47E:60.1.2.1/32": {
"action_type":"servicetunnel",
"overlay_sip":"fd00:108:0:d204:0:200::0",
"overlay_dip":"2603:10e1:100:2::0",
"underlay_sip":"30.1.2.1",
"underlay_dip":"25.1.2.1"
},
"OP": "SET"
},
{
"DASH_ROUTE_TABLE:F4939FEFC47E:70.1.2.0/24": {
"action_type":"servicetunnel",
"overlay_sip":"fd00:108:0:d204:0:200::0",
"overlay_dip":"2603:10e1:100:2::4601:203",
"underlay_sip":"34.1.2.1"
},
"OP": "SET"
},
{
"DASH_METER: {
"eni_id": "497f23d7-f0ac-4c99-a98f-59b470e8c7bd",
"metadata":"SERVICE_TUNNEL_ROUTE_50010200",
"metering_class": "50000"
}
"OP": "SET"
},
]
For the example configuration above, the following is a brief explanation of lookup behavior in the outbound direction:
1. Packet destined to 50.1.2.1 from 10.1.1.1:
a. LPM lookup hits for entry 50.1.2.0/24
b. The action in this case is "servicetunnel"
c. First Action for "servicetunnel" is 4to6 transposition
d. Packet gets transformed as: Overlay SIP fd00:108:0:d204:0:200::a01:101, Overlay DIP 2603:10e1:100:2::3201:201
e. Second Action is Static NVGRE encap.
f. Since underlay dip is not specified in the LPM table, It shall use Dst IP (overlay) from packet, i.e 50.1.2.1 and underlay Src IP as 40.1.2.1
g. Metering:
- LPM route has metering bucket 50000 configured
- Metering policy lookup is not enabled
- Metering class 50000 from the route entry is used for accounting the traffic
2. Packet destined to 60.1.2.1 from 10.1.1.1:
a. LPM lookup hits for entry 60.1.2.1/32
b. The action in this case is "servicetunnel"
c. First Action for "servicetunnel" is 4to6 transposition
d. Packet gets transformed as: Overlay SIP fd00:108:0:d204:0:200::0a01:101, Overlay DIP 2603:10e1:100:2::3c01:201
e. Second Action is Static NVGRE encap.
f. Since underlay sip/dip is specified in the LPM table, It shall use Dst IP (25.1.2.1), Src IP (30.1.2.1)
g. Thus the packet is sent with two headers - One Inner IPv6 and another Outer IPv4
h. Inbound packet for this flow:
h.1 The return packet for the above flow shall land on an SLB MUX (30.1.2.1) as NVGRE packet
h.2 SLB MUX (30.1.2.1) shall encapsulate this to another header with standard GRE key, say 100 and forwards to Appliance PA
h.3 Appliance PA is the unique IP advertised by the Appliance card, say via BGP for reachability.
h.4 Appliance shall first decapsulate the outer header and map it to a flow
h.5 Second header's dst mac shall correspond to ENI MAC, as overwritten by SLB MUX
h.6 Third header shall be the transpositioned IPv6 header
i. Note: This flow fixup shall be done when FastPath kicks in with ICMP Redirect, and packets ingress with two headers.
3. Packet destined to 70.1.2.1 from 10.1.1.1:
a. LPM lookup hits for entry 70.1.2.0/24
b. The action in this case is "servicetunnel"
c. First Action for "servicetunnel" is 4to6 transposition
d. Packet gets transformed as: Overlay SIP fd00:108:0:d204:0:200::0a01:101, since overlay DIP is fully provided, it shall be 2603:10e1:100:2::4601:203
e. Second Action is Static NVGRE encap.
f. Since underlay dip is not specified in the LPM table, It shall use Dst IP from packet, i.e 70.1.2.1 and underlay Src IP as 34.1.2.1
[
{
"DASH_ROUTING_TYPE_TABLE:privatelink": [
{
"name": "action1",
"action_type": "4to6",
},
{
"name": "action2",
"action_type": "staticencap",
"encap_type": "nvgre"
"key":"100"
} ],
"OP": "SET",
"DASH_ROUTING_TYPE_TABLE:privatelinknsg": [
{
"name": "action1",
"action_type": "4to6",
},
{
"name": "action2",
"action_type": "staticencap",
"encap_type": "nvgre"
"key":"100"
},
{
"name": "action3",
"action_type": "appliance",
} ],
"OP": "SET"
},
{
DASH_ROUTING_APPLIANCE_TABLE:22: {
"appliance_guid":"497f23d7-f0ac-4c99",
"addresses": "100.8.1.2",
"encap_type": "vxlan",
"vni": 101
},
"OP": "SET"
},
{
"DASH_ENI_TABLE:F4939FEFC47E": {
"eni_id": "497f23d7-f0ac-4c99-a98f-59b470e8c7bd",
"mac_address": "F4-93-9F-EF-C4-7E",
"underlay_ip": "25.1.1.1",
"admin_state": "enabled",
"vnet": "Vnet1",
"pl_sip_encoding": "0x0020000000000a0b0c0d0a0b/0x002000000000ffffffffffff",
"pl_underlay_sip": "55.1.2.3"
},
"OP": "SET"
},
{
"DASH_ROUTE_TABLE:F4939FEFC47E:10.1.0.8/32": {
"action_type":"vnet",
"vnet":"Vnet1"
"metering_policy_en":"false",
"metering_class":"60000"
},
"OP": "SET"
},
{
"DASH_VNET_MAPPING_TABLE:Vnet1:10.1.0.8": {
"routing_type":"privatelink",
"mac_address":"F9-22-83-99-22-A2",
"underlay_ip":"50.1.2.3",
"overlay_sip":"fd40:108:0:d204:0:200::0",
"overlay_dip":"2603:10e1:100:2::3401:203",
"metering_class":"60001",
"override_meter":"true"
},
"OP": "SET"
},
{
"DASH_ROUTE_TABLE:F4939FEFC47E:10.2.0.6/24": {
"action_type":"vnet",
"vnet":"Vnet1"
},
"OP": "SET"
},
{
"DASH_VNET_MAPPING_TABLE:Vnet1:10.2.0.6": {
"routing_type":"privatelink",
"mac_address":"F9-22-83-99-22-A2",
"underlay_ip":"50.2.2.6",
"overlay_sip":"fd40:108:0:d204:0:200::0",
"overlay_dip":"2603:10e1:100:2::3402:206",
},
"OP": "SET"
},
{
"DASH_VNET_MAPPING_TABLE:Vnet1:10.2.0.9": {
"routing_type":"privatelinknsg",
"mac_address":"F9-22-83-99-22-A2",
"underlay_ip":"50.2.2.6",
"overlay_sip":"fd40:108:0:d204:0:200::0",
"overlay_dip":"2603:10e1:100:2::3402:206",
"routing_appliance_id":22
},
"OP": "SET"
},
"DASH_METER: {
"eni_id": "497f23d7-f0ac-4c99-a98f-59b470e8c7bd",
"metadata":"ROUTE_VNET1_10010008",
"metering_class": "60000"
}
"OP": "SET"
},
{
"DASH_METER: {
"eni_id": "497f23d7-f0ac-4c99-a98f-59b470e8c7bd",
"metadata":"PRIVATE_LINK_VNET1_10010008",
"metering_class": "60001"
}
"OP": "SET"
},
]
For the example configuration above, the following is a brief explanation of lookup behavior in the outbound direction:
1. Packet destined to 10.1.0.8 from 10.1.1.1:
a. LPM lookup hits for entry 10.1.0.8/32
b. The action in this case is "vnet"
c. Next lookup is in the mapping table and mapping table action here is "privatelink"
d. First Action for "privatelink" is 4to6 transposition
e. Packet gets transformed as:
For Overlay SIP, using ENI's "pl_sip_encoding": "0x0020000000000a0b0c0d0a0b/0x002000000000ffffffffffff" -> Overlay SIP fd30:108:0:0a0b:0c0d:0a0b:a01:101 using the following logic:
1. fv = (fd40:108:0:d204:0:200::0 & !0x002000000000ffffffffffff) (first 96 bits based on provided mask length)
2. result = fv | 0x0020000000000a0b0c0d0a0b (first 96 bits based on the provided mask length)
3. result = result | source CA (last 32 bits if its set to 0 in mapping, implicit conversion)
Overlay DIP 2603:10e1:100:2::3401:203 (No transformation, provided as part of mapping)
f. Second Action is Static NVGRE encap with GRE key '100'.
g. Underlay DIP shall be 50.1.2.3 (from mapping), Underlay SIP shall be 55.1.2.3 (from ENI)
h. Metering:
- LPM route has metering bucket 60000 configured
- Metering policy lookup is not enabled
- Override meter is enabled for the mapping entry 10.1.0.8
- Metering class 60001 from mapping entry 10.1.0.8 is used for accounting the traffic
2. Packet destined to 10.2.0.6 from 10.1.1.2:
a. LPM lookup hits for entry 10.2.0.6/24
b. The action in this case is "vnet"
c. Next lookup is in the mapping table and mapping table action here is "privatelink"
d. First Action for "privatelink" is 4to6 transposition
e. Packet gets transformed as:
For Overlay SIP, using ENI's "pl_sip_encoding": "0x0020000000000a0b0c0d0a0b/0x002000000000ffffffffffff" -> Overlay SIP fd30:108:0:0a0b:0c0d:0a0b:a01:102;
Overlay DIP 2603:10e1:100:2::3402:206 (No transformation, provided as part of mapping)
f. Second Action is Static NVGRE encap with GRE key '100'.
g. Underlay DIP shall be 50.2.2.6 (from mapping), Underlay SIP shall be 55.1.2.3 (from ENI)
3. Packet destined to 10.2.0.9 from 10.1.1.2:
a. LPM lookup hits for entry 10.2.0.6/24
b. The action in this case is "vnet"
c. Next lookup is in the mapping table and mapping table action here is "privatelinknsg"
d. First Action for "privatelink" is 4to6 transposition
e. Packet gets transformed as:
For Overlay SIP, using ENI's "pl_sip_encoding": "0x0020000000000a0b0c0d0a0b/0x002000000000ffffffffffff" -> Overlay SIP fd30:108:0:0a0b:0c0d:0a0b:a01:102;
Overlay DIP 2603:10e1:100:2::3402:206 (No transformation, provided as part of mapping)
f. Second Action is Static NVGRE encap with GRE key '100'.
g. Underlay DIP shall be 50.2.2.6 (from mapping), Underlay SIP shall be 55.1.2.3 (from ENI)
h. Third Action is Appliance Encap for id 22
i. Packet shall be encapsulated with Outer DIP as 100.8.1.2 and SIP as VIP of this originating appliance card with VNI of 101.
j. Inbound flow shall be similar to PL and outer encap shall be of the SLB MUX and not of the NSG appliance.