Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce kni isolate rx queue support #695

Open
wants to merge 2 commits into
base: devel
Choose a base branch
from

Conversation

wanlebing
Copy link

@wanlebing wanlebing commented Jan 3, 2021

introduce kni isolate rx queue support:

  1. local ip filter will make input set fixed on ixgbe/i40e;
  2. dip filter is not supported by ixgbe and i40e under the premise of local ip filter, which is support on mellanox;
  3. use dip + dport + dst_port_mask filters to cover port range [0-65535] to replace dip filter on ixgbe/i40e;
  4. recommend to use rte_eth_dev_filter_ctrl to config kni fdir for kni ip, rte_flow is not supported well on ixgbe;
  5. recommend to use rte_flow to config rss queue region to exclude kni rx queue;
  6. support mellanox(cx4/5)/ixgbe(82599ES)/i40e(X710/XL710);
  7. support ipv4/ipv6 kni fdir;

@wanlebing wanlebing force-pushed the kni_isolate_rx branch 3 times, most recently from 5f36834 to a9fcd98 Compare January 3, 2021 11:09
@wanlebing wanlebing closed this Jan 3, 2021
@wanlebing wanlebing reopened this Jan 3, 2021
@wanlebing wanlebing force-pushed the kni_isolate_rx branch 3 times, most recently from a2c49fe to e5b6d6d Compare January 3, 2021 11:53
@ywc689 ywc689 requested review from ywc689 and anonymous-zx and removed request for ywc689 January 5, 2021 06:29
@ywc689 ywc689 self-assigned this Jan 5, 2021
@ywc689 ywc689 self-requested a review January 5, 2021 06:29
@anonymous-zx anonymous-zx added the pr/to-review-codes review codes line by line and check if problem exists. label Jan 7, 2021
@anonymous-zx
Copy link
Collaborator

anonymous-zx commented Jan 14, 2021

请问一下:

  1. 独立kni rx队列对性能的提升有多少,这样做的主要目的是为了数据面和控制面分离吗?
  2. 我看到现有的操作是把struct kni_addr ip[NETIF_KNI_ADDR_MAX_NUM]里的ip流量全都导向kni queue,不知道我理解的有没有错?如果是这样的规则的话,配置vip的kni上的流量怎么处理?

@wanlebing
Copy link
Author

wanlebing commented Jan 14, 2021

请问一下:

  1. 独立kni rx队列对性能的提升有多少,这样做的主要目的是为了数据面和控制面分离吗?
  2. 我看到现有的操作是把struct kni_addr ip[NETIF_KNI_ADDR_MAX_NUM]里的ip流量全都导向kni queue,不知道我理解的有没有错?如果是这样的规则的话,配置vip的kni上的流量怎么处理?
  1. 是为了控制面和数据面分离,kni流量不会绕行worker,让BGP/health check/ssh等流量高优,worker busy的时候,不至于影响到BGP链接以及健康检查的流量,遇到过BGP邻居闪断的一些问题;
  2. 对,因为intel上的卡有个缺陷,下发了local ip的filter(ip + port粒度)以后,input set就固定了,ip粒度就不生效,为了向下兼容,就多加规则,把kni ip 的端口范围[0-65535]覆盖到,起到了kn ip粒度导流的作用,不过这里有个地方没更新,如果转发核个数不是2^n,端口无法覆盖完,这个我更新下;
  3. 配置vip的kni上的流量,怎么理解?是哪种模式?可能我没用过

@wanlebing
Copy link
Author

wanlebing commented Jan 14, 2021

请问一下:

  1. 独立kni rx队列对性能的提升有多少,这样做的主要目的是为了数据面和控制面分离吗?
  2. 我看到现有的操作是把struct kni_addr ip[NETIF_KNI_ADDR_MAX_NUM]里的ip流量全都导向kni queue,不知道我理解的有没有错?如果是这样的规则的话,配置vip的kni上的流量怎么处理?

说的是vip如果是kni ip?kni的流量会优先分流,业务流量走rss flow

@wanlebing wanlebing closed this Jan 14, 2021
@wanlebing wanlebing reopened this Jan 14, 2021
@sjaliang
Copy link

这个patch对dpdk版本有要求吗 需要至少哪个版本? dpdk 17.11版本对rte_flow支持的不太好

@anonymous-zx anonymous-zx added pr/needs-confirmed the feature in the pr is what we need,and list what cases should be checked in later stages pr/to-confirm-needs consider whether the feature of pr is needed and removed pr/to-review-codes review codes line by line and check if problem exists. pr/needs-confirmed the feature in the pr is what we need,and list what cases should be checked in later stages labels Jan 15, 2021
@wanlebing
Copy link
Author

这个patch对dpdk版本有要求吗 需要至少哪个版本? dpdk 17.11版本对rte_flow支持的不太好

只要是能支持配置的rss queue region的 rte_flow就行,我们这都是18.11.2,17.11确实没试过

@ywc689 ywc689 added pr/needs-confirmed the feature in the pr is what we need,and list what cases should be checked in later stages and removed pr/to-confirm-needs consider whether the feature of pr is needed labels Jan 19, 2021
@wanlebing wanlebing force-pushed the kni_isolate_rx branch 2 times, most recently from 05d61dc to 4c8dc61 Compare March 21, 2021 12:30
@wanlebing
Copy link
Author

wanlebing commented Mar 21, 2021

1.add lock to protect the safty of netdev flow api.
2.add comment for ixgbe v6 filter: signature mode is essential for ipv6 filter, which math the first 4 bytes on ipv6 addr, this will cause some problems,for example: when you add a kni filter for fdbd:dc02:9:135::13, a pkt with dst of fdbd:dc02:9:135:0:13:0:1 will match previous kni filter.

! kni_ipaddress {
! ipv4 <wan link ipv4 address>
! ipv6 <wan link ipv6 address>
! }
}

<init> device bond0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is rx_queue_id needed to config in kni worker type?

Comment on lines +34 to +42
* 1. local ip filter will make input set fixed on ixgbe/i40e.
* 2. dip filter is not supported by ixgbe and i40e under the
* premise of local ip filter.
* 3. use dip + dport + dst_port_mask filters to cover port range
* [0-65535] to replace dip filter on ixgbe/i40e.
* 4. kni fdir filter support tcp and udp, icmp not supported.
* 5. if (fdir_conf.mask.dst_port_mask & pkt.dport) equal to an
* element in the port_base_array, pkt will match kni fdir
* filter and redirected to kni rx queue.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a solution to support both tcp/udp and IP protocols? Somebody may prefer OSPF (IP protocol 89) to BGP(TCP) for ECMP routes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flow支持ip粒度或者tcp/udp的,这里的注释是解释的ixgbe/i40e,intel卡有input set,一种filter/flow会固定input set,其他input set的filter/flow可能就不生效。

Comment on lines +90 to +91
&& (dev->kni.rx_queue_id != NETIF_QUEUE_ID_INVALID))
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& (dev->kni.rx_queue_id != NETIF_QUEUE_ID_INVALID)) {

Comment on lines +522 to +526
addr.in.s_addr = kni_ip->in.s_addr;
RTE_LOG(INFO, Kni, "[%s] success to add kni fdir ipv4 filter "
"on port: %s for kni_ip: %s\n",
__func__, dev->name,
inet_ntop(AF_INET, &addr, dst, sizeof(dst)) ? dst: "");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addr.in.s_addr = kni_ip->in.s_addr isn't needed.

Comment on lines +106 to +108
int kni_addr_cnt;
struct kni_addr kni_ip[NETIF_KNI_ADDR_MAX_NUM]; /* ipv4 or ipv6 */

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be better to add flows with the command tool dpip? For example dpip flow add ... dev ....

Comment on lines +186 to +187
&& dev->type == PORT_TYPE_GENERAL) {
return true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it support bonding devices?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

也可以支持,需要在所有slave上下flow,内部没有用bond,所以做了判断;考虑bond,这里可能需要适配下

Comment on lines +2922 to +2928
if (fwd_mode == KNI_FWD_MODE_ISOLATE_RX) {
nb_rb = rte_eth_rx_burst(dev->id, dev->kni.rx_queue_id,
mbufs, NETIF_MAX_PKT_BURST);
} else {
nb_rb = rte_ring_dequeue_burst(dev->kni.rx_ring, (void**)mbufs,
NETIF_MAX_PKT_BURST, NULL);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some packets are sent to kni from dpvs's protocol stack. I don't think the flow based kni rx can handle all these packets.

Comment on lines +532 to +537
case NETDEV_FLOW_TYPE_RSS:
/* setup rss queues info */
netdev_flow_add_ingress_attribute(netdev_flow, &attr);
netdev_flow_add_rss_patterns(netdev_flow, patts);
netdev_flow_add_rss_actions(port_id, netdev_flow, acts);
break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ixgbe devices support rss flow type? How does its performance compared to the global RSS?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ixgbe支持rss flow,测试看性能没有明显区别;ixgbe上用ip粒度的rte flow,可能会不生效,因为ip + port粒度的local ip filter会让ixgbe input set固定,ixgbe上想实现kni 独立接收,可能用port range的方式覆盖所有端口比较合适。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/needs-confirmed the feature in the pr is what we need,and list what cases should be checked in later stages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants