KernelScan.io

HIGH

net/mlx5e DmaFifo Desync

CVE-2026-43466

CVSS 8.2 / 10.0 NVD

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:L/A:H

KernelScan AI5.3MEDIUM

01

In the Linux kernel, the following vulnerability has been resolved: net/mlx5e: Fix DMA FIFO desync on error CQE SQ recovery In case of a TX error CQE, a recovery flow is triggered, mlx5e_reset_txqsq_cc_pc() resets dma_fifo_cc to 0 but not dma_fifo_pc, desyncing the DMA FIFO producer and consumer. After recovery, the producer pushes new DMA entries at the old dma_fifo_pc, while the consumer reads from position 0. This causes us to unmap stale DMA addresses from before the recovery. The DMA FIFO is a purely software construct with no HW counterpart. At the point of reset, all WQEs have been flushed so dma_fifo_cc is already equal to dma_fifo_pc. There is no need to reset either counter, similar to how skb_fifo pc/cc are untouched. Remove the 'dma_fifo_cc = 0' reset. This fixes the following WARNING: WARNING: CPU: 0 PID: 0 at drivers/iommu/dma-iommu.c:1240 iommu_dma_unmap_page+0x79/0x90 Modules linked in: mlx5_vdpa vringh vdpa bonding mlx5_ib mlx5_vfio_pci ipip mlx5_fwctl tunnel4 mlx5_core ib_ipoib geneve ip6_gre ip_gre gre nf_tables ip6_tunnel rdma_ucm ib_uverbs ib_umad vfio_pci vfio_pci_core act_mirred act_skbedit act_vlan vhost_net vhost tap ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress vhost_iotlb iptable_raw tunnel6 vfio_iommu_type1 vfio openvswitch nsh rpcsec_gss_krb5 auth_rpcgss oid_registry xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat xt_addrtype br_netfilter overlay zram zsmalloc rpcrdma ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm ib_core fuse [last unloaded: nf_tables] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.13.0-rc5_for_upstream_min_debug_2024_12_30_21_33 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:iommu_dma_unmap_page+0x79/0x90 Code: 2b 4d 3b 21 72 26 4d 3b 61 08 73 20 49 89 d8 44 89 f9 5b 4c 89 f2 4c 89 e6 48 89 ef 5d 41 5c 41 5d 41 5e 41 5f e9 c7 ae 9e ff <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f 1f 84 00 00 00 00 Call Trace: <IRQ> ? __warn+0x7d/0x110 ? iommu_dma_unmap_page+0x79/0x90 ? report_bug+0x16d/0x180 ? handle_bug+0x4f/0x90 ? exc_invalid_op+0x14/0x70 ? asm_exc_invalid_op+0x16/0x20 ? iommu_dma_unmap_page+0x79/0x90 ? iommu_dma_unmap_page+0x2e/0x90 dma_unmap_page_attrs+0x10d/0x1b0 mlx5e_tx_wi_dma_unmap+0xbe/0x120 [mlx5_core] mlx5e_poll_tx_cq+0x16d/0x690 [mlx5_core] mlx5e_napi_poll+0x8b/0xac0 [mlx5_core] __napi_poll+0x24/0x190 net_rx_action+0x32a/0x3b0 ? mlx5_eq_comp_int+0x7e/0x270 [mlx5_core] ? notifier_call_chain+0x35/0xa0 handle_softirqs+0xc9/0x270 irq_exit_rcu+0x71/0xd0 common_interrupt+0x7f/0xa0 </IRQ> <TASK> asm_common_interrupt+0x22/0x40

02

Engine v0.2.0

Risk summary

Systems with MLX5 Ethernet hardware are at risk of kernel warnings and potential system instability when TX error recovery is triggered. The bug causes stale DMA address unmapping during error recovery due to desynchronized FIFO indices. Because any local user can generate network traffic that exercises the TX path, unprivileged local attackers can trigger the vulnerable recovery flow.

Affecteddrivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c (MLX5 Ethernet driver)

Vulnerability analysis

The root cause is in mlx5e_reset_txqsq_cc_pc() which incorrectly resets only the DMA FIFO consumer counter (dma_fifo_cc) to 0 while leaving the producer counter (dma_fifo_pc) unchanged during TX error CQE recovery. This desynchronizes the producer and consumer, causing the driver to operate on stale DMA FIFO entries and unmap invalid or stale DMA addresses. The fix removes the erroneous counter reset since all WQEs are already flushed and the counters are naturally synchronized. The exploitable primitive is operation on stale DMA FIFO entries (CWE-119), not a locking issue. Attack surface is local: any user with the ability to send traffic through the MLX5 interface can trigger the TX path, and a subsequent hardware TX error CQE causes the buggy recovery to run.

03

BranchFixed inPatch commit
5.105.10.253821f85d619f7
5.155.15.2036eb68ecc5acc
6.16.1.1676f41f7812bfa
6.126.12.789c5ee9b981ee
6.186.18.19ce1b19dd0684
6.196.19.9829efcccfa8f
6.66.6.130383b37c04a48
mainline7.01633111d6905