HIGH
fork VMA Race
CVE-2024-27022
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H
KernelScan AI7.8HIGH
01Description
In the Linux kernel, the following vulnerability has been resolved: fork: defer linking file vma until vma is fully initialized Thorvald reported a WARNING [1]. And the root cause is below race: CPU 1 CPU 2 fork hugetlbfs_fallocate dup_mmap hugetlbfs_punch_hole i_mmap_lock_write(mapping); vma_interval_tree_insert_after -- Child vma is visible through i_mmap tree. i_mmap_unlock_write(mapping); hugetlb_dup_vma_private -- Clear vma_lock outside i_mmap_rwsem! i_mmap_lock_write(mapping); hugetlb_vmdelete_list vma_interval_tree_foreach hugetlb_vma_trylock_write -- Vma_lock is cleared. tmp->vm_ops->open -- Alloc new vma_lock outside i_mmap_rwsem! hugetlb_vma_unlock_write -- Vma_lock is assigned!!! i_mmap_unlock_write(mapping); hugetlb_dup_vma_private() and hugetlb_vm_op_open() are called outside i_mmap_rwsem lock while vma lock can be used in the same time. Fix this by deferring linking file vma until vma is fully initialized. Those vmas should be initialized first before they can be used.
02KernelScan AI Analysis
Risk summary
A race condition during process forking can cause kernel crashes when hugetlb VMAs are incompletely initialized but visible to other processes. This affects systems using hugetlb memory and could potentially be exploited for privilege escalation through memory corruption, though exploitation would be challenging due to the narrow timing window.
Vulnerability analysis
Root Cause: During process forking in dup_mmap(), child VMAs are made visible in the i_mmap interval tree before they are fully initialized. Specifically, vma_interval_tree_insert_after() is called while holding i_mmap_rwsem, but hugetlb_dup_vma_private() and vm_ops->open() are called after releasing the lock. This creates a window where another CPU can find the partially initialized VMA through the i_mmap tree and attempt to use its uninitialized vma_lock.
Attack Surface: This is a local vulnerability affecting process creation (fork/clone system calls). It requires the ability to create processes with hugetlb VMAs while concurrent hugetlbfs operations (like fallocate/punch_hole) are occurring. The race window is narrow but can lead to kernel crashes or potential privilege escalation through memory corruption.
Fix Mechanism: The patch reorders operations in dup_mmap() to defer linking file VMAs into the i_mmap tree until after they are fully initialized. The sequence is changed to: 1) Initialize hugetlb private data, 2) Store VMA in maple tree, 3) Call vm_ops->open(), and only then 4) Link to i_mmap tree under lock. This ensures VMAs are never visible to other CPUs in a partially initialized state.
03Fix Versions
| Branch | Fixed in | Patch commit |
|---|---|---|
| 6.6 | 6.6.134 | 2e5cbab8ccbf |
| 6.8 | 6.8.8 | abdb88dd272b |
| mainline | 6.9 | 35e351780fa9 |