Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Apr 22, 2022
    • Matthew Wilcox (Oracle)'s avatar
      XArray: Disallow sibling entries of nodes · 63b1898f
      Matthew Wilcox (Oracle) authored
      
      There is a race between xas_split() and xas_load() which can result in
      the wrong page being returned, and thus data corruption.  Fortunately,
      it's hard to hit (syzbot took three months to find it) and often guarded
      with VM_BUG_ON().
      
      The anatomy of this race is:
      
      thread A			thread B
      order-9 page is stored at index 0x200
      				lookup of page at index 0x274
      page split starts
      				load of sibling entry at offset 9
      stores nodes at offsets 8-15
      				load of entry at offset 8
      
      The entry at offset 8 turns out to be a node, and so we descend into it,
      and load the page at index 0x234 instead of 0x274.  This is hard to fix
      on the split side; we could replace the entire node that contains the
      order-9 page instead of replacing the eight entries.  Fixing it on
      the lookup side is easier; just disallow sibling entries that point
      to nodes.  This cannot ever be a useful thing as the descent would not
      know the correct offset to use within the new node.
      
      The test suite continues to pass, but I have not added a new test for
      this bug.
      
      Reported-by: default avatar <syzbot+cf4cf13056f85dec2c40@syzkaller.appspotmail.com>
      Tested-by: default avatar <syzbot+cf4cf13056f85dec2c40@syzkaller.appspotmail.com>
      Fixes: 6b24ca4a
      
       ("mm: Use multi-index entries in the page cache")
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      63b1898f
    • Matthew Wilcox (Oracle)'s avatar
      tools: Add kmem_cache_alloc_lru() · b9663a6f
      Matthew Wilcox (Oracle) authored
      Turn kmem_cache_alloc() into a wrapper around kmem_cache_alloc_lru().
      
      Fixes: 9bbdc0f3
      
       ("xarray: use kmem_cache_alloc_lru to allocate xa_node")
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reported-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Reported-by: default avatarLi Wang <liwang@redhat.com>
      b9663a6f
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 281b9d9a
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "13 patches.
      
        Subsystems affected by this patch series: mm (memory-failure, memcg,
        userfaultfd, hugetlbfs, mremap, oom-kill, kasan, hmm), and kcov"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove()
        kcov: don't generate a warning on vm_insert_page()'s failure
        MAINTAINERS: add Vincenzo Frascino to KASAN reviewers
        oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup
        selftest/vm: add skip support to mremap_test
        selftest/vm: support xfail in mremap_test
        selftest/vm: verify remap destination address in mremap_test
        selftest/vm: verify mmap addr in mremap_test
        mm, hugetlb: allow for "high" userspace addresses
        userfaultfd: mark uffd_wp regardless of VM_WRITE flag
        memcg: sync flush only if periodic flush is delayed
        mm/memory-failure.c: skip huge_zero_page in memory_failure()
        mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()
      281b9d9a
    • Nicholas Piggin's avatar
      mm/vmalloc: huge vmalloc backing pages should be split rather than compound · 3b8000ae
      Nicholas Piggin authored
      Huge vmalloc higher-order backing pages were allocated with __GFP_COMP
      in order to allow the sub-pages to be refcounted by callers such as
      "remap_vmalloc_page [sic]" (remap_vmalloc_range).
      
      However a similar problem exists for other struct page fields callers
      use, for example fb_deferred_io_fault() takes a vmalloc'ed page and
      not only refcounts it but uses ->lru, ->mapping, ->index.
      
      This is not compatible with compound sub-pages, and can cause bad page
      state issues like
      
        BUG: Bad page state in process swapper/0  pfn:00743
        page:(____ptrval____) refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x743
        flags: 0x7ffff000000000(node=0|zone=0|lastcpupid=0x7ffff)
        raw: 007ffff000000000 c00c00000001d0c8 c00c00000001d0c8 0000000000000000
        raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
        page dumped because: corrupted mapping in tail page
        Modules linked in:
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc3-00082-gfc6fff4a7ce1-dirty #2810
        Call Trace:
          dump_stack_lvl+0x74/0xa8 (unreliable)
          bad_page+0x12c/0x170
          free_tail_pages_check+0xe8/0x190
          free_pcp_prepare+0x31c/0x4e0
          free_unref_page+0x40/0x1b0
          __vunmap+0x1d8/0x420
          ...
      
      The correct approach is to use split high-order pages for the huge
      vmalloc backing. These allow callers to treat them in exactly the same
      way as individually-allocated order-0 pages.
      
      Link: https://lore.kernel.org/all/14444103-d51b-0fb3-ee63-c3f182f0b546@molgen.mpg.de/
      
      
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Cc: Paul Menzel <pmenzel@molgen.mpg.de>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3b8000ae
  2. Apr 21, 2022
  3. Apr 20, 2022