Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Aug 01, 2024
  2. Jul 28, 2024
  3. Jul 20, 2024
  4. Jul 18, 2024
  5. Jul 15, 2024
  6. Jul 10, 2024
  7. Jul 04, 2024
  8. Jun 15, 2024
    • Yury Norov's avatar
      gcc: disable '-Warray-bounds' for gcc-9 · 8e5bd4ea
      Yury Norov authored
      '-Warray-bounds' is already disabled for gcc-10+.  Now that we've merged
      bitmap_{read,write), I see the following error when building the kernel
      with gcc-9.4 (Ubuntu 20.04.4 LTS) for x86_64 allmodconfig:
      
      drivers/pinctrl/pinctrl-cy8c95x0.c: In function `cy8c95x0_read_regs_mask.isra.0':
      include/linux/bitmap.h:756:18: error: array subscript [1, 288230376151711744] is outside array bounds of `long unsigned int[1]' [-Werror=array-bounds]
        756 |  value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
            |               ~~~^~~~~~~~~~~
      
      The immediate reason is that the commit b4475970 ("bitmap: make
      bitmap_{get,set}_value8() use bitmap_{read,write}()") switched the
      bitmap_get_value8() to an alias of bitmap_read(); the same for 'set'.
      
      Now; the code that triggers Warray-bounds, calls the function like this:
      
        #define MAX_BANK 8
        #define BANK_SZ 8
        #define MAX_LINE        (MAX_BANK * BANK_SZ)
        DECLARE_BITMAP(tval, MAX_LINE); // 64-bit map: unsigned l...
      8e5bd4ea
  9. May 14, 2024
    • Masahiro Yamada's avatar
      Makefile: remove redundant tool coverage variables · 7f7f6f7a
      Masahiro Yamada authored
      
      Now Kbuild provides reasonable defaults for objtool, sanitizers, and
      profilers.
      
      Remove redundant variables.
      
      Note:
      
      This commit changes the coverage for some objects:
      
        - include arch/mips/vdso/vdso-image.o into UBSAN, GCOV, KCOV
        - include arch/sparc/vdso/vdso-image-*.o into UBSAN
        - include arch/sparc/vdso/vma.o into UBSAN
        - include arch/x86/entry/vdso/extable.o into KASAN, KCSAN, UBSAN, GCOV, KCOV
        - include arch/x86/entry/vdso/vdso-image-*.o into KASAN, KCSAN, UBSAN, GCOV, KCOV
        - include arch/x86/entry/vdso/vdso32-setup.o into KASAN, KCSAN, UBSAN, GCOV, KCOV
        - include arch/x86/entry/vdso/vma.o into GCOV, KCOV
        - include arch/x86/um/vdso/vma.o into KASAN, GCOV, KCOV
      
      I believe these are positive effects because all of them are kernel
      space objects.
      
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Tested-by: default avatarRoberto Sassu <roberto.sassu@huawei.com>
      7f7f6f7a
  10. May 09, 2024
  11. May 06, 2024
  12. Apr 30, 2024
  13. Apr 26, 2024
  14. Apr 25, 2024
  15. Apr 24, 2024
  16. Apr 16, 2024
  17. Apr 12, 2024
  18. Apr 11, 2024
  19. Apr 09, 2024
  20. Apr 05, 2024
  21. Mar 31, 2024
  22. Mar 26, 2024
  23. Mar 25, 2024
    • Qais Yousef's avatar
      sched/fair: Check if a task has a fitting CPU when updating misfit · 22d56074
      Qais Yousef authored
      
      If a misfit task is affined to a subset of the possible CPUs, we need to
      verify that one of these CPUs can fit it. Otherwise the load balancer
      code will continuously trigger needlessly leading the balance_interval
      to increase in return and eventually end up with a situation where real
      imbalances take a long time to address because of this impossible
      imbalance situation.
      
      This can happen in Android world where it's common for background tasks
      to be restricted to little cores.
      
      Similarly if we can't fit the biggest core, triggering misfit is
      pointless as it is the best we can ever get on this system.
      
      To be able to detect that; we use asym_cap_list to iterate through
      capacities in the system to see if the task is able to run at a higher
      capacity level based on its p->cpus_ptr. We do that when the affinity
      change, a fair task is forked, or when a task switched to fair policy.
      We store the max_allowed_capacity in task_struct to allow for cheap
      comparison in the fast path.
      
      Improve check_misfit_status() function by removing redundant checks.
      misfit_task_load will be 0 if the task can't move to a bigger CPU. And
      nohz_balancer_kick() already checks for cpu_check_capacity() before
      calling check_misfit_status().
      
      Test:
      =====
      
      Add
      
      	trace_printk("balance_interval = %lu\n", interval)
      
      in get_sd_balance_interval().
      
      run
      	if [ "$MASK" != "0" ]; then
      		adb shell "taskset -a $MASK cat /dev/zero > /dev/null"
      	fi
      	sleep 10
      	// parse ftrace buffer counting the occurrence of each valaue
      
      Where MASK is either:
      
      	* 0: no busy task running
      	* 1: busy task is pinned to 1 cpu; handled today to not cause
      	  misfit
      	* f: busy task pinned to little cores, simulates busy background
      	  task, demonstrates the problem to be fixed
      
      Results:
      ========
      
      Note how occurrence of balance_interval = 128 overshoots for MASK = f.
      
      BEFORE
      ------
      
      	MASK=0
      
      		   1 balance_interval = 175
      		 120 balance_interval = 128
      		 846 balance_interval = 64
      		  55 balance_interval = 63
      		 215 balance_interval = 32
      		   2 balance_interval = 31
      		   2 balance_interval = 16
      		   4 balance_interval = 8
      		1870 balance_interval = 4
      		  65 balance_interval = 2
      
      	MASK=1
      
      		  27 balance_interval = 175
      		  37 balance_interval = 127
      		 840 balance_interval = 64
      		 167 balance_interval = 63
      		 449 balance_interval = 32
      		  84 balance_interval = 31
      		 304 balance_interval = 16
      		1156 balance_interval = 8
      		2781 balance_interval = 4
      		 428 balance_interval = 2
      
      	MASK=f
      
      		   1 balance_interval = 175
      		1328 balance_interval = 128
      		  44 balance_interval = 64
      		 101 balance_interval = 63
      		  25 balance_interval = 32
      		   5 balance_interval = 31
      		  23 balance_interval = 16
      		  23 balance_interval = 8
      		4306 balance_interval = 4
      		 177 balance_interval = 2
      
      AFTER
      -----
      
      Note how the high values almost disappear for all MASK values. The
      system has background tasks that could trigger the problem without
      simulate it even with MASK=0.
      
      	MASK=0
      
      		 103 balance_interval = 63
      		  19 balance_interval = 31
      		 194 balance_interval = 8
      		4827 balance_interval = 4
      		 179 balance_interval = 2
      
      	MASK=1
      
      		 131 balance_interval = 63
      		   1 balance_interval = 31
      		  87 balance_interval = 8
      		3600 balance_interval = 4
      		   7 balance_interval = 2
      
      	MASK=f
      
      		   8 balance_interval = 127
      		 182 balance_interval = 63
      		   3 balance_interval = 31
      		   9 balance_interval = 16
      		 415 balance_interval = 8
      		3415 balance_interval = 4
      		  21 balance_interval = 2
      
      Signed-off-by: default avatarQais Yousef <qyousef@layalina.io>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Reviewed-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Link: https://lore.kernel.org/r/20240324004552.999936-3-qyousef@layalina.io
      22d56074
  24. Mar 22, 2024
  25. Mar 04, 2024
  26. Mar 01, 2024
    • Christian Brauner's avatar
      pidfd: add pidfs · cb12fd8e
      Christian Brauner authored
      This moves pidfds from the anonymous inode infrastructure to a tiny
      pseudo filesystem. This has been on my todo for quite a while as it will
      unblock further work that we weren't able to do simply because of the
      very justified limitations of anonymous inodes. Moving pidfds to a tiny
      pseudo filesystem allows:
      
      * statx() on pidfds becomes useful for the first time.
      * pidfds can be compared simply via statx() and then comparing inode
        numbers.
      * pidfds have unique inode numbers for the system lifetime.
      * struct pid is now stashed in inode->i_private instead of
        file->private_data. This means it is now possible to introduce
        concepts that operate on a process once all file descriptors have been
        closed. A concrete example is kill-on-last-close.
      * file->private_data is freed up for per-file options for pidfds.
      * Each struct pid will refer to a different inode but the same struct
        pid will refer to the same inode if it's opened multiple times. In
        contrast to now where each struct pid refers to the same inode. Even
        if we were to move to anon_inode_create_getfile() which creates new
        inodes we'd still be associating the same struct pid with multiple
        different inodes.
      
      The tiny pseudo filesystem is not visible anywhere in userspace exactly
      like e.g., pipefs and sockfs. There's no lookup, there's no complex
      inode operations, nothing. Dentries and inodes are always deleted when
      the last pidfd is closed.
      
      We allocate a new inode for each struct pid and we reuse that inode for
      all pidfds. We use iget_locked() to find that inode again based on the
      inode number which isn't recycled. We allocate a new dentry for each
      pidfd that uses the same inode. That is similar to anonymous inodes
      which reuse the same inode for thousands of dentries. For pidfds we're
      talking way less than that. There usually won't be a lot of concurrent
      openers of the same struct pid. They can probably often be counted on
      two hands. I know that systemd does use separate pidfd for the same
      struct pid for various complex process tracking issues. So I think with
      that things actually become way simpler. Especially because we don't
      have to care about lookup. Dentries and inodes continue to be always
      deleted.
      
      The code is entirely optional and fairly small. If it's not selected we
      fallback to anonymous inodes. Heavily inspired by nsfs which uses a
      similar stashing mechanism just for namespaces.
      
      Link: https://lore.kernel.org/r/20240213-vfs-pidfd_fs-v1-2-f863f58cfce1@kernel.org
      
      
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      cb12fd8e
  27. Feb 27, 2024
  28. Feb 25, 2024