Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Jul 29, 2024
    • Zheng Zucheng's avatar
      sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime · 77baa5ba
      Zheng Zucheng authored
      In extreme test scenarios:
      the 14th field utime in /proc/xx/stat is greater than sum_exec_runtime,
      utime = 18446744073709518790 ns, rtime = 135989749728000 ns
      
      In cputime_adjust() process, stime is greater than rtime due to
      mul_u64_u64_div_u64() precision problem.
      before call mul_u64_u64_div_u64(),
      stime = 175136586720000, rtime = 135989749728000, utime = 1416780000.
      after call mul_u64_u64_div_u64(),
      stime = 135989949653530
      
      unsigned reversion occurs because rtime is less than stime.
      utime = rtime - stime = 135989749728000 - 135989949653530
      		      = -199925530
      		      = (u64)18446744073709518790
      
      Trigger condition:
        1). User task run in kernel mode most of time
        2). ARM64 architecture
        3). TICK_CPU_ACCOUNTING=y
            CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
      
      Fix mul_u64_u64_div_u64() conversion precision by reset stime to rtime
      
      Fixes: 3dc167ba
      
       ("sched/cputime: Improve cputime_adjust()")
      Signed-off-by: default avatarZheng Zucheng <zhengzucheng@h...>
      77baa5ba
  2. May 27, 2024
  3. Apr 17, 2024
  4. Dec 27, 2022
  5. Jul 04, 2022
  6. Feb 23, 2022
  7. Dec 02, 2021
  8. Nov 23, 2021
  9. Mar 21, 2021
    • Ingo Molnar's avatar
      sched: Fix various typos · 3b03706f
      Ingo Molnar authored
      
      Fix ~42 single-word typos in scheduler code comments.
      
      We have accumulated a few fun ones over the years. :-)
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: linux-kernel@vger.kernel.org
      3b03706f
  10. Mar 17, 2021
  11. Dec 02, 2020
  12. Jun 15, 2020
    • Oleg Nesterov's avatar
      sched/cputime: Improve cputime_adjust() · 3dc167ba
      Oleg Nesterov authored
      
      People report that utime and stime from /proc/<pid>/stat become very
      wrong when the numbers are big enough, especially if you watch these
      counters incrementally.
      
      Specifically, the current implementation of: stime*rtime/total,
      results in a saw-tooth function on top of the desired line, where the
      teeth grow in size the larger the values become. IOW, it has a
      relative error.
      
      The result is that, when watching incrementally as time progresses
      (for large values), we'll see periods of pure stime or utime increase,
      irrespective of the actual ratio we're striving for.
      
      Replace scale_stime() with a math64.h helper: mul_u64_u64_div_u64()
      that is far more accurate. This also allows architectures to override
      the implementation -- for instance they can opt for the old algorithm
      if this new one turns out to be too expensive for them.
      
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20200519172506.GA317395@hirez.programming.kicks-ass.net
      3dc167ba
  13. Apr 15, 2020
  14. Mar 06, 2020
  15. Jan 17, 2020
  16. Nov 21, 2019
  17. Oct 29, 2019
  18. Oct 09, 2019
    • Frederic Weisbecker's avatar
      sched/cputime: Spare a seqcount lock/unlock cycle on context switch · 8d495477
      Frederic Weisbecker authored
      
      On context switch we are locking the vtime seqcount of the scheduling-out
      task twice:
      
       * On vtime_task_switch_common(), when we flush the pending vtime through
         vtime_account_system()
      
       * On arch_vtime_task_switch() to reset the vtime state.
      
      This is pointless as these actions can be performed without the need
      to unlock/lock in the middle. The reason these steps are separated is to
      consolidate a very small amount of common code between
      CONFIG_VIRT_CPU_ACCOUNTING_GEN and CONFIG_VIRT_CPU_ACCOUNTING_NATIVE.
      
      Performance in this fast path is definitely a priority over artificial
      code factorization so split the task switch code between GEN and
      NATIVE and mutualize the parts than can run under a single seqcount
      locked block.
      
      As a side effect, vtime_account_idle() becomes included in the seqcount
      protection. This happens to be a welcome preparation in order to
      properly support kcpustat under vtime in the future and fetch
      CPUTIME_IDLE without race.
      
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
      Link: https://lkml.kernel.org/r/20191003161745.28464-3-frederic@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8d495477
    • Frederic Weisbecker's avatar
      sched/cputime: Rename vtime_account_system() to vtime_account_kernel() · f83eeb1a
      Frederic Weisbecker authored
      
      vtime_account_system() decides if we need to account the time to the
      system (__vtime_account_system()) or to the guest (vtime_account_guest()).
      
      So this function is a misnomer as we are on a higher level than
      "system". All we know when we call that function is that we are
      accounting kernel cputime. Whether it belongs to guest or system time
      is a lower level detail.
      
      Rename this function to vtime_account_kernel(). This will clarify things
      and avoid too many underscored vtime_account_system() versions.
      
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
      Link: https://lkml.kernel.org/r/20191003161745.28464-2-frederic@ker...
      f83eeb1a
    • Frederic Weisbecker's avatar
      sched/vtime: Fix guest/system mis-accounting on task switch · 68e7a4d6
      Frederic Weisbecker authored
      
      vtime_account_system() assumes that the target task to account cputime
      to is always the current task. This is most often true indeed except on
      task switch where we call:
      
      	vtime_common_task_switch(prev)
      		vtime_account_system(prev)
      
      Here prev is the scheduling-out task where we account the cputime to. It
      doesn't match current that is already the scheduling-in task at this
      stage of the context switch.
      
      So we end up checking the wrong task flags to determine if we are
      accounting guest or system time to the previous task.
      
      As a result the wrong task is used to check if the target is running in
      guest mode. We may then spuriously account or leak either system or
      guest time on task switch.
      
      Fix this assumption and also turn vtime_guest_enter/exit() to use the
      task passed in parameter as well to avoid future similar issues.
      
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Fixes: 2a42eb95 ("sched/cputime: Accumulate vtime on top of nsec clocksource")
      Link: https://lkml.kernel.org/r/20190925214242.21873-1-frederic@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      68e7a4d6
  19. May 21, 2019
  20. Dec 03, 2018
    • Ingo Molnar's avatar
      sched: Fix various typos in comments · dfcb245e
      Ingo Molnar authored
      
      Go over the scheduler source code and fix common typos
      in comments - and a typo in an actual variable name.
      
      No change in functionality intended.
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      dfcb245e
  21. Mar 04, 2018
    • Ingo Molnar's avatar
      sched/headers: Simplify and clean up header usage in the scheduler · 325ea10c
      Ingo Molnar authored
      
      Do the following cleanups and simplifications:
      
       - sched/sched.h already includes <asm/paravirt.h>, so no need to
         include it in sched/core.c again.
      
       - order the <linux/sched/*.h> headers alphabetically
      
       - add all <linux/sched/*.h> headers to kernel/sched/sched.h
      
       - remove all unnecessary includes from the .c files that
         are already included in kernel/sched/sched.h.
      
      Finally, make all scheduler .c files use a single common header:
      
        #include "sched.h"
      
      ... which now contains a union of the relied upon headers.
      
      This makes the various .c files easier to read and easier to handle.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      325ea10c
  22. Mar 03, 2018
    • Ingo Molnar's avatar
      sched: Clean up and harmonize the coding style of the scheduler code base · 97fb7a0a
      Ingo Molnar authored
      
      A good number of small style inconsistencies have accumulated
      in the scheduler core, so do a pass over them to harmonize
      all these details:
      
       - fix speling in comments,
      
       - use curly braces for multi-line statements,
      
       - remove unnecessary parentheses from integer literals,
      
       - capitalize consistently,
      
       - remove stray newlines,
      
       - add comments where necessary,
      
       - remove invalid/unnecessary comments,
      
       - align structure definitions and other data types vertically,
      
       - add missing newlines for increased readability,
      
       - fix vertical tabulation where it's misaligned,
      
       - harmonize preprocessor conditional block labeling
         and vertical alignment,
      
       - remove line-breaks where they uglify the code,
      
       - add newline after local variable definitions,
      
      No change in functionality:
      
        md5:
           1191fa0a890cfa8132156d2959d7e9e2  built-in.o.before.asm
           1191fa0a890cfa8132156d2959d7e9e2  built-in.o.after.asm
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      97fb7a0a
  23. Nov 08, 2017
  24. Sep 25, 2017
  25. Jul 14, 2017
    • Wanpeng Li's avatar
      sched/cputime: Don't use smp_processor_id() in preemptible context · 0e4097c3
      Wanpeng Li authored
      
      Recent kernels trigger this warning:
      
       BUG: using smp_processor_id() in preemptible [00000000] code: 99-trinity/181
       caller is debug_smp_processor_id+0x17/0x19
       CPU: 0 PID: 181 Comm: 99-trinity Not tainted 4.12.0-01059-g2a42eb9 #1
       Call Trace:
        dump_stack+0x82/0xb8
        check_preemption_disabled()
        debug_smp_processor_id()
        vtime_delta()
        task_cputime()
        thread_group_cputime()
        thread_group_cputime_adjusted()
        wait_consider_task()
        do_wait()
        SYSC_wait4()
        do_syscall_64()
        entry_SYSCALL64_slow_path()
      
      As Frederic pointed out:
      
      | Although those sched_clock_cpu() things seem to only matter when the
      | sched_clock() is unstable. And that stability is a condition for nohz_full
      | to work anyway. So probably sched_clock() alone would be enough.
      
      This patch fixes it by replacing sched_clock_cpu() with sched_clock() to
      avoid calling smp_processor_id() in a preemptible context.
      
      Reported-by: default avatarXiaolong Ye <xiaolong.ye@intel.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1499586028-7402-1-git-send-email-wanpeng.li@hotmail.com
      
      
      [ Prettified the changelog. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0e4097c3
  26. Jul 05, 2017