Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Feb 09, 2024
  2. Jan 03, 2024
  3. Dec 20, 2023
  4. Feb 03, 2022
    • Igor Pylypiv's avatar
      Revert "module, async: async_synchronize_full() on module init iff async is used" · 67d6212a
      Igor Pylypiv authored
      This reverts commit 774a1221.
      
      We need to finish all async code before the module init sequence is
      done.  In the reverted commit the PF_USED_ASYNC flag was added to mark a
      thread that called async_schedule().  Then the PF_USED_ASYNC flag was
      used to determine whether or not async_synchronize_full() needs to be
      invoked.  This works when modprobe thread is calling async_schedule(),
      but it does not work if module dispatches init code to a worker thread
      which then calls async_schedule().
      
      For example, PCI driver probing is invoked from a worker thread based on
      a node where device is attached:
      
      	if (cpu < nr_cpu_ids)
      		error = work_on_cpu(cpu, local_pci_probe, &ddi);
      	else
      		error = local_pci_probe(&ddi);
      
      We end up in a situation where a worker thread gets the PF_USED_ASYNC
      flag set instead of the modprobe thread.  As a result,
      async_synchronize_full() is not invoked and modprobe completes without
      waiting for the async code to finish.
      
      The issue was discovered while loading the pm80xx driver:
      (scsi_mod.scan=async)
      
      modprobe pm80xx                      worker
      ...
        do_init_module()
        ...
          pci_call_probe()
            work_on_cpu(local_pci_probe)
                                           local_pci_probe()
                                             pm8001_pci_probe()
                                               scsi_scan_host()
                                                 async_schedule()
                                                 worker->flags |= PF_USED_ASYNC;
                                           ...
            < return from worker >
        ...
        if (current->flags & PF_USED_ASYNC) <--- false
        	async_synchronize_full();
      
      Commit 21c3c5d2 ("block: don't request module during elevator init")
      fixed the deadlock issue which the reverted commit 774a1221
      ("module, async: async_synchronize_full() on module init iff async is
      used") tried to fix.
      
      Since commit 0fdff3ec
      
       ("async, kmod: warn on synchronous
      request_module() from async workers") synchronous module loading from
      async is not allowed.
      
      Given that the original deadlock issue is fixed and it is no longer
      allowed to call synchronous request_module() from async we can remove
      PF_USED_ASYNC flag to make module init consistently invoke
      async_synchronize_full() unless async module probe is requested.
      
      Signed-off-by: default avatarIgor Pylypiv <ipylypiv@google.com>
      Reviewed-by: default avatarChangyuan Lyu <changyuanl@google.com>
      Reviewed-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      67d6212a
  5. May 07, 2021
  6. May 06, 2021
  7. Jul 16, 2020
    • Kees Cook's avatar
      treewide: Remove uninitialized_var() usage · 3f649ab7
      Kees Cook authored
      Using uninitialized_var() is dangerous as it papers over real bugs[1]
      (or can in the future), and suppresses unrelated compiler warnings
      (e.g. "unused variable"). If the compiler thinks it is uninitialized,
      either simply initialize the variable or make compiler changes.
      
      In preparation for removing[2] the[3] macro[4], remove all remaining
      needless uses with the following script:
      
      git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
      	xargs perl -pi -e \
      		's/\buninitialized_var\(([^\)]+)\)/\1/g;
      		 s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'
      
      drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
      pathological white-space.
      
      No outstanding warnings were found building allmodconfig with GCC 9.3.0
      for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
      alpha, and m68k.
      
      [1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
      [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6ee...
      3f649ab7
  8. Jun 05, 2019
  9. Apr 09, 2019
    • Sakari Ailus's avatar
      treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively · d75f773c
      Sakari Ailus authored
      %pF and %pf are functionally equivalent to %pS and %ps conversion
      specifiers. The former are deprecated, therefore switch the current users
      to use the preferred variant.
      
      The changes have been produced by the following command:
      
      	git grep -l '%p[fF]' | grep -v '^\(tools\|Documentation\)/' | \
      	while read i; do perl -i -pe 's/%pf/%ps/g; s/%pF/%pS/g;' $i; done
      
      And verifying the result.
      
      Link: http://lkml.kernel.org/r/20190325193229.23390-1-sakari.ailus@linux.intel.com
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: sparclinux@vger.kernel.org
      Cc: linux-um@lists.infradead.org
      Cc: xen-devel@lists.xenproject.org
      Cc: linux-acpi@vger.kernel.org
      Cc: linux-pm@vger.kernel.org
      Cc: drbd-dev@lists.linbit.com
      Cc: linux-block@vger.kernel.org
      Cc: linux-mmc@vger.kernel.org
      Cc: linux-nvdimm@lists.01.org
      Cc: linux-pci@vger.kernel.org
      Cc: linux-scsi@vger.kernel.org
      Cc: linux-btrfs@vger.kernel.org
      C...
      d75f773c
  10. Jan 31, 2019
    • Alexander Duyck's avatar
      async: Add support for queueing on specific NUMA node · 6be9238e
      Alexander Duyck authored
      
      Introduce four new variants of the async_schedule_ functions that allow
      scheduling on a specific NUMA node.
      
      The first two functions are async_schedule_near and
      async_schedule_near_domain end up mapping to async_schedule and
      async_schedule_domain, but provide NUMA node specific functionality. They
      replace the original functions which were moved to inline function
      definitions that call the new functions while passing NUMA_NO_NODE.
      
      The second two functions are async_schedule_dev and
      async_schedule_dev_domain which provide NUMA specific functionality when
      passing a device as the data member and that device has a NUMA node other
      than NUMA_NO_NODE.
      
      The main motivation behind this is to address the need to be able to
      schedule device specific init work on specific NUMA nodes in order to
      improve performance of memory initialization.
      
      I have seen a significant improvement in initialziation time for persistent
      memory as a result of this approach. In the case of 3TB of memory on a
      single node the initialization time in the worst case went from 36s down to
      about 26s for a 10s improvement. As such the data shows a general benefit
      for affinitizing the async work to the node local to the device.
      
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6be9238e
  11. Feb 06, 2018
    • Rasmus Villemoes's avatar
      kernel/async.c: revert "async: simplify lowest_in_progress()" · 4f7e988e
      Rasmus Villemoes authored
      This reverts commit 92266d6e ("async: simplify lowest_in_progress()")
      which was simply wrong: In the case where domain is NULL, we now use the
      wrong offsetof() in the list_first_entry macro, so we don't actually
      fetch the ->cookie value, but rather the eight bytes located
      sizeof(struct list_head) further into the struct async_entry.
      
      On 64 bit, that's the data member, while on 32 bit, that's a u64 built
      from func and data in some order.
      
      I think the bug happens to be harmless in practice: It obviously only
      affects callers which pass a NULL domain, and AFAICT the only such
      caller is
      
        async_synchronize_full() ->
        async_synchronize_full_domain(NULL) ->
        async_synchronize_cookie_domain(ASYNC_COOKIE_MAX, NULL)
      
      and the ASYNC_COOKIE_MAX means that in practice we end up waiting for
      the async_global_pending list to be empty - but it would break if
      somebody happened to pass (void*)-1 as the data element to
      async_schedule, and of course also if s...
      4f7e988e
  12. May 23, 2017
  13. Nov 19, 2015
  14. Oct 09, 2014
  15. Mar 12, 2013
  16. Jan 25, 2013
  17. Jan 23, 2013
    • Tejun Heo's avatar
      async: replace list of active domains with global list of pending items · 9fdb04cd
      Tejun Heo authored
      
      Global synchronization - async_synchronize_full() - is currently
      implemented by keeping a list of all active registered domains and
      syncing them one by one until no domain is active.
      
      While this isn't necessarily a complex scheme, it can easily be
      simplified by keeping global list of the pending items of all
      registered active domains instead of list of domains and simply using
      the globl pending list the same way as domain syncing.
      
      This patch replaces async_domains with async_global_pending and update
      lowest_in_progress() to use the global pending list if @domain is
      %NULL.  async_synchronize_full_domain(NULL) is now allowed and
      equivalent to async_synchronize_full().  As no one is calling with
      NULL domain, this doesn't affect any existing users.
      
      async_register_mutex is no longer necessary and dropped.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <djbw@fb.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      9fdb04cd
    • Tejun Heo's avatar
      async: keep pending tasks on async_domain and remove async_pending · 52722794
      Tejun Heo authored
      
      Async kept single global pending list and per-domain running lists.
      When an async item is queued, it's put on the global pending list.
      The item is moved to the per-domain running list when its execution
      starts.
      
      At this point, this design complicates execution and synchronization
      without bringing any benefit.  The list only matters for
      synchronization which doesn't care whether a given async item is
      pending or executing.  Also, global synchronization is done by
      iterating through all active registered async_domains, so the global
      async_pending list doesn't help anything either.
      
      Rename async_domain->running to async_domain->pending and put async
      items directly there and remove when execution completes.  This
      simplifies lowest_in_progress() a lot - the first item on the pending
      list is the one with the lowest cookie, and async_run_entry_fn()
      doesn't have to mess with moving the item from pending to running.
      
      After the change, whether a domain is empty or not can be trivially
      determined by looking at async_domain->pending.  Remove
      async_domain->count and use list_empty() on pending instead.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <djbw@fb.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      52722794
    • Tejun Heo's avatar
      async: use ULLONG_MAX for infinity cookie value · c68eee14
      Tejun Heo authored
      
      Currently, next_cookie is used as the infinity value.  In most cases,
      this should work fine but it theoretically could bring subtle behavior
      difference between async_synchronize_full() and
      async_synchronize_full_domain().
      
      async_synchronize_full() keeps waiting until there's no registered
      async_entry left regardless of what next_cookie was when the function
      was called.  It guarantees that the queue is completely drained at
      least once before returning.
      
      However, async_synchronize_full_domain() doesn't.  It synchronizes
      upto next_cookie and if further async jobs are queued after the
      next_cookie value to synchronize is decided, they won't be waited for.
      
      For unrelated async jobs, the behavior difference doesn't matter;
      however, if async jobs which are related (nested or otherwise) to the
      executing ones are queued while sychronization is in progress, the
      resulting behavior difference could be problematic.
      
      This can be easily fixed by using ULLONG_MAX as the infinity value
      instead.  Define ASYNC_COOKIE_MAX as ULLONG_MAX and use it as the
      infinity value for synchronization.  This makes
      async_synchronize_full_domain() fully drain the domain at least once
      before returning, making its behavior match async_synchronize_full().
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <djbw@fb.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      c68eee14
    • Tejun Heo's avatar
      async: bring sanity to the use of words domain and running · 8723d503
      Tejun Heo authored
      
      In the beginning, running lists were literal struct list_heads.  Later
      on, struct async_domain was added.  For some reason, while the
      conversion substituted list_heads with async_domains, the variable
      names weren't fully converted.  In more places, "running" was used for
      struct async_domain while other places adopted new "domain" name.
      
      The situation is made much worse by having async_domain's running list
      named "domain" and async_entry's field pointing to async_domain named
      "running".
      
      So, we end up with mix of "running" and "domain" for variable names
      for async_domain, with the field names of async_domain and async_entry
      swapped between "running" and "domain".
      
      It feels almost intentionally made to be as confusing as possible.
      Bring some sanity by
      
      * Renaming all async_domain variables "domain".
      
      * s/async_running/async_dfl_domain/
      
      * s/async_domain->domain/async_domain->running/
      
      * s/async_entry->running/async_entry->domain/
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <djbw@fb.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      8723d503
  18. Jan 22, 2013
    • Tejun Heo's avatar
      async: fix __lowest_in_progress() · f56c3196
      Tejun Heo authored
      Commit 083b804c ("async: use workqueue for worker pool") made it
      possible that async jobs are moved from pending to running out-of-order.
      While pending async jobs will be queued and dispatched for execution in
      the same order, nothing guarantees they'll enter "1) move self to the
      running queue" of async_run_entry_fn() in the same order.
      
      Before the conversion, async implemented its own worker pool.  An async
      worker, upon being woken up, fetches the first item from the pending
      list, which kept the executing lists sorted.  The conversion to
      workqueue was done by adding work_struct to each async_entry and async
      just schedules the work item.  The queueing and dispatching of such work
      items are still in order but now each worker thread is associated with a
      specific async_entry and moves that specific async_entry to the
      executing list.  So, depending on which worker reaches that point
      earlier, which is non-deterministic, we may end up moving an async_entry
      with larger cookie...
      f56c3196
  19. Jan 18, 2013
  20. Jan 16, 2013
    • Tejun Heo's avatar
      module, async: async_synchronize_full() on module init iff async is used · 774a1221
      Tejun Heo authored
      If the default iosched is built as module, the kernel may deadlock
      while trying to load the iosched module on device probe if the probing
      was running off async.  This is because async_synchronize_full() at
      the end of module init ends up waiting for the async job which
      initiated the module loading.
      
       async A				modprobe
      
       1. finds a device
       2. registers the block device
       3. request_module(default iosched)
      					4. modprobe in userland
      					5. load and init module
      					6. async_synchronize_full()
      
      Async A waits for modprobe to finish in request_module() and modprobe
      waits for async A to finish in async_synchronize_full().
      
      Because there's no easy to track dependency once control goes out to
      userland, implementing properly nested flushing is difficult.  For
      now, make module init perform async_synchronize_full() iff module init
      has queued async jobs as suggested by Linus.
      
      This avoids the described deadlock because iosched module doesn't use
      async and thus wouldn't invoke async_synchronize_full().  This is
      hacky and incomplete.  It will deadlock if async module loading nests;
      however, this works around the known problem case and seems to be the
      best of bad options.
      
      For more details, please refer to the following thread.
      
        http://thread.gmane.org/gmane.linux.kernel/1420814
      
      
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarAlex Riesen <raa.lkml@gmail.com>
      Tested-by: default avatarMing Lei <ming.lei@canonical.com>
      Tested-by: default avatarAlex Riesen <raa.lkml@gmail.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      774a1221
  21. Jul 20, 2012
    • Dan Williams's avatar
      [SCSI] async: make async_synchronize_full() flush all work regardless of domain · a4683487
      Dan Williams authored
      In response to an async related regression James noted:
      
        "My theory is that this is an init problem: The assumption in a lot of
         our code is that async_synchronize_full() waits for everything ... even
         the domain specific async schedules, which isn't true."
      
      ...so make this assumption true.
      
      Each domain, including the default one, registers itself on a global domain
      list when work is scheduled.  Once all entries complete it exits that
      list.  Waiting for the list to be empty syncs all in-flight work across
      all domains.
      
      Domains can opt-out of global syncing if they are declared as exclusive
      ASYNC_DOMAIN_EXCLUSIVE().  All stack-based domains have been declared
      exclusive since the domain may go out of scope as soon as the last work
      item completes.
      
      Statically declared domains are mostly ok, but async_unregister_domain()
      is there to close any theoretical races with pending
      async_synchronize_full waiters at module removal tim...
      a4683487
    • Dan Williams's avatar
      [SCSI] async: introduce 'async_domain' type · 2955b47d
      Dan Williams authored
      
      This is in preparation for teaching async_synchronize_full() to sync all
      pending async work, and not just on the async_running domain.  This
      conversion is functionally equivalent, just embedding the existing list
      in a new async_domain type.
      
      The .registered attribute is used in a later patch to distinguish
      between domains that want to be flushed by async_synchronize_full()
      versus those that only expect async_synchronize_{full|cookie}_domain to
      be used for flushing.
      
      [jejb: add async.h to scsi_priv.h for struct async_domain]
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Acked-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Acked-by: default avatarMark Brown <broonie@opensource.wolfsonmicro.com>
      Tested-by: default avatarEldad Zack <eldad@fogrefinery.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      2955b47d
  22. Jan 12, 2012
  23. Oct 31, 2011
  24. Sep 15, 2011
  25. Jun 14, 2011
  26. Jul 14, 2010
  27. Mar 30, 2010
    • Tejun Heo's avatar
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo authored
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Guess-its-ok-by: default avatarChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  28. Jun 08, 2009
    • Linus Torvalds's avatar
      async: Fix lack of boot-time console due to insufficient synchronization · 3af968e0
      Linus Torvalds authored
      Our async work synchronization was broken by "async: make sure
      independent async domains can't accidentally entangle" (commit
      d5a877e8), because it would report
      the wrong lowest active async ID when there was both running and
      pending async work.
      
      This caused things like no being able to read the root filesystem,
      resulting in missing console devices and inability to run 'init',
      causing a boot-time panic.
      
      This fixes it by properly returning the lowest pending async ID: if
      there is any running async work, that will have a lower ID than any
      pending work, and we should _not_ look at the pending work list.
      
      There were alternative patches from Jaswinder and James, but this one
      also cleans up the code by removing the pointless 'ret' variable and
      the unnecesary testing for an empty list around 'for_each_entry()' (if
      the list is empty, the for_each_entry() thing just won't execute).
      
      Fixes-bug: http://bugzilla....
      3af968e0
  29. May 24, 2009
  30. Mar 28, 2009
  31. Feb 08, 2009