- Jun 07, 2024
-
-
Aseda Aboagye authored
HUTRR94 added support for a new usage titled "System Do Not Disturb" which toggles a system-wide Do Not Disturb setting. This commit simply adds a new event code for the usage. Signed-off-by:
Aseda Aboagye <aaboagye@chromium.org> Acked-by:
Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/Zl-gUHE70s7wCAoB@google.com Signed-off-by:
Benjamin Tissoires <bentiss@kernel.org>
-
Aseda Aboagye authored
HUTRR116 added support for a new usage titled "System Accessibility Binding" which toggles a system-wide bound accessibility UI or command. This commit simply adds a new event code for the usage. Signed-off-by:
Aseda Aboagye <aaboagye@chromium.org> Acked-by:
Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/Zl-e97O9nvudco5z@google.com Signed-off-by:
Benjamin Tissoires <bentiss@kernel.org>
-
- Jun 05, 2024
-
-
Chengming Zhou authored
We normally ksm_zero_pages++ in ksmd when page is merged with zero page, but ksm_zero_pages-- is done from page tables side, where there is no any accessing protection of ksm_zero_pages. So we can read very exceptional value of ksm_zero_pages in rare cases, such as -1, which is very confusing to users. Fix it by changing to use atomic_long_t, and the same case with the mm->ksm_zero_pages. Link: https://lkml.kernel.org/r/20240528-b4-ksm-counters-v3-2-34bb358fdc13@linux.dev Fixes: e2942062 ("ksm: count all zero pages placed by KSM") Fixes: 6080d19f ("ksm: add ksm zero pages for each process") Signed-off-by:
Chengming Zhou <chengming.zhou@linux.dev> Acked-by:
David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn> Cc: Stefan Roesch <shr@devkernel.io> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Barry Song authored
if CONFIG_SYSFS is not enabled in config, we get the below error, All errors (new ones prefixed by >>): s390-linux-ld: mm/memory.o: in function `count_mthp_stat': >> include/linux/huge_mm.h:285:(.text+0x191c): undefined reference to `mthp_stats' s390-linux-ld: mm/huge_memory.o:(.rodata+0x10): undefined reference to `mthp_stats' vim +285 include/linux/huge_mm.h 279 280 static inline void count_mthp_stat(int order, enum mthp_stat_item item) 281 { 282 if (order <= 0 || order > PMD_ORDER) 283 return; 284 > 285 this_cpu_inc(mthp_stats.stats[order][item]); 286 } 287 Link: https://lkml.kernel.org/r/20240523210045.40444-1-21cnbao@gmail.com Fixes: ec33687c ("mm: add per-order mTHP anon_fault_alloc and anon_fault_fallback counters") Reported-by:
kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405231728.tCAogiSI-lkp@intel.com/ Signed-off-by:
Barry Song <v-songbaohua@oppo.com> Tested-by:
Yujie Liu <yujie.liu@intel.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Baolin Wang authored
The mTHP swap related counters: 'anon_swpout' and 'anon_swpout_fallback' are confusing with an 'anon_' prefix, since the shmem can swap out non-anonymous pages. So drop the 'anon_' prefix to keep consistent with the old swap counter names. This is needed in 6.10-rcX to avoid having an inconsistent ABI out in the field. Link: https://lkml.kernel.org/r/7a8989c13299920d7589007a30065c3e2c19f0e0.1716431702.git.baolin.wang@linux.alibaba.com Fixes: d0f048ac ("mm: add per-order mTHP anon_swpout and anon_swpout_fallback counters") Fixes: 42248b9d ("mm: add docs for per-order mTHP counters and transhuge_page ABI") Signed-off-by:
Baolin Wang <baolin.wang@linux.alibaba.com> Suggested-by:
"Huang, Ying" <ying.huang@intel.com> Acked-by:
Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Carlos Llamas authored
For ${atomic}_sub_and_test() the @i parameter is the value to subtract, not add. Fix the typo in the kerneldoc template and generate the headers with this update. Fixes: ad811070 ("locking/atomic: scripts: generate kerneldoc comments") Suggested-by:
Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Carlos Llamas <cmllamas@google.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Mark Rutland <mark.rutland@arm.com> Reviewed-by:
Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20240515133844.3502360-1-cmllamas@google.com
-
Jakub Kicinski authored
Jaroslav reports Dell's OMSA Systems Management Data Engine expects NLM_DONE in a separate recvmsg(), both for rtnl_dump_ifinfo() and inet_dump_ifaddr(). We already added a similar fix previously in commit 460b0d33 ("inet: bring NLM_DONE out to a separate recv() again") Instead of modifying all the dump handlers, and making them look different than modern for_each_netdev_dump()-based dump handlers - put the workaround in rtnetlink code. This will also help us move the custom rtnl-locking from af_netlink in the future (in net-next). Note that this change is not touching rtnl_dump_all(). rtnl_dump_all() is different kettle of fish and a potential problem. We now mix families in a single recvmsg(), but NLM_DONE is not coalesced. Tested: ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_addr.yaml \ --dump getaddr --json '{"ifa-family": 2}' ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_route.yaml \ --dump get...
-
- Jun 04, 2024
-
-
Dan Williams authored
While the experiment did reveal that there are additional places that are missing the lock during secondary bus reset, one of the places that needs to take cfg_access_lock (pci_bus_lock()) is not prepared for lockdep annotation. Specifically, pci_bus_lock() takes pci_dev_lock() recursively and is currently dependent on the fact that the device_lock() is marked lockdep_set_novalidate_class(&dev->mutex). Otherwise, without that annotation, pci_bus_lock() would need to use something like a new pci_dev_lock_nested() helper, a scheme to track a PCI device's depth in the topology, and a hope that the depth of a PCI tree never exceeds the max value for a lockdep subclass. The alternative to ripping out the lockdep coverage would be to deploy a dynamic lock key for every PCI device. Unfortunately, there is evidence that increasing the number of keys that lockdep needs to track to be per-PCI-device is prohibitively expensive for something like the cfg_access_lock. The main motivation for adding the annotation in the first place was to catch unlocked secondary bus resets, not necessarily catch lock ordering problems between cfg_access_lock and other locks. Solve that narrower problem with follow-on patches, and just due to targeted revert for now. Link: https://lore.kernel.org/r/171711746402.1628941.14575335981264103013.stgit@dwillia2-xfh.jf.intel.com Fixes: 7e89efc6 ("PCI: Lock upstream bridge for pci_reset_function()") Reported-by:
Imre Deak <imre.deak@intel.com> Closes: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_134186v1/shard-dg2-1/igt@device_reset@unbind-reset-rebind.html Signed-off-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Tested-by:
Hans de Goede <hdegoede@redhat.com> Tested-by:
Kalle Valo <kvalo@kernel.org> Reviewed-by:
Dave Jiang <dave.jiang@intel.com> Cc: Jani Saarinen <jani.saarinen@intel.com>
-
Lu Baolu authored
iommu_sva_bind_device() should return either a sva bond handle or an ERR_PTR value in error cases. Existing drivers (idxd and uacce) only check the return value with IS_ERR(). This could potentially lead to a kernel NULL pointer dereference issue if the function returns NULL instead of an error pointer. In reality, this doesn't cause any problems because iommu_sva_bind_device() only returns NULL when the kernel is not configured with CONFIG_IOMMU_SVA. In this case, iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) will return an error, and the device drivers won't call iommu_sva_bind_device() at all. Fixes: 26b25a2b ("iommu: Bind process address spaces to devices") Signed-off-by:
Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by:
Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by:
Kevin Tian <kevin.tian@intel.com> Reviewed-by:
Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20240528042528.71396-1-baolu.lu@linux.intel.com Signed-off-by:
Joerg Roedel <jroedel@suse.de>
-
- Jun 01, 2024
-
-
Dmitry Safonov authored
TCP_CLOSE may or may not have current/rnext keys and should not be considered "established". The fast-path for TCP_CLOSE is SKB_DROP_REASON_TCP_CLOSE. This is what tcp_rcv_state_process() does anyways. Add an early drop path to not spend any time verifying segment signatures for sockets in TCP_CLOSE state. Cc: stable@vger.kernel.org # v6.7 Fixes: 0a3a8090 ("net/tcp: Verify inbound TCP-AO signed segments") Signed-off-by:
Dmitry Safonov <0x7f454c46@gmail.com> Link: https://lore.kernel.org/r/20240529-tcp_ao-sk_state-v1-1-d69b5d323c52@gmail.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Greg Kroah-Hartman authored
This reverts commit 8c467f33 . Turns out this breaks many architectures as the vt ioctls do not all match up everywhere due to historical reasons, so the original commit is invalid for many values. Reported-by:
Nick Bowler <nbowler@draconx.ca> Reported-by:
Arnd Bergmann <arnd@kernel.org> Reported-by:
Jiri Slaby <jirislaby@kernel.org> Reported-by:
Christian Zigotzky <chzigotzky@xenosoft.de> Reported-by:
Michael Ellerman <mpe@ellerman.id.au> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexey Gladkov <legion@kernel.org> Link: https://lore.kernel.org/r/ad4e561c-1d49-4f25-882c-7a36c6b1b5c0@draconx.ca Link: https://lore.kernel.org/r/0da9785e-ba44-4718-9d08-4e96c1ba7ab2@kernel.org Link: https://lore.kernel.org/all/34d848f4-670b-4493-bf21-130ef862521b@xenosoft.de/ Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- May 30, 2024
-
-
Damien Le Moal authored
A zoned device may have a last sequential write required zone that is smaller than other zones. However, all tests to check if a zone write plug write offset exceeds the zone capacity use the same capacity value stored in the gendisk zone_capacity field. This is incorrect for a zoned device with a last runt (smaller) zone. Add the new field last_zone_capacity to struct gendisk to store the capacity of the last zone of the device. blk_revalidate_seq_zone() and blk_revalidate_conv_zone() are both modified to get this value when disk_zone_is_last() returns true. Similarly to zone_capacity, the value is first stored using the last_zone_capacity field of struct blk_revalidate_zone_args. Once zone revalidation of all zones is done, this is used to set the gendisk last_zone_capacity field. The checks to determine if a zone is full or if a sector offset in a zone exceeds the zone capacity in disk_should_remove_zone_wplug(), disk_zone_wplug_abort_unaligned(), blk_zone_write_plug_init_request(), and blk_zone_wplug_prepare_bio() are modified to use the new helper functions disk_zone_is_full() and disk_zone_wplug_is_full(). disk_zone_is_full() uses the zone index to determine if the zone being tested is the last one of the disk and uses the either the disk zone_capacity or last_zone_capacity accordingly. Fixes: dd291d77 ("block: Introduce zone write plugging") Signed-off-by:
Damien Le Moal <dlemoal@kernel.org> Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Niklas Cassel <cassel@kernel.org> Link: https://lore.kernel.org/r/20240530054035.491497-4-dlemoal@kernel.org Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jakub Kicinski authored
Recent commit 0cfe71f4 ("netdev: add queue stats") added a lot of useful stats, but only those immediately needed by virtio. Presumably virtio does not support CHECKSUM_COMPLETE, so statistic for that form of checksumming wasn't included. Other drivers will definitely need it, in fact we expect it to be needed in net-next soon (mlx5). So let's add the definition of the counter for CHECKSUM_COMPLETE to uAPI in net already, so that the counters are in a more natural order (all subsequent counters have not been present in any released kernel, yet). Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Reviewed-by:
Joe Damato <jdamato@fastly.com> Fixes: 0cfe71f4 ("netdev: add queue stats") Link: https://lore.kernel.org/r/20240529163547.3693194-1-kuba@kernel.org Signed-off-by:
Paolo Abeni <pabeni@redhat.com>
-
- May 29, 2024
-
-
Eric Dumazet authored
__dst_negative_advice() does not enforce proper RCU rules when sk->dst_cache must be cleared, leading to possible UAF. RCU rules are that we must first clear sk->sk_dst_cache, then call dst_release(old_dst). Note that sk_dst_reset(sk) is implementing this protocol correctly, while __dst_negative_advice() uses the wrong order. Given that ip6_negative_advice() has special logic against RTF_CACHE, this means each of the three ->negative_advice() existing methods must perform the sk_dst_reset() themselves. Note the check against NULL dst is centralized in __dst_negative_advice(), there is no need to duplicate it in various callbacks. Many thanks to Clement Lecigne for tracking this issue. This old bug became visible after the blamed commit, using UDP sockets. Fixes: a87cb3e4 ("net: Facility to report route quality of connected sockets") Reported-by:
Clement Lecigne <clecigne@google.com> Diagnosed-by:
Clement Lecigne <clecigne@google.com> Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Reviewed-by:
David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240528114353.1794151-1-edumazet@google.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Alexandre Belloni authored
Fix the typo in the comment for SNDRV_PCM_RATE_KNOT Signed-off-by:
Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20240528191850.63314-1-alexandre.belloni@bootlin.com Signed-off-by:
Takashi Iwai <tiwai@suse.de>
-
- May 28, 2024
-
-
Arnd Bergmann authored
When extra warnings are enabled, gcc points out a global variable definition in a header: In file included from drivers/cpufreq/amd-pstate-ut.c:29: include/linux/amd-pstate.h:123:27: error: 'amd_pstate_mode_string' defined but not used [-Werror=unused-const-variable=] 123 | static const char * const amd_pstate_mode_string[] = { | ^~~~~~~~~~~~~~~~~~~~~~ This header is only included from two files in the same directory, and one of them uses only a single definition from it, so clean it up by moving most of the contents into the driver that uses them, and making shared bits a local header file. Fixes: 36c5014e ("cpufreq: amd-pstate: optimize driver working mode selection in amd_pstate_param()") Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Acked-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Andy Shevchenko authored
The pnp_bus_type is defined only when CONFIG_PNP=y, while being not guarded by ifdeffery in the header. Moreover, it's not used outside of the PNP code. Move it to the internal header to make sure no-one will try to (ab)use it. Signed-off-by:
Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Andy Shevchenko authored
Since we have a dev_is_pnp() macro that utilises the address of the pnp_bus_type variable, the users, which can be compiled as modules, will fail to build. Convert the macro to be a function and export it to the modules to prevent build breakage. Reported-by:
Woody Suwalski <terraluna977@gmail.com> Closes: https://lore.kernel.org/r/cc8a93b2-2504-9754-e26c-5d5c3bd1265c@gmail.com Fixes: 2a49b45c ("PNP: Add dev_is_pnp() macro") Signed-off-by:
Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Jarkko Sakkinen authored
Rename and document TPM2_OA_TMPL, as originally requested in the patch set review, but left unaddressed without any appropriate reasoning. The new name is TPM2_OA_NULL_KEY, has a documentation and is local only to tpm2-sessions.c. Link: https://lore.kernel.org/linux-integrity/ddbeb8111f48a8ddb0b8fca248dff6cc9d7079b2.camel@HansenPartnership.com/ Link: https://lore.kernel.org/linux-integrity/CZCKTWU6ZCC9.2UTEQPEVICYHL@suppilovahvero/ Signed-off-by:
Jarkko Sakkinen <jarkko@kernel.org>
-
Jarkko Sakkinen authored
With only single call site, this makes no sense (slipped out of the radar during the review). Open code and document the action directly to the site, to make it more readable. Fixes: 1b6d7f9e ("tpm: add session encryption protection to tpm2_get_random()") Signed-off-by:
Jarkko Sakkinen <jarkko@kernel.org>
-
- May 27, 2024
-
-
Alexander Lobakin authored
After the tagged commit, @netdev got documented twice and the kdoc script didn't notice that. Remove the second description added later and move the initial one according to the field position. After merging commit 5f8e4007 ("kernel-doc: fix struct_group_tagged() parsing"), kdoc requires to describe struct groups as well. &page_pool_params has 2 struct groups which generated new warnings, describe them to resolve this. Fixes: 403f11ac ("page_pool: don't use driver-set flags field directly") Signed-off-by:
Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20240524112859.2757403-1-aleksander.lobakin@intel.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
Jason commit made checks against ACK sequence less strict and can be exploited by attackers to establish spoofed flows with less probes. Innocent users might use tcp_rmem[1] == 1,000,000,000, or something more reasonable. An attacker can use a regular TCP connection to learn the server initial tp->rcv_wnd, and use it to optimize the attack. If we make sure that only the announced window (smaller than 65535) is used for ACK validation, we force an attacker to use 65537 packets to complete the 3WHS (assuming server ISN is unknown) Fixes: 378979e9 ("tcp: remove 64 KByte limit for initial tp->rcv_wnd value") Link: https://datatracker.ietf.org/meeting/119/materials/slides-119-tcpm-ghost-acks-00 Signed-off-by:
Eric Dumazet <edumazet@google.com> Acked-by:
Neal Cardwell <ncardwell@google.com> Reviewed-by:
Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20240523130528.60376-1-edumazet@google.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Christoph Hellwig authored
This is unused now that all the atomic queue limit conversions are merged. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240521221606.393040-1-hch@lst.de Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
David Howells authored
There's a problem in 9p's interaction with netfslib whereby a crash occurs because the 9p_fid structs get forcibly destroyed during client teardown (without paying attention to their refcounts) before netfslib has finished with them. However, it's not a simple case of deferring the clunking that p9_fid_put() does as that requires the p9_client record to still be present. The problem is that netfslib has to unlock pages and clear the IN_PROGRESS flag before destroying the objects involved - including the fid - and, in any case, nothing checks to see if writeback completed barring looking at the page flags. Fix this by keeping a count of outstanding I/O requests (of any type) and waiting for it to quiesce during inode eviction. Reported-by:
<syzbot+df038d463cca332e8414@syzkaller.appspotmail.com> Link: https://lore.kernel.org/all/0000000000005be0aa061846f8d6@google.com/ Reported-by:
<syzbot+d7c7a495a5e466c031b6@syzkaller.appspotmail.com> Link: h...
-
- May 26, 2024
-
-
Kent Overstreet authored
percpu.h depends on smp.h, but doesn't include it directly because of circular header dependency issues; percpu.h is needed in a bunch of low level headers. This fixes a randconfig build error on mips: include/linux/alloc_tag.h: In function '__alloc_tag_ref_set': include/asm-generic/percpu.h:31:40: error: implicit declaration of function 'raw_smp_processor_id' [-Werror=implicit-function-declaration] Reported-by:
kernel test robot <lkp@intel.com> Fixes: 24e44cc2 ("mm: percpu: enable per-cpu allocation tagging") Closes: https://lore.kernel.org/oe-kbuild-all/202405210052.DIrMXJNz-lkp@intel.com/ Signed-off-by:
Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- May 25, 2024
-
-
Daniel Borkmann authored
When running Cilium connectivity test suite with netkit in L2 mode, we found that compared to tcx a few tests were failing which pushed traffic into an L7 proxy sitting in host namespace. The problem in particular is around the invocation of eth_type_trans() in netkit. In case of tcx, this is run before the tcx ingress is triggered inside host namespace and thus if the BPF program uses the bpf_skb_change_type() helper the newly set type is retained. However, in case of netkit, the late eth_type_trans() invocation overrides the earlier decision from the BPF program which eventually leads to the test failure. Instead of eth_type_trans(), split out the relevant parts, meaning, reset of mac header and call to eth_skb_pkt_type() before the BPF program is run in order to have the same behavior as with tcx, and refactor a small helper called eth_skb_pull_mac() which is run in case it's passed up the stack where the mac header must be pulled. With this all ...
-
- May 24, 2024
-
-
Andrey Konovalov authored
After commit 69d4c0d3 ("entry, kasan, x86: Disallow overriding mem*() functions") and the follow-up fixes, with CONFIG_FORTIFY_SOURCE enabled, even though the compiler instruments meminstrinsics by generating calls to __asan/__hwasan_ prefixed functions, FORTIFY_SOURCE still uses uninstrumented memset/memmove/memcpy as the underlying functions. As a result, KASAN cannot detect bad accesses in memset/memmove/memcpy. This also makes KASAN tests corrupt kernel memory and cause crashes. To fix this, use __asan_/__hwasan_memset/memmove/memcpy as the underlying functions whenever appropriate. Do this only for the instrumented code (as indicated by __SANITIZE_ADDRESS__). Link: https://lkml.kernel.org/r/20240517130118.759301-1-andrey.konovalov@linux.dev Fixes: 69d4c0d3 ("entry, kasan, x86: Disallow overriding mem*() functions") Fixes: 51287dcb ("kasan: emit different calls for instrumentable memintrinsics") Fixes: 36be5cba ("kasan: treat meminstrinsic as builtins in uninstrumented files") Signed-off-by:
Andrey Konovalov <andreyknvl@gmail.com> Reported-by:
Erhard Furtner <erhard_f@mailbox.org> Reported-by:
Nico Pache <npache@redhat.com> Closes: https://lore.kernel.org/all/20240501144156.17e65021@outsider.home/ Reviewed-by:
Marco Elver <elver@google.com> Tested-by:
Nico Pache <npache@redhat.com> Acked-by:
Nico Pache <npache@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Daniel Axtens <dja@axtens.net> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Gal Pressman authored
The MTMP register (0x900a) capability offset is off-by-one, move it to the right place. Fixes: 1f507e80 ("net/mlx5: Expose NIC temperature via hardware monitoring kernel API") Signed-off-by:
Gal Pressman <gal@nvidia.com> Reviewed-by:
Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by:
Tariq Toukan <tariqt@nvidia.com> Reviewed-by:
Simon Horman <horms@kernel.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Xu Yang authored
Add mapping_max_folio_size() to get the maximum folio size for this pagecache mapping. Fixes: 5d8edfb9 ("iomap: Copy larger chunks from userspace") Cc: stable@vger.kernel.org Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20240521114939.2541461-1-xu.yang_2@nxp.com Reviewed-by:
Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by:
Christian Brauner <brauner@kernel.org>
-
Matt Jan authored
The implicit conversion from unsigned int to enum proc_cn_event is invalid, so explicitly cast it for compilation in a C++ compiler. /usr/include/linux/cn_proc.h: In function 'proc_cn_event valid_event(proc_cn_event)': /usr/include/linux/cn_proc.h:72:17: error: invalid conversion from 'unsigned int' to 'proc_cn_event' [-fpermissive] 72 | ev_type &= PROC_EVENT_ALL; | ^ | | | unsigned int Signed-off-by:
Matt Jan <zoo868e@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 23, 2024
-
-
Jeff Xu authored
The new mseal() is an syscall on 64 bit CPU, and with following signature: int mseal(void addr, size_t len, unsigned long flags) addr/len: memory range. flags: reserved. mseal() blocks following operations for the given memory range. 1> Unmapping, moving to another location, and shrinking the size, via munmap() and mremap(), can leave an empty space, therefore can be replaced with a VMA with a new set of attributes. 2> Moving or expanding a different VMA into the current location, via mremap(). 3> Modifying a VMA via mmap(MAP_FIXED). 4> Size expansion, via mremap(), does not appear to pose any specific risks to sealed VMAs. It is included anyway because the use case is unclear. In any case, users can rely on merging to expand a sealed VMA. 5> mprotect() and pkey_mprotect(). 6> Some destructive madvice() behaviors (e.g. MADV_DONTNEED) for anonymous memory, when users don't have write permission to the memory. Those behaviors can alter region contents by...
-
Jeff Xu authored
Patch series "Introduce mseal", v10. This patchset proposes a new mseal() syscall for the Linux kernel. In a nutshell, mseal() protects the VMAs of a given virtual memory range against modifications, such as changes to their permission bits. Modern CPUs support memory permissions, such as the read/write (RW) and no-execute (NX) bits. Linux has supported NX since the release of kernel version 2.6.8 in August 2004 [1]. The memory permission feature improves the security stance on memory corruption bugs, as an attacker cannot simply write to arbitrary memory and point the code to it. The memory must be marked with the X bit, or else an exception will occur. Internally, the kernel maintains the memory permissions in a data structure called VMA (vm_area_struct). mseal() additionally protects the VMA itself against modifications of the selected seal type. Memory sealing is useful to mitigate memory corruption issues where a corrupted pointer is passed to a memory management...
-
Heiner Kallweit authored
Remove this class after all users have been gone. Signed-off-by:
Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by:
Andi Shyti <andi.shyti@kernel.org>
-
- May 22, 2024
-
-
Steven Rostedt (Google) authored
With the rework of how the __string() handles dynamic strings where it saves off the source string in field in the helper structure[1], the assignment of that value to the trace event field is stored in the helper value and does not need to be passed in again. This means that with: __string(field, mystring) Which use to be assigned with __assign_str(field, mystring), no longer needs the second parameter and it is unused. With this, __assign_str() will now only get a single parameter. There's over 700 users of __assign_str() and because coccinelle does not handle the TRACE_EVENT() macro I ended up using the following sed script: git grep -l __assign_str | while read a ; do sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file; mv /tmp/test-file $a; done I then searched for __assign_str() that did not end with ';' as those were multi line assignments that the sed script above would fail to catch. Note, the same updates will need to be done for: __assign_str_len() __assign_rel_str() __assign_rel_str_len() I tested this with both an allmodconfig and an allyesconfig (build only for both). [1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/ Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by:
Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by:
Jani Nikula <jani.nikula@intel.com> Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts. Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal Acked-by:
Takashi Iwai <tiwai@suse.de> Acked-by: Darrick J. Wong <djwong@kernel.org> # xfs Tested-by:
Guenter Roeck <linux@roeck-us.net>
-
Puranjay Mohan authored
This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by:
Puranjay Mohan <puranjay@kernel.org> Tested-by:
Björn Töpel <bjorn@rivosinc.com> Reviewed-by:
Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Linus Torvalds authored
Work around clang problems with asm constraints that have multiple possibilities, particularly "g" and "rm". Clang seems to turn inputs like that into the most generic form, which is the memory input - but to make matters worse, clang won't even use a possible original memory location, but will spill the value to stack, and use the stack for the asm input. See https://github.com/llvm/llvm-project/issues/20571#issuecomment-980933442 for some explanation of why clang has this strange behavior, but the end result is that "g" and "rm" really end up generating horrid code. Link: https://github.com/llvm/llvm-project/issues/20571 Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
David Hildenbrand authored
With virtio-mem, primarily hibernation is problematic: as the machine shuts down, the virtio-mem device loses its state. Powering the machine back up is like losing a bunch of DIMMs. While there would be ways to add limited support, suspend+resume is more commonly used for VMs and "easier" to support cleanly. s2idle can be supported without any device dependencies. Similarly, one would expect suspend-to-ram (i.e., S3) to work out of the box. However, QEMU currently unplugs all device memory when resuming the VM, using a cold reset on the "wakeup" path. In order to support S3, we need a feature flag for the device to tell us if memory remains plugged when waking up. In the future, QEMU will implement this feature. So let's always support s2idle and support S3 with plugged memory only if the device indicates support. Block hibernation early using the PM notifier. Trying to hibernate now fails early: # echo disk > /sys/power/state [ 26.455369] PM: hibernation: hibernation entry [ 26.458271] virtio_mem virtio0: hibernation is not supported. [ 26.462498] PM: hibernation: hibernation exit -bash: echo: write error: Operation not permitted s2idle works even without the new feature bit: # echo s2idle > /sys/power/mem_sleep # echo mem > /sys/power/state [ 52.083725] PM: suspend entry (s2idle) [ 52.095950] Filesystems sync: 0.010 seconds [ 52.101493] Freezing user space processes [ 52.104213] Freezing user space processes completed (elapsed 0.001 seconds) [ 52.106520] OOM killer disabled. [ 52.107655] Freezing remaining freezable tasks [ 52.110880] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [ 52.113296] printk: Suspending console(s) (use no_console_suspend to debug) S3 does not work without the feature bit when memory is plugged: # echo deep > /sys/power/mem_sleep # echo mem > /sys/power/state [ 32.788281] PM: suspend entry (deep) [ 32.816630] Filesystems sync: 0.027 seconds [ 32.820029] Freezing user space processes [ 32.823870] Freezing user space processes completed (elapsed 0.001 seconds) [ 32.827756] OOM killer disabled. [ 32.829608] Freezing remaining freezable tasks [ 32.833842] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [ 32.837953] printk: Suspending console(s) (use no_console_suspend to debug) [ 32.916172] virtio_mem virtio0: suspend+resume with plugged memory is not supported [ 32.916181] virtio-pci 0000:00:02.0: PM: pci_pm_suspend(): virtio_pci_freeze+0x0/0x50 returns -1 [ 32.916197] virtio-pci 0000:00:02.0: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x170 returns -1 [ 32.916210] virtio-pci 0000:00:02.0: PM: failed to suspend async: error -1 But S3 works with the new feature bit when memory is plugged (patched QEMU): # echo deep > /sys/power/mem_sleep # echo mem > /sys/power/state [ 33.983694] PM: suspend entry (deep) [ 34.009828] Filesystems sync: 0.024 seconds [ 34.013589] Freezing user space processes [ 34.016722] Freezing user space processes completed (elapsed 0.001 seconds) [ 34.019092] OOM killer disabled. [ 34.020291] Freezing remaining freezable tasks [ 34.023549] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [ 34.026090] printk: Suspending console(s) (use no_console_suspend to debug) Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by:
David Hildenbrand <david@redhat.com> Message-Id: <20240318120645.105664-1-david@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
Mike Christie authored
Instead of lingering until the device is closed, this has us handle SIGKILL by: 1. marking the worker as killed so we no longer try to use it with new virtqueues and new flush operations. 2. setting the virtqueue to worker mapping so no new works are queued. 3. running all the exiting works. Suggested-by:
Edward Adam Davis <eadavis@qq.com> Reported-and-tested-by:
<syzbot+98edc2df894917b3431f@syzkaller.appspotmail.com> Message-Id: <tencent_546DA49414E876EEBECF2C78D26D242EE50A@qq.com> Signed-off-by:
Mike Christie <michael.christie@oracle.com> Message-Id: <20240316004707.45557-9-michael.christie@oracle.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
Tony Luck authored
Code in v6.9 arch/x86/kernel/smpboot.c was changed by commit 4db64279 ("x86/cpu: Switch to new Intel CPU model defines") from: static const struct x86_cpu_id intel_cod_cpu[] = { X86_MATCH_INTEL_FAM6_MODEL(HASWELL_X, 0), /* COD */ X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_X, 0), /* COD */ X86_MATCH_INTEL_FAM6_MODEL(ANY, 1), /* SNC */ <--- 443 {} }; static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o) { const struct x86_cpu_id *id = x86_match_cpu(intel_cod_cpu); to: static const struct x86_cpu_id intel_cod_cpu[] = { X86_MATCH_VFM(INTEL_HASWELL_X, 0), /* COD */ X86_MATCH_VFM(INTEL_BROADWELL_X, 0), /* COD */ X86_MATCH_VFM(INTEL_ANY, 1), /* SNC */ {} }; static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o) { const struct x86_cpu_id *id = x86_match_cpu(intel_cod_cpu); On an Intel CPU with SNC enabled this code previously matched the rule on line 443 to avoid printing messages about insane cache configuration. The new code did not match any rules. Expanding the macros for the intel_cod_cpu[] array shows that the old is equivalent to: static const struct x86_cpu_id intel_cod_cpu[] = { [0] = { .vendor = 0, .family = 6, .model = 0x3F, .steppings = 0, .feature = 0, .driver_data = 0 }, [1] = { .vendor = 0, .family = 6, .model = 0x4F, .steppings = 0, .feature = 0, .driver_data = 0 }, [2] = { .vendor = 0, .family = 6, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 1 }, [3] = { .vendor = 0, .family = 0, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 0 } } while the new code expands to: static const struct x86_cpu_id intel_cod_cpu[] = { [0] = { .vendor = 0, .family = 6, .model = 0x3F, .steppings = 0, .feature = 0, .driver_data = 0 }, [1] = { .vendor = 0, .family = 6, .model = 0x4F, .steppings = 0, .feature = 0, .driver_data = 0 }, [2] = { .vendor = 0, .family = 0, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 1 }, [3] = { .vendor = 0, .family = 0, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 0 } } Looking at the code for x86_match_cpu(): const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match) { const struct x86_cpu_id *m; struct cpuinfo_x86 *c = &boot_cpu_data; for (m = match; m->vendor | m->family | m->model | m->steppings | m->feature; m++) { ... } return NULL; it is clear that there was no match because the ANY entry in the table (array index 2) is now the loop termination condition (all of vendor, family, model, steppings, and feature are zero). So this code was working before because the "ANY" check was looking for any Intel CPU in family 6. But fails now because the family is a wild card. So the root cause is that x86_match_cpu() has never been able to match on a rule with just X86_VENDOR_INTEL and all other fields set to wildcards. Add a new flags field to struct x86_cpu_id that has a bit set to indicate that this entry in the array is valid. Update X86_MATCH*() macros to set that bit. Change the end-marker check in x86_match_cpu() to just check the flags field for this bit. Backporter notes: The commit in Fixes is really the one that is broken: you can't have m->vendor as part of the loop termination conditional in x86_match_cpu() because it can happen - as it has happened above - that that whole conditional is 0 albeit vendor == 0 is a valid case - X86_VENDOR_INTEL is 0. However, the only case where the above happens is the SNC check added by 4db64279 so you only need this fix if you have backported that other commit 4db64279 ("x86/cpu: Switch to new Intel CPU model defines") Fixes: 644e9cbb ("Add driver auto probing for x86 features v4") Suggested-by:
Thomas Gleixner <tglx@linutronix.de> Suggested-by:
Borislav Petkov <bp@alien8.de> Signed-off-by:
Tony Luck <tony.luck@intel.com> Signed-off-by:
Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable+noautosel@kernel.org> # see above Link: https://lore.kernel.org/r/20240517144312.GBZkdtAOuJZCvxhFbJ@fat_crate.local
-
- May 21, 2024
-
-
Wayne Lin authored
[Why] Commit: - commit 5aa1dfcd ("drm/mst: Refactor the flow for payload allocation/removement") accidently overwrite the commit - commit 54d21740 ("drm: use mgr->dev in drm_dbg_kms in drm_dp_add_payload_part2") which cause regression. [How] Recover the original NULL fix and remove the unnecessary input parameter 'state' for drm_dp_add_payload_part2(). Fixes: 5aa1dfcd ("drm/mst: Refactor the flow for payload allocation/removement") Reported-by:
Leon Weiß <leon.weiss@ruhr-uni-bochum.de> Link: https://lore.kernel.org/r/38c253ea42072cc825dc969ac4e6b9b600371cc8.camel@ruhr-uni-bochum.de/ Cc: lyude@redhat.com Cc: imre.deak@intel.com Cc: stable@vger.kernel.org Cc: regressions@lists.linux.dev Reviewed-by:
Harry Wentland <harry.wentland@amd.com> Acked-by:
Jani Nikula <jani.nikula@intel.com> Signed-off-by:
Wayne Lin <Wayne.Lin@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240307062957.2323620-1-Wayne.Lin@amd.com (cherry picked from commit 4545614c1d8da603e57b60dd66224d81b6ffc305)
-