Commits · 0db4618e8fabfcc404af4dda23799bba726785a5 · BeagleBoard.org / Linux

Jul 19, 2024

btrfs: change BTRFS_MOUNT_* flags to 64bit type · c3ece6b7

Qu Wenruo authored 8 months ago

Currently the BTRFS_MOUNT_* flags are already beyond 32 bits, this is
going to cause compilation errors for some 32 bit systems, as their
unsigned long is only 32 bits long, thus flag
BTRFS_MOUNT_IGNORESUPERFLAGS overflows and can lead to errors.

Fix the problem by:

- Migrate all existing BTRFS_MOUNT_* flags to unsigned long long
- Migrate all mount option related variables to unsigned long long
  * btrfs_fs_info::mount_opt
  * btrfs_fs_context::mount_opt
  * mount_opt parameter of btrfs_check_options()
  * old_opts parameter of btrfs_remount_begin()
  * old_opts parameter of btrfs_remount_cleanup()
  * mount_opt parameter of btrfs_check_mountopts_zoned()
  * mount_opt and opt parameters of check_ro_option()

Fixes: 32e62165

 ("btrfs: introduce new "rescue=ignoresuperflags" mount option")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

c3ece6b7

Jul 18, 2024

bcachefs: Fix integer overflow on trans->nr_updates · 6f719cbe

Kent Overstreet authored 8 months ago


We can't have more updates than paths, so btree_path_idx_t is the
correct type to use.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

6f719cbe

bcachefs: silence silly kdoc warning · f05a0b9c
Kent Overstreet authored 8 months ago
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
f05a0b9c

bcachefs: Fix fsck warning about btree_trans not passed to fsck error · 2c4c17fe

Kent Overstreet authored 8 months ago

If a btree_trans is in use it's supposed to be passed to fsck_err so
that it can be unlocked if we're waiting on userspace input; but the
btree IO paths do call fsck errors where a btree_trans exists on the
stack but it's not passed through.

But it's ok, because it's unlocked while doing IO.

Fixes: a850bde6

 ("bcachefs: fsck_err() may now take a btree_trans")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2c4c17fe

bcachefs: Add an error message for insufficient rw journal devs · f12410bb

Kent Overstreet authored 8 months ago


This causes us to go read-only - need an error message saying why.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f12410bb

bcachefs: varint: Avoid left-shift of a negative value · ee1b8dc1

Tavian Barnes authored 9 months ago


Shifting a negative value left is undefined.

Signed-off-by: Tavian Barnes <tavianator@tavianator.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ee1b8dc1

nsfs: use cleanup guard · 280e36f0

Christian Brauner authored 8 months ago

Ensure that rcu read lock is given up before returning.

Link: https://lore.kernel.org/r/20240716-elixier-fliesen-1ab342151a61@brauner
Fixes: ca567df7

 ("nsfs: add pid translation ioctls")
Reported-by:  <syzbot+a3e82ae343b26b4d2335@syzkaller.appspotmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>

280e36f0

fs/adfs: add MODULE_DESCRIPTION · 400e4064

Jeff Johnson authored 10 months ago


Fix the 'make W=1' issue:
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/adfs/adfs.o

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240523-md-adfs-v1-1-364268e38370@quicinc.com


Signed-off-by: Christian Brauner <brauner@kernel.org>

400e4064

Jul 17, 2024

nfs: split nfs_read_folio · a308996e

Christoph Hellwig authored 8 months ago


nfs_read_folio is a bit hard to follow because it mixes highlevel logic
with the actual data read.  Split the latter into a helper and update
the comments to be more accurate.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

a308996e

nfs: pass explicit offset/count to trace events · fada32ed

Christoph Hellwig authored 8 months ago

nfs_folio_length is unsafe to use without having the folio locked and a
check for a NULL ->f_mapping that protects against truncations and can
lead to kernel crashes.  E.g. when running xfstests generic/065 with
all nfs trace points enabled.

Follow the model of the XFS trace points and pass in an explіcit offset
and length.  This has the additional benefit that these values can
be more accurate as some of the users touch partial folio ranges.

Fixes: eb5654b3

 ("NFS: Enable tracing of nfs_invalidate_folio() and nfs_launder_folio()")
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

fada32ed

virtio: rename virtio_find_vqs_info() to virtio_find_vqs() · 6c85d6b6

Jiri Pirko authored 8 months ago


Since the original virtio_find_vqs() is no longer present, rename
virtio_find_vqs_info() back to virtio_find_vqs().

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Message-Id: <20240708074814.1739223-20-jiri@resnulli.us>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

6c85d6b6

virtiofs: convert to use virtio_find_vqs_info() · fc496dcd

Jiri Pirko authored 8 months ago


Instead of passing separate names and callbacks arrays
to virtio_find_vqs(), allocate one of virtual_queue_info structs and
pass it to virtio_find_vqs_info().

Suggested-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Message-Id: <20240708074814.1739223-16-jiri@resnulli.us>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

fc496dcd

Jul 15, 2024

exfat: fix potential deadlock on __exfat_get_dentry_set · 89fc5487

Sungjong Seo authored 9 months ago

When accessing a file with more entries than ES_MAX_ENTRY_NUM, the bh-array
is allocated in __exfat_get_entry_set. The problem is that the bh-array is
allocated with GFP_KERNEL. It does not make sense. In the following cases,
a deadlock for sbi->s_lock between the two processes may occur.

       CPU0                CPU1
       ----                ----
  kswapd
   balance_pgdat
    lock(fs_reclaim)
                      exfat_iterate
                       lock(&sbi->s_lock)
                       exfat_readdir
                        exfat_get_uniname_from_ext_entry
                         exfat_get_dentry_set
                          __exfat_get_dentry_set
                           kmalloc_array
                            ...
                            lock(fs_reclaim)
    ...
    evict
     exfat_evict_inode
      lock(&sbi->s_lock)

To fix this, let's allocate bh-array with GFP_NOFS.

Fixes: a3ff29a9

 ("exfat: support dynamic allocate bh for exfat_entry_set_cache")
Cc: stable@vger.kernel.org # v6.2+
Reported-by:  <syzbot+412a392a2cd4a65e71db@syzkaller.appspotmail.com>
Closes: https://lore.kernel.org/lkml/000000000000fef47e0618c0327f@google.com


Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>

89fc5487

exfat: handle idmapped mounts · 22482176

Michael Jeanson authored 10 months ago

Pass the idmapped mount information to the different helper
functions. Adapt the uid/gid checks in exfat_setattr to use the
vfsuid/vfsgid helpers.

Based on the fat implementation in commit 4b789936


("fat: handle idmapped mounts") by Christian Brauner.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>

22482176

Jul 14, 2024

bcachefs: darray: Don't pass NULL to memcpy() · 2e118ba3

Tavian Barnes authored 9 months ago


memcpy's second parameter must not be NULL, even if size is zero.

Signed-off-by: Tavian Barnes <tavianator@tavianator.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2e118ba3

bcachefs: Kill bch2_assert_btree_nodes_not_locked() · efb2018e

Kent Overstreet authored 8 months ago


We no longer track individual btree node locks with lockdep, so this
will never be enabled.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

efb2018e

bcachefs: Rename BCH_WRITE_DONE -> BCH_WRITE_SUBMITTED · ae469056
Kent Overstreet authored 1 year ago
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
ae469056

bcachefs: __bch2_read(): call trans_begin() on every loop iter · 1d18b5ca

Kent Overstreet authored 8 months ago


perusal of /sys/kernel/debug/bcachefs/*/btree_transaction_stats shows
that the read path has been acculumalating unneeded paths on the reflink
btree, which we don't want.

The solution is to call bch2_trans_begin(), which drops paths not used
on previous loop iteration.

bch2_readahead:
  Max mem used: 0
  Transaction duration:
    count:      194235
                           since mount        recent
    duration of events
      min:                      150 ns
      max:                        9 ms
      total:                    838 ms
      mean:                       4 us          6 us
      stddev:                    34 us          7 us
    time between events
      min:                       10 ns
      max:                       15 h
      mean:                       2 s          12 s
      stddev:                     2 s           3 ms
  Maximum allocated btree paths (193):
    path: idx  2 ref 0:0 P   btree=extents l=0 pos 270943112:392:U32_MAX locks 0
    path: idx  3 ref 1:0   S btree=extents l=0 pos 270943112:24578:U32_MAX locks 1
    path: idx  4 ref 0:0 P   btree=reflink l=0 pos 0:24773509:0 locks 0
    path: idx  5 ref 0:0 P S btree=reflink l=0 pos 0:24773631:0 locks 1
    path: idx  6 ref 0:0 P S btree=reflink l=0 pos 0:24773759:0 locks 1
    path: idx  7 ref 0:0 P S btree=reflink l=0 pos 0:24773887:0 locks 1
    path: idx  8 ref 0:0 P S btree=reflink l=0 pos 0:24774015:0 locks 1
    path: idx  9 ref 0:0 P S btree=reflink l=0 pos 0:24774143:0 locks 1
    path: idx 10 ref 0:0 P S btree=reflink l=0 pos 0:24774271:0 locks 1
<many more reflink paths>

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1d18b5ca

bcachefs: show none if label is not set · 114f530e

Hongbo Li authored 8 months ago


If label is not set, the Label tag in superblock info show '(none)'.

```
[Before]
Device index:                               0
Label:
Version:                                    1.4: member_seq

[After]
Device index:                               0
Label:                                      (none)
Version:                                    1.4: member_seq
```

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

114f530e

bcachefs: drop packed, aligned from bkey_inode_buf · 7b6dda72

Kent Overstreet authored 8 months ago


Unnecessary here, and this broke the rust bindings:

error[E0588]: packed type cannot transitively contain a `#[repr(align)]` type
     --> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:29025:1
      |
29025 | pub struct bkey_i_inode_v3 {
      | ^^^^^^^^^^^^^^^^^^^^^^^^^^
      |
note: `bch_inode_v3` has a `#[repr(align)]` attribute
     --> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:8949:1
      |
8949  | pub struct bch_inode_v3 {
      | ^^^^^^^^^^^^^^^^^^^^^^^

error[E0588]: packed type cannot transitively contain a `#[repr(align)]` type
     --> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:32826:1
      |
32826 | pub struct bkey_inode_buf {
      | ^^^^^^^^^^^^^^^^^^^^^^^^^
      |
note: `bch_inode_v3` has a `#[repr(align)]` attribute
     --> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:8949:1
      |
8949  | pub struct bch_inode_v3 {
      | ^^^^^^^^^^^^^^^^^^^^^^^
note: `bkey_inode_buf` contains a field of type `bkey_i_inode_v3`
     --> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:32827:9
      |
32827 |     pub inode: bkey_i_inode_v3,
      |         ^^^^^
note: ...which contains a field of type `bch_inode_v3`
     --> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:29027:9
      |
29027 |     pub v: bch_inode_v3,
      |         ^

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

7b6dda72

bcachefs: btree node scan: fall back to comparing by journal seq · 6ec8623f

Kent Overstreet authored 8 months ago


highly damaged filesystems, or filesystems that have been damaged and
repair and damaged again, may have sequence numbers we can't fully trust
- which in itself is something we need to debug.

Add a journal_seq fallback so that repair doesn't get stuck.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

6ec8623f

bcachefs: Add lockdep support for btree node locks · 375476c4

Kent Overstreet authored 1 year ago


This adds lockdep tracking for held btree locks with a single dep_map in
btree_trans, i.e. tracking all held btree locks as one object.

This is more practical and more useful than having lockdep track held
btree locks individually, because
 - we can take more locks than lockdep can track (unbounded, now that we
   have dynamically resizable btree paths)
 - there's no lock ordering between btree locks for lockdep to track (we
   do cycle detection)
 - and this makes it easy to teach lockdep that btree locks are not safe
   to hold while invoking memory reclaim.

The last rule is one that lockdep would never learn, because we only do
trylock() from within shrinkers - but we very much do not want to be
invoking memory reclaim while holding btree node locks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

375476c4

lockdep: lockdep_set_notrack_class() · 1a616c2f

Kent Overstreet authored 1 year ago

Add a new helper to disable lockdep tracking entirely for a given class.

This is needed for bcachefs, which takes too many btree node locks for
lockdep to track. Instead, we have a single lockdep_map for "btree_trans
has any btree nodes locked", which makes more since given that we have
centralized lock management and a cycle detector.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1a616c2f

bcachefs: Improve copygc_wait_to_text() · 8f523d42

Kent Overstreet authored 8 months ago


printing the raw values can occasionally be very useful

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8f523d42

bcachefs: Convert clock code to u64s · 27d033df

Kent Overstreet authored 8 months ago


Eliminate possible integer truncation bugs on 32 bit

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

27d033df

bcachefs: Improve startup message · ec8bf491

Kent Overstreet authored 8 months ago


We're not always mounting when we start the filesystem

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ec8bf491

bcachefs: Self healing on read IO error · a2cb8a62

Kent Overstreet authored 8 months ago


This repurposes the promote path, which already knows how to call
data_update() after a read: we now automatically rewrite bad data when
we get a read error and then successfully retry from a different
replica.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a2cb8a62

bcachefs: Make read_only a mount option again, but hidden · b1d63b06

Kent Overstreet authored 8 months ago


fsck passes read_only as a mount option, and it's required for
nochanges, which it also uses.

Usually read_only is handled by the VFS, but we need to be able to
handle it too; we just don't want to print it out twice, so mark it as a
hidden option.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b1d63b06

bcachefs: bch2_extent_crc_unpacked_to_text() · 9d9d212e
Kent Overstreet authored 8 months ago
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
9d9d212e
bcachefs: Ratelimit checksum error messages · 5e3c2083
Kent Overstreet authored 8 months ago
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
5e3c2083
bcachefs: spelling fix · 0f3372dc
Kent Overstreet authored 8 months ago
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
0f3372dc

bcachefs: Simplify btree key cache fill path · d2cb6b21

Kent Overstreet authored 9 months ago


Don't allocate the new bkey_cached until after we've done the btree
lookup; this means we can kill bkey_cached.valid.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d2cb6b21

bcachefs: Improve "unable to allocate journal write" message · 39d5d829
Kent Overstreet authored 9 months ago
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
39d5d829

bcachefs: Fix missing BTREE_TRIGGER_bucket_invalidate flag · e0d5bc6a

Kent Overstreet authored 9 months ago


This fixes an accounting mismatch for cached data.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e0d5bc6a

bcachefs: Ensure buffered writes write as much as they can · 7554a8bb

Kent Overstreet authored 1 year ago


This adds a new helper, bch2_folio_reservation_get_partial(), which
reserves as many blocks as possible and may return partial success.

__bch2_buffered_write() is switched to the new helper - this fixes
fstests generic/275, the write until -ENOSPC test.

generic/230 now fails: this appears to be a test bug, where xfs_io isn't
looping after a partial write to get the error code.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

7554a8bb

bcachefs: support STATX_DIOALIGN for statx file · 95924420

Hongbo Li authored 9 months ago


Add support for STATX_DIOALIGN to bcachefs, so that direct I/O alignment
restrictions are exposed to userspace in a generic way.

[Before]
```
./statx_test /mnt/bcachefs/test
statx(/mnt/bcachefs/test) = 0
dio mem align:0
dio offset align:0
```

[After]
```
./statx_test /mnt/bcachefs/test
statx(/mnt/bcachefs/test) = 0
dio mem align:1
dio offset align:512
```

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

95924420

bcachefs: split out lru_format.h · 7aa7183e
Kent Overstreet authored 9 months ago
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
7aa7183e

bcachefs: bch2_btree_key_cache_drop() now evicts · 789566da

Kent Overstreet authored 9 months ago


As part of improving btree key cache coherency, the bkey_cached.valid
flag is going away.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

789566da

bcachefs: set fgf order hint before starting a buffered write · febc33cb

Pankaj Raghav authored 9 months ago


Set the preferred folio order in the fgp_flags by calling
fgf_set_order(). Page cache will try to allocate large folio of the
preferred order whenever possible instead of allocating multiple 0 order
folios.

This improves the buffered write performance up to 1.25x with default
mount options and up to 1.57x when mounted with no_data_io option with
the following fio workload:

fio --name=bcachefs --filename=/mnt/test  --size=100G \
     --ioengine=io_uring --iodepth=16 --rw=write --bs=128k

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

febc33cb

bcachefs: use FGP_WRITEBEGIN instead of combining individual flags · 2b02b955

Pankaj Raghav authored 9 months ago


Use FGP_WRITEBEGIN to avoid repeating the individual FGP flags before
starting a buffered write.

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2b02b955

Admin message