Commits · 4a04b4f0de59dd5c621e78f15803ee0b0544eeb8 · BeagleBoard.org / Linux

Jul 12, 2024

bpf: fix overflow check in adjust_jmp_off() · 4a04b4f0

Shung-Hsi Yu authored 8 months ago

adjust_jmp_off() incorrectly used the insn->imm field for all overflow check,
which is incorrect as that should only be done or the BPF_JMP32 | BPF_JA case,
not the general jump instruction case. Fix it by using insn->off for overflow
check in the general case.

Fixes: 5337ac4c

 ("bpf: Fix the corner case with may_goto and jump to the 1st insn.")
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Link: https://lore.kernel.org/r/20240712080127.136608-2-shung-hsi.yu@suse.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

4a04b4f0

bpf: Eliminate remaining "make W=1" warnings in kernel/bpf/btf.o · 2454075f

Alan Maguire authored 8 months ago

As reported by Mirsad [1] we still see format warnings in kernel/bpf/btf.o
at W=1 warning level:

  CC      kernel/bpf/btf.o
./kernel/bpf/btf.c: In function ‘btf_type_seq_show_flags’:
./kernel/bpf/btf.c:7553:21: warning: assignment left-hand side might be a candidate for a format attribute [-Wsuggest-attribute=format]
 7553 |         sseq.showfn = btf_seq_show;
      |                     ^
./kernel/bpf/btf.c: In function ‘btf_type_snprintf_show’:
./kernel/bpf/btf.c:7604:31: warning: assignment left-hand side might be a candidate for a format attribute [-Wsuggest-attribute=format]
 7604 |         ssnprintf.show.showfn = btf_snprintf_show;
      |                               ^

Combined with CONFIG_WERROR=y these can halt the build.

The fix (annotating the structure field with __printf())
suggested by Mirsad resolves these. Apologies I missed this last time.
No other W=1 warnings were observed in kernel/bpf after this fix.

[1] https://lore.kernel.org/bpf/92c9d047-f058-400c-9c7d-81d4dc1ef71b@gmail.com/

Fixes: b3470da3

 ("bpf: annotate BTF show functions with __printf")
Reported-by: Mirsad Todorovac <mtodorovac69@gmail.com>
Suggested-by: Mirsad Todorovac <mtodorovac69@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240712092859.1390960-1-alan.maguire@oracle.com

2454075f

Jul 11, 2024

bpf: annotate BTF show functions with __printf · b3470da3

Alan Maguire authored 8 months ago

-Werror=suggest-attribute=format warns about two functions
in kernel/bpf/btf.c [1]; add __printf() annotations to silence
these warnings since for CONFIG_WERROR=y they will trigger
build failures.

[1] https://lore.kernel.org/bpf/a8b20c72-6631-4404-9e1f-0410642d7d20@gmail.com/

Fixes: 31d0bc81

 ("bpf: Move to generic BTF show support, apply it to seq files/strings")
Reported-by: Mirsad Todorovac <mtodorovac69@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Mirsad Todorovac <mtodorovac69@yahoo.com>
Link: https://lore.kernel.org/r/20240711182321.963667-1-alan.maguire@oracle.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

b3470da3

bpf, arm64: Fix trampoline for BPF_TRAMP_F_CALL_ORIG · 19d3c179

Puranjay Mohan authored 8 months ago

When BPF_TRAMP_F_CALL_ORIG is set, the trampoline calls
__bpf_tramp_enter() and __bpf_tramp_exit() functions, passing them
the struct bpf_tramp_image *im pointer as an argument in R0.

The trampoline generation code uses emit_addr_mov_i64() to emit
instructions for moving the bpf_tramp_image address into R0, but
emit_addr_mov_i64() assumes the address to be in the vmalloc() space
and uses only 48 bits. Because bpf_tramp_image is allocated using
kzalloc(), its address can use more than 48-bits, in this case the
trampoline will pass an invalid address to __bpf_tramp_enter/exit()
causing a kernel crash.

Fix this by using emit_a64_mov_i64() in place of emit_addr_mov_i64()
as it can work with addresses that are greater than 48-bits.

Fixes: efc9909f

 ("bpf, arm64: Add bpf trampoline for arm64")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Closes: https://lore.kernel.org/all/SJ0PR15MB461564D3F7E7A763498CA6A8CBDB2@SJ0PR15MB4615.namprd15.prod.outlook.com/
Link: https://lore.kernel.org/bpf/20240711151838.43469-1-puranjay@kernel.org

19d3c179

Jul 10, 2024

Merge branch 'BPF selftests misc fixes' · 18a8a4c8

Martin KaFai Lau authored 8 months ago


Geliang Tang says:

====================
v2:
 - only check the first "link" (link_nl) in test_mixed_links().
 - Drop patch 2 in v1.

This patchset fixes a segfault and a bpf object leak in test_progs.

It is a resend patch 1 out of "skip ENOTSUPP BPF selftests" set as Eduard
suggested. Together with another fix for xdp_adjust_tail.
====================

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

18a8a4c8

selftests/bpf: Close obj in error path in xdp_adjust_tail · 52b49ec1

Geliang Tang authored 8 months ago

If bpf_object__load() fails in test_xdp_adjust_frags_tail_grow(), "obj"
opened before this should be closed. So use "goto out" to close it instead
of using "return" here.

Fixes: 11022108

 ("bpf: selftests: update xdp_adjust_tail selftest to include xdp frags")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/f282a1ed2d0e3fb38cceefec8e81cabb69cab260.1720615848.git.tanggeliang@kylinos.cn


Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

52b49ec1

selftests/bpf: Null checks for links in bpf_tcp_ca · eef0532e

Geliang Tang authored 8 months ago

Run bpf_tcp_ca selftests (./test_progs -t bpf_tcp_ca) on a Loongarch
platform, some "Segmentation fault" errors occur:

'''
 test_dctcp:PASS:bpf_dctcp__open_and_load 0 nsec
 test_dctcp:FAIL:bpf_map__attach_struct_ops unexpected error: -524
 #29/1    bpf_tcp_ca/dctcp:FAIL
 test_cubic:PASS:bpf_cubic__open_and_load 0 nsec
 test_cubic:FAIL:bpf_map__attach_struct_ops unexpected error: -524
 #29/2    bpf_tcp_ca/cubic:FAIL
 test_dctcp_fallback:PASS:dctcp_skel 0 nsec
 test_dctcp_fallback:PASS:bpf_dctcp__load 0 nsec
 test_dctcp_fallback:FAIL:dctcp link unexpected error: -524
 #29/4    bpf_tcp_ca/dctcp_fallback:FAIL
 test_write_sk_pacing:PASS:open_and_load 0 nsec
 test_write_sk_pacing:FAIL:attach_struct_ops unexpected error: -524
 #29/6    bpf_tcp_ca/write_sk_pacing:FAIL
 test_update_ca:PASS:open 0 nsec
 test_update_ca:FAIL:attach_struct_ops unexpected error: -524
 settcpca:FAIL:setsockopt unexpected setsockopt: \
					actual -1 == expected -1
 (network_helpers.c:99: errno: No such file or directory) \
					Failed to call post_socket_cb
 start_test:FAIL:start_server_str unexpected start_server_str: \
					actual -1 == expected -1
 test_update_ca:FAIL:ca1_ca1_cnt unexpected ca1_ca1_cnt: \
					actual 0 <= expected 0
 #29/9    bpf_tcp_ca/update_ca:FAIL
 #29      bpf_tcp_ca:FAIL
 Caught signal #11!
 Stack trace:
 ./test_progs(crash_handler+0x28)[0x5555567ed91c]
 linux-vdso.so.1(__vdso_rt_sigreturn+0x0)[0x7ffffee408b0]
 ./test_progs(bpf_link__update_map+0x80)[0x555556824a78]
 ./test_progs(+0x94d68)[0x5555564c4d68]
 ./test_progs(test_bpf_tcp_ca+0xe8)[0x5555564c6a88]
 ./test_progs(+0x3bde54)[0x5555567ede54]
 ./test_progs(main+0x61c)[0x5555567efd54]
 /usr/lib64/libc.so.6(+0x22208)[0x7ffff2aaa208]
 /usr/lib64/libc.so.6(__libc_start_main+0xac)[0x7ffff2aaa30c]
 ./test_progs(_start+0x48)[0x55555646bca8]
 Segmentation fault
'''

This is because BPF trampoline is not implemented on Loongarch yet,
"link" returned by bpf_map__attach_struct_ops() is NULL. test_progs
crashs when this NULL link passes to bpf_link__update_map(). This
patch adds NULL checks for all links in bpf_tcp_ca to fix these errors.
If "link" is NULL, goto the newly added label "out" to destroy the skel.

v2:
 - use "goto out" instead of "return" as Eduard suggested.

Fixes: 06da9f3b

 ("selftests/bpf: Test switching TCP Congestion Control algorithms.")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/r/b4c841492bd4ed97964e4e61e92827ce51bf1dc9.1720615848.git.tanggeliang@kylinos.cn


Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

eef0532e

Merge branch 'use network helpers, part 8' · ec5b8c76

Martin KaFai Lau authored 8 months ago


Geliang Tang says:

====================
v11:
 - new patches 2, 4, 6.
 - drop expect_errno from network_helper_opts as Eduard and Martin
   suggested.
 - drop sockmap_ktls patches from this set.

v10:
 - a new patch 10 is added.
 - patches 1-6, 8-9 unchanged, only commit logs updated.
 - "err = -errno" is used in patches 7, 11, 12 to get the real error
   number before checking value of "err".

v9:
 - new patches 5-7, new struct member expect_errno for network_helper_opts.
 - patches 1-4, 8-9 unchanged.
 - update patches 10-11 to make sure all tests pass.

v8:
 - only patch 8 updated, to fix errors reported by CI.

v7:
 - address Martin's comments in v6. (thanks)
 - use MAX(opts->backlog, 0) instead of opts->backlog.
 - use connect_to_fd_opts instead connect_to_fd.
 - more ASSERT_* to check errors.

v6:
 - update patch 6 as Daniel suggested. (thanks)

v5:
 - keep make_server and make_client as Eduard suggested.

v4:
 - a new patch to use make_sockaddr in sockmap_ktls.
 - a new patch to close fd in error path in drop_on_reuseport.
 - drop make_server() in patch 7.
 - drop make_client() too in patch 9.

v3:
 - a new patch to add backlog for network_helper_opts.
 - use start_server_str in sockmap_ktls now, not start_server.

v2:
 - address Eduard's comments in v1. (thanks)
 - fix errors reported by CI.

This patch set uses network helpers in sk_lookup, and drop the local
helpers inetaddr_len() and make_socket().
====================

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

ec5b8c76

selftests/bpf: Use connect_fd_to_fd in sk_lookup · 9004054b

Geliang Tang authored 8 months ago


This patch uses public helper connect_fd_to_fd() exported in
network_helpers.h instead of using getsockname() + connect() in
run_lookup_prog() in prog_tests/sk_lookup.c. This can simplify
the code.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/7077c277cde5a1864cdc244727162fb75c8bb9c5.1720515893.git.tanggeliang@kylinos.cn


Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

9004054b

selftests/bpf: Use start_server_addr in sk_lookup · d9810c43

Geliang Tang authored 8 months ago

This patch uses public helper start_server_addr() in udp_recv_send()
in prog_tests/sk_lookup.c to simplify the code.

And use ASSERT_OK_FD() to check fd returned by start_server_addr().

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/f11cabfef4a2170ecb66a1e8e2e72116d8f621b3.1720515893.git.tanggeliang@kylinos.cn

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

d9810c43

selftests/bpf: Use start_server_str in sk_lookup · 14fc6fcd

Geliang Tang authored 8 months ago

This patch uses public helper start_server_str() to simplify make_server()
in prog_tests/sk_lookup.c.

Add a callback setsockopts() to do all sockopts, set it to post_socket_cb
pointer of struct network_helper_opts. And add a new struct cb_opts to save
the data needed to pass to the callback. Then pass this network_helper_opts
to start_server_str().

Also use ASSERT_OK_FD() to check fd returned by start_server_str().

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/5981539f5591d2c4998c962ef2bf45f34c940548.1720515893.git.tanggeliang@kylinos.cn

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

14fc6fcd

selftests/bpf: Close fd in error path in drop_on_reuseport · adae187e

Geliang Tang authored 8 months ago

In the error path when update_lookup_map() fails in drop_on_reuseport in
prog_tests/sk_lookup.c, "server1", the fd of server 1, should be closed.
This patch fixes this by using "goto close_srv1" lable instead of "detach"
to close "server1" in this case.

Fixes: 0ab5539f

 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point")
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/86aed33b4b0ea3f04497c757845cff7e8e621a2d.1720515893.git.tanggeliang@kylinos.cn


Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

adae187e

selftests/bpf: Add ASSERT_OK_FD macro · 7046345d

Geliang Tang authored 8 months ago

Add a new dedicated ASSERT macro ASSERT_OK_FD to test whether a socket
FD is valid or not. It can be used to replace macros ASSERT_GT(fd, 0, ""),
ASSERT_NEQ(fd, -1, "") or statements (fd < 0), (fd != -1).

Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/ded75be86ac630a3a5099739431854c1ec33f0ea.1720515893.git.tanggeliang@kylinos.cn

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

7046345d

selftests/bpf: Add backlog for network_helper_opts · a3016a27

Geliang Tang authored 8 months ago

Some callers expect __start_server() helper to pass their own "backlog"
value to listen() instead of the default of 1. So this patch adds struct
member "backlog" for network_helper_opts to allow callers to set "backlog"
value via start_server_str() helper.

listen(fd, 0 /* backlog */) can be used to enforce syncookie. Meaning
backlog 0 is a legit value.

Using 0 as a default and changing it to 1 here is fine. It makes the test
program easier to write for the common case. Enforcing syncookie mode by
using backlog 0 is a niche use case but it should at least have a way for
the caller to do that. Thus, -ve backlog value is used here for the
syncookie use case. Please see the comment in network_helpers.h for
the details.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/1660229659b66eaad07aa2126e9c9fe217eba0dd.1720515893.git.tanggeliang@kylinos.cn

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

a3016a27

selftests/bpf: fix compilation failure when CONFIG_NF_FLOW_TABLE=m · eeb23b54

Alan Maguire authored 8 months ago

In many cases, kernel netfilter functionality is built as modules.
If CONFIG_NF_FLOW_TABLE=m in particular, progs/xdp_flowtable.c
(and hence selftests) will fail to compile, so add a ___local
version of "struct flow_ports".

Fixes: c77e572d

 ("selftests/bpf: Add selftest for bpf_xdp_flow_lookup kfunc")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/r/20240710150051.192598-1-alan.maguire@oracle.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

eeb23b54

bpf: Remove tst_run from lwt_seg6local_prog_ops. · c13fda93

Sebastian Andrzej Siewior authored 8 months ago


The syzbot reported that the lwt_seg6 related BPF ops can be invoked
via bpf_test_run() without without entering input_action_end_bpf()
first.

Martin KaFai Lau said that self test for BPF_PROG_TYPE_LWT_SEG6LOCAL
probably didn't work since it was introduced in commit 04d4b274e2a
("ipv6: sr: Add seg6local action End.BPF"). The reason is that the
per-CPU variable seg6_bpf_srh_states::srh is never assigned in the self
test case but each BPF function expects it.

Remove test_run for BPF_PROG_TYPE_LWT_SEG6LOCAL.

Suggested-by: Martin KaFai Lau <martin.lau@linux.dev>
Reported-by:  <syzbot+608a2acde8c5a101d07d@syzkaller.appspotmail.com>
Fixes: d1542d4a ("seg6: Use nested-BH locking for seg6_bpf_srh_states.")
Fixes: 004d4b27

 ("ipv6: sr: Add seg6local action End.BPF")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20240710141631.FbmHcQaX@linutronix.de


Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

c13fda93

Jul 09, 2024

bpf: relax zero fixed offset constraint on KF_TRUSTED_ARGS/KF_RCU · 605c9699

Matt Bobrowski authored 8 months ago

Currently, BPF kfuncs which accept trusted pointer arguments
i.e. those flagged as KF_TRUSTED_ARGS, KF_RCU, or KF_RELEASE, all
require an original/unmodified trusted pointer argument to be supplied
to them. By original/unmodified, it means that the backing register
holding the trusted pointer argument that is to be supplied to the BPF
kfunc must have its fixed offset set to zero, or else the BPF verifier
will outright reject the BPF program load. However, this zero fixed
offset constraint that is currently enforced by the BPF verifier onto
BPF kfuncs specifically flagged to accept KF_TRUSTED_ARGS or KF_RCU
trusted pointer arguments is rather unnecessary, and can limit their
usability in practice. Specifically, it completely eliminates the
possibility of constructing a derived trusted pointer from an original
trusted pointer. To put it simply, a derived pointer is a pointer
which points to one of the nested member fields of the object being
pointed to by the original trusted pointer.

This patch relaxes the zero fixed offset constraint that is enforced
upon BPF kfuncs which specifically accept KF_TRUSTED_ARGS, or KF_RCU
arguments. Although, the zero fixed offset constraint technically also
applies to BPF kfuncs accepting KF_RELEASE arguments, relaxing this
constraint for such BPF kfuncs has subtle and unwanted
side-effects. This was discovered by experimenting a little further
with an initial version of this patch series [0]. The primary issue
with relaxing the zero fixed offset constraint on BPF kfuncs accepting
KF_RELEASE arguments is that it'd would open up the opportunity for
BPF programs to supply both trusted pointers and derived trusted
pointers to them. For KF_RELEASE BPF kfuncs specifically, this could
be problematic as resources associated with the backing pointer could
be released by the backing BPF kfunc and cause instabilities for the
rest of the kernel.

With this new fixed offset semantic in-place for BPF kfuncs accepting
KF_TRUSTED_ARGS and KF_RCU arguments, we now have more flexibility
when it comes to the BPF kfuncs that we're able to introduce moving
forward.

Early discussions covering the possibility of relaxing the zero fixed
offset constraint can be found using the link below. This will provide
more context on where all this has stemmed from [1].

Notably, pre-existing tests have been updated such that they provide
coverage for the updated zero fixed offset
functionality. Specifically, the nested offset test was converted from
a negative to positive test as it was already designed to assert zero
fixed offset semantics of a KF_TRUSTED_ARGS BPF kfunc.

[0] https://lore.kernel.org/bpf/ZnA9ndnXKtHOuYMe@google.com/
[1] https://lore.kernel.org/bpf/ZhkbrM55MKQ0KeIV@google.com/

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20240709210939.1544011-1-mattbobrowski@google.com

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

605c9699

Merge branch 'fix-libbpf-bpf-skeleton-forward-backward-compat' · 02779af2

Alexei Starovoitov authored 8 months ago

Andrii Nakryiko says:

====================
Fix libbpf BPF skeleton forward/backward compat

Fix recently identified (but long standing) bug with handling BPF skeleton
forward and backward compatibility. On libbpf side, even though BPF skeleton
was always designed to be forward and backwards compatible through recording
actual size of constrituents of BPF skeleton itself (map/prog/var skeleton
definitions), libbpf implementation did implicitly hard-code those sizes by
virtue of using a trivial array access syntax.

This issue will only affect libbpf used as a shared library. Statically
compiled libbpfs will always be in sync with BPF skeleton, bypassing this
problem altogether.

This patch set fixes libbpf, but also mitigates the problem for old libbpf
versions by teaching bpftool to generate more conservative BPF skeleton,
if possible (i.e., if there are no struct_ops maps defined).

v1->v2:
  - fix SOB, add acks, typo fixes (Quentin, Eduard);
  - improve reporting of skipped map auto-attachment (Alan, Eduard).
====================

Link: https://lore.kernel.org/r/20240708204540.4188946-1-andrii@kernel.org


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

02779af2

libbpf: improve old BPF skeleton handling for map auto-attach · a459f4bb

Andrii Nakryiko authored 8 months ago


Improve how we handle old BPF skeletons when it comes to BPF map
auto-attachment. Emit one warn-level message per each struct_ops map
that could have been auto-attached, if user provided recent enough BPF
skeleton version. Don't spam log if there are no relevant struct_ops
maps, though.

This should help users realize that they probably need to regenerate BPF
skeleton header with more recent bpftool/libbpf-cargo (or whatever other
means of BPF skeleton generation).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240708204540.4188946-4-andrii@kernel.org


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

a459f4bb

libbpf: fix BPF skeleton forward/backward compat handling · 99fb9531

Andrii Nakryiko authored 8 months ago


BPF skeleton was designed from day one to be extensible. Generated BPF
skeleton code specifies actual sizes of map/prog/variable skeletons for
that reason and libbpf is supposed to work with newer/older versions
correctly.

Unfortunately, it was missed that we implicitly embed hard-coded most
up-to-date (according to libbpf's version of libbpf.h header used to
compile BPF skeleton header) sizes of those structs, which can differ
from the actual sizes at runtime when libbpf is used as a shared
library.

We have a few places were we just index array of maps/progs/vars, which
implicitly uses these potentially invalid sizes of structs.

This patch aims to fix this problem going forward. Once this lands,
we'll backport these changes in Github repo to create patched releases
for older libbpfs.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Fixes: d66562fb ("libbpf: Add BPF object skeleton support")
Fixes: 430025e5 ("libbpf: Add subskeleton scaffolding")
Fixes: 08ac454e

 ("libbpf: Auto-attach struct_ops BPF maps in BPF skeleton")
Co-developed-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240708204540.4188946-3-andrii@kernel.org


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

99fb9531

bpftool: improve skeleton backwards compat with old buggy libbpfs · 06e71ad5

Andrii Nakryiko authored 8 months ago


Old versions of libbpf don't handle varying sizes of bpf_map_skeleton
struct correctly. As such, BPF skeleton generated by newest bpftool
might not be compatible with older libbpf (though only when libbpf is
used as a shared library), even though it, by design, should.

Going forward libbpf will be fixed, plus we'll release bug fixed
versions of relevant old libbpfs, but meanwhile try to mitigate from
bpftool side by conservatively assuming older and smaller definition of
bpf_map_skeleton, if possible. Meaning, if there are no struct_ops maps.

If there are struct_ops, then presumably user would like to have
auto-attaching logic and struct_ops map link placeholders, so use the
full bpf_map_skeleton definition in that case.

Acked-by: Quentin Monnet <qmo@kernel.org>
Co-developed-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240708204540.4188946-2-andrii@kernel.org


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

06e71ad5

Merge branch 'selftests-drv-net-rss_ctx-more-tests' · 746d684e

Jakub Kicinski authored 8 months ago

Jakub Kicinski says:

====================
selftests: drv-net: rss_ctx: more tests

Add a few more tests for RSS.

v1: https://lore.kernel.org/all/20240705015725.680275-1-kuba@kernel.org/
====================

Link: https://patch.msgid.link/20240708213627.226025-1-kuba@kernel.org


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

746d684e

selftests: drv-net: rss_ctx: test flow rehashing without impacting traffic · 933048fe

Jakub Kicinski authored 8 months ago


Some workloads may want to rehash the flows in response to an imbalance.
Most effective way to do that is changing the RSS key. Check that changing
the key does not cause link flaps or traffic disruption.

Disrupting traffic for key update is not incorrect, but makes the key
update unusable for rehashing under load.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240708213627.226025-6-kuba@kernel.org


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

933048fe

selftests: drv-net: rss_ctx: check behavior of indirection table resizing · 7e3e5b0b

Jakub Kicinski authored 8 months ago


Some devices dynamically increase and decrease the size of the RSS
indirection table based on the number of enabled queues.
When that happens driver must maintain the balance of entries
(preferably duplicating the smaller table).

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240708213627.226025-5-kuba@kernel.org


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

7e3e5b0b

selftests: drv-net: rss_ctx: test queue changes vs user RSS config · e2c9703d

Jakub Kicinski authored 8 months ago


By default main RSS table should change to include all queues.
When user sets a specific RSS config the driver should preserve it,
even when queue count changes. Driver should refuse to deactivate
queues used in the user-set RSS config.

For additional contexts driver should still refuse to deactivate
queues in use. Whether the contexts should get resized like
context 0 when queue count increases is a bit unclear. I anticipate
most drivers today don't do that. Since main use case for additional
contexts is to set the indir table - it doesn't seem worthwhile to
care about behavior of the default table too much. Don't test that.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240708213627.226025-4-kuba@kernel.org


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

e2c9703d

selftests: drv-net: rss_ctx: factor out send traffic and check · 847aa551

Jakub Kicinski authored 8 months ago


Wrap up sending traffic and checking in which queues it landed
in a helper.

The method used for testing is to send a lot of iperf traffic
and check which queues received the most packets. Those should
be the queues where we expect iperf to land - either because we
installed a filter for the port iperf uses, or we didn't and
expect it to use context 0.

Contexts get disjoint queue sets, but the main context (AKA context 0)
may receive some background traffic (noise).

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240708213627.226025-3-kuba@kernel.org


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

847aa551

selftests: drv-net: rss_ctx: fix cleanup in the basic test · a0aab7d7

Jakub Kicinski authored 8 months ago


The basic test may fail without resetting the RSS indir table.
Use the .exec() method to run cleanup early since we re-test
with traffic that returning to default state works.
While at it reformat the doc a tiny bit.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240708213627.226025-2-kuba@kernel.org


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

a0aab7d7

net: ti: icssg-prueth: add missing deps · d6947113

Guillaume La Roque authored 8 months ago

Add missing dependency on NET_SWITCHDEV.

Fixes: abd5576b

 ("net: ti: icssg-prueth: Add support for ICSSG switch firmware")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Guillaume La Roque <glaroque@baylibre.com>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Link: https://patch.msgid.link/20240708-net-deps-v2-1-b22fb74da2a3@baylibre.com


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

d6947113

dt-bindings: net: fsl,fman: add ptimer-handle property · 5618ced0

Frank Li authored 8 months ago

Add ptimer-handle property to link to ptp-timer node handle.
Fix below warning:
arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dtb: fman@1a00000: 'ptimer-handle' do not match any of the regexes: '^ethernet@[a-f0-9]+$', '^mdio@[a-f0-9]+$', '^muram@[a-f0-9]+$', '^phc@[a-f0-9]+$', '^port@[a-f0-9]+$', 'pinctrl-[0-9]+'

Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20240708180949.1898495-2-Frank.Li@nxp.com

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

5618ced0

dt-bindings: net: fsl,fman: allow dma-coherent property · dd84d831

Frank Li authored 8 months ago

Add dma-coherent property to fix below warning.
arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dtb: fman@1a00000: 'dma-coherent', 'ptimer-handle' do not match any of the regexes: '^ethernet@[a-f0-9]+$', '^mdio@[a-f0-9]+$', '^muram@[a-f0-9]+$', '^phc@[a-f0-9]+$', '^port@[a-f0-9]+$', 'pinctrl-[0-9]+'
from schema $id: http://devicetree.org/schemas/net/fsl,fman.yaml#

Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20240708180949.1898495-1-Frank.Li@nxp.com

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dd84d831

net: tls: Pass union tls_crypto_context pointer to memzero_explicit · 0d9e699d

Simon Horman authored 8 months ago


Pass union tls_crypto_context pointer, rather than struct
tls_crypto_info pointer, to memzero_explicit().

The address of the pointer is the same before and after.
But the new construct means that the size of the dereferenced pointer type
matches the size being zeroed. Which aids static analysis.

As reported by Smatch:

  .../tls_main.c:842 do_tls_setsockopt_conf() error: memzero_explicit() 'crypto_info' too small (4 vs 56)

No functional change intended.
Compile tested only.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240708-tls-memzero-v2-1-9694eaf31b79@kernel.org


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

0d9e699d

selftests: forwarding: Make vxlan-bridge-1d pass on debug kernels · 3699e57a

Ido Schimmel authored 8 months ago

The ageing time used by the test is too short for debug kernels and
results in entries being aged out prematurely [1].

Fix by increasing the ageing time.

The same change was done for the VLAN-aware version of the test in
commit dfbab740

 ("selftests: forwarding: Make vxlan-bridge-1q pass
on debug kernels").

[1]
 # ./vxlan_bridge_1d.sh
 [...]
 # TEST: VXLAN: flood before learning                              [ OK ]
 # TEST: VXLAN: show learned FDB entry                             [ OK ]
 # TEST: VXLAN: learned FDB entry                                  [FAIL]
 # veth3: Expected to capture 0 packets, got 4.
 # RTNETLINK answers: No such file or directory
 # TEST: VXLAN: deletion of learned FDB entry                      [ OK ]
 # TEST: VXLAN: Ageing of learned FDB entry                        [FAIL]
 # veth3: Expected to capture 0 packets, got 2.
 [...]

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240707095458.2870260-1-idosch@nvidia.com


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

3699e57a

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 7b769adc

Paolo Abeni authored 8 months ago

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-07-08

The following pull-request contains BPF updates for your *net-next* tree.

We've added 102 non-merge commits during the last 28 day(s) which contain
a total of 127 files changed, 4606 insertions(+), 980 deletions(-).

The main changes are:

1) Support resilient split BTF which cuts down on duplication and makes BTF
   as compact as possible wrt BTF from modules, from Alan Maguire & Eduard Zingerman.

2) Add support for dumping kfunc prototypes from BTF which enables both detecting
   as well as dumping compilable prototypes for kfuncs, from Daniel Xu.

3) Batch of s390x BPF JIT improvements to add support for BPF arena and to implement
   support for BPF exceptions, from Ilya Leoshkevich.

4) Batch of riscv64 BPF JIT improvements in particular to add 12-argument support
   for BPF trampolines and to utilize bpf_prog_pack for the latter, from ...

7b769adc

net: phy: microchip: lan937x: add support for 100BaseTX PHY · 870a1dbc

Oleksij Rempel authored 8 months ago


Add support of 100BaseTX PHY build in to LAN9371 and LAN9372 switches.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Link: https://patch.msgid.link/20240706154201.1456098-1-o.rempel@pengutronix.de


Signed-off-by: Paolo Abeni <pabeni@redhat.com>

870a1dbc

udp: Remove duplicate included header file trace/events/udp.h · 0787ab20

Thorsten Blum authored 8 months ago


Remove duplicate included header file trace/events/udp.h and the
following warning reported by make includecheck:

  trace/events/udp.h is included more than once

Compile-tested only.

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240706071132.274352-2-thorsten.blum@toblux.com


Signed-off-by: Paolo Abeni <pabeni@redhat.com>

0787ab20

net: tn40xx: add per queue netdev-genl stats support · 6c2a4c2f

FUJITA Tomonori authored 8 months ago


Add support for the netdev-genl per queue stats API.

./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump qstats-get --json '{"scope":"queue"}'
[{'ifindex': 4,
  'queue-id': 0,
  'queue-type': 'rx',
  'rx-alloc-fail': 0,
  'rx-bytes': 266613,
  'rx-packets': 3325},
 {'ifindex': 4,
  'queue-id': 0,
  'queue-type': 'tx',
  'tx-bytes': 142823367,
  'tx-packets': 2387}]

Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20240706064324.137574-1-fujita.tomonori@gmail.com


Signed-off-by: Paolo Abeni <pabeni@redhat.com>

6c2a4c2f

sctp: Fix typos and improve comments · 417d8818

Thorsten Blum authored 8 months ago


Fix typos s/steam/stream/ and spell out Schedule/Unschedule in the
comments.

Compile-tested only.

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240704202558.62704-2-thorsten.blum@toblux.com


Signed-off-by: Paolo Abeni <pabeni@redhat.com>

417d8818

l2tp: fix possible UAF when cleaning up tunnels · f8ad00f3

James Chapman authored 8 months ago


syzbot reported a UAF caused by a race when the L2TP work queue closes a
tunnel at the same time as a userspace thread closes a session in that
tunnel.

Tunnel cleanup is handled by a work queue which iterates through the
sessions contained within a tunnel, and closes them in turn.

Meanwhile, a userspace thread may arbitrarily close a session via
either netlink command or by closing the pppox socket in the case of
l2tp_ppp.

The race condition may occur when l2tp_tunnel_closeall walks the list
of sessions in the tunnel and deletes each one.  Currently this is
implemented using list_for_each_safe, but because the list spinlock is
dropped in the loop body it's possible for other threads to manipulate
the list during list_for_each_safe's list walk.  This can lead to the
list iterator being corrupted, leading to list_for_each_safe spinning.
One sequence of events which may lead to this is as follows:

 * A tunnel is created, containing two sessions A and B.
 * A thread closes the tunnel, triggering tunnel cleanup via the work
   queue.
 * l2tp_tunnel_closeall runs in the context of the work queue.  It
   removes session A from the tunnel session list, then drops the list
   lock.  At this point the list_for_each_safe temporary variable is
   pointing to the other session on the list, which is session B, and
   the list can be manipulated by other threads since the list lock has
   been released.
 * Userspace closes session B, which removes the session from its parent
   tunnel via l2tp_session_delete.  Since l2tp_tunnel_closeall has
   released the tunnel list lock, l2tp_session_delete is able to call
   list_del_init on the session B list node.
 * Back on the work queue, l2tp_tunnel_closeall resumes execution and
   will now spin forever on the same list entry until the underlying
   session structure is freed, at which point UAF occurs.

The solution is to iterate over the tunnel's session list using
list_first_entry_not_null to avoid the possibility of the list
iterator pointing at a list item which may be removed during the walk.

Also, have l2tp_tunnel_closeall ref each session while it processes it
to prevent another thread from freeing it.

	cpu1				cpu2
	---				---
					pppol2tp_release()

	spin_lock_bh(&tunnel->list_lock);
	for (;;) {
		session = list_first_entry_or_null(&tunnel->session_list,
						   struct l2tp_session, list);
		if (!session)
			break;
		list_del_init(&session->list);
		spin_unlock_bh(&tunnel->list_lock);

 					l2tp_session_delete(session);

		l2tp_session_delete(session);
		spin_lock_bh(&tunnel->list_lock);
	}
	spin_unlock_bh(&tunnel->list_lock);

Calling l2tp_session_delete on the same session twice isn't a problem
per-se, but if cpu2 manages to destruct the socket and unref the
session to zero before cpu1 progresses then it would lead to UAF.

Reported-by:  <syzbot+b471b7c936301a59745b@syzkaller.appspotmail.com>
Reported-by:  <syzbot+c041b4ce3a6dfd1e63e2@syzkaller.appspotmail.com>
Fixes: d18d3f0a

 ("l2tp: replace hlist with simple list for per-tunnel session list")
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Link: https://patch.msgid.link/20240704152508.1923908-1-jchapman@katalix.com


Signed-off-by: Paolo Abeni <pabeni@redhat.com>

f8ad00f3

Jul 08, 2024

Merge branch 'net-stmmac-qcom-ethqos-enable-2-5g-ethernet-on-sa8775p-ride' · 06cd3d4b

Jakub Kicinski authored 8 months ago

Bartosz Golaszewski says:

====================
net: stmmac: qcom-ethqos: enable 2.5G ethernet on sa8775p-ride

Here are the changes required to enable 2.5G ethernet on sa8775p-ride.
As advised by Andrew Lunn and Russell King, I am reusing the existing
stmmac infrastructure to enable the SGMII loopback and so I dropped the
patches adding new callbacks to the driver core. I also added more
details to the commit message and made sure the workaround is only
enabled on Rev 3 of the board (with AQR115C PHY). Also: dropped any
mentions of the OCSGMII mode.

v2: https://lore.kernel.org/20240627113948.25358-1-brgl@bgdev.pl
v1: https://lore.kernel.org/20240619184550.34524-1-brgl@bgdev.pl
====================

Link: https://patch.msgid.link/20240703181500.28491-1-brgl@bgdev.pl


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

06cd3d4b

net: stmmac: qcom-ethqos: enable SGMII loopback during DMA reset on sa8775p-ride-r3 · 3c466d65

Bartosz Golaszewski authored 8 months ago


On sa8775p-ride-r3 the RX clocks from the AQR115C PHY are not available at
the time of the DMA reset. We can however extract the RX clock from the
internal SERDES block. Once the link is up, we can revert to the
previous state.

The AQR115C PHY doesn't support in-band signalling so we can count on
getting the link up notification and safely reuse existing callbacks
which are already used by another HW quirk workaround which enables the
functional clock to avoid a DMA reset due to timeout.

Only enable loopback on revision 3 of the board - check the phy_mode to
make sure.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20240703181500.28491-3-brgl@bgdev.pl


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

3c466d65

Admin message