-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Q: SCMP_FLTATR_API_TSKIP does not seem to be used by tracer programs #368
Comments
@pcmoore @drakenclimber I'd appreciate it if you could answer at your convenience. |
Well, the use case is exactly as you described in your posting above; it is intended to support process tracers :) It has been several years since we made this change, so this reasoning may be wrong, but my recollection is that without a "syscall == -1" allow filter rule, the seccomp filter would reject the syscall skip before the kernel got to the skip line you mentioned. The "syscall == -1" rule in the BPF filter isn't to force the syscall to be skipped, it is to allow the kernel processing to get to the point where the syscall can be skipped. Of course if you have a reproducer which shows that this doesn't work this way anymore I think we would like to see it :) |
@pcmoore Thank you for your comment.
I attached the reproducer which shows that a tracer program can skip a system call without a "syscall == -1" rule. ptrace_test.c// Copyright 2022 Sony Group Corporation
//
#include <seccomp.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <sys/user.h>
#include <sys/prctl.h>
#include <syscall.h>
int die (const char *msg) {
perror(msg);
exit(errno);
}
int child() {
int rc = -1;
scmp_filter_ctx ctx;
prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
ctx = seccomp_init(SCMP_ACT_ALLOW);
if (ctx == NULL)
goto out;
rc = seccomp_rule_add_exact(ctx, SCMP_ACT_TRACE(getpid()), SCMP_SYS(getuid), 0);
if (rc < 0)
goto out;
rc = seccomp_load(ctx);
if (rc < 0)
goto out;
// This should output -ENOSYS (-38) as syscall-enter-stop on x86
printf("uid: %d\n", getuid());
out:
seccomp_release(ctx);
return -rc;
}
int main() {
int pid;
int rc;
int status;
struct user_regs_struct regs;
pid = fork();
switch(pid) {
case -1:
die("failed to fork");
case 0:
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
kill(getpid(), SIGSTOP);
rc = child();
if (rc < 0) {
die("failed to execute child");
}
return 0;
}
waitpid(pid, &status, __WALL);
ptrace(PTRACE_SETOPTIONS, pid, NULL, PTRACE_O_TRACESECCOMP);
ptrace(PTRACE_CONT, pid, NULL, NULL);
while(1) {
waitpid(pid, &status, __WALL);
if (status >> 8 == (SIGTRAP | (PTRACE_EVENT_SECCOMP << 8))) {
ptrace(PTRACE_GETREGS, pid, NULL, ®s);
if (regs.orig_rax == SYS_getuid) {
printf("caught getuid syscall\n");
// Change the syscall number to -1 in order to skip the syscall
regs.orig_rax = -1;
ptrace(PTRACE_SETREGS, pid, NULL, ®s);
}
}
if (WIFEXITED(status) || WIFSIGNALED(status)) {
break;
}
ptrace(PTRACE_CONT, pid, NULL, NULL);
}
return 0;
} I can observe that the kernel can get to the skip line as I mentioned earlier by setting probe point to https://elixir.bootlin.com/linux/v5.10/source/kernel/seccomp.c#L989 . $ uname -a
Linux xxxx 5.10.0-1057-oem #61-Ubuntu SMP Thu Jan 13 15:06:11 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ sudo perf probe --source=/usr/src/linux-oem-5.10-5.10.0 --add "__seccomp_filter:68 this_syscall"
Added new event:
probe:__seccomp_filter_L68 (on __seccomp_filter:68 with this_syscall)
You can now use it in all perf tools, such as:
perf record -e probe:__seccomp_filter_L68 -aR sleep 1
$ sudo perf record -e probe:__seccomp_filter_L68 -aR ./ptrace_test
caught getuid syscall
uid: -38
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.898 MB perf.data (1 samples) ]
$ sudo perf script
ptrace_test 12337 [020] 2739.243594: probe:__seccomp_filter_L68: (ffffffffb639720e) this_syscall=-1 The tracer program outputs If you don't mind, could you look into the reproducer? Thank you. |
Thanks for sending the reproducer and the additional information, we'll add this to the list of things to investigate further but it might take me some time to get back to this. As a reminder, |
Interesting. I'm swamped at the moment as well, but I am definitely intrigued. |
Thank you for considering review it. It would be helpful.
Yes, so I didn't enable the |
Hello, I have a question about
SCMP_FLTATR_API_TSKIP
attribute.SCMP_FLTATR_API_SKIP
has been supported from dc87999 in order to address the #80 and the man page explains as follows:However, I think tracer programs do not use
SCMP_FLTATR_API_TSKIP
to skip a syscall because the tracer skips a syscall by changing directly the register of syscall number as explained inseccomp(2)
, not using a seccomp filter.Excerpt from
SECCOMP_RET_TRACE
section inseccomp(2)
:Actually, the kernel will skip a syscall if the syscall number is set to -1 by a ptracer at the following point.
https://elixir.bootlin.com/linux/v5.16/source/kernel/seccomp.c#L1229
The ptracer can set the syscall value of -1 without
SCMP_FLTATR_API_TSKIP
because it just changes the register.Hence, it does not seem to make sense to create a filter rule using a syscall value of -1. I'm sorry if I'm wrong, but I'm not sure why
SCMP_FLTATR API_TSKIP
was added.Would you mind if I asked the use case of
SCMP_FLTATR_API_TSKIP
?The text was updated successfully, but these errors were encountered: