ptrace notes

From Noah.org
Jump to navigationJump to search

See Also

LD_PRELOAD_notes is the classic way to manipulate the reality of a program. It is less powerful than ptrace, but it is faster The ptrace API gives you much finer control from what you can get with LD_PRELOAD. Also LD_PRELOAD only lets you hook into shared object library calls. If an application was totally statically linked, then you won't ever have visibility. But ptrace gives single-step process control at the machine instruction level.

Slowing down a subprocess

This adds a usleep after each system call. Build the small C program given and test with the following examples. Note that my environment has export TIMEFORMAT=$'\npcpu\t%P\nreal\t%3lR\nuser\t%3lU\nsys\t%3lS'. This environment variable causes the Bash built-in version of time to also display the percentage of CPU used by a process. If you are using a different shell then you may not see the pcpu row output.

Running a normal baseline `time find /tmp` gave this:

pcpu    4.93
real    0m0.081s
user    0m0.004s
sys     0m0.000s

Running the same command under slow, `time ./slow find /tmp`, gave this:

pcpu    0.48
real    0m31.929s
user    0m0.040s
sys     0m0.116s

source code to 'slow'

/*
slow.c

This slows down a given program. It forks and executes the given program in
a child process with a delay added after each syscall the child process
makes.  This is done using the ptrace syscall to control the child process.

Because the delay is added only after a syscall (PTRACE_SYSCALL) it is
impossible to produce a uniform delay in the child process because each
syscall introduces its own delay. This may produce a paradoxical behavior
where a fast section of the child process that calls lots of fast syscalls
will experience a greater delay effect that a slow section that calls fewer
slow syscalls.  And if a section of the child process calls no syscalls then
it will experience no delay. This makes the effect unpredictable. But this
may be an advantage in many cases because blocking or slow syscalls don't
need to be made slower.  It is possible to use ptrace to take control of the
child process after every single instruction in the child process
(PTRACE_SINGLESTEP), but this can make the program almost unusable even if
the delay after each step was set to zero. The overhead of the hook that
controls the child process is so huge compared to time it takes to execute a
single instruction in the child process that there is rarely any practical
use for this.

To build:
    gcc slow.c -o slow

Examples running the 'find' command on /tmp:
Without an integer as the first argument the delay defaults to
1/100th of a second or 0.01 seconds or 1/100th of second or 10 milliseconds:
    ./slow find /tmp
When run with an integer as the first argument the delay is set to the
given number of microseconds. This means you can not run a program with a number
for a name. This example sets the delay to 1/10th of a second:
    ./slow 100000 find /tmp

LICENSE
    This license is approved by the OSI and FSF as GPL-compatible.
        http://opensource.org/licenses/isc-license.txt

    Copyright (c) 2020, Noah Spurrier <noah@noah.org>
    PERMISSION TO USE, COPY, MODIFY, AND/OR DISTRIBUTE THIS SOFTWARE FOR ANY
    PURPOSE WITH OR WITHOUT FEE IS HEREBY GRANTED, PROVIDED THAT THE ABOVE
    COPYRIGHT NOTICE AND THIS PERMISSION NOTICE APPEAR IN ALL COPIES.
    THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
    WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
    MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
    ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
    WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
    ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
    OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*/

#include <stdlib.h>
#include <stdio.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <sys/user.h>

int main(int argc, char ** argv) {
    int delay_us; // Time in microseconds to pause after each step.
    int child_argv_start;
    int status;
    pid_t child_pid;

    if (sscanf(argv[1], "%i", &delay_us) != 1) {

        delay_us = 10000; // default to 10000 microseconds
        child_argv_start = 1;
    }
    else {
        child_argv_start = 2;
    }

    // This is crudely done, but I'm just trying to slow down the child, not
    // build a debugger. For homework, use ptrace to attach to an existing pid
    // and control it. You can use ptrace to become the parent of another process.

    child_pid = fork();
    if (child_pid == 0) {
        ptrace(PTRACE_TRACEME, 0, NULL, NULL);
        // Stop. When the parent is ready it will call ptrace to restart the child.
        raise(SIGSTOP);
        execvp(argv[child_argv_start], &argv[child_argv_start]);
    }
    else {
        // This is a sync step to prevent a race condition.
        // This waits until the child changes state.
        // The child does this by stopping itself.
        if (waitpid(child_pid, &status, 0) < 0) {
            exit(1);
        }
	// This sets a ptrace option so that if the parent exits a SIGKILL
        // is sent to the child process to ensure the child does not escape.
        ptrace(PTRACE_SETOPTIONS, child_pid, NULL, PTRACE_O_EXITKILL);
        // Check if child called exit before we even got started.
        while (1) {
            // This signals the child to restart until the next system call.
            // For run, use PTRACE_SINGLESTEP to trace until the next instruction.
            if (ptrace(PTRACE_SYSCALL, child_pid, NULL, NULL) == -1) {
                exit(1);
            }
            if (waitpid(child_pid, &status, 0) < 0) {
                exit(1);
            }
            // This is what it's all about.
            // To build a debugger, you would use ptrace calls to inspect
            // the state of the child. But we don't do that here. We just sleep.
            usleep(delay_us);
//            // This is a little hack that slows down only on write system calls.
//            // This can be used to slow output. This includes output to stdout.
//            // With a little work you could check the arguments for fd=1 (stdout).
//            // Note that the rax system call numbers can vary between kernels.
//            // This is a very crude hack.
//            struct user_regs_struct regs;
//            ptrace(PTRACE_GETREGS, child_pid, NULL, &regs);
//            // on x86_64 rax==1 is the write system call.
//            if ((long)regs.orig_rax == 1) {
//                usleep(delay_us);
//            }
        }
    }
}

Previous version

/**
 * This adds a trace around the given program and adds a usleep after each syscall.
 * Could also set to sleep after every instruction in single-step mode, but that
 * would make the program very slow. In fact, you usually wouldn't need to even
 * add sleep. Just having this hook do nothing would be enough to make the child
 * program painfully slow. In single-step mode you should note that system calls
 * themselves are not traced, so you will still see line output flash one line at
 * a time with a pause in between.
 *
 * To build:
 *     gcc slow.c -o slow
 *
 * Example run:
 *     ./slow find /tmp
 *
 * Copyright (c) 2010, Noah Spurrier <noah@noah.org>
 */

#include <stdio.h>
#include <sys/ptrace.h>
// #include <asm/ptrace-abi.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/user.h>
#include <sys/syscall.h>

int main(int argc, char ** argv) {
    int child_argv_start;
    int status;
    pid_t child;

    // printf ("argc: %d, argv length: %d\n", argc, argv_length(argv));

    child = fork();
    if(child == 0) {
        // Find end of parent option list to get start of child args.
        // For the moment we assume parent has no options, so we
        // hard code this to 1.
        child_argv_start = 1;
        ptrace(PTRACE_TRACEME, 0, NULL, NULL);
        execvp(argv[child_argv_start], &argv[child_argv_start]);
    }
    else {
        wait(&status);
        // Check if child called exit before we even got started.
        if(WIFEXITED(status))
            _exit (WEXITSTATUS(status));
        ptrace(PTRACE_SYSCALL, child, NULL, NULL);
        wait(&status);
        while (1) {
            // I should probably check if the child got terminated by a signal.
            if(WIFEXITED(status))
                break;
            usleep (10000); // in microseconds (millionths of a second)
            ptrace(PTRACE_SYSCALL, child, NULL, NULL);
            //ptrace(PTRACE_SINGLESTEP, child, NULL, NULL);
            wait(&status);
        }
    }
    _exit (WEXITSTATUS(status));
}

/*
 * This returns the length of a NULL-terminated argv array.
 */
int argv_length(char **argv) {
    int count;
    char **p;

    if (argv == NULL)
        return 0;

    for (count = 0, p = argv; *p; count++, p++)
        ;

    return count;
}