Skip to content

Conversation

@brenns10
Copy link
Contributor

@brenns10 brenns10 commented Oct 22, 2025

This implements almost every option for crash's rd command. With the identify_address_all() API it's now trivial to make the symbolized output identical to crash's. There are a lot of formatting options, which I tried to cover with a generic unit test suite that is independent from the kernel tests. And there are also linux kernel crash command tests.

The _print_memory() API will get used by the bt command to print the formatted memory between stack frames. I didn't think it merited being an exported helper API because it's so crash-specific. If necessary that can always be changed.

The only options missing are:

  • -f to print dumpfile offsets. This is just too low-level for drgn, and I don't see an easy way to include it. I also have never used it myself.
  • -m to print Xen host machine memory. In 2025 I'm not sure how relevant this is, and I don't think we actually support the Xen dumps necessary for this to even make sense.

Userspace memory is supported too, using the trick we discussed on matrix.

@brenns10
Copy link
Contributor Author

The error on Python 3.8 is rather interesting. The test case intermixes positional arguments and optional arguments.

%crash> rd ffff981c2ac99bc0 -32 5
usage: rd [-e END] [-o OFFSET] [-p] [-d | -D] [-s] [-S] [-x] [-N] [-R] [-r RAW]
          [-8 | -16 | -32 | -64 | -a] [--drgn]
          address|symbol [count]
drgn: crash: rd: error: unrecognized arguments: 5

I guess I assumed that would work, but argparse doesn't allow it unless you explicitly use parse_intermixed_args(). (Though strangely it starts working in Python 3.12. I don't see anything specific in the changelog about it.)

Normally I don't use optional positionals (nargs='?') which I guess is why this is news to me. The crash rd command allows this invocation (rd ADDRESS -32 5) and it mentally makes sense to me, as far as the order of my thoughts when I run the command (read from ADDRESS in 32-bit units, and read 5 of them).

We could just enable parse_intermixed_args() for all commands, but I'm not 100% confident about it, because the docs indicate that some features won't work with it:

These parsers do not support all the argparse features, and will raise exceptions if unsupported features are used. In particular, subparsers, and mutually exclusive groups that include both optionals and positionals are not supported.

We could also allow commands to opt-in to intermixed args via a command line argument, or we could just accept that this is the way it is (and change the test). I honestly don't care too much about intermixed args. Maybe we can chat about it in today's meeting.

@brenns10
Copy link
Contributor Author

For now, I've modified the test to not intermix positional & optional arguments. I also updated the text alignment to match crash's more accurately.

@brenns10 brenns10 force-pushed the crash_rd branch 4 times, most recently from 9bc1710 to 381bfd7 Compare October 24, 2025 20:47
@brenns10
Copy link
Contributor Author

Ok, this PR should be ready. It has every rd option except -m and -f. It supports userspace memory in the way we discussed on Matrix. And while they're not completed, the all-architecture tests are looking pretty good.

I'll probably look into adding the -fF[F] options for bt based on this next. So I think rd would make more sense to look at before bt, if you had to choose. It's also a more straightforward implementation, I think.

Copy link
Owner

@osandov osandov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you. Just a few nitpicks.

The rd command allows reading and formatting kernel memory in a variety
of ways. In addition to being its own command, it also provides
functionality that other commands (like bt) will use, so it's important
to provide an API for its functionality as well. But since it is just
implementing crash-compatible behavior, it's not exposed as a helper.

Thanks to the recent refactor to identify_address_all(), it is possible
to implement the specific behavior that Crash's rd command implements
for -sS[S]. Unfortunately, the addresses crash identifies and annotates
are a subset of what drgn is capable of annotating. It might be worth
including the entire set of drgn annotations with a dedicated option,
such as when specifying a third "-S" option. That is not implemneted
here.

Userspace memory reading is also supported, using the context task to
determine the address space. Since it's definitely non-trivial to
determine whether an address is user or kernel, we employ a trick: first
we try to read memory using prog.read, and if that fails, we try to read
via access_process_vm(). Based on that, we can "magically" generate the
correct drgn code. Usually the user knows whether they're trying to read
user/kernel memory, so it makes sense to pretend we magically knew it as
well.

The only crash options which are not supported are -m (for Xen host
machine addresses) and -f (for dumpfile offsets). Neither of these are
likely to be supported by drgn.

Signed-off-by: Stephen Brennan <[email protected]>
@brenns10
Copy link
Contributor Author

Thanks for the review. I think everything is addressed here. I noticed a funny error on my local tests, but it's reproducing on main so I think it's unrelated. I'm trying to avoid rebasing to make it easy to re-review, so crossing my fingers that it doesn't happen here. I'll file something separately for the test failure I got, if it's a real isuse.

Copy link
Owner

@osandov osandov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

The new hrtimer command test has been flaky, but I haven't gotten to the bottom of that yet. Is that the one you've seen?

@osandov osandov merged commit d6d9feb into osandov:main Nov 12, 2025
35 checks passed
@brenns10 brenns10 deleted the crash_rd branch November 12, 2025 21:31
@brenns10
Copy link
Contributor Author

No, this is in the kmodify helpers:

======================================================================
ERROR: test_string_args (tests.linux_kernel.helpers.test_kmodify.TestCallFunction.test_string_args) [array pointer]
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/stepbren/repos/drgn/tests/linux_kernel/helpers/test_kmodify.py", line 183, in test_string_args
    self.assert_called(
  File "/home/stepbren/repos/drgn/tests/linux_kernel/helpers/test_kmodify.py", line 34, in assert_called
    return_value = call_function(self.prog["drgn_kmodify_test_" + name], *args)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stepbren/repos/drgn/drgn/helpers/common/prog.py", line 115, in wrapper
    return f(args[0].prog_, *args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stepbren/repos/drgn/drgn/helpers/experimental/kmodify.py", line 1501, in call_function
    implicit_convert(
TypeError: cannot convert 'char *' to incompatible type 'char *'

----------------------------------------------------------------------

It is only reproducing on my Oracle Linux laptop, not my Arch desktop or the Debian rootfs. I'm tracking it down a bit further but will file an issue if it is actually a bug.

@brenns10
Copy link
Contributor Author

#569 :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants