-
Notifications
You must be signed in to change notification settings - Fork 2.2k
pkg/proc: process spawned event #4171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
c877a70 to
9d5daaf
Compare
|
This is groundwork for implementing the StartDebugging DAP reverse request, to allow multiprocess DAP debugging, once #4078 is merged. But also, I want to be able to intercept the Process DAP event in an extension, detect that the process is NodeJS, and create a NodeJS debugger; one of my projects involves a Go program that spawns a NodeJS child. But I'm running into an issue. A node process obviously doesn't have |
|
What's a nodejs debugger? How does it work? |
|
By "nodejs debugger" I mean "a debugger capable of debugging a JavaScript program executed with NodeJS". As in, what delve does for Go programs, but for a NodeJS program. This is the process flow I'm envisioning:
It is possible to do this manually - to wait for the NodeJS process to be spawned and manually launch a NodeJS debug session - but it is clumsy and error prone and the process I describe above should be a much better UX. To be crystal clear, delve plays no role in setting up the other debugger besides sending the process spawned event. All I want is for delve to send a notification without keeping the child stopped. |
Yes, that I understood. I was more wondering how those work. If they work like delve, by using ptrace (and ptrace-like things), or if it's more like JVM debuggers that talk to the VM. If it is the former there is no way to hand off a debugged process to another debugger, I investigated this possibility when I was implementing the multiprocess debugging functions of delve (being able to have a delve process per debugged process would have made things much simpler). Also you can't have two debuggers attached to the same process and there is also no way to detach from a process while leaving it stopped. |
aarzilli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of test failures, many due to the addTarget bug, others, I think, due to the fact that the DAP tests are now receiving two process events for the first process. If you search for newEvent("process") in service/dap/server.go you'll see that we were already sending one.
Oops, I should have started with that. I see that's in
The NodeJS debugger specifically is like the JVM debugger. I have to launch node with
Is it possible to detach and leave it running? That's the behavior I'm looking for. I launch my Go program with the debugger; it spawns |
|
Ok, I think you can probably do that. |
|
@aarzilli The real issue with non-Go processes is that I've resolved the issue with events by only sending a process spawned event for child processes. I am as certain as I can be that this will allow me to do what I want with heterogenous multiprocess debugging, so I'm marking it ready, though I can't properly test it until DAP-mode supports follow-exec (aka #4078). |
| willFollow = false | ||
| } | ||
|
|
||
| // Notify listeners that a child process was spawned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are doing this regardless just do it before the two previous ifs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to include willFollow in the process spawned event. When I add the DAP start debugging request (which I plan to do in a separate PR), that probably needs to be conditional. If delve did not follow/attach to a child process, sending a start debugging request for that child probably won't work.
aarzilli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, see test failures.
| if errors.As(err, new(*proc.ErrBadBinaryInfo)) { | ||
| // If we are unable to load the binary info, log the error | ||
| // but don't return it. This allows the client to continue | ||
| // debugging the parent process. | ||
| logflags.DebuggerLogger().Warnf("Child target %d %q cannot be debugged: %v", pid, cmdline, err) | ||
| } else { | ||
| return nil, err | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing from #4171 (comment)
Ok, you are right but then the problem is that you are restarting the process.
I still don't see how this is different from the follow-exec-disabled case. The calls to Process.EntryPoint or BinaryInfo.LoadBinaryInfo are literally the only difference between the follow-exec-disabled and unable-to-load-binary-info paths (excluding read-only operations) so unless those function calls somehow cause the process to be restarted I don't see how that's possible. Unless the follow-exec-disabled case is also restarting the process, but if that's the case then it's not caused by my changes.
I was not able to reproduce the TestBadAttachRequest failure locally. Based on the stack traces of the exceptions, I am guessing the failure comes down to TargetGroup.addTarget being called from other places, so I moved the "log and ignore" code here. Which means it's platform specific. I found what I believe to be the equivalent code for Windows so I added a TODO, but I was unable to identify an equivalent block of code for darwin or freebsd. And I don't want to make changes that I can't test myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see from TestFollowExec that follow-exec appears not to be implemented on darwin or freebsd, which explains why I couldn't find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't see how this is different from the follow-exec-disabled case
It is the same but that path detaches from the child process, which restarts. Which means that by the time you see the process event that child will have made some progress executing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, are you saying "the process resumes execution and thus cannot be debugged from the entry point"? I misunderstood, I thought you were saying "the process is stopped and relaunched" or something along those lines. Though I suppose if the process hasn't executed any code there's not much of a difference.
Which means that by the time you see the process event that child will have made some progress executing.
I think this is just something I/we will have to live with. Having process events is better than not having process events, so I think this is worth implementing as-is. Maybe someone thinks up a better solution down the road. For my purposes, the child resuming is acceptable, because I need to use a non-ptrace debugger anyways, and if I use node's --inspect-brk flag then the part of the child I care about (the javascript, not the runtime) will be paused at entry anyways.
As far as handoff to another debugger, I found PTRACE_SEIZE which might be usable, but apparently it is "fragile and kernel-version dependent". Another thought that occurs to me is having delve act as a gdb rpc server and letting the caller connect that way. But that seems like something that needs a lot more planning before it would be worth considering.
|
Do you have any guesses why |
Flake, probably. Where was it failing? TestGeneratedDoc is failing too. |
|
It was on the aggregator/team city build for the previous commit. |
That's not a TestGeneratedDoc failure. |
|
Is this ready for review now that #4078 is merged? |
|
Yes, this is ready for review. As far as I can tell the CI failure is a flaky test. |
_fixtures/nongochild/child.bat
Outdated
| @@ -0,0 +1,2 @@ | |||
| @echo off | |||
| echo Hello World No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add newline at end of script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
_fixtures/nongochild/child.sh
Outdated
| @@ -0,0 +1,2 @@ | |||
| #!/bin/sh | |||
| echo "Hello World" No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add newline at end of script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
pkg/proc/target_group.go
Outdated
| func (grp *TargetGroup) addTarget(p ProcessInternal, pid int, currentThread Thread, path string, stopReason StopReason, cmdline string) (*Target, error) { | ||
| logger := logflags.DebuggerLogger() | ||
| if len(grp.targets) > 0 { | ||
| willFollow := true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not initialize to grp.followExecEnabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply setting willFollow := grp.followExecEnabled didn't read right to me. Also, if there was some way for followExecEnabled to be false but still have followExecRegex nil, it would have printed out two "Detaching from child target" messages. I reworked it; hopefully it's more clear now.
firelizzard18
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Updated)
_fixtures/nongochild/child.bat
Outdated
| @@ -0,0 +1,2 @@ | |||
| @echo off | |||
| echo Hello World No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
_fixtures/nongochild/child.sh
Outdated
| @@ -0,0 +1,2 @@ | |||
| #!/bin/sh | |||
| echo "Hello World" No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
pkg/proc/target_group.go
Outdated
| func (grp *TargetGroup) addTarget(p ProcessInternal, pid int, currentThread Thread, path string, stopReason StopReason, cmdline string) (*Target, error) { | ||
| logger := logflags.DebuggerLogger() | ||
| if len(grp.targets) > 0 { | ||
| willFollow := true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply setting willFollow := grp.followExecEnabled didn't read right to me. Also, if there was some way for followExecEnabled to be false but still have followExecRegex nil, it would have printed out two "Detaching from child target" messages. I reworked it; hopefully it's more clear now.
|
Looks like there are still some failures around windows/arm64 and some capslock errors. You can fix the capslock errors by running |
e607841 to
a2ffadb
Compare
firelizzard18
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestLaunchWithFollowExec was failing (also on Linux, not sure why that didn't show up in CI). That test didn't exist when I started this PR; I assume that's why I didn't see the failure earlier. I rebased and force-pushed so I could test it. Fixing it on Linux was easy, I just added client.ExpectProcessEvent(t) to account for the new event. But on Windows something weird is happening (more details in a code comment).
| client.ExpectBreakpointEvent(t) | ||
| if runtime.GOOS != "windows" { | ||
| client.ExpectBreakpointEvent(t) | ||
| } | ||
| client.ExpectProcessEvent(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding ExpectProcessEvent is necessary because the spawned child process still generates a process event. But for some reason on Windows, the breakpoint doesn't happen and the process is terminated instead - I get a TerminatedEvent. Which is strange because I explicitly did not implement follow-exec process events on Windows, and this still happens even when I revert the changes to target_group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is also failing in Windows on master:
$ go test -fullpath -run=^TestLaunchWithFollowExec$ -bench=- -json -count=1 -benchmem
=== RUN TestLaunchWithFollowExec
DAP server listening at: [::]:58675
2025-11-26T14:41:25-08:00 warn layer=rpc Listening for remote connections (connections are not authenticated nor encrypted)
Connecting to server at: [::]:58675
hello world
c:/Users/firelizzard/Documents/delve/service/dap/server_test.go:733: got &dap.TerminatedEvent{Event:dap.Event{ProtocolMessage:dap.ProtocolMessage{Seq:0, Type:"event"}, Event:"terminated"}, Body:dap.TerminatedEventBody{Restart:json.RawMessage(nil)}}, want *dap.BreakpointEvent
--- FAIL: TestLaunchWithFollowExec (1.59s)
FAIL
panic: Log in goroutine after TestLaunchWithFollowExec has completed: server stop triggered internally
goroutine 12 [running]:
testing.(*common).log(0xc000003a40, {0xc00136e1e0?, 0x21?})
C:/Program Files/Go/src/testing/testing.go:1030 +0x1ac
testing.(*common).Log(0xc000003a40, {0xc000083fb0?, 0x0?, 0x0?})
C:/Program Files/Go/src/testing/testing.go:1180 +0x45
github.com/go-delve/delve/service/dap.startDAPServer.func1()
c:/Users/firelizzard/Documents/delve/service/dap/server_test.go:127 +0x116
created by github.com/go-delve/delve/service/dap.startDAPServer in goroutine 8
c:/Users/firelizzard/Documents/delve/service/dap/server_test.go:119 +0x285
exit status 2
FAIL github.com/go-delve/delve/service/dap 2.887s
|
|
|
Hey, looks like there are still failures on the Windows builders: |
|
TestLaunchWithFollowExec is failing for me on the |
Adds a ProcessSpawned event to lay the groundwork for multi-process DAP debugging. Updates #3947.