-
Notifications
You must be signed in to change notification settings - Fork 695
uhd_find_devices / libusb1_base seg faults #615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
In the original post, I noted the issue on 4.1.0.5. I'm just confirming that this issue is still present in UHD 4.3.0.0. However, the potential fix I posted above doesn't seem to help. I'm still getting occasional seg faults. Using UHD 4.3.0.0 on Ubuntu 20.04, the output of uhd_find_devices when it seg faults looks like:
Thanks, |
The fix will be in |
FYI, I changed some minor formatting and added some info to the commit message. |
Issue Description
Using UHD 4.1.0.5 on either Ubuntu 18.04 or Ubuntu 20.04 machines, we occasionally see seg faults when executing uhd_find_devices. This has been traced back to global session management in 'host/lib/transport/libusb1_base.cpp', specifically 'libusb::session::sptr libusb::session::get_global_session(void)'. It appears that the existence of a global_session is checked for. If a session does exist, the next step is to return a pointer to that session. On occasion, it seems that the session expires just after the check, and an empty shared pointer is returned by get_global_session. This has been tested on many different host machines.
Setup Details
UHD 4.1.0.5 / Ubuntu 18.04 or 20.04.
run uhd_find_devices.
Expected Behavior
No Seg Fault
Actual Behaviour
Occasional seg faults.
Steps to reproduce the problem
To reproduce the issue, I would run the following:
while true; do date; uhd_find_devices; sleep 6; done
Leaving this run, the problem might occur anywhere from 1 to maybe 100 times over 24 hours
Additional Information
When the seg fault occurs, this would be displayed in the terminal:
Checking "dmesg -T" would result in:
We were able to capture some coredumps. a backtrace in gdb showed:
Adding a debug message like the following prints the session pointer. When a seg fault occurs, the pointer would print as "0x0". Normally, when not seg faulting, it would show a larger, "proper" looking pointer value.
To make the problem occur much more frequently, you can add something that take time after line 102 in libusb1_base.cpp. If I print a log message as follows, the seg faults occur almost every time uhd_find_devices is run:
Potential Fix
I modified lines 102 and 103 and changed them from:
to
After rebuilding with this change, we no longer see any seg faults (with multiple hosts running the uhd_find_devices loop for several days). I believe this fix creates a shared pointer as it checks for session expiration, which maintains ownership and prevents session expiration until "get_global_session" returns (assuming the session hadn't already expired prior to calling global_session.lock()). I don't know if this is the most appropriate fix, as I'm not even close to an expert in this kind of thing.
Thanks,
Jim
The text was updated successfully, but these errors were encountered: