Torchaudio + tensorflow + CUDA 11.0 = segfault #1595

zpapakipos · 2021-06-21T08:59:16Z

🐛 Bug

Importing torchaudio after tensorflow-gpu while using CUDA 11.0 causes a segfault. This issue was originally reported in the AugLy repo: facebookresearch/AugLy#28.

To Reproduce

Steps to reproduce the behavior:

Only happens on CUDA 11.0, so we haven't been able to reproduce this error.

import tensorflow
import augly.audio as audaugs

Output:

2021-06-18 18:42:04.241048: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Segmentation fault (core dumped)

Expected behavior

No segfault.

Environment

What commands did you used to install torchaudio (conda/pip/build from source)?
pip install -r torch==1.8.1 torchaudio==0.8.1 (in AugLy's requirements.txt)
PyTorch Version (e.g., 1.0):
OS (e.g., Linux):
How you installed PyTorch (conda, pip, source):
Build command you used (if compiling from source):
Python version:
CUDA/cuDNN version: 11.0
GPU models and configuration:
Any other relevant information:

Additional context

This problem doesn't happen on our (AugLy's maintainers') environments, only on one user's.

The text was updated successfully, but these errors were encountered:

mthrok · 2021-06-21T09:18:34Z

Hi @zpapakipos

What PyTorch version is this? torchaudio==0.8 has no CUDA-related code, so it seems that the issue is caused by importing torch, like two CUDA versions are loaded.

Can you verify that it's not PyTorch?

mcanan · 2021-06-21T12:51:41Z

The error is happening to me as I reported in this issue: facebookresearch/AugLy#28
In my environment it can be reproduced with this commands:

python3 -m venv venv
source venv/bin/activate
pip3 install wheel
pip3 install tensorflow-gpu==2.4.1
pip3 install augly
python3 -c "import tensorflow; import augly.audio as audaugs"

Answering your question, the PyTorch version is the version installed by AugLy torch==1.8.1

Doing a:

strace python3 -c "import tensorflow; import augly.audio as audaugs"

The last file opened before the segfault is: python3.8/site-packages/torchaudio/_internal/fft.py

More information about my environment:

SO: Linux Ubuntu 20.04
Python version: 3.8.5
tensorflow version: 2.4.1
CUDA version: 11.0
Nvidia Driver Version: 450.119.03.
GPU: Quadro P5200

Please let me know if you need additional information.
Thank you.

mthrok · 2021-06-21T13:59:49Z

@mcanan Thanks. Can you replace import augly.audio as audaugs with import torch and see that happens?

mcanan · 2021-06-21T14:23:19Z

It doesn't fail:

python3 -c "import tensorflow; import torch"
2021-06-21 10:18:58.630042: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0

In the other case the output is this:

python3 -c "import tensorflow; import augly.audio as audaugs"
2021-06-21 10:18:45.234129: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Segmentation fault (core dumped)

mthrok · 2021-06-21T15:01:32Z

Oh that's interesting. Can you run python -m 'torch.utils.collect_env' and report the output?

mcanan · 2021-06-21T15:07:15Z

Collecting environment information...
PyTorch version: 1.8.1+cu102
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.1 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: version 3.16.3

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: Quadro P5200
Nvidia driver version: 450.119.03
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.20.2
[pip3] torch==1.8.1
[pip3] torchaudio==0.8.1
[conda] Could not collect

mthrok · 2021-06-21T15:22:21Z

Do you use CUDA from both TF and PyTorch?

The original issue description says Only happens on CUDA 11.0. (and I am interpreting this as The issue does not happen on CPU-version of PyTorch).
If you do not use PyTorch for DL-related stuff, as a workaround, it might work to replace the CUDA-enabled PyTorch with one without it.

mcanan · 2021-06-21T15:35:46Z

We don't use PyTorch. We use only TF for DL.
We wanted to test the new AugLy library to test the audio augmentations and we installed it. PyTorch is installed during the AugLy installation.
How can we replace the CUDA enabled PyTorch for the other one?
Should't it be done during AugLy installation?

mthrok · 2021-06-21T15:42:50Z

So first uninstall PyTorch and torchaudio pip uninstall torch torchaudio then install the right version of torch and torchaudio.

To install torch, something like pip3 install torch==1.8.1+cpu torchaudio==0.8.1 will work but the best way to find the correct command is to go to https://pytorch.org/get-started/locally/#start-locally and choose the right configuration for you.

mcanan · 2021-06-21T18:07:16Z

I installed the cpu version and I still have the same segfault.
I can reproduce it from scratch following these steps:

python3 -m venv venv
source venv/bin/activate
pip3 install tensorflow-gpu==2.4.1
pip3 install augly
pip3 uninstall torch torchaudio
pip3 install torch==1.8.1+cpu torchaudio==0.8.1 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html
python3 -c "import tensorflow; import augly.audio as audaugs"

The output is:

python3 -c "import tensorflow; import augly.audio as audaugs"
2021-06-21 14:05:53.719005: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Segmentation fault (core dumped)

The output of python -m 'torch.utils.collect_env' is:

Collecting environment information...
PyTorch version: 1.8.1+cpu
Is debug build: False
CUDA used to build PyTorch: Could not collect
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.1 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: version 3.16.3

Python version: 3.8 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: Quadro P5200
Nvidia driver version: 450.119.03
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.20.2
[pip3] torch==1.8.1+cpu
[pip3] torchaudio==0.8.1
[conda] Could not collect

Thank you

mthrok · 2021-06-21T19:25:03Z

Thanks for the report. One more thing, can you try torch==1.9.0 and torchaudio==0.9.0?
It's unlikely that we can make a change to the past release (1.8/0.8), but if we can fix it on master, then we can do a minor release 0.9.1

mcanan · 2021-06-21T19:48:13Z

It doesn't fail now. I followed these steps from scratch:

python3 -m venv venv
source venv/bin/activate
pip3 install tensorflow-gpu==2.4.1
pip3 install augly
pip3 uninstall torch torchaudio
pip3 install torch==1.9.0+cpu torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
python3 -c "import tensorflow; import augly.audio as audaugs"

The only error messages I had were these during torch and torchaudio installation:

ERROR: augly 0.1.1 has requirement torch==1.8.1, but you'll have torch 1.9.0+cpu which is incompatible.
ERROR: augly 0.1.1 has requirement torchaudio==0.8.1, but you'll have torchaudio 0.9.0 which is incompatible.

mthrok · 2021-06-21T19:58:36Z

@mcanan Thanks. Grad that it works.

@zpapakipos @jbitton Not sure if the version requirements should be bumped up in AugLy's requirements.txt, but at least this seems to work.

jbitton · 2021-06-21T21:38:29Z

@mthrok thank you for helping debug this issue! I'll check the unit tests for AugLy audio with torch 1.9.0 and torchaudio 0.9.0 and see if we're still getting the same expected results :)

Summary: As called out in facebookresearch#28, there are some conflicting dependencies between `torchaudio`/`torch` 0.8.1/1.8.1 and `tensorflow-gpu`. However, as discovered in pytorch/audio#1595, upgrading to v0.9 etc actually resolve this issue. Thus, I update the torchaudio/torch versions in our `requirements.txt` and I also updated our `numpy` requirement so there are no conflicting dependencies between `tf-gpu` and `augly` :) I verified on my side that all unit tests still pass and that `setup.py` finishes as expected with no errors. I also update `setup.py` to add our README to our PyPI page. Differential Revision: D29292956 fbshipit-source-id: e07f8b3d6d2d8bc9b21af166307f2ae00dbca663

* Update `torchaudio` to 0.9 for `tensorflow-gpu` compatibility Summary: As called out in #28, there are some conflicting dependencies between `torchaudio`/`torch` 0.8.1/1.8.1 and `tensorflow-gpu`. However, as discovered in pytorch/audio#1595, upgrading to v0.9 etc actually resolve this issue. Thus, I update the torchaudio/torch versions in our `requirements.txt` and I also updated our `numpy` requirement so there are no conflicting dependencies between `tf-gpu` and `augly` :) I verified on my side that all unit tests still pass and that `setup.py` finishes as expected with no errors. I also update `setup.py` to add our README to our PyPI page. Differential Revision: D29292956 fbshipit-source-id: e07f8b3d6d2d8bc9b21af166307f2ae00dbca663 * Update setup.py Co-authored-by: Zoe Papakipos <[email protected]>

mthrok · 2021-06-22T13:42:57Z

Closing this issue as it does not happen in recent release (0.9) and master branch.
The reason why it does not work with 0.8 is still unknown but we do not update the past release as-well, so we recommend users to use the 0.9.

…okresearch#43) * Update `torchaudio` to 0.9 for `tensorflow-gpu` compatibility Summary: As called out in facebookresearch#28, there are some conflicting dependencies between `torchaudio`/`torch` 0.8.1/1.8.1 and `tensorflow-gpu`. However, as discovered in pytorch/audio#1595, upgrading to v0.9 etc actually resolve this issue. Thus, I update the torchaudio/torch versions in our `requirements.txt` and I also updated our `numpy` requirement so there are no conflicting dependencies between `tf-gpu` and `augly` :) I verified on my side that all unit tests still pass and that `setup.py` finishes as expected with no errors. I also update `setup.py` to add our README to our PyPI page. Differential Revision: D29292956 fbshipit-source-id: e07f8b3d6d2d8bc9b21af166307f2ae00dbca663 * Update setup.py Co-authored-by: Zoe Papakipos <[email protected]>

zpapakipos mentioned this issue Jun 21, 2021

Problem using audio augmentations with tensorflow facebookresearch/AugLy#28

Closed

jbitton mentioned this issue Jun 22, 2021

Update torchaudio to 0.9 for tensorflow-gpu compatibility facebookresearch/AugLy#43

Merged

mthrok closed this as completed Jun 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Torchaudio + tensorflow + CUDA 11.0 = segfault #1595

Torchaudio + tensorflow + CUDA 11.0 = segfault #1595

zpapakipos commented Jun 21, 2021

mthrok commented Jun 21, 2021

Uh oh!

mcanan commented Jun 21, 2021

Uh oh!

mthrok commented Jun 21, 2021

Uh oh!

mcanan commented Jun 21, 2021

Uh oh!

mthrok commented Jun 21, 2021

Uh oh!

mcanan commented Jun 21, 2021

Uh oh!

mthrok commented Jun 21, 2021 •

edited

Loading

Uh oh!

mcanan commented Jun 21, 2021

Uh oh!

mthrok commented Jun 21, 2021

Uh oh!

mcanan commented Jun 21, 2021

Uh oh!

mthrok commented Jun 21, 2021

Uh oh!

mcanan commented Jun 21, 2021

Uh oh!

mthrok commented Jun 21, 2021

Uh oh!

jbitton commented Jun 21, 2021

Uh oh!

mthrok commented Jun 22, 2021

Uh oh!

Torchaudio + tensorflow + CUDA 11.0 = segfault #1595

Torchaudio + tensorflow + CUDA 11.0 = segfault #1595

Comments

zpapakipos commented Jun 21, 2021

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

mthrok commented Jun 21, 2021

Uh oh!

mcanan commented Jun 21, 2021

Uh oh!

mthrok commented Jun 21, 2021

Uh oh!

mcanan commented Jun 21, 2021

Uh oh!

mthrok commented Jun 21, 2021

Uh oh!

mcanan commented Jun 21, 2021

Uh oh!

mthrok commented Jun 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mcanan commented Jun 21, 2021

Uh oh!

mthrok commented Jun 21, 2021

Uh oh!

mcanan commented Jun 21, 2021

Uh oh!

mthrok commented Jun 21, 2021

Uh oh!

mcanan commented Jun 21, 2021

Uh oh!

mthrok commented Jun 21, 2021

Uh oh!

jbitton commented Jun 21, 2021

Uh oh!

mthrok commented Jun 22, 2021

Uh oh!

mthrok commented Jun 21, 2021 •

edited

Loading