Uses MPS (Mac acceleration) by default when available #382

dwarkeshsp · 2022-10-21T00:53:19Z

Currently, Whisper defaults to using the CPU on MacOS devices despite the fact that PyTorch has introduced Metal Performance Shaders framework for Apple devices in the nightly release (more info).

With my changes to init.py, torch checks in MPS is available if torch.device has not been specified. If it is, and CUDA is not available, then Whisper defaults to MPS.

This way, Mac users can experience speedups from their GPU by default.

usergit · 2022-10-21T05:52:20Z

@dwarkeshsp have you measured any speedups compared to using the CPU?

Michcioperz · 2022-10-21T18:10:48Z

Doesn't this also require switching FP16 off?

DiegoGiovany · 2022-11-09T12:32:50Z

I'm getting this error when try to use MPS

/Users/diego/.pyenv/versions/3.10.6/lib/python3.10/site-packages/whisper-1.0-py3.10.egg/whisper/decoding.py:629: UserWarning: The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/diego/Projects/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
audio_features = audio_features.repeat_interleave(self.n_group, dim=0)
/AppleInternal/Library/BuildRoots/2d9b4df9-4b93-11ed-b0fc-2e32217d8374/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:794: failed assertion `[MPSNDArray, initWithBuffer:descriptor:] Error: buffer is not large enough. Must be 23200 bytes
'
Abort trap: 6
/Users/diego/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown

any clues?

glangford · 2022-11-14T19:44:50Z

@DiegoGiovany Not an expert on this but It looks like PyTorch itself is missing some operators for MPS. See for example
pytorch/pytorch#77764 (comment)
(which refers to repeat_interleave)

and
pytorch/pytorch#87219

gltanaka · 2022-11-17T18:33:21Z

Thanks for your work. I just tried this. Unfortunately, it didn't work for me on my m1 max with 32GB.
Here is what I did:
pip install git+https://github.com/openai/whisper.git@refs/pull/382/head

No errors on install and it works fine when run without mps: whisper audiofile_name --model medium

When I run: whisper audiofile_name --model medium --device mps

Here is the error I get:
Detecting language using up to the first 30 seconds. Use --language to specify the language
loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/810eba08-405a-11ed-86e9-6af958a02716/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x1024x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).

When I run: whisper audiofile_name --model medium --device mps --fp16 False

Here is the error I get:
Detecting language using up to the first 30 seconds. Use --language to specify the language
Detected language: English
/anaconda3/lib/python3.9/site-packages/whisper/decoding.py:633: UserWarning: The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
audio_features = audio_features.repeat_interleave(self.n_group, dim=0)
/AppleInternal/Library/BuildRoots/f0468ab4-4115-11ed-8edc-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:794: failed assertion `[MPSNDArray, initWithBuffer:descriptor:] Error: buffer is not large enough. Must be 1007280 bytes

Basically, same error as @DiegoGiovany.

Any ideas on how to fix?

megeek · 2022-11-28T06:49:16Z

+1 for me! I'm actually using an Intel Mac with Radeon Pro 560X 4 GB...

glangford · 2022-11-28T13:46:04Z

Related
pytorch/pytorch#87351

PhDLuffy · 2022-12-08T13:56:41Z

@dwarkeshsp

not work，with mbp2015 pytorch 1.3 stable，egpu RX580, MacOS 12.3.

changed the code as the same as yours.

changed to use --device mps but show error, maybe there is still somewhere to change or modify.

use --device cpu, it works.

with other pytorch-metal project, MPS works.

changeling · 2023-01-16T19:02:15Z

What's the status on this?

jongwook · 2023-01-18T23:01:35Z

I also see the same errors as others mentioned above, on an M1 Mac running arm64 Python.

changeling · 2023-01-19T06:02:14Z

On an M1 16" MBP with 16GB running MacOS 13.0.1, I'm seeing the following with openai-whisper-20230117:

Using this command:
(venv) whisper_ai_playground % whisper './test_file.mp3' --model tiny.en --output_dir ./output --device mps

I'm encountering the following errors:

loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/810eba08-405a-11ed-86e9-6af958a02716/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x384x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible

LLVM ERROR: Failed to infer result type(s).

zsh: abort whisper --model tiny.en --output_dir ./output --device mps

  warnings.warn('resource_tracker: There appear to be %d '```

sachit-menon · 2023-02-04T21:05:46Z

Is there any update on this, or did anyone figure out how to get it to work?

renderpci · 2023-02-05T11:02:28Z

Same problem with osx 13.2 in MacBook Pro M2 max:

loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/9e200cfa-7d96-11ed-886f-a23c4f261b56/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x1280x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
zsh: abort      whisper audio.wav --language en --model large
m2@Render ~ % /opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

DontEatOreo · 2023-02-06T13:11:14Z

I'm getting the same error as @renderpci using the M1 Base Model

loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/9e200cfa-7d96-11ed-886f-a23c4f261b56/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x512x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
[1]    3746 abort      python3 test.py

test.py:

import whisper

model = whisper.load_model("base")
result = model.transcribe("audio.mp3")
print(result["text"])

saurabhsharan · 2023-02-06T18:44:22Z

FWIW I switched to the C++ port https://github.com/ggerganov/whisper.cpp/ and got a ~15x speedup compared to CPU pytorch on my M1 Pro. (But note that it doesn't have all the features/flags from the official whisper repo.)

renderpci · 2023-02-06T19:06:51Z

FWIW I switched to the C++ port https://github.com/ggerganov/whisper.cpp/

For us whisper.cpp is not an option:

Should I use whisper.cpp in my project?

whisper.cpp is a hobby project. It does not strive to provide a production ready implementation. The main goals of the implementation is to be educational, minimalistic, portable, hackable and performant. There are no guarantees that the implementation is correct and bug-free and stuff can break at any point in the future. Support and updates will depend mostly on contributions, since with time I will move on and won't dedicate too much time on the project.

If you plan to use whisper.cpp in your own project, keep in mind the above.
My advice is to not put all your eggs into the whisper.cpp basket.

devpacdd · 2023-02-07T08:25:18Z

The same error as @renderpci using the M2

whisper interview.mp4 --language en --model large --device mps

loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/9e200cfa-7d96-11ed-886f-a23c4f261b56/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x1280x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
zsh: abort      whisper interview.mp4 --language en --model large --device mps
pac@dd ~ % /opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

DenisVieriu97 · 2023-02-21T08:04:02Z

Hey @devpacdd - this should be fixed in latest pytorch nightly (pip3 install --pre --force-reinstall torch --index-url https://download.pytorch.org/whl/nightly/cpu). Let me know if you still see any issues. Thanks

manuthebyte · 2023-02-21T10:18:58Z

Still have the same error after updating

Edit: After adding --fp16 False to the command, I now get a new error, as well as the old one:

/opt/homebrew/lib/python3.10/site-packages/whisper/decoding.py:633: UserWarning: The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
  audio_features = audio_features.repeat_interleave(self.n_group, dim=0)
/AppleInternal/Library/BuildRoots/5b8a32f9-5db2-11ed-8aeb-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:794: failed assertion `[MPSNDArray, initWithBuffer:descriptor:] Error: buffer is not large enough. Must be 1007280 bytes
'
zsh: abort      whisper --model large --language de --task transcribe  --device mps --fp16
/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

cameronbergh · 2023-02-23T05:42:49Z

i was able to get it to kinda work: davabase/whisper_real_time#5 (comment)

DenisVieriu97 · 2023-02-24T21:10:51Z

The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
audio_features = audio_features.repeat_interleave(self.n_group, dim=0)

@manuthebyte could you please make sure you are on a recent nightly? repeat_interleave should be natively supported. If you could try grabbing today's nightly and give a try that would be awesome! (You can get today's nightly with pip3 install --pre --force-reinstall torch==2.0.0.dev20230224 --index-url https://download.pytorch.org/whl/nightly/cpu)

cameronbergh · 2023-02-25T01:06:38Z

Wow!

when running:
Python3 transcribe_demo.py --model medium (from https://github.com/davabase/whisper_real_time)

with the following packages in my pipenv's requirements.txt

certifi==2022.12.7
charset-normalizer==3.0.1
ffmpeg-python==0.2.0
filelock==3.9.0
future==0.18.3
huggingface-hub==0.12.1
idna==3.4
more-itertools==9.0.0
mpmath==1.2.1
networkx==3.0rc1
numpy==1.24.2
openai-whisper @ git+https://github.com/openai/whisper.git@51c785f7c91b8c032a1fa79c0e8f862dea81b860
packaging==23.0
Pillow==9.4.0
PyAudio==0.2.13
PyYAML==6.0
regex==2022.10.31
requests==2.28.2
SpeechRecognition==3.9.0
sympy==1.11.1
tokenizers==0.13.2
torch==2.0.0.dev20230224
torchaudio==0.13.1
torchvision==0.14.1
tqdm==4.64.1
transformers==4.26.1
typing_extensions==4.4.0
urllib3==1.26.14

it gets every word! while i was singing! in realtime, with maybe 50%~ gpu usage on the apple M2 Pro Max.

andrewguy9 · 2023-04-11T08:36:06Z

Did some performance testing of MPS vs CPU on Apple M2 Pro.

I tested a 30 second clip for performance and accuracy on every version of the model and CPU vs MPS.

Here is the MPS Version:

details

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "data": {
    "values": [
      { "Model": "tiny.en", "Transcribe Time vs Audio Time": 0.214962963, "Perfect": false },
      { "Model": "tiny", "Transcribe Time vs Audio Time": 0.267037037, "Perfect": false },
      { "Model": "base.en", "Transcribe Time vs Audio Time": 0.2654814815, "Perfect": false },
      { "Model": "base", "Transcribe Time vs Audio Time": 0.3830740741, "Perfect": false },
      { "Model": "small.en", "Transcribe Time vs Audio Time": 0.6409259259, "Perfect": true },
      { "Model": "small", "Transcribe Time vs Audio Time": 0.7203333333, "Perfect": true },
      { "Model": "medium.en", "Transcribe Time vs Audio Time": 2.406666667, "Perfect": false },
      { "Model": "medium", "Transcribe Time vs Audio Time": 1.545925926, "Perfect": true },
      { "Model": "large", "Transcribe Time vs Audio Time": 3.214814815, "Perfect": true },
      { "Model": "large-v1", "Transcribe Time vs Audio Time": 3.090740741, "Perfect": true },
      { "Model": "large-v2", "Transcribe Time vs Audio Time": 3.181481481, "Perfect": true }
    ]
  },
  "mark": {
    "type": "bar",
    "color": {
      "condition": {
        "test": "datum.Perfect === true",
        "value": "green"
      },
      "value": "red"
    }
  },
  "encoding": {
    "x": {
      "field": "Model",
      "type": "nominal",
      "sort": null
    },
    "y": {
      "field": "Transcribe Time vs Audio Time",
      "type": "quantitative"
    }
  }
}

Here is the CPU Version:

details

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "data": {
    "values": [
      { "Model": "tiny.en", "Transcribe Time vs Audio Time": 0.101037037, "Perfect": false },
      { "Model": "tiny", "Transcribe Time vs Audio Time": 0.1236296296, "Perfect": false },
      { "Model": "base.en", "Transcribe Time vs Audio Time": 0.1630740741, "Perfect": false },
      { "Model": "base", "Transcribe Time vs Audio Time": 0.227962963, "Perfect": false },
      { "Model": "small.en", "Transcribe Time vs Audio Time": 0.4248888889, "Perfect": true },
      { "Model": "small", "Transcribe Time vs Audio Time": 0.5275925926, "Perfect": true },
      { "Model": "medium.en", "Transcribe Time vs Audio Time": 1.728296296, "Perfect": false },
      { "Model": "medium", "Transcribe Time vs Audio Time": 1.181074074, "Perfect": true },
      { "Model": "large", "Transcribe Time vs Audio Time": 2.648148148, "Perfect": true },
      { "Model": "large-v1", "Transcribe Time vs Audio Time": 2.619259259, "Perfect": true },
      { "Model": "large-v2", "Transcribe Time vs Audio Time": 2.654814815, "Perfect": true }
    ]
  },
  "mark": {
    "type": "bar"
  },
  "encoding": {
    "x": {
      "field": "Model",
      "type": "nominal",
      "sort": null
    },
    "y": {
      "field": "Transcribe Time vs Audio Time",
      "type": "quantitative"
    },
    "color": {
      "field": "Perfect",
      "type": "nominal",
      "scale": {
        "domain": [false, true],
        "range": ["red", "green"]
      }
    }
  }
}

CPU performs better on smaller models, and MPS performs better on larger models.

A value of 1 means the audio time is the same as the transcode time. A value of 2 means it takes 2 seconds to transcribe 1 second of audio.

salamer · 2023-04-22T04:34:26Z

Any progress? Or does whisper have any other means of accelerating inferencing?

hqucsx · 2023-08-02T09:51:54Z

@mukulpatnaik My device is M1 MacBook Pro, I got the same error with the latest version of whisper(v20230314), then I switch to v20230124, every thing works fine. (torch nightly version)

But, seems like mps is slower than cpu like @renderpci reported, for my task

cpu 3.26 s

mps 5.25 s

cpu+torch2 compile 3.31 s

mps+torch2 compile 4.94 s

🫠

great it worked for me

KnechtNoobrecht · 2023-08-07T14:53:50Z

I got it working too, but on an Intel machine (5600M, i9-9980HK) and it does not seem to be doing anything.
It is using 40% GPU and 10% CPU, but no progress. Not even the progress bar comes up.
Can anyone reproduce?

anvart · 2023-08-26T19:45:04Z

I got it working too, but on an Intel machine (5600M, i9-9980HK) and it does not seem to be doing anything. It is using 40% GPU and 10% CPU, but no progress. Not even the progress bar comes up. Can anyone reproduce?

@KnechtNoobrecht mps is for Apple Silicon (M1/M2), please anyone correct me if I am wrong.

KnechtNoobrecht · 2023-08-26T20:03:13Z

@KnechtNoobrecht mps is for Apple Silicon (M1/M2), please anyone correct me if I am wrong.

https://developer.apple.com/metal/pytorch/
According to their own documentation, it is not Apple Silicon exclusive.

anvart · 2023-08-27T08:36:00Z

@KnechtNoobrecht

True, can also run on AMD GPUs.

renderpci · 2023-10-14T12:14:03Z

Hi

PyTorch was broken again!

I have same error msg with #382 (comment)

Traceback (most recent call last):
  File "/Users/render/Library/Python/3.9/bin/whisper", line 8, in <module>
    sys.exit(cli())
  File "/Users/render/Library/Python/3.9/lib/python/site-packages/whisper/transcribe.py", line 444, in cli
    model = load_model(model_name, device=device, download_root=model_dir)
  File "/Users/render/Library/Python/3.9/lib/python/site-packages/whisper/__init__.py", line 154, in load_model
    return model.to(device)
  File "/Users/render/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py", line 1161, in to
    return self._apply(convert)
  File "/Users/render/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py", line 858, in _apply
    self._buffers[key] = fn(buf)
  File "/Users/render/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py", line 1159, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
NotImplementedError: Could not run 'aten::empty.memory_format' with arguments from the 'SparseMPS' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::empty.memory_format' is only available for these backends: [CPU, MPS, Meta, QuantizedCPU, QuantizedMeta, MkldnnCPU, SparseCPU, SparseMeta, SparseCsrCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterCPU.cpp:31188 [kernel]
MPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMPS.cpp:27199 [kernel]
Meta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMeta.cpp:26838 [kernel]
QuantizedCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterQuantizedCPU.cpp:944 [kernel]
QuantizedMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterQuantizedMeta.cpp:105 [kernel]
MkldnnCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMkldnnCPU.cpp:515 [kernel]
SparseCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseCPU.cpp:1387 [kernel]
SparseMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseMeta.cpp:249 [kernel]
SparseCsrCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseCsrCPU.cpp:1135 [kernel]
BackendSelect: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:807 [kernel]
Python: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:153 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:498 [backend fallback]
Functionalize: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:302 [backend fallback]
Named: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ConjugateFallback.cpp:21 [kernel]
Negative: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/NegateFallback.cpp:23 [kernel]
ZeroTensor: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:90 [kernel]
ADInplaceOrView: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:86 [backend fallback]
AutogradOther: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradCUDA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradHIP: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradXLA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradMPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradIPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradXPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradHPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradVE: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradLazy: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradMTIA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradPrivateUse1: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradPrivateUse2: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradPrivateUse3: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
AutogradNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:18627 [autograd kernel]
Tracer: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/TraceType_2.cpp:17268 [kernel]
AutocastCPU: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:379 [backend fallback]
AutocastCUDA: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:245 [backend fallback]
FuncTorchBatched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:744 [backend fallback]
BatchedNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:772 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback]
Batched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:203 [backend fallback]
PythonTLSSnapshot: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:161 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:494 [backend fallback]
PreDispatch: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:165 [backend fallback]
PythonDispatcher: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:157 [backend fallback]

mstephenson6 · 2023-10-20T20:04:30Z

I have large-v2 running on M1 Pro GPU (2021 MBP) today, big thank you to @linroex above for the pip freeze output. Starting from that and working through pip problems, I got to the below requirements.txt, installed in a fresh python 3.11 conda environment.

certifi==2022.12.7
charset-normalizer==3.0.1
ffmpeg-python==0.2.0
filelock==3.9.0
future==0.18.3
huggingface-hub==0.12.1
idna==3.4
more-itertools==9.0.0
mpmath==1.2.1
networkx==3.0rc1
numpy==1.24.2
openai-whisper @ git+https://github.com/openai/whisper.git@51c785f7c91b8c032a1fa79c0e8f862dea81b860
packaging==23.0
Pillow==9.4.0
PyYAML==6.0
regex==2022.10.31
requests==2.28.2
SpeechRecognition==3.9.0
sympy==1.11.1
tokenizers==0.13.2
torch
torchaudio
torchvision
tqdm==4.64.1
transformers==4.26.1
typing_extensions==4.4.0
urllib3==1.26.14

To watch the transcribe output live as it's inferred, I added a sys.stderr.flush() line at lib/python3.11/site-packages/whisper/transcribe.py:175

chicman · 2023-11-05T22:36:34Z

I tried to have the env setup but still got errors. M1 Pro MPS. macOS 14.1.

Traceback (most recent call last):
  File "/Users/jm/miniconda3/bin/whisper", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/Users/jm/miniconda3/lib/python3.11/site-packages/whisper/transcribe.py", line 310, in cli
    model = load_model(model_name, device=device, download_root=model_dir)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jm/miniconda3/lib/python3.11/site-packages/whisper/__init__.py", line 115, in load_model
    checkpoint = torch.load(fp, map_location=device)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jm/miniconda3/lib/python3.11/site-packages/torch/serialization.py", line 1024, in load
    return _load(opened_zipfile,
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jm/miniconda3/lib/python3.11/site-packages/torch/serialization.py", line 1432, in _load
    result = unpickler.load()
             ^^^^^^^^^^^^^^^^
  File "/Users/jm/miniconda3/lib/python3.11/site-packages/torch/serialization.py", line 1402, in persistent_load
    typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jm/miniconda3/lib/python3.11/site-packages/torch/serialization.py", line 1376, in load_tensor
    wrap_storage=restore_location(storage, location),
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jm/miniconda3/lib/python3.11/site-packages/torch/serialization.py", line 1306, in restore_location
    return default_restore_location(storage, map_location)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jm/miniconda3/lib/python3.11/site-packages/torch/serialization.py", line 394, in default_restore_location
    raise RuntimeError("don't know how to restore data location of "
RuntimeError: don't know how to restore data location of torch.storage.UntypedStorage (tagged with MPS)

certifi==2022.12.7
charset-normalizer==3.0.1
ffmpeg-python==0.2.0
filelock==3.9.0
future==0.18.3
huggingface-hub==0.12.1
idna==3.4
more-itertools==9.0.0
mpmath==1.2.1
networkx==3.0rc1
numpy==1.24.2
openai-whisper @ git+https://github.com/openai/whisper.git@51c785f7c91b8c032a1fa79c0e8f862dea81b860
packaging==23.0
Pillow==9.4.0
PyYAML==6.0
regex==2022.10.31
requests==2.28.2
SpeechRecognition==3.9.0
sympy==1.11.1
tokenizers==0.13.2
torch
torchaudio
torchvision
tqdm==4.64.1
transformers==4.26.1
typing_extensions==4.4.0
urllib3==1.26.14

salamer · 2023-11-28T06:02:41Z

any progress?

kingname · 2023-12-02T11:57:52Z

$ whisper pie-ep91.mp3 --model small --output_format txt --device mps
Traceback (most recent call last):
  File "/Users/kingname/.local/share/virtualenvs/smart_podcast-NYiabyPE/bin/whisper", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/Users/kingname/.local/share/virtualenvs/smart_podcast-NYiabyPE/lib/python3.11/site-packages/whisper/transcribe.py", line 458, in cli
    model = load_model(model_name, device=device, download_root=model_dir)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kingname/.local/share/virtualenvs/smart_podcast-NYiabyPE/lib/python3.11/site-packages/whisper/__init__.py", line 156, in load_model
    return model.to(device)
           ^^^^^^^^^^^^^^^^
  File "/Users/kingname/.local/share/virtualenvs/smart_podcast-NYiabyPE/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1152, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/kingname/.local/share/virtualenvs/smart_podcast-NYiabyPE/lib/python3.11/site-packages/torch/nn/modules/module.py", line 849, in _apply
    self._buffers[key] = fn(buf)
                         ^^^^^^^
  File "/Users/kingname/.local/share/virtualenvs/smart_podcast-NYiabyPE/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1150, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: Could not run 'aten::empty.memory_format' with arguments from the 'SparseMPS' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::empty.memory_format' is only available for these backends: [CPU, MPS, Meta, QuantizedCPU, QuantizedMeta, MkldnnCPU, SparseCPU, SparseMeta, SparseCsrCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterCPU.cpp:31357 [kernel]
MPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMPS.cpp:27248 [kernel]
Meta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMeta.cpp:26984 [kernel]
QuantizedCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterQuantizedCPU.cpp:944 [kernel]
QuantizedMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterQuantizedMeta.cpp:105 [kernel]
MkldnnCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMkldnnCPU.cpp:515 [kernel]
SparseCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseCPU.cpp:1387 [kernel]
SparseMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseMeta.cpp:249 [kernel]
SparseCsrCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseCsrCPU.cpp:1135 [kernel]
BackendSelect: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:807 [kernel]
Python: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:154 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:498 [backend fallback]
Functionalize: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:324 [backend fallback]
Named: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ConjugateFallback.cpp:21 [kernel]
Negative: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/NegateFallback.cpp:23 [kernel]
ZeroTensor: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:90 [kernel]
ADInplaceOrView: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:86 [backend fallback]
AutogradOther: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradCUDA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradHIP: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradXLA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradMPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradIPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradXPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradHPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradVE: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradLazy: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradMTIA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradPrivateUse1: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradPrivateUse2: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradPrivateUse3: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
Tracer: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/TraceType_2.cpp:17346 [kernel]
AutocastCPU: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:378 [backend fallback]
AutocastCUDA: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:244 [backend fallback]
FuncTorchBatched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:720 [backend fallback]
BatchedNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:746 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback]
Batched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:203 [backend fallback]
PythonTLSSnapshot: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:162 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:494 [backend fallback]
PreDispatch: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:166 [backend fallback]
PythonDispatcher: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:158 [backend fallback]

juanluisrto · 2024-01-19T10:17:59Z

I am using whisper via huggingface pipelines, where you can specify to use MPS.
However after some tests I see that CPU only is still faster.

My guess is that not all pytorch operations are compatible with MPS yet as it can be seen in this issue: pytorch/pytorch#77764

For a 11 second audio clip it takes 0.81 s on CPU and 1.23 s on GPU

This is how I compare both approaches:

import gradio as gr
from transformers import pipeline
import numpy as np

import time


transcriber_gpu = pipeline("automatic-speech-recognition", model="openai/whisper-base", device = "mps")
transcriber_cpu = pipeline("automatic-speech-recognition", model="openai/whisper-base", device = "cpu")

def track_time(func, *args, **kwargs):
    start = time.time()
    output = func(*args, **kwargs)
    end = time.time()
    return output, end - start


def transcribe(audio):
    sr, y = audio
    y = y.astype(np.float32)
    if y.ndim == 2:  # Check if there are two channels
        y = np.mean(y, axis=1)  # Convert to mono by taking the mean of the two channels
    y /= np.max(np.abs(y))

    out_gpu = track_time(transcriber, {"sampling_rate": sr, "raw": y})
    out_cpu = track_time(transcriber_cpu, {"sampling_rate": sr, "raw": y})

    print(out_gpu)
    print(out_cpu)
    text_gpu = out_gpu[0]["text"]
    text_cpu = out_cpu[0]["text"]
    time_gpu = out_gpu[1]
    time_cpu = out_cpu[1]

    combined_output = f"""
    OUTPUT_GPU t={time_gpu}
    {text_gpu}

    OUTPUT_CPU t={time_cpu}
    {text_cpu}
        
    """
    
    return combined_output


demo = gr.Interface(
    transcribe,
    gr.Audio(),
    "text",
)

demo.launch()

0x0elliot · 2024-08-05T12:55:17Z

Any progress?

int8")
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 145, in __init__
    self.model = ctranslate2.models.Whisper(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: unsupported device mps

Francyrad · 2024-10-09T15:45:20Z

Any news? the error is still present

sagatake · 2024-11-17T22:43:18Z

Hi, just for your information, you can run whisper with the almost identical way by replacing with transformers.
I've confirmed that works for my macbook pro with Apple sillicon.
https://huggingface.co/openai/whisper-large-v3

andrewguy9 · 2024-11-18T19:57:19Z

Hi, just for your information, you can run whisper with the almost identical way by replacing with transformers. I've confirmed that works for my macbook pro with Apple sillicon. https://huggingface.co/openai/whisper-large-v3

@sagatake, would you mind pasting a small example? I'd like to verify mps is working.

sagatake · 2024-11-18T20:17:35Z

Hi, just for your information, you can run whisper with the almost identical way by replacing with transformers. I've confirmed that works for my macbook pro with Apple sillicon. https://huggingface.co/openai/whisper-large-v3

@sagatake, would you mind pasting a small example? I'd like to verify mps is working.

@andrewguy9

Here is the minimum example.

import torch

from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
from datasets import load_dataset

def main():
    
    test_audio_path = r"test.wav"

    # device = "cuda:0" if torch.cuda.is_available() else "cpu"
    device = "mps" if torch.backends.mps.is_available() else "cpu"    

    torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
    
    model_id = "openai/whisper-large-v3"
    
    model = AutoModelForSpeechSeq2Seq.from_pretrained(
        model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
    )
    model.to(device)
    
    processor = AutoProcessor.from_pretrained(model_id)
    
    pipe = pipeline(
        "automatic-speech-recognition",
        model=model,
        tokenizer=processor.tokenizer,
        feature_extractor=processor.feature_extractor,
        torch_dtype=torch_dtype,
        device=device,
    )
    
    result = pipe(test_audio_path)
    print(result["text"])
        
if __name__ == '__main__':
    main()

SamuelAierizer · 2024-12-06T16:41:42Z

It would be really cool if something like this would work:

whisper <filename> --language <language> --model large --output_format txt --device mps

agilealpha1 · 2025-04-26T14:24:55Z

Apple M3 Max , successfully worked. https://rewa-insights.com/t/translating-multilingual-audio-into-simplified-chinese-and-saving-to-a-text-file-with-python/432?u=rewa-evija

MPS (Mac acceleration) by default if available

bdd0d79

scottleibrand mentioned this pull request Dec 20, 2022

Would Whisper require M1 GPU support? danielgross/teleprompter#2

Open

jongwook and others added 3 commits January 18, 2023 14:03

Merge branch 'main' into main

2c91499

hasattr check for torch.backends.mps

4b77a81

add another hasattr check for torch.backends.mps

51c785f

chidiwilliams mentioned this pull request Feb 2, 2023

Add GPU support chidiwilliams/buzz#346

Closed

renderpci mentioned this pull request Feb 10, 2023

General MPS op coverage tracking issue pytorch/pytorch#77764

Open

cameronbergh mentioned this pull request Feb 25, 2023

It kinda works with m2 mps device davabase/whisper_real_time#5

Closed

johnny12150 mentioned this pull request Apr 17, 2023

Thanks ! oliverwehrens/benchmark-whisper-cpp#1

Open

andrewschreiber mentioned this pull request Apr 22, 2023

Performance on M1 chips compared to PyTorch implementation? sanchit-gandhi/whisper-jax#20

Open

Klaudioz mentioned this pull request May 29, 2023

Porting to Mac and improve performance SevaSk/ecoute#27

Closed

jongwook mentioned this pull request Jun 29, 2023

Add GPU Support for Apple Silicon Macs #1481

Closed

NripeshN approved these changes Jul 5, 2023

View reviewed changes

NripeshN mentioned this pull request Jul 5, 2023

added parser argument to support MPS by default #1501

Open

n8henrie mentioned this pull request Jul 17, 2023

Package request: pytorch on darwin with GPU (MPS) support NixOS/nixpkgs#243868

Closed

rgbkrk mentioned this pull request Dec 2, 2023

detect if Mac's Metal Performance Shaders are available #1866

Closed

410063005 mentioned this pull request Mar 23, 2024

Problem with MacOs MPS Support (M2 Pro) jianfch/stable-ts#322

Closed

ramen mentioned this pull request Dec 9, 2024

[feature]: GPU acceleration for Apple Silicon TeamAudio/reaspeech#125

Closed

Uses MPS (Mac acceleration) by default when available #382

Are you sure you want to change the base?

Uses MPS (Mac acceleration) by default when available #382

Conversation

dwarkeshsp commented Oct 21, 2022

usergit commented Oct 21, 2022

Michcioperz commented Oct 21, 2022

DiegoGiovany commented Nov 9, 2022 • edited Loading

glangford commented Nov 14, 2022

gltanaka commented Nov 17, 2022 • edited Loading

megeek commented Nov 28, 2022

glangford commented Nov 28, 2022

PhDLuffy commented Dec 8, 2022 • edited Loading

changeling commented Jan 16, 2023

jongwook commented Jan 18, 2023

changeling commented Jan 19, 2023 • edited Loading

sachit-menon commented Feb 4, 2023

renderpci commented Feb 5, 2023 • edited Loading

DontEatOreo commented Feb 6, 2023

saurabhsharan commented Feb 6, 2023 • edited Loading

renderpci commented Feb 6, 2023 • edited Loading

devpacdd commented Feb 7, 2023 • edited Loading

DenisVieriu97 commented Feb 21, 2023 • edited Loading

manuthebyte commented Feb 21, 2023 • edited Loading

cameronbergh commented Feb 23, 2023

DenisVieriu97 commented Feb 24, 2023

cameronbergh commented Feb 25, 2023 • edited Loading

andrewguy9 commented Apr 11, 2023

salamer commented Apr 22, 2023

hqucsx commented Aug 2, 2023

KnechtNoobrecht commented Aug 7, 2023

anvart commented Aug 26, 2023

KnechtNoobrecht commented Aug 26, 2023

anvart commented Aug 27, 2023

renderpci commented Oct 14, 2023

mstephenson6 commented Oct 20, 2023

chicman commented Nov 5, 2023 • edited Loading

salamer commented Nov 28, 2023

kingname commented Dec 2, 2023

juanluisrto commented Jan 19, 2024

0x0elliot commented Aug 5, 2024

Francyrad commented Oct 9, 2024

sagatake commented Nov 17, 2024 • edited Loading

andrewguy9 commented Nov 18, 2024

sagatake commented Nov 18, 2024

SamuelAierizer commented Dec 6, 2024

agilealpha1 commented Apr 26, 2025

DiegoGiovany commented Nov 9, 2022 •

edited

Loading

gltanaka commented Nov 17, 2022 •

edited

Loading

PhDLuffy commented Dec 8, 2022 •

edited

Loading

changeling commented Jan 19, 2023 •

edited

Loading

renderpci commented Feb 5, 2023 •

edited

Loading

saurabhsharan commented Feb 6, 2023 •

edited

Loading

renderpci commented Feb 6, 2023 •

edited

Loading

devpacdd commented Feb 7, 2023 •

edited

Loading

DenisVieriu97 commented Feb 21, 2023 •

edited

Loading

manuthebyte commented Feb 21, 2023 •

edited

Loading

cameronbergh commented Feb 25, 2023 •

edited

Loading

chicman commented Nov 5, 2023 •

edited

Loading

sagatake commented Nov 17, 2024 •

edited

Loading