-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Golang bindings are 45 slower than the original C++ binary #421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Sounds unrealistic to me. I only did one run as the runtimes are so similar that repeating to average the timings seemed unnecessary. This is what I get if I run the C++ vs Go example against each other: C++ example
Go bindings. I build the example to not add Go build time.
Here for the tiny model for completion: C++
Go:
|
It would be great to add system_info to the go bindings, so we can get the
OP To confirm that their go version is actually using the same instruction
set as their C++ version.
I suspect this is where the difference lies.
…On Tue, 24 Jan 2023 at 05:50, Lukas Rist ***@***.***> wrote:
Sounds unrealistic to me. I only did one run as the runtimes are so
similar that repeating to average the timings seemed unnecessary.
This is what I get if I run the C++ vs Go example against each other:
C++ example
whisper.cpp$ ./main -m ./bindings/go/models/ggml-small.en.bin samples/jfk.wav
whisper_init_from_file: loading model from './bindings/go/models/ggml-small.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 768
whisper_model_load: n_text_head = 12
whisper_model_load: n_text_layer = 12
whisper_model_load: n_mels = 80
whisper_model_load: f16 = 1
whisper_model_load: type = 3
whisper_model_load: adding 1607 extra tokens
whisper_model_load: mem_required = 1044.00 MB
whisper_model_load: ggml ctx size = 464.56 MB
whisper_model_load: memory size = 68.48 MB
whisper_model_load: model size = 464.44 MB
system_info: n_threads = 4 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
main: processing 'samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ...
[00:00:00.000 --> 00:00:08.000] And so, my fellow Americans, ask not what your country can do for you.
[00:00:08.000 --> 00:00:11.000] Ask what you can do for your country.
whisper_print_timings: load time = 186.77 ms
whisper_print_timings: mel time = 41.27 ms
whisper_print_timings: sample time = 1.99 ms
whisper_print_timings: encode time = 2142.41 ms / 178.53 ms per layer
whisper_print_timings: decode time = 308.09 ms / 25.67 ms per layer
whisper_print_timings: total time = 2681.03 ms
Go bindings. I build the example to not add Go build time.
whisper.cpp/bindings/go$ time ./examples/go-whisper/main -model=./bindings/go/models/ggml-small.en.bin samples/jfk.wav
whisper_init_from_file: loading model from './bindings/go/models/ggml-small.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 768
whisper_model_load: n_text_head = 12
whisper_model_load: n_text_layer = 12
whisper_model_load: n_mels = 80
whisper_model_load: f16 = 1
whisper_model_load: type = 3
whisper_model_load: adding 1607 extra tokens
whisper_model_load: mem_required = 1044.00 MB
whisper_model_load: ggml ctx size = 464.56 MB
whisper_model_load: memory size = 68.48 MB
whisper_model_load: model size = 464.44 MB
Loading "samples/jfk.wav"
...processing "samples/jfk.wav"
[ 0s-> 8s] And so, my fellow Americans, ask not what your country can do for you.
[ 8s-> 11s] Ask what you can do for your country.
whisper_print_timings: load time = 0.00 ms
whisper_print_timings: mel time = 139838291968.00 ms
whisper_print_timings: sample time = 2.13 ms
whisper_print_timings: encode time = 1649.40 ms / 137.45 ms per layer
whisper_print_timings: decode time = 463.02 ms / 38.58 ms per layer
whisper_print_timings: total time = 2345.97 ms
real 0m2.356s
user 0m26.836s
sys 0m0.240s
—
Reply to this email directly, view it on GitHub
<#421 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALQR63PGYH34FZTQU7SSF3WT3OHRANCNFSM6AAAAAAT6MLMWQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Sincerely
Jay
|
Reopen if issue persists |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Golang bindings are 45 times slower than the C++ binary when transcoding
samples/jfk.wav
using theggml-tiny.en.bin
model.C++ and Golang examples are compiled following the readme. I haven't profiled the Golang bindings yet.
Raw results
C++ binary
Golang bindings
The text was updated successfully, but these errors were encountered: