Closed
Description
Golang bindings are 45 times slower than the C++ binary when transcoding samples/jfk.wav
using the ggml-tiny.en.bin
model.
C++ binary | Golang bindings |
---|---|
1.428s | 63.919s |
C++ and Golang examples are compiled following the readme. I haven't profiled the Golang bindings yet.
Raw results
C++ binary
time ./main -m ./models/ggml-tiny.en.bin -f ./samples/jfk.wav
whisper_init_from_file: loading model from './models/ggml-tiny.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 384
whisper_model_load: n_audio_head = 6
whisper_model_load: n_audio_layer = 4
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 384
whisper_model_load: n_text_head = 6
whisper_model_load: n_text_layer = 4
whisper_model_load: n_mels = 80
whisper_model_load: f16 = 1
whisper_model_load: type = 1
whisper_model_load: mem required = 387.00 MB (+ 3.00 MB per decoder)
whisper_model_load: kv self size = 2.62 MB
whisper_model_load: kv cross size = 8.79 MB
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx = 73.58 MB
whisper_model_load: model size = 73.54 MB
system_info: n_threads = 4 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
main: processing './samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ...
[00:00:00.000 --> 00:00:07.740] And so my fellow Americans ask not what your country can do for you
[00:00:07.740 --> 00:00:10.740] ask what you can do for your country
whisper_print_timings: load time = 186.77 ms
whisper_print_timings: mel time = 83.82 ms
whisper_print_timings: sample time = 16.98 ms
whisper_print_timings: encode time = 791.23 ms / 197.81 ms per layer
whisper_print_timings: decode time = 289.46 ms / 72.36 ms per layer
whisper_print_timings: total time = 1374.19 ms
real 0m1.428s
user 0m4.208s
sys 0m0.280s
Golang bindings
time ./build/go-whisper -model ./models/ggml-tiny.en.bin samples/jfk.wav
whisper_init_from_file: loading model from './models/ggml-tiny.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 384
whisper_model_load: n_audio_head = 6
whisper_model_load: n_audio_layer = 4
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 384
whisper_model_load: n_text_head = 6
whisper_model_load: n_text_layer = 4
whisper_model_load: n_mels = 80
whisper_model_load: f16 = 1
whisper_model_load: type = 1
whisper_model_load: mem required = 387.00 MB (+ 3.00 MB per decoder)
whisper_model_load: kv self size = 2.62 MB
whisper_model_load: kv cross size = 8.79 MB
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx = 73.58 MB
whisper_model_load: model size = 73.54 MB
Loading "./samples/jfk.wav"
...processing "./samples/jfk.wav"
[ 0s-> 7.74s] And so my fellow Americans ask not what your country can do for you
[ 7.74s->10.74s] ask what you can do for your country
real 1m3.919s
user 4m7.851s
sys 0m6.771s
Metadata
Metadata
Assignees
Labels
No labels