Skip to content

Golang bindings are 45 slower than the original C++ binary #421

Closed
@ilyazub

Description

@ilyazub

Golang bindings are 45 times slower than the C++ binary when transcoding samples/jfk.wav using the ggml-tiny.en.bin model.

C++ binary Golang bindings
1.428s 63.919s

C++ and Golang examples are compiled following the readme. I haven't profiled the Golang bindings yet.

Raw results

C++ binary

time ./main -m ./models/ggml-tiny.en.bin -f ./samples/jfk.wav
whisper_init_from_file: loading model from './models/ggml-tiny.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 384
whisper_model_load: n_audio_head  = 6
whisper_model_load: n_audio_layer = 4
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 384
whisper_model_load: n_text_head   = 6
whisper_model_load: n_text_layer  = 4
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 1
whisper_model_load: mem required  =  387.00 MB (+    3.00 MB per decoder)
whisper_model_load: kv self size  =    2.62 MB
whisper_model_load: kv cross size =    8.79 MB
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx     =   73.58 MB
whisper_model_load: model size    =   73.54 MB

system_info: n_threads = 4 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 

main: processing './samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ...


[00:00:00.000 --> 00:00:07.740]   And so my fellow Americans ask not what your country can do for you
[00:00:07.740 --> 00:00:10.740]   ask what you can do for your country


whisper_print_timings:     load time =   186.77 ms
whisper_print_timings:      mel time =    83.82 ms
whisper_print_timings:   sample time =    16.98 ms
whisper_print_timings:   encode time =   791.23 ms / 197.81 ms per layer
whisper_print_timings:   decode time =   289.46 ms / 72.36 ms per layer
whisper_print_timings:    total time =  1374.19 ms

real    0m1.428s
user    0m4.208s
sys 0m0.280s

Golang bindings

time ./build/go-whisper -model ./models/ggml-tiny.en.bin samples/jfk.wav
whisper_init_from_file: loading model from './models/ggml-tiny.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 384
whisper_model_load: n_audio_head  = 6
whisper_model_load: n_audio_layer = 4
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 384
whisper_model_load: n_text_head   = 6
whisper_model_load: n_text_layer  = 4
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 1
whisper_model_load: mem required  =  387.00 MB (+    3.00 MB per decoder)
whisper_model_load: kv self size  =    2.62 MB
whisper_model_load: kv cross size =    8.79 MB
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx     =   73.58 MB
whisper_model_load: model size    =   73.54 MB
Loading "./samples/jfk.wav"
  ...processing "./samples/jfk.wav"
[    0s-> 7.74s]  And so my fellow Americans ask not what your country can do for you
[ 7.74s->10.74s]  ask what you can do for your country

real    1m3.919s
user    4m7.851s
sys 0m6.771s

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions