Skip to content

Commit a7aef5c

Browse files
authored
Merge pull request collabora#5 from jpc/jpc/executable-docs
Docker images and a Quarto README with synchronized docs
2 parents 0733173 + ad3d1f7 commit a7aef5c

16 files changed

+508
-58
lines changed

README.md

Lines changed: 179 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -1,112 +1,234 @@
11
# WhisperBot
2-
Welcome to WhisperBot. WhisperBot builds upon the capabilities of the [WhisperLive](https://github.com/collabora/WhisperLive) and [WhisperSpeech](https://github.com/collabora/WhisperSpeech) by integrating Mistral, a Large Language Model (LLM), on top of the real-time speech-to-text pipeline. WhisperLive relies on OpenAI Whisper, a powerful automatic speech recognition (ASR) system. Both Mistral and Whisper are optimized to run efficiently as TensorRT engines, maximizing performance and real-time processing capabilities.
2+
3+
4+
Welcome to WhisperBot. WhisperBot builds upon the capabilities of the
5+
[WhisperLive](https://github.com/collabora/WhisperLive) and
6+
[WhisperSpeech](https://github.com/collabora/WhisperSpeech) by
7+
integrating Mistral, a Large Language Model (LLM), on top of the
8+
real-time speech-to-text pipeline. WhisperLive relies on OpenAI Whisper,
9+
a powerful automatic speech recognition (ASR) system. Both Mistral and
10+
Whisper are optimized to run efficiently as TensorRT engines, maximizing
11+
performance and real-time processing capabilities.
312

413
## Features
5-
- **Real-Time Speech-to-Text**: Utilizes OpenAI WhisperLive to convert spoken language into text in real-time.
614

7-
- **Large Language Model Integration**: Adds Mistral, a Large Language Model, to enhance the understanding and context of the transcribed text.
15+
- **Real-Time Speech-to-Text**: Utilizes OpenAI WhisperLive to convert
16+
spoken language into text in real-time.
17+
18+
- **Large Language Model Integration**: Adds Mistral, a Large Language
19+
Model, to enhance the understanding and context of the transcribed
20+
text.
821

9-
- **TensorRT Optimization**: Both Mistral and Whisper are optimized to run as TensorRT engines, ensuring high-performance and low-latency processing.
22+
- **TensorRT Optimization**: Both Mistral and Whisper are optimized to
23+
run as TensorRT engines, ensuring high-performance and low-latency
24+
processing.
1025

1126
## Prerequisites
12-
Install [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/installation.md) to build Whisper and Mistral TensorRT engines. The README builds a docker image for TensorRT-LLM.
13-
Instead of building a docker image, we can also refer to the README and the [Dockerfile.multi](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docker/Dockerfile.multi) to install the required packages in the base pytroch docker image. Just make sure to use the correct base image as mentioned in the dockerfile and everything should go nice and smooth.
27+
28+
Install
29+
[TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/installation.md)
30+
to build Whisper and Mistral TensorRT engines. The README builds a
31+
docker image for TensorRT-LLM. Instead of building a docker image, we
32+
can also refer to the README and the
33+
[Dockerfile.multi](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docker/Dockerfile.multi)
34+
to install the required packages in the base pytroch docker image. Just
35+
make sure to use the correct base image as mentioned in the dockerfile
36+
and everything should go nice and smooth.
1437

1538
### Build Whisper TensorRT Engine
16-
- Change working dir to the [whisper example dir](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/whisper) in TensorRT-LLM.
17-
```bash
18-
cd TensorRT-LLM/examples/whisper
19-
```
20-
- Currently, by default TensorRT-LLM only supports `large-v2` and `large-v3`. In this repo, we use `small.en`.
21-
- Download the required assets.
22-
```bash
23-
wget --directory-prefix=assets assets/mel_filters.npz https://raw.githubusercontent.com/openai/whisper/main/whisper/assets/mel_filters.npz
24-
25-
# small.en model
39+
40+
> [!NOTE]
41+
>
42+
> These steps are included in `docker/scripts/build-whisper.sh`
43+
44+
Change working dir to the [whisper example
45+
dir](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/whisper)
46+
in TensorRT-LLM.
47+
48+
``` bash
49+
cd /root/TensorRT-LLM-examples/whisper
50+
```
51+
52+
Currently, by default TensorRT-LLM only supports `large-v2` and
53+
`large-v3`. In this repo, we use `small.en`.
54+
55+
Download the required assets
56+
57+
``` bash
58+
# the sound filter definitions
59+
wget --directory-prefix=assets https://raw.githubusercontent.com/openai/whisper/main/whisper/assets/mel_filters.npz
60+
# the small.en model weights
2661
wget --directory-prefix=assets https://openaipublic.azureedge.net/main/whisper/models/f953ad0fd29cacd07d5a9eda5624af0f6bcf2258be67c92b79389873d91e0872/small.en.pt
2762
```
28-
- Edit `build.py` to support `small.en`. In order to do that, add `"small.en"` as an item in the list [`choices`](https://github.com/NVIDIA/TensorRT-LLM/blob/a75618df24e97ecf92b8899ca3c229c4b8097dda/examples/whisper/build.py#L58).
29-
- Build `small.en` TensorRT engine.
30-
```bash
63+
64+
We have to patch the script to add support for out model size
65+
(`small.en`):
66+
67+
``` bash
68+
patch <<EOF
69+
--- build.py.old 2024-01-17 17:47:47.508545842 +0100
70+
+++ build.py 2024-01-17 17:47:41.404941926 +0100
71+
@@ -58,6 +58,7 @@
72+
choices=[
73+
"large-v3",
74+
"large-v2",
75+
+ "small.en",
76+
])
77+
parser.add_argument('--quantize_dir', type=str, default="quantize/1-gpu")
78+
parser.add_argument('--dtype',
79+
EOF
80+
```
81+
82+
Finally we can build the TensorRT engine for the `small.en` Whisper
83+
model:
84+
85+
``` bash
3186
pip install -r requirements.txt
3287
python3 build.py --output_dir whisper_small_en --use_gpt_attention_plugin --use_gemm_plugin --use_layernorm_plugin --use_bert_attention_plugin --model_name small.en
88+
mkdir -p /root/scratch-space/models
89+
cp -r whisper_small_en /root/scratch-space/models
3390
```
3491

3592
### Build Mistral TensorRT Engine
36-
- Change working dir to [llama example dir](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/llama) in TensorRT-LLM folder.
37-
```bash
38-
cd TensorRT-LLM/examples/llama
93+
94+
> [!NOTE]
95+
>
96+
> These steps are included in `docker/scripts/build-mistral.sh`
97+
98+
``` bash
99+
cd /root/TensorRT-LLM-examples/llama
39100
```
40-
- Convert Mistral to `fp16` TensorRT engine.
41-
```bash
101+
102+
Build TensorRT for Mistral with `fp16`
103+
104+
``` bash
42105
python build.py --model_dir teknium/OpenHermes-2.5-Mistral-7B \
43106
--dtype float16 \
44107
--remove_input_padding \
45108
--use_gpt_attention_plugin float16 \
46109
--enable_context_fmha \
47110
--use_gemm_plugin float16 \
48111
--output_dir ./tmp/mistral/7B/trt_engines/fp16/1-gpu/ \
49-
--max_input_len 5000
112+
--max_input_len 5000 \
50113
--max_batch_size 1
114+
mkdir -p /root/scratch-space/models
115+
cp -r tmp/mistral/7B/trt_engines/fp16/1-gpu /root/scratch-space/models/mistral
51116
```
52117

53118
### Build Phi TensorRT Engine
54-
Note: Phi is only available in main branch and hasnt been released yet. So, make sure to build TensorRT-LLM from main branch.
55-
- Change working dir to [phi example dir](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/phi) in TensorRT-LLM folder.
56-
```bash
57-
cd TensorRT-LLM/examples/phi
119+
120+
> [!NOTE]
121+
>
122+
> These steps are included in `docker/scripts/build-phi-2.sh`
123+
124+
Note: Phi is only available in main branch and hasnt been released yet.
125+
So, make sure to build TensorRT-LLM from main branch.
126+
127+
``` bash
128+
cd /root/TensorRT-LLM-examples/phi
58129
```
59-
- Build phi TensorRT engine
60-
```bash
130+
131+
Build TensorRT for Phi-2 with `fp16`
132+
133+
``` bash
61134
git lfs install
62-
git clone https://huggingface.co/microsoft/phi-2
135+
phi_path=$(huggingface-cli download --repo-type model --revision 834565c23f9b28b96ccbeabe614dd906b6db551a microsoft/phi-2)
63136
python3 build.py --dtype=float16 \
64137
--log_level=verbose \
65138
--use_gpt_attention_plugin float16 \
66139
--use_gemm_plugin float16 \
67140
--max_batch_size=16 \
68141
--max_input_len=1024 \
69142
--max_output_len=1024 \
70-
--output_dir=phi_engine \
71-
--model_dir=phi-2>&1 | tee build.log
143+
--output_dir=phi-2 \
144+
--model_dir="$phi_path" >&1 | tee build.log
145+
dest=/root/scratch-space/models
146+
mkdir -p "$dest/phi-2/tokenizer"
147+
cp -r phi-2 "$dest"
148+
(cd "$phi_path" && cp config.json tokenizer_config.json vocab.json merges.txt "$dest/phi-2/tokenizer")
149+
cp -r "$phi_path" "$dest/phi-orig-model"
72150
```
73151

74-
## Run WhisperBot
75-
- Clone this repo and install requirements.
76-
```bash
77-
git clone https://github.com/collabora/WhisperBot.git
152+
## Build WhisperBot
153+
154+
> [!NOTE]
155+
>
156+
> These steps are included in `docker/scripts/setup-whisperbot.sh`
157+
158+
Clone this repo and install requirements
159+
160+
``` bash
161+
[ -d "WhisperBot" ] || git clone https://github.com/collabora/WhisperBot.git
78162
cd WhisperBot
79163
apt update
80164
apt install ffmpeg portaudio19-dev -y
165+
```
166+
167+
Install torchaudio matching the PyTorch from the base image
168+
169+
``` bash
170+
pip install --extra-index-url https://download.pytorch.org/whl/cu121 torchaudio
171+
```
172+
173+
Install all the other dependencies normally
174+
175+
``` bash
81176
pip install -r requirements.txt
177+
pip install openai-whisper whisperspeech soundfile
82178
```
83179

84-
### Whisper + Mistral
85-
- Take the folder path for Whisper TensorRT model, folder_path and tokenizer_path for Mistral TensorRT from the build phase. If a huggingface model is used to build mistral then just use the huggingface repo name as the tokenizer path.
86-
```bash
87-
python3 main.py --mistral
88-
--whisper_tensorrt_path /root/TensorRT-LLM/examples/whisper/whisper_small_en \
89-
--mistral_tensorrt_path /root/TensorRT-LLM/examples/llama/tmp/mistral/7B/trt_engines/fp16/1-gpu/ \
90-
--mistral_tokenizer_path teknium/OpenHermes-2.5-Mistral-7B
180+
force update huggingface_hub (tokenizers 0.14.1 spuriously require and
181+
ancient \<=0.18 version)
182+
183+
``` bash
184+
pip install -U huggingface_hub
185+
huggingface-cli download collabora/whisperspeech t2s-small-en+pl.model s2a-q4-tiny-en+pl.model
186+
huggingface-cli download charactr/vocos-encodec-24khz
187+
mkdir -p /root/.cache/torch/hub/checkpoints/
188+
curl -L -o /root/.cache/torch/hub/checkpoints/encodec_24khz-d7cc33bc.th https://dl.fbaipublicfiles.com/encodec/v0/encodec_24khz-d7cc33bc.th
189+
mkdir -p /root/.cache/whisper-live/
190+
curl -L -o /root/.cache/whisper-live/silero_vad.onnx https://github.com/snakers4/silero-vad/raw/master/files/silero_vad.onnx
191+
python -c 'from transformers.utils.hub import move_cache; move_cache()'
91192
```
92193

93-
### Whisper + Phi
94-
- Take the folder path for Whisper TensorRT model, folder_path and tokenizer_path for Phi TensorRT from the build phase. If a huggingface model is used to build phi then just use the huggingface repo name as the tokenizer path.
95-
```bash
96-
python3 main.py --phi
97-
--whisper_tensorrt_path /root/TensorRT-LLM/examples/whisper/whisper_small_en \
98-
--phi_tensorrt_path /root/TensorRT-LLM/examples/phi/phi_engine \
99-
--phi_tokenizer_path /root/TensorRT-LLM/examples/phi/phi-2
194+
### Run WhisperBot with Whisper and Mistral/Phi-2
195+
196+
Take the folder path for Whisper TensorRT model, folder_path and
197+
tokenizer_path for Mistral/Phi-2 TensorRT from the build phase. If a
198+
huggingface model is used to build mistral/phi-2 then just use the
199+
huggingface repo name as the tokenizer path.
200+
201+
> [!NOTE]
202+
>
203+
> These steps are included in `docker/scripts/run-whisperbot.sh`
204+
205+
``` bash
206+
test -f /etc/shinit_v2 && source /etc/shinit_v2
207+
cd WhisperBot
208+
if [ "$1" != "mistral" ]; then
209+
exec python3 main.py --phi \
210+
--whisper_tensorrt_path /root/whisper_small_en \
211+
--phi_tensorrt_path /root/phi-2 \
212+
--phi_tokenizer_path /root/phi-2
213+
else
214+
exec python3 main.py --mistral \
215+
--whisper_tensorrt_path /root/models/whisper_small_en \
216+
--mistral_tensorrt_path /root/models/mistral \
217+
--mistral_tokenizer_path teknium/OpenHermes-2.5-Mistral-7B
218+
fi
100219
```
101220

102-
- On the client side clone the repo, install the requirements and execute `run_client.py`
103-
```bash
221+
- On the client side clone the repo, install the requirements and
222+
execute `run_client.py`
223+
224+
``` bash
104225
cd WhisperBot
105226
pip install -r requirements.txt
106227
python3 run_client.py
107228
```
108229

109-
110230
## Contact Us
111-
For questions or issues, please open an issue.
112-
231+
232+
For questions or issues, please open an issue. Contact us at:
233+
234+

README.qmd

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
---
2+
format: gfm
3+
execute:
4+
echo: false
5+
output: asis
6+
---
7+
8+
```{python}
9+
#| include: false
10+
def include_file(fname):
11+
with open(fname) as f:
12+
print(f'''
13+
:::{{.callout-note}}
14+
These steps are included in `{fname}`
15+
:::
16+
''')
17+
code = False
18+
for l in f:
19+
if l.startswith('#!'):
20+
continue
21+
if l.startswith('## '):
22+
if code: print("```"); code=False
23+
print(l[3:])
24+
elif l.strip():
25+
if not code: print("```bash"); code=True
26+
print(l.rstrip())
27+
if code: print("```")
28+
```
29+
30+
# WhisperBot
31+
32+
Welcome to WhisperBot. WhisperBot builds upon the capabilities of the [WhisperLive](https://github.com/collabora/WhisperLive) and [WhisperSpeech](https://github.com/collabora/WhisperSpeech) by integrating Mistral, a Large Language Model (LLM), on top of the real-time speech-to-text pipeline. WhisperLive relies on OpenAI Whisper, a powerful automatic speech recognition (ASR) system. Both Mistral and Whisper are optimized to run efficiently as TensorRT engines, maximizing performance and real-time processing capabilities.
33+
34+
## Features
35+
- **Real-Time Speech-to-Text**: Utilizes OpenAI WhisperLive to convert spoken language into text in real-time.
36+
37+
- **Large Language Model Integration**: Adds Mistral, a Large Language Model, to enhance the understanding and context of the transcribed text.
38+
39+
- **TensorRT Optimization**: Both Mistral and Whisper are optimized to run as TensorRT engines, ensuring high-performance and low-latency processing.
40+
41+
## Prerequisites
42+
Install [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/installation.md) to build Whisper and Mistral TensorRT engines. The README builds a docker image for TensorRT-LLM.
43+
Instead of building a docker image, we can also refer to the README and the [Dockerfile.multi](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docker/Dockerfile.multi) to install the required packages in the base pytroch docker image. Just make sure to use the correct base image as mentioned in the dockerfile and everything should go nice and smooth.
44+
45+
### Build Whisper TensorRT Engine
46+
47+
```{python}
48+
include_file('docker/scripts/build-whisper.sh')
49+
```
50+
51+
### Build Mistral TensorRT Engine
52+
53+
```{python}
54+
include_file('docker/scripts/build-mistral.sh')
55+
```
56+
57+
### Build Phi TensorRT Engine
58+
59+
```{python}
60+
include_file('docker/scripts/build-phi-2.sh')
61+
```
62+
63+
## Build WhisperBot
64+
65+
```{python}
66+
include_file('docker/scripts/setup-whisperbot.sh')
67+
```
68+
69+
### Run WhisperBot with Whisper and Mistral/Phi-2
70+
71+
Take the folder path for Whisper TensorRT model, folder_path and tokenizer_path for Mistral/Phi-2 TensorRT from the build phase. If a huggingface model is used to build mistral/phi-2 then just use the huggingface repo name as the tokenizer path.
72+
73+
```{python}
74+
include_file('docker/scripts/run-whisperbot.sh')
75+
```
76+
77+
- On the client side clone the repo, install the requirements and execute `run_client.py`
78+
```bash
79+
cd WhisperBot
80+
pip install -r requirements.txt
81+
python3 run_client.py
82+
```
83+
84+
## Contact Us
85+
For questions or issues, please open an issue.
86+

docker/Dockerfile

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
FROM ghcr.io/collabora/whisperbot-base:latest as base
2+
3+
WORKDIR /root
4+
COPY scripts/setup-whisperbot.sh scripts/run-whisperbot.sh scratch-space/models /root/
5+
RUN ./setup-whisperbot.sh
6+
7+
CMD ./run-whisperbot.sh
8+

0 commit comments

Comments
 (0)