Skip to content

Commit ca43f4c

Browse files
authored
Merge pull request alexrudall#200 from Clemalfroy/whisper
Add Whisper endpoints
2 parents 73cc980 + 8210cf3 commit ca43f4c

File tree

9 files changed

+1305
-2
lines changed

9 files changed

+1305
-2
lines changed

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,12 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [3.5.0] - 2023-03-02
9+
10+
### Added
11+
12+
- Add Client#transcribe and Client translate endpoints - Whisper over the wire! Thanks to [@Clemalfroy](https://github.com/Clemalfroy)
13+
814
## [3.4.0] - 2023-03-01
915

1016
### Added

Gemfile.lock

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
PATH
22
remote: .
33
specs:
4-
ruby-openai (3.4.0)
4+
ruby-openai (3.5.0)
55
httparty (>= 0.18.1)
66

77
GEM

README.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -267,6 +267,38 @@ Pass a string to check if it violates OpenAI's Content Policy:
267267
=> 5.505014632944949e-05
268268
```
269269

270+
### Whisper
271+
272+
Whisper is a speech to text model that can be used to generate text based on an audio file [messages](https://platform.openai.com/docs/guides/chat/introduction):
273+
274+
#### Translate
275+
276+
The translations API takes as input the audio file in any of the supported languages and transcribes the audio into English.
277+
278+
```ruby
279+
response = client.translate(
280+
parameters: {
281+
model: "whisper-1",
282+
file: File.open('path_to_file'),
283+
})
284+
puts response.parsed_body['text']
285+
=> "Translation of the text"
286+
```
287+
288+
#### Transcribe
289+
290+
The transcriptions API takes as input the audio file you want to transcribe and returns the text in the desired output file format.
291+
292+
```ruby
293+
response = client.transcribe(
294+
parameters: {
295+
model: "whisper-1",
296+
file: File.open('path_to_file'),
297+
})
298+
puts response.parsed_body['text']
299+
=> "Transcription of the text"
300+
```
301+
270302
## Development
271303

272304
After checking out the repo, run `bin/setup` to install dependencies. You can run `bin/console` for an interactive prompt that will allow you to experiment.

lib/openai/client.rb

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,14 @@ def moderations(parameters: {})
4343
OpenAI::Client.json_post(path: "/moderations", parameters: parameters)
4444
end
4545

46+
def transcribe(parameters: {})
47+
OpenAI::Client.multipart_post(path: "/audio/transcriptions", parameters: parameters)
48+
end
49+
50+
def translate(parameters: {})
51+
OpenAI::Client.multipart_post(path: "/audio/translations", parameters: parameters)
52+
end
53+
4654
def self.get(path:)
4755
HTTParty.get(
4856
uri(path: path),

lib/openai/version.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
module OpenAI
2-
VERSION = "3.4.0".freeze
2+
VERSION = "3.5.0".freeze
33
end

spec/fixtures/cassettes/whisper-1_transcribe.yml

Lines changed: 599 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

spec/fixtures/cassettes/whisper-1_translate.yml

Lines changed: 599 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

spec/fixtures/files/audio_sample.mp3

119 KB
Binary file not shown.

spec/openai/client/audio_spec.rb

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
RSpec.describe OpenAI::Client do
2+
describe "#transcribe" do
3+
context "with audio", :vcr do
4+
let(:filename) { "audio_sample.mp3" }
5+
let(:audio) { File.join(RSPEC_ROOT, "fixtures/files", filename) }
6+
7+
let(:response) do
8+
OpenAI::Client.new.transcribe(
9+
parameters: {
10+
model: model,
11+
file: File.open(audio, 'r:iso-8859-1')
12+
}
13+
)
14+
end
15+
let(:content) { response.parsed_response["text"] }
16+
let(:cassette) { "#{model} transcribe".downcase }
17+
18+
context "with model: whisper-1" do
19+
let(:model) { "whisper-1" }
20+
21+
it "succeeds" do
22+
23+
VCR.use_cassette(cassette) do
24+
expect(content.empty?).to eq(false)
25+
end
26+
end
27+
end
28+
end
29+
end
30+
31+
describe "#translate" do
32+
context "with audio", :vcr do
33+
let(:filename) { "audio_sample.mp3" }
34+
let(:audio) { File.join(RSPEC_ROOT, "fixtures/files", filename) }
35+
36+
let(:response) do
37+
OpenAI::Client.new.translate(
38+
parameters: {
39+
model: model,
40+
file: File.open(audio, 'r:iso-8859-1')
41+
}
42+
)
43+
end
44+
let(:content) { response.parsed_response["text"] }
45+
let(:cassette) { "#{model} translate".downcase }
46+
47+
context "with model: whisper-1" do
48+
let(:model) { "whisper-1" }
49+
50+
it "succeeds" do
51+
52+
VCR.use_cassette(cassette) do
53+
expect(content.empty?).to eq(false)
54+
end
55+
end
56+
end
57+
end
58+
end
59+
end

0 commit comments

Comments
 (0)