Skip to content

Commit 4ca1f04

Browse files
committed
Add Assistant Vision example to README
1 parent 580346e commit 4ca1f04

File tree

2 files changed

+68
-1
lines changed

2 files changed

+68
-1
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,5 +65,7 @@ build-iPhoneSimulator/
6565
.vscode
6666
.vs/
6767

68+
/path/to/*
69+
6870
# Mac
6971
.DS_Store

README.md

Lines changed: 66 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1072,6 +1072,71 @@ run_id = response['id']
10721072
thread_id = response['thread_id']
10731073
```
10741074

1075+
#### Vision in a thread
1076+
1077+
You can include images in a thread and they will be described & read by the LLM. In this example I'm using [this file](https://upload.wikimedia.org/wikipedia/commons/7/70/Example.png):
1078+
1079+
```
1080+
require "openai"
1081+
1082+
# Make a client
1083+
client = OpenAI::Client.new(
1084+
access_token: "access_token_goes_here",
1085+
log_errors: true # Don't log errors in production.
1086+
)
1087+
1088+
# Upload image as a file
1089+
file_id = client.files.upload(
1090+
parameters: {
1091+
file: "path/to/example.png",
1092+
purpose: "assistants"
1093+
}
1094+
)["id"]
1095+
1096+
# Create assistant
1097+
assistant_id = client.assistants.create(
1098+
parameters: {
1099+
model: "gpt-4o",
1100+
name: "Image reader",
1101+
instructions: "You are an image describer. You describe the contents of images."
1102+
})["id"]
1103+
1104+
# Create thread
1105+
thread_id = client.threads.create["id"]
1106+
1107+
# Add image in message
1108+
client.messages.create(
1109+
thread_id: thread_id,
1110+
parameters: {
1111+
role: "user", # Required for manually created messages
1112+
content: [
1113+
{
1114+
"type": "text",
1115+
"text": "What's in this image?"
1116+
},
1117+
{
1118+
"type": "image_file",
1119+
"image_file": { "file_id": file_id }
1120+
}
1121+
]
1122+
})
1123+
1124+
# Run thread
1125+
run_id = client.runs.create(thread_id: thread_id, parameters: { assistant_id: assistant_id })["id"]
1126+
1127+
# Wait until run in complete
1128+
status = nil
1129+
until status == "completed" do
1130+
sleep(0.1)
1131+
status = client.runs.retrieve(id: run_id, thread_id: thread_id)['status']
1132+
end
1133+
1134+
# Get the response
1135+
messages = client.messages.list(thread_id: thread_id, parameters: { order: 'asc' })
1136+
messages.dig("data", -1, "content", 0, "text", "value")
1137+
=> "The image contains a placeholder graphic with a tilted, stylized representation of a postage stamp in the top part, which includes an abstract landscape with hills and a sun. Below the stamp, in the middle of the image, there is italicized text in a light golden color that reads, \"This is just an example.\" The background is a light pastel shade, and a yellow border frames the entire image."
1138+
```
1139+
10751140
#### Runs involving function tools
10761141

10771142
In case you are allowing the assistant to access `function` tools (they are defined in the same way as functions during chat completion), you might get a status code of `requires_action` when the assistant wants you to evaluate one or more function tools:
@@ -1128,7 +1193,7 @@ require "openai"
11281193
# Make a client
11291194
client = OpenAI::Client.new(
11301195
access_token: "access_token_goes_here",
1131-
log_errors: true # Don't do this in production.
1196+
log_errors: true # Don't log errors in production.
11321197
)
11331198
11341199
# Upload your file(s)

0 commit comments

Comments
 (0)