A Go CLI tool for generating, editing, and analyzing images using OpenAI's API. Supports image generation from text prompts, image inpainting (completion), and image vision (analysis) tasks.
- Image Generation: Create images from text prompts using OpenAI's image generation API.
- Image Completion (Inpainting): Edit or complete images by providing an original and a masked image.
- Image Vision: Analyze and answer questions about images using OpenAI's vision models.
multimodality/
├── .env # Environment variables (API keys, etc.)
├── go.mod, go.sum # Go module files and dependencies
├── main.go # Main CLI entry point
├── openai/ # OpenAI API client and image operation logic
└── test-files/ # Example images and test assets
-
Clone the repository and navigate to the
multimodality
directory. -
Install dependencies (requires Go 1.24+):
go mod tidy
-
Set up environment variables:
- Copy
.env
and set your OpenAI API key:export OPEN_API_KEY=your-openai-api-key
- Copy
Run the CLI with the desired action and flags:
Generate an image from a text description:
go run main.go -action=image-gen -image-desc="An astronaut riding a bicycle on the moon"
Edit or complete an image using a mask:
go run main.go -action=image-complete -image-desc="Ancient Konark temple dedicated for Lord Surya (Sun) before it was destroyed." -image-path=./test-files/image.png -masked-image-path=./test-files/masked.png
Ask a question about an image (local file or URL):
go run main.go -action=image-vision -query="What did the Ancient Konark temple dedicated for Lord Surya (Sun) look like before it was destroyed." -image-path=./test-files/image.png
or
go run main.go -action=image-vision -query="What did the Ancient Konark temple dedicated for Lord Surya (Sun) look like before it was destroyed." -image-url=https://ik.imagekit.io/1hhs6vx06v/konark.png
Flag | Description |
---|---|
-action |
One of: image-gen , image-complete , image-vision |
-image-desc |
Description for image generation or completion |
-image-path |
Path to the input image file |
-masked-image-path |
Path to the masked image file (for inpainting) |
-query |
Question to ask about the image (for vision) |
MIT License