Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion docs/source/en/api/video_processor.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,10 @@ specific language governing permissions and limitations under the License.

# Video Processor

The `VideoProcessor` provides a unified API for video pipelines to prepare inputs for VAE encoding and post-processing outputs once they're decoded. The class inherits [`VaeImageProcessor`] so it includes transformations such as resizing, normalization, and conversion between PIL Image, PyTorch, and NumPy arrays.
The [`VideoProcessor`] provides a unified API for video pipelines to prepare inputs for VAE encoding and post-processing outputs once they're decoded. The class inherits [`VaeImageProcessor`] so it includes transformations such as resizing, normalization, and conversion between PIL Image, PyTorch, and NumPy arrays.

## VideoProcessor

[[autodoc]] video_processor.VideoProcessor.preprocess_video

[[autodoc]] video_processor.VideoProcessor.postprocess_video
20 changes: 11 additions & 9 deletions src/diffusers/video_processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,17 +30,19 @@ def preprocess_video(self, video, height: Optional[int] = None, width: Optional[
Preprocesses input video(s).
Args:
video: The input video. It can be one of the following:
video (`List[PIL.Image]`, `List[List[PIL.Image]]`, `torch.Tensor`, `np.array`, `List[torch.Tensor]`, `List[np.array]`):
The input video. It can be one of the following:
* List of the PIL images.
* List of list of PIL images.
* 4D Torch tensors (expected shape for each tensor: (num_frames, num_channels, height, width)).
* 4D NumPy arrays (expected shape for each array: (num_frames, height, width, num_channels)).
* List of 4D Torch tensors (expected shape for each tensor: (num_frames, num_channels, height, width)).
* List of 4D NumPy arrays (expected shape for each array: (num_frames, height, width, num_channels)).
* 5D NumPy arrays: expected shape for each array: (batch_size, num_frames, height, width,
num_channels).
* 5D Torch tensors: expected shape for each array: (batch_size, num_frames, num_channels, height,
width).
* 4D Torch tensors (expected shape for each tensor `(num_frames, num_channels, height, width)`).
* 4D NumPy arrays (expected shape for each array `(num_frames, height, width, num_channels)`).
* List of 4D Torch tensors (expected shape for each tensor `(num_frames, num_channels, height,
width)`).
* List of 4D NumPy arrays (expected shape for each array `(num_frames, height, width, num_channels)`).
* 5D NumPy arrays: expected shape for each array `(batch_size, num_frames, height, width,
num_channels)`.
* 5D Torch tensors: expected shape for each array `(batch_size, num_frames, num_channels, height,
width)`.
height (`int`, *optional*, defaults to `None`):
The height in preprocessed frames of the video. If `None`, will use the `get_default_height_width()` to
get default height.
Expand Down