Open
Description
OpenCL Error
I built the image xpu-flex image in xpu-master/docker and it fails the sample code found in README.md (Inference on GPU) RuntimeError: An OpenCL error occurred: -6
. I fixed this by adding
FROM intel-extension-for-pytorch:xpu-flex
RUN apt update && apt -y upgrade
Torchvision Warning
It would be nice if the instructions (docker/README.md) mention that this torchvision warning is not a problem. It took me too much time to find the issue where this was mentioned.
root@1f3f1787e64f:/# python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.__version__); print(ipex.__version__); [print(f'[{i}]: {ipex.xpu.get_device_properties(i)}') for i in range(ipex.xpu.device_count())];"
/usr/local/lib/python3.10/dist-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
1.13.0a0+git6c9b55e
1.13.120+xpu
[0]: _DeviceProperties(name='Intel(R) Arc(TM) A770 Graphics', platform_name='Intel(R) Level-Zero', dev_type='gpu, support_fp64=0, total_memory=15473MB, max_compute_units=512)
Performance Reference
After ultimately getting it to work with stable diffusion using https://github.com/vladmandic/automatic (dev) performance was good.
Prompt: cute dog riding an apple
Steps: 25 | Sampler: DPM++ 2M Karras | CFG scale: 7 | Seed: 1 | Size: 512x512 | Model hash: 6ce0161689 | Model: v1-5-pruned-emaonly | Clip skip: 1 | Version: cc685a8 | Parser: Full parser
Time taken: 2m 28.33s |
GPU active 5424 MB reserved 5908 MB | System peak 5280 MB total 15474 MB
(Variation Seed: 1, Batch Count: 8, Batch Size 8)
RTX 3060 12GB takes 3m 11s, so the A770 is substantially faster ( 2m 28s).