-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Labels
Description
This issue tracks full cuDNN4 support with rust-cudnn.
cuDNN 4 does not introduce many new things and rust-cudnn works already pretty well with the cuDNN 4 library. (see notable update below) Mostly it makes improvements to the Normalization API and internally improves the convolution performance. From the cuDNN 4 release notes.
_New Features_
- Batch Normalization routines have been added.
- Convolution forward and backward now supports NHWC tensor format.
- FFT Tiling algorithm has been added for cudnnConvolutionForward and cudnnConvolutionBackwardData routines
- cudnnConvolutionForward now supports computation in FP16 when run on GPU
with a compute capability >= 5.3
- cudnnConvolutionForward has been optimized for batch size = 1
- Pooling and activation routines have a descriptor option to propagate NaN numbers.
One notable update of cuDNN 4 is
Performance of cudnnConvolutionBackwardFilter when using Algo 1 has been
improved for some cases. This code path now also requires a workspace.
which affects collenchyma-nn as it makes the convolution algorithm inconsistent when switching from cuDNN3 to cuDNN4.