Disable cudnn when doing grid_sample to reduce memory cost.

ruotianluo · ruotianluo · commit 2cd362baa6c8 · 2017-11-07T18:12:51.000-06:00
diff --git a/README.md b/README.md
@@ -58,7 +58,7 @@ Additional features not mentioned in the [report](https://arxiv.org/pdf/1702.021
   - **Support for visualization**. The current implementation will summarize ground truth boxes, statistics of losses, activations and variables during training, and dump it to a separate folder for tensorboard visualization. The computing graph is also saved for debugging.
 
 ### Prerequisites
-  - A basic pytorch installation. The code follows **0.2**. If you are using old **0.1.12**, you can checkout 0.1.12 branch.
+  - A basic pytorch installation. The code follows **0.3**. If you are using old **0.1.12** or **0.2**, you can checkout the corresponding branch.
   - Python packages you might not have: `cffi`, `opencv-python`, `easydict` (similar to [py-faster-rcnn](https://github.com/rbgirshick/py-faster-rcnn)). For `easydict` make sure you have the right version. Xinlei uses 1.6.
   - [tensorboard-pytorch](https://github.com/lanpa/tensorboard-pytorch) to visualize the training and validation curve. Please build from source to use the latest tensorflow-tensorboard.
   - ~~Docker users: Since the recent upgrade, the docker image on docker hub (https://hub.docker.com/r/mbuckler/tf-faster-rcnn-deps/) is no longer valid. However, you can still build your own image by using dockerfile located at `docker` folder (cuda 8 version, as it is required by Tensorflow r1.0.) And make sure following Tensorflow installation to install and use nvidia-docker[https://github.com/NVIDIA/nvidia-docker]. Last, after launching the container, you have to build the Cython modules within the running container.~~
diff --git a/lib/nets/network.py b/lib/nets/network.py
@@ -119,15 +119,13 @@ def _crop_pool_layer(self, bottom, rois, max_pool=True):
     theta[:, 1, 1] = (y2 - y1) / (height - 1)
     theta[:, 1, 2] = (y1 + y2 - height + 1) / (height - 1)
 
+    pre_pool_size = cfg.POOLING_SIZE * 2 if max_pool else cfg.POOLING_SIZE
+    grid = F.affine_grid(theta, torch.Size((rois.size(0), 1, pre_pool_size, pre_pool_size)))
+    torch.backends.cudnn.enabled = False
+    crops = F.grid_sample(bottom.expand(rois.size(0), bottom.size(1), bottom.size(2), bottom.size(3)), grid)
+    torch.backends.cudnn.enabled = True
     if max_pool:
-      pre_pool_size = cfg.POOLING_SIZE * 2
-      grid = F.affine_grid(theta, torch.Size((rois.size(0), 1, pre_pool_size, pre_pool_size)))
-      crops = F.grid_sample(bottom.expand(rois.size(0), bottom.size(1), bottom.size(2), bottom.size(3)), grid)
       crops = F.max_pool2d(crops, 2, 2)
-    else:
-      grid = F.affine_grid(theta, torch.Size((rois.size(0), 1, cfg.POOLING_SIZE, cfg.POOLING_SIZE)))
-      crops = F.grid_sample(bottom.expand(rois.size(0), bottom.size(1), bottom.size(2), bottom.size(3)), grid)
-    
     return crops
 
   def _anchor_target_layer(self, rpn_cls_score):