Pruning during training versus post-training pruning
A key decision in applying pruning is whether to prune the model during training or after training is complete:
- Pruning during training: This approach allows the model to adjust to the pruned structure over time by iteratively pruning weights as it learns. The model can compensate for pruned weights, potentially resulting in better final performance. However, it requires more computational resources and training time.
Here’s an example of this approach:
import torch import torch.nn.utils.prune as prune # Assuming model is a pre-trained LLM model = ... # Load or define your LLM model optimizer = torch.optim.Adam(model.parameters(), lr=0.001) criterion = torch.nn.CrossEntropyLoss() def train(model, train_loader, optimizer): model.train() for batch in train_loader: inputs, targets = batch &...