Transformers documentation

自定义层和工具

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

自定义层和工具

此页面列出了库使用的所有自定义层,以及它为模型提供的实用函数。

其中大多数只有在您研究库中模型的代码时才有用。

Pytorch自定义模块

class transformers.Conv1D

< >

( nf nx )

Parameters

  • nf (int) — The number of output features.
  • nx (int) — The number of input features.

1D-convolutional layer as defined by Radford et al. for OpenAI GPT (and also used in GPT-2).

Basically works like a linear layer but the weights are transposed.

PyTorch帮助函数

transformers.apply_chunking_to_forward

< >

( forward_fn: Callable[..., torch.Tensor] chunk_size: int chunk_dim: int *input_tensors ) torch.Tensor

Parameters

  • forward_fn (Callable[..., torch.Tensor]) — The forward function of the model.
  • chunk_size (int) — The chunk size of a chunked tensor: num_chunks = len(input_tensors[0]) / chunk_size.
  • chunk_dim (int) — The dimension over which the input_tensors should be chunked.
  • input_tensors (tuple[torch.Tensor]) — The input tensors of forward_fn which will be chunked

Returns

torch.Tensor

A tensor with the same shape as the forward_fn would have given if applied`.

This function chunks the input_tensors into smaller input tensor parts of size chunk_size over the dimension chunk_dim. It then applies a layer forward_fn to each chunk independently to save memory.

If the forward_fn is independent across the chunk_dim this function will yield the same result as directly applying forward_fn to input_tensors.

Examples:

# rename the usual forward() fn to forward_chunk()
def forward_chunk(self, hidden_states):
    hidden_states = self.decoder(hidden_states)
    return hidden_states


# implement a chunked forward function
def forward(self, hidden_states):
    return apply_chunking_to_forward(self.forward_chunk, self.chunk_size_lm_head, self.seq_len_dim, hidden_states)

transformers.pytorch_utils.prune_linear_layer

< >

( layer: nn.Linear index: torch.LongTensor dim: int = 0 ) torch.nn.Linear

Parameters

  • layer (torch.nn.Linear) — The layer to prune.
  • index (torch.LongTensor) — The indices to keep in the layer.
  • dim (int, optional, defaults to 0) — The dimension on which to keep the indices.

Returns

torch.nn.Linear

The pruned layer as a new layer with requires_grad=True.

Prune a linear layer to keep only entries in index.

Used to remove heads.

Update on GitHub