On this page
torch.nn.functional
Convolution functions
conv1d |
Applies a 1D convolution over an input signal composed of several input planes. |
conv2d |
Applies a 2D convolution over an input image composed of several input planes. |
conv3d |
Applies a 3D convolution over an input image composed of several input planes. |
conv_transpose1d |
Applies a 1D transposed convolution operator over an input signal composed of several input planes, sometimes also called "deconvolution". |
conv_transpose2d |
Applies a 2D transposed convolution operator over an input image composed of several input planes, sometimes also called "deconvolution". |
conv_transpose3d |
Applies a 3D transposed convolution operator over an input image composed of several input planes, sometimes also called "deconvolution" |
unfold |
Extracts sliding local blocks from a batched input tensor. |
fold |
Combines an array of sliding local blocks into a large containing tensor. |
Pooling functions
avg_pool1d |
Applies a 1D average pooling over an input signal composed of several input planes. |
avg_pool2d |
Applies 2D average-pooling operation in regions by step size steps. |
avg_pool3d |
Applies 3D average-pooling operation in regions by step size steps. |
max_pool1d |
Applies a 1D max pooling over an input signal composed of several input planes. |
max_pool2d |
Applies a 2D max pooling over an input signal composed of several input planes. |
max_pool3d |
Applies a 3D max pooling over an input signal composed of several input planes. |
max_unpool1d |
Computes a partial inverse of |
max_unpool2d |
Computes a partial inverse of |
max_unpool3d |
Computes a partial inverse of |
lp_pool1d |
Applies a 1D power-average pooling over an input signal composed of several input planes. |
lp_pool2d |
Applies a 2D power-average pooling over an input signal composed of several input planes. |
adaptive_max_pool1d |
Applies a 1D adaptive max pooling over an input signal composed of several input planes. |
adaptive_max_pool2d |
Applies a 2D adaptive max pooling over an input signal composed of several input planes. |
adaptive_max_pool3d |
Applies a 3D adaptive max pooling over an input signal composed of several input planes. |
adaptive_avg_pool1d |
Applies a 1D adaptive average pooling over an input signal composed of several input planes. |
adaptive_avg_pool2d |
Applies a 2D adaptive average pooling over an input signal composed of several input planes. |
adaptive_avg_pool3d |
Applies a 3D adaptive average pooling over an input signal composed of several input planes. |
fractional_max_pool2d |
Applies 2D fractional max pooling over an input signal composed of several input planes. |
fractional_max_pool3d |
Applies 3D fractional max pooling over an input signal composed of several input planes. |
Attention Mechanisms
scaled_dot_product_attention |
Computes scaled dot product attention on query, key and value tensors, using an optional attention mask if passed, and applying dropout if a probability greater than 0.0 is specified. |
Non-linear activation functions
threshold |
Thresholds each element of the input Tensor. |
threshold_ |
In-place version of |
relu |
Applies the rectified linear unit function element-wise. |
relu_ |
In-place version of |
hardtanh |
Applies the HardTanh function element-wise. |
hardtanh_ |
In-place version of |
hardswish |
Applies the hardswish function, element-wise, as described in the paper: |
relu6 |
Applies the element-wise function . |
elu |
Applies the Exponential Linear Unit (ELU) function element-wise. |
elu_ |
In-place version of |
selu |
Applies element-wise, , with and . |
celu |
Applies element-wise, . |
leaky_relu |
Applies element-wise, |
leaky_relu_ |
In-place version of |
prelu |
Applies element-wise the function where weight is a learnable parameter. |
rrelu |
Randomized leaky ReLU. |
rrelu_ |
In-place version of |
glu |
The gated linear unit. |
gelu |
When the approximate argument is 'none', it applies element-wise the function |
logsigmoid |
Applies element-wise |
hardshrink |
Applies the hard shrinkage function element-wise |
tanhshrink |
Applies element-wise, |
softsign |
Applies element-wise, the function |
softplus |
Applies element-wise, the function . |
softmin |
Applies a softmin function. |
softmax |
Applies a softmax function. |
softshrink |
Applies the soft shrinkage function elementwise |
gumbel_softmax |
Samples from the Gumbel-Softmax distribution (Link 1 Link 2) and optionally discretizes. |
log_softmax |
Applies a softmax followed by a logarithm. |
tanh |
Applies element-wise, |
sigmoid |
Applies the element-wise function |
hardsigmoid |
Applies the element-wise function |
silu |
Applies the Sigmoid Linear Unit (SiLU) function, element-wise. |
mish |
Applies the Mish function, element-wise. |
batch_norm |
Applies Batch Normalization for each channel across a batch of data. |
group_norm |
Applies Group Normalization for last certain number of dimensions. |
instance_norm |
Applies Instance Normalization for each channel in each data sample in a batch. |
layer_norm |
Applies Layer Normalization for last certain number of dimensions. |
local_response_norm |
Applies local response normalization over an input signal composed of several input planes, where channels occupy the second dimension. |
normalize |
Performs normalization of inputs over specified dimension. |
Linear functions
linear |
Applies a linear transformation to the incoming data: . |
bilinear |
Applies a bilinear transformation to the incoming data: |
Dropout functions
dropout |
During training, randomly zeroes some of the elements of the input tensor with probability |
alpha_dropout |
Applies alpha dropout to the input. |
feature_alpha_dropout |
Randomly masks out entire channels (a channel is a feature map, e.g. |
dropout1d |
Randomly zero out entire channels (a channel is a 1D feature map, e.g., the -th channel of the -th sample in the batched input is a 1D tensor ) of the input tensor). |
dropout2d |
Randomly zero out entire channels (a channel is a 2D feature map, e.g., the -th channel of the -th sample in the batched input is a 2D tensor ) of the input tensor). |
dropout3d |
Randomly zero out entire channels (a channel is a 3D feature map, e.g., the -th channel of the -th sample in the batched input is a 3D tensor ) of the input tensor). |
Sparse functions
embedding |
A simple lookup table that looks up embeddings in a fixed dictionary and size. |
embedding_bag |
Computes sums, means or maxes of |
one_hot |
Takes LongTensor with index values of shape |
Distance functions
pairwise_distance |
See |
cosine_similarity |
Returns cosine similarity between |
pdist |
Computes the p-norm distance between every pair of row vectors in the input. |
Loss functions
binary_cross_entropy |
Function that measures the Binary Cross Entropy between the target and input probabilities. |
binary_cross_entropy_with_logits |
Function that measures Binary Cross Entropy between target and input logits. |
poisson_nll_loss |
Poisson negative log likelihood loss. |
cosine_embedding_loss |
See |
cross_entropy |
This criterion computes the cross entropy loss between input logits and target. |
ctc_loss |
The Connectionist Temporal Classification loss. |
gaussian_nll_loss |
Gaussian negative log likelihood loss. |
hinge_embedding_loss |
See |
kl_div |
|
l1_loss |
Function that takes the mean element-wise absolute value difference. |
mse_loss |
Measures the element-wise mean squared error. |
margin_ranking_loss |
See |
multilabel_margin_loss |
See |
multilabel_soft_margin_loss |
See |
multi_margin_loss |
See |
nll_loss |
The negative log likelihood loss. |
huber_loss |
Function that uses a squared term if the absolute element-wise error falls below delta and a delta-scaled L1 term otherwise. |
smooth_l1_loss |
Function that uses a squared term if the absolute element-wise error falls below beta and an L1 term otherwise. |
soft_margin_loss |
See |
triplet_margin_loss |
See |
triplet_margin_with_distance_loss |
See |
Vision functions
pixel_shuffle |
Rearranges elements in a tensor of shape
to a tensor of shape
, where r is the |
pixel_unshuffle |
Reverses the |
pad |
Pads tensor. |
interpolate |
Down/up samples the input to either the given |
upsample |
Upsamples the input to either the given |
upsample_nearest |
Upsamples the input, using nearest neighbours' pixel values. |
upsample_bilinear |
Upsamples the input, using bilinear upsampling. |
grid_sample |
Given an |
affine_grid |
Generates a 2D or 3D flow field (sampling grid), given a batch of affine matrices |
DataParallel functions (multi-GPU, distributed)
data_parallel
|
Evaluates module(input) in parallel across the GPUs given in device_ids. |
© 2024, PyTorch Contributors
PyTorch has a BSD-style license, as found in the LICENSE file.
https://pytorch.org/docs/2.1/nn.functional.html