linear

class torch.ao.nn.quantized.functional.linear(input, weight, bias=None, scale=None, zero_point=None) [source]

Applies a linear transformation to the incoming quantized data: $y = xA^T + b$ . See Linear

Note

Current implementation packs weights on every call, which has penalty on performance. If you want to avoid the overhead, use Linear.

Parameters

input (Tensor) – Quantized input of type torch.quint8
weight (Tensor) – Quantized weight of type torch.qint8
bias (Tensor) – None or fp32 bias of type torch.float
scale (double) – output scale. If None, derived from the input scale
zero_point (python:long) – output zero point. If None, derived from the input zero_point

Return type

Tensor

Shape:

Input: $(N, *, in\_features)$ where * means any number of additional dimensions
Weight: $(out\_features, in\_features)$
Bias: $(out\_features)$
Output: $(N, *, out\_features)$

Title here