pub fn quantize(
w: impl AsRef<Array>,
group_size: impl Into<Option<i32>>,
bits: impl Into<Option<i32>>,
) -> Result<(Array, Array, Array)>Expand description
Quantize the matrix w using bits bits per element.
Note, every group_size elements in a row of w are quantized together. Hence, number of
columns of w should be divisible by group_size. In particular, the rows of w are divided
into groups of size group_size which are quantized together.
quantizedcurrently only supports 2D inputs with dimensions which are multiples of 32
For details, please see this documentation
ยงParams
w: The input matrixgroup_size: The size of the group inwthat shares a scale and bias. (default:64)bits: The number of bits occupied by each element of w in the returned quantized matrix. (default: 4)