Module fast

Source
Expand description

Fast implementations of commonly used multi-op functions.

Functionsยง

layer_norm
Layer normalization.
layer_norm_device
Layer normalization.
rms_norm
Root Mean Square normalization (RMS norm).
rms_norm_device
Root Mean Square normalization (RMS norm).
rope
Optimized implementation of NN.RoPE.
rope_device
Optimized implementation of NN.RoPE.
scaled_dot_product_attention
A fast implementation of multi-head attention: O = softmax(Q @ K.T, dim=-1) @ V
scaled_dot_product_attention_device
A fast implementation of multi-head attention: O = softmax(Q @ K.T, dim=-1) @ V