llmcompressor.modifiers.utils.pytorch_helpers
PyTorch-specific helper functions for model compression.
Provides utility functions for PyTorch model operations including batch processing, padding mask application, and model architecture detection. Supports MoE (Mixture of Experts) models and specialized tensor operations for compression workflows.
Functions:
-
apply_pad_mask_to_batch
–Apply a mask to the input ids of a batch. This is used to zero out
-
is_moe_model
–Check if the model is a mixture of experts model
apply_pad_mask_to_batch
Apply a mask to the input ids of a batch. This is used to zero out padding tokens so they do not contribute to the hessian calculation in the GPTQ and SparseGPT algorithms
Assumes that attention_mask
only contains zeros and ones
Parameters:
-
batch
Dict[str, Tensor]
) –batch to apply padding to if it exists
Returns:
-
Dict[str, Tensor]
–batch with padding zeroed out in the input_ids
Source code in llmcompressor/modifiers/utils/pytorch_helpers.py
is_moe_model
Check if the model is a mixture of experts model
Parameters:
-
model
Module
) –the model to check
Returns:
-
bool
–True if the model is a mixture of experts model