Skip to content

llmcompressor.pytorch.utils.sparsification

Helper functions for retrieving information related to model sparsification

Classes:

ModuleSparsificationInfo

ModuleSparsificationInfo(
    module: Module,
    state_dict: Optional[Dict[str, Tensor]] = None,
)

Helper class for providing information related to torch Module parameters and the amount of sparsification applied. Includes information for pruning and quantization

Parameters:

  • module

    (Module) –

    torch Module to analyze

  • state_dict

    (Optional[Dict[str, Tensor]], default: None ) –

    optional state_dict to analyze in place of the torch model. This is used when analyzing an FSDP model, where the full weights may not be accessible

Attributes:

  • params_quantized (int) –

    :return: number of parameters across quantized layers

  • params_quantized_percent (float) –

    :return: percentage of parameters that have been quantized

  • params_sparse (int) –

    :return: total number of sparse (0) trainable parameters in the model

  • params_sparse_percent (float) –

    :return: percent of sparsified parameters in the entire model

  • params_total (int) –

    :return: total number of trainable parameters in the model

Source code in llmcompressor/pytorch/utils/sparsification.py
def __init__(
    self, module: Module, state_dict: Optional[Dict[str, torch.Tensor]] = None
):
    self.module = module

    if state_dict is not None:
        # when analyzing an FSDP model, the state_dict does not differentiate
        # between trainable and non-trainable parameters
        # (e.g. it can contain buffers) this means that the
        # self.trainable_parameters may be overestimated
        self.trainable_params = state_dict
    else:
        if hasattr(module, "_hf_hook"):
            self.trainable_params = get_state_dict_offloaded_model(module)
        else:
            self.trainable_params = {
                k: v for k, v in self.module.named_parameters() if v.requires_grad
            }

params_quantized property

params_quantized: int

Returns:

  • int

    number of parameters across quantized layers

params_quantized_percent property

params_quantized_percent: float

Returns:

  • float

    percentage of parameters that have been quantized

params_sparse property

params_sparse: int

Returns:

  • int

    total number of sparse (0) trainable parameters in the model

params_sparse_percent property

params_sparse_percent: float

Returns:

  • float

    percent of sparsified parameters in the entire model

params_total property

params_total: int

Returns:

  • int

    total number of trainable parameters in the model