llmcompressor.pytorch.utils.sparsification_info.configs

Classes:

SparsificationInfo –
SparsificationPruning –

A model that contains the pruning information for a torch module.
SparsificationQuantization –

A model that contains the quantization information for a torch module.
SparsificationSummaries –

A model that contains the sparsification summaries for a torch module.

SparsificationInfo

Bases: BaseModel, ABC

Methods:

filter_loggable_items_non_zero_only –

Filter the loggable items to only yield the non-zero items
filter_loggable_items_percentages_only –

Filter the loggable items to only yield the percentages of the loggable items
from_module –

Factory method to create SparsificationInfo object from a module.
loggable_items –

Yield the loggable items for SparsificationInfo object.

filter_loggable_items_non_zero_only `staticmethod`

filter_loggable_items_non_zero_only(
    items_to_log, non_zero_only
)

Filter the loggable items to only yield the non-zero items

Parameters:

items_to_log
–

A generator that yields the loggable items for this object.
non_zero_only
–

If True, only yield information for non-zero items.

Returns:

–

A generator that yields the loggable items for this object.

Source code in llmcompressor/pytorch/utils/sparsification_info/configs.py

@staticmethod
def filter_loggable_items_non_zero_only(items_to_log, non_zero_only):
    """
    Filter the loggable items to only yield the non-zero items

    :param items_to_log: A generator that yields the loggable items for this object.
    :param non_zero_only: If True, only yield information for non-zero items.
    :return: A generator that yields the loggable items for this object.
    """

    def filter_non_zero_values(log):
        # log value must be non-zero
        return log[1] != 0

    yield from SparsificationInfo._filter_items_to_log(
        items_to_log,
        filter_function=filter_non_zero_values,
        to_filter=non_zero_only,
    )

filter_loggable_items_percentages_only `staticmethod`

filter_loggable_items_percentages_only(
    items_to_log: Generator[Tuple[str, Any], None, None],
    percentage_only: bool = False,
)

Filter the loggable items to only yield the percentages of the loggable items

Parameters:

items_to_log
(Generator[Tuple[str, Any], None, None]) –

A generator that yields the loggable items for this object.
percentage_only
(bool, default: False ) –

If True, only yield the percentages of the loggable items. If False, yield both the counts and percentages. Defaults to False

Returns:

–

A generator that yields the loggable items for this object.

Source code in llmcompressor/pytorch/utils/sparsification_info/configs.py

@staticmethod
def filter_loggable_items_percentages_only(
    items_to_log: Generator[Tuple[str, Any], None, None],
    percentage_only: bool = False,
):
    """
    Filter the loggable items to only yield the percentages of the loggable items

    :param items_to_log: A generator that yields the loggable items for this object.
    :param percentage_only: If True, only yield the percentages of the loggable
        items. If False, yield both the counts and percentages. Defaults to False
    :return: A generator that yields the loggable items for this object.
    """

    def filter_percentage(log):
        # log tag ends with percent
        return log[0].endswith("percent")

    yield from SparsificationInfo._filter_items_to_log(
        items_to_log,
        filter_function=filter_percentage,
        to_filter=percentage_only,
    )

from_module `abstractmethod` `classmethod`

from_module(module: Module, **kwargs) -> SparsificationInfo

Factory method to create SparsificationInfo object from a module.

Parameters:

module
(Module) –

The module to create the SparsificationInfo object from.
kwargs
–

Additional arguments to pass to the SparsificationInfo object.

Returns:

SparsificationInfo –

A SparsificationInfo object.

Source code in llmcompressor/pytorch/utils/sparsification_info/configs.py

@classmethod
@abstractmethod
def from_module(
    cls,
    module: torch.nn.Module,
    **kwargs,
) -> "SparsificationInfo":
    """
    Factory method to create SparsificationInfo object from a module.

    :param module: The module to create the SparsificationInfo object from.
    :param kwargs: Additional arguments to pass to the SparsificationInfo object.
    :return: A SparsificationInfo object.
    """
    raise NotImplementedError()

loggable_items `abstractmethod`

loggable_items(
    **kwargs,
) -> Generator[
    Tuple[str, Union[Dict[str, int], float, int]],
    None,
    None,
]

Yield the loggable items for SparsificationInfo object.

Returns:

Generator[Tuple[str, Union[Dict[str, int], float, int]], None, None] –

A generator that yields the loggable items for this object.

Source code in llmcompressor/pytorch/utils/sparsification_info/configs.py

@abstractmethod
def loggable_items(
    self,
    **kwargs,
) -> Generator[Tuple[str, Union[Dict[str, int], float, int]], None, None]:
    """
    Yield the loggable items for SparsificationInfo object.

    :return: A generator that yields the loggable items for this object.
    """
    raise NotImplementedError()

SparsificationPruning

Bases: SparsificationInfo

A model that contains the pruning information for a torch module.

Methods:

from_module –

Factory method to create a SparsificationPruning object from a module.
loggable_items –

Yield the loggable items for SparsificationPruning object.

from_module `classmethod`

from_module(module: Module) -> SparsificationPruning

Factory method to create a SparsificationPruning object from a module.

Parameters:

module
(Module) –

The module to create the SparsificationPruning object from.

Returns:

SparsificationPruning –

A SparsificationPruning object.

Source code in llmcompressor/pytorch/utils/sparsification_info/configs.py

@classmethod
def from_module(cls, module: torch.nn.Module) -> "SparsificationPruning":
    """
    Factory method to create a SparsificationPruning object from a module.

    :param module: The module to create the SparsificationPruning object from.
    :return: A SparsificationPruning object.
    """
    sparse_parameters_count = defaultdict(CountAndPercent)
    for param_name, param in module.named_parameters():
        num_parameters = param.numel()
        num_zero_parameters = param.numel() - param.count_nonzero().item()
        num_parameters = max(1, num_parameters)  # avoid FSDP divide by 0

        zero_count = num_zero_parameters
        zero_count_percent = num_zero_parameters / num_parameters

        sparse_parameters_count[param_name] = CountAndPercent(
            count=zero_count, percent=zero_count_percent
        )

    return cls(sparse_parameters=sparse_parameters_count)

loggable_items

loggable_items(
    percentages_only: bool = False,
    non_zero_only: bool = False,
    **kwargs,
) -> Generator[
    Tuple[str, Union[Dict[str, int], float, int]],
    None,
    None,
]

Yield the loggable items for SparsificationPruning object.

Parameters:

percentages_only
(bool, default: False ) –

If True, only yield the percentages of the loggable items. If False, yield both the counts and percentages. Default is False.
non_zero_only
(bool, default: False ) –

If True, only yield information for non-zero counts/percentages. Default is False.

Returns:

Generator[Tuple[str, Union[Dict[str, int], float, int]], None, None] –

A generator that yields the loggable items for this object.

Source code in llmcompressor/pytorch/utils/sparsification_info/configs.py

def loggable_items(
    self,
    percentages_only: bool = False,
    non_zero_only: bool = False,
    **kwargs,
) -> Generator[Tuple[str, Union[Dict[str, int], float, int]], None, None]:
    """
    Yield the loggable items for SparsificationPruning object.

    :param percentages_only: If True, only yield the percentages of the loggable
        items. If False, yield both the counts and percentages. Default is False.
    :param non_zero_only: If True, only yield information for non-zero
        counts/percentages. Default is False.
    :return: A generator that yields the loggable items for this object.
    """
    main_tag = self.__class__.__name__
    items_to_log = []
    for param_name, count_and_percent in self.sparse_parameters.items():
        items_to_log.append(
            (
                f"{main_tag}/SparseParameters/{param_name}/count",
                count_and_percent.count,
            )
        )  # noqa: E501
        items_to_log.append(
            (
                f"{main_tag}/SparseParameters/{param_name}/percent",
                count_and_percent.percent,
            )
        )  # noqa: E501

    items_to_log = SparsificationInfo.filter_loggable_items_percentages_only(
        items_to_log, percentages_only
    )
    items_to_log = SparsificationInfo.filter_loggable_items_non_zero_only(
        items_to_log, non_zero_only
    )

    yield from items_to_log

SparsificationQuantization

Bases: SparsificationInfo

A model that contains the quantization information for a torch module.

Methods:

from_module –

Factory method to create a SparsificationQuantization object from a module.
loggable_items –

Yield the loggable items for SparsificationQuantization object.

from_module `classmethod`

from_module(module: Module) -> SparsificationQuantization

Factory method to create a SparsificationQuantization object from a module.

Parameters:

module
(Module) –

The module to create the SparsificationQuantization object from.

Returns:

SparsificationQuantization –

A SparsificationQuantization object.

Source code in llmcompressor/pytorch/utils/sparsification_info/configs.py

@classmethod
def from_module(
    cls,
    module: torch.nn.Module,
) -> "SparsificationQuantization":
    """
    Factory method to create a SparsificationQuantization object from a module.

    :param module: The module to create the SparsificationQuantization object from.
    :return: A SparsificationQuantization object.
    """
    operations = get_leaf_operations(module)
    enabled = defaultdict(bool)
    precision = defaultdict(str)
    for op in operations:
        operation_name = op.__class__.__name__
        operation_counter = 0
        # make sure that the operation name is unique
        while enabled.get(operation_name) is not None:
            operation_counter += 1
            operation_name = f"{op.__class__.__name__}_{operation_counter}"

        enabled[operation_name] = is_quantized(op)
        precision[operation_name] = get_precision_information(op)

    return cls(enabled=enabled, precision=precision)

loggable_items

loggable_items(
    enabled_only: bool = False, **kwargs
) -> Generator[
    Tuple[str, Union[Dict[str, int], float, int]],
    None,
    None,
]

Yield the loggable items for SparsificationQuantization object.

Parameters:

enabled_only
(bool, default: False ) –

If True, only yield loggable items for operations where quantization is enabled. If False, yield irrespective of whether quantization is enabled or not. Defaults to False.

Returns:

Generator[Tuple[str, Union[Dict[str, int], float, int]], None, None] –

A generator that yields the loggable items for this object.

Source code in llmcompressor/pytorch/utils/sparsification_info/configs.py

def loggable_items(
    self,
    enabled_only: bool = False,
    **kwargs,
) -> Generator[Tuple[str, Union[Dict[str, int], float, int]], None, None]:
    """
    Yield the loggable items for SparsificationQuantization object.

    :param enabled_only: If True, only yield loggable items for
        operations where quantization is enabled. If False, yield irrespective
        of whether quantization is enabled or not. Defaults to False.
    :return: A generator that yields the loggable items for this object.
    """
    main_tag = self.__class__.__name__
    for operation in self.enabled.keys():
        if enabled_only and not self.enabled[operation]:
            continue

        yield f"{main_tag}/{operation}/enabled", self.enabled[operation]

        precision = self.precision[operation]
        if precision is None:
            yield f"{main_tag}/{operation}/precision", precision
        elif isinstance(precision, int):
            yield f"{main_tag}/{operation}/precision.weights/num_bits", precision
        elif isinstance(precision, BaseModel):
            yield (
                f"{main_tag}/{operation}/precision/weights/num_bits",
                precision.weights.num_bits,
            )  # noqa: E501
            yield (
                f"{main_tag}/{operation}/precision/input_activations/num_bits",
                precision.input_activations.num_bits,
            )  # noqa: E501
        else:
            raise ValueError(
                f"The precision is not a valid type {type(precision)}."
            )

SparsificationSummaries

Bases: SparsificationInfo

A model that contains the sparsification summaries for a torch module.

Methods:

from_module –

Factory method to create a SparsificationSummaries object from a module.
loggable_items –

Yield the loggable items for SparsificationSummaries object.

from_module `classmethod`

from_module(
    module=torch.nn.Module,
    pruning_thresholds: Tuple[float, float] = (
        0.05,
        1 - 1e-09,
    ),
) -> SparsificationSummaries

Factory method to create a SparsificationSummaries object from a module.

Parameters:

module
–

The module to create the SparsificationSummaries object from.
pruning_thresholds
(Tuple[float, float], default: (0.05, 1 - 1e-09) ) –

The lower and upper thresholds used to determine whether a parameter is pruned. If it's percentage of zero weights is between the lower and upper thresholds, it is considered pruned.

Returns:

SparsificationSummaries –

A SparsificationSummaries object.

Source code in llmcompressor/pytorch/utils/sparsification_info/configs.py

@classmethod
def from_module(
    cls,
    module=torch.nn.Module,
    pruning_thresholds: Tuple[float, float] = (0.05, 1 - 1e-9),
) -> "SparsificationSummaries":
    """
    Factory method to create a SparsificationSummaries object from a module.

    :param module: The module to create the SparsificationSummaries object from.
    :param pruning_thresholds: The lower and upper thresholds used to determine
        whether a parameter is pruned. If it's percentage of zero weights is between
        the lower and upper thresholds, it is considered pruned.
    :return: A SparsificationSummaries object.
    """
    operations = get_leaf_operations(module)
    num_quantized_ops = sum([is_quantized(op) for op in operations])
    total_num_params = len(list(module.parameters()))

    lower_threshold_pruning = min(pruning_thresholds)
    upper_threshold_pruning = max(pruning_thresholds)
    total_num_params_pruned = 0
    count_parameters = defaultdict(int)

    for param_name, param in module.named_parameters():
        num_parameters = param.numel()
        num_zero_parameters = param.numel() - param.count_nonzero().item()
        num_parameters = max(1, num_parameters)  # avoid FSDP divide by 0

        if (
            lower_threshold_pruning
            <= num_zero_parameters / num_parameters
            <= upper_threshold_pruning
        ):
            total_num_params_pruned += 1

        count_parameters[param_name] = num_parameters

    return cls(
        pruned=CountAndPercent(
            count=total_num_params_pruned,
            percent=total_num_params_pruned / total_num_params,
        ),
        quantized=CountAndPercent(
            count=num_quantized_ops, percent=num_quantized_ops / len(operations)
        ),
        parameter_counts=count_parameters,
        operation_counts=Counter([op.__class__.__name__ for op in operations]),
    )

loggable_items

loggable_items(
    non_zero_only: bool = False,
    percentages_only: bool = True,
    **kwargs,
) -> Generator[
    Tuple[str, Union[Dict[str, int], float, int]],
    None,
    None,
]

Yield the loggable items for SparsificationSummaries object.

Parameters:

non_zero_only
(bool, default: False ) –

If True, only yield information for non-zero items.
percentages_only
(bool, default: True ) –

If True, only yield the percentages of the loggable items. If False, yield both the counts and percentages. Defaults to True

Returns:

Generator[Tuple[str, Union[Dict[str, int], float, int]], None, None] –

A generator that yields the loggable items for this object.

Source code in llmcompressor/pytorch/utils/sparsification_info/configs.py

def loggable_items(
    self,
    non_zero_only: bool = False,
    percentages_only: bool = True,
    **kwargs,
) -> Generator[Tuple[str, Union[Dict[str, int], float, int]], None, None]:
    """
    Yield the loggable items for SparsificationSummaries object.

    :param non_zero_only: If True, only yield information for non-zero items.
    :param percentages_only: If True, only yield the percentages of the loggable
        items. If False, yield both the counts and percentages. Defaults to True
    :return: A generator that yields the loggable items for this object.
    """
    main_tag = self.__class__.__name__
    yield f"{main_tag}/OperationCounts", self.operation_counts
    yield f"{main_tag}/ParameterCounts", self.parameter_counts

    items_to_log = (
        (f"{main_tag}/QuantizedOperations/count", self.quantized.count),
        (f"{main_tag}/QuantizedOperations/percent", self.quantized.percent),
        (f"{main_tag}/PrunedParameters/count", self.pruned.count),
        (f"{main_tag}/PrunedParameters/percent", self.pruned.percent),
    )

    items_to_log = SparsificationInfo.filter_loggable_items_percentages_only(
        items_to_log, percentages_only
    )
    items_to_log = SparsificationInfo.filter_loggable_items_non_zero_only(
        items_to_log, non_zero_only
    )

    yield from items_to_log

llmcompressor.pytorch.utils.sparsification_info.configs

SparsificationInfo

filter_loggable_items_non_zero_only `staticmethod`

`items_to_log`

`non_zero_only`

filter_loggable_items_percentages_only `staticmethod`

`items_to_log`

`percentage_only`

from_module `abstractmethod` `classmethod`

`module`

`kwargs`

loggable_items `abstractmethod`

SparsificationPruning

from_module `classmethod`

`module`

loggable_items

`percentages_only`

`non_zero_only`

SparsificationQuantization

from_module `classmethod`

`module`

loggable_items

`enabled_only`

SparsificationSummaries

from_module `classmethod`

`module`

`pruning_thresholds`

loggable_items

`non_zero_only`

`percentages_only`

llmcompressor.pytorch.utils.sparsification_info.configs

SparsificationInfo

filter_loggable_items_non_zero_only staticmethod

items_to_log

non_zero_only

filter_loggable_items_percentages_only staticmethod

items_to_log

percentage_only

from_module abstractmethod classmethod

module

kwargs

loggable_items abstractmethod

SparsificationPruning

from_module classmethod

module

loggable_items

percentages_only

non_zero_only

SparsificationQuantization

from_module classmethod

module

loggable_items

enabled_only

SparsificationSummaries

from_module classmethod

module

pruning_thresholds

loggable_items

non_zero_only

percentages_only

filter_loggable_items_non_zero_only `staticmethod`

`items_to_log`

`non_zero_only`

filter_loggable_items_percentages_only `staticmethod`

`items_to_log`

`percentage_only`

from_module `abstractmethod` `classmethod`

`module`

`kwargs`

loggable_items `abstractmethod`

from_module `classmethod`

`module`

`percentages_only`

`non_zero_only`

from_module `classmethod`

`module`

`enabled_only`

from_module `classmethod`

`module`

`pruning_thresholds`

`non_zero_only`

`percentages_only`