llmcompressor.pytorch.utils.sparsification_info.helpers
Functions:
-
get_leaf_operations
–Get the leaf operations in the model
-
get_precision_information
–Get the information about the precision of the operation.
-
is_quantized
–Check whether the operation is quantized (contains
get_leaf_operations
get_leaf_operations(
model: Module,
operations_to_skip: Optional[List[Module]] = None,
operations_to_unwrap: Optional[List[Module]] = None,
) -> List[torch.nn.Module]
Get the leaf operations in the model (those that do not have operations as children)
Parameters:
-
model
Module
) –the model to get the leaf operations from
-
operations_to_skip
Optional[List[Module]]
, default:None
) –a list of leaf operations that will be omitted when getting the leaf operations. If None passed, by default the Identity operation will be skipped
-
operations_to_unwrap
Optional[List[Module]]
, default:None
) –a list of operations that will be unwrapped when getting the leaf operations. Unwrapping means that we directly add the module(s) that is/are wrapped by the operation (i.e. operation's
module
attribute) to the list of leaf operations. If None passed, by default the QuantWrapper operation will be unwrapped
Returns:
-
List[Module]
–a list of the leaf operations
Source code in llmcompressor/pytorch/utils/sparsification_info/helpers.py
get_precision_information
Get the information about the precision of the operation.
1) If operation is quantized, returns the quantization scheme of the operation. 2) If operation is not quantized, returns the numer of bits of the operation's weights. 3) If operation is not quantized and does not have a weights, returns None.
Parameters:
-
operation
Module
) –the operation to get the quantization scheme from
Returns:
-
Union[None, int, QuantizationScheme]
–the quantization scheme of the operation, the number of bits of the operation's weights, or None if the operation is not quantized and does not have a weight
Source code in llmcompressor/pytorch/utils/sparsification_info/helpers.py
is_quantized
Check whether the operation is quantized (contains a quantization scheme)