llmcompressor.pytorch.utils
Generic code used as utilities and helpers for PyTorch
Modules:
-
helpers
–Utility / helper functions
-
sparsification
–Helper functions for retrieving information related to model sparsification
-
sparsification_info
–
Classes:
-
ModuleSparsificationInfo
–Helper class for providing information related to torch Module parameters
Functions:
-
get_linear_layers
–:param module: the module to grab all linear layers for
-
get_quantized_layers
–:param module: the module to get the quantized layers from
-
set_deterministic_seeds
–Manually seeds the numpy, random, and torch packages.
-
tensor_sparsity
–:param tens: the tensor to calculate the sparsity for
-
tensors_module_forward
–Default function for calling into a model with data for a forward execution.
-
tensors_to_device
–Default function for putting a tensor or collection of tensors to the proper device.
-
tensors_to_precision
–:param tensors: the tensors to change the precision of
ModuleSparsificationInfo
Helper class for providing information related to torch Module parameters and the amount of sparsification applied. Includes information for pruning and quantization
Parameters:
-
module
Module
) –torch Module to analyze
-
state_dict
Optional[Dict[str, Tensor]]
, default:None
) –optional state_dict to analyze in place of the torch model. This is used when analyzing an FSDP model, where the full weights may not be accessible
Attributes:
-
params_quantized
(int
) –:return: number of parameters across quantized layers
-
params_quantized_percent
(float
) –:return: percentage of parameters that have been quantized
-
params_sparse
(int
) –:return: total number of sparse (0) trainable parameters in the model
-
params_sparse_percent
(float
) –:return: percent of sparsified parameters in the entire model
-
params_total
(int
) –:return: total number of trainable parameters in the model
Source code in llmcompressor/pytorch/utils/sparsification.py
params_quantized property
Returns:
-
int
–number of parameters across quantized layers
params_quantized_percent property
Returns:
-
float
–percentage of parameters that have been quantized
params_sparse property
Returns:
-
int
–total number of sparse (0) trainable parameters in the model
params_sparse_percent property
Returns:
-
float
–percent of sparsified parameters in the entire model
get_linear_layers
Parameters:
-
module
Module
) –the module to grab all linear layers for
Returns:
-
Dict[str, Module]
–a list of all linear layers in the module
Source code in llmcompressor/pytorch/utils/helpers.py
get_quantized_layers
Parameters:
-
module
Module
) –the module to get the quantized layers from
Returns:
-
List[Tuple[str, Module]]
–a list containing the names and modules of the quantized layers (Embedding, Linear, Conv2d, Conv3d)
Source code in llmcompressor/pytorch/utils/helpers.py
set_deterministic_seeds
Manually seeds the numpy, random, and torch packages. Also sets torch.backends.cudnn.deterministic to True
Parameters:
-
seed
int
, default:0
) –the manual seed to use. Default is 0
Source code in llmcompressor/pytorch/utils/helpers.py
tensor_sparsity
tensor_sparsity(
tens: Tensor,
dim: Union[
None, int, List[int], Tuple[int, ...]
] = None,
) -> Tensor
Parameters:
-
tens
Tensor
) –the tensor to calculate the sparsity for
-
dim
Union[None, int, List[int], Tuple[int, ...]]
, default:None
) –the dimension(s) to split the calculations over; ex, can split over batch, channels, or combos
Returns:
-
Tensor
–the sparsity of the input tens, ie the fraction of numbers that are zero
Source code in llmcompressor/pytorch/utils/helpers.py
tensors_module_forward
tensors_module_forward(
tensors: Union[
Tensor, Iterable[Tensor], Mapping[Any, Tensor]
],
module: Module,
check_feat_lab_inp: bool = True,
) -> Any
Default function for calling into a model with data for a forward execution. Returns the model result. Note, if an iterable the features to be passed into the model are considered to be at index 0 and other indices are for labels.
Supported use cases: single tensor, iterable with first tensor taken as the features to pass into the model
Parameters:
-
tensors
Union[Tensor, Iterable[Tensor], Mapping[Any, Tensor]]
) –the data to be passed into the model, if an iterable the features to be passed into the model are considered to be at index 0 and other indices are for labels
-
module
Module
) –the module to pass the data into
-
check_feat_lab_inp
bool
, default:True
) –True to check if the incoming tensors looks like it's made up of features and labels ie a tuple or list with 2 items (typical output from a data loader) and will call into the model with just the first element assuming it's the features False to not check
Returns:
-
Any
–the result of calling into the model for a forward pass
Source code in llmcompressor/pytorch/utils/helpers.py
tensors_to_device
tensors_to_device(
tensors: Union[
Tensor, Iterable[Tensor], Dict[Any, Tensor]
],
device: str,
) -> Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]]
Default function for putting a tensor or collection of tensors to the proper device. Returns the tensor references after being placed on the proper device.
Supported use cases: - single tensor - Dictionary of single tensors - Dictionary of iterable of tensors - Dictionary of dictionary of tensors - Iterable of single tensors - Iterable of iterable of tensors - Iterable of dictionary of tensors
Parameters:
-
tensors
Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]]
) –the tensors or collection of tensors to put onto a device
-
device
str
) –the string representing the device to put the tensors on, ex: 'cpu', 'cuda', 'cuda:1'
Returns:
-
Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]]
–the tensors or collection of tensors after being placed on the device
Source code in llmcompressor/pytorch/utils/helpers.py
tensors_to_precision
tensors_to_precision(
tensors: Union[
Tensor, Iterable[Tensor], Dict[Any, Tensor]
],
full_precision: bool,
) -> Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]]
Parameters:
-
tensors
Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]]
) –the tensors to change the precision of
-
full_precision
bool
) –True for full precision (float 32) and False for half (float 16)
Returns:
-
Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]]
–the tensors converted to the desired precision