llmcompressor.pipelines.layer_sequential.helpers
Functions:
-
capture_first_layer_intermediates
–Captures the intermediate activations directly before the first model layer.
-
match_modules
–Find all submodules which match the
target_names
and sort them by name -
maybe_inject_pos_embeddings
–As of https://github.com/huggingface/transformers/pull/34858, positional embeddings
-
to_next_layer_kwargs
–Convert a list of arguments to a dictionary of keyword arguments which match the
EarlyStopException dataclass
Bases: Exception
Dataclass for storing model activations
Note: Attribute names args
and kwargs
are reserved for dataclass.GenericAlias
capture_first_layer_intermediates
capture_first_layer_intermediates(
model: Module,
first_layer: Module,
dataloader: DataLoader,
model_device: device = torch.device("cpu"),
mask_padding: bool = True,
) -> IntermediatesCache
Captures the intermediate activations directly before the first model layer. This is meant to capture any model preprocessing before model layers are executed
Note that if any modules compressed prior to the execution of the first layer, the compression error induced by compressing those modules will not be propagated to subsequent activations, as they would be for modules which are compressed within a layer
Parameters:
-
model
Module
) –model containing layers
-
first_layer
Module
) –the first layer of the model
-
dataloader
DataLoader
) –dataloader of calibration inputs
-
mask_padding
bool
, default:True
) –zero out padding tokens if True. This affects modifiers such as GPTQ and SparseGPT
Source code in llmcompressor/pipelines/layer_sequential/helpers.py
match_modules
Find all submodules which match the target_names
and sort them by name
Parameters:
-
model
Module
) –model to search for submodules in
-
target_names
List[str]
) –patterns of submodule names to match
Returns:
-
List[Module]
–list of submodules
Source code in llmcompressor/pipelines/layer_sequential/helpers.py
maybe_inject_pos_embeddings
maybe_inject_pos_embeddings(
output: Dict[str, Any],
next_layer: Module,
inputs: Dict[str, Any],
) -> Dict[str, Any]
As of https://github.com/huggingface/transformers/pull/34858, positional embeddings must be passed into each decoder call as kwargs
Parameters:
-
output
Dict[str, Any]
) –output of the previous layer
-
next_layer
Module
) –next layer to call
-
inputs
Dict[str, Any]
) –inputs to next layer
Source code in llmcompressor/pipelines/layer_sequential/helpers.py
to_next_layer_kwargs
Convert a list of arguments to a dictionary of keyword arguments which match the next layer's function signature
Parameters:
-
args
Tuple[Any, ...]
) –list of argument values
-
next_layer
Module
) –the next layer whose function signature must be matched
Returns:
-
Dict[str, Any]
–dictionary mapping function signature keywords to argument values