llmcompressor.pipelines.layer_sequential.helpers
Functions:
-
capture_first_layer_intermediates–Captures the intermediate activations directly before the first model layer.
-
match_modules–Find all submodules which match the
target_namesand sort them by name -
maybe_inject_pos_embeddings–As of https://github.com/huggingface/transformers/pull/34858, positional embeddings
-
to_next_layer_kwargs–Convert a list of arguments to a dictionary of keyword arguments which match the
EarlyStopException dataclass
Bases: Exception
Dataclass for storing model activations
Note: Attribute names args and kwargs are reserved for dataclass.GenericAlias
capture_first_layer_intermediates
capture_first_layer_intermediates(
model: Module,
first_layer: Module,
dataloader: DataLoader,
model_device: device = torch.device("cpu"),
mask_padding: bool = True,
) -> IntermediatesCache
Captures the intermediate activations directly before the first model layer. This is meant to capture any model preprocessing before model layers are executed
Note that if any modules compressed prior to the execution of the first layer, the compression error induced by compressing those modules will not be propagated to subsequent activations, as they would be for modules which are compressed within a layer
Parameters:
-
(modelModule) –model containing layers
-
(first_layerModule) –the first layer of the model
-
(dataloaderDataLoader) –dataloader of calibration inputs
-
(mask_paddingbool, default:True) –zero out padding tokens if True. This affects modifiers such as GPTQ and SparseGPT
Source code in llmcompressor/pipelines/layer_sequential/helpers.py
match_modules
Find all submodules which match the target_names and sort them by name
Parameters:
-
(modelModule) –model to search for submodules in
-
(target_namesList[str]) –patterns of submodule names to match
Returns:
-
List[Module]–list of submodules
Source code in llmcompressor/pipelines/layer_sequential/helpers.py
maybe_inject_pos_embeddings
maybe_inject_pos_embeddings(
output: Dict[str, Any],
next_layer: Module,
inputs: Dict[str, Any],
) -> Dict[str, Any]
As of https://github.com/huggingface/transformers/pull/34858, positional embeddings must be passed into each decoder call as kwargs
Parameters:
-
(outputDict[str, Any]) –output of the previous layer
-
(next_layerModule) –next layer to call
-
(inputsDict[str, Any]) –inputs to next layer
Source code in llmcompressor/pipelines/layer_sequential/helpers.py
to_next_layer_kwargs
Convert a list of arguments to a dictionary of keyword arguments which match the next layer's function signature
Parameters:
-
(argsTuple[Any, ...]) –list of argument values
-
(next_layerModule) –the next layer whose function signature must be matched
Returns:
-
Dict[str, Any]–dictionary mapping function signature keywords to argument values