llmcompressor.pipelines.sequential.helpers
Classes:
-
Subgraph–Dataclass specifying an executable subgraph of a model graph
Functions:
-
dispatch_for_sequential–Dispatch a model for sequential calibration using a sequential pipeline.
-
get_sequential_targets–Infer sequential targets from modifiers list and dataset args
-
trace_subgraphs–Trace a model to produce subgraphs, where each sequential target belongs to exactly
SequentialTracer
Bases: HFTracer
Get a tracer specialized for the given model. The resulting tracer will not trace inside of sequential targets, nor any modules which are not call graph ancestors of sequential targets
Tracing within sequential targets is unnecessary, and tracing within offloaded modules may result in meta tensors being added to the model graph
Parameters:
-
(ancestorsSet[Module]) –modules which are ancestors of sequential targets
-
(offloadedSet[Module]) –modules which have offloaded params and should not be traced
Source code in llmcompressor/pipelines/sequential/helpers.py
Subgraph dataclass
Subgraph(
graph: Graph,
input_names: Set[str],
consumed_names: Set[str],
_code: Optional[PythonCode] = None,
)
Dataclass specifying an executable subgraph of a model graph
Parameters:
-
(graphGraph) –subgraph of model graph
-
(input_namesSet[str]) –argument names of the compiled forward function
-
(consumed_namesSet[str]) –argument names which are not used by any subsequent subgraphs and can therefore be deleted from the intermediates cache
Methods:
-
forward–Execute the operations within the subgraph
forward
Execute the operations within the subgraph
Parameters:
-
–\*argsargument inputs to subgraph forward function
-
–\**kwargskeyword inputs to subgraph forward function
Returns:
-
Dict[str, Any]–
Source code in llmcompressor/pipelines/sequential/helpers.py
dispatch_for_sequential
Dispatch a model for sequential calibration using a sequential pipeline. The model will be offloaded to the CPU and dispatched to CUDA/XPU device if available. Removes any existing hooks.
Parameters:
-
(modelPreTrainedModel) –model to dispatch
Returns:
-
PreTrainedModel–dispatched model
Source code in llmcompressor/pipelines/sequential/helpers.py
find_target_nodes
Find all nodes whose execution is equivalent to executing the target modules. Note that these nodes are guaranteed to be treated as leaf nodes by SequentialTracer
Parameters:
-
(graphGraphModule) –graph containing target nodes
-
(targetsSet[Module]) –modules whose nodes are being searched for
Returns:
-
Set[Node]–set of all nodes which call the target modules
Source code in llmcompressor/pipelines/sequential/helpers.py
get_sequential_ancestors
Find modules which are call graph ancestors of the given sequential targets
Parameters:
-
(modelModule) –model containing sequential targets
-
(targetsSet[Module]) –sequential targets to find ancestors of
Returns:
-
Set[Module]–call graph ancestors of sequential targets
Source code in llmcompressor/pipelines/sequential/helpers.py
get_sequential_targets
get_sequential_targets(
modifiers: List[Modifier],
model: PreTrainedModel,
args: DatasetArguments,
) -> List[str]
Infer sequential targets from modifiers list and dataset args
Parameters:
-
(modelPreTrainedModel) –model being calibrated
-
(modifiersList[Modifier]) –list of modifiers being applied during calibration
-
–dataset_argsdataset arguments passed by user
Returns:
-
List[str]–list of sequential targets
Source code in llmcompressor/pipelines/sequential/helpers.py
graph_is_well_formed
A graph is well formed if and only if nodeA in NodeB.users <=> nodeB in Node.A.all_input_nodes
Parameters:
-
(graphGraph) –graph being checked
Returns:
-
bool–True if the graph is well formed, False otherwise
Source code in llmcompressor/pipelines/sequential/helpers.py
match_modules
Find modules whose names match the patterns given by target_names
Parameters:
-
(modelModule) –model containing submodules to find
-
(target_namesList[str]) –target patterns to find
Returns:
-
Set[Module]–all submodules matching
target_names
Source code in llmcompressor/pipelines/sequential/helpers.py
partition_graph
Convert each partition into a Subgraph. Each Subgraph returns a dictionary mapping of output node names to their computed values. Note that the consumed_names attribute of each Subgraph remains empty, to be later populated by trace_consumed_names
Parameters:
-
(modelModule) –model which owns the produced Subgraphs
-
(partitionsList[List[Node]]) –list of partitions, where each partition is a list of nodes belonging to that partition
Returns:
-
List[Subgraph]–list of subgraphs in order of execution
Source code in llmcompressor/pipelines/sequential/helpers.py
populate_concrete_args
Creates concrete args which, unlike the equivalent function provided by transformers.utils.fx, creates default values for variadic arguments, which are needed by some models.
Parameters:
-
(modelModule) –model being traced
-
(sample_inputDict) –values used to symbolically trace the model. All arguments to the model.forward function which are not in the sample_input are considered concrete args
Returns:
-
Dict–dictionary mapping concrete argument names to their default values
Source code in llmcompressor/pipelines/sequential/helpers.py
topological_partition
Partition the graph into partitions such that each target belongs to exactly one partition and executing each partition depends only on intermediate values produced by executing the partitions before it.
Parameters:
-
(graphGraphModule) –graph being partitioned
-
(targetsSet[Module]) –target modules which will be assigned to disjoint partitions
Returns:
-
List[List[Node]]–list of partitions, where each partition is a list of nodes belonging to that partition
Source code in llmcompressor/pipelines/sequential/helpers.py
trace_consumed_names
Populate the consumed_names attribute of each Subgraph according to when inputs are last used in order to vacate the intermediates cache and save memory
Parameters:
-
(subgraphsList[Subgraph]) –list of subgraphs with empty
consumed_namesattributes
Source code in llmcompressor/pipelines/sequential/helpers.py
trace_subgraphs
trace_subgraphs(
model: PreTrainedModel,
sample_input: Dict[str, Any],
sequential_targets: List[str],
ignore: List[str],
) -> List[Subgraph]
Trace a model to produce subgraphs, where each sequential target belongs to exactly one subgraph and where executing each subgraph in order is equivalent to executing the original model
Parameters:
-
(modelPreTrainedModel) –model being traced
-
(sample_inputDict[str, Any]) –inputs whose values will change during execution but whose len, bool, and contains values are assumed constant across batches
-
(sequential_targetsList[str]) –list of patterns matching sequential targets
-
(ignoreList[str]) –function and method names to skip during tracing
Returns:
-
List[Subgraph]–a list of Subgraphs in order of execution