llmcompressor.pipelines.sequential.helpers
Classes:
-
Subgraph
–Dataclass specifying an executable subgraph of a model graph
Functions:
-
dispatch_for_sequential
–Dispatch a model for sequential calibration using a sequential pipeline.
-
get_sequential_targets
–Infer sequential targets from modifiers list and dataset args
-
trace_subgraphs
–Trace a model to produce subgraphs, where each sequential target belongs to exactly
SequentialTracer
Bases: HFTracer
Get a tracer specialized for the given model. The resulting tracer will not trace inside of sequential targets, nor any modules which are not call graph ancestors of sequential targets
Tracing within sequential targets is unnecessary, and tracing within offloaded modules may result in meta tensors being added to the model graph
Parameters:
-
ancestors
Set[Module]
) –modules which are ancestors of sequential targets
-
offloaded
Set[Module]
) –modules which have offloaded params and should not be traced
Source code in llmcompressor/pipelines/sequential/helpers.py
Subgraph dataclass
Subgraph(
graph: Graph,
input_names: Set[str],
consumed_names: Set[str],
_code: Optional[PythonCode] = None,
)
Dataclass specifying an executable subgraph of a model graph
Parameters:
-
graph
Graph
) –subgraph of model graph
-
input_names
Set[str]
) –argument names of the compiled forward function
-
consumed_names
Set[str]
) –argument names which are not used by any subsequent subgraphs and can therefore be deleted from the intermediates cache
Methods:
-
forward
–Execute the operations within the subgraph
forward
Execute the operations within the subgraph
Parameters:
-
\*args
argument inputs to subgraph forward function
-
\**kwargs
keyword inputs to subgraph forward function
Returns:
-
Dict[str, Any]
–
Source code in llmcompressor/pipelines/sequential/helpers.py
dispatch_for_sequential
Dispatch a model for sequential calibration using a sequential pipeline. The model will be offloaded to the CPU and dispatched to CUDA/XPU device if available. Removes any existing hooks.
Parameters:
-
model
PreTrainedModel
) –model to dispatch
Returns:
-
PreTrainedModel
–dispatched model
Source code in llmcompressor/pipelines/sequential/helpers.py
find_target_nodes
Find all nodes whose execution is equivalent to executing the target modules. Note that these nodes are guaranteed to be treated as leaf nodes by SequentialTracer
Parameters:
-
graph
GraphModule
) –graph containing target nodes
-
targets
Set[Module]
) –modules whose nodes are being searched for
Returns:
-
Set[Node]
–set of all nodes which call the target modules
Source code in llmcompressor/pipelines/sequential/helpers.py
get_sequential_ancestors
Find modules which are call graph ancestors of the given sequential targets
Parameters:
-
model
Module
) –model containing sequential targets
-
targets
Set[Module]
) –sequential targets to find ancestors of
Returns:
-
Set[Module]
–call graph ancestors of sequential targets
Source code in llmcompressor/pipelines/sequential/helpers.py
get_sequential_targets
get_sequential_targets(
modifiers: List[Modifier],
model: PreTrainedModel,
args: DatasetArguments,
) -> List[str]
Infer sequential targets from modifiers list and dataset args
Parameters:
-
model
PreTrainedModel
) –model being calibrated
-
modifiers
List[Modifier]
) –list of modifiers being applied during calibration
-
dataset_args
dataset arguments passed by user
Returns:
-
List[str]
–list of sequential targets
Source code in llmcompressor/pipelines/sequential/helpers.py
graph_is_well_formed
A graph is well formed if and only if nodeA in NodeB.users <=> nodeB in Node.A.all_input_nodes
Parameters:
-
graph
Graph
) –graph being checked
Returns:
-
bool
–True if the graph is well formed, False otherwise
Source code in llmcompressor/pipelines/sequential/helpers.py
match_modules
Find modules whose names match the patterns given by target_names
Parameters:
-
model
Module
) –model containing submodules to find
-
target_names
List[str]
) –target patterns to find
Returns:
-
Set[Module]
–all submodules matching
target_names
Source code in llmcompressor/pipelines/sequential/helpers.py
partition_graph
Convert each partition into a Subgraph. Each Subgraph returns a dictionary mapping of output node names to their computed values. Note that the consumed_names
attribute of each Subgraph remains empty, to be later populated by trace_consumed_names
Parameters:
-
model
Module
) –model which owns the produced Subgraphs
-
partitions
List[List[Node]]
) –list of partitions, where each partition is a list of nodes belonging to that partition
Returns:
-
List[Subgraph]
–list of subgraphs in order of execution
Source code in llmcompressor/pipelines/sequential/helpers.py
populate_concrete_args
Creates concrete args which, unlike the equivalent function provided by transformers.utils.fx, creates default values for variadic arguments, which are needed by some models.
Parameters:
-
model
Module
) –model being traced
-
sample_input
Dict
) –values used to symbolically trace the model. All arguments to the model.forward function which are not in the sample_input are considered concrete args
Returns:
-
Dict
–dictionary mapping concrete argument names to their default values
Source code in llmcompressor/pipelines/sequential/helpers.py
topological_partition
Partition the graph into partitions such that each target
belongs to exactly one partition and executing each partition depends only on intermediate values produced by executing the partitions before it.
Parameters:
-
graph
GraphModule
) –graph being partitioned
-
targets
Set[Module]
) –target modules which will be assigned to disjoint partitions
Returns:
-
List[List[Node]]
–list of partitions, where each partition is a list of nodes belonging to that partition
Source code in llmcompressor/pipelines/sequential/helpers.py
trace_consumed_names
Populate the consumed_names
attribute of each Subgraph according to when inputs are last used in order to vacate the intermediates
cache and save memory
Parameters:
-
subgraphs
List[Subgraph]
) –list of subgraphs with empty
consumed_names
attributes
Source code in llmcompressor/pipelines/sequential/helpers.py
trace_subgraphs
trace_subgraphs(
model: PreTrainedModel,
sample_input: Dict[str, Any],
sequential_targets: List[str],
ignore: List[str],
) -> List[Subgraph]
Trace a model to produce subgraphs, where each sequential target belongs to exactly one subgraph and where executing each subgraph in order is equivalent to executing the original model
Parameters:
-
model
PreTrainedModel
) –model being traced
-
sample_input
Dict[str, Any]
) –inputs whose values will change during execution but whose len, bool, and contains values are assumed constant across batches
-
sequential_targets
List[str]
) –list of patterns matching sequential targets
-
ignore
List[str]
) –function and method names to skip during tracing
Returns:
-
List[Subgraph]
–a list of Subgraphs in order of execution