llmcompressor.core
Provides the core compression framework for LLM Compressor.
The core API manages compression sessions, tracks state changes, handles events during compression, and Provides lifecycle hooks for the compression process.
Modules:
-
events
–LLM Compressor Core Events Package
-
helpers
–Helper functions for core compression operations.
-
lifecycle
–Module for managing the compression lifecycle in the LLM Compressor.
-
model_layer
–Model layer utility classes for LLM compression workflows.
-
session
–Compression session management for LLM compression workflows.
-
session_functions
–Session management functions for LLM compression workflows.
-
state
–Module for managing LLM Compressor state.
Classes:
-
CompressionLifecycle
–A class for managing the lifecycle of compression events in the LLM Compressor.
-
CompressionSession
–A session for compression that holds the lifecycle
-
Data
–A dataclass to hold different data sets for training, validation,
-
Event
–A class for defining an event that can be triggered during sparsification.
-
EventType
–An Enum for defining the different types of events that can be triggered
-
Hardware
–A dataclass to hold information about the hardware being used.
-
LifecycleCallbacks
–A class for invoking lifecycle events for the active session
-
ModelParameterizedLayer
–A dataclass for holding a parameter and its layer
-
ModifiedState
–A dataclass to represent a modified model, optimizer, and loss.
-
State
–State class holds information about the current compression state.
Functions:
-
active_session
–:return: the active session for sparsification
-
create_session
–Context manager to create and yield a new session for sparsification.
-
reset_session
–Reset the currently active session to its initial state
CompressionLifecycle dataclass
CompressionLifecycle(
state: State = State(),
recipe: Recipe = Recipe(),
initialized_: bool = False,
finalized: bool = False,
_last_event_type: Optional[
EventType
] = EventType.BATCH_END,
_event_order: List[EventType] = (
lambda: [
EventType.BATCH_START,
EventType.LOSS_CALCULATED,
EventType.OPTIM_PRE_STEP,
EventType.OPTIM_POST_STEP,
EventType.BATCH_END,
]
)(),
global_step: int = 0,
)
A class for managing the lifecycle of compression events in the LLM Compressor.
Parameters:
-
state
State
, default:State()
) –The current state of the compression process
-
recipe
Recipe
, default:Recipe()
) –The compression recipe
-
modifiers
List[StageModifiers]
) –The list of stage modifiers
Methods:
-
event
–Handle a compression event.
-
finalize
–Finalize the compression lifecycle.
-
initialize
–Initialize the compression lifecycle.
-
reset
–Reset the compression lifecycle, finalizing any active modifiers
event
Handle a compression event.
Parameters:
-
event_type
EventType
) –The type of event to handle
-
kwargs
Additional arguments to pass to the event handlers
Returns:
-
List[Any]
–List of data returned from handling the event by modifiers
Raises:
-
ValueError
–If called before initialization, after finalization, or for an invalid event type
Source code in llmcompressor/core/lifecycle.py
finalize
Finalize the compression lifecycle.
Parameters:
-
kwargs
Additional arguments to update the state with
Returns:
-
List[Any]
–List of data returned from finalizing modifiers
Raises:
-
ValueError
–If called before initialization or more than once
Source code in llmcompressor/core/lifecycle.py
initialize
initialize(
recipe: Optional[RecipeInput] = None,
recipe_stage: Optional[RecipeStageInput] = None,
recipe_args: Optional[RecipeArgsInput] = None,
**kwargs
) -> List[Any]
Initialize the compression lifecycle.
Parameters:
-
kwargs
Additional arguments to update the state with
Returns:
-
List[Any]
–List of data returned from initialization of modifiers
Source code in llmcompressor/core/lifecycle.py
reset
Reset the compression lifecycle, finalizing any active modifiers and resetting all attributes.
Source code in llmcompressor/core/lifecycle.py
CompressionSession
A session for compression that holds the lifecycle and state for the current compression session
Methods:
-
event
–Invoke an event for current CompressionSession.
-
finalize
–Finalize the session for compression. This will run the finalize method
-
get_serialized_recipe
–:return: serialized string of the current compiled recipe
-
initialize
–Initialize the session for compression. This will run the initialize method
-
log
–Log model and loss information for the current event type
-
reset
–Reset the session to its initial state
-
reset_stage
–Reset the session for starting a new stage, recipe and model stays intact
Attributes:
-
lifecycle
(CompressionLifecycle
) –Lifecycle is used to keep track of where we are in the compression
-
state
(State
) –State of the current compression session. State instance
Source code in llmcompressor/core/session.py
lifecycle property
Lifecycle is used to keep track of where we are in the compression process and what modifiers are active. It also Provides the ability to invoke events on the lifecycle.
Returns:
-
CompressionLifecycle
–the lifecycle for the session
state property
State of the current compression session. State instance is used to store all information such as the recipe, model optimizer, data, etc. that is needed for compression.
Returns:
-
State
–the current state of the session
event
event(
event_type: EventType,
batch_data: Optional[Any] = None,
loss: Optional[Any] = None,
**kwargs
) -> ModifiedState
Invoke an event for current CompressionSession.
Parameters:
-
event_type
EventType
) –the event type to invoke
-
batch_data
Optional[Any]
, default:None
) –the batch data to use for the event
-
loss
Optional[Any]
, default:None
) –the loss to use for the event if any
-
kwargs
additional kwargs to pass to the lifecycle's event method
Returns:
-
ModifiedState
–the modified state of the session after invoking the event
Source code in llmcompressor/core/session.py
finalize
Finalize the session for compression. This will run the finalize method for each modifier in the session's lifecycle. This will also set the session's state to the finalized state.
Parameters:
-
kwargs
additional kwargs to pass to the lifecycle's finalize method
Returns:
-
ModifiedState
–the modified state of the session after finalizing
Source code in llmcompressor/core/session.py
get_serialized_recipe
Returns:
-
Optional[str]
–serialized string of the current compiled recipe
Source code in llmcompressor/core/session.py
initialize
initialize(
recipe: Union[
str, List[str], Recipe, List[Recipe], None
] = None,
recipe_stage: Union[str, List[str], None] = None,
recipe_args: Union[Dict[str, Any], None] = None,
model: Optional[Any] = None,
teacher_model: Optional[Any] = None,
optimizer: Optional[Any] = None,
attach_optim_callbacks: bool = True,
train_data: Optional[Any] = None,
val_data: Optional[Any] = None,
test_data: Optional[Any] = None,
calib_data: Optional[Any] = None,
copy_data: bool = True,
start: Optional[float] = None,
steps_per_epoch: Optional[int] = None,
batches_per_step: Optional[int] = None,
loggers: Union[
None, LoggerManager, List[BaseLogger]
] = None,
**kwargs
) -> ModifiedState
Initialize the session for compression. This will run the initialize method for each modifier in the session's lifecycle. This will also set the session's state to the initialized state.
Parameters:
-
recipe
Union[str, List[str], Recipe, List[Recipe], None]
, default:None
) –the recipe to use for the compression, can be a path to a recipe file, a raw recipe string, a recipe object, or a list of recipe objects.
-
recipe_stage
Union[str, List[str], None]
, default:None
) –the stage to target for the compression
-
recipe_args
Union[Dict[str, Any], None]
, default:None
) –the args to use for overriding the recipe defaults
-
model
Optional[Any]
, default:None
) –the model to compress
-
teacher_model
Optional[Any]
, default:None
) –the teacher model to use for knowledge distillation
-
optimizer
Optional[Any]
, default:None
) –the optimizer to use for the compression
-
attach_optim_callbacks
bool
, default:True
) –True to attach the optimizer callbacks to the compression lifecycle, False otherwise
-
train_data
Optional[Any]
, default:None
) –the training data to use for the compression
-
val_data
Optional[Any]
, default:None
) –the validation data to use for the compression
-
test_data
Optional[Any]
, default:None
) –the testing data to use for the compression
-
calib_data
Optional[Any]
, default:None
) –the calibration data to use for the compression
-
copy_data
bool
, default:True
) –True to copy the data, False otherwise
-
start
Optional[float]
, default:None
) –the start epoch to use for the compression
-
steps_per_epoch
Optional[int]
, default:None
) –the number of steps per epoch to use for the compression
-
batches_per_step
Optional[int]
, default:None
) –the number of batches per step to use for compression
-
loggers
Union[None, LoggerManager, List[BaseLogger]]
, default:None
) –the metrics manager to setup logging important info and milestones to, also accepts a list of BaseLogger(s)
-
kwargs
additional kwargs to pass to the lifecycle's initialize method
Returns:
-
ModifiedState
–the modified state of the session after initializing
Source code in llmcompressor/core/session.py
log
Log model and loss information for the current event type
Parameters:
-
event_type
EventType
) –the event type to log for
-
loss
Optional[Any]
, default:None
) –the loss to log if any
Source code in llmcompressor/core/session.py
reset
reset_stage
Reset the session for starting a new stage, recipe and model stays intact
Data dataclass
Data(
train: Optional[Any] = None,
val: Optional[Any] = None,
test: Optional[Any] = None,
calib: Optional[Any] = None,
)
A dataclass to hold different data sets for training, validation, testing, and/or calibration. Each data set is a ModifiableData instance.
Parameters:
-
train
Optional[Any]
, default:None
) –The training data set
-
val
Optional[Any]
, default:None
) –The validation data set
-
test
Optional[Any]
, default:None
) –The testing data set
-
calib
Optional[Any]
, default:None
) –The calibration data set
Event dataclass
Event(
type_: Optional[EventType] = None,
steps_per_epoch: Optional[int] = None,
batches_per_step: Optional[int] = None,
invocations_per_step: int = 1,
global_step: int = 0,
global_batch: int = 0,
)
A class for defining an event that can be triggered during sparsification.
Parameters:
-
type_
Optional[EventType]
, default:None
) –The type of event.
-
steps_per_epoch
Optional[int]
, default:None
) –The number of steps per epoch.
-
batches_per_step
Optional[int]
, default:None
) –The number of batches per step where step is an optimizer step invocation. For most pathways, these are the same. See the invocations_per_step parameter for more details when they are not.
-
invocations_per_step
int
, default:1
) –The number of invocations of the step wrapper before optimizer.step was called. Generally can be left as 1 (default). For older amp pathways, this is the number of times the scaler wrapper was invoked before the wrapped optimizer step function was called to handle accumulation in fp16.
-
global_step
int
, default:0
) –The current global step.
-
global_batch
int
, default:0
) –The current global batch.
Methods:
-
new_instance
–Creates a new instance of the event with the provided keyword arguments.
-
should_update
–Determines if the event should trigger an update.
Attributes:
-
current_index
(float
) –Calculates the current index of the event.
-
epoch
(int
) –Calculates the current epoch.
-
epoch_based
(bool
) –Determines if the event is based on epochs.
-
epoch_batch
(int
) –Calculates the current batch within the current epoch.
-
epoch_full
(float
) –Calculates the current epoch with the fraction of the current step.
-
epoch_step
(int
) –Calculates the current step within the current epoch.
current_index property
writable
Calculates the current index of the event.
Returns:
-
float
–The current index of the event, which is either the global step or the epoch with the fraction of the current step.
Raises:
-
ValueError
–if the event is not epoch based or if the steps per epoch are too many.
epoch property
Calculates the current epoch.
Returns:
-
int
–The current epoch.
Raises:
-
ValueError
–if the event is not epoch based.
epoch_based property
Determines if the event is based on epochs.
Returns:
-
bool
–True if the event is based on epochs, False otherwise.
epoch_batch property
Calculates the current batch within the current epoch.
Returns:
-
int
–The current batch within the current epoch.
Raises:
-
ValueError
–if the event is not epoch based.
epoch_full property
Calculates the current epoch with the fraction of the current step.
Returns:
-
float
–The current epoch with the fraction of the current step.
Raises:
-
ValueError
–if the event is not epoch based.
epoch_step property
Calculates the current step within the current epoch.
Returns:
-
int
–The current step within the current epoch.
Raises:
-
ValueError
–if the event is not epoch based.
new_instance
Creates a new instance of the event with the provided keyword arguments.
Parameters:
-
kwargs
Keyword arguments to set in the new instance.
Returns:
-
Event
–A new instance of the event with the provided kwargs.
Source code in llmcompressor/core/events/event.py
should_update
Determines if the event should trigger an update.
Parameters:
-
start
Optional[float]
) –The start index to check against, set to None to ignore start.
-
end
Optional[float]
) –The end index to check against, set to None to ignore end.
-
update
Optional[float]
) –The update interval, set to None or 0.0 to always update, otherwise must be greater than 0.0, defaults to None.
Returns:
-
bool
–True if the event should trigger an update, False otherwise.
Source code in llmcompressor/core/events/event.py
EventType
Bases: Enum
An Enum for defining the different types of events that can be triggered during model compression lifecycles. The purpose of each EventType is to trigger the corresponding modifier callback during training or post training pipelines.
Parameters:
-
INITIALIZE
Event type for initialization.
-
FINALIZE
Event type for finalization.
-
BATCH_START
Event type for the start of a batch.
-
LOSS_CALCULATED
Event type for when loss is calculated.
-
BATCH_END
Event type for the end of a batch.
-
CALIBRATION_EPOCH_START
Event type for the start of a calibration epoch.
-
SEQUENTIAL_EPOCH_END
Event type for the end of a layer calibration epoch, specifically used by
src/llmcompressor/pipelines/sequential/pipeline.py
-
CALIBRATION_EPOCH_END
Event type for the end of a calibration epoch.
-
OPTIM_PRE_STEP
Event type for pre-optimization step.
-
OPTIM_POST_STEP
Event type for post-optimization step.
Hardware dataclass
Hardware(
device: Optional[str] = None,
devices: Optional[List[str]] = None,
rank: Optional[int] = None,
world_size: Optional[int] = None,
local_rank: Optional[int] = None,
local_world_size: Optional[int] = None,
distributed: Optional[bool] = None,
distributed_strategy: Optional[str] = None,
)
A dataclass to hold information about the hardware being used.
Parameters:
-
device
Optional[str]
, default:None
) –The current device being used for training
-
devices
Optional[List[str]]
, default:None
) –List of all devices to be used for training
-
rank
Optional[int]
, default:None
) –The rank of the current device
-
world_size
Optional[int]
, default:None
) –The total number of devices being used
-
local_rank
Optional[int]
, default:None
) –The local rank of the current device
-
local_world_size
Optional[int]
, default:None
) –The total number of devices being used on the local machine
-
distributed
Optional[bool]
, default:None
) –Whether or not distributed training is being used
-
distributed_strategy
Optional[str]
, default:None
) –The distributed strategy being used
LifecycleCallbacks
A class for invoking lifecycle events for the active session
Methods:
-
batch_end
–Invoke a batch end event for the active session
-
batch_start
–Invoke a batch start event for the active session
-
calibration_epoch_end
–Invoke a epoch end event for the active session during calibration. This event
-
calibration_epoch_start
–Invoke a epoch start event for the active session during calibration. This event
-
event
–Invoke an event for the active session
-
loss_calculated
–Invoke a loss calculated event for the active session
-
optim_post_step
–Invoke an optimizer post-step event for the active session
-
optim_pre_step
–Invoke an optimizer pre-step event for the active session
-
sequential_epoch_end
–Invoke a sequential epoch end event for the active session. This event should be
batch_end classmethod
Invoke a batch end event for the active session
Parameters:
-
kwargs
additional kwargs to pass to the current session's event method
Returns:
-
ModifiedState
–the modified state of the active session after invoking the event
Source code in llmcompressor/core/session_functions.py
batch_start classmethod
Invoke a batch start event for the active session
Parameters:
-
batch_data
Optional[Any]
, default:None
) –the batch data to use for the event
-
kwargs
additional kwargs to pass to the current session's event method
Returns:
-
ModifiedState
–the modified state of the active session after invoking the event
Source code in llmcompressor/core/session_functions.py
calibration_epoch_end classmethod
Invoke a epoch end event for the active session during calibration. This event should be called after the model has been calibrated for one epoch
see src/llmcompressor/pipelines/basic/pipeline.py
for usage example
Source code in llmcompressor/core/session_functions.py
calibration_epoch_start classmethod
Invoke a epoch start event for the active session during calibration. This event should be called before calibration starts for one epoch
see src/llmcompressor/pipelines/basic/pipeline.py
for usage example
Source code in llmcompressor/core/session_functions.py
event classmethod
Invoke an event for the active session
Parameters:
-
event_type
EventType
) –the event type to invoke
-
kwargs
additional kwargs to pass to the current session's event method
Returns:
-
ModifiedState
–the modified state of the active session after invoking the event
Source code in llmcompressor/core/session_functions.py
loss_calculated classmethod
Invoke a loss calculated event for the active session
Parameters:
-
loss
Optional[Any]
, default:None
) –the loss to use for the event
-
kwargs
additional kwargs to pass to the current session's event method
Returns:
-
ModifiedState
–the modified state of the active session after invoking the event
Source code in llmcompressor/core/session_functions.py
optim_post_step classmethod
Invoke an optimizer post-step event for the active session
Parameters:
-
kwargs
additional kwargs to pass to the current session's event method
Returns:
-
ModifiedState
–the modified state of the active session after invoking the event
Source code in llmcompressor/core/session_functions.py
optim_pre_step classmethod
Invoke an optimizer pre-step event for the active session
Parameters:
-
kwargs
additional kwargs to pass to the current session's event method
Returns:
-
ModifiedState
–the modified state of the active session after invoking the event
Source code in llmcompressor/core/session_functions.py
sequential_epoch_end classmethod
Invoke a sequential epoch end event for the active session. This event should be called after one sequential layer has been calibrated/trained for one epoch
This is called after a sequential layer has been calibrated with one batch, see src/llmcompressor/pipelines/sequential/pipeline.py
for usage example
Source code in llmcompressor/core/session_functions.py
ModelParameterizedLayer dataclass
A dataclass for holding a parameter and its layer
Parameters:
-
layer_name
str
) –the name of the layer
-
layer
Any
) –the layer object
-
param_name
str
) –the name of the parameter
-
param
Any
) –the parameter object
ModifiedState dataclass
A dataclass to represent a modified model, optimizer, and loss.
Parameters:
-
model
Optional[Any]
) –The modified model
-
optimizer
Optional[Any]
) –The modified optimizer
-
loss
Optional[Any]
) –The modified loss
-
modifier_data
Optional[List[Dict[str, Any]]]
) –The modifier data used to modify the model, optimizer, and loss
Initialize the ModifiedState with the given parameters.
Parameters:
-
model
Any
) –The modified model
-
optimizer
Any
) –The modified optimizer
-
loss
Any
) –The modified loss
-
modifier_data
List[Dict[str, Any]]
) –The modifier data used to modify the model, optimizer, and loss
Source code in llmcompressor/core/state.py
State dataclass
State(
model: Any = None,
teacher_model: Any = None,
optimizer: Any = None,
optim_wrapped: bool = None,
loss: Any = None,
batch_data: Any = None,
data: Data = Data(),
hardware: Hardware = Hardware(),
loggers: Optional[LoggerManager] = None,
model_log_cadence: Optional[float] = None,
_last_log_step: Union[float, int, None] = None,
)
State class holds information about the current compression state.
Parameters:
-
model
Any
, default:None
) –The model being used for compression
-
teacher_model
Any
, default:None
) –The teacher model being used for compression
-
optimizer
Any
, default:None
) –The optimizer being used for training
-
optim_wrapped
bool
, default:None
) –Whether or not the optimizer has been wrapped
-
loss
Any
, default:None
) –The loss function being used for training
-
batch_data
Any
, default:None
) –The current batch of data being used for compression
-
data
Data
, default:Data()
) –The data sets being used for training, validation, testing, and/or calibration, wrapped in a Data instance
-
hardware
Hardware
, default:Hardware()
) –Hardware instance holding info about the target hardware being used
-
loggers
Optional[LoggerManager]
, default:None
) –LoggerManager instance holding all the loggers to log
-
model_log_cadence
Optional[float]
, default:None
) –The cadence to log model information w.r.t epochs. If 1, logs every epoch. If 2, logs every other epoch, etc. Default is 1.
Methods:
-
update
–Update the state with the given parameters.
Attributes:
-
compression_ready
(bool
) –Check if the model and optimizer are set for compression.
compression_ready property
Check if the model and optimizer are set for compression.
Returns:
-
bool
–True if model and optimizer are set, False otherwise
update
update(
model: Any = None,
teacher_model: Any = None,
optimizer: Any = None,
attach_optim_callbacks: bool = True,
train_data: Any = None,
val_data: Any = None,
test_data: Any = None,
calib_data: Any = None,
copy_data: bool = True,
start: float = None,
steps_per_epoch: int = None,
batches_per_step: int = None,
loggers: Union[
None, LoggerManager, List[BaseLogger]
] = None,
model_log_cadence: Optional[float] = None,
**kwargs
) -> Dict
Update the state with the given parameters.
Parameters:
-
model
Any
, default:None
) –The model to update the state with
-
teacher_model
Any
, default:None
) –The teacher model to update the state with
-
optimizer
Any
, default:None
) –The optimizer to update the state with
-
attach_optim_callbacks
bool
, default:True
) –Whether or not to attach optimizer callbacks
-
train_data
Any
, default:None
) –The training data to update the state with
-
val_data
Any
, default:None
) –The validation data to update the state with
-
test_data
Any
, default:None
) –The testing data to update the state with
-
calib_data
Any
, default:None
) –The calibration data to update the state with
-
copy_data
bool
, default:True
) –Whether or not to copy the data
-
start
float
, default:None
) –The start index to update the state with
-
steps_per_epoch
int
, default:None
) –The steps per epoch to update the state with
-
batches_per_step
int
, default:None
) –The batches per step to update the state with
-
loggers
Union[None, LoggerManager, List[BaseLogger]]
, default:None
) –The metrics manager to setup logging important info and milestones to, also accepts a list of BaseLogger(s)
-
model_log_cadence
Optional[float]
, default:None
) –The cadence to log model information w.r.t epochs. If 1, logs every epoch. If 2, logs every other epoch, etc. Default is 1.
-
kwargs
Additional keyword arguments to update the state with
Returns:
-
Dict
–The updated state as a dictionary
Source code in llmcompressor/core/state.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 |
|
active_session
Returns:
-
CompressionSession
–the active session for sparsification
create_session
Context manager to create and yield a new session for sparsification. This will set the active session to the new session for the duration of the context.
Returns:
-
Generator[CompressionSession, None, None]
–the new session