Skip to content

llmcompressor.core

Provides the core compression framework for LLM Compressor.

The core API manages compression sessions, tracks state changes, handles events during compression, and Provides lifecycle hooks for the compression process.

Modules:

  • events

    LLM Compressor Core Events Package

  • helpers

    Helper functions for core compression operations.

  • lifecycle

    Module for managing the compression lifecycle in the LLM Compressor.

  • model_layer

    Model layer utility classes for LLM compression workflows.

  • session

    Compression session management for LLM compression workflows.

  • session_functions

    Session management functions for LLM compression workflows.

  • state

    Module for managing LLM Compressor state.

Classes:

  • CompressionLifecycle

    A class for managing the lifecycle of compression events in the LLM Compressor.

  • CompressionSession

    A session for compression that holds the lifecycle

  • Data

    A dataclass to hold different data sets for training, validation,

  • Event

    A class for defining an event that can be triggered during sparsification.

  • EventType

    An Enum for defining the different types of events that can be triggered

  • Hardware

    A dataclass to hold information about the hardware being used.

  • LifecycleCallbacks

    A class for invoking lifecycle events for the active session

  • ModelParameterizedLayer

    A dataclass for holding a parameter and its layer

  • ModifiedState

    A dataclass to represent a modified model, optimizer, and loss.

  • State

    State class holds information about the current compression state.

Functions:

  • active_session

    :return: the active session for sparsification

  • create_session

    Context manager to create and yield a new session for sparsification.

  • reset_session

    Reset the currently active session to its initial state

CompressionLifecycle dataclass

CompressionLifecycle(
    state: State = State(),
    recipe: Recipe = Recipe(),
    initialized_: bool = False,
    finalized: bool = False,
    _last_event_type: Optional[
        EventType
    ] = EventType.BATCH_END,
    _event_order: List[EventType] = (
        lambda: [
            EventType.BATCH_START,
            EventType.LOSS_CALCULATED,
            EventType.OPTIM_PRE_STEP,
            EventType.OPTIM_POST_STEP,
            EventType.BATCH_END,
        ]
    )(),
    global_step: int = 0,
)

A class for managing the lifecycle of compression events in the LLM Compressor.

Parameters:

  • state

    (State, default: State() ) –

    The current state of the compression process

  • recipe

    (Recipe, default: Recipe() ) –

    The compression recipe

  • modifiers

    (List[StageModifiers]) –

    The list of stage modifiers

Methods:

  • event

    Handle a compression event.

  • finalize

    Finalize the compression lifecycle.

  • initialize

    Initialize the compression lifecycle.

  • reset

    Reset the compression lifecycle, finalizing any active modifiers

event

event(
    event_type: EventType,
    global_step: Optional[int] = 0,
    **kwargs
) -> List[Any]

Handle a compression event.

Parameters:

  • event_type

    (EventType) –

    The type of event to handle

  • kwargs

    Additional arguments to pass to the event handlers

Returns:

  • List[Any]

    List of data returned from handling the event by modifiers

Raises:

  • ValueError

    If called before initialization, after finalization, or for an invalid event type

Source code in llmcompressor/core/lifecycle.py
def event(
    self, event_type: EventType, global_step: Optional[int] = 0, **kwargs
) -> List[Any]:
    """
    Handle a compression event.

    :param event_type: The type of event to handle
    :type event_type: EventType
    :param kwargs: Additional arguments to pass to the event handlers
    :return: List of data returned from handling the event by modifiers
    :rtype: List[Any]
    :raises ValueError: If called before initialization, after finalization,
        or for an invalid event type
    """
    if not self.initialized_:
        logger.error("Cannot invoke event before initializing")
        raise ValueError("Cannot invoke event before initializing")

    if self.finalized:
        logger.error("Cannot invoke event after finalizing")
        raise ValueError("Cannot invoke event after finalizing")

    if event_type in [EventType.INITIALIZE, EventType.FINALIZE]:
        logger.error(
            "Cannot invoke {} event. Use the corresponding method instead.",
            event_type,
        )
        raise ValueError(
            f"Cannot invoke {event_type} event. "
            f"Use the corresponding method instead."
        )

    if not self._validate_event_order(event_type):
        raise ValueError(
            f"Lifecycle events must appear following order: {self._event_order}. "
            f"Instead, {self._last_event_type} was called before {event_type}"
        )

    if event_type == EventType.LOSS_CALCULATED and (
        "loss" not in kwargs or kwargs["loss"] is None
    ):
        logger.error("Loss must be provided for loss calculated event")
        raise ValueError("Loss must be provided for loss calculated event")

    logger.debug("Handling event: {}", event_type)

    # update global step
    if global_step is not None:
        self.global_step = global_step

    event = Event(type_=event_type)
    mod_data = []
    for mod in self.recipe.modifiers:
        data = mod.update_event(state=self.state, event=event, **kwargs)
        logger.debug("Updated event with modifier: {}", mod)
        if data is not None:
            mod_data.append(data)

    assert (
        event is not None
    ), f"Event lifecycle did not return an event for {event_type}"

    return mod_data

finalize

finalize(**kwargs) -> List[Any]

Finalize the compression lifecycle.

Parameters:

  • kwargs

    Additional arguments to update the state with

Returns:

  • List[Any]

    List of data returned from finalizing modifiers

Raises:

  • ValueError

    If called before initialization or more than once

Source code in llmcompressor/core/lifecycle.py
def finalize(self, **kwargs) -> List[Any]:
    """
    Finalize the compression lifecycle.

    :param kwargs: Additional arguments to update the state with
    :return: List of data returned from finalizing modifiers
    :rtype: List[Any]
    :raises ValueError: If called before initialization or more than once
    """
    if not self.initialized_:
        logger.error("Cannot finalize before initializing")
        raise ValueError("Cannot finalize before initializing")

    if self.finalized:
        logger.error("Cannot finalize more than once")
        raise ValueError("Cannot finalize more than once")

    logger.debug("Finalizing compression lifecycle")
    mod_data = []
    for mod in self.recipe.modifiers:
        data = mod.finalize(state=self.state, **kwargs)
        logger.debug("Finalized modifier: {}", mod)
        if data is not None:
            mod_data.append(data)

    self.finalized = True

    logger.info(
        "Compression lifecycle finalized for {} modifiers",
        len(self.recipe.modifiers),
    )

    return mod_data

initialize

initialize(
    recipe: Optional[RecipeInput] = None,
    recipe_stage: Optional[RecipeStageInput] = None,
    recipe_args: Optional[RecipeArgsInput] = None,
    **kwargs
) -> List[Any]

Initialize the compression lifecycle.

Parameters:

  • kwargs

    Additional arguments to update the state with

Returns:

  • List[Any]

    List of data returned from initialization of modifiers

Source code in llmcompressor/core/lifecycle.py
def initialize(
    self,
    recipe: Optional[RecipeInput] = None,
    recipe_stage: Optional[RecipeStageInput] = None,
    recipe_args: Optional[RecipeArgsInput] = None,
    **kwargs,
) -> List[Any]:
    """
    Initialize the compression lifecycle.

    :param kwargs: Additional arguments to update the state with
    :return: List of data returned from initialization of modifiers
    :rtype: List[Any]
    """

    self.state.update(**kwargs)
    if self.initialized_:  # TODO: do not initialize twice
        return

    logger.debug("Initializing compression lifecycle")
    if not recipe:
        self.recipe = Recipe()
    else:
        self.recipe = Recipe.create_instance(
            path_or_modifiers=recipe, target_stage=recipe_stage
        )
        if recipe_args:
            self.recipe.args = {**recipe_args}

    mod_data = []
    for mod in self.recipe.modifiers:
        data = mod.initialize(state=self.state, **kwargs)
        logger.debug("Initialized modifier: {}", mod)
        if data is not None:
            mod_data.append(data)

    self.initialized_ = True
    logger.info(
        "Compression lifecycle initialized for {} modifiers",
        len(self.recipe.modifiers),
    )

    return mod_data

reset

reset()

Reset the compression lifecycle, finalizing any active modifiers and resetting all attributes.

Source code in llmcompressor/core/lifecycle.py
def reset(self):
    """
    Reset the compression lifecycle, finalizing any active modifiers
    and resetting all attributes.
    """
    logger.debug("Resetting compression lifecycle")

    for mod in self.recipe.modifiers:
        if not mod.initialized or mod.finalized:
            continue
        try:
            mod.finalize(self.state)
            logger.debug("Finalized modifier: {}", mod)
        except Exception as e:
            logger.warning(f"Exception during finalizing modifier: {e}")

    self.__init__()
    logger.info("Compression lifecycle reset")

CompressionSession

CompressionSession()

A session for compression that holds the lifecycle and state for the current compression session

Methods:

  • event

    Invoke an event for current CompressionSession.

  • finalize

    Finalize the session for compression. This will run the finalize method

  • get_serialized_recipe

    :return: serialized string of the current compiled recipe

  • initialize

    Initialize the session for compression. This will run the initialize method

  • log

    Log model and loss information for the current event type

  • reset

    Reset the session to its initial state

  • reset_stage

    Reset the session for starting a new stage, recipe and model stays intact

Attributes:

Source code in llmcompressor/core/session.py
def __init__(self):
    self._lifecycle = CompressionLifecycle()

lifecycle property

lifecycle: CompressionLifecycle

Lifecycle is used to keep track of where we are in the compression process and what modifiers are active. It also Provides the ability to invoke events on the lifecycle.

Returns:

state property

state: State

State of the current compression session. State instance is used to store all information such as the recipe, model optimizer, data, etc. that is needed for compression.

Returns:

  • State

    the current state of the session

event

event(
    event_type: EventType,
    batch_data: Optional[Any] = None,
    loss: Optional[Any] = None,
    **kwargs
) -> ModifiedState

Invoke an event for current CompressionSession.

Parameters:

  • event_type

    (EventType) –

    the event type to invoke

  • batch_data

    (Optional[Any], default: None ) –

    the batch data to use for the event

  • loss

    (Optional[Any], default: None ) –

    the loss to use for the event if any

  • kwargs

    additional kwargs to pass to the lifecycle's event method

Returns:

  • ModifiedState

    the modified state of the session after invoking the event

Source code in llmcompressor/core/session.py
def event(
    self,
    event_type: EventType,
    batch_data: Optional[Any] = None,
    loss: Optional[Any] = None,
    **kwargs,
) -> ModifiedState:
    """
    Invoke an event for current CompressionSession.

    :param event_type: the event type to invoke
    :param batch_data: the batch data to use for the event
    :param loss: the loss to use for the event if any
    :param kwargs: additional kwargs to pass to the lifecycle's event method
    :return: the modified state of the session after invoking the event
    """
    mod_data = self._lifecycle.event(
        event_type=event_type, batch_data=batch_data, loss=loss, **kwargs
    )
    return ModifiedState(
        model=self.state.model,
        optimizer=self.state.optimizer,
        loss=self.state.loss,  # TODO: is this supposed to be a different type?
        modifier_data=mod_data,
    )

finalize

finalize(**kwargs) -> ModifiedState

Finalize the session for compression. This will run the finalize method for each modifier in the session's lifecycle. This will also set the session's state to the finalized state.

Parameters:

  • kwargs

    additional kwargs to pass to the lifecycle's finalize method

Returns:

  • ModifiedState

    the modified state of the session after finalizing

Source code in llmcompressor/core/session.py
def finalize(self, **kwargs) -> ModifiedState:
    """
    Finalize the session for compression. This will run the finalize method
    for each modifier in the session's lifecycle. This will also set the session's
    state to the finalized state.

    :param kwargs: additional kwargs to pass to the lifecycle's finalize method
    :return: the modified state of the session after finalizing
    """
    mod_data = self._lifecycle.finalize(**kwargs)

    return ModifiedState(
        model=self.state.model,
        optimizer=self.state.optimizer,
        loss=self.state.loss,
        modifier_data=mod_data,
    )

get_serialized_recipe

get_serialized_recipe() -> Optional[str]

Returns:

  • Optional[str]

    serialized string of the current compiled recipe

Source code in llmcompressor/core/session.py
def get_serialized_recipe(self) -> Optional[str]:
    """
    :return: serialized string of the current compiled recipe
    """
    recipe = self.lifecycle.recipe

    if recipe is not None and hasattr(recipe, "yaml"):
        return recipe.yaml()

    logger.warning("Recipe not found in session - it may have been reset")

initialize

initialize(
    recipe: Union[
        str, List[str], Recipe, List[Recipe], None
    ] = None,
    recipe_stage: Union[str, List[str], None] = None,
    recipe_args: Union[Dict[str, Any], None] = None,
    model: Optional[Any] = None,
    teacher_model: Optional[Any] = None,
    optimizer: Optional[Any] = None,
    attach_optim_callbacks: bool = True,
    train_data: Optional[Any] = None,
    val_data: Optional[Any] = None,
    test_data: Optional[Any] = None,
    calib_data: Optional[Any] = None,
    copy_data: bool = True,
    start: Optional[float] = None,
    steps_per_epoch: Optional[int] = None,
    batches_per_step: Optional[int] = None,
    loggers: Union[
        None, LoggerManager, List[BaseLogger]
    ] = None,
    **kwargs
) -> ModifiedState

Initialize the session for compression. This will run the initialize method for each modifier in the session's lifecycle. This will also set the session's state to the initialized state.

Parameters:

  • recipe

    (Union[str, List[str], Recipe, List[Recipe], None], default: None ) –

    the recipe to use for the compression, can be a path to a recipe file, a raw recipe string, a recipe object, or a list of recipe objects.

  • recipe_stage

    (Union[str, List[str], None], default: None ) –

    the stage to target for the compression

  • recipe_args

    (Union[Dict[str, Any], None], default: None ) –

    the args to use for overriding the recipe defaults

  • model

    (Optional[Any], default: None ) –

    the model to compress

  • teacher_model

    (Optional[Any], default: None ) –

    the teacher model to use for knowledge distillation

  • optimizer

    (Optional[Any], default: None ) –

    the optimizer to use for the compression

  • attach_optim_callbacks

    (bool, default: True ) –

    True to attach the optimizer callbacks to the compression lifecycle, False otherwise

  • train_data

    (Optional[Any], default: None ) –

    the training data to use for the compression

  • val_data

    (Optional[Any], default: None ) –

    the validation data to use for the compression

  • test_data

    (Optional[Any], default: None ) –

    the testing data to use for the compression

  • calib_data

    (Optional[Any], default: None ) –

    the calibration data to use for the compression

  • copy_data

    (bool, default: True ) –

    True to copy the data, False otherwise

  • start

    (Optional[float], default: None ) –

    the start epoch to use for the compression

  • steps_per_epoch

    (Optional[int], default: None ) –

    the number of steps per epoch to use for the compression

  • batches_per_step

    (Optional[int], default: None ) –

    the number of batches per step to use for compression

  • loggers

    (Union[None, LoggerManager, List[BaseLogger]], default: None ) –

    the metrics manager to setup logging important info and milestones to, also accepts a list of BaseLogger(s)

  • kwargs

    additional kwargs to pass to the lifecycle's initialize method

Returns:

  • ModifiedState

    the modified state of the session after initializing

Source code in llmcompressor/core/session.py
def initialize(
    self,
    recipe: Union[str, List[str], "Recipe", List["Recipe"], None] = None,
    recipe_stage: Union[str, List[str], None] = None,
    recipe_args: Union[Dict[str, Any], None] = None,
    model: Optional[Any] = None,
    teacher_model: Optional[Any] = None,
    optimizer: Optional[Any] = None,
    attach_optim_callbacks: bool = True,
    train_data: Optional[Any] = None,
    val_data: Optional[Any] = None,
    test_data: Optional[Any] = None,
    calib_data: Optional[Any] = None,
    copy_data: bool = True,
    start: Optional[float] = None,
    steps_per_epoch: Optional[int] = None,
    batches_per_step: Optional[int] = None,
    loggers: Union[None, LoggerManager, List[BaseLogger]] = None,
    **kwargs,
) -> ModifiedState:
    """
    Initialize the session for compression. This will run the initialize method
    for each modifier in the session's lifecycle. This will also set the session's
    state to the initialized state.

    :param recipe: the recipe to use for the compression, can be a path to a
        recipe file, a raw recipe string, a recipe object, or a list
        of recipe objects.
    :param recipe_stage: the stage to target for the compression
    :param recipe_args: the args to use for overriding the recipe defaults
    :param model: the model to compress
    :param teacher_model: the teacher model to use for knowledge distillation
    :param optimizer: the optimizer to use for the compression
    :param attach_optim_callbacks: True to attach the optimizer callbacks to the
        compression lifecycle, False otherwise
    :param train_data: the training data to use for the compression
    :param val_data: the validation data to use for the compression
    :param test_data: the testing data to use for the compression
    :param calib_data: the calibration data to use for the compression
    :param copy_data: True to copy the data, False otherwise
    :param start: the start epoch to use for the compression
    :param steps_per_epoch: the number of steps per epoch to use for the
        compression
    :param batches_per_step: the number of batches per step to use for
        compression
    :param loggers: the metrics manager to setup logging important info
        and milestones to, also accepts a list of BaseLogger(s)
    :param kwargs: additional kwargs to pass to the lifecycle's initialize method
    :return: the modified state of the session after initializing
    """
    mod_data = self._lifecycle.initialize(
        recipe=recipe,
        recipe_stage=recipe_stage,
        recipe_args=recipe_args,
        model=model,
        teacher_model=teacher_model,
        optimizer=optimizer,
        attach_optim_callbacks=attach_optim_callbacks,
        train_data=train_data,
        val_data=val_data,
        test_data=test_data,
        calib_data=calib_data,
        copy_data=copy_data,
        start=start,
        steps_per_epoch=steps_per_epoch,
        batches_per_step=batches_per_step,
        loggers=loggers,
        **kwargs,
    )

    return ModifiedState(
        model=self.state.model,
        optimizer=self.state.optimizer,
        loss=self.state.loss,
        modifier_data=mod_data,
    )

log

log(event_type: EventType, loss: Optional[Any] = None)

Log model and loss information for the current event type

Parameters:

  • event_type

    (EventType) –

    the event type to log for

  • loss

    (Optional[Any], default: None ) –

    the loss to log if any

Source code in llmcompressor/core/session.py
def log(self, event_type: EventType, loss: Optional[Any] = None):
    """
    Log model and loss information for the current event type

    :param event_type: the event type to log for
    :param loss: the loss to log if any
    """
    self._log_model_info()
    self._log_loss(event_type=event_type, loss=loss)

reset

reset()

Reset the session to its initial state

Source code in llmcompressor/core/session.py
def reset(self):
    """
    Reset the session to its initial state
    """
    self._lifecycle.reset()

reset_stage

reset_stage()

Reset the session for starting a new stage, recipe and model stays intact

Source code in llmcompressor/core/session.py
def reset_stage(self):
    """
    Reset the session for starting a new stage, recipe and model stays intact
    """
    self.lifecycle.initialized_ = False
    self.lifecycle.finalized = False

Data dataclass

Data(
    train: Optional[Any] = None,
    val: Optional[Any] = None,
    test: Optional[Any] = None,
    calib: Optional[Any] = None,
)

A dataclass to hold different data sets for training, validation, testing, and/or calibration. Each data set is a ModifiableData instance.

Parameters:

  • train

    (Optional[Any], default: None ) –

    The training data set

  • val

    (Optional[Any], default: None ) –

    The validation data set

  • test

    (Optional[Any], default: None ) –

    The testing data set

  • calib

    (Optional[Any], default: None ) –

    The calibration data set

Event dataclass

Event(
    type_: Optional[EventType] = None,
    steps_per_epoch: Optional[int] = None,
    batches_per_step: Optional[int] = None,
    invocations_per_step: int = 1,
    global_step: int = 0,
    global_batch: int = 0,
)

A class for defining an event that can be triggered during sparsification.

Parameters:

  • type_

    (Optional[EventType], default: None ) –

    The type of event.

  • steps_per_epoch

    (Optional[int], default: None ) –

    The number of steps per epoch.

  • batches_per_step

    (Optional[int], default: None ) –

    The number of batches per step where step is an optimizer step invocation. For most pathways, these are the same. See the invocations_per_step parameter for more details when they are not.

  • invocations_per_step

    (int, default: 1 ) –

    The number of invocations of the step wrapper before optimizer.step was called. Generally can be left as 1 (default). For older amp pathways, this is the number of times the scaler wrapper was invoked before the wrapped optimizer step function was called to handle accumulation in fp16.

  • global_step

    (int, default: 0 ) –

    The current global step.

  • global_batch

    (int, default: 0 ) –

    The current global batch.

Methods:

  • new_instance

    Creates a new instance of the event with the provided keyword arguments.

  • should_update

    Determines if the event should trigger an update.

Attributes:

  • current_index (float) –

    Calculates the current index of the event.

  • epoch (int) –

    Calculates the current epoch.

  • epoch_based (bool) –

    Determines if the event is based on epochs.

  • epoch_batch (int) –

    Calculates the current batch within the current epoch.

  • epoch_full (float) –

    Calculates the current epoch with the fraction of the current step.

  • epoch_step (int) –

    Calculates the current step within the current epoch.

current_index property writable

current_index: float

Calculates the current index of the event.

Returns:

  • float

    The current index of the event, which is either the global step or the epoch with the fraction of the current step.

Raises:

  • ValueError

    if the event is not epoch based or if the steps per epoch are too many.

epoch property

epoch: int

Calculates the current epoch.

Returns:

  • int

    The current epoch.

Raises:

  • ValueError

    if the event is not epoch based.

epoch_based property

epoch_based: bool

Determines if the event is based on epochs.

Returns:

  • bool

    True if the event is based on epochs, False otherwise.

epoch_batch property

epoch_batch: int

Calculates the current batch within the current epoch.

Returns:

  • int

    The current batch within the current epoch.

Raises:

  • ValueError

    if the event is not epoch based.

epoch_full property

epoch_full: float

Calculates the current epoch with the fraction of the current step.

Returns:

  • float

    The current epoch with the fraction of the current step.

Raises:

  • ValueError

    if the event is not epoch based.

epoch_step property

epoch_step: int

Calculates the current step within the current epoch.

Returns:

  • int

    The current step within the current epoch.

Raises:

  • ValueError

    if the event is not epoch based.

new_instance

new_instance(**kwargs) -> Event

Creates a new instance of the event with the provided keyword arguments.

Parameters:

  • kwargs

    Keyword arguments to set in the new instance.

Returns:

  • Event

    A new instance of the event with the provided kwargs.

Source code in llmcompressor/core/events/event.py
def new_instance(self, **kwargs) -> "Event":
    """
    Creates a new instance of the event with the provided keyword arguments.

    :param kwargs: Keyword arguments to set in the new instance.
    :return: A new instance of the event with the provided kwargs.
    :rtype: Event
    """
    logger.debug("Creating new instance of event with kwargs: {}", kwargs)
    instance = deepcopy(self)
    for key, value in kwargs.items():
        setattr(instance, key, value)
    return instance

should_update

should_update(
    start: Optional[float],
    end: Optional[float],
    update: Optional[float],
) -> bool

Determines if the event should trigger an update.

Parameters:

  • start

    (Optional[float]) –

    The start index to check against, set to None to ignore start.

  • end

    (Optional[float]) –

    The end index to check against, set to None to ignore end.

  • update

    (Optional[float]) –

    The update interval, set to None or 0.0 to always update, otherwise must be greater than 0.0, defaults to None.

Returns:

  • bool

    True if the event should trigger an update, False otherwise.

Source code in llmcompressor/core/events/event.py
def should_update(
    self, start: Optional[float], end: Optional[float], update: Optional[float]
) -> bool:
    """
    Determines if the event should trigger an update.

    :param start: The start index to check against, set to None to ignore start.
    :type start: Optional[float]
    :param end: The end index to check against, set to None to ignore end.
    :type end: Optional[float]
    :param update: The update interval, set to None or 0.0 to always update,
        otherwise must be greater than 0.0, defaults to None.
    :type update: Optional[float]
    :return: True if the event should trigger an update, False otherwise.
    :rtype: bool
    """
    current = self.current_index
    logger.debug(
        "Checking if event should update: "
        "current_index={}, start={}, end={}, update={}",
        current,
        start,
        end,
        update,
    )
    if start is not None and current < start:
        return False
    if end is not None and current > end:
        return False
    return update is None or update <= 0.0 or current % update < 1e-10

EventType

Bases: Enum

An Enum for defining the different types of events that can be triggered during model compression lifecycles. The purpose of each EventType is to trigger the corresponding modifier callback during training or post training pipelines.

Parameters:

  • INITIALIZE

    Event type for initialization.

  • FINALIZE

    Event type for finalization.

  • BATCH_START

    Event type for the start of a batch.

  • LOSS_CALCULATED

    Event type for when loss is calculated.

  • BATCH_END

    Event type for the end of a batch.

  • CALIBRATION_EPOCH_START

    Event type for the start of a calibration epoch.

  • SEQUENTIAL_EPOCH_END

    Event type for the end of a layer calibration epoch, specifically used by src/llmcompressor/pipelines/sequential/pipeline.py

  • CALIBRATION_EPOCH_END

    Event type for the end of a calibration epoch.

  • OPTIM_PRE_STEP

    Event type for pre-optimization step.

  • OPTIM_POST_STEP

    Event type for post-optimization step.

Hardware dataclass

Hardware(
    device: Optional[str] = None,
    devices: Optional[List[str]] = None,
    rank: Optional[int] = None,
    world_size: Optional[int] = None,
    local_rank: Optional[int] = None,
    local_world_size: Optional[int] = None,
    distributed: Optional[bool] = None,
    distributed_strategy: Optional[str] = None,
)

A dataclass to hold information about the hardware being used.

Parameters:

  • device

    (Optional[str], default: None ) –

    The current device being used for training

  • devices

    (Optional[List[str]], default: None ) –

    List of all devices to be used for training

  • rank

    (Optional[int], default: None ) –

    The rank of the current device

  • world_size

    (Optional[int], default: None ) –

    The total number of devices being used

  • local_rank

    (Optional[int], default: None ) –

    The local rank of the current device

  • local_world_size

    (Optional[int], default: None ) –

    The total number of devices being used on the local machine

  • distributed

    (Optional[bool], default: None ) –

    Whether or not distributed training is being used

  • distributed_strategy

    (Optional[str], default: None ) –

    The distributed strategy being used

LifecycleCallbacks

A class for invoking lifecycle events for the active session

Methods:

  • batch_end

    Invoke a batch end event for the active session

  • batch_start

    Invoke a batch start event for the active session

  • calibration_epoch_end

    Invoke a epoch end event for the active session during calibration. This event

  • calibration_epoch_start

    Invoke a epoch start event for the active session during calibration. This event

  • event

    Invoke an event for the active session

  • loss_calculated

    Invoke a loss calculated event for the active session

  • optim_post_step

    Invoke an optimizer post-step event for the active session

  • optim_pre_step

    Invoke an optimizer pre-step event for the active session

  • sequential_epoch_end

    Invoke a sequential epoch end event for the active session. This event should be

batch_end classmethod

batch_end(**kwargs) -> ModifiedState

Invoke a batch end event for the active session

Parameters:

  • kwargs

    additional kwargs to pass to the current session's event method

Returns:

  • ModifiedState

    the modified state of the active session after invoking the event

Source code in llmcompressor/core/session_functions.py
@classmethod
def batch_end(cls, **kwargs) -> ModifiedState:
    """
    Invoke a batch end event for the active session

    :param kwargs: additional kwargs to pass to the current session's event method
    :return: the modified state of the active session after invoking the event
    """
    active_session()._log_model_info()
    return cls.event(EventType.BATCH_END, **kwargs)

batch_start classmethod

batch_start(
    batch_data: Optional[Any] = None, **kwargs
) -> ModifiedState

Invoke a batch start event for the active session

Parameters:

  • batch_data

    (Optional[Any], default: None ) –

    the batch data to use for the event

  • kwargs

    additional kwargs to pass to the current session's event method

Returns:

  • ModifiedState

    the modified state of the active session after invoking the event

Source code in llmcompressor/core/session_functions.py
@classmethod
def batch_start(cls, batch_data: Optional[Any] = None, **kwargs) -> ModifiedState:
    """
    Invoke a batch start event for the active session

    :param batch_data: the batch data to use for the event
    :param kwargs: additional kwargs to pass to the current session's event method
    :return: the modified state of the active session after invoking the event
    """
    return cls.event(EventType.BATCH_START, batch_data=batch_data, **kwargs)

calibration_epoch_end classmethod

calibration_epoch_end(**kwargs) -> ModifiedState

Invoke a epoch end event for the active session during calibration. This event should be called after the model has been calibrated for one epoch

see src/llmcompressor/pipelines/basic/pipeline.py for usage example

Source code in llmcompressor/core/session_functions.py
@classmethod
def calibration_epoch_end(cls, **kwargs) -> ModifiedState:
    """
    Invoke a epoch end event for the active session during calibration. This event
    should be called after the model has been calibrated for one epoch

    see `src/llmcompressor/pipelines/basic/pipeline.py` for usage example
    """
    return cls.event(EventType.CALIBRATION_EPOCH_END, **kwargs)

calibration_epoch_start classmethod

calibration_epoch_start(**kwargs) -> ModifiedState

Invoke a epoch start event for the active session during calibration. This event should be called before calibration starts for one epoch

see src/llmcompressor/pipelines/basic/pipeline.py for usage example

Source code in llmcompressor/core/session_functions.py
@classmethod
def calibration_epoch_start(cls, **kwargs) -> ModifiedState:
    """
    Invoke a epoch start event for the active session during calibration. This event
    should be called before calibration starts for one epoch

    see `src/llmcompressor/pipelines/basic/pipeline.py` for usage example
    """
    return cls.event(EventType.CALIBRATION_EPOCH_START, **kwargs)

event classmethod

event(event_type: EventType, **kwargs) -> ModifiedState

Invoke an event for the active session

Parameters:

  • event_type

    (EventType) –

    the event type to invoke

  • kwargs

    additional kwargs to pass to the current session's event method

Returns:

  • ModifiedState

    the modified state of the active session after invoking the event

Source code in llmcompressor/core/session_functions.py
@classmethod
def event(cls, event_type: EventType, **kwargs) -> ModifiedState:
    """
    Invoke an event for the active session

    :param event_type: the event type to invoke
    :param kwargs: additional kwargs to pass to the current session's event method
    :return: the modified state of the active session after invoking the event
    """
    if event_type in [EventType.INITIALIZE, EventType.FINALIZE]:
        raise ValueError(
            f"Cannot invoke {event_type} event. "
            f"Use the corresponding method instead."
        )

    return active_session().event(event_type, **kwargs)

loss_calculated classmethod

loss_calculated(
    loss: Optional[Any] = None, **kwargs
) -> ModifiedState

Invoke a loss calculated event for the active session

Parameters:

  • loss

    (Optional[Any], default: None ) –

    the loss to use for the event

  • kwargs

    additional kwargs to pass to the current session's event method

Returns:

  • ModifiedState

    the modified state of the active session after invoking the event

Source code in llmcompressor/core/session_functions.py
@classmethod
def loss_calculated(cls, loss: Optional[Any] = None, **kwargs) -> ModifiedState:
    """
    Invoke a loss calculated event for the active session

    :param loss: the loss to use for the event
    :param kwargs: additional kwargs to pass to the current session's event method
    :return: the modified state of the active session after invoking the event
    """
    # log loss if loss calculated
    active_session()._log_loss(event_type=EventType.LOSS_CALCULATED, loss=loss)
    return cls.event(EventType.LOSS_CALCULATED, loss=loss, **kwargs)

optim_post_step classmethod

optim_post_step(**kwargs) -> ModifiedState

Invoke an optimizer post-step event for the active session

Parameters:

  • kwargs

    additional kwargs to pass to the current session's event method

Returns:

  • ModifiedState

    the modified state of the active session after invoking the event

Source code in llmcompressor/core/session_functions.py
@classmethod
def optim_post_step(cls, **kwargs) -> ModifiedState:
    """
    Invoke an optimizer post-step event for the active session

    :param kwargs: additional kwargs to pass to the current session's event method
    :return: the modified state of the active session after invoking the event
    """
    return cls.event(EventType.OPTIM_POST_STEP, **kwargs)

optim_pre_step classmethod

optim_pre_step(**kwargs) -> ModifiedState

Invoke an optimizer pre-step event for the active session

Parameters:

  • kwargs

    additional kwargs to pass to the current session's event method

Returns:

  • ModifiedState

    the modified state of the active session after invoking the event

Source code in llmcompressor/core/session_functions.py
@classmethod
def optim_pre_step(cls, **kwargs) -> ModifiedState:
    """
    Invoke an optimizer pre-step event for the active session

    :param kwargs: additional kwargs to pass to the current session's event method
    :return: the modified state of the active session after invoking the event
    """
    return cls.event(EventType.OPTIM_PRE_STEP, **kwargs)

sequential_epoch_end classmethod

sequential_epoch_end(**kwargs) -> ModifiedState

Invoke a sequential epoch end event for the active session. This event should be called after one sequential layer has been calibrated/trained for one epoch

This is called after a sequential layer has been calibrated with one batch, see src/llmcompressor/pipelines/sequential/pipeline.py for usage example

Source code in llmcompressor/core/session_functions.py
@classmethod
def sequential_epoch_end(cls, **kwargs) -> ModifiedState:
    """
    Invoke a sequential epoch end event for the active session. This event should be
    called after one sequential layer has been calibrated/trained for one epoch

    This is called after a sequential layer has been calibrated with one batch, see
    `src/llmcompressor/pipelines/sequential/pipeline.py` for usage example
    """
    return cls.event(EventType.SEQUENTIAL_EPOCH_END, **kwargs)

ModelParameterizedLayer dataclass

ModelParameterizedLayer(
    layer_name: str, layer: Any, param_name: str, param: Any
)

A dataclass for holding a parameter and its layer

Parameters:

  • layer_name

    (str) –

    the name of the layer

  • layer

    (Any) –

    the layer object

  • param_name

    (str) –

    the name of the parameter

  • param

    (Any) –

    the parameter object

ModifiedState dataclass

ModifiedState(model, optimizer, loss, modifier_data)

A dataclass to represent a modified model, optimizer, and loss.

Parameters:

  • model

    (Optional[Any]) –

    The modified model

  • optimizer

    (Optional[Any]) –

    The modified optimizer

  • loss

    (Optional[Any]) –

    The modified loss

  • modifier_data

    (Optional[List[Dict[str, Any]]]) –

    The modifier data used to modify the model, optimizer, and loss

Initialize the ModifiedState with the given parameters.

Parameters:

  • model

    (Any) –

    The modified model

  • optimizer

    (Any) –

    The modified optimizer

  • loss

    (Any) –

    The modified loss

  • modifier_data

    (List[Dict[str, Any]]) –

    The modifier data used to modify the model, optimizer, and loss

Source code in llmcompressor/core/state.py
def __init__(self, model, optimizer, loss, modifier_data):
    """
    Initialize the ModifiedState with the given parameters.

    :param model: The modified model
    :type model: Any
    :param optimizer: The modified optimizer
    :type optimizer: Any
    :param loss: The modified loss
    :type loss: Any
    :param modifier_data: The modifier data used to modify the model, optimizer,
        and loss
    :type modifier_data: List[Dict[str, Any]]
    """
    self.model = model
    self.optimizer = optimizer
    self.loss = loss
    self.modifier_data = modifier_data

State dataclass

State(
    model: Any = None,
    teacher_model: Any = None,
    optimizer: Any = None,
    optim_wrapped: bool = None,
    loss: Any = None,
    batch_data: Any = None,
    data: Data = Data(),
    hardware: Hardware = Hardware(),
    loggers: Optional[LoggerManager] = None,
    model_log_cadence: Optional[float] = None,
    _last_log_step: Union[float, int, None] = None,
)

State class holds information about the current compression state.

Parameters:

  • model

    (Any, default: None ) –

    The model being used for compression

  • teacher_model

    (Any, default: None ) –

    The teacher model being used for compression

  • optimizer

    (Any, default: None ) –

    The optimizer being used for training

  • optim_wrapped

    (bool, default: None ) –

    Whether or not the optimizer has been wrapped

  • loss

    (Any, default: None ) –

    The loss function being used for training

  • batch_data

    (Any, default: None ) –

    The current batch of data being used for compression

  • data

    (Data, default: Data() ) –

    The data sets being used for training, validation, testing, and/or calibration, wrapped in a Data instance

  • hardware

    (Hardware, default: Hardware() ) –

    Hardware instance holding info about the target hardware being used

  • loggers

    (Optional[LoggerManager], default: None ) –

    LoggerManager instance holding all the loggers to log

  • model_log_cadence

    (Optional[float], default: None ) –

    The cadence to log model information w.r.t epochs. If 1, logs every epoch. If 2, logs every other epoch, etc. Default is 1.

Methods:

  • update

    Update the state with the given parameters.

Attributes:

  • compression_ready (bool) –

    Check if the model and optimizer are set for compression.

compression_ready property

compression_ready: bool

Check if the model and optimizer are set for compression.

Returns:

  • bool

    True if model and optimizer are set, False otherwise

update

update(
    model: Any = None,
    teacher_model: Any = None,
    optimizer: Any = None,
    attach_optim_callbacks: bool = True,
    train_data: Any = None,
    val_data: Any = None,
    test_data: Any = None,
    calib_data: Any = None,
    copy_data: bool = True,
    start: float = None,
    steps_per_epoch: int = None,
    batches_per_step: int = None,
    loggers: Union[
        None, LoggerManager, List[BaseLogger]
    ] = None,
    model_log_cadence: Optional[float] = None,
    **kwargs
) -> Dict

Update the state with the given parameters.

Parameters:

  • model

    (Any, default: None ) –

    The model to update the state with

  • teacher_model

    (Any, default: None ) –

    The teacher model to update the state with

  • optimizer

    (Any, default: None ) –

    The optimizer to update the state with

  • attach_optim_callbacks

    (bool, default: True ) –

    Whether or not to attach optimizer callbacks

  • train_data

    (Any, default: None ) –

    The training data to update the state with

  • val_data

    (Any, default: None ) –

    The validation data to update the state with

  • test_data

    (Any, default: None ) –

    The testing data to update the state with

  • calib_data

    (Any, default: None ) –

    The calibration data to update the state with

  • copy_data

    (bool, default: True ) –

    Whether or not to copy the data

  • start

    (float, default: None ) –

    The start index to update the state with

  • steps_per_epoch

    (int, default: None ) –

    The steps per epoch to update the state with

  • batches_per_step

    (int, default: None ) –

    The batches per step to update the state with

  • loggers

    (Union[None, LoggerManager, List[BaseLogger]], default: None ) –

    The metrics manager to setup logging important info and milestones to, also accepts a list of BaseLogger(s)

  • model_log_cadence

    (Optional[float], default: None ) –

    The cadence to log model information w.r.t epochs. If 1, logs every epoch. If 2, logs every other epoch, etc. Default is 1.

  • kwargs

    Additional keyword arguments to update the state with

Returns:

  • Dict

    The updated state as a dictionary

Source code in llmcompressor/core/state.py
def update(
    self,
    model: Any = None,
    teacher_model: Any = None,
    optimizer: Any = None,
    attach_optim_callbacks: bool = True,
    train_data: Any = None,
    val_data: Any = None,
    test_data: Any = None,
    calib_data: Any = None,
    copy_data: bool = True,
    start: float = None,
    steps_per_epoch: int = None,
    batches_per_step: int = None,
    loggers: Union[None, LoggerManager, List[BaseLogger]] = None,
    model_log_cadence: Optional[float] = None,
    **kwargs,
) -> Dict:
    """
    Update the state with the given parameters.

    :param model: The model to update the state with
    :type model: Any
    :param teacher_model: The teacher model to update the state with
    :type teacher_model: Any
    :param optimizer: The optimizer to update the state with
    :type optimizer: Any
    :param attach_optim_callbacks: Whether or not to attach optimizer callbacks
    :type attach_optim_callbacks: bool
    :param train_data: The training data to update the state with
    :type train_data: Any
    :param val_data: The validation data to update the state with
    :type val_data: Any
    :param test_data: The testing data to update the state with
    :type test_data: Any
    :param calib_data: The calibration data to update the state with
    :type calib_data: Any
    :param copy_data: Whether or not to copy the data
    :type copy_data: bool
    :param start: The start index to update the state with
    :type start: float
    :param steps_per_epoch: The steps per epoch to update the state with
    :type steps_per_epoch: int
    :param batches_per_step: The batches per step to update the state with
    :type batches_per_step: int
    :param loggers: The metrics manager to setup logging important info and
        milestones to, also accepts a list of BaseLogger(s)
    :type loggers: Union[None, LoggerManager, List[BaseLogger]]
    :param model_log_cadence: The cadence to log model information w.r.t epochs.
        If 1, logs every epoch. If 2, logs every other epoch, etc. Default is 1.
    :type model_log_cadence: Optional[float]
    :param kwargs: Additional keyword arguments to update the state with
    :return: The updated state as a dictionary
    :rtype: Dict
    """
    logger.debug(
        "Updating state with provided parameters: {}",
        {
            "model": model,
            "teacher_model": teacher_model,
            "optimizer": optimizer,
            "attach_optim_callbacks": attach_optim_callbacks,
            "train_data": train_data,
            "val_data": val_data,
            "test_data": test_data,
            "calib_data": calib_data,
            "copy_data": copy_data,
            "start": start,
            "steps_per_epoch": steps_per_epoch,
            "batches_per_step": batches_per_step,
            "loggers": loggers,
            "model_log_cadence": model_log_cadence,
            "kwargs": kwargs,
        },
    )

    if model is not None:
        self.model = model
    if teacher_model is not None:
        self.teacher_model = teacher_model
    if optimizer is not None:
        self.optim_wrapped = attach_optim_callbacks
        self.optimizer = optimizer
    if train_data is not None:
        self.data.train = train_data if not copy_data else deepcopy(train_data)
    if val_data is not None:
        self.data.val = val_data if not copy_data else deepcopy(val_data)
    if test_data is not None:
        self.data.test = test_data if not copy_data else deepcopy(test_data)
    if calib_data is not None:
        self.data.calib = calib_data if not copy_data else deepcopy(calib_data)

    if "device" in kwargs:
        self.hardware.device = kwargs["device"]

    loggers = loggers or []
    if isinstance(loggers, list):
        loggers = LoggerManager(loggers)
    self.loggers = loggers

    if model_log_cadence is not None:
        self.model_log_cadence = model_log_cadence

    return kwargs

active_session

active_session() -> CompressionSession

Returns:

Source code in llmcompressor/core/session_functions.py
def active_session() -> CompressionSession:
    """
    :return: the active session for sparsification
    """
    global _local_storage
    return getattr(_local_storage, "session", _global_session)

create_session

create_session() -> (
    Generator[CompressionSession, None, None]
)

Context manager to create and yield a new session for sparsification. This will set the active session to the new session for the duration of the context.

Returns:

Source code in llmcompressor/core/session_functions.py
@contextmanager
def create_session() -> Generator[CompressionSession, None, None]:
    """
    Context manager to create and yield a new session for sparsification.
    This will set the active session to the new session for the duration
    of the context.

    :return: the new session
    """
    global _local_storage
    orig_session = getattr(_local_storage, "session", None)
    new_session = CompressionSession()
    _local_storage.session = new_session
    try:
        yield new_session
    finally:
        _local_storage.session = orig_session

reset_session

reset_session()

Reset the currently active session to its initial state

Source code in llmcompressor/core/session_functions.py
def reset_session():
    """
    Reset the currently active session to its initial state
    """
    session = active_session()
    session._lifecycle.reset()