llmcompressor.transformers.finetune.session_mixin
Classes:
-
SessionManagerMixIn
–Mix-In class to extend the Hugging Face Trainer class to support LLM Compressor
SessionManagerMixIn
SessionManagerMixIn(
recipe: str,
model_args: ModelArguments,
dataset_args: Optional[DatasetArguments] = None,
teacher: Optional[Union[Module, str]] = None,
recipe_args: Optional[
Union[Dict[str, Any], str]
] = None,
**kwargs
)
Mix-In class to extend the Hugging Face Trainer class to support LLM Compressor recipes for one-shot and finetuning flows.
Parameters:
-
recipe
str
) –path to recipe file to apply during training
-
recipe_args
Optional[Union[Dict[str, Any], str]]
, default:None
) –additional kwargs to use for evaluating recipe
-
dataset_args
Optional[DatasetArguments]
, default:None
) –kwargs for configuring dataset loading
-
teacher
Optional[Union[Module, str]]
, default:None
) –optional teacher model to use for distillation
Methods:
-
compute_loss
–Override for the compute_loss to factor trigger callbacks and filter columns
-
create_optimizer
–Override the optimizer to apply and update the recipe while training.
-
create_scheduler
–Create an LR scheduler to work with the applied recipes. This is a placeholder
-
finalize_session
–Wrap up training by finalizing all modifiers initialized in the current session
-
initialize_session
–Initialize the CompressionSession from the specified epoch, evaluates the recipe
-
log_model_sparsification
–Log the current model sparsification info including pruned and quantized states
-
maybe_log_model_sparsification
–Log info on model sparsity and quantization if possible. Only print logs on the
-
save_model
–Override of the save_model function and expects it to exist in the parent.
-
train
–Run a sparsification training cycle. Runs initialization for the sparse session
-
training_step
–Overrides the Trainer's training step to trigger the batch_start callback to
Source code in llmcompressor/transformers/finetune/session_mixin.py
compute_loss
compute_loss(
model: Module,
inputs: Dict[str, Any],
return_outputs: bool = False,
num_items_in_batch: Optional[Tensor] = None,
) -> Union[torch.Tensor, Tuple[torch.Tensor, Any]]
Override for the compute_loss to factor trigger callbacks and filter columns
Parameters:
-
model
Module
) –the model to compute the loss for
-
inputs
Dict[str, Any]
) –the inputs to pass through the model for calculating the loss
-
return_outputs
bool
, default:False
) –True to return the outputs with the loss, False otherwise
-
num_items_in_batch
Optional[Tensor]
, default:None
) –the number of items which contribute to loss
Returns:
-
Union[Tensor, Tuple[Tensor, Any]]
–the resulting loss if not return_outputs, otherwise a tuple containing the loss and the model's outputs
Source code in llmcompressor/transformers/finetune/session_mixin.py
create_optimizer
Override the optimizer to apply and update the recipe while training. create_optimizer must exist in the parent class and should set self.optimizer to the optimizer state and optionally set self.scaler if using amp.
Source code in llmcompressor/transformers/finetune/session_mixin.py
create_scheduler
Create an LR scheduler to work with the applied recipes. This is a placeholder that just calls the super method, but would be expanded upon if we ever implement a LearningRateModifier.
Parameters:
-
num_training_steps
int
) –the total number of training steps
-
optimizer
Optimizer
, default:None
) –pre-initialized optimizer
Source code in llmcompressor/transformers/finetune/session_mixin.py
finalize_session
Wrap up training by finalizing all modifiers initialized in the current session
Source code in llmcompressor/transformers/finetune/session_mixin.py
initialize_session
Initialize the CompressionSession from the specified epoch, evaluates the recipe and initialized the modifiers for the training session
Parameters:
-
epoch
float
) –Epoch to initialize session from, usually 0 unless loading from a checkpoint
-
checkpoint
Optional[str]
, default:None
) –Optional checkpoint to initialize from to continue training
-
stage
Optional[str]
, default:None
) –Optional stage of recipe to run, or None to run all stages
Source code in llmcompressor/transformers/finetune/session_mixin.py
log_model_sparsification
Log the current model sparsification info including pruned and quantized states
Source code in llmcompressor/transformers/finetune/session_mixin.py
maybe_log_model_sparsification
Log info on model sparsity and quantization if possible. Only print logs on the main process, and avoid logging for quantized FSDP models
Source code in llmcompressor/transformers/finetune/session_mixin.py
save_model
save_model(
output_dir: str,
_internal_call: bool = False,
skip_sparsity_compression_stats: Optional[bool] = True,
)
Override of the save_model function and expects it to exist in the parent. Calls into super() to save the model and additionally saves any recipes that were used with the model within the model folder.
Parameters:
-
output_dir
str
) –the path to save the recipes into
-
_internal_call
bool
, default:False
) –True if this is an internal call from the trainer in super(). Called from self.save_model(output_dir, _internal_call=True) in transformers/trainer/Trainer::_save_checkpoint
Source code in llmcompressor/transformers/finetune/session_mixin.py
train
Run a sparsification training cycle. Runs initialization for the sparse session before calling super().train() and finalization of the session after.
Logs sparsification details for the trained model.
Parameters:
-
args
positional args to pass to super().train()
-
stage
Optional[str]
, default:None
) –Optional stage of recipe to run, or None to run all stages
-
kwargs
keyword args to pass to super().train()
Returns:
- –
the output from super.train()
Source code in llmcompressor/transformers/finetune/session_mixin.py
training_step
training_step(
model: Module,
inputs: Dict[str, Union[Tensor, Any]],
num_items_in_batch: Optional[int] = None,
) -> torch.Tensor
Overrides the Trainer's training step to trigger the batch_start callback to the modifiers, then calls the parent function.
Parameters:
-
model
Module
) –the model to compute the loss for
-
inputs
Dict[str, Union[Tensor, Any]]
) –the inputs to pass through the model for calculating the loss
Returns:
-
Tensor
–output of the model