llmcompressor.utils.dev
Functions:
-
dispatch_for_generation
–Dispatch a model autoregressive generation. This means that modules are dispatched
-
patch_transformers_logger_level
–Context under which the transformers logger's level is modified
-
skip_weights_download
–Context manager under which models are initialized without having to download
dispatch_for_generation
Dispatch a model autoregressive generation. This means that modules are dispatched evenly across avaiable devices and kept onloaded if possible. Removes any HF hooks that may have existed previously.
Parameters:
-
model
PreTrainedModel
) –model to dispatch
Returns:
-
PreTrainedModel
–model which is dispatched
Source code in llmcompressor/utils/dev.py
patch_transformers_logger_level
Context under which the transformers logger's level is modified
This can be used with skip_weights_download
to squelch warnings related to missing parameters in the checkpoint
Parameters:
-
level
int
, default:ERROR
) –new logging level for transformers logger. Logs whose level is below this level will not be logged
Source code in llmcompressor/utils/dev.py
skip_weights_download
Context manager under which models are initialized without having to download the model weight files. This differs from init_empty_weights
in that weights are allocated on to assigned devices with random values, as opposed to being on the meta device
Parameters:
-
model_class
Type[PreTrainedModel]
, default:AutoModelForCausalLM
) –class to patch, defaults to
AutoModelForCausalLM
Source code in llmcompressor/utils/dev.py
skip_weights_initialize
Very similar to transformers.model_utils.no_init_weights
, except that torch.Tensor initialization functions are also patched to account for tensors which are initialized not on the meta device