llmcompressor.utils
General utility functions used throughout LLM Compressor.
Modules:
-
dev– -
fsdp– -
helpers–General utility helper functions.
-
metric_logging–Utility functions for metrics logging and GPU memory monitoring.
-
pytorch–
Classes:
-
NumpyArrayBatcher–Batcher instance to handle taking in dictionaries of numpy arrays,
Functions:
-
DisableQuantization–Disable quantization during forward passes after applying a quantization config
-
bucket_iterable–Bucket iterable into subarray consisting of the first top percentage
-
calibration_forward_context–Context in which all calibration forward passes should occur.
-
clean_path–:param path: the directory or file path to clean
-
convert_to_bool–:param val: the value to be converted to a bool,
-
create_dirs–:param path: the directory path to try and create
-
create_parent_dirs–:param path: the file path to try to create the parent directories for
-
create_unique_dir–:param path: the file path to create a unique version of
-
disable_cache–Temporarily disable the key-value cache for transformer models. Used to prevent
-
disable_hf_kernels–In transformers>=4.50.0, some module forward methods may be
-
dispatch_for_generation–Dispatch a model autoregressive generation. This means that modules are dispatched
-
eval_context–Disable pytorch training mode for the given module
-
flatten_iterable–:param li: a possibly nested iterable of items to be flattened
-
getattr_chain–Chain multiple getattr calls, separated by
. -
import_from_path–Import the module and the name of the function/class separated by :
-
interpolate–note, caps values at their min of x0 and max x1,
-
interpolate_list_linear–interpolate for input values within a list of measurements linearly
-
interpolated_integral–Calculate the interpolated integal for a group of measurements of the form
-
is_package_available–A helper function to check if a package is available
-
is_url–:param val: value to check if it is a url or not
-
json_to_jsonl–Converts a json list file to jsonl file format (used for sharding efficienty)
-
load_labeled_data–Load labels and data from disk or from memory and group them together.
-
load_numpy–Load a numpy file into either an ndarray or an OrderedDict representing what
-
patch_attr–Patch the value of an object attribute. Original value is restored upon exit
-
patch_transformers_logger_level–Context under which the transformers logger's level is modified
-
path_file_count–Return the number of files that match the given pattern under the given path
-
path_file_size–Return the total size, in bytes, for a path on the file system
-
save_numpy–Save a numpy array or collection of numpy arrays to disk
-
skip_weights_download–Context manager under which models are initialized without having to download
-
tensor_export–:param tensor: tensor to export to a saved numpy array file
-
tensors_export–:param tensors: the tensors to export to a saved numpy array file
-
validate_str_iterable–:param val: the value to validate, check that it is a list (and flattens it),
NumpyArrayBatcher
Bases: object
Batcher instance to handle taking in dictionaries of numpy arrays, appending multiple items to them to increase their batch size, and then stack them into a single batched numpy array for all keys in the dicts.
Methods:
-
append–Append a new item into the current batch.
-
stack–Stack the current items into a batch along a new, zeroed dimension
Source code in llmcompressor/utils/helpers.py
append
Append a new item into the current batch. All keys and shapes must match the current state.
Parameters:
-
(itemUnion[ndarray, Dict[str, ndarray]]) –the item to add for batching
Source code in llmcompressor/utils/helpers.py
stack
Stack the current items into a batch along a new, zeroed dimension
Returns:
-
Dict[str, ndarray]–the stacked items
Source code in llmcompressor/utils/helpers.py
DisableQuantization
Disable quantization during forward passes after applying a quantization config
Source code in llmcompressor/utils/helpers.py
bucket_iterable
bucket_iterable(
val: Iterable[Any],
num_buckets: int = 3,
edge_percent: float = 0.05,
sort_highest: bool = True,
sort_key: Callable[[Any], Any] = None,
) -> List[Tuple[int, Any]]
Bucket iterable into subarray consisting of the first top percentage followed by the rest of the iterable sliced into equal sliced groups.
Parameters:
-
(valIterable[Any]) –The iterable to bucket
-
(num_bucketsint, default:3) –The number of buckets to group the iterable into, does not include the top bucket
-
(edge_percentfloat, default:0.05) –Group the first percent into its own bucket. If sort_highest, then this is the top percent, else bottom percent. If <= 0, then will not create an edge bucket
-
(sort_highestbool, default:True) –True to sort such that the highest percent is first and will create buckets in descending order. False to sort so lowest is first and create buckets in ascending order.
-
(sort_keyCallable[[Any], Any], default:None) –The sort_key, if any, to use for sorting the iterable after converting it to a list
Returns:
-
List[Tuple[int, Any]]–a list of each value mapped to the bucket it was sorted into
Source code in llmcompressor/utils/helpers.py
calibration_forward_context
Context in which all calibration forward passes should occur.
- Remove gradient calculations
- Disable the KV cache
- Disable train mode and enable eval mode
- Disable hf kernels which could bypass hooks
Source code in llmcompressor/utils/helpers.py
clean_path
Parameters:
-
(pathstr) –the directory or file path to clean
Returns:
-
str–a cleaned version that expands the user path and creates an absolute path
convert_to_bool
Parameters:
-
(valAny) –the value to be converted to a bool, supports logical values as strings ie True, t, false, 0
Returns:
- –
the boolean representation of the value, if it can't be determined, falls back on returning True
Source code in llmcompressor/utils/helpers.py
create_dirs
Parameters:
-
(pathstr) –the directory path to try and create
Source code in llmcompressor/utils/helpers.py
create_parent_dirs
Parameters:
-
(pathstr) –the file path to try to create the parent directories for
create_unique_dir
Parameters:
-
(pathstr) –the file path to create a unique version of (append numbers until one doesn't exist)
-
(check_numberint, default:0) –the number to begin checking for unique versions at
Returns:
-
str–the unique directory path
Source code in llmcompressor/utils/helpers.py
disable_cache
Temporarily disable the key-value cache for transformer models. Used to prevent excess memory use in one-shot cases where the model only performs the prefill phase and not the generation phase.
Example:
model = AutoModel.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0") input = torch.randint(0, 32, size=(1, 32)) with disable_cache(model): ... output = model(input)
Source code in llmcompressor/utils/helpers.py
disable_hf_kernels
In transformers>=4.50.0, some module forward methods may be replaced by calls to hf hub kernels. This has the potential to bypass hooks added by LLM Compressor
Source code in llmcompressor/utils/helpers.py
dispatch_for_generation
Dispatch a model autoregressive generation. This means that modules are dispatched evenly across avaiable devices and kept onloaded if possible. Removes any HF hooks that may have existed previously.
Parameters:
-
(modelPreTrainedModel) –model to dispatch
Returns:
-
PreTrainedModel–model which is dispatched
Source code in llmcompressor/utils/dev.py
eval_context
Disable pytorch training mode for the given module
Source code in llmcompressor/utils/helpers.py
flatten_iterable
Parameters:
-
(liIterable) –a possibly nested iterable of items to be flattened
Returns:
- –
a flattened version of the list where all elements are in a single list flattened in a depth first pattern
Source code in llmcompressor/utils/helpers.py
getattr_chain
Chain multiple getattr calls, separated by .
Parameters:
-
(objAny) –base object whose attributes are being retrieved
-
(chain_strstr) –attribute names separated by
. -
–defaultdefault value, throw error otherwise
Source code in llmcompressor/utils/helpers.py
import_from_path
Import the module and the name of the function/class separated by : Examples: path = "/path/to/file.py:func_or_class_name" path = "/path/to/file:focn" path = "path.to.file:focn"
Parameters:
-
(pathstr) –path including the file path and object name
Source code in llmcompressor/utils/helpers.py
interpolate
interpolate(
x_cur: float,
x0: float,
x1: float,
y0: Any,
y1: Any,
inter_func: str = "linear",
) -> Any
note, caps values at their min of x0 and max x1, designed to not work outside of that range for implementation reasons
Parameters:
-
(x_curfloat) –the current value for x, should be between x0 and x1
-
(x0float) –the minimum for x to interpolate between
-
(x1float) –the maximum for x to interpolate between
-
(y0Any) –the minimum for y to interpolate between
-
(y1Any) –the maximum for y to interpolate between
-
(inter_funcstr, default:'linear') –the type of function to interpolate with: linear, cubic, inverse_cubic
Returns:
-
Any–the interpolated value projecting x into y for the given interpolation function
Source code in llmcompressor/utils/helpers.py
interpolate_list_linear
interpolate_list_linear(
measurements: List[Tuple[float, float]],
x_val: Union[float, List[float]],
) -> List[Tuple[float, float]]
interpolate for input values within a list of measurements linearly
Parameters:
-
(measurementsList[Tuple[float, float]]) –the measurements to interpolate the output value between
-
(x_valUnion[float, List[float]]) –the target values to interpolate to the second dimension
Returns:
-
List[Tuple[float, float]]–a list of tuples containing the target values, interpolated values
Source code in llmcompressor/utils/helpers.py
interpolated_integral
Calculate the interpolated integal for a group of measurements of the form [(x0, y0), (x1, y1), ...]
Parameters:
-
(measurementsList[Tuple[float, float]]) –the measurements to calculate the integral for
Returns:
- –
the integral or area under the curve for the measurements given
Source code in llmcompressor/utils/helpers.py
is_package_available
is_package_available(
package_name: str, return_version: bool = False
) -> Union[Tuple[bool, str], bool]
A helper function to check if a package is available and optionally return its version. This function enforces a check that the package is available and is not just a directory/file with the same name as the package.
inspired from: https://github.com/huggingface/transformers/blob/965cf677695dd363285831afca8cf479cf0c600c/src/transformers/utils/import_utils.py#L41
Parameters:
-
(package_namestr) –The package name to check for
-
(return_versionbool, default:False) –True to return the version of the package if available
Returns:
-
Union[Tuple[bool, str], bool]–True if the package is available, False otherwise or a tuple of (bool, version) if return_version is True
Source code in llmcompressor/utils/helpers.py
is_url
Parameters:
-
(valstr) –value to check if it is a url or not
Returns:
- –
True if value is a URL, False otherwise
Source code in llmcompressor/utils/helpers.py
json_to_jsonl
Converts a json list file to jsonl file format (used for sharding efficienty) e.x. [{"a": 1}, {"a": 1}] would convert to: {"a": 1}
Parameters:
-
(json_file_pathstr) –file path to a json file path containing a json list of objects
-
(overwritebool, default:True) –If True, the existing json file will be overwritten, if False, the file will have the same name but with a .jsonl extension
Source code in llmcompressor/utils/helpers.py
load_labeled_data
load_labeled_data(
data: Union[
str,
Iterable[Union[str, ndarray, Dict[str, ndarray]]],
],
labels: Union[
None,
str,
Iterable[Union[str, ndarray, Dict[str, ndarray]]],
],
raise_on_error: bool = True,
) -> List[
Tuple[
Union[numpy.ndarray, Dict[str, numpy.ndarray]],
Union[
None, numpy.ndarray, Dict[str, numpy.ndarray]
],
]
]
Load labels and data from disk or from memory and group them together. Assumes sorted ordering for on disk. Will match between when a file glob is passed for either data and/or labels.
Parameters:
-
(dataUnion[str, Iterable[Union[str, ndarray, Dict[str, ndarray]]]]) –the file glob, file path to numpy data tar ball, or list of arrays to use for data
-
(labelsUnion[None, str, Iterable[Union[str, ndarray, Dict[str, ndarray]]]]) –the file glob, file path to numpy data tar ball, or list of arrays to use for labels, if any
-
(raise_on_errorbool, default:True) –True to raise on any error that occurs; False to log a warning, ignore, and continue
Returns:
-
List[Tuple[Union[ndarray, Dict[str, ndarray]], Union[None, ndarray, Dict[str, ndarray]]]]–a list containing tuples of the data, labels. If labels was passed in as None, will now contain a None for the second index in each tuple
Source code in llmcompressor/utils/helpers.py
load_numpy
Load a numpy file into either an ndarray or an OrderedDict representing what was in the npz file
Parameters:
-
(file_pathstr) –the file_path to load
Returns:
-
Union[ndarray, Dict[str, ndarray]]–the loaded values from the file
Source code in llmcompressor/utils/helpers.py
patch_attr
Patch the value of an object attribute. Original value is restored upon exit
Parameters:
-
(baseobject) –object which has the attribute to patch
-
(attrstr) –name of the the attribute to patch
-
(valueAny) –used to replace original value Usage: >>> from types import SimpleNamespace >>> obj = SimpleNamespace() >>> with patch_attr(obj, "attribute", "value"): ... assert obj.attribute == "value" >>> assert not hasattr(obj, "attribute")
Source code in llmcompressor/utils/helpers.py
patch_transformers_logger_level
Context under which the transformers logger's level is modified
This can be used with skip_weights_download to squelch warnings related to missing parameters in the checkpoint
Parameters:
-
(levelint, default:ERROR) –new logging level for transformers logger. Logs whose level is below this level will not be logged
Source code in llmcompressor/utils/dev.py
path_file_count
Return the number of files that match the given pattern under the given path
Parameters:
-
(pathstr) –the path to the directory to look for files under
-
(patternstr, default:'*') –the pattern the files must match to be counted
Returns:
-
int–the number of files matching the pattern under the directory
Source code in llmcompressor/utils/helpers.py
path_file_size
Return the total size, in bytes, for a path on the file system
Parameters:
-
(pathstr) –the path (directory or file) to get the size for
Returns:
-
int–the size of the path, in bytes, as stored on disk
Source code in llmcompressor/utils/helpers.py
save_numpy
save_numpy(
array: Union[
ndarray, Dict[str, ndarray], Iterable[ndarray]
],
export_dir: str,
name: str,
npz: bool = True,
)
Save a numpy array or collection of numpy arrays to disk
Parameters:
-
(arrayUnion[ndarray, Dict[str, ndarray], Iterable[ndarray]]) –the array or collection of arrays to save
-
(export_dirstr) –the directory to export the numpy file into
-
(namestr) –the name of the file to export to (without extension)
-
(npzbool, default:True) –True to save as an npz compressed file, False for standard npy. Note, npy can only be used for single numpy arrays
Returns:
- –
the saved path
Source code in llmcompressor/utils/helpers.py
skip_weights_download
Context manager under which models are initialized without having to download the model weight files. This differs from init_empty_weights in that weights are allocated on to assigned devices with random values, as opposed to being on the meta device
Parameters:
-
(model_classType[PreTrainedModel], default:AutoModelForCausalLM) –class to patch, defaults to
AutoModelForCausalLM
Source code in llmcompressor/utils/dev.py
tensor_export
tensor_export(
tensor: Union[
ndarray, Dict[str, ndarray], Iterable[ndarray]
],
export_dir: str,
name: str,
npz: bool = True,
) -> str
Parameters:
-
(tensorUnion[ndarray, Dict[str, ndarray], Iterable[ndarray]]) –tensor to export to a saved numpy array file
-
(export_dirstr) –the directory to export the file in
-
(namestr) –the name of the file, .npy will be appended to it
-
(npzbool, default:True) –True to export as an npz file, False otherwise
Returns:
-
str–the path of the numpy file the tensor was exported to
Source code in llmcompressor/utils/helpers.py
tensors_export
tensors_export(
tensors: Union[
ndarray, Dict[str, ndarray], Iterable[ndarray]
],
export_dir: str,
name_prefix: str,
counter: int = 0,
break_batch: bool = False,
) -> List[str]
Parameters:
-
(tensorsUnion[ndarray, Dict[str, ndarray], Iterable[ndarray]]) –the tensors to export to a saved numpy array file
-
(export_dirstr) –the directory to export the files in
-
(name_prefixstr) –the prefix name for the tensors to save as, will append info about the position of the tensor in a list or dict in addition to the .npy file format
-
(counterint, default:0) –the current counter to save the tensor at
-
(break_batchbool, default:False) –treat the tensor as a batch and break apart into multiple tensors
Returns:
-
List[str]–the exported paths
Source code in llmcompressor/utils/helpers.py
validate_str_iterable
validate_str_iterable(
val: Union[str, Iterable[str]], error_desc: str = ""
) -> Union[str, Iterable[str]]
Parameters:
-
(valUnion[str, Iterable[str]]) –the value to validate, check that it is a list (and flattens it), otherwise checks that it's an ALL or ALL_PRUNABLE string, otherwise raises a ValueError
-
(error_descstr, default:'') –the description to raise an error with in the event that the val wasn't valid
Returns:
-
Union[str, Iterable[str]]–the validated version of the param