Skip to content
LLM Compressor Docs
model_load
Initializing search
GitHub
LLM Compressor Docs
GitHub
About LLM Compressor
Getting started
Getting started
Installation
Compress Your Model
Deploy with vLLM
Guides
Guides
Compression Formats
Compression Schemes
Saving a Model
Examples
Examples
Quantizing Models with Activation-Aware Quantization (AWQ)
Big Modeling with Sequential Onloading
Quantizing Multimodal Audio Models
Quantizing Multimodal Vision-Language Models
int4 Weight Quantization of a 2:4 Sparse Model
fp8 Weight, Activation, and KV Cache Quantization
Non-uniform Quantization
int4 Weight Quantization
fp4 Quantization
fp8 Weight and Activation Quantization
int8 Weight and Activation Quantization
Quantizing Mixtral-8x7B-Instruct-v0.1 Model with FP8
Applying 2:4 Sparsity with Optional FP8 Quantization
Sparse Finetuning with TRL's SFTTrainer
Developer
Developer
Code of Conduct
Contributing Guide
Development Guide
Observers Overview
API Reference
API Reference
llmcompressor
llmcompressor
logger
sentinel
args
args
dataset_arguments
model_arguments
recipe_arguments
training_arguments
utils
core
core
helpers
lifecycle
model_layer
session
session_functions
state
events
events
event
datasets
datasets
utils
entrypoints
entrypoints
oneshot
train
utils
metrics
metrics
logger
utils
utils
frequency_manager
modeling
modeling
deepseek_v3
fuse
granite4
llama4
prepare
qwen3_moe
modifiers
modifiers
factory
interface
modifier
awq
awq
base
mappings
distillation
distillation
output
output
base
utils
utils
pytorch
pytorch
kd_factory
kd_wrapper
model_wrapper
experimental
experimental
logarithmic_equalization
logarithmic_equalization
base
obcq
obcq
base
sgpt_base
sgpt_sparsify
pruning
pruning
helpers
constant
constant
base
magnitude
magnitude
base
utils
utils
pytorch
pytorch
layer_mask
mask_factory
wanda
wanda
base
wanda_sparsify
quantization
quantization
cache
calibration
gptq
gptq
base
gptq_quantize
quantization
quantization
base
mixin
smoothquant
smoothquant
base
utils
transform
transform
quip
quip
base
spinquant
spinquant
base
mappings
norm_mappings
utils
utils
constants
helpers
hooks
pytorch_helpers
observers
observers
base
helpers
min_max
mse
pipelines
pipelines
cache
registry
basic
basic
pipeline
data_free
data_free
pipeline
independent
independent
pipeline
layer_sequential
layer_sequential
helpers
pipeline
sequential
sequential
ast_helpers
helpers
pipeline
pytorch
pytorch
model_load
model_load
helpers
utils
utils
helpers
sparsification
sparsification_info
sparsification_info
configs
helpers
module_sparsification_info
recipe
recipe
metadata
recipe
utils
transformers
transformers
compression
compression
compressed_tensors_utils
helpers
quantization_format
sparsity_metadata_config
finetune
finetune
callbacks
session_mixin
trainer
data
data
base
c4
cnn_dailymail
custom
data_helpers
evolcodealpaca
flickr_30k
gsm8k
open_platypus
peoples_speech
ultrachat_200k
wikitext
tracing
tracing
debug
utils
utils
helpers
preprocessing_functions
utils
utils
dev
helpers
metric_logging
fsdp
fsdp
context
helpers
pytorch
pytorch
module
utils
llmcompressor.pytorch.model_load
Modules:
helpers
–
Back to top