quantization-aware-training
Here are 110 public repositories matching this topic...
SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime
-
Updated
Jun 26, 2026 - Python
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、reg…
-
Updated
May 6, 2025 - Python
Neural Network Compression Framework for enhanced OpenVINO™ inference
-
Updated
Jun 24, 2026 - Python
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
-
Updated
Mar 3, 2026 - Python
YOLO ModelCompression MultidatasetTraining
-
Updated
Jun 21, 2022 - Python
Tutorial notebooks for hls4ml
-
Updated
Jun 22, 2026 - Jupyter Notebook
A model compression and acceleration toolbox based on pytorch.
-
Updated
Jan 12, 2024 - Python
0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture
-
Updated
Mar 17, 2024 - Python
针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库
-
Updated
Apr 19, 2023 - Python
Quantized LLM training in pure CUDA/C++.
-
Updated
Jun 3, 2026 - C++
Enhancing LLMs with LoRA
-
Updated
Oct 20, 2025 - Jupyter Notebook
QuTLASS: CUTLASS-Powered Quantized BLAS for Deep Learning
-
Updated
Nov 11, 2025 - C++
This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.
-
Updated
Jan 23, 2023 - Jupyter Notebook
Notes on quantization in neural networks
-
Updated
Dec 14, 2023 - Jupyter Notebook
FrostNet: Towards Quantization-Aware Network Architecture Search
-
Updated
May 3, 2024 - Python
Quantization Aware Training
-
Updated
Jan 13, 2024 - Python
OpenVINO Training Extensions Object Detection
-
Updated
Mar 8, 2023 - Python
Quantization-aware training with spiking neural networks
-
Updated
Feb 18, 2022 - Python
Train neural networks with joint quantization and pruning on both weights and activations using any pytorch modules
-
Updated
Sep 19, 2022 - Python
FakeQuantize with Learned Step Size(LSQ+) as Observer in PyTorch
-
Updated
Dec 18, 2021 - C++
Improve this page
Add a description, image, and links to the quantization-aware-training topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the quantization-aware-training topic, visit your repo's landing page and select "manage topics."