Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
RedHatAI
/
quantization
like
6
Follow
Red Hat AI
1.81k
kernel
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
3
refs/pr/3
quantization
5.61 GB
2 contributors
History:
58 commits
danieldk
HF Staff
Build
ba53cf7
2 days ago
attention
Sync to vLLM 20250627
7 months ago
build
Build
2 days ago
compressed_tensors
Sync updates for CUDA 13 compat
3 days ago
core
Sync to vLLM 20250627
7 months ago
cutlass_extensions
Sync to vLLM 20250627
7 months ago
cutlass_w8a8
Sync to vLLM 20250627
7 months ago
fp8
Sync updates for CUDA 13 compat
3 days ago
gptq_marlin
Sync to vLLM 20250627
7 months ago
marlin
Sync to vLLM 20250627
7 months ago
tests
Sync to vLLM 20250627
7 months ago
torch-ext
Fix absolute imports
6 months ago
.gitattributes
1.56 kB
Build
about 1 year ago
LICENSE
11.4 kB
Add cutlass_w8a8
about 1 year ago
README.md
195 Bytes
Update README.md (#1)
11 months ago
build.toml
6.1 kB
Use `-static-global-template-stub=false` nvcc option for Marlin
3 days ago
cub_helpers.h
416 Bytes
Sync updates for CUDA 13 compat
3 days ago
cuda_utils.h
1.41 kB
Sync on vLLM 20240402
9 months ago
dispatch_utils.h
3.9 kB
Sync to vLLM 20250627
7 months ago
flake.lock
2.48 kB
Use `-static-global-template-stub=false` nvcc option for Marlin
3 days ago
flake.nix
281 Bytes
Use `-static-global-template-stub=false` nvcc option for Marlin
3 days ago
utils.cuh
1.84 kB
Sync on vLLM 20240402
9 months ago
vectorization.cuh
878 Bytes
Sync to vLLM 20250627
7 months ago
vectorization_utils.cuh
2.61 kB
Sync to vLLM 20250627
7 months ago