Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

RedHatAI
/

quantization

Model card Files Files and versions

5.61 GB

2 contributors

History: 58 commits

danieldk's picture

danieldk HF Staff

Build

ba53cf7 2 days ago

attention
Sync to vLLM 20250627 7 months ago
build
Build 2 days ago
compressed_tensors
Sync updates for CUDA 13 compat 3 days ago
core
Sync to vLLM 20250627 7 months ago
cutlass_extensions
Sync to vLLM 20250627 7 months ago
cutlass_w8a8
Sync to vLLM 20250627 7 months ago
fp8
Sync updates for CUDA 13 compat 3 days ago
gptq_marlin
Sync to vLLM 20250627 7 months ago
marlin
Sync to vLLM 20250627 7 months ago
tests
Sync to vLLM 20250627 7 months ago
torch-ext
Fix absolute imports 6 months ago
.gitattributes
1.56 kB

Build about 1 year ago
LICENSE
11.4 kB

Add cutlass_w8a8 about 1 year ago
README.md
195 Bytes

Update README.md (#1) 11 months ago
build.toml
6.1 kB

Use `-static-global-template-stub=false` nvcc option for Marlin 3 days ago
cub_helpers.h
416 Bytes

Sync updates for CUDA 13 compat 3 days ago
cuda_utils.h
1.41 kB

Sync on vLLM 20240402 9 months ago
dispatch_utils.h
3.9 kB

Sync to vLLM 20250627 7 months ago
flake.lock
2.48 kB

Use `-static-global-template-stub=false` nvcc option for Marlin 3 days ago
flake.nix
281 Bytes

Use `-static-global-template-stub=false` nvcc option for Marlin 3 days ago
utils.cuh
1.84 kB

Sync on vLLM 20240402 9 months ago
vectorization.cuh
878 Bytes

Sync to vLLM 20250627 7 months ago
vectorization_utils.cuh
2.61 kB

Sync to vLLM 20250627 7 months ago