Skip to content
@SqueezeBits

SqueezeBits Inc.

We are squeezing bits.

Popular repositories Loading

  1. QUICK QUICK Public

    QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference

    Python 118 5

  2. Torch-TRTLLM Torch-TRTLLM Public

    Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.

    Python 53 3

  3. owlite owlite Public

    OwLite is a low-code AI model compression toolkit for AI models.

    Python 51 5

  4. GraLoRA GraLoRA Public

    Jupyter Notebook 28 2

  5. owlite-examples owlite-examples Public

    OwLite Examples repository offers illustrative example codes to help users seamlessly compress PyTorch deep learning models and transform them into TensorRT engines.

    Python 9 1

  6. Gaudi-Hands-on-Workshop Gaudi-Hands-on-Workshop Public

    Jupyter Notebook 5 2

Repositories

Showing 10 of 28 repositories

Top languages

Loading…

Most used topics

Loading…