An open API service providing repository metadata for many open source software ecosystems.

Topic: "fast-inference"

foolwood/pytorch-slimming

Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017.

Language: Python - Size: 12.7 KB - Last synced at: 19 days ago - Pushed at: almost 6 years ago - Stars: 570 - Forks: 96

aredden/flux-fp8-api

Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.

Language: Python - Size: 157 KB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 227 - Forks: 28

kssteven418/BigLittleDecoder

[NeurIPS'23] Speculative Decoding with Big Little Decoder

Language: Python - Size: 100 MB - Last synced at: 11 days ago - Pushed at: about 1 year ago - Stars: 90 - Forks: 10

dvlab-research/Q-LLM

This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"

Language: Python - Size: 6.84 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 29 - Forks: 0

szemenyeim/RoboDNN

Fast Forward-Only Deep Neural Network Library for the Nao Robots

Language: C++ - Size: 538 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 5 - Forks: 1

Academich/translation-transformer

An implementation of the encoder-decoder transformer for SMILES-to-SMILES translation tasks with inference accelerated by speculative decoding

Language: Python - Size: 1.58 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 0

u-hyszk/japanese-speculative-decoding

Verification of the effect of speculative decoding in Japanese.

Language: Python - Size: 245 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

lim142857/Sparsifiner

Official Codebase for CVPR2023 paper "Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers"

Language: Python - Size: 46.9 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

PopoDev/CSE481N_Project

Reproducibility Project for [NeurIPS'23] Speculative Decoding with Big Little Decoder

Language: Python - Size: 6.33 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0