cortex.tensorrt-llm

Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/menloresearch%2Fcortex.tensorrt-llm
PURL: pkg:github/menloresearch/cortex.tensorrt-llm

Fork of NVIDIA/TensorRT-LLM
Stars: 43
Forks: 2
Open issues: 3

License: apache-2.0
Language: C++
Size: 273 MB
Dependencies parsed at: Pending

Created at: over 1 year ago
Updated at: 5 months ago
Pushed at: 10 months ago
Last synced at: 4 months ago

Topics: jan, llm, nvidia, tensorrt, tensorrt-llm

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub / menloresearch / cortex.tensorrt-llm