An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: audio-understanding

AudioLLMs/Awesome-Audio-LLM

Audio Large Language Models

Language: Python - Size: 17.8 MB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 701 - Forks: 36

LJungang/Awesome-Omni-Large-Models-and-Datasets

🔥 Omni large models and datasets for understanding and generating multi-modalities.

Size: 53.7 KB - Last synced at: 2 days ago - Pushed at: 11 months ago - Stars: 17 - Forks: 0

HKUDS/VideoAgent

"VideoAgent: Transform any video with a single prompt—seamlessly generate engaging commentary, cross-lingual adaptations, viral memes, and stunning music remixes, all in one go!"

Language: Python - Size: 136 MB - Last synced at: 13 days ago - Pushed at: 17 days ago - Stars: 172 - Forks: 29

khalooei/Voxtral-AI-Demo-Local-Interface

Voxtral is a state-of-the-art model developed to handle both speech transcription and audio understanding with remarkable accuracy and efficiency. This demo interface lets you run the Voxtral model on powerful GPUs to evaluate its performance and see how it can be used for transcription and deeper analysis.

Language: Python - Size: 396 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 4 - Forks: 1