Topic: "music-information-retrieval"
MTG/essentia
C++ library for audio and music analysis, description and synthesis, including Python bindings
Language: C++ - Size: 299 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 3,137 - Forks: 572

libAudioFlux/audioFlux
A library for audio and music analysis, feature extraction.
Language: C - Size: 7.11 MB - Last synced at: 24 days ago - Pushed at: about 1 year ago - Stars: 3,102 - Forks: 138

ybayle/awesome-deep-learning-music
List of articles related to deep learning applied to music
Language: TeX - Size: 5.87 MB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 2,893 - Forks: 341

mdeff/fma
FMA: A Dataset For Music Analysis
Language: Jupyter Notebook - Size: 3.99 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 2,397 - Forks: 448

Music-and-Culture-Technology-Lab/omnizart
Omniscient Mozart, being able to transcribe everything in the music, including vocal, drum, chord, beat, instruments, and more.
Language: Python - Size: 72.6 MB - Last synced at: about 22 hours ago - Pushed at: about 1 year ago - Stars: 1,709 - Forks: 120

meyda/meyda
Audio feature extraction for JavaScript.
Language: TypeScript - Size: 16 MB - Last synced at: 12 days ago - Pushed at: about 1 year ago - Stars: 1,556 - Forks: 108

CPJKU/madmom
Python audio and music signal processing library
Language: Python - Size: 6.27 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 1,449 - Forks: 243

iranroman/musicinformationretrieval.com
Instructional notebooks on music information retrieval.
Language: Jupyter Notebook - Size: 178 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 1,243 - Forks: 412

marl/crepe
CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)
Language: Python - Size: 183 MB - Last synced at: 23 days ago - Pushed at: 11 months ago - Stars: 1,242 - Forks: 167

Natooz/MidiTok
MIDI / symbolic music tokenizers for Deep Learning models 🎶
Language: Python - Size: 5.73 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 793 - Forks: 95

MTG/essentia.js
JavaScript library for music/audio analysis and processing powered by Essentia WebAssembly
Language: TypeScript - Size: 169 MB - Last synced at: about 4 hours ago - Pushed at: about 6 hours ago - Stars: 728 - Forks: 50

EmulationAI/awesome-large-audio-models
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
Size: 6.56 MB - Last synced at: 7 days ago - Pushed at: 12 months ago - Stars: 682 - Forks: 42

openvpi/SOME
SOME: Singing-Oriented MIDI Extractor.
Language: Python - Size: 159 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 543 - Forks: 47

urinieto/msaf
Music Structure Analysis Framework
Language: Python - Size: 25.6 MB - Last synced at: 27 days ago - Pushed at: 4 months ago - Stars: 525 - Forks: 86

salu133445/muspy
A toolkit for symbolic music generation
Language: Python - Size: 16.9 MB - Last synced at: 8 days ago - Pushed at: 5 months ago - Stars: 486 - Forks: 55

SuperKogito/spafe
:sound: spafe: Simplified Python Audio Features Extraction
Language: Python - Size: 20.7 MB - Last synced at: about 21 hours ago - Pushed at: 4 months ago - Stars: 475 - Forks: 79

RetroCirce/HTS-Audio-Transformer
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"
Language: Python - Size: 896 KB - Last synced at: 9 days ago - Pushed at: 11 months ago - Stars: 418 - Forks: 68

adamstark/Gist
A C++ Library for Audio Analysis
Language: C++ - Size: 938 KB - Last synced at: 4 months ago - Pushed at: almost 4 years ago - Stars: 378 - Forks: 76

charlie86/spotifyr
R wrapper for Spotify's Web API
Language: R - Size: 4.34 MB - Last synced at: 5 days ago - Pushed at: 9 months ago - Stars: 376 - Forks: 69

source-separation/tutorial
Tutorial covering Open Source tools for Source Separation.
Language: Jupyter Notebook - Size: 409 MB - Last synced at: 13 days ago - Pushed at: about 1 year ago - Stars: 371 - Forks: 40

JorenSix/Olaf
Olaf: Overly Lightweight Acoustic Fingerprinting is a portable acoustic fingerprinting system.
Language: C - Size: 5.77 MB - Last synced at: about 23 hours ago - Pushed at: 13 days ago - Stars: 361 - Forks: 38

gabolsgabs/DALI
DALI: a large Dataset of synchronised Audio, LyrIcs and vocal notes.
Language: Python - Size: 31 MB - Last synced at: 4 months ago - Pushed at: about 5 years ago - Stars: 356 - Forks: 34

apacha/OMR-Datasets
Collection of datasets used for Optical Music Recognition
Language: Python - Size: 6.88 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 350 - Forks: 43

spotify-research/llark
Code for the paper "LLark: A Multimodal Instruction-Following Language Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, and Rachel Bittner.
Language: Jupyter Notebook - Size: 422 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 339 - Forks: 26

Spijkervet/CLMR
Official PyTorch implementation of Contrastive Learning of Musical Representations
Language: Python - Size: 77.4 MB - Last synced at: about 2 months ago - Pushed at: 12 months ago - Stars: 326 - Forks: 51

MTG/mtg-jamendo-dataset
Metadata, scripts and baselines for the MTG-Jamendo dataset
Language: Python - Size: 41.4 MB - Last synced at: 13 days ago - Pushed at: 22 days ago - Stars: 319 - Forks: 44

ilaria-manco/multimodal-ml-music
List of academic resources on Multimodal ML for Music
Language: TeX - Size: 268 KB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 296 - Forks: 11

dodiku/AudioOwl
Fast and simple music and audio analysis using RNN in Python 🕵️♀️ 🥁
Language: Python - Size: 89.8 KB - Last synced at: 11 days ago - Pushed at: about 3 years ago - Stars: 290 - Forks: 21

CPJKU/partitura
A python package for handling modern staff notation of music
Language: Python - Size: 6.5 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 286 - Forks: 27

JorenSix/Panako
The Panako acoustic fingerprinting system.
Language: Java - Size: 56 MB - Last synced at: about 23 hours ago - Pushed at: about 1 year ago - Stars: 228 - Forks: 39

danyalimran93/Music-Emotion-Recognition
A Machine Learning Approach of Emotional Model
Language: Python - Size: 36.1 KB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 208 - Forks: 62

RetroCirce/Zero_Shot_Audio_Source_Separation
The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022
Language: Python - Size: 684 KB - Last synced at: 9 days ago - Pushed at: about 3 years ago - Stars: 202 - Forks: 33

mimbres/neural-audio-fp
Official implementation of Neural Audio Fingerprint (ICASSP 2021)
Language: Python - Size: 11.9 MB - Last synced at: 5 days ago - Pushed at: 12 months ago - Stars: 194 - Forks: 26

carlosholivan/musicaiz
A python framework for symbolic music generation, evaluation and analysis
Language: Python - Size: 5.36 MB - Last synced at: 9 days ago - Pushed at: about 2 years ago - Stars: 184 - Forks: 18

amanteur/BandSplitRNN-PyTorch
Unofficial PyTorch implementation of Music Source Separation with Band-split RNN
Language: Python - Size: 25.9 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 178 - Forks: 24

alexanderlerch/pyACA
Python scripts accompanying the book "An Introduction to Audio Content Analysis" (www.AudioContentAnalysis.org)
Language: Python - Size: 3.74 MB - Last synced at: 9 days ago - Pushed at: about 2 months ago - Stars: 169 - Forks: 40

salu133445/mmt
Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)
Language: Python - Size: 410 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 146 - Forks: 27

salu133445/pypianoroll
A toolkit for working with piano rolls
Language: Python - Size: 9.63 MB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 146 - Forks: 18

cjbayron/autochord
Automatic Chord Recognition tools - ISMIR2021 Late-Breaking Demo presentation
Language: Jupyter Notebook - Size: 1.27 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 135 - Forks: 15

wayne391/symbolic-music-datasets
:musical_keyboard: symbolic musical datasets
Language: Jupyter Notebook - Size: 12.8 MB - Last synced at: 4 days ago - Pushed at: about 5 years ago - Stars: 133 - Forks: 28

CPJKU/beat_this
Accurate and general beat tracker
Language: Python - Size: 10.2 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 122 - Forks: 25

JosiahParry/genius
Easily access song lyrics from Genius in a tibble.
Language: HTML - Size: 428 KB - Last synced at: 5 days ago - Pushed at: over 3 years ago - Stars: 120 - Forks: 18

wayne391/lead-sheet-dataset
:headphones: lead sheet datasets in various formats
Language: Jupyter Notebook - Size: 6.11 MB - Last synced at: 4 days ago - Pushed at: almost 4 years ago - Stars: 119 - Forks: 19

dodiku/MixingBear
Package for automatic beat-mixing of music files in Python 🐻🎚
Language: Python - Size: 31.3 KB - Last synced at: 9 days ago - Pushed at: about 7 years ago - Stars: 119 - Forks: 10

a43992899/MARBLE
State-of-the-art pretrained music models for training, evaluation, inference
Language: Python - Size: 2.57 MB - Last synced at: 16 days ago - Pushed at: 21 days ago - Stars: 116 - Forks: 10

alexanderlerch/ACA-Slides
Slides and Code for "An Introduction to Audio Content Analysis," also taught at Georgia Tech as MUSI-6201. This introductory course on Music Information Retrieval is based on the text book "An Introduction to Audio Content Analysis", Wiley 2012/2022
Language: TeX - Size: 614 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 113 - Forks: 18

dr-costas/mad-twinnet
The code for the MaD TwinNet. Demo page:
Language: Python - Size: 308 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 111 - Forks: 26

Polochon-street/bliss-rs
A song analysis library for making playlists
Language: Rust - Size: 15.8 MB - Last synced at: 1 day ago - Pushed at: 6 days ago - Stars: 110 - Forks: 6

ilaria-manco/muscall
Official implementation of "Contrastive Audio-Language Learning for Music" (ISMIR 2022)
Language: Python - Size: 193 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 106 - Forks: 11

ashishpatel26/Best-Audio-Classification-Resources-with-Deep-learning
List of articles related to deep learning applied to music
Language: TeX - Size: 5.2 MB - Last synced at: 27 days ago - Pushed at: almost 6 years ago - Stars: 94 - Forks: 11

belovm96/chord-detection
App for Chord Sequence Detection
Language: Jupyter Notebook - Size: 67.4 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 87 - Forks: 7

alexanderlerch/ACA-Code
Matlab scripts accompanying the book "An Introduction to Audio Content Analysis" (www.AudioContentAnalysis.org)
Language: MATLAB - Size: 541 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 86 - Forks: 22

salu133445/lakh-pianoroll-dataset
A collection of 174,154 multi-track piano-rolls
Language: Python - Size: 1.52 MB - Last synced at: 2 days ago - Pushed at: 2 months ago - Stars: 85 - Forks: 11

sergree/whatbpm
💓 Today's Trending Values for EDM Production
Language: Rust - Size: 13.2 MB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 85 - Forks: 10

KinWaiCheuk/demucs_lightning
Demucs Lightning: A PyTorch lightning version of Demucs with Hydra and Tensorboard features
Language: Python - Size: 201 KB - Last synced at: 9 days ago - Pushed at: about 2 years ago - Stars: 85 - Forks: 10

MTG/SymbTr
Turkish Makam Music Symbolic Data Collection
Language: Python - Size: 98.1 MB - Last synced at: 28 days ago - Pushed at: 3 months ago - Stars: 83 - Forks: 12

CPJKU/msmd
A Multimodal Audio Sheet Music Dataset
Language: Jupyter Notebook - Size: 3.08 MB - Last synced at: 3 months ago - Pushed at: about 6 years ago - Stars: 83 - Forks: 18

RetroCirce/Music-SketchNet
ISMIR 2020 Paper repo: Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm
Language: Jupyter Notebook - Size: 4.01 MB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 82 - Forks: 12

ChenDelong1999/VirtualConductor
🎶 Music-Driven Conducting Motion Generation (IEEE ICME'21 Best Demo)
Language: Python - Size: 12.8 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 81 - Forks: 11

snejus/beetcamp
Bandcamp autotagger source for beets (https://beets.io)
Language: Python - Size: 4.13 MB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 80 - Forks: 12

ilaria-manco/muscaps
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)
Language: Jupyter Notebook - Size: 91.9 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 77 - Forks: 7

smashub/choco
ChoCo: the Chord Corpus
Language: Jupyter Notebook - Size: 1.06 GB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 76 - Forks: 6

alexanderlerch/libACA
C++ code accompanying the book "An Introduction to Audio Content Analysis" (www.AudioContentAnalysis.org)
Language: C++ - Size: 191 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 74 - Forks: 9

bill317996/Melody-extraction-with-melodic-segnet
The source code of "A Streamlined Encoder/Decoder Architecture for Melody Extraction"
Language: Python - Size: 6.97 MB - Last synced at: about 12 hours ago - Pushed at: over 5 years ago - Stars: 73 - Forks: 13

ankrypht/AudioScape
An Android Application For Streaming Music From YouTube Music Built With React-Native Using Expo.
Language: TypeScript - Size: 47.3 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 70 - Forks: 5

cemfi/meico
A converter framework with support for MEI, MSM, MPM, MIDI, WAV, MP3, chroma, and XSLT
Language: Java - Size: 108 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 70 - Forks: 14

kristijanbartol/Deep-Music-Tagger
Music genre classification model using CRNN
Language: Python - Size: 51.4 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 67 - Forks: 7

dodiku/music-synthesis-with-python
Music Synthesis with Python talk, originally given at PyGotham 2017.
Language: Jupyter Notebook - Size: 128 MB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 67 - Forks: 12

s603122001/Vocal-Melody-Extraction
Source code for "Vocal melody extraction with semantic segmentation and audio-symbolic domain transfer learning".
Language: Python - Size: 1.9 MB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 65 - Forks: 11

kyungyunlee/ismir2018-revisiting-svd
Revisiting Singing Voice Detection : a Quantitative Review and the Future Outlook
Language: Python - Size: 10.9 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 62 - Forks: 9

MTG/da-tacos
A Dataset for Cover Song Identification and Understanding
Language: Python - Size: 89.8 KB - Last synced at: 28 days ago - Pushed at: over 2 years ago - Stars: 61 - Forks: 3

chordify/CASD
Chordify Annotator Subjectivity Dataset - A chord-Label harmony dataset with multiple reference annotations per song
Language: Python - Size: 1.28 MB - Last synced at: 5 days ago - Pushed at: about 6 years ago - Stars: 61 - Forks: 6

salu133445/arranger
Official Implementation of "Towards Automatic Instrumentation by Learning to Separate Parts in Symbolic Multitrack Music" (ISMIR 2021)
Language: Python - Size: 193 MB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 59 - Forks: 8

theadamsabra/LearningfromAudio
Understand of the fundamentals of digital signal processing for Machine Learning/Deep Learning applications.
Language: Jupyter Notebook - Size: 19.5 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 59 - Forks: 10

marcobn/musicntwrk
Network Analysis of Generalized Musical Spaces
Language: Jupyter Notebook - Size: 1.06 GB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 58 - Forks: 3

asigalov61/Los-Angeles-MIDI-Dataset
SOTA kilo-scale MIDI dataset for MIR and Music AI purposes
Language: Python - Size: 3.01 GB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 58 - Forks: 3

igorbrigadir/ishkurs-guide-dataset
Structured Data from Ishkur's Guide to Electronic Music. Working Mirror for v2.5 here: https://igorbrigadir.github.io/ishkurs-guide-dataset/
Language: Jupyter Notebook - Size: 152 MB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 58 - Forks: 2

babycat-io/babycat
An audio manipulation library for Rust, Python, WebAssembly, and C.
Language: Rust - Size: 187 MB - Last synced at: 7 days ago - Pushed at: over 2 years ago - Stars: 56 - Forks: 6

georgid/AlignmentDuration
Lyrics-to-audio-alignement system. Based on Machine Learning Algorithms: Hidden Markov Models with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes. The phonetic model are classified with MLP Deep Neural Network.
Language: Python - Size: 342 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 55 - Forks: 6

KID-22/Awesome-Music-Recommendation-Datasets
Awesome Datasets for Music Recommendation
Size: 5.86 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 53 - Forks: 5

dansuh17/jdcnet-pytorch
pytorch implementation of JDCNet, singing voice detection and classification network
Language: Python - Size: 19 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 50 - Forks: 5

YuriyGuts/dechorder
Automatic chord recognition application powered by machine learning
Language: Python - Size: 6 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 49 - Forks: 6

kyungyunlee/sampleCNN-pytorch
Pytorch implementation of "Sample-level Deep Convolutional Neural Networks for Music Auto-tagging Using Raw Waveforms"
Language: Python - Size: 28.3 KB - Last synced at: over 2 years ago - Pushed at: almost 7 years ago - Stars: 48 - Forks: 11

aubio/vamp-aubio-plugins
aubio plugins for Vamp
Language: C++ - Size: 440 KB - Last synced at: 3 months ago - Pushed at: over 7 years ago - Stars: 48 - Forks: 12

tachi-hi/HPSS
Harmonic/Percussive Sound Separation
Language: C++ - Size: 39.1 KB - Last synced at: 3 months ago - Pushed at: almost 4 years ago - Stars: 46 - Forks: 4

alisonbma/aiSFX
Representation Learning for the Automatic Indexing of Sound Effects Libraries (ISMIR 2022): Deep audio embeddings pre-trained on UCS & Non-UCS-compliant datasets.
Language: Python - Size: 59.6 KB - Last synced at: 30 days ago - Pushed at: about 2 years ago - Stars: 45 - Forks: 4

gabolsgabs/cunet
Control mechanisms to the U-Net architecture for doing multiple source separation instruments
Language: Python - Size: 5.43 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 44 - Forks: 9

tosiron/jazznet
jazznet dataset of piano patterns for music audio machine learning research
Language: Python - Size: 4.24 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 43 - Forks: 0

salu133445/deepperformer
Deep Performer: Score-to-audio music performance synthesis
Language: SCSS - Size: 16.8 MB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 43 - Forks: 4

RetroCirce/TONet
The official implementation of "TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music"
Language: Python - Size: 938 KB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 41 - Forks: 5

MusicBucket/musicbucket-bot
A Telegram bot that helps chat users sharing and keeping track music
Language: Python - Size: 3.27 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 37 - Forks: 12

MusicMoveArr/Datasets
Datasets of MusicBrainz, Tidal, Spotify
Size: 243 KB - Last synced at: 18 days ago - Pushed at: 19 days ago - Stars: 36 - Forks: 3

ilya16/ScorePerformer
ScorePerformer: Expressive Piano Performance Rendering with Fine-Grained Control (ISMIR 2023)
Language: Python - Size: 2.42 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 36 - Forks: 2

pianosnake/ireal-reader
A Node JS module to read music files from iRealPro.
Language: JavaScript - Size: 112 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 36 - Forks: 10

ffont/source
A Freesound Community Sampler
Language: HTML - Size: 29 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 34 - Forks: 6

jundsp/VAE-BSS
Unsupervised blind source separation of mixed images and sounds with variational auto-encoders.
Language: Python - Size: 101 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 34 - Forks: 4

sertansenturk/tomato
Turkish-Ottoman Makam (M)usic Analysis TOolbox
Language: Python - Size: 33.3 MB - Last synced at: 8 months ago - Pushed at: about 3 years ago - Stars: 34 - Forks: 6

tabahi/WebSpeechAnalyzer
JS speech analyzer for fast speech analysis and labeling
Language: JavaScript - Size: 13.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 33 - Forks: 3

amanteur/SCNet-PyTorch
Unofficial PyTorch implementation of "SCNet: Sparse Compression Network for Music Source Separation"
Language: Python - Size: 309 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 33 - Forks: 1

mjhydri/Singing-Vocal-Beat-Tracking
This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations. Then, it uses HMM decoder to infer signing beats and tempo.
Language: Python - Size: 122 KB - Last synced at: 1 day ago - Pushed at: almost 3 years ago - Stars: 33 - Forks: 4
