GitHub topics: forced-alignment
readbeyond/aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Language: Python - Size: 29.1 MB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 2,732 - Forks: 263

echogarden-project/echogarden
Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.
Language: TypeScript - Size: 2.55 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 392 - Forks: 41

feldberlin/timething
Timething is a library for aligning text transcripts with their audio recordings.
Language: Jupyter Notebook - Size: 29.8 MB - Last synced at: 5 days ago - Pushed at: 9 months ago - Stars: 122 - Forks: 13

mozilla/DSAlign
DeepSpeech based forced alignment tool
Language: Python - Size: 229 KB - Last synced at: 13 days ago - Pushed at: over 4 years ago - Stars: 239 - Forks: 33

shaharpickman555/LyricsProj
Web app for automatic karaoke maker from songs using faster-whisper
Language: Python - Size: 125 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

xulihang/Silhouette
An open source computer-aided translation tool for audios and videos
Language: B4X - Size: 830 KB - Last synced at: 14 days ago - Pushed at: 28 days ago - Stars: 12 - Forks: 0

MahmoudAshraf97/ctc-forced-aligner
Text to speech alignment using CTC forced alignment
Language: Python - Size: 76.2 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 336 - Forks: 62

r4victor/afaligner
📈 A forced aligner intended for synchronization of narrated text
Language: Python - Size: 21.7 MB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 95 - Forks: 15

samuelbradshaw/text-to-timestamps
Python and command-line utility for aligning audio to a transcript.
Language: Python - Size: 101 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 9 - Forks: 3

michel-meneses/keyword-miner
A framework for generating labeled audio recordings of single-spoken keywords via automatic forced alignment.
Language: Python - Size: 3.45 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 11 - Forks: 1

MontrealCorpusTools/Montreal-Forced-Aligner
Command line utility for forced alignment using Kaldi
Language: Python - Size: 85.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,553 - Forks: 260

avinashvarna/audio_alignment
Align various Sanskrit texts and audio
Language: Python - Size: 82.6 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 15 - Forks: 9

MahtaFetrat/ManaTTS-Persian-Speech-Dataset
ManaTTS is the largest open Persian speech dataset with 114+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.
Language: Jupyter Notebook - Size: 16.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 31 - Forks: 1

tsukumijima/pydomino-prebuilt Fork of DwangoMediaVillage/pydomino
日本語音声に対して音素ラベルをアラインメントするためのツールです
Language: C++ - Size: 74 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

proger/uk
Фонограми та синтагми: інструменти обробки
Language: Python - Size: 8.11 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 21 - Forks: 0

saurabhshri/CCAligner
🔮 Word by word audio subtitle synchronisation tool and API. Developed under GSoC 2017 with CCExtractor.
Language: C++ - Size: 127 MB - Last synced at: 19 days ago - Pushed at: almost 6 years ago - Stars: 172 - Forks: 34

ihavenoidea76786/Mana-Speech-Dataset-Generator
The Mana Speech Dataset Generator offers a straightforward way to create high-quality speech datasets from raw audio and text pairs. This modular pipeline ensures flexibility in noisy environments, making it a valuable tool for developers and researchers alike. 🛠️🎤
Language: Jupyter Notebook - Size: 207 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

MahtaFetrat/Mana-Forced-Aligner
A robust forced alignment tool for low-resource languages using multiple ASR models and CER-based matching. Built for noisy data and imperfect transcripts.
Language: Jupyter Notebook - Size: 4.86 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 5 - Forks: 0

MahtaFetrat/GPTInformal-Persian-Speech-Dataset
A free licensed Persian TTS dataset including 6+ hours of audio-text pairs with subject
Size: 4.88 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 7 - Forks: 0

r4victor/syncabook
📖🎧 A tool for creating ebooks with synchronized text and audio (EPUB3 with Media Overlays)
Language: HTML - Size: 132 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 301 - Forks: 29

bunyaminergen/Callytics
Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.
Language: Python - Size: 23.9 MB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 65 - Forks: 10

MahtaFetrat/VirgoolInformal-Speech-Dataset
A dataset of informal Persian audio and text chunks, along with a fully open processing pipeline, suitable for ASR and TTS tasks. Created from crawled content on virgool.io.
Language: Jupyter Notebook - Size: 508 KB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

seanghay/kfa
A fast Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus
Language: Python - Size: 10.1 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

dangrebenkin/wav2vec2_speech_markuper
Automatic generation of speech dataset markup using Wav2Vec2 ASR models
Language: Python - Size: 396 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

KiyotadaMori/jaeeadjuster
jaeeadjuster: Japanese-accented English & English adjuster
Language: Jupyter Notebook - Size: 696 KB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

zelaki/DisfluentFA
A Weakly Supervised Forced Alignment for disluent speech
Language: Python - Size: 1.07 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 7 - Forks: 1

bookbot-hive/OpenBible-TTS
Building Text-to-Speech Systems using OpenBible!
Language: Jupyter Notebook - Size: 2.04 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

zelaki/KaldiLongAligner
Speech to Text Alignment tool implemented with Python and Kaldi
Language: Python - Size: 137 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 0

BayesForDays/gently
Gentle and praatio scripts for easy forced alignment
Size: 590 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 17 - Forks: 2

jhdeov/interlingual-MFA
Workflow for forced alignment between languages
Language: Python - Size: 260 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 1

jasmeanfernando/BatchP2FA
Penn Phonetics Lab Forced Aligner (P2FA)
Language: Python - Size: 31 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

amirharati/kaldi-alligner
scripts to align a given wave to its transcription using trained models by Kaldi
Language: Shell - Size: 4.15 MB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 30 - Forks: 6

joshchen984/WriteMyVideo-Backend
WriteMyVideo's purpose is to help people create videos quickly and easily by simply typing out the video’s script and a description of images to include in the video.
Language: Python - Size: 21.1 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 21 - Forks: 7

Telegram-Zalo/zac2022-lyric-alignment
Solution for Zalo AI Challenge 2022 - Lyrics Alignment
Language: Python - Size: 949 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 61 - Forks: 18

dcavar/ELAN2split
Split ELAN Annotation Files and corresponding speech files into a corpus format for common ASR and Forced Aligners
Language: C++ - Size: 16.6 KB - Last synced at: 5 months ago - Pushed at: almost 7 years ago - Stars: 10 - Forks: 3

2017fandrei/ForcedAlignment
Graphical utility for forced alignment using aeneas, an interactive audio player
Language: Python - Size: 300 KB - Last synced at: 9 months ago - Pushed at: about 8 years ago - Stars: 6 - Forks: 1

ronggong/interspeech2018_submission01
Supplementary information and code for INTERSPEECH 2018 paper: Singing voice phoneme segmentation by hierarchically inferring syllable and phoneme onset positions
Language: Python - Size: 28.7 MB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 43 - Forks: 4

tiefenauer/ip9
Code for my master thesis at FHNW
Language: Python - Size: 62.5 MB - Last synced at: over 2 years ago - Pushed at: about 6 years ago - Stars: 6 - Forks: 1

hrishikeshrt/audio_alignment Fork of avinashvarna/audio_alignment
Align various Sanskrit texts and audio
Language: Python - Size: 46.8 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

dbklim/WebRTCVAD_Wrapper
A simple Python wrapper to simplify working with WebRTC VAD and its rougher analogue based on RMS and ZCR (useful for processing audio recordings before using them with neural networks).
Language: Python - Size: 625 KB - Last synced at: about 2 months ago - Pushed at: about 3 years ago - Stars: 9 - Forks: 3

itsupera/audiobook_alignment
Aligning a Japanese audio-book with its text and create Anki sentence cards with audio.
Language: Python - Size: 1.02 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 0

wxjiao/BERT-Text-Features
BERT-Text-Features for Tokenized Transcripts from P2FA.
Language: Python - Size: 657 KB - Last synced at: about 2 months ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 0

auromitamitra/Mongolian_Acoustic_Model
Acoustic model for Khalka Mongolian
Size: 23.5 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

aiera-inc/gentle Fork of lowerquality/gentle
gentle forced aligner
Size: 1.53 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

achen4290/ForcedAligner
Flask implementation of Montreal Forced Aligner
Language: Python - Size: 5.29 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

YinPing-Cho/Mandarin-Forced-Aligner-with-Aeaneas
A dirty but working snippet of code for Mandarin sentence-level audio-text aligning with Aeneas.
Language: Python - Size: 3.91 KB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

scarletcho/evaluateFA
Evaluation script for a forced aligned TextGrid
Language: Python - Size: 2.93 KB - Last synced at: over 2 years ago - Pushed at: over 8 years ago - Stars: 1 - Forks: 1
