GitHub topics: forced-alignment

Repositories

readbeyond/aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

Language: Python - Size: 29.1 MB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 2,732 - Forks: 263

Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.

Language: TypeScript - Size: 2.55 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 392 - Forks: 41

feldberlin/timething

Timething is a library for aligning text transcripts with their audio recordings.

Language: Jupyter Notebook - Size: 29.8 MB - Last synced at: 5 days ago - Pushed at: 9 months ago - Stars: 122 - Forks: 13

mozilla/DSAlign

DeepSpeech based forced alignment tool

Language: Python - Size: 229 KB - Last synced at: 13 days ago - Pushed at: over 4 years ago - Stars: 239 - Forks: 33

shaharpickman555/LyricsProj

Web app for automatic karaoke maker from songs using faster-whisper

Language: Python - Size: 125 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

xulihang/Silhouette

An open source computer-aided translation tool for audios and videos

Language: B4X - Size: 830 KB - Last synced at: 14 days ago - Pushed at: 28 days ago - Stars: 12 - Forks: 0

MahmoudAshraf97/ctc-forced-aligner

Text to speech alignment using CTC forced alignment

Language: Python - Size: 76.2 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 336 - Forks: 62

r4victor/afaligner

📈 A forced aligner intended for synchronization of narrated text

Language: Python - Size: 21.7 MB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 95 - Forks: 15

samuelbradshaw/text-to-timestamps

Python and command-line utility for aligning audio to a transcript.

Language: Python - Size: 101 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 9 - Forks: 3

michel-meneses/keyword-miner

A framework for generating labeled audio recordings of single-spoken keywords via automatic forced alignment.

Language: Python - Size: 3.45 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 11 - Forks: 1

MontrealCorpusTools/Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi

Language: Python - Size: 85.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,553 - Forks: 260

avinashvarna/audio_alignment

Align various Sanskrit texts and audio

Language: Python - Size: 82.6 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 15 - Forks: 9

MahtaFetrat/ManaTTS-Persian-Speech-Dataset

ManaTTS is the largest open Persian speech dataset with 114+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.

Language: Jupyter Notebook - Size: 16.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 31 - Forks: 1

tsukumijima/pydomino-prebuilt Fork of DwangoMediaVillage/pydomino

日本語音声に対して音素ラベルをアラインメントするためのツールです

Language: C++ - Size: 74 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

proger/uk

Фонограми та синтагми: інструменти обробки

Language: Python - Size: 8.11 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 21 - Forks: 0

saurabhshri/CCAligner

🔮 Word by word audio subtitle synchronisation tool and API. Developed under GSoC 2017 with CCExtractor.

Language: C++ - Size: 127 MB - Last synced at: 19 days ago - Pushed at: almost 6 years ago - Stars: 172 - Forks: 34

ihavenoidea76786/Mana-Speech-Dataset-Generator

The Mana Speech Dataset Generator offers a straightforward way to create high-quality speech datasets from raw audio and text pairs. This modular pipeline ensures flexibility in noisy environments, making it a valuable tool for developers and researchers alike. 🛠️🎤

Language: Jupyter Notebook - Size: 207 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

MahtaFetrat/Mana-Forced-Aligner

A robust forced alignment tool for low-resource languages using multiple ASR models and CER-based matching. Built for noisy data and imperfect transcripts.

Language: Jupyter Notebook - Size: 4.86 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 5 - Forks: 0

MahtaFetrat/GPTInformal-Persian-Speech-Dataset

A free licensed Persian TTS dataset including 6+ hours of audio-text pairs with subject

Size: 4.88 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 7 - Forks: 0

r4victor/syncabook

📖🎧 A tool for creating ebooks with synchronized text and audio (EPUB3 with Media Overlays)

Language: HTML - Size: 132 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 301 - Forks: 29

bunyaminergen/Callytics

Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.

Language: Python - Size: 23.9 MB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 65 - Forks: 10

MahtaFetrat/VirgoolInformal-Speech-Dataset

A dataset of informal Persian audio and text chunks, along with a fully open processing pipeline, suitable for ASR and TTS tasks. Created from crawled content on virgool.io.

Language: Jupyter Notebook - Size: 508 KB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

seanghay/kfa

A fast Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus

Language: Python - Size: 10.1 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

dangrebenkin/wav2vec2_speech_markuper

Automatic generation of speech dataset markup using Wav2Vec2 ASR models

Language: Python - Size: 396 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

KiyotadaMori/jaeeadjuster

jaeeadjuster: Japanese-accented English & English adjuster

Language: Jupyter Notebook - Size: 696 KB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

zelaki/DisfluentFA

A Weakly Supervised Forced Alignment for disluent speech

Language: Python - Size: 1.07 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 7 - Forks: 1

bookbot-hive/OpenBible-TTS

Building Text-to-Speech Systems using OpenBible!

Language: Jupyter Notebook - Size: 2.04 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

zelaki/KaldiLongAligner

Speech to Text Alignment tool implemented with Python and Kaldi

Language: Python - Size: 137 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 0

BayesForDays/gently

Gentle and praatio scripts for easy forced alignment

Size: 590 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 17 - Forks: 2

jhdeov/interlingual-MFA

Workflow for forced alignment between languages

Language: Python - Size: 260 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 1

jasmeanfernando/BatchP2FA

Penn Phonetics Lab Forced Aligner (P2FA)

Language: Python - Size: 31 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

amirharati/kaldi-alligner

scripts to align a given wave to its transcription using trained models by Kaldi

Language: Shell - Size: 4.15 MB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 30 - Forks: 6

joshchen984/WriteMyVideo-Backend

WriteMyVideo's purpose is to help people create videos quickly and easily by simply typing out the video’s script and a description of images to include in the video.

Language: Python - Size: 21.1 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 21 - Forks: 7

Telegram-Zalo/zac2022-lyric-alignment

Solution for Zalo AI Challenge 2022 - Lyrics Alignment

Language: Python - Size: 949 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 61 - Forks: 18

dcavar/ELAN2split

Split ELAN Annotation Files and corresponding speech files into a corpus format for common ASR and Forced Aligners

Language: C++ - Size: 16.6 KB - Last synced at: 5 months ago - Pushed at: almost 7 years ago - Stars: 10 - Forks: 3

2017fandrei/ForcedAlignment

Graphical utility for forced alignment using aeneas, an interactive audio player

Language: Python - Size: 300 KB - Last synced at: 9 months ago - Pushed at: about 8 years ago - Stars: 6 - Forks: 1

ronggong/interspeech2018_submission01

Supplementary information and code for INTERSPEECH 2018 paper: Singing voice phoneme segmentation by hierarchically inferring syllable and phoneme onset positions

Language: Python - Size: 28.7 MB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 43 - Forks: 4

tiefenauer/ip9

Code for my master thesis at FHNW

Language: Python - Size: 62.5 MB - Last synced at: over 2 years ago - Pushed at: about 6 years ago - Stars: 6 - Forks: 1

hrishikeshrt/audio_alignment Fork of avinashvarna/audio_alignment

Align various Sanskrit texts and audio

Language: Python - Size: 46.8 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

dbklim/WebRTCVAD_Wrapper

A simple Python wrapper to simplify working with WebRTC VAD and its rougher analogue based on RMS and ZCR (useful for processing audio recordings before using them with neural networks).

Language: Python - Size: 625 KB - Last synced at: about 2 months ago - Pushed at: about 3 years ago - Stars: 9 - Forks: 3

itsupera/audiobook_alignment

Aligning a Japanese audio-book with its text and create Anki sentence cards with audio.

Language: Python - Size: 1.02 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 0

wxjiao/BERT-Text-Features

BERT-Text-Features for Tokenized Transcripts from P2FA.

Language: Python - Size: 657 KB - Last synced at: about 2 months ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 0

auromitamitra/Mongolian_Acoustic_Model

Acoustic model for Khalka Mongolian

Size: 23.5 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

aiera-inc/gentle Fork of lowerquality/gentle

gentle forced aligner

Size: 1.53 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

Related Keywords

forced-alignment 47 speech-recognition 13 python 10 tts 7 nlp 7 speech-processing 6 alignment 5 speech-dataset 5 speech-to-text 5 text-to-speech 5 speech-synthesis 4 speech 4 mana-tts 4 audio 4 manatts 4 persian 4 speech-corpus 4 speech-data-collection 4 asr 4 dataset-preparation 3 kaldi 3 wav2vec2 3 transcription 3 audio-processing 3 cli 3 windows 2 low-resource-languages 2 montreal-forced-aligner 2 command-line 2 subtitles 2 ffmpeg 2 text-to-speech-dataset 2 tts-dataset 2 karaoke 2 open-source 2 batch-processing 2 whisper 2 vad 2 japanese 2 dataset-generation 2 hmm 2 voice-activity-detection 2 persian-speech 2 text 2 linux 2 read-along 2 espeak 2 sanskrit 2 data-collection 2 data-preprocessing 2 macos 2 speech-alignment 2 mac 1 gentle 1 rq 1 video 1 video-editing 1 evaluation 1 youtube 1 deep-learning 1 dynamic-programming 1 music-alignment 1 pytorch 1 julius 1 disfluency-detection 1 interspeech2023 1 bible 1 mms 1 openbible 1 swahili 1 text-to-spech 1 phonetics 1 phonology 1 praat 1 psycholinguistics 1 textgrid 1 textgridtools 1 cross-language 1 cross-language-alignment 1 multilingual-alignment 1 automated-deployment 1 openai-whisper 1 pandas 1 python3 1 kaldi-asr 1 keras 1 singing-voice 1 sequence-alignment 1 audio-alignment 1 frontend 1 dsp 1 silence-suppression 1 vad-detection 1 webrtc 1 webrtc-tools 1 webrtc-vad 1 webrtcvad-wrapper 1 anki 1 anki-flashcards 1 japanese-language 1