GitHub topics: speaker-diarization

Repositories

transcriptionstream/transcriptionstream

turnkey self-hosted offline transcription and diarization service with llm summary

Language: Python - Size: 1.23 MB - Last synced at: about 7 hours ago - Pushed at: 8 months ago - Stars: 850 - Forks: 50

modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language: Python - Size: 100 MB - Last synced at: 1 day ago - Pushed at: 8 days ago - Stars: 10,476 - Forks: 1,049

bunyaminergen/awesome-speech-dataset

Awesome Speech Dataset, including download links and a brief explanation for each resource. These datasets provide diverse and high-quality speech data covering various domains such as conversational, academic, political, and more.

Size: 116 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 7 - Forks: 0

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Language: Jupyter Notebook - Size: 435 KB - Last synced at: 2 days ago - Pushed at: 25 days ago - Stars: 4,511 - Forks: 410

wenet-e2e/wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Language: Python - Size: 6.22 MB - Last synced at: about 23 hours ago - Pushed at: 3 months ago - Stars: 896 - Forks: 136

Purfview/whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

Size: 214 KB - Last synced at: 3 days ago - Pushed at: 27 days ago - Stars: 2,051 - Forks: 97

modelscope/3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Language: Python - Size: 3.19 MB - Last synced at: 2 days ago - Pushed at: 28 days ago - Stars: 2,005 - Forks: 170

juanmc2005/diart

A python package to build AI-powered real-time audio applications

Language: Python - Size: 34.8 MB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 1,284 - Forks: 100

nttcslab-sp/mamba-diarization

Official repository for Mamba-based Segmentation Model for Speaker Diarization

Language: Python - Size: 45.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 36 - Forks: 3

speechbrain/speechbrain

A PyTorch-based Speech Toolkit

Language: Python - Size: 97.8 MB - Last synced at: 3 days ago - Pushed at: 8 days ago - Stars: 9,812 - Forks: 1,489

pyannote/pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language: Jupyter Notebook - Size: 252 MB - Last synced at: 3 days ago - Pushed at: 9 days ago - Stars: 7,480 - Forks: 876

linto-ai/whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Language: Python - Size: 4.49 MB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 2,401 - Forks: 183

revdotcom/reverb

Open source inference code for Rev's model

Language: Python - Size: 507 KB - Last synced at: 1 day ago - Pushed at: 24 days ago - Stars: 401 - Forks: 26

espnet/espnet

End-to-End Speech Processing Toolkit

Language: Python - Size: 1.13 GB - Last synced at: 5 days ago - Pushed at: 11 days ago - Stars: 9,083 - Forks: 2,258

google/uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

Language: Python - Size: 107 MB - Last synced at: 2 days ago - Pushed at: 8 months ago - Stars: 1,573 - Forks: 320

rrkas/Speaker-Diarization-Transcription

Language: Jupyter Notebook - Size: 12.9 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

mubingshen/MLC-SLM-Baseline

The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-SLM) to provide participants with baseline systems for speech recognition and speaker diarization in multilingual conversational scenario.

Language: Python - Size: 2.33 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 34 - Forks: 5

hwk06023/SONATA

SONATA (SOund and Narrative Advanced Transcription Assistant): An advanced ASR system that captures human expressions including emotive sounds and non-verbal cues.

Language: Python - Size: 527 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2 - Forks: 0

NavodPeiris/speechlib

speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names

Language: Python - Size: 33.9 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 213 - Forks: 20

Zhima-Mochi/whisper-v3-server

A robust backend server for audio processing, delivering high-accuracy transcription and speaker diarization. Powered by Whisper for speech-to-text and Pyannote for speaker segmentation, wrapped in a clean, maintainable architecture based on Domain-Driven Design (DDD) and Hexagonal Architecture.

Language: Python - Size: 1.76 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

wq2012/awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

Size: 81.1 KB - Last synced at: 12 days ago - Pushed at: 7 months ago - Stars: 1,735 - Forks: 232

Picovoice/falcon

On-device speaker diarization powered by deep learning

Language: Python - Size: 20.2 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 44 - Forks: 5

oddradiocircle/WAScribe

Transcribe WhatsApp audios using Groq and Pyannote (Hugginface) API.

Language: Python - Size: 123 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

IBM-Cloud/chatbot-watson-android 📦

An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

Language: Java - Size: 3.42 MB - Last synced at: 4 days ago - Pushed at: over 3 years ago - Stars: 197 - Forks: 182

bunyaminergen/WavLMMSDD

This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia.

Language: Jupyter Notebook - Size: 1.8 MB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 6 - Forks: 3

manojpamk/pytorch_xvectors

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

Language: Python - Size: 356 KB - Last synced at: 20 days ago - Pushed at: over 4 years ago - Stars: 315 - Forks: 64

mikeesto/gemini-transcribe

Transcribe audio and video files with speaker diarization and logically grouped timestamps using Gemini Flash

Language: TypeScript - Size: 1.74 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 19 - Forks: 2

google/speaker-id

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

Language: Python - Size: 175 MB - Last synced at: 15 days ago - Pushed at: about 2 months ago - Stars: 411 - Forks: 39

wq2012/SpectralCluster

Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

Language: Python - Size: 1.81 MB - Last synced at: 1 day ago - Pushed at: 8 months ago - Stars: 529 - Forks: 72

juanmc2005/rttm-viewer

Application for viewing Rich Transcription Time Marked (RTTM) files in an interactive way

Language: Python - Size: 2.23 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 41 - Forks: 5

yinruiqing/pyannote-whisper

Language: Python - Size: 3.34 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 574 - Forks: 100

alperensumeroglu/ai-clips-maker

AI-powered tool to turn long videos into short, viral-ready clips. Combines transcription, speaker diarization, scene detection & 9:16 resizing — perfect for creators & smart automation.

Language: Python - Size: 2.93 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

clement-pages/gryannote

Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.

Language: Svelte - Size: 2.65 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 61 - Forks: 7

marccasals98/Diarization

Framework for Speaker Diarization

Language: Python - Size: 73.2 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

aidayang/FunASR-OneClick

FunASR实时语音识别版，识别麦克风和电脑内播放的声音，电脑语音打字软件

Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

DrJuChunKoO/TransPal-transcriber

WhisperX Slack bot for transcribing audio files

Language: Python - Size: 22.5 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

bitcointranscripts/tstbtc

This cli app transcribe audio and videos for submission to the bitcointranscripts repo

Language: Python - Size: 6.37 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 6 - Forks: 8

see2023/VoiceMind

Real-time voice assistant with multi-speaker recognition & tactical suggestions. Local AI processing for privacy-sensitive scenarios (debates/meetings/negotiations).

Language: Dart - Size: 1.23 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

Audio-WestlakeU/FS-EEND

The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024] and "LS-EEND: long-form streaming end-to-end neural diarization with online attractor extraction"

Language: Python - Size: 3.22 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 116 - Forks: 5

RobCaamano/Youtube-English-to-Spanish

📺 End-to-End Solution for Translating YouTube Videos to Spanish

Language: Python - Size: 65.4 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 3 - Forks: 2

nezhar/speech-condenser

A tool for summarizing dialogues from videos or audio

Language: Python - Size: 241 KB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 82 - Forks: 10

DongKeon/Awesome-Speaker-Diarization

Some comprehensive papers about speaker diarization

Size: 646 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 250 - Forks: 5

JaesungHuh/av-diarization

Audio-visual diarization pipeline used for creating VoxConverse dataset

Language: Python - Size: 12.5 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

ElmiraGhorbani/gpt-speaker-diarization

Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.

Language: Jupyter Notebook - Size: 39.1 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 12 - Forks: 0

gorkemkaramolla/whisper-run

Faster Whisper with Speaker Diarization

Language: Python - Size: 8.32 MB - Last synced at: 24 days ago - Pushed at: 7 months ago - Stars: 6 - Forks: 0

itmo-mbss-lab/sr_lectures_book

The project is related to the development of Basics of Voice Biometrics lecture book for the ITMO Speaker Recognition Course.

Language: TeX - Size: 1.28 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

neuralwork/audio2chat

Convert multi-speaker audio files to structured chat data for LLMs

Language: Python - Size: 2.02 MB - Last synced at: 11 days ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

linto-ai/linto-diarization

Speaker diarization service

Language: Python - Size: 37 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 21 - Forks: 1

Jpzinn654/speaker-diarization-portuguese

This project implements speaker diarization for Portuguese audio using WhisperX for transcription and PyAnotAudio's Speaker-Diarization 3.1 for speaker separation. It includes a Flask UI for easy file upload, transcription, and speaker identification.

Language: Python - Size: 24.4 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

al3xkras/sovits-svc-tools-docker

A unified docker environment combining SoVITS SVC fork, UVR5, audio-separator and pyannote.audio.

Language: Python - Size: 11.7 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

sohansai/speaker-diarization

A user-friendly interface for identifying and separating speakers in audio files using pyannote.audio and Gradio.

Language: Python - Size: 1000 Bytes - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

shashikg/X-Vector-Based-Speaker-Diarization

Course project for EE698R (2020-21 Sem 2). An X-Vector Based Speaker Diarization System with AutoEncoder based clustering method. Also supports spectral and KMeans clustering method.

Language: Jupyter Notebook - Size: 97 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 14 - Forks: 0

nuaazs/VAF_2

Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.

Language: Python - Size: 32.7 MB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 403 - Forks: 21

mtwn105/audio-intel

AudioIntel - Audio/Video Intelligence, Transcripts, Summary, and much more

Language: TypeScript - Size: 660 KB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

yufan-aslp/AliMeeting

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

Language: Python - Size: 492 KB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 114 - Forks: 17

scionoftech/speaker_diarization

speaker diarization using spectralcluster and Deeplearning

Language: Jupyter Notebook - Size: 188 KB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 0

wq2012/VB_diarization

VB Diarization with Eigenvoice and HMM Priors, refactored

Language: Python - Size: 32.4 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 15 - Forks: 3

NickNaskida/cog-whisper-diarization Fork of thomasmol/cog-whisper-diarization

Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote

Language: Python - Size: 54.7 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

a-jain24/Diarization

Facilitates purely text-based diarization labeling of transcripts or other written conversational data using LLMs

Language: Jupyter Notebook - Size: 18.6 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

NickNaskida/insanely-fast-whisper Fork of chenxwh/insanely-fast-whisper

Incredibly fast Whisper-large-v3 with speaker diarization

Language: Jupyter Notebook - Size: 396 KB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 1

ubclaunchpad/minutes

:telescope: Speaker diarization via transfer learning

Language: Python - Size: 22.2 MB - Last synced at: 3 days ago - Pushed at: about 6 years ago - Stars: 27 - Forks: 5

werserk/TechStormHack-1st-place

Решение соревнования ТехШторм от корпорации ТатНефть по анализу активности членов команды на ВКС

Language: Python - Size: 5.77 MB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Rehan-Ahmad/SpeakerDiarization

Audio based speaker diarization

Language: Python - Size: 1.19 MB - Last synced at: 9 months ago - Pushed at: about 6 years ago - Stars: 16 - Forks: 1

Rehan-Ahmad/MultimodalDiarization

Multimodal speaker diarization using pre-trained audio-visual synchronization model

Language: Python - Size: 38.1 KB - Last synced at: 9 months ago - Pushed at: about 5 years ago - Stars: 9 - Forks: 6

GameOfPods/PAT

PodcastProject Analytics Toolkit - Project that creates analytics various input data. Exported data is intended to be used in a PodcastProject website

Language: Python - Size: 150 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

CiscoDevNet/vo-id

Language: Python - Size: 85.7 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 11 - Forks: 3

aeronjl/transcribe

Python package for accurate audio transcription with speaker diarisation

Language: Python - Size: 25.4 MB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

wq2012/SimpleDER

A lightweight library to compute Diarization Error Rate (DER).

Language: Python - Size: 79.1 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 59 - Forks: 9

FrenchKrab/IS2024-powerset-calibration

Companion repository to the paper "On the calibration of powerset speaker diarization models" published at Interspeech 2024

Language: HTML - Size: 28.4 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

luisst/SpeakerLID_GT_code

Speaker Diarization, Recognition and Language Identification. Scripts to generate GT using our WebApp and Praat software

Language: Python - Size: 41.1 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

cadia-lvl/kaldi-speaker-diarization

This repository creates speaker diarization recipes to be used within the egs folder of kaldi.

Language: Shell - Size: 78.1 KB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 13 - Forks: 3

aeronjl/transcribe-streamlit

Streamlit user interface for transcribing conversations with speaker diarisation

Language: Python - Size: 235 KB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

flo-bit/youtube-speaker-separation

simple python script that outputs separate audio files for each speaker in a youtube video, using whisper on replicate

Language: Python - Size: 2.93 KB - Last synced at: 5 days ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

katagaki/FiresideSubtitles

Video transcription, speaker diarization, and face detection in Python.

Language: Python - Size: 37.1 KB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

mathusanm6/Amaze-Voice-Lab

The goal of this research project was to be able to control the movements of characters in a Maze game using real-time voice commands such as saying out loud Up, Down, Left or Right.

Language: Java - Size: 65.8 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

hitachi-speech/EEND

End-to-End Neural Diarization

Language: Python - Size: 50.6 MB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 354 - Forks: 57

taylorlu/Speaker-Diarization

speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition

Language: Python - Size: 52.6 MB - Last synced at: 12 months ago - Pushed at: almost 4 years ago - Stars: 453 - Forks: 124

Wenhao-Yang/SpeakerVerifiaction-pytorch

Speaker Verification using Pytorch

Language: Jupyter Notebook - Size: 17.7 MB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 4

cvqluu/simple_diarizer

Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

Language: Python - Size: 1.27 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 123 - Forks: 26

7egment/3D-Speaker-Diarization-Pipeline

A simplified and faster version of the speaker diarization pipeline in the 3D-Speaker toolkit by Alibaba DAMO Academy

Language: Python - Size: 30.7 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

mmxgn/smooth-convex-kl-nmf

Repository holding various implementation of specific NMF methods for speaker diarization

Language: Python - Size: 17.6 KB - Last synced at: about 1 month ago - Pushed at: over 7 years ago - Stars: 5 - Forks: 1

dptools/WhisperNote

Subtitle generation w/ Speaker Diarization using Whisper and pyannote.audio

Language: Python - Size: 86.9 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

nikitalpopov/master

research for master degree

Language: Jupyter Notebook - Size: 213 MB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

pranshurastogi29/uis_rnn_for_speaker_diarization

speaker_diarization done on toy dataset and tested on timit dataset

Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: 28 days ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 0

Joost385/transcription-ui

Full-stack Transcription-UI: Features OpenAI Whisper and NVIDIA NeMo, with Docker for easy deployment.

Language: TypeScript - Size: 353 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

e6quisitory/pyannote-benchmark

pyannote.audio benchmark for NVIDIA GPUs

Language: Python - Size: 2.93 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

VidyasagarMSC/WatBot

An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.

Language: Java - Size: 4.82 MB - Last synced at: 24 days ago - Pushed at: over 6 years ago - Stars: 72 - Forks: 53

doerlbh/MiniVox

Code for our ACML and INTERSPEECH papers: "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox".

Language: Cuda - Size: 998 MB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 25 - Forks: 5

SEERNET/Multi-Speaker-Diarization

Automated Multi Speaker diarization API for meetings, calls, interviews, press-conference etc.

Size: 13.7 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 11 - Forks: 0

team-re-verb/RE-VERB

speaker diarization system using an LSTM

Language: Python - Size: 135 MB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 48 - Forks: 8

FlorianKrey/DNC

Discriminative Neural Clustering for Speaker Diarisation

Language: Python - Size: 3.62 GB - Last synced at: 12 months ago - Pushed at: about 3 years ago - Stars: 78 - Forks: 14

kamakaya/gcp-speaker-diarization

Language: Jupyter Notebook - Size: 37.2 MB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 6 - Forks: 1

j-schmied/RealTimeSpeechRecognition

Various approaches for speech recognition and speaker diarization.

Language: Jupyter Notebook - Size: 2.86 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

cvqluu/Factorized-TDNN

PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi

Language: Python - Size: 278 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 140 - Forks: 34

theomariotte/SpeakerLoc

Speaker localization algorithms in the meeting context

Language: Python - Size: 332 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 2

haoming29/ez-transcription

An easy way to make perfect audio transcript with Whisper model and speaker diarization

Language: JavaScript - Size: 1.86 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

FrenchKrab/IS2023-powerset-diarization

Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.

Language: Jupyter Notebook - Size: 705 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 28 - Forks: 1

cvqluu/TDNN

Time delay neural network (TDNN) implementation in Pytorch using unfold method

Language: Python - Size: 708 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 183 - Forks: 40

ryojiysd/speaker-diarization-sample

Sample codes of Google Cloud Speech API's speaker diarization feature

Language: JavaScript - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 3

Rajeshshashank/Speaker-Diarization

Speaker Diarization using Python, Flask and Html

Language: HTML - Size: 161 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 2

Related Keywords

speaker-diarization 138 speaker-recognition 34 speech-recognition 32 diarization 22 whisper 22 speaker-verification 20 pytorch 18 speech-processing 18 speech-to-text 17 asr 15 machine-learning 15 deep-learning 14 transcription 14 speech 14 speaker-identification 13 pyannote 13 python 10 voice-activity-detection 8 audio 8 clustering 7 kaldi 7 openai 7 ai 6 speaker-embedding 5 neural-network 5 audio-processing 5 end-to-end 5 lstm 5 python3 4 ml 4 whisperx 4 text-to-speech 4 docker 4 openai-whisper 4 neural-networks 3 whisper-faster 3 deep-neural-networks 3 speaker-diarization-problem 3 source-separation 3 automatic-speech-recognition 3 speech-enhancement 3 conversation 3 faster-whisper 3 dataset 3 speech-separation 3 huggingface 3 audio-transcription 3 ibm-cloud 2 awesome 2 lena 2 asr-model 2 intent 2 whisper-large 2 java 2 calibration 2 watson 2 d-vectors 2 funasr 2 awesome-list 2 mfcc 2 whisper-ai 2 librosa 2 speech-transcription 2 gradio 2 lium 2 pytorch-lightning 2 interspeech 2 android 2 android-studio 2 stt 2 nmf 2 uis-rnn 2 chatbot 2 supervised-clustering 2 voice-conversion 2 conversation-service 2 cnn 2 voice-cloning 2 ghostvlad 2 chainer 2 dialog 2 entity 2 speech-activity-detection 2 tdnn 2 spectral-clustering 2 unsupervised-clustering 2 unsupervised-learning 2 plda 2 eres2net 2 redis 2 reverb 2 cnceleb 2 campplus 2 vue 2 meeting-summarization 2 speech-analysis 2 vad 2 speechllm 2 speechgpt 2 rnnt 2