GitHub topics: speaker-diarization
transcriptionstream/transcriptionstream
turnkey self-hosted offline transcription and diarization service with llm summary
Language: Python - Size: 1.23 MB - Last synced at: about 7 hours ago - Pushed at: 8 months ago - Stars: 850 - Forks: 50

modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language: Python - Size: 100 MB - Last synced at: 1 day ago - Pushed at: 8 days ago - Stars: 10,476 - Forks: 1,049

bunyaminergen/awesome-speech-dataset
Awesome Speech Dataset, including download links and a brief explanation for each resource. These datasets provide diverse and high-quality speech data covering various domains such as conversational, academic, political, and more.
Size: 116 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 7 - Forks: 0

MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Language: Jupyter Notebook - Size: 435 KB - Last synced at: 2 days ago - Pushed at: 25 days ago - Stars: 4,511 - Forks: 410

wenet-e2e/wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Language: Python - Size: 6.22 MB - Last synced at: about 23 hours ago - Pushed at: 3 months ago - Stars: 896 - Forks: 136

Purfview/whisper-standalone-win
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
Size: 214 KB - Last synced at: 3 days ago - Pushed at: 27 days ago - Stars: 2,051 - Forks: 97

modelscope/3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
Language: Python - Size: 3.19 MB - Last synced at: 2 days ago - Pushed at: 28 days ago - Stars: 2,005 - Forks: 170

juanmc2005/diart
A python package to build AI-powered real-time audio applications
Language: Python - Size: 34.8 MB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 1,284 - Forks: 100

nttcslab-sp/mamba-diarization
Official repository for Mamba-based Segmentation Model for Speaker Diarization
Language: Python - Size: 45.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 36 - Forks: 3

speechbrain/speechbrain
A PyTorch-based Speech Toolkit
Language: Python - Size: 97.8 MB - Last synced at: 3 days ago - Pushed at: 8 days ago - Stars: 9,812 - Forks: 1,489

pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Language: Jupyter Notebook - Size: 252 MB - Last synced at: 3 days ago - Pushed at: 9 days ago - Stars: 7,480 - Forks: 876

linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Language: Python - Size: 4.49 MB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 2,401 - Forks: 183

revdotcom/reverb
Open source inference code for Rev's model
Language: Python - Size: 507 KB - Last synced at: 1 day ago - Pushed at: 24 days ago - Stars: 401 - Forks: 26

espnet/espnet
End-to-End Speech Processing Toolkit
Language: Python - Size: 1.13 GB - Last synced at: 5 days ago - Pushed at: 11 days ago - Stars: 9,083 - Forks: 2,258

google/uis-rnn
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
Language: Python - Size: 107 MB - Last synced at: 2 days ago - Pushed at: 8 months ago - Stars: 1,573 - Forks: 320

rrkas/Speaker-Diarization-Transcription
Language: Jupyter Notebook - Size: 12.9 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

mubingshen/MLC-SLM-Baseline
The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-SLM) to provide participants with baseline systems for speech recognition and speaker diarization in multilingual conversational scenario.
Language: Python - Size: 2.33 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 34 - Forks: 5

hwk06023/SONATA
SONATA (SOund and Narrative Advanced Transcription Assistant): An advanced ASR system that captures human expressions including emotive sounds and non-verbal cues.
Language: Python - Size: 527 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2 - Forks: 0

NavodPeiris/speechlib
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
Language: Python - Size: 33.9 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 213 - Forks: 20

Zhima-Mochi/whisper-v3-server
A robust backend server for audio processing, delivering high-accuracy transcription and speaker diarization. Powered by Whisper for speech-to-text and Pyannote for speaker segmentation, wrapped in a clean, maintainable architecture based on Domain-Driven Design (DDD) and Hexagonal Architecture.
Language: Python - Size: 1.76 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

wq2012/awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Size: 81.1 KB - Last synced at: 12 days ago - Pushed at: 7 months ago - Stars: 1,735 - Forks: 232

Picovoice/falcon
On-device speaker diarization powered by deep learning
Language: Python - Size: 20.2 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 44 - Forks: 5

oddradiocircle/WAScribe
Transcribe WhatsApp audios using Groq and Pyannote (Hugginface) API.
Language: Python - Size: 123 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

IBM-Cloud/chatbot-watson-android 📦
An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
Language: Java - Size: 3.42 MB - Last synced at: 4 days ago - Pushed at: over 3 years ago - Stars: 197 - Forks: 182

bunyaminergen/WavLMMSDD
This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia.
Language: Jupyter Notebook - Size: 1.8 MB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 6 - Forks: 3

manojpamk/pytorch_xvectors
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
Language: Python - Size: 356 KB - Last synced at: 20 days ago - Pushed at: over 4 years ago - Stars: 315 - Forks: 64

mikeesto/gemini-transcribe
Transcribe audio and video files with speaker diarization and logically grouped timestamps using Gemini Flash
Language: TypeScript - Size: 1.74 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 19 - Forks: 2

google/speaker-id
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
Language: Python - Size: 175 MB - Last synced at: 15 days ago - Pushed at: about 2 months ago - Stars: 411 - Forks: 39

wq2012/SpectralCluster
Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.
Language: Python - Size: 1.81 MB - Last synced at: 1 day ago - Pushed at: 8 months ago - Stars: 529 - Forks: 72

juanmc2005/rttm-viewer
Application for viewing Rich Transcription Time Marked (RTTM) files in an interactive way
Language: Python - Size: 2.23 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 41 - Forks: 5

yinruiqing/pyannote-whisper
Language: Python - Size: 3.34 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 574 - Forks: 100

alperensumeroglu/ai-clips-maker
AI-powered tool to turn long videos into short, viral-ready clips. Combines transcription, speaker diarization, scene detection & 9:16 resizing — perfect for creators & smart automation.
Language: Python - Size: 2.93 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

clement-pages/gryannote
Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.
Language: Svelte - Size: 2.65 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 61 - Forks: 7

marccasals98/Diarization
Framework for Speaker Diarization
Language: Python - Size: 73.2 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

aidayang/FunASR-OneClick
FunASR实时语音识别版,识别麦克风和电脑内播放的声音,电脑语音打字软件
Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

DrJuChunKoO/TransPal-transcriber
WhisperX Slack bot for transcribing audio files
Language: Python - Size: 22.5 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

bitcointranscripts/tstbtc
This cli app transcribe audio and videos for submission to the bitcointranscripts repo
Language: Python - Size: 6.37 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 6 - Forks: 8

see2023/VoiceMind
Real-time voice assistant with multi-speaker recognition & tactical suggestions. Local AI processing for privacy-sensitive scenarios (debates/meetings/negotiations).
Language: Dart - Size: 1.23 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

Audio-WestlakeU/FS-EEND
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024] and "LS-EEND: long-form streaming end-to-end neural diarization with online attractor extraction"
Language: Python - Size: 3.22 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 116 - Forks: 5

RobCaamano/Youtube-English-to-Spanish
📺 End-to-End Solution for Translating YouTube Videos to Spanish
Language: Python - Size: 65.4 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 3 - Forks: 2

nezhar/speech-condenser
A tool for summarizing dialogues from videos or audio
Language: Python - Size: 241 KB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 82 - Forks: 10

DongKeon/Awesome-Speaker-Diarization
Some comprehensive papers about speaker diarization
Size: 646 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 250 - Forks: 5

JaesungHuh/av-diarization
Audio-visual diarization pipeline used for creating VoxConverse dataset
Language: Python - Size: 12.5 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

ElmiraGhorbani/gpt-speaker-diarization
Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.
Language: Jupyter Notebook - Size: 39.1 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 12 - Forks: 0

gorkemkaramolla/whisper-run
Faster Whisper with Speaker Diarization
Language: Python - Size: 8.32 MB - Last synced at: 24 days ago - Pushed at: 7 months ago - Stars: 6 - Forks: 0

itmo-mbss-lab/sr_lectures_book
The project is related to the development of Basics of Voice Biometrics lecture book for the ITMO Speaker Recognition Course.
Language: TeX - Size: 1.28 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

neuralwork/audio2chat
Convert multi-speaker audio files to structured chat data for LLMs
Language: Python - Size: 2.02 MB - Last synced at: 11 days ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

linto-ai/linto-diarization
Speaker diarization service
Language: Python - Size: 37 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 21 - Forks: 1

Jpzinn654/speaker-diarization-portuguese
This project implements speaker diarization for Portuguese audio using WhisperX for transcription and PyAnotAudio's Speaker-Diarization 3.1 for speaker separation. It includes a Flask UI for easy file upload, transcription, and speaker identification.
Language: Python - Size: 24.4 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

al3xkras/sovits-svc-tools-docker
A unified docker environment combining SoVITS SVC fork, UVR5, audio-separator and pyannote.audio.
Language: Python - Size: 11.7 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

sohansai/speaker-diarization
A user-friendly interface for identifying and separating speakers in audio files using pyannote.audio and Gradio.
Language: Python - Size: 1000 Bytes - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

shashikg/X-Vector-Based-Speaker-Diarization
Course project for EE698R (2020-21 Sem 2). An X-Vector Based Speaker Diarization System with AutoEncoder based clustering method. Also supports spectral and KMeans clustering method.
Language: Jupyter Notebook - Size: 97 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 14 - Forks: 0

nuaazs/VAF_2
Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.
Language: Python - Size: 32.7 MB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 403 - Forks: 21

mtwn105/audio-intel
AudioIntel - Audio/Video Intelligence, Transcripts, Summary, and much more
Language: TypeScript - Size: 660 KB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

yufan-aslp/AliMeeting
The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.
Language: Python - Size: 492 KB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 114 - Forks: 17

scionoftech/speaker_diarization
speaker diarization using spectralcluster and Deeplearning
Language: Jupyter Notebook - Size: 188 KB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 0

wq2012/VB_diarization
VB Diarization with Eigenvoice and HMM Priors, refactored
Language: Python - Size: 32.4 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 15 - Forks: 3

NickNaskida/cog-whisper-diarization Fork of thomasmol/cog-whisper-diarization
Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote
Language: Python - Size: 54.7 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

a-jain24/Diarization
Facilitates purely text-based diarization labeling of transcripts or other written conversational data using LLMs
Language: Jupyter Notebook - Size: 18.6 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

NickNaskida/insanely-fast-whisper Fork of chenxwh/insanely-fast-whisper
Incredibly fast Whisper-large-v3 with speaker diarization
Language: Jupyter Notebook - Size: 396 KB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 1

ubclaunchpad/minutes
:telescope: Speaker diarization via transfer learning
Language: Python - Size: 22.2 MB - Last synced at: 3 days ago - Pushed at: about 6 years ago - Stars: 27 - Forks: 5

werserk/TechStormHack-1st-place
Решение соревнования ТехШторм от корпорации ТатНефть по анализу активности членов команды на ВКС
Language: Python - Size: 5.77 MB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Rehan-Ahmad/SpeakerDiarization
Audio based speaker diarization
Language: Python - Size: 1.19 MB - Last synced at: 9 months ago - Pushed at: about 6 years ago - Stars: 16 - Forks: 1

Rehan-Ahmad/MultimodalDiarization
Multimodal speaker diarization using pre-trained audio-visual synchronization model
Language: Python - Size: 38.1 KB - Last synced at: 9 months ago - Pushed at: about 5 years ago - Stars: 9 - Forks: 6

GameOfPods/PAT
PodcastProject Analytics Toolkit - Project that creates analytics various input data. Exported data is intended to be used in a PodcastProject website
Language: Python - Size: 150 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

CiscoDevNet/vo-id
Language: Python - Size: 85.7 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 11 - Forks: 3

aeronjl/transcribe
Python package for accurate audio transcription with speaker diarisation
Language: Python - Size: 25.4 MB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

wq2012/SimpleDER
A lightweight library to compute Diarization Error Rate (DER).
Language: Python - Size: 79.1 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 59 - Forks: 9

FrenchKrab/IS2024-powerset-calibration
Companion repository to the paper "On the calibration of powerset speaker diarization models" published at Interspeech 2024
Language: HTML - Size: 28.4 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

luisst/SpeakerLID_GT_code
Speaker Diarization, Recognition and Language Identification. Scripts to generate GT using our WebApp and Praat software
Language: Python - Size: 41.1 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

cadia-lvl/kaldi-speaker-diarization
This repository creates speaker diarization recipes to be used within the egs folder of kaldi.
Language: Shell - Size: 78.1 KB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 13 - Forks: 3

aeronjl/transcribe-streamlit
Streamlit user interface for transcribing conversations with speaker diarisation
Language: Python - Size: 235 KB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

flo-bit/youtube-speaker-separation
simple python script that outputs separate audio files for each speaker in a youtube video, using whisper on replicate
Language: Python - Size: 2.93 KB - Last synced at: 5 days ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

katagaki/FiresideSubtitles
Video transcription, speaker diarization, and face detection in Python.
Language: Python - Size: 37.1 KB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

mathusanm6/Amaze-Voice-Lab
The goal of this research project was to be able to control the movements of characters in a Maze game using real-time voice commands such as saying out loud Up, Down, Left or Right.
Language: Java - Size: 65.8 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

hitachi-speech/EEND
End-to-End Neural Diarization
Language: Python - Size: 50.6 MB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 354 - Forks: 57

taylorlu/Speaker-Diarization
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Language: Python - Size: 52.6 MB - Last synced at: 12 months ago - Pushed at: almost 4 years ago - Stars: 453 - Forks: 124

Wenhao-Yang/SpeakerVerifiaction-pytorch
Speaker Verification using Pytorch
Language: Jupyter Notebook - Size: 17.7 MB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 4

cvqluu/simple_diarizer
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
Language: Python - Size: 1.27 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 123 - Forks: 26

7egment/3D-Speaker-Diarization-Pipeline
A simplified and faster version of the speaker diarization pipeline in the 3D-Speaker toolkit by Alibaba DAMO Academy
Language: Python - Size: 30.7 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

mmxgn/smooth-convex-kl-nmf
Repository holding various implementation of specific NMF methods for speaker diarization
Language: Python - Size: 17.6 KB - Last synced at: about 1 month ago - Pushed at: over 7 years ago - Stars: 5 - Forks: 1

dptools/WhisperNote
Subtitle generation w/ Speaker Diarization using Whisper and pyannote.audio
Language: Python - Size: 86.9 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

nikitalpopov/master
research for master degree
Language: Jupyter Notebook - Size: 213 MB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

pranshurastogi29/uis_rnn_for_speaker_diarization
speaker_diarization done on toy dataset and tested on timit dataset
Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: 28 days ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 0

Joost385/transcription-ui
Full-stack Transcription-UI: Features OpenAI Whisper and NVIDIA NeMo, with Docker for easy deployment.
Language: TypeScript - Size: 353 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

e6quisitory/pyannote-benchmark
pyannote.audio benchmark for NVIDIA GPUs
Language: Python - Size: 2.93 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

VidyasagarMSC/WatBot
An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Language: Java - Size: 4.82 MB - Last synced at: 24 days ago - Pushed at: over 6 years ago - Stars: 72 - Forks: 53

doerlbh/MiniVox
Code for our ACML and INTERSPEECH papers: "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox".
Language: Cuda - Size: 998 MB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 25 - Forks: 5

SEERNET/Multi-Speaker-Diarization
Automated Multi Speaker diarization API for meetings, calls, interviews, press-conference etc.
Size: 13.7 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 11 - Forks: 0

team-re-verb/RE-VERB
speaker diarization system using an LSTM
Language: Python - Size: 135 MB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 48 - Forks: 8

FlorianKrey/DNC
Discriminative Neural Clustering for Speaker Diarisation
Language: Python - Size: 3.62 GB - Last synced at: 12 months ago - Pushed at: about 3 years ago - Stars: 78 - Forks: 14

kamakaya/gcp-speaker-diarization
Language: Jupyter Notebook - Size: 37.2 MB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 6 - Forks: 1

j-schmied/RealTimeSpeechRecognition
Various approaches for speech recognition and speaker diarization.
Language: Jupyter Notebook - Size: 2.86 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

cvqluu/Factorized-TDNN
PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi
Language: Python - Size: 278 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 140 - Forks: 34

theomariotte/SpeakerLoc
Speaker localization algorithms in the meeting context
Language: Python - Size: 332 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 2

haoming29/ez-transcription
An easy way to make perfect audio transcript with Whisper model and speaker diarization
Language: JavaScript - Size: 1.86 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

FrenchKrab/IS2023-powerset-diarization
Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.
Language: Jupyter Notebook - Size: 705 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 28 - Forks: 1

cvqluu/TDNN
Time delay neural network (TDNN) implementation in Pytorch using unfold method
Language: Python - Size: 708 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 183 - Forks: 40

ryojiysd/speaker-diarization-sample
Sample codes of Google Cloud Speech API's speaker diarization feature
Language: JavaScript - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 3

Rajeshshashank/Speaker-Diarization
Speaker Diarization using Python, Flask and Html
Language: HTML - Size: 161 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 2
