An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: korean-text-processing

bab2min/Kiwi

Kiwi(지능형 한국어 형태소 분석기)

Language: C++ - Size: 401 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 580 - Forks: 52

okikirmui/nkhandic

NK-HanDic: morphological analysis dictionary for North Korean language

Language: Perl - Size: 170 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

okikirmui/handic

HanDic: a morphological analysis dictionary for contemporary Korean

Language: Perl - Size: 239 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

lovit/soynlp

한국어 자연어처리를 위한 파이썬 라이브러리입니다. 단어 추출/ 토크나이저 / 품사판별/ 전처리의 기능을 제공합니다.

Language: Python - Size: 34.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 962 - Forks: 185

open-korean-text/open-korean-text

Open Korean Text Processor - An Open-source Korean Text Processor

Language: Scala - Size: 32.7 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 625 - Forks: 98

selfcontrol7/Korean_Voice_Phishing_Detection

All codes implemented on Korean voice phishing detection papers

Language: Jupyter Notebook - Size: 135 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 14 - Forks: 8

shineware/PyKOMORAN

(Beta) PyKOMORAN is wrapped KOMORAN in Python using Py4J.

Language: Python - Size: 34.8 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 42 - Forks: 6

NLP-kr/tensorflow-ml-nlp-tf2

텐서플로2와 머신러닝으로 시작하는 자연어처리 (로지스틱회귀부터 BERT와 GPT3까지) 실습자료

Language: Jupyter Notebook - Size: 200 MB - Last synced at: 16 days ago - Pushed at: over 2 years ago - Stars: 274 - Forks: 137

shineware/KOMORAN

Korean Morphological Analyzer by shineware

Language: Java - Size: 111 MB - Last synced at: 5 days ago - Pushed at: about 2 years ago - Stars: 291 - Forks: 65

bytecell/slotminer

Tool for slot extraction from text

Language: Python - Size: 132 KB - Last synced at: 7 days ago - Pushed at: over 2 years ago - Stars: 15 - Forks: 2

inhaKDD/UKTA-web Fork of ttytu/UKTA-web

Language: Python - Size: 88.2 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 2

ttytu/UKTA-web

Unififed Korean Text Analyzer including morpheme analysis, lexical features, and writing evaluation.

Language: Python - Size: 88.2 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 3

ttop32/coqui_tts_korea

Korean TTS using coqui TTS (glowtts and multiband melgan) - 한국어 TTS

Language: Jupyter Notebook - Size: 2.79 MB - Last synced at: 14 days ago - Pushed at: over 3 years ago - Stars: 57 - Forks: 17

EX3exp/NetKiwi

Kiwi(지능형 한국어 형태소 분석기)의 멀티플랫폼 C# 래퍼입니다. / Multiplatform C# Wrapper of Kiwi(지능형 한국어 형태소 분석기).

Language: C++ - Size: 56.6 MB - Last synced at: 12 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

tenebo/g2pk2 Fork of harmlessman/g2pkk

Updated folk of g2pk

Language: Python - Size: 66.4 KB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 3

mkpoli/koconv

JS library to convert Korean Hangul text

Language: TypeScript - Size: 35.2 KB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

33tm/Parrot

Reversing Korean romanization with AI

Language: Python - Size: 157 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

lovit/KR-WordRank

비지도학습 방법으로 한국어 텍스트에서 단어/키워드를 자동으로 추출하는 라이브러리입니다

Language: Python - Size: 4.55 MB - Last synced at: 4 months ago - Pushed at: about 3 years ago - Stars: 354 - Forks: 57

crizin/korean-utils

A Java library that provides various utility functions for processing and manipulating Korean text

Language: Java - Size: 89.8 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

jeongukjae/korean-spacing-model

한국어 문장 띄어쓰기(삭제/추가) 모델입니다. 데이터 준비 후 직접 학습이 가능하도록 작성하였습니다.

Language: Python - Size: 2.21 MB - Last synced at: 30 days ago - Pushed at: almost 3 years ago - Stars: 57 - Forks: 4

NLP-kr/tensorflow-ml-nlp

텐서플로우와 머신러닝으로 시작하는 자연어처리(로지스틱회귀부터 트랜스포머 챗봇까지)

Language: Jupyter Notebook - Size: 154 MB - Last synced at: 16 days ago - Pushed at: over 4 years ago - Stars: 200 - Forks: 103

ychoi-kr/ko-prfrdr

Utils for Korean proofreaders

Language: Python - Size: 24.2 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 58 - Forks: 7

llami-team/tetrapod

Improved swear word detection module

Language: JavaScript - Size: 130 KB - Last synced at: 5 months ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

open-korean-text/elasticsearch-analysis-openkoreantext

Korean analysis plugin that integrates open-korean-text module into elasticsearch.

Language: Java - Size: 13.1 MB - Last synced at: 6 months ago - Pushed at: almost 2 years ago - Stars: 127 - Forks: 22

bab2min/kiwi-gui

C# API for Kiwi

Language: C# - Size: 190 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 10 - Forks: 5

ttop32/KoGPT2novel

Generate novel text - novel finetuned from skt KoGPT2 base v2 - 한국어

Language: Jupyter Notebook - Size: 138 KB - Last synced at: 14 days ago - Pushed at: over 2 years ago - Stars: 12 - Forks: 2

lovit/crf_postagger

Korean Part-of-Speech Tagger using Conditional Random Field (CRF)

Language: Python - Size: 68.7 MB - Last synced at: 6 months ago - Pushed at: over 6 years ago - Stars: 12 - Forks: 4

Astro36/kotka

Korean Obfuscation ToolKit Advanced

Language: Python - Size: 36.1 KB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 4 - Forks: 1

rasoio/daon

한글 형태소 분석기

Language: Java - Size: 68.7 MB - Last synced at: 5 months ago - Pushed at: over 6 years ago - Stars: 20 - Forks: 5

hmmhmmhm/hangul-search-js

🇰🇷 Simple Korean text search module

Language: TypeScript - Size: 1.49 MB - Last synced at: 27 days ago - Pushed at: over 3 years ago - Stars: 25 - Forks: 1

lovit/customized_konlpy

Customized KoNLPy - Korean Natural Language Processing Toolkit KoNLPy wrapping code

Language: Python - Size: 929 KB - Last synced at: 15 days ago - Pushed at: over 6 years ago - Stars: 126 - Forks: 24

storidient/KoBookNLP

한국어 소설 텍스트를 위한 자연어처리 라이브러리입니다. Natural Language Processing Library for Korean Literary Text. (Will be open in February, 2024)

Language: Python - Size: 727 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

kimcore/Josa.kt

조사를 자동으로 교정하는 Kotlin 라이브러리입니다.

Language: Kotlin - Size: 92.8 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 12 - Forks: 1

kimcore/inko.kt

🇰🇷 영타를 한글로, 한타를 영어로 변환해주는 Kotlin 오픈소스 라이브러리 (Implementation of inko.js)

Language: Kotlin - Size: 89.8 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 13 - Forks: 0

coarchive/hangul-unicode 📦

A library to process and standardize hangul characters

Language: JavaScript - Size: 2.34 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

oneonlee/KR-Emotional-Analysis

2023년 국립국어원 인공 지능 언어 능력 평가: 감정 분석 과제

Language: Jupyter Notebook - Size: 13.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

SOMJANG/Mecab-ko-for-Google-Colab

Use Mecab Library(NLP Library) in Google Colab

Language: Shell - Size: 1.68 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 61 - Forks: 29

Keracorn/geulstagram

📷 글스타그램 데이터셋 만들기

Language: Python - Size: 16.4 MB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 9

minseok0809/korean-sentence-segementation

AIHub 한국어 데이터 전처리: 한국어 문장 분리

Language: Jupyter Notebook - Size: 2.61 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

affjljoo3581/langumo-ko

한국어 말뭉치용 langumo parser 모음

Language: Python - Size: 27.3 KB - Last synced at: 18 days ago - Pushed at: over 4 years ago - Stars: 7 - Forks: 2

steamb23/Naramal

This library is designed for flexible Korean processing in C#.

Language: C# - Size: 187 KB - Last synced at: 8 months ago - Pushed at: almost 5 years ago - Stars: 11 - Forks: 0

SohyeonKim-dev/Textinit

GPT-3와 MLKit 을 활용한 한국어 텍스트 생성기

Language: Swift - Size: 272 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

fingeredman/teanaps

자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.

Language: Jupyter Notebook - Size: 62.5 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 92 - Forks: 11

passing2961/KMRE

Korean Moview Review Emotion (KMRE) Dataset

Size: 23.1 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 17 - Forks: 0

yc9701/pansori-tedxkr-corpus

Korean ASR Corpus generated from TEDx talks

Size: 163 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 26 - Forks: 4

yonkmanjl/hangul-convert

Converts English word to Korean alphabet

Language: Java - Size: 863 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

this-is-my-life/KoreanScript 📦

코딩? 한국어로 시작하자! "한글스크립트"

Language: JavaScript - Size: 866 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 0

ni-inha/topic-modeling-of-mom-community

네이버 카페 "맘스홀릭 베이비" 수유 질문방 게시판 토픽 모델링 분석

Language: Jupyter Notebook - Size: 532 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

ni-inha/De-identification-of-Korean-names-in-clinical-notes

EMR 임상노트 내 규칙 기반 한국어 이름 비식별화

Language: Python - Size: 5.86 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

tgisaturday/CNN-text-classification

multi-class text classification using text-CNN and Konlpy

Language: Python - Size: 7.81 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 6 - Forks: 2

usik/usik_nlp

basic framework for NLP tasks.

Language: Python - Size: 45.9 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

shineware/RKOMORAN

RKOMORAN is KOMORAN wrapper for R users

Language: R - Size: 15 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 0

abdalimran/pykotokenizer

PyKoTokenizer is a Korean text tokenizer for Korean Natural Language Processing tasks.

Language: Python - Size: 10.6 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 1

mohenjo/Hangul.Net

.NET framework 한글 처리 클래스 라이브러리

Language: C# - Size: 21.5 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

Kiminjo/Extract-company-preference-factors Fork of UnstructuredDataProject/Unstructured-Data

Based on company review data, company preference factors are derived. This project was conducted as a part of the "Unstructured Data Analysis" class at the Department of Data Science, Seoul National University of Science and Technology

Language: Jupyter Notebook - Size: 4.74 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

ossteam8/LDA-TextRank-keyword

Keyword extractor using LDA and TextRank combined

Language: Jupyter Notebook - Size: 44.1 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 4 - Forks: 0

jaeyung1001/NLP Fork of ahroobe/NLP

Natural Language Processing for Korean.

Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 2

codebasic/pyko

Korean Text Processing using Python

Language: Python - Size: 6.49 MB - Last synced at: 1 day ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

mohenjo/PyHangulUtils

한글 문자/문자열 처리를 위한 파이썬 모듈

Language: Python - Size: 15.6 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 2

woodongk/geulstagram Fork of Keracorn/geulstagram

글스타그램 데이터셋 만들기

Language: Python - Size: 6.96 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

alexlaurence/DoReMi

🌊 DoReMi uses WaveNet to synthesise speech from online job listings for blind Korean speakers

Language: Python - Size: 11.9 MB - Last synced at: 2 months ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0