Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: chinese-dataset
brightmart/nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Size: 3.91 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 9,216 - Forks: 1,528
seanpm2001/AI2001_Category-Linguistics-SC-Chinese-Traditional
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🔠️🔢️ The linguistic:Chinese-Traditional category for AI2001, containing Chinese (Traditional) language linguistic datasets
Language: R - Size: 1.74 MB - Last synced: 27 days ago - Pushed: about 1 year ago - Stars: 3 - Forks: 1
seanpm2001/AI2001_Category-Linguistics-SC-Chinese-Simplified
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🔠️🔢️ The linguistic:Chinese-Simplified category for AI2001, containing Chinese (Simplified) language linguistic datasets
Language: R - Size: 1.77 MB - Last synced: 27 days ago - Pushed: about 1 year ago - Stars: 2 - Forks: 1
chaoswork/sft_datasets
开源SFT数据集整理,随时补充
Size: 3.91 KB - Last synced: about 1 month ago - Pushed: 12 months ago - Stars: 340 - Forks: 29
sovaai/sova-dataset
Size: 43 KB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 110 - Forks: 7
CLUEbenchmark/QBQTC
QBQTC: 大规模搜索匹配数据集
Language: Python - Size: 10.6 MB - Last synced: 3 months ago - Pushed: over 2 years ago - Stars: 44 - Forks: 6
zake7749/Gossiping-Chinese-Corpus
PTT 八卦版問答中文語料
Language: Jupyter Notebook - Size: 116 MB - Last synced: 2 months ago - Pushed: over 3 years ago - Stars: 226 - Forks: 36
ashleyfhh/HanSig
A large-scale offline Chinese handwritten signature dataset
Size: 64.5 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 4 - Forks: 0
Eurus-Holmes/CHABCNet
[CHABCNet] ABCNet on the Chinese dataset, building on Detectron2 (Facebook AI Research)
Language: Python - Size: 3.32 MB - Last synced: 7 months ago - Pushed: 8 months ago - Stars: 11 - Forks: 0
secsilm/zi-dataset
汉字数据集,包括汉字的相关信息,例如笔画数、部首、拼音、英文释义/同义词等。
Size: 1.57 MB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 50 - Forks: 8
lvyufeng/SciBERT_CN
Pretrained model for Chinese Scientific Text
Size: 462 KB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 36 - Forks: 2