Topic: "nlp-datasets"
mihail911/nlp-library
curated collection of papers for the nlp practitioner 📖👩🔬
Size: 63.5 KB - Last synced at: 3 months ago - Pushed at: almost 5 years ago - Stars: 1,075 - Forks: 91

hellohaptik/multi-task-NLP
multi_task_NLP is a utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.
Language: Python - Size: 7.46 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 358 - Forks: 54

dkulagin/kartaslov
Открытые лингвистические датасеты: тональный словарь русского языка КартаСловСент, датасет по семантике, ассоциативный граф и датасет по орфографическим ошибкам и опечаткам.
Size: 20.1 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 346 - Forks: 50

quincyliang/nlp-public-dataset
Chinese, English NER, English-Chinese machine translation dataset. 中英文实体识别数据集,中英文机器翻译数据集, 中文分词数据集
Language: Python - Size: 12.9 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 320 - Forks: 75

guhhhhaa/4675-scifi
chinese NLP corpus of chinese science fiction,chinese science fiction corpus : About 4675 Chinese science fiction novels 大约有4675本科幻小说,中文科幻小说自然语言处理语料库,中文科幻小说文本语料库,中文科幻小说文本数据库,科幻小说语料
Size: 113 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 277 - Forks: 50

irfnrdh/Awesome-Indonesia-NLP
Resource NLP & Bahasa
Size: 52.7 KB - Last synced at: 4 days ago - Pushed at: over 5 years ago - Stars: 269 - Forks: 67

grammarly/ua-gec
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Language: Macaulay2 - Size: 18 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 261 - Forks: 22

StonyBrookNLP/appworld
🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Paper.
Language: Python - Size: 5.16 MB - Last synced at: 22 days ago - Pushed at: about 1 month ago - Stars: 201 - Forks: 19

liutiedong/goat
a Fine-tuned LLaMA that is Good at Arithmetic Tasks
Language: Jupyter Notebook - Size: 863 KB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 177 - Forks: 17

cjiang2/VDCNN
Implementation of Very Deep Convolutional Neural Network for Text Classification
Language: Python - Size: 42 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 171 - Forks: 40

INK-USC/TriggerNER
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition (ACL 2020)
Language: Python - Size: 2.22 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 170 - Forks: 19

INK-USC/CommonGen
A Constrained Text Generation Challenge Towards Generative Commonsense Reasoning
Language: Python - Size: 107 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 136 - Forks: 23

xtea/chinese_medical_words
手工整理医疗行业词汇、术语等语料。可用于语音识别、对话系统等各类nlp模型训练。
Size: 1.33 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 85 - Forks: 31

Niger-Volta-LTI/yoruba-text
Yorùbá language training text for NLP, ASR and TTS tasks
Language: Python - Size: 76.2 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 76 - Forks: 26

Pzoom522/HistSumm
Code and data for "Summarising Historical Text in Modern Languages" (EACL 2021)
Language: Jupyter Notebook - Size: 237 KB - Last synced at: 3 months ago - Pushed at: about 4 years ago - Stars: 72 - Forks: 9

kelvin-jiang/FreebaseQA
The release of the FreebaseQA data set (NAACL 2019).
Size: 7.8 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 59 - Forks: 1

fido-ai/ua-datasets
A collection of datasets for Ukrainian language
Language: Python - Size: 2.08 MB - Last synced at: 2 days ago - Pushed at: 11 months ago - Stars: 57 - Forks: 2

gcunhase/AMICorpusXML
Extracts Transcript and Summary (Abstractive and Extractive) from the AMI Meeting Corpus
Language: Python - Size: 9.48 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 52 - Forks: 29

selimfirat/bilkent-turkish-writings-dataset
Compilation of Turkish writings dataset that promotes creativity, content, composition, grammar, spelling and punctuation.
Language: Python - Size: 41.3 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 50 - Forks: 2

secsilm/zi-dataset
汉字数据集,包括汉字的相关信息,例如笔画数、部首、拼音、英文释义/同义词等。
Size: 1.57 MB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 50 - Forks: 8

guhhhhaa/wula-scifi
chinese NLP corpus of chinese science fiction, chinese science fiction corpus: Archive of the Ark Plan of Ula Science Fiction Website 乌拉科幻小说网方舟计划存档,中文科幻小说自然语言处理语料库,中文科幻小说文本语料库,中文科幻小说文本数据库,科幻小说语料
Size: 199 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 49 - Forks: 9

AndyTheFactory/romanian-nlp-datasets
A list of Romanian NLP Datasets
Size: 215 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 48 - Forks: 8

matt-seb-ho/WikiWhy
WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000+ "why" question-answer-rationale triplets.
Language: Python - Size: 28.2 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 41 - Forks: 1

afrisenti-semeval/afrisent-semeval-2023
AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/
Language: Jupyter Notebook - Size: 33 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 38 - Forks: 38

gkiril/benchie
Comprehensive evaluation framework for Open Information Extraction.
Language: Python - Size: 340 KB - Last synced at: 12 months ago - Pushed at: almost 3 years ago - Stars: 38 - Forks: 8

uma-pi1/OPIEC
Reading the data from OPIEC - an Open Information Extraction corpus
Language: Java - Size: 237 KB - Last synced at: 7 days ago - Pushed at: about 6 years ago - Stars: 37 - Forks: 6

bothub-it/bothub
Bothub is an open platform for predicting, training and sharing NLP datasets in multiple languages
Language: Makefile - Size: 1.17 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 35 - Forks: 5

gpt-tester/ChatGPT-test-dataset-01
a small test dataset for use with OpenAI's ChatGPT
Size: 47.9 KB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 34 - Forks: 11

ElizaLo/Question-Answering-based-on-SQuAD Fork of gauthierdmn/question_answering
Question Answering System using BiDAF Model on SQuAD v2.0
Language: Python - Size: 7.27 MB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 25 - Forks: 27

cybermatt/russian-names
Library for generation of russian names
Language: Python - Size: 628 KB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 24 - Forks: 2

INK-USC/XCSR
Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"
Language: Python - Size: 60.7 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 20 - Forks: 2

utahnlp/infotabs-code
Implementation of the semi-structured inference model in our ACL 2020 paper, INFOTABS: Inference on Tables as Semi-structured Data.
Language: Python - Size: 127 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 17 - Forks: 7

JadynHax/scpscraper
A Python library designed for scraping data from the SCP wiki.
Language: Python - Size: 216 KB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 15 - Forks: 4

maxent-ai/Datasets 📦
datasets with text data for use in NLP, Text analysis, information extraction, ML research.
Language: Jupyter Notebook - Size: 45.7 MB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 15 - Forks: 3

aajanki/finnish-nlp-datasets
Open Finnish NLP datasets
Size: 30.3 KB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 14 - Forks: 1

jamesohortle/loanwords_gairaigo
English loanwords in Japanese
Language: Python - Size: 17 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 1

uma-pi1/OPIEC-pipeline
Language: Java - Size: 59.3 MB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 14 - Forks: 2

MiniXC/opensubtitles-dataloader
Loads OpenSubtitles v2018 dataset without having to load everything into memory at once. Works well with pytorch.
Language: Python - Size: 26.4 KB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 13 - Forks: 2

trisongz/pylines
Simplifying parsing of large jsonline files in NLP Workflows
Language: Python - Size: 244 KB - Last synced at: 2 days ago - Pushed at: over 3 years ago - Stars: 12 - Forks: 1

aryashah2k/SASBitathon-WinningSolution
1st Place solution for the SAS | GIM Bitathon, an annual Data Science Hackathon organized by SAS and Goa Institute of Management. The dataset worked on is the subset of the consumer complaints database provided by www.consumerfinance.gov
Language: Jupyter Notebook - Size: 39.6 MB - Last synced at: 5 days ago - Pushed at: over 3 years ago - Stars: 11 - Forks: 1

SemiringInc/Mueller-Report-Corpus
The Mueller Report Corpus V 0.1
Size: 3.51 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 11 - Forks: 0

divkakwani/webcorpus
Generate large textual corpora for almost any language by crawling the web
Language: Python - Size: 44.9 MB - Last synced at: 18 days ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 11

mnschmit/SherLIiC
A Typed Event-Focused Lexical Inference Benchmark for Evaluating Natural Language Inference
Language: Python - Size: 20.8 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 8 - Forks: 1

INK-USC/RiddleSense
RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge
Language: Python - Size: 16.3 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 1

mtala3t/Identify-the-Sentiments-AV-NLP-Contest
This project is submitted as python implementation in the contest of Analytics Vidhya called "Identify the Sentiments". I enjoyed the joining of this competition and all its process. This submited solution got the rank 118 in the public leaderboard.
Language: Python - Size: 7.61 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 7 - Forks: 2

marco-roberti/pytorch-e2e-dataset
The E2E Dataset, packed as a PyTorch DataSet subclass
Language: Python - Size: 97.7 KB - Last synced at: over 2 years ago - Pushed at: almost 7 years ago - Stars: 7 - Forks: 0

Dibyakanti/AutoTNLI-code
This repository contains the official code for the paper : Realistic Data Augmentation Framework for Enhancing Tabular Reasoning.
Language: HTML - Size: 3.99 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 6 - Forks: 1

LIAAD/PT-Pump-Up
Hub for the Portuguese language NLP Resources
Language: PHP - Size: 8.37 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 0

mehrdad-dev/Battle-of-the-Wordsmiths
Official github repository: Battle of the Wordsmiths: Comparing ChatGPT, GPT-4, Claude, and Bard (dataset)
Language: Python - Size: 614 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 1

JasonShao55/Chinese_Metaphor_Explanation
An annotated Chinese metaphor dataset
Language: Python - Size: 71.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

fzehracetin/turkish-question-answering
We extracted 5,000 question-answer pairs from Turkish Wikipedia and fine-tuned Turkish BERT, ALBERT, ELECTRA for the question-answering task.
Language: Jupyter Notebook - Size: 1.52 MB - Last synced at: 12 months ago - Pushed at: almost 4 years ago - Stars: 6 - Forks: 1

kushalchauhan98/ticket-segmentation
Data for the ACL 2020 paper - Improving Segmentation for Technical Support Problems
Size: 1.22 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 2

gcunhase/ArXivAbsTitleDataset
Extract Abstract and Title Dataset from arXiv articles
Language: Python - Size: 14 MB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 5 - Forks: 0

griff4692/clin-sum
Analysis of Hospital-Course Summaries
Language: Python - Size: 336 KB - Last synced at: 4 months ago - Pushed at: about 4 years ago - Stars: 4 - Forks: 1

bavard-ai/nlu-meta-dataset
A large dataset for learning to perform few-shot intent classification.
Size: 1.4 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 4 - Forks: 1

Bohdan-Khomtchouk/NERO-nlp
NERO-nlp is a PyPI package for biomedical Named Entity (Recognition) Ontology
Language: Python - Size: 29.9 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 1

navneetkrc/Flair_SOTA_NLP
Use of State of the Art FLAIR library for the NLP datasets
Language: Jupyter Notebook - Size: 1.12 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 4 - Forks: 0

dellison/NLIDatasets.jl
Julia interface to datasets for natural language inference
Language: Julia - Size: 48.8 KB - Last synced at: 12 days ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 0

Delta-Sigma/urdu-stopwords
A list containing Urdu stopwords.
Size: 25.4 KB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 11

StonyBrookNLP/appworld-leaderboard
🌍 Leaderboard Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL2024
Language: Python - Size: 127 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 3 - Forks: 1

Jpzinn654/qa-portuguese-v1
This is a split 500 thousands rows of a dataset from hugging face in portuguese to train NLP's for Question-and-Answering
Language: Python - Size: 4.88 KB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 3 - Forks: 0

poethan/AlphaMWE
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations
Size: 265 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 2

d0rj/RusLit
📚 A small collection of Russian literature 📚
Size: 20.7 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 2

ArmanBehnam/NLP
Natural language processing including Datasets,Farsi NLP, Automated Essay Scoring, Automatic Speech Recognition and etc.
Language: Jupyter Notebook - Size: 512 KB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

jrgpulido/js19is2e
Language: TeX - Size: 27.5 MB - Last synced at: over 2 years ago - Pushed at: about 6 years ago - Stars: 3 - Forks: 10

U-11-Agar/timeseries-analysis
time series data analysis on real time data and csv files
Language: Jupyter Notebook - Size: 73 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

turkish-nlp-suite/Vitamins-Supplements-NER-dataset
Repo for Turkish Vitamins and Supplements NER dataset.
Size: 558 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 1

MiSaengg/gunhee-RnD-space
R&D for datasets for book genres
Language: Jupyter Notebook - Size: 17.1 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

PranavNV/Nationality-Prejudice-in-Text-Generation
This project focuses on the analysis of text generation models such as GPT-2 to identify and understand populistic behaviors or biases against various nationality.
Size: 20.1 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

mzhukovaucsb/emoji_gestures
Research project “Gesture Emoji Twitter Corpus”. Project description, data collection pipeline (tweepy), data preprocessing functions (regex, nltk), 2 datasets for Russian and English published in open access.
Language: Jupyter Notebook - Size: 125 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

DravidianNLP/Datasets
This repository hosts all the datasets published in Dravidian Languages.
Size: 11.7 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

theQuert/NLP-sentiment-analysis
Sentiment Analysis models with multiple algorithms
Language: Jupyter Notebook - Size: 5.63 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 2

vgupta123/infotabs-code Fork of utahnlp/infotabs-code
Implementation of the semi-structured inference model in our ACL 2020 paper. INFOTABS: Inference on Tables as Semi-structured Data
Language: Python - Size: 138 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 0

jrgpulido/pd18is5d
Language: Roff - Size: 11.1 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 13

BigToothDev/pet-project-nlp
Natural language processing pet project. It includes data web scraping, lemmatizing, stemming, and working with related words (hyponyms, hypernyms, meronyms, holonyms). This specific code gathers all data from chosen pages of the Suspilne (Суспільне) webpage. Next, the data is manipulated and processed for future analysis
Language: Python - Size: 48.5 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

Robert-Morabito/STOP
Repository for the paper STOP! Benchmarking Large Language Models with Sensitivity Testing on Offensive Progressions (EMNLP 2024)
Language: Python - Size: 375 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

christosojan/MSA_in_Indian_Languages
Implementation of Dense Fusion Network with Multimodal Residual (DFMR) for Multi-modal Sentiment Analysis(MSA) in native Indian Languages like Malayalam by integrating Multi-modal information from Multimedia. The model processes the textual, visual, and auditory modalities of the video to classify the sentiment into five categories.
Language: Jupyter Notebook - Size: 31.3 MB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

hicte/moin
A dataset of Moin Persian 🇮🇷 dictionary 📖 words.
Size: 265 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Yu-billie/NLP-Project-CUAI-1H23
NLP Projects in CUAI 1H23
Language: Jupyter Notebook - Size: 3.09 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

HuynhXuanLam-IT44/BERT-Covid-Sentiment-Classification
Applying and Understanding an Advanced, Novel Deep Learning Approach
Language: Jupyter Notebook - Size: 2.55 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

aman-17/BERT-Semantic-Similarity-Flask-App
Flask app for Semantic Similarity of sentences using BERT model.
Language: CSS - Size: 6.06 MB - Last synced at: 12 days ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

yarakyrychenko/tg-misinfo-data
Telegram posts from Russian news, misinformation, and propaganda channels made during the first weeks of the 2022 Russian invasion of Ukraine.
Size: 37.1 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

sammitjain/loksabha-questions
Questions asked in the Lok Sabha - collection and analysis of trends. Creating the dataset from scratch.
Language: Jupyter Notebook - Size: 80.7 MB - Last synced at: 2 months ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 1

Utkichaps/AMICorpus-Meeting-Transcript-Extraction
This can be used to convert the AMI corpus meeting transcripts to a speaker-by-speaker dialogue discourse conversation for each meeting.
Language: Python - Size: 243 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

cedspam/text_dataset_streaming
Language: Python - Size: 72.3 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

tagtog/BBC-News-Dataset
🍃BBC-News-Dataset in anndoc (tagtog) format
Language: HTML - Size: 2.81 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

DARK-art108/FinBox-NLP-Exercise
An NLP Exercise
Language: Jupyter Notebook - Size: 91.8 KB - Last synced at: 4 months ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

Text-Mining/Ferdowsi-Annotated-Academic-Linguistic-Corpus
دو پیکره زبانی مربوط به مجموعه مقالات دانشگاه فردوسی مشهد
Size: 57.6 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 1

praveentn/nlpaeg
Natural Language Processing for Artificial Error Generation
Language: Python - Size: 6.35 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

vaibhav0000patel/Topical-Sentiment-Analysis
ML model that recognizes how much the text is related to data of a particular topic which the model is trained with. Modular structure of the code makes it easier to understand and modify it. Here, the model classify the text if it is crime related or not..
Language: Python - Size: 483 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

nelson888/seq2seq-data-augmentation
Soft Contextual Data Augmentation, a Data Augmentation method for NLP translation datasets
Language: Java - Size: 41 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

language-resources-nepal/language-resources-nepal.github.io
A curated collection of language resources for Nepal
Language: SCSS - Size: 87.9 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

Selium98/Flix-Master
Advance Movie Recommender, with Flask as the framework used for User Interface, deployed on Heroku.
Language: HTML - Size: 1.11 MB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

Karan-Malik/WordEmbeddings
Creating Word Embeddings using Keras
Language: Jupyter Notebook - Size: 24.5 MB - Last synced at: 4 months ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

rareloto/beginnerwebscraping-naverdictionary
Scraping Korean - English conversations parallel text pairs from Naver Conversation of the Day
Language: Jupyter Notebook - Size: 682 KB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

Shayokh144/Bengali-Literature-Data-Collection
Size: 943 KB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

roshangrewal/natural-language-processing
Natural language processing is a subfield of linguistics, computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human languages, in particular how to program computers to process and analyze large amounts of natural language data.
Language: Jupyter Notebook - Size: 7.81 KB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

navanith007/ULMFit-using-pytorch
This repository contains solving of NLP problems using transfer learning
Language: Jupyter Notebook - Size: 11.7 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

saxenaprerit/Text_mining_using_NLTK
This code uses NTLK for text mining in python
Language: Jupyter Notebook - Size: 14.6 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

clulab/edin-data
Biomolecular events mined by Reach from PubMed Central
Size: 3.29 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1
