GitHub topics: named-entity-recognition
hankcs/HanLP
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
Language: Python - Size: 69.5 MB - Last synced at: about 7 hours ago - Pushed at: about 9 hours ago - Stars: 35,861 - Forks: 10,847
spencermountain/compromise
modest natural-language processing
Language: JavaScript - Size: 55.2 MB - Last synced at: about 9 hours ago - Pushed at: 4 months ago - Stars: 11,944 - Forks: 659
khteh/pAIthon
Python AI, ML, DL and NLP exploration playground.
Language: Python - Size: 2.21 GB - Last synced at: about 22 hours ago - Pushed at: about 23 hours ago - Stars: 1 - Forks: 0
The-FinAI/PIXIU
This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning data, and evaluation benchmarks to holistically assess financial LLMs. Our goal is to continually push forward the open-source development of financial artificial intelligence (AI).
Language: Jupyter Notebook - Size: 49.5 MB - Last synced at: 1 day ago - Pushed at: 9 months ago - Stars: 804 - Forks: 106
datagodzilla/medical-nlp-lean
Medical Entities Recognition
Language: Python - Size: 1020 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0
Bobbywasher/cc-cli
🔄 Switch and manage Claude Code configurations easily with CC CLI, featuring multi-site support, smart merging, and cloud backups.
Language: JavaScript - Size: 1.66 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0
LALITCHAROLA/genr-kit
🚀 Prototype and deploy generative AI applications with ease using Python, Gradio, and Transformers for text, image, and speech tasks.
Language: Python - Size: 12.7 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0
M4UNC/PDF-Package-Analyzer
🔍 Analyze PDF files effectively with this Python tool, testing compatibility across libraries to guide optimal PDF processing solutions.
Language: Python - Size: 1.35 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0
fmadore/iwac-ai-pipelines
AI pipelines for Omeka S digital collections - OCR correction, entity extraction, and text analysis
Language: Python - Size: 208 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0
ICIJ/datashare
A self‑hosted search engine for documents
Language: Java - Size: 396 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 667 - Forks: 64
Knowledge-Graph-Hub/kg-microbe
Language: Jupyter Notebook - Size: 536 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 20 - Forks: 3
explosion/spacy-llm
🦙 Integrating LLMs into structured NLP pipelines
Language: Python - Size: 1.79 MB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 1,342 - Forks: 104
explosion/spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
Language: Python - Size: 194 MB - Last synced at: 4 days ago - Pushed at: 6 days ago - Stars: 32,793 - Forks: 4,620
taishan1994/awesome-chinese-ner
中文命名实体识别。包含目前最新的中文命名实体识别论文、中文实体识别相关工具、数据集,以及中文预训练模型、词向量、实体识别综述等。
Size: 246 KB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 749 - Forks: 57
mejba-alam/spaCy
🧠 Enhance your applications with spaCy, a powerful library for advanced Natural Language Processing in Python and Cython, supporting 70+ languages.
Language: Python - Size: 17.7 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0
MAbdelhamid2001/POS-NER-Tagger
Language: Jupyter Notebook - Size: 29.2 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0
ankane/mitie-ruby
Named-entity recognition for Ruby
Language: Ruby - Size: 89.8 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 178 - Forks: 7
microsoft/presidio
An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.
Language: Python - Size: 254 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 6,057 - Forks: 839
mirpo/fastapi-gen
Build LLM-enabled FastAPI applications without build configuration.
Language: Python - Size: 1.09 MB - Last synced at: 4 days ago - Pushed at: 6 days ago - Stars: 10 - Forks: 1
zjunlp/DeepKE
[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
Language: Python - Size: 121 MB - Last synced at: 5 days ago - Pushed at: 4 months ago - Stars: 4,182 - Forks: 731
stanfordnlp/stanza
Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
Language: Python - Size: 82.7 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 7,646 - Forks: 927
winstxnhdw/llm-api
A fast CPU-based API for Qwen 2.5 using CTranslate2, hosted on Hugging Face Spaces.
Language: Python - Size: 1.4 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 2
SydAirAhd74/Smart_v0.0
AI privacy tool for pure edge computing utilizing translation, transcription across Hebrew, English and Farsi, Summarization, NER, Action Item Recognition, Timeline Extraction, Sentiment Analysis and Recommendations of the uploaded file.
Language: Python - Size: 15.6 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0
JohnSnowLabs/spark-nlp
State of the Art Natural Language Processing
Language: Scala - Size: 3.46 GB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 4,068 - Forks: 733
ankane/mitie-php
Named-entity recognition for PHP
Language: PHP - Size: 49.8 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 28 - Forks: 5
hitz-zentroa/GoLLIE
Guideline following Large Language Model for Information Extraction
Language: Python - Size: 10.8 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 409 - Forks: 28
flairNLP/flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)
Language: Python - Size: 377 MB - Last synced at: 6 days ago - Pushed at: 20 days ago - Stars: 14,319 - Forks: 2,130
daviden1013/llm-ie
A comprehensive toolkit that provides building blocks for LLM-based named entity recognition, attribute extraction, and relation extraction pipelines.
Language: Python - Size: 11.6 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 37 - Forks: 4
ankane/informers
Fast transformer inference for Ruby
Language: Ruby - Size: 2.48 MB - Last synced at: 7 days ago - Pushed at: 10 months ago - Stars: 589 - Forks: 17
stanfordnlp/CoreNLP
CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
Language: Java - Size: 380 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 9,995 - Forks: 2,719
thoughtbot/top_secret
Filter sensitive information from free text before sending it to external services or APIs, such as chatbots and LLMs.
Language: Ruby - Size: 134 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 288 - Forks: 6
4AI/LS-LLaMA
A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning
Language: Python - Size: 3.54 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 154 - Forks: 24
lonePatient/TorchBlocks
A PyTorch-based toolkit for natural language processing
Language: Python - Size: 481 KB - Last synced at: about 6 hours ago - Pushed at: over 2 years ago - Stars: 160 - Forks: 27
Tongjilibo/bert4torch
An elegent pytorch implement of transformers
Language: Python - Size: 11.3 MB - Last synced at: 2 days ago - Pushed at: 12 days ago - Stars: 1,331 - Forks: 168
ukairia777/tensorflow-nlp-tutorial
tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.
Language: Jupyter Notebook - Size: 126 MB - Last synced at: about 23 hours ago - Pushed at: 5 months ago - Stars: 565 - Forks: 291
shiva0824/Jobs
An end-to-end NLP project that extracts skills from job descriptions, builds job–resume matching recommendations, and showcases deployment with FastAPI and AWS.
Language: Python - Size: 15.1 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0
undertheseanlp/underthesea
Underthesea - Vietnamese NLP Toolkit
Language: Python - Size: 166 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1,622 - Forks: 288
urchade/GLiNER
Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
Language: Python - Size: 31.1 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 2,469 - Forks: 226
viclang/anonymacy
anonymaCy is a spaCy extension for anonymizing PII using rule-based recognizers, context-aware processing, conflict resolution and customizable anonymization.
Language: Python - Size: 626 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 2 - Forks: 0
quqxui/Awesome-LLM4IE-Papers
Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)
Size: 1.5 MB - Last synced at: 10 days ago - Pushed at: 12 months ago - Stars: 1,023 - Forks: 60
amirivojdan/shekar
Simplifying Persian NLP for Modern Applications
Language: Python - Size: 23 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 47 - Forks: 3
mawiesne/DE-NERmed
DE-NERmed: An OpenNLP named entity recognition tool and model files trained for medical NLP use cases
Language: Java - Size: 345 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0
FuxiaoLiu/VisualNews-Repository
[EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning
Language: Jupyter Notebook - Size: 6.94 MB - Last synced at: about 2 hours ago - Pushed at: over 1 year ago - Stars: 100 - Forks: 9
bogwi/rookeen
spaCy-based CLI for web linguistic analysis with embeddings, sentiment, POS/NER, and Unix pipeline composability. Outputs JSON, Parquet, CoNLL-U for ML workflows.
Language: Python - Size: 318 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0
Nishtha031105/ner-ml-project
Named Entity Recognition tool using NLP and React
Language: Python - Size: 60.5 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0
freinold/GLiNER-API
Easily configurable API & frontend providing simple access to dynamic NER models.
Language: Python - Size: 3.34 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 5 - Forks: 1
Apex-05/REALM
Real-Time Analysis of Linguistic Media
Language: Jupyter Notebook - Size: 4.44 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0
keerthanap8898/Neural-CRF_NER-Tagger
How to build a baby-BERT : I analyze BiLSTMs combined with Conditional Random Fields for Named Entity Recognition & contrasts a Neural-CRF tagger against a baseline BiLSTM model, exploring how probabilistic sequence dependencies improve contextual understanding beyond token-level classification.
Language: Jupyter Notebook - Size: 5.46 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 1 - Forks: 0
Aishwaraya-Dharmadhikari/NLP_Programs
All Natural Language Processing Programs
Size: 6.84 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0
yyDing1/GNER
[ACL 2024 Findings] Code implementation of Paper "Rethinking Negative Instances for Generative Named Entity Recognition"
Language: Python - Size: 4.69 MB - Last synced at: about 24 hours ago - Pushed at: over 1 year ago - Stars: 59 - Forks: 2
emrecncelik/weighted-bert
Nonofficial implementation of the paper A Text Document Clustering Method Based on Weighted BERT Model.
Language: Python - Size: 44.9 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 4 - Forks: 0
Chenfeng1271/awesome-MNER
awesome-multimodal-named-entity-recognition
Size: 176 KB - Last synced at: 15 days ago - Pushed at: about 2 years ago - Stars: 60 - Forks: 5
mhbashari/awesome-persian-nlp-ir
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Size: 192 KB - Last synced at: 6 days ago - Pushed at: about 2 years ago - Stars: 767 - Forks: 115
smilelight/lightKG
基于Pytorch和torchtext的知识图谱深度学习框架。
Language: Python - Size: 91.8 KB - Last synced at: 20 days ago - Pushed at: over 5 years ago - Stars: 615 - Forks: 150
mmarouen/ds-gear
data science gear: Advanced machine learning algorithms built on top of keras, tensorflow and sklear
Language: Python - Size: 86.9 KB - Last synced at: 17 days ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 1
vngrs-ai/vnlp
State-of-the-art, lightweight NLP tools for Turkish language. Developed by VNGRS.
Language: Python - Size: 392 MB - Last synced at: 22 days ago - Pushed at: 2 months ago - Stars: 283 - Forks: 17
microsoft/presidio-research
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
Language: Jupyter Notebook - Size: 10.4 MB - Last synced at: 27 days ago - Pushed at: 2 months ago - Stars: 239 - Forks: 70
sina-al/pynlp 📦
A pythonic wrapper for Stanford CoreNLP.
Language: Python - Size: 85 KB - Last synced at: 27 days ago - Pushed at: 28 days ago - Stars: 108 - Forks: 11
opensemanticsearch/open-semantic-search-apps
Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations and named entities) and data import (ETL like text extraction, OCR and crawling filesystems or websites)
Language: CSS - Size: 1.37 MB - Last synced at: 20 days ago - Pushed at: about 3 years ago - Stars: 99 - Forks: 38
CAMeL-Lab/camel_tools
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
Language: Python - Size: 11.5 MB - Last synced at: 26 days ago - Pushed at: 2 months ago - Stars: 488 - Forks: 78
UniversalDataTool/universal-data-tool
Collaborate & label any type of data, images, text, or documents, in an easy web interface or desktop app.
Language: JavaScript - Size: 247 MB - Last synced at: 24 days ago - Pushed at: 8 months ago - Stars: 2,031 - Forks: 193
luopeixiang/named_entity_recognition
中文命名实体识别(包括多种模型:HMM,CRF,BiLSTM,BiLSTM+CRF的具体实现)
Language: Python - Size: 28.2 MB - Last synced at: 28 days ago - Pushed at: over 3 years ago - Stars: 2,245 - Forks: 537
indu-explores-data/Automated-Resume-Data-Extraction
Automated resume information extraction using NLP. The project extracts Name, Email, and Phone from TXT, DOCX, and PDF files using spaCy and regex. It converts unstructured data into structured formats, improving recruitment efficiency and enabling scalable candidate profiling.
Language: Jupyter Notebook - Size: 71.3 KB - Last synced at: 30 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0
JayYip/m3tl
BERT for Multitask Learning
Language: Jupyter Notebook - Size: 29.1 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 548 - Forks: 125
apache/ctakes
Apache cTAKES is a Natural Language Processing (NLP) platform for clinical text.
Language: Java - Size: 128 MB - Last synced at: 18 days ago - Pushed at: about 2 months ago - Stars: 106 - Forks: 21
Determined22/zh-NER-TF
A very simple BiLSTM-CRF model for Chinese Named Entity Recognition 中文命名实体识别 (TensorFlow)
Language: Python - Size: 107 MB - Last synced at: 23 days ago - Pushed at: over 3 years ago - Stars: 2,340 - Forks: 934
CogComp/cogcomp-nlp
CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.
Language: Java - Size: 85.5 MB - Last synced at: 16 days ago - Pushed at: over 2 years ago - Stars: 479 - Forks: 144
hankcs/pyhanlp
中文分词
Language: Python - Size: 280 KB - Last synced at: 17 days ago - Pushed at: 10 months ago - Stars: 3,200 - Forks: 803
Georgetown-IR-Lab/QuickUMLS
System for Medical Concept Extraction and Linking
Language: Python - Size: 89.8 KB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 418 - Forks: 99
poteminr/instruct-ner
Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)
Language: Python - Size: 297 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 86 - Forks: 9
explosion/spacy-streamlit
👑 spaCy building blocks and visualizers for Streamlit apps
Language: Python - Size: 61.5 KB - Last synced at: 26 days ago - Pushed at: over 1 year ago - Stars: 845 - Forks: 119
zjunlp/Generative_KG_Construction_Papers
[EMNLP 2022] Generative Knowledge Graph Construction: A Review
Size: 15.8 MB - Last synced at: about 21 hours ago - Pushed at: over 2 years ago - Stars: 112 - Forks: 7
rodrigopivi/Chatito
🎯🗯 Dataset generation for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!
Language: TypeScript - Size: 6.42 MB - Last synced at: 17 days ago - Pushed at: about 2 years ago - Stars: 884 - Forks: 153
mddunlap924/PII-Detection
Personal Identifiable Information (PII) entity detection and performance enhancement with synthetic data generation
Language: Python - Size: 548 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 31 - Forks: 4
AdirthaBorgohain/NER-RE
A Named Entity Recognition + Entity Linker + Relation Extraction Pipeline built using spacy v3. Given a text, the pipeline will extract entities from the text as trained and will disambiguate the entities to its normalized form through an Entity Linker connected to a Knowledge Base and will assign a relation between the entities, if any.
Language: Python - Size: 15.1 MB - Last synced at: 19 days ago - Pushed at: over 2 years ago - Stars: 42 - Forks: 9
LingAdeu/ner-with-representation-language-model
This project documents an ML experiment with multilingual and crosslanguage models, namely M-BERT and XLM-R, for bilingual named entity recognition.
Language: JavaScript - Size: 447 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0
fbkaragoz/ottoman-ner
Ottoman Language Name Entity Recognition toolkit
Language: Python - Size: 6.97 MB - Last synced at: 27 days ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 1
raynardj/langhuan
Light weight labeling engine
Language: Jupyter Notebook - Size: 1.06 MB - Last synced at: 20 days ago - Pushed at: about 4 years ago - Stars: 13 - Forks: 0
rmusser01/BloodHound-Investigator
Tool to help researchers and journalists better understand large datasets
Language: Python - Size: 116 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 5 - Forks: 1
LanguageMachines/frog
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
Language: C++ - Size: 70.2 MB - Last synced at: 29 days ago - Pushed at: 4 months ago - Stars: 80 - Forks: 11
miftahurrrizki/custom-named-entity-recognition
Custom Named Entity Recognition (NER) with BiLSTM CRF and Spacy
Language: Jupyter Notebook - Size: 284 KB - Last synced at: 16 days ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0
sagorbrur/bnlp
BNLP is a natural language processing toolkit for Bengali Language.
Language: Jupyter Notebook - Size: 22.5 MB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 304 - Forks: 68
fastdatascience/country_named_entity_recognition
Code to find country names
Language: Python - Size: 151 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 5 - Forks: 2
RozyShindra/Information-Extractor
Java + Spring Boot REST API for Information Extraction integrating Knowledge Graph , Sentiment Detection from documents using Stanford CoreNLP. Supports entity extraction (Person, Location, Organization, etc.) and can be extended for advanced NLP tasks.
Language: Java - Size: 32.2 KB - Last synced at: 25 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0
antdragiotis/Personalize-emails-LangChain-
The project leverages OpenAI and LangChain to extract structured data from unstructured text and cluster customers into segments, enabling the creation of personalized email campaigns.
Language: Jupyter Notebook - Size: 971 KB - Last synced at: 22 days ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0
meefs/entseeker
entseeker is a command-line tool for Named Entity Recognition (NER) and web entity searches in text files. It uses spaCy's NLP capabilities for standard named entities and custom rules for web-related entities.
Language: Python - Size: 12.7 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0
StarlangSoftware/TurkishNamedEntityRecognition-CPP
NER Corpus Processing Library
Language: C++ - Size: 13 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0
shitohana/Alembic
Alembic is a comprehensive platform for fetching and analyzing biological and biomedical metadata. It provides a unified interface to access NCBI databases and extract named entities from biomedical text.
Language: Python - Size: 188 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 1
Abhay-Rudatala/Resume-Analyzer
Intelligent Resume Analysis System using Machine Learning and NLP. Features TF-IDF + Naive Bayes/SVM classification (90-95% accuracy), SpaCy NER for information extraction, and interactive Streamlit web app with custom UI. Built with Python, Scikit-learn, and deployed on Streamlit Cloud.
Language: Python - Size: 107 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0
felixfaruix/multi-task-nlp-evaluation
This project evaluates different NLP approaches (rule-based, unsupervised, and supervised machine learning) across three core text mining tasks: sentiment analysis using VADER and SVM, topic classification using LDA, and named entity recognition using BERT and spaCy.
Language: Jupyter Notebook - Size: 1.55 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0
RKirlew/Custom-Resume-NER-Model-Development-with-spaCy
I developed a custom Named Entity Recognition (NER) model using spaCy. The process involved manually annotating data, training the model, and evaluating its performance on unseen text. This project provided hands-on experience in working with NLP models, data annotation, and model training pipelines.
Language: Jupyter Notebook - Size: 61.5 KB - Last synced at: 21 days ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0
microsoft/vert-papers
This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microsoft Research Asia (MSRA).
Language: Python - Size: 22 MB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 279 - Forks: 95
dice-group/gerbil
GERBIL - General Entity annotatoR Benchmark
Language: Java - Size: 120 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 230 - Forks: 57
arjuntanil/NLP-CADL-Activities
CADL Activites of NLP (PMC2421A).
Language: Jupyter Notebook - Size: 11.1 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0
deeppavlov/DeepPavlov
An open source library for deep learning end-to-end dialog systems and chatbots.
Language: Python - Size: 31.4 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 6,934 - Forks: 1,165
monarch-initiative/ontogpt
LLM-based ontological extraction tools, including SPIRES
Language: Jupyter Notebook - Size: 80.9 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 738 - Forks: 100
ckiplab/ckip-transformers
CKIP Transformers
Language: Python - Size: 232 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 750 - Forks: 79
Az-r-ow/TravelNER Fork of lucas066001/TravelOrderResolver
Travel Named Entity Recognition using probabilistic model vs Deep Learning and Transformers
Language: Jupyter Notebook - Size: 6.07 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0
DerwenAI/strwythura
Construct knowledge graphs from unstructured data sources, use graph algorithms for enhanced GraphRAG with a DSPy-based chat bot locally, and curate semantics for optimizing AI app outcomes within a specific domain.
Language: Jupyter Notebook - Size: 3.74 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 174 - Forks: 21
0xferit/ITU-Turkish-NLP-Pipeline-Caller 📦
A Python3 wrapper tool to help using ITU Turkish NLP Pipeline API -- UNMAINTAINED --
Language: Python - Size: 131 KB - Last synced at: 8 days ago - Pushed at: over 7 years ago - Stars: 45 - Forks: 9