An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: named-entity-recognition

ICIJ/datashare

A self‑hosted search engine for documents

Language: Java - Size: 396 MB - Last synced at: about 22 hours ago - Pushed at: about 22 hours ago - Stars: 662 - Forks: 64

winstxnhdw/llm-api

A fast CPU-based API for Qwen 2.5 using CTranslate2, hosted on Hugging Face Spaces.

Language: Python - Size: 1.38 MB - Last synced at: about 19 hours ago - Pushed at: about 23 hours ago - Stars: 0 - Forks: 2

Bobbywasher/cc-cli

🔄 Switch and manage Claude Code configurations easily with CC CLI, featuring multi-site support, smart merging, and cloud backups.

Language: JavaScript - Size: 1.66 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

LALITCHAROLA/genr-kit

🚀 Prototype and deploy generative AI applications with ease using Python, Gradio, and Transformers for text, image, and speech tasks.

Language: Python - Size: 12.7 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

M4UNC/PDF-Package-Analyzer

🔍 Analyze PDF files effectively with this Python tool, testing compatibility across libraries to guide optimal PDF processing solutions.

Language: Python - Size: 1.35 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

keerthanap8898/Neural-CRF_NER-Tagger

How to build a baby-BERT : I analyze BiLSTMs combined with Conditional Random Fields for Named Entity Recognition & contrasts a Neural-CRF tagger against a baseline BiLSTM model, exploring how probabilistic sequence dependencies improve contextual understanding beyond token-level classification.

Language: Jupyter Notebook - Size: 5.45 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

poteminr/instruct-ner

Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)

Language: Python - Size: 297 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 86 - Forks: 9

The-FinAI/PIXIU

This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning data, and evaluation benchmarks to holistically assess financial LLMs. Our goal is to continually push forward the open-source development of financial artificial intelligence (AI).

Language: Jupyter Notebook - Size: 49.5 MB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 787 - Forks: 105

Knowledge-Graph-Hub/kg-microbe

Language: Jupyter Notebook - Size: 537 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 20 - Forks: 3

hitz-zentroa/GoLLIE

Guideline following Large Language Model for Information Extraction

Language: Python - Size: 10.8 MB - Last synced at: 3 days ago - Pushed at: 12 months ago - Stars: 404 - Forks: 28

mhbashari/awesome-persian-nlp-ir

Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources

Size: 192 KB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 765 - Forks: 115

spencermountain/compromise

modest natural-language processing

Language: JavaScript - Size: 55.2 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 11,895 - Forks: 659

explosion/spacy-llm

🦙 Integrating LLMs into structured NLP pipelines

Language: Python - Size: 1.79 MB - Last synced at: 3 days ago - Pushed at: 9 months ago - Stars: 1,321 - Forks: 104

Tongjilibo/bert4torch

An elegent pytorch implement of transformers

Language: Python - Size: 11.3 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 1,321 - Forks: 169

zjunlp/Generative_KG_Construction_Papers

[EMNLP 2022] Generative Knowledge Graph Construction: A Review

Size: 15.8 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 112 - Forks: 7

stanfordnlp/stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Language: Python - Size: 82.6 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 7,625 - Forks: 919

stanfordnlp/CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.

Language: Java - Size: 380 MB - Last synced at: 5 days ago - Pushed at: 2 months ago - Stars: 9,981 - Forks: 2,714

amirivojdan/shekar

Simplifying Persian NLP for Modern Applications

Language: Python - Size: 21.9 MB - Last synced at: 3 days ago - Pushed at: 9 days ago - Stars: 39 - Forks: 1

explosion/spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

Language: Python - Size: 194 MB - Last synced at: 7 days ago - Pushed at: 5 months ago - Stars: 32,624 - Forks: 4,595

microsoft/presidio

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

Language: Python - Size: 246 MB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 5,750 - Forks: 816

CAMeL-Lab/camel_tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

Language: Python - Size: 11.5 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 485 - Forks: 78

mddunlap924/PII-Detection

Personal Identifiable Information (PII) entity detection and performance enhancement with synthetic data generation

Language: Python - Size: 548 KB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 31 - Forks: 4

freinold/GLiNER-API

Easily configurable API & frontend providing simple access to dynamic NER models.

Language: Python - Size: 3.28 MB - Last synced at: 7 days ago - Pushed at: 9 days ago - Stars: 5 - Forks: 1

apache/ctakes

Apache cTAKES is a Natural Language Processing (NLP) platform for clinical text.

Language: Java - Size: 128 MB - Last synced at: 4 days ago - Pushed at: 21 days ago - Stars: 105 - Forks: 21

LingAdeu/ner-with-representation-language-model

This project documents an ML experiment with multilingual and crosslanguage models, namely M-BERT and XLM-R, for bilingual named entity recognition.

Language: JavaScript - Size: 447 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

fbkaragoz/ottoman-ner

Ottoman Language Name Entity Recognition toolkit

Language: Python - Size: 6.97 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 3 - Forks: 1

JohnSnowLabs/spark-nlp

State of the Art Natural Language Processing

Language: Scala - Size: 3.45 GB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 4,053 - Forks: 732

thoughtbot/top_secret

Filter sensitive information from free text before sending it to external services or APIs, such as chatbots and LLMs.

Language: Ruby - Size: 121 KB - Last synced at: 8 days ago - Pushed at: 16 days ago - Stars: 268 - Forks: 6

ankane/mitie-ruby

Named-entity recognition for Ruby

Language: Ruby - Size: 85.9 KB - Last synced at: 4 days ago - Pushed at: 4 months ago - Stars: 176 - Forks: 7

mawiesne/DE-NERmed

DE-NERmed: An OpenNLP named entity recognition tool and model files trained for medical NLP use cases

Language: Java - Size: 381 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 1 - Forks: 0

rmusser01/BloodHound-Investigator

Tool to help researchers and journalists better understand large datasets

Language: Python - Size: 116 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 5 - Forks: 1

daviden1013/llm-ie

A comprehensive toolkit that provides building blocks for LLM-based named entity recognition, attribute extraction, and relation extraction pipelines.

Language: Python - Size: 11.1 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 35 - Forks: 4

ukairia777/tensorflow-nlp-tutorial

tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.

Language: Jupyter Notebook - Size: 126 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 560 - Forks: 290

LanguageMachines/frog

Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

Language: C++ - Size: 70.2 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 80 - Forks: 11

sagorbrur/bnlp

BNLP is a natural language processing toolkit for Bengali Language.

Language: Jupyter Notebook - Size: 22.5 MB - Last synced at: 6 days ago - Pushed at: 11 months ago - Stars: 304 - Forks: 68

fastdatascience/country_named_entity_recognition

Code to find country names

Language: Python - Size: 151 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 5 - Forks: 2

RozyShindra/Information-Extractor

Java + Spring Boot REST API for Information Extraction integrating Knowledge Graph , Sentiment Detection from documents using Stanford CoreNLP. Supports entity extraction (Person, Location, Organization, etc.) and can be extended for advanced NLP tasks.

Language: Java - Size: 32.2 KB - Last synced at: 4 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

taishan1994/awesome-chinese-ner

中文命名实体识别。包含目前最新的中文命名实体识别论文、中文实体识别相关工具、数据集,以及中文预训练模型、词向量、实体识别综述等。

Size: 246 KB - Last synced at: 13 days ago - Pushed at: 3 months ago - Stars: 741 - Forks: 56

shiva0824/Jobs

An end-to-end NLP project that extracts skills from job descriptions, builds job–resume matching recommendations, and showcases deployment with FastAPI, Docker, CI/CD, and AWS.

Language: Python - Size: 13.6 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

meefs/entseeker

entseeker is a command-line tool for Named Entity Recognition (NER) and web entity searches in text files. It uses spaCy's NLP capabilities for standard named entities and custom rules for web-related entities.

Language: Python - Size: 12.7 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 1 - Forks: 0

StarlangSoftware/TurkishNamedEntityRecognition-CPP

NER Corpus Processing Library

Language: C++ - Size: 13 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 2 - Forks: 0

shitohana/Alembic

Alembic is a comprehensive platform for fetching and analyzing biological and biomedical metadata. It provides a unified interface to access NCBI databases and extract named entities from biomedical text.

Language: Python - Size: 188 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 1

Abhay-Rudatala/Resume-Analyzer

Intelligent Resume Analysis System using Machine Learning and NLP. Features TF-IDF + Naive Bayes/SVM classification (90-95% accuracy), SpaCy NER for information extraction, and interactive Streamlit web app with custom UI. Built with Python, Scikit-learn, and deployed on Streamlit Cloud.

Language: Python - Size: 107 MB - Last synced at: 12 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0

felixfaruix/multi-task-nlp-evaluation

This project evaluates different NLP approaches (rule-based, unsupervised, and supervised machine learning) across three core text mining tasks: sentiment analysis using VADER and SVM, topic classification using LDA, and named entity recognition using BERT and spaCy.

Language: Jupyter Notebook - Size: 1.55 MB - Last synced at: 15 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0

RKirlew/Custom-Resume-NER-Model-Development-with-spaCy

I developed a custom Named Entity Recognition (NER) model using spaCy. The process involved manually annotating data, training the model, and evaluating its performance on unseen text. This project provided hands-on experience in working with NLP models, data annotation, and model training pipelines.

Language: Jupyter Notebook - Size: 61.5 KB - Last synced at: about 17 hours ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

quqxui/Awesome-LLM4IE-Papers

Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)

Size: 1.5 MB - Last synced at: 6 days ago - Pushed at: 11 months ago - Stars: 1,008 - Forks: 60

microsoft/vert-papers

This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microsoft Research Asia (MSRA).

Language: Python - Size: 22 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 279 - Forks: 95

dice-group/gerbil

GERBIL - General Entity annotatoR Benchmark

Language: Java - Size: 120 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 230 - Forks: 57

arjuntanil/NLP-CADL-Activities

CADL Activites of NLP (PMC2421A).

Language: Jupyter Notebook - Size: 11.1 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

undertheseanlp/underthesea

Underthesea - Vietnamese NLP Toolkit

Language: Python - Size: 165 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 1,592 - Forks: 289

deeppavlov/DeepPavlov

An open source library for deep learning end-to-end dialog systems and chatbots.

Language: Python - Size: 31.4 MB - Last synced at: 18 days ago - Pushed at: 2 months ago - Stars: 6,934 - Forks: 1,165

microsoft/presidio-research

This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.

Language: Jupyter Notebook - Size: 10.4 MB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 238 - Forks: 70

monarch-initiative/ontogpt

LLM-based ontological extraction tools, including SPIRES

Language: Jupyter Notebook - Size: 80.9 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 738 - Forks: 100

ckiplab/ckip-transformers

CKIP Transformers

Language: Python - Size: 232 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 750 - Forks: 79

Az-r-ow/TravelNER Fork of lucas066001/TravelOrderResolver

Travel Named Entity Recognition using probabilistic model vs Deep Learning and Transformers

Language: Jupyter Notebook - Size: 6.07 MB - Last synced at: 6 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

DerwenAI/strwythura

Construct knowledge graphs from unstructured data sources, use graph algorithms for enhanced GraphRAG with a DSPy-based chat bot locally, and curate semantics for optimizing AI app outcomes within a specific domain.

Language: Jupyter Notebook - Size: 3.74 MB - Last synced at: 18 days ago - Pushed at: 22 days ago - Stars: 174 - Forks: 21

hankcs/HanLP

Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named Entity Recognition, Syntactic & Semantic Dependency Parsing, Document Classification

Language: Python - Size: 69.5 MB - Last synced at: 22 days ago - Pushed at: about 1 month ago - Stars: 35,676 - Forks: 10,794

0xferit/ITU-Turkish-NLP-Pipeline-Caller 📦

A Python3 wrapper tool to help using ITU Turkish NLP Pipeline API -- UNMAINTAINED --

Language: Python - Size: 131 KB - Last synced at: 10 days ago - Pushed at: over 7 years ago - Stars: 45 - Forks: 9

shael-nlp/cc_representation

Repository for my Master's Thesis

Language: Python - Size: 1.14 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

UniversalDataTool/universal-data-tool

Collaborate & label any type of data, images, text, or documents, in an easy web interface or desktop app.

Language: JavaScript - Size: 247 MB - Last synced at: 21 days ago - Pushed at: 7 months ago - Stars: 2,029 - Forks: 193

zjunlp/DeepKE

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction

Language: Python - Size: 121 MB - Last synced at: 23 days ago - Pushed at: 3 months ago - Stars: 4,127 - Forks: 724

mirpo/fastapi-gen

Build LLM-enabled FastAPI applications without build configuration.

Language: Python - Size: 524 KB - Last synced at: 4 days ago - Pushed at: 6 days ago - Stars: 9 - Forks: 1

explosion/spacy-streamlit

👑 spaCy building blocks and visualizers for Streamlit apps

Language: Python - Size: 61.5 KB - Last synced at: 22 days ago - Pushed at: about 1 year ago - Stars: 844 - Forks: 118

baidu/lac

百度NLP:分词,词性标注,命名实体识别,词重要性

Language: C++ - Size: 63.6 MB - Last synced at: 13 days ago - Pushed at: over 4 years ago - Stars: 3,967 - Forks: 594

ThilinaRajapakse/simpletransformers

Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI

Language: Python - Size: 20.1 MB - Last synced at: 24 days ago - Pushed at: about 2 months ago - Stars: 4,217 - Forks: 728

ArneBinder/pytorch-ie

PyTorch-IE: State-of-the-art Information Extraction in PyTorch

Language: Python - Size: 1.82 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 78 - Forks: 7

capella-marcosfilipe/selecao-dev-fullstack-unicap

Language: TypeScript - Size: 211 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 1 - Forks: 0

omarmhaimdat/quickner

Quickner is a new tool to quickly annotate texts for NER (Named Entity Recognition). It is written in Rust and accessible through a Python API.

Language: Rust - Size: 26.7 MB - Last synced at: 25 days ago - Pushed at: over 1 year ago - Stars: 22 - Forks: 1

BrikerMan/Kashgari

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Language: Python - Size: 14.3 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 2,387 - Forks: 433

oroszgy/awesome-hungarian-nlp

A curated list of NLP resources for Hungarian

Size: 164 KB - Last synced at: 5 days ago - Pushed at: 2 months ago - Stars: 257 - Forks: 19

ankane/informers

Fast transformer inference for Ruby

Language: Ruby - Size: 2.48 MB - Last synced at: about 11 hours ago - Pushed at: 9 months ago - Stars: 585 - Forks: 17

macanv/BERT-BiLSTM-CRF-NER

Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services

Language: Python - Size: 3.75 MB - Last synced at: 29 days ago - Pushed at: over 4 years ago - Stars: 4,853 - Forks: 1,250

yongzhuo/Pytorch-NLU

中文文本分类、序列标注工具包(pytorch),支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标注任务。 Chinese text classification and sequence labeling toolkit, supports multi class and multi label classification, text similsrity, text summary and NER.

Language: Python - Size: 379 KB - Last synced at: 14 days ago - Pushed at: about 1 year ago - Stars: 350 - Forks: 50

viclang/anonymacy

anonymaCy is a spaCy extension for anonymizing PII using rule-based recognizers, context-aware processing, conflict resolution and customizable anonymization.

Language: Python - Size: 481 KB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 2 - Forks: 0

ankane/mitie-php

Named-entity recognition for PHP

Language: PHP - Size: 48.8 KB - Last synced at: 13 days ago - Pushed at: 4 months ago - Stars: 28 - Forks: 5

rodrigopivi/Chatito

🎯🗯 Dataset generation for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!

Language: TypeScript - Size: 6.42 MB - Last synced at: 19 days ago - Pushed at: about 2 years ago - Stars: 885 - Forks: 153

ankitklu/NLP_prep

NLP_Prep is a collection of Natural Language Processing (NLP) concepts, implementations, and projects. Covers various NLP topics, from preprocessing to advanced techniques, along with a few Generative AI projects leveraging Groq AI and Gemini AI.

Language: Python - Size: 29.3 KB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0

syuoni/eznlp

Easy Natural Language Processing

Language: Python - Size: 3.53 MB - Last synced at: 9 days ago - Pushed at: 6 months ago - Stars: 143 - Forks: 22

yyDing1/GNER

[ACL-24 Findings] Code implementation of Paper "Rethinking Negative Instances for Generative Named Entity Recognition"

Language: Python - Size: 4.69 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 56 - Forks: 2

kaisugi/entity-related-papers

Named Entity Recognition, Entity Linking, and more

Size: 143 KB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 113 - Forks: 10

alinrajpoot/genr-kit

Genr-Kit: The ultimate open-source playground for multi-modal AI. One toolkit to build it all: from text and image generation to speech synthesis and analysis, powered by Gradio and Transformers.

Size: 7.81 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Franck-Dernoncourt/NeuroNER

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

Language: Python - Size: 121 MB - Last synced at: 10 days ago - Pushed at: over 2 years ago - Stars: 1,717 - Forks: 474

MagedSaeed/farasapy

A Python implementation of Farasa toolkit

Language: Python - Size: 265 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 136 - Forks: 23

chakki-works/seqeval

A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)

Language: Python - Size: 180 KB - Last synced at: 25 days ago - Pushed at: about 1 year ago - Stars: 1,150 - Forks: 134

d-kleine/NER_decoder

Named Entity Recognition with an decoder-only (autoregressive) LLM using HuggingFace

Language: Jupyter Notebook - Size: 888 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 42 - Forks: 1

philipperemy/Stanford-NER-Python

Stanford Named Entity Recognizer (NER) - Python Wrapper

Language: Python - Size: 166 MB - Last synced at: 29 days ago - Pushed at: over 5 years ago - Stars: 79 - Forks: 16

Danzigerrr/ProbNEL

Entity Linking Web App allowing for flexible NER and NED strategies adjustments

Language: Jupyter Notebook - Size: 19.7 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

howl-anderson/seq2annotation

基于 TensorFlow & PaddlePaddle 的通用序列标注算法库(目前包含 BiLSTM+CRF, Stacked-BiLSTM+CRF 和 IDCNN+CRF,更多算法正在持续添加中)实现中文分词(Tokenizer / segmentation)、词性标注(Part Of Speech, POS)和命名实体识别(Named Entity Recognition, NER)等序列标注任务。

Language: Python - Size: 8.81 MB - Last synced at: 22 days ago - Pushed at: almost 3 years ago - Stars: 86 - Forks: 22

flairNLP/flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Language: Python - Size: 377 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 14,282 - Forks: 2,123

zamgi/lingvo--NER--German

Named-entity recognition in German language using combined of deep neural network and ruled-based approach in C# for .NET

Language: C# - Size: 17.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 5 - Forks: 1

benbusby/namebuster

A tool for enumerating usernames from text, files, or websites

Language: Go - Size: 45.9 KB - Last synced at: 19 days ago - Pushed at: over 3 years ago - Stars: 82 - Forks: 12

blmoistawinde/HarvestText

文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法

Language: Python - Size: 4.27 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2,550 - Forks: 337

AnthonyMRios/pymetamap

Python wraper for MetaMap

Language: Python - Size: 45.9 KB - Last synced at: 25 days ago - Pushed at: about 5 years ago - Stars: 173 - Forks: 62

VinAIResearch/PhoNLP

PhoNLP: A BERT-based multi-task learning model for part-of-speech tagging, named entity recognition and dependency parsing (NAACL 2021)

Language: Python - Size: 588 KB - Last synced at: 27 days ago - Pushed at: 10 months ago - Stars: 148 - Forks: 19

vngrs-ai/vnlp

State-of-the-art, lightweight NLP tools for Turkish language. Developed by VNGRS.

Language: Python - Size: 392 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 267 - Forks: 17

KGCP/MEL-TNNT

Metadata Extractor & Loader (MEL) ■ The NLP-NER Toolkit (TNNT)

Language: Python - Size: 63.7 MB - Last synced at: 1 day ago - Pushed at: over 2 years ago - Stars: 24 - Forks: 1

daviden1013/ie-viz

A visualization tool for NLP information extraction: Named entity recognition, Entity attribute extraction, and Relation extraction.

Language: JavaScript - Size: 5.03 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 1

sebastianruder/NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Language: Python - Size: 1.33 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 22,945 - Forks: 3,618

aymara/lima

The Libre Multilingual Analyzer, a Natural Language Processing (NLP) C++ toolkit.

Language: C++ - Size: 276 MB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 114 - Forks: 20

fmadore/iwac-ai-pipelines

AI pipelines for Omeka S digital collections - OCR correction, entity extraction, and text analysis

Language: Python - Size: 153 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Related Keywords
named-entity-recognition 1,536 nlp 552 natural-language-processing 444 ner 396 machine-learning 198 python 191 spacy 151 deep-learning 140 pytorch 136 sentiment-analysis 128 relation-extraction 124 bert 123 information-extraction 105 transformers 74 text-classification 74 tensorflow 70 nlp-machine-learning 67 crf 59 sequence-labeling 51 knowledge-graph 44 dataset 43 pos-tagging 42 lstm 41 topic-modeling 40 python3 40 text-mining 39 spacy-nlp 38 keras 38 nltk 36 transformer 36 llm 34 entity-linking 34 question-answering 34 huggingface 31 ai 31 conditional-random-fields 30 entity-extraction 29 artificial-intelligence 29 tokenization 28 large-language-models 27 text-summarization 26 data-science 25 neural-network 25 corpus 25 natural-language-understanding 25 lemmatization 25 huggingface-transformers 24 classification 24 bilstm-crf 23 named-entities 22 bert-model 22 neural-networks 22 java 22 flask 21 jupyter-notebook 21 part-of-speech-tagging 21 language-model 21 word-embeddings 21 roberta 20 docker 20 bilstm 20 part-of-speech-tagger 20 annotation-tool 20 streamlit 19 machine-translation 19 event-extraction 18 transfer-learning 18 fine-tuning 17 tokenizer 17 dependency-parsing 17 text-processing 17 token-classification 17 named-entity-disambiguation 16 chatbot 16 intent-classification 16 conll-2003 16 flair 16 coreference-resolution 16 dependency-parser 15 text-generation 15 anonymization 14 text-analysis 14 bert-fine-tuning 14 biomedical 14 stemming 14 part-of-speech 14 summarization 14 ocr 14 word2vec 14 lstm-crf 14 nlp-library 14 annotation 13 lstm-neural-networks 13 llama 13 api 13 fastapi 13 bioinformatics 13 embeddings 12 information-retrieval 12 named-entity-linking 12