GitHub topics: named-entity-recognition

Repositories

Sunnyboivr/ner-ml-project

🧠 Extract named entities from text effortlessly with advanced NLP and ML models, providing insights and analytics for diverse documents.

Language: JavaScript - Size: 12.1 MB - Last synced at: about 16 hours ago - Pushed at: about 19 hours ago - Stars: 0 - Forks: 0

deeppavlov/DeepPavlov

An open source library for deep learning end-to-end dialog systems and chatbots.

Language: Python - Size: 31.4 MB - Last synced at: about 11 hours ago - Pushed at: 4 months ago - Stars: 6,957 - Forks: 1,172

explosion/spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

Language: Python - Size: 194 MB - Last synced at: 1 day ago - Pushed at: 12 days ago - Stars: 32,919 - Forks: 4,637

Bobbywasher/cc-cli

🔄 Switch and manage Claude Code configurations easily with CC CLI, featuring multi-site support, smart merging, and cloud backups.

Language: JavaScript - Size: 1.66 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning data, and evaluation benchmarks to holistically assess financial LLMs. Our goal is to continually push forward the open-source development of financial artificial intelligence (AI).

Language: Jupyter Notebook - Size: 49.5 MB - Last synced at: 1 day ago - Pushed at: 9 months ago - Stars: 812 - Forks: 107

explosion/spacy-llm

🦙 Integrating LLMs into structured NLP pipelines

Language: Python - Size: 1.79 MB - Last synced at: 2 days ago - Pushed at: 11 months ago - Stars: 1,354 - Forks: 106

winstxnhdw/llm-api

A fast CPU-based API for Qwen 2.5 using CTranslate2, hosted on Hugging Face Spaces.

Language: Python - Size: 1.77 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 2

Tongjilibo/bert4torch

An elegent pytorch implement of transformers

Language: Python - Size: 11.3 MB - Last synced at: 2 days ago - Pushed at: about 1 month ago - Stars: 1,332 - Forks: 168

microsoft/presidio

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

Language: Python - Size: 256 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 6,294 - Forks: 865

Knowledge-Graph-Hub/kg-microbe

Language: Jupyter Notebook - Size: 522 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 21 - Forks: 3

spencermountain/compromise

modest natural-language processing

Language: JavaScript - Size: 55.2 MB - Last synced at: 3 days ago - Pushed at: 17 days ago - Stars: 11,967 - Forks: 661

flairNLP/flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Language: Python - Size: 377 MB - Last synced at: 1 day ago - Pushed at: about 1 month ago - Stars: 14,332 - Forks: 2,129

LALITCHAROLA/genr-kit

🚀 Prototype and deploy generative AI applications with ease using Python, Gradio, and Transformers for text, image, and speech tasks.

Language: Python - Size: 12.7 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

daviden1013/llm-ie

A comprehensive toolkit that provides building blocks for LLM-based named entity recognition, attribute extraction, and relation extraction pipelines.

Language: Python - Size: 16.1 MB - Last synced at: 4 days ago - Pushed at: 7 days ago - Stars: 44 - Forks: 4

AlexisRellon/GAIA

Real-time environmental hazard detection for the Philippines using AI (Climate-NLI + Geo-NER). Processes news feeds and citizen reports to visualize typhoons, floods, earthquakes, and more on an interactive map. React + FastAPI + Supabase.

Language: TypeScript - Size: 32 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

fairagro/pilot-uc-textmining-metadata

This repository contains the code used to generate the NER corpus for metadata enrichment as part of FAIRagro use case

Language: Jupyter Notebook - Size: 31.8 MB - Last synced at: 1 day ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

AnaPaula04/pii-redaction-demo

Lightweight PII redaction pipeline using Hugging Face NER + regex (Python) 96.5% accuracy

Language: Python - Size: 29.3 KB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

quqxui/Awesome-LLM4IE-Papers

Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)

Size: 1.5 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 1,033 - Forks: 61

M4UNC/PDF-Package-Analyzer

🔍 Analyze PDF files effectively with this Python tool, testing compatibility across libraries to guide optimal PDF processing solutions.

Language: Python - Size: 1.35 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

fmadore/iwac-ai-pipelines

AI pipelines for Omeka S digital collections - OCR correction, entity extraction, and text analysis

Language: Python - Size: 478 KB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

tanaos/artifex

A Python library to use pre-trained, small task-specific LLMs and fine-tune them without training data 🤖🚀

Language: Python - Size: 1.71 MB - Last synced at: 3 days ago - Pushed at: 6 days ago - Stars: 20 - Forks: 4

hitz-zentroa/GoLLIE

Guideline following Large Language Model for Information Extraction

Language: Python - Size: 10.8 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 414 - Forks: 27

apache/ctakes

Apache cTAKES is a Natural Language Processing (NLP) platform for clinical text.

Language: Java - Size: 128 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 108 - Forks: 21

undertheseanlp/underthesea

Underthesea - Vietnamese NLP Toolkit

Language: Python - Size: 166 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 1,634 - Forks: 288

mejba-alam/spaCy

🧠 Enhance your applications with spaCy, a powerful library for advanced Natural Language Processing in Python and Cython, supporting 70+ languages.

Language: Python - Size: 17.7 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

Jihaad2021/taskflow

Production-scale document processing pipeline using specialized fine-tuned models. 100x cheaper than GPT-4 API at $0.0005/document. Process 100K+ docs/hour.

Language: PowerShell - Size: 70.3 KB - Last synced at: 3 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

chaosen315/AIwork4translator

AI-powered translation tool for professionals handling technical documents, long texts, novels, and rulebooks. Optimizes proper noun accuracy with 35% higher recognition rate and 99% token savings vs traditional RAG. Supports CLI/WebUI and multiple formats (PDF/PPT/Word).

Language: Python - Size: 4.26 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 14 - Forks: 2

daviden1013/ie-viz

A visualization tool for NLP information extraction: Named entity recognition, Entity attribute extraction, and Relation extraction.

Language: JavaScript - Size: 5.03 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 4 - Forks: 1

ukairia777/tensorflow-nlp-tutorial

tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.

Language: Jupyter Notebook - Size: 126 MB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 566 - Forks: 291

ankane/informers

Fast transformer inference for Ruby

Language: Ruby - Size: 2.48 MB - Last synced at: 4 days ago - Pushed at: 10 months ago - Stars: 593 - Forks: 17

mirpo/fastapi-gen

Build LLM-enabled FastAPI applications without build configuration.

Language: Python - Size: 953 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 11 - Forks: 1

khteh/pAIthon

Python AI, ML, DL and NLP exploration playground.

Language: Python - Size: 2.22 GB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

easonanalytica/company_name_matcher

A library for matching and comparing company names using a fine-tuned sentence transformer model

Language: Jupyter Notebook - Size: 511 KB - Last synced at: 1 day ago - Pushed at: 5 days ago - Stars: 7 - Forks: 1

YueranCao2001/legalbert-kd-ner

Compressing LegalBERT via knowledge distillation for legal-domain NER (InLegalNER)

Language: Python - Size: 2.27 MB - Last synced at: 7 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

microsoft/vert-papers

This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microsoft Research Asia (MSRA).

Language: Python - Size: 22 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 280 - Forks: 96

urchade/GLiNER

Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024

Language: Python - Size: 35.5 MB - Last synced at: 11 days ago - Pushed at: 13 days ago - Stars: 2,556 - Forks: 230

stanfordnlp/CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.

Language: Java - Size: 381 MB - Last synced at: 11 days ago - Pushed at: 13 days ago - Stars: 10,010 - Forks: 2,717

Franck-Dernoncourt/NeuroNER

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

Language: Python - Size: 121 MB - Last synced at: 10 days ago - Pushed at: over 2 years ago - Stars: 1,717 - Forks: 476

vngrs-ai/vnlp

State-of-the-art, lightweight NLP tools for Turkish language. Developed by VNGRS.

Language: Python - Size: 392 MB - Last synced at: 12 days ago - Pushed at: 3 months ago - Stars: 285 - Forks: 17

JayYip/m3tl

BERT for Multitask Learning

Language: Jupyter Notebook - Size: 29.1 MB - Last synced at: 12 days ago - Pushed at: over 2 years ago - Stars: 548 - Forks: 125

taishan1994/awesome-chinese-ner

中文命名实体识别。包含目前最新的中文命名实体识别论文、中文实体识别相关工具、数据集，以及中文预训练模型、词向量、实体识别综述等。

Size: 246 KB - Last synced at: 12 days ago - Pushed at: 5 months ago - Stars: 753 - Forks: 57

yoann0723/LocalAI-Assistant

LocalAI-Assistant is a local desktop AI assistant built with C++/Qt. It supports voice and text input to perform local tasks. All models run locally — no user data is uploaded, ensuring full privacy. A plugin-based architecture allows easy expansion of new capabilities.

Language: C++ - Size: 173 KB - Last synced at: 10 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

ICIJ/datashare

A self‑hosted search engine for documents

Language: Java - Size: 396 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 671 - Forks: 65

stanfordnlp/stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Language: Python - Size: 82.8 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 7,676 - Forks: 928

IBM/zshot

Zero and Few shot named entity & relationships recognition

Language: Python - Size: 1.47 MB - Last synced at: 13 days ago - Pushed at: 3 months ago - Stars: 393 - Forks: 25

amirivojdan/shekar

Simplifying Persian NLP for Modern Applications

Language: Python - Size: 23.4 MB - Last synced at: 13 days ago - Pushed at: 14 days ago - Stars: 52 - Forks: 3

microsoft/presidio-research

This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.

Language: Jupyter Notebook - Size: 10.4 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 247 - Forks: 72

thoughtbot/top_secret

Filter sensitive information from free text before sending it to external services or APIs, such as chatbots and LLMs.

Language: Ruby - Size: 101 KB - Last synced at: 4 days ago - Pushed at: about 1 month ago - Stars: 295 - Forks: 6

biomedicalinformaticsgroup/ParallelPyMetaMap

This code is to run MetaMap in parallel using Python.

Language: Python - Size: 139 KB - Last synced at: 12 days ago - Pushed at: 15 days ago - Stars: 8 - Forks: 0

baidu/lac

百度NLP：分词，词性标注，命名实体识别，词重要性

Language: C++ - Size: 63.6 MB - Last synced at: 10 days ago - Pushed at: over 4 years ago - Stars: 3,972 - Forks: 594

oroszgy/awesome-hungarian-nlp

A curated list of NLP resources for Hungarian

Size: 164 KB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 258 - Forks: 19

impresso/newsagency-classification

Recognition of news agency mentions in historical news articles (BERT-based token classification).

Language: Jupyter Notebook - Size: 242 MB - Last synced at: 12 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0

mawiesne/DE-NERmed

DE-NERmed: An OpenNLP named entity recognition tool and model files trained for medical NLP use cases

Language: Java - Size: 360 KB - Last synced at: 12 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0

hankcs/pyhanlp

中文分词

Language: Python - Size: 280 KB - Last synced at: 11 days ago - Pushed at: 11 months ago - Stars: 3,204 - Forks: 804

hankcs/HanLP

中文分词词性标注命名实体识别依存句法分析成分句法分析语义依存分析语义角色标注指代消解风格转换语义相似度新词发现关键词短语提取自动摘要文本分类聚类拼音简繁转换自然语言处理

Language: Python - Size: 69.5 MB - Last synced at: 17 days ago - Pushed at: 24 days ago - Stars: 35,900 - Forks: 10,864

freinold/GLiNER-API

Easily configurable API & frontend providing simple access to dynamic NER models.

Language: Python - Size: 3.37 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 5 - Forks: 1

JohnSnowLabs/spark-nlp

State of the Art Natural Language Processing

Language: Scala - Size: 3.46 GB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 4,071 - Forks: 736

dmis-lab/ConNER

Bioinformatics'2023: Consistency Enhancement of Model Prediction on Document-level Named Entity Recognition

Language: Python - Size: 9.24 MB - Last synced at: about 24 hours ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 0

poteminr/instruct-ner

Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)

Language: Python - Size: 297 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 87 - Forks: 9

kaisugi/entity-related-papers

Named Entity Recognition, Entity Linking, and more

Size: 143 KB - Last synced at: 3 days ago - Pushed at: 11 months ago - Stars: 115 - Forks: 10

DerwenAI/strwythura

Construct knowledge graphs from unstructured data sources, use graph algorithms for enhanced GraphRAG with a DSPy-based chat bot locally, and curate semantics for optimizing AI app outcomes within a specific domain.

Language: Jupyter Notebook - Size: 3.2 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 181 - Forks: 20

Nishtha031105/ner-ml-project

Named Entity Recognition tool using NLP and React

Language: Python - Size: 70.3 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 1

datagodzilla/medical-nlp-lean

Medical Entities Recognition

Language: Python - Size: 1020 KB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

Georgetown-IR-Lab/QuickUMLS

System for Medical Concept Extraction and Linking

Language: Python - Size: 89.8 KB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 423 - Forks: 99

ankane/mitie-php

Named-entity recognition for PHP

Language: PHP - Size: 51.8 KB - Last synced at: 23 days ago - Pushed at: 27 days ago - Stars: 28 - Forks: 5

MAbdelhamid2001/POS-NER-Tagger

Language: Jupyter Notebook - Size: 29.2 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

ankane/mitie-ruby

Named-entity recognition for Ruby

Language: Ruby - Size: 89.8 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 178 - Forks: 7

zjunlp/DeepKE

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction

Language: Python - Size: 121 MB - Last synced at: 29 days ago - Pushed at: 5 months ago - Stars: 4,182 - Forks: 731

SydAirAhd74/Smart_v0.0

AI privacy tool for pure edge computing utilizing translation, transcription across Hebrew, English and Farsi, Summarization, NER, Action Item Recognition, Timeline Extraction, Sentiment Analysis and Recommendations of the uploaded file.

Language: Python - Size: 15.6 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

aymara/lima

The Libre Multilingual Analyzer, a Natural Language Processing (NLP) C++ toolkit.

Language: C++ - Size: 276 MB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 115 - Forks: 20

4AI/LS-LLaMA

A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning

Language: Python - Size: 3.54 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 154 - Forks: 24

lonePatient/TorchBlocks

A PyTorch-based toolkit for natural language processing

Language: Python - Size: 481 KB - Last synced at: 24 days ago - Pushed at: almost 3 years ago - Stars: 160 - Forks: 27

shiva0824/Jobs

An end-to-end NLP project that extracts skills from job descriptions, builds job–resume matching recommendations, and showcases deployment with FastAPI and AWS.

Language: Python - Size: 15.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

yuzhimanhua/Multi-BioNER

Cross-type Biomedical Named Entity Recognition with Deep Multi-task Learning (Bioinformatics'19)

Language: Python - Size: 159 KB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 135 - Forks: 27

viclang/anonymacy

anonymaCy is a spaCy extension for anonymizing PII using rule-based recognizers, context-aware processing, conflict resolution and customizable anonymization.

Language: Python - Size: 626 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

FuxiaoLiu/VisualNews-Repository

[EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning

Language: Jupyter Notebook - Size: 6.94 MB - Last synced at: 24 days ago - Pushed at: over 1 year ago - Stars: 100 - Forks: 9

bogwi/rookeen

spaCy-based CLI for web linguistic analysis with embeddings, sentiment, POS/NER, and Unix pipeline composability. Outputs JSON, Parquet, CoNLL-U for ML workflows.

Language: Python - Size: 318 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

dnanhkhoa/python-vncorenlp

A Python wrapper for VnCoreNLP using a bidirectional communication channel.

Language: Python - Size: 40 KB - Last synced at: 12 days ago - Pushed at: over 7 years ago - Stars: 57 - Forks: 18

Babelscape/wikineural

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).

Language: Python - Size: 225 MB - Last synced at: 10 days ago - Pushed at: almost 3 years ago - Stars: 69 - Forks: 10

Apex-05/REALM

Real-Time Analysis of Linguistic Media

Language: Jupyter Notebook - Size: 4.44 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

keerthanap8898/Neural-CRF_NER-Tagger

How to build a baby-BERT : I analyze BiLSTMs combined with Conditional Random Fields for Named Entity Recognition & contrasts a Neural-CRF tagger against a baseline BiLSTM model, exploring how probabilistic sequence dependencies improve contextual understanding beyond token-level classification.

Language: Jupyter Notebook - Size: 5.46 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

Aishwaraya-Dharmadhikari/NLP_Programs

All Natural Language Processing Programs

Size: 6.84 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

yyDing1/GNER

[ACL 2024 Findings] Code implementation of Paper "Rethinking Negative Instances for Generative Named Entity Recognition"

Language: Python - Size: 4.69 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 59 - Forks: 2

emrecncelik/weighted-bert

Nonofficial implementation of the paper A Text Document Clustering Method Based on Weighted BERT Model.

Language: Python - Size: 44.9 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 0

LanguageMachines/frog

Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

Language: C++ - Size: 70.2 MB - Last synced at: 11 days ago - Pushed at: 20 days ago - Stars: 79 - Forks: 12

Chenfeng1271/awesome-MNER

awesome-multimodal-named-entity-recognition

Size: 176 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 60 - Forks: 5

mhbashari/awesome-persian-nlp-ir

Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources

Size: 192 KB - Last synced at: 29 days ago - Pushed at: about 2 years ago - Stars: 767 - Forks: 115

smilelight/lightKG

基于Pytorch和torchtext的知识图谱深度学习框架。

Language: Python - Size: 91.8 KB - Last synced at: 12 days ago - Pushed at: over 5 years ago - Stars: 615 - Forks: 150

mmarouen/ds-gear

data science gear: Advanced machine learning algorithms built on top of keras, tensorflow and sklear

Language: Python - Size: 86.9 KB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 1

nyukat/pathology_extraction

Improving Information Extraction from Pathology Reports using Named Entity Recognition

Language: Python - Size: 1.56 MB - Last synced at: about 17 hours ago - Pushed at: about 2 years ago - Stars: 17 - Forks: 3

sina-al/pynlp 📦

A pythonic wrapper for Stanford CoreNLP.

Language: Python - Size: 85 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 108 - Forks: 11

opensemanticsearch/open-semantic-search-apps

Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations and named entities) and data import (ETL like text extraction, OCR and crawling filesystems or websites)

Language: CSS - Size: 1.37 MB - Last synced at: about 1 month ago - Pushed at: about 3 years ago - Stars: 99 - Forks: 38

CAMeL-Lab/camel_tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

Language: Python - Size: 11.5 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 488 - Forks: 78

UniversalDataTool/universal-data-tool

Collaborate & label any type of data, images, text, or documents, in an easy web interface or desktop app.

Language: JavaScript - Size: 247 MB - Last synced at: about 2 months ago - Pushed at: 9 months ago - Stars: 2,031 - Forks: 193

luopeixiang/named_entity_recognition

中文命名实体识别（包括多种模型：HMM，CRF，BiLSTM，BiLSTM+CRF的具体实现）

Language: Python - Size: 28.2 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 2,245 - Forks: 537

indu-explores-data/Automated-Resume-Data-Extraction

Automated resume information extraction using NLP. The project extracts Name, Email, and Phone from TXT, DOCX, and PDF files using spaCy and regex. It converts unstructured data into structured formats, improving recruitment efficiency and enabling scalable candidate profiling.

Language: Jupyter Notebook - Size: 71.3 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Related Keywords

nlp 372 natural-language-processing 311 ner 287 machine-learning 135 python 110 pytorch 106 deep-learning 103 bert 98 relation-extraction 94 spacy 90 information-extraction 78 sentiment-analysis 68 tensorflow 51 transformers 50 text-classification 48 crf 45 sequence-labeling 45 nlp-machine-learning 37 pos-tagging 37 lstm 33 dataset 33 knowledge-graph 33 text-mining 31 question-answering 30 entity-linking 30 transformer 25 keras 25 huggingface 24 conditional-random-fields 23 python3 23 natural-language-understanding 21 tokenization 21 entity-extraction 21 ai 21 nltk 20 llm 20 language-model 19 bilstm-crf 19 large-language-models 19 neural-networks 19 event-extraction 18 neural-network 18 spacy-nlp 17 topic-modeling 17 corpus 17 artificial-intelligence 17 java 17 lemmatization 17 named-entities 17 dependency-parsing 16 dependency-parser 15 intent-classification 15 transfer-learning 15 bilstm 15 classification 14 flask 14 part-of-speech-tagging 14 lstm-crf 14 data-science 14 part-of-speech-tagger 14 annotation-tool 13 coreference-resolution 13 roberta 13 bert-model 13 machine-translation 13 word-embeddings 13 summarization 12 conll-2003 12 text-summarization 12 part-of-speech 11 llama 11 nlp-library 11 bert-ner 11 text-processing 11 deep-neural-networks 11 anonymization 11 nlu 11 slot-filling 11 word-segmentation 10 cnn 10 vietnamese-nlp 10 chinese 10 text-generation 10 stemming 10 docker 10 biomedical 10 sentiment-classification 10 token-classification 9 tokenizer 9 word2vec 9 flair 9 streamlit 9 fastapi 9 ocr 9 annotation 9 bilstm-crf-model 9 stanford-corenlp 9 named-entity-disambiguation 9 text-annotation 9 chatbot 9