An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: document-classification

microsoft/simplechat

Secure AI conversations with documents, video, audio, and more. Personal workspaces for focused context, group spaces for shared insight. Classify docs, reuse prompts, and extend with modular features.

Language: Python - Size: 49 MB - Last synced at: about 20 hours ago - Pushed at: about 21 hours ago - Stars: 97 - Forks: 79

cernis-intelligence/docuglean-ocr

Intelligent document processing. Extract structured data like JSON, Markdown and HTML from documents using AI.

Language: Python - Size: 454 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 6 - Forks: 0

DeathFlame7/Smart_Organizer

📁 Streamline file management with Smart_Organizer, an intelligent tool that automates file sorting, renaming, and organization with machine learning.

Language: Python - Size: 1.48 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

Kaisanya/NanoTabVLM

📊 Transform images of tables into accurate HTML text with NanoTabVLM, a lightweight model that excels in digitalization and efficiency.

Language: Python - Size: 7.23 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

mangzz12/Paper2Agent

🤖 Transform research papers into interactive AI agents with minimal effort using Paper2Agent's multi-agent system for seamless tutorial execution.

Language: Jupyter Notebook - Size: 1.8 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0

PFS-AI/PFS

Precision File Search (PFS) is an AI-powered desktop file search for finding, classifying, and understanding files. Search by keyword and semantics inside files, ask questions, and quickly gain insights from cluttered and scattered documents.

Language: Python - Size: 2.96 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0

Renovamen/Text-Classification

PyTorch implementation of some text classification models (HAN, fastText, BiLSTM-Attention, TextCNN, Transformer) | 文本分类

Language: Python - Size: 266 KB - Last synced at: 44 minutes ago - Pushed at: over 4 years ago - Stars: 152 - Forks: 30

sergioburdisso/pyss3

A Python library for Interpretable Machine Learning in Text Classification using the SS3 model, with easy-to-use visualization tools for Explainable AI :octocat:

Language: Python - Size: 102 MB - Last synced at: 1 day ago - Pushed at: about 1 month ago - Stars: 347 - Forks: 44

raviqqe/tensorflow-font2char2word2sent2doc

TensorFlow implementation of Hierarchical Attention Networks for Document Classification and some extension

Language: Python - Size: 78.1 KB - Last synced at: about 1 month ago - Pushed at: over 8 years ago - Stars: 94 - Forks: 31

Md-Emon-Hasan/InformaTruth

Fine-tuned roberta-base classifier on the LIAR dataset. Aaccepts multiple input types text, URLs, and PDFs and outputs a prediction with a confidence score. It also leverages google/flan-t5-base to generate explanations and uses an Agentic AI with LangGraph to orchestrate agents for planning, retrieval, execution, fallback, and reasoning.

Language: Jupyter Notebook - Size: 9.61 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 1

ThomasKaen/nas_file_organizer

Content-aware file sorter with OCR for NAS (Synology/QNAP/Docker-friendly).

Language: Python - Size: 1.14 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

umur957/Custodian

An intelligent, enterprise-grade document management system that automatically sorts, renames, and archives digital documents using state-of-the-art OCR and AI technology.

Language: Python - Size: 65.4 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

paulsamuel-w-e/Multi-Modal-Government-ID-Classification

AI-powered Gov. ID classifier using OCR, BERT, ResNet, and LayoutLMv3 for Aadhar, PAN, Passport, and other scanned IDs.

Language: Python - Size: 53.4 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 1

mdh266/TextClassificationApp

Building and Deploying A Serverless Text Classification Web App

Language: Jupyter Notebook - Size: 12.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 19 - Forks: 10

BABIN-JOE/NeuroDoc

NeuroDoc is a powerful AI-based offline document summarization tool that leverages OCR and NLP to intelligently analyze PDFs and generate structured summaries. Built using Flask, this tool is designed to run completely offline and supports both text-based and scanned/image-based documents.

Language: Python - Size: 13.7 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0

elielsonfrigeri/nas_file_organizer

🗂️ Organize your NAS or local files effortlessly with automated sorting for PDFs, images, and Office documents using OCR and rule-based classification.

Language: Python - Size: 328 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

timothyroch/HackerRank

Language: Python - Size: 60.5 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

nabeelshan78/document-classification-nlp

Automated document classification system using PyTorch & TorchText. Loads and preprocesses news articles, trains a text classification model, visualizes embeddings, and predicts topics such as World, Sports, Business, and Sci/Tech.

Language: Jupyter Notebook - Size: 22.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

jfilter/text-classification-keras 📦

📚 Text classification library with Keras

Language: Python - Size: 11.8 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 53 - Forks: 11

kk7nc/HDLTex

HDLTex: Hierarchical Deep Learning for Text Classification

Language: Python - Size: 32 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 276 - Forks: 66

DataTurks/DataTurks

ML data annotations made super easy for teams. Just upload data, add your team and build training/evaluation dataset in hours.

Language: JavaScript - Size: 3.95 MB - Last synced at: 3 months ago - Pushed at: almost 4 years ago - Stars: 269 - Forks: 123

NavodPeiris/DoClasiq

A document classification and data extraction Web App for company documents. Classification is done using a fine-tuned version of LayoutLMv2ForSequenceClassification from HuggingFace Transformers library. Rule-based structured data extraction is done using predefined configuration for each class of documents.

Language: Jupyter Notebook - Size: 5.02 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

jankstar/pydocu

fastapi server for classification of documents and extraction of data

Language: Python - Size: 156 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

gdmatrix/gdmatrix

GDMatrix is a modular document management system that provides an integrated set of services and applications designed to meet the requirements of public administrations.

Language: Java - Size: 27.5 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 5 - Forks: 5

Willgnner-Santos/DPE-Legal-Doc-Classification-Pipeline

The results are drawn from experiments on the classification of legal documents using LLMs in a real-world institutional setting

Language: Jupyter Notebook - Size: 45.8 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

ankitdipto/ImageInfoExtractor

An end-to-end pipeline to filter scanned documents from arbitrary images with subsequent classification of the extracted documents.

Language: Jupyter Notebook - Size: 12.5 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

anukraticodes/CareerStack

Intelligent file manager for career professionals - Organize resumes, certificates, and cover letters with AI-powered auto-detection. Built with Java Spring Boot, React, and Machine Learning.

Language: Python - Size: 69.8 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Yosef-AlSabbah/Cloud-Based-Document-Analytics-Service-2

Cloud-based service for uploading, scraping, and managing PDF/DOCX documents. Features include title sorting, content search with highlights, rule-based classification, and storage stats. Integrated with cloud platforms for scalable document analytics.

Language: TypeScript - Size: 269 KB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

sgrvinod/a-PyTorch-Tutorial-to-Text-Classification

Hierarchical Attention Networks | a PyTorch Tutorial to Text Classification

Language: Python - Size: 712 KB - Last synced at: 5 months ago - Pushed at: over 5 years ago - Stars: 248 - Forks: 55

DocsaidLab/DocClassifier

A zero-shot document classifier.

Language: Python - Size: 46 MB - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 4 - Forks: 1

eleonc56/Cloud-Based-Document-Analytics-Service

Cloud-Based Document Analytics Service offers a simple way to manage your documents in the cloud. With features like drag-and-drop upload and powerful web scraping, it streamlines your document analysis. 🗂️💻

Language: TypeScript - Size: 315 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

MohamedMoubarakHussein/Automatic-Document-Classification-Categorization-By-Subject

Machine Learning-powered document classifier using SVM and TF-IDF vectorization. Automatically categorizes BBC news articles into 5 subjects with 98.65% accuracy.

Language: Jupyter Notebook - Size: 2.43 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

vietnh1009/Hierarchical-attention-networks-pytorch

Hierarchical Attention Networks for document classification

Language: Python - Size: 48.5 MB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 396 - Forks: 104

reinelt88/document-classifier

AI-powered document classifier with FastAPI + Streamlit | Supports text & PDF input

Language: Python - Size: 305 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

MiniAiLive/ID-DocumentRecognition-SDK-Docker

MiniAiLive Intelligent ID OCR for Reliable Identity Verification From document verification to data entry, our MiniAiLive OCR solution can help transform your identity verification process.

Language: Python - Size: 1.66 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 152 - Forks: 4

brightmart/bert_language_understanding

Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN

Language: Python - Size: 16 MB - Last synced at: 6 months ago - Pushed at: almost 7 years ago - Stars: 966 - Forks: 211

Hazoom/bert-han

Hierarchical-Attention-Network

Language: Python - Size: 522 KB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 46 - Forks: 9

Neeraj652/AI-_Document_Extraction

Document Intelligence: Demonstrating AI and OCR capabilities in the the system

Language: Python - Size: 101 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

kk7nc/Text_Classification

Text Classification Algorithms: A Survey

Language: Python - Size: 13.8 MB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 1,811 - Forks: 544

JPLeoRX/detectron2-publaynet

Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset

Language: Python - Size: 7.76 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 49 - Forks: 7

devrahulbanjara/Document-Verification-System

The system that is designed to assist in verifying the authenticity of Nepali documents (Citizenship, Driving License, Passport) against the government database.

Language: Jupyter Notebook - Size: 4.01 GB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 2 - Forks: 1

hank110/bagofconcepts

Python implementation of bag-of-concepts

Language: Python - Size: 5.34 MB - Last synced at: 27 days ago - Pushed at: over 3 years ago - Stars: 20 - Forks: 1

luopeixiang/textclf

TextClf :基于Pytorch/Sklearn的文本分类框架,包括逻辑回归、SVM、TextCNN、TextRNN、TextRCNN、DRNN、DPCNN、Bert等多种模型,通过简单配置即可完成数据处理、模型训练、测试等过程。

Language: Python - Size: 281 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 241 - Forks: 39

sraashis/diseaseprediction

Undergrad final year project to predict diseases given any text symptoms.

Language: Java - Size: 1.4 MB - Last synced at: 7 months ago - Pushed at: about 4 years ago - Stars: 26 - Forks: 7

vietnh1009/Character-level-cnn-pytorch

Character-level CNN for text classification

Language: Python - Size: 53.3 MB - Last synced at: 7 months ago - Pushed at: almost 4 years ago - Stars: 54 - Forks: 17

vietnh1009/Character-level-cnn-tensorflow

Character-level CNN for text classification

Language: Python - Size: 84.3 MB - Last synced at: 4 months ago - Pushed at: almost 4 years ago - Stars: 28 - Forks: 11

vietnh1009/Very-deep-cnn-pytorch

Very deep CNN for text classification

Language: Python - Size: 120 MB - Last synced at: 7 months ago - Pushed at: almost 4 years ago - Stars: 36 - Forks: 14

vietnh1009/Very-deep-cnn-tensorflow

Very deep CNN for text classification

Language: Python - Size: 17.3 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 20 - Forks: 8

danfmaia/hybrid-legal-doc-classifier

Production-ready zero-shot legal document classifier using Mistral-7B LLM and FAISS validation, built with FastAPI for high-performance document classification.

Language: Python - Size: 301 KB - Last synced at: 8 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

vickshan001/Friends-Character-Classifier-Vector-Semantics-NLP

NLP coursework using vector space semantics to classify Friends character dialogue. Includes TF-IDF, POS, sentiment, and context-aware features.

Language: Jupyter Notebook - Size: 339 KB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

MiniAiLive/ID-DocumentRecognition-Windows

MiniAiLive Intelligent ID OCR for Reliable Identity Verification From document verification to data entry, our MiniAiLive OCR solution can help transform your identity verification process.

Language: Python - Size: 1.72 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 83 - Forks: 76

MiniAiLive/ID-DocumentRecognition-Linux

MiniAiLive Intelligent ID OCR for Reliable Identity Verification From document verification to data entry, our MiniAiLive OCR solution can help transform your identity verification process.

Language: Python - Size: 1.73 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 83 - Forks: 72

MiniAiLive/.github

Size: 47.9 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 15 - Forks: 2

castorini/hedwig

PyTorch deep learning models for document classification

Language: Python - Size: 23.3 MB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 595 - Forks: 125

SDpDas/YOLOv5-DocAnalyser

This tool extracts images from a PDF, annotates them using the YOLOv5 model, and converts the annotated images back into a single PDF.. https://github.com/ultralytics/yolov5 https://github.com/HumanSignal/labelImg https://www.kaggle.com/code/sagardeepdas/yolov5-model1

Language: Python - Size: 179 MB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

justinbt1/Multimodal-Document-Classification

MSc project investigating multi-modal fusion approaches to combining textual and visual features for multi-page classification of documents within the OGA National Data Repository (NDR).

Language: Jupyter Notebook - Size: 11.9 MB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

ematvey/hierarchical-attention-networks

Document classification with Hierarchical Attention Networks in TensorFlow. WARNING: project is currently unmaintained, issues will probably not be addressed.

Language: Python - Size: 40 KB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 465 - Forks: 148

olekli/MrDocument

Automatic PDF transcription and classification via OpenAI

Language: Rust - Size: 163 KB - Last synced at: about 2 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

RituYadav92/Lightweighted-CNN-for-Document-Classification

Optimized Text Document Classification

Language: Python - Size: 4.93 MB - Last synced at: 7 months ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 4

sr-murthy/doc_classifier

Experimental document classification tool based on a domain-dependent, keywords-based document class map and a keyword frequency score

Language: Python - Size: 74.2 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

MiteshPuthran/Document_Classification

Python code for classification of documents into different classes using machine learning

Language: Jupyter Notebook - Size: 89.1 MB - Last synced at: 7 months ago - Pushed at: over 6 years ago - Stars: 28 - Forks: 8

DanjelPiDev/SmartPDFManager

A Python-based tool for organizing PDF files. The tool automatically extracts text from PDF documents to determine their category based on predefined keywords and moves them into categorized folders. Making document management easier and more efficient.

Language: Python - Size: 1010 KB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Hellisotherpeople/Active-Explainable-Classification

A set of tools for leveraging pre-trained embeddings, active learning and model explainability for effecient document classification

Language: HTML - Size: 2.78 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 29 - Forks: 1

acsenrafilho/cucaracha

A bureaucratic cockroach (cucaracha) assistent to help in document processing and analysis

Language: Python - Size: 5.93 MB - Last synced at: 23 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 1

GerHobbelt/qiqqa-open-source Fork of jimmejardine/qiqqa-open-source

The open-sourced version of the award-winning Qiqqa research management tool for Windows (a bleeding edge dev fork) ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ☞☞☞ File any issues you find in the main repo issue tracker at https://github.com/jimmejardine/qiqqa-open-source/issues

Language: TeX - Size: 1.43 GB - Last synced at: 8 months ago - Pushed at: 10 months ago - Stars: 47 - Forks: 5

digiparser/digiparser-docs

DigiParser Documentation and API reference

Language: MDX - Size: 149 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

qkrdmsghk/TextSSL

[AAAI 2022] Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification

Language: Python - Size: 142 MB - Last synced at: 7 months ago - Pushed at: 12 months ago - Stars: 32 - Forks: 10

IBM/twitter-customer-care-document-prediction 📦

Twitter dataset for Conversational Document Prediction to Assist Customer Care Agents (Ganhotra et al. 2020, EMNLP)

Size: 6.84 MB - Last synced at: 5 months ago - Pushed at: about 5 years ago - Stars: 15 - Forks: 6

pandeykartikey/Hierarchical-Attention-Network

Implementation of Hierarchical Attention Networks in PyTorch

Language: Jupyter Notebook - Size: 98.2 MB - Last synced at: 7 months ago - Pushed at: about 7 years ago - Stars: 129 - Forks: 27

wri-dssg-omdena/policy-data-analyzer

Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.

Language: Jupyter Notebook - Size: 226 MB - Last synced at: 8 months ago - Pushed at: over 3 years ago - Stars: 35 - Forks: 8

eahlys/EdPaper

Helps you organizing your paperwork

Language: PHP - Size: 354 KB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 57 - Forks: 10

qtuantruong/hierarchical-attention-networks

TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"

Language: Python - Size: 1.07 MB - Last synced at: 4 months ago - Pushed at: over 6 years ago - Stars: 87 - Forks: 25

PKPDAI/PKDocClassifier

Binary classifier to identify scientific publications reporting pharmacokinetic parameters estimated in vivo

Language: Python - Size: 8.42 MB - Last synced at: 5 months ago - Pushed at: almost 2 years ago - Stars: 18 - Forks: 8

SDpDas/Document-layout-generator-and-segmentation-tool

Lists all parts of a document PDF and is a highly scalable with robust code.

Language: Python - Size: 58.7 MB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

bitrao/Job-Advertisement-Classification

Classification Models for Job Advertisements

Language: Jupyter Notebook - Size: 2.19 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

lozingaro/multimodal-side-tuning

Classification using deep-learning additive technique and multimodal inputs.

Language: Python - Size: 4.05 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 0

caltechlibrary/documentarist

Process Caltech Archives' digital documents and photos, and annotate each page or image with information about its contents

Language: Python - Size: 519 KB - Last synced at: 3 months ago - Pushed at: over 3 years ago - Stars: 12 - Forks: 4

IgorAugust0/info-org-retrieval

📙 Arquivos e materiais utilizados na disciplina GSI024 - Organização e Recuperação da Informação da UFU.

Language: Jupyter Notebook - Size: 5.61 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

gaurav104/TextClassification

Repository of state of the art text/documentation classification algorithms in Pytorch.

Language: Jupyter Notebook - Size: 1.82 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 11 - Forks: 2

csebuetnlp/banglabert

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: NAACL-2022.

Language: Python - Size: 1.14 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 230 - Forks: 31

ExtrieveTechnologies/QuickCapture_IOS

QuickCapture Mobile Scanning SDK Specially designed for native IOS

Language: Objective-C - Size: 90.8 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

prnvpwr2612/MS-AI-900-Practice

Completing the exercises & exploring service offered by Azure AI.

Language: Python - Size: 33.5 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

MS1034/document-classification-using-KNN

Documents classification using KNN Algorithm a graph based approach along with scrapped data

Language: Python - Size: 13.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ali7haider/Classification_of_Documents_Using_Graph-Based-Features_and_KNN_GT

Classification of Documents Using Graph-Based Features and KNN This project offers hands-on experience with graph theory and machine learning, fostering skills in data representation, algorithm implementation, and analytical thinking in the context of document classification.

Language: Python - Size: 1.79 MB - Last synced at: 9 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Yahya123-hub/Classification-of-Documents-Using-Graph-Based-Features-and-KNN

An innovative project that integrates graph theory and machine learning techniques to classify documents into predefined topics. By leveraging graph representations of documents and employing the K-Nearest Neighbors (KNN) algorithm, this project aims to provide a robust system for document classification

Language: Python - Size: 49 MB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 2

hautran7201/skip_gram_for_document_classification

Language: Python - Size: 79.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

gangeshwark/hierarchical_document_modeling

Hierarchical Attention Networks for Document Classification

Language: Python - Size: 63.9 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 3 - Forks: 5

pfalcon/papersman

Minimalist electronic documents/papers/publications manager/indexer/categorizer

Language: Python - Size: 21.5 KB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 14 - Forks: 2

jhj0517/document_classification

finetune text classification model

Language: Jupyter Notebook - Size: 4.9 MB - Last synced at: 8 months ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 1

stipefrkovic/identity-document-classification-trainer

[Software Engineering]

Language: Python - Size: 327 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

11harini04/Feedly

Sentiment analysis of Reuter news articles..

Language: Python - Size: 25.4 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

FranzTscharf/JavaScript-D3-Object-Detection-Example

JavaScript D3 Object Detection Bounding Box Draw Example

Language: HTML - Size: 598 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 1

EdGENetworks/attention-networks-for-classification

Hierarchical Attention Networks for Document Classification in PyTorch

Language: Jupyter Notebook - Size: 233 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 595 - Forks: 135

elahehaghaarabi/language_model_grant_classifier

A language model is fine-tuned using domain data to identify pre-defined groups of documents

Language: Jupyter Notebook - Size: 43.9 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sbischoff-ai/basic-document-classifier

A simple CNN for n-class classification of document images

Language: Python - Size: 27.3 KB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

GuillaumeDD/gowpy

A very simple library for exploiting graph-of-words in NLP

Language: Python - Size: 1.09 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 12 - Forks: 2

aditya00kumar/document-classification

This project is an attempt to provide a generic pipeline for document classification using different machine learning models.

Language: Jupyter Notebook - Size: 12 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 7 - Forks: 3

Emvista/Gnn4DependencyDocumentClassification

Language: Python - Size: 35.2 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

malteos/semantic-document-relations

Implementation, trained models and result data for the paper "Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles"

Language: Python - Size: 174 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 31 - Forks: 2

ibtihelgharsalah/Document-Summarization-and-Information-Retrieval

A web app that uses NLP tasks for document manipulation and QA.

Language: HTML - Size: 35.2 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Related Keywords
document-classification 185 nlp 51 machine-learning 45 text-classification 43 deep-learning 33 natural-language-processing 32 python 22 pytorch 22 hierarchical-attention-networks 16 bert 14 nlp-machine-learning 14 ocr 14 classification 13 tensorflow 13 python3 10 sentiment-analysis 10 ai 8 image-classification 8 fastapi 8 word2vec 7 deep-neural-networks 7 naive-bayes-classifier 7 automation 6 document 6 tesseract 6 java 6 document-management 6 lstm 5 han 5 deeplearning 5 data-extraction 5 document-analysis 5 convolutional-neural-networks 5 text-mining 5 sentence-classification 5 docker 5 data-science 5 transformers 5 authentication 4 huggingface-transformers 4 computer-vision 4 transfer-learning 4 artificial-intelligence 4 mrz-scanner 4 information-retrieval 4 onboarding 4 sentiment-classification 4 attention-mechanism 4 transformer 4 pdf 4 keras 4 cnn-text-classification 4 multinomial-naive-bayes 4 neural-network 4 scikit-learn 4 biometrics 4 ekyc-verification 4 nltk 4 lda 4 id-document-reader 4 ocr-recognition 3 pytorch-implementation 3 tf-idf 3 fine-tuning 3 bert-model 3 sequence-classification 3 text-analysis 3 passport-reader 3 idcard-ocr 3 vector-space-model 3 streamlit 3 naive-bayes 3 huggingface 3 document-ocr 3 active-learning 3 jupyter-notebook 3 on-premise 3 image-processing 3 typescript 3 text-processing 3 knn-classification 3 topic-modeling 3 object-detection 3 open-source 3 data-mining 3 textcnn 3 fasttext 3 file-management 3 cnn 3 clustering 3 instance-segmentation 2 glove 2 cv 2 apache-spark 2 attention 2 neural-networks 2 idverification 2 kyc-api 2 nfc-card-reader 2 cloud-computing 2