GitHub topics: document-classification
microsoft/simplechat
Secure AI conversations with documents, video, audio, and more. Personal workspaces for focused context, group spaces for shared insight. Classify docs, reuse prompts, and extend with modular features.
Language: Python - Size: 49 MB - Last synced at: about 20 hours ago - Pushed at: about 21 hours ago - Stars: 97 - Forks: 79
cernis-intelligence/docuglean-ocr
Intelligent document processing. Extract structured data like JSON, Markdown and HTML from documents using AI.
Language: Python - Size: 454 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 6 - Forks: 0
DeathFlame7/Smart_Organizer
📁 Streamline file management with Smart_Organizer, an intelligent tool that automates file sorting, renaming, and organization with machine learning.
Language: Python - Size: 1.48 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0
Kaisanya/NanoTabVLM
📊 Transform images of tables into accurate HTML text with NanoTabVLM, a lightweight model that excels in digitalization and efficiency.
Language: Python - Size: 7.23 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0
mangzz12/Paper2Agent
🤖 Transform research papers into interactive AI agents with minimal effort using Paper2Agent's multi-agent system for seamless tutorial execution.
Language: Jupyter Notebook - Size: 1.8 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0
PFS-AI/PFS
Precision File Search (PFS) is an AI-powered desktop file search for finding, classifying, and understanding files. Search by keyword and semantics inside files, ask questions, and quickly gain insights from cluttered and scattered documents.
Language: Python - Size: 2.96 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0
Renovamen/Text-Classification
PyTorch implementation of some text classification models (HAN, fastText, BiLSTM-Attention, TextCNN, Transformer) | 文本分类
Language: Python - Size: 266 KB - Last synced at: 44 minutes ago - Pushed at: over 4 years ago - Stars: 152 - Forks: 30
sergioburdisso/pyss3
A Python library for Interpretable Machine Learning in Text Classification using the SS3 model, with easy-to-use visualization tools for Explainable AI :octocat:
Language: Python - Size: 102 MB - Last synced at: 1 day ago - Pushed at: about 1 month ago - Stars: 347 - Forks: 44
raviqqe/tensorflow-font2char2word2sent2doc
TensorFlow implementation of Hierarchical Attention Networks for Document Classification and some extension
Language: Python - Size: 78.1 KB - Last synced at: about 1 month ago - Pushed at: over 8 years ago - Stars: 94 - Forks: 31
Md-Emon-Hasan/InformaTruth
Fine-tuned roberta-base classifier on the LIAR dataset. Aaccepts multiple input types text, URLs, and PDFs and outputs a prediction with a confidence score. It also leverages google/flan-t5-base to generate explanations and uses an Agentic AI with LangGraph to orchestrate agents for planning, retrieval, execution, fallback, and reasoning.
Language: Jupyter Notebook - Size: 9.61 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 1
ThomasKaen/nas_file_organizer
Content-aware file sorter with OCR for NAS (Synology/QNAP/Docker-friendly).
Language: Python - Size: 1.14 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0
umur957/Custodian
An intelligent, enterprise-grade document management system that automatically sorts, renames, and archives digital documents using state-of-the-art OCR and AI technology.
Language: Python - Size: 65.4 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0
paulsamuel-w-e/Multi-Modal-Government-ID-Classification
AI-powered Gov. ID classifier using OCR, BERT, ResNet, and LayoutLMv3 for Aadhar, PAN, Passport, and other scanned IDs.
Language: Python - Size: 53.4 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 1
mdh266/TextClassificationApp
Building and Deploying A Serverless Text Classification Web App
Language: Jupyter Notebook - Size: 12.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 19 - Forks: 10
BABIN-JOE/NeuroDoc
NeuroDoc is a powerful AI-based offline document summarization tool that leverages OCR and NLP to intelligently analyze PDFs and generate structured summaries. Built using Flask, this tool is designed to run completely offline and supports both text-based and scanned/image-based documents.
Language: Python - Size: 13.7 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0
elielsonfrigeri/nas_file_organizer
🗂️ Organize your NAS or local files effortlessly with automated sorting for PDFs, images, and Office documents using OCR and rule-based classification.
Language: Python - Size: 328 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0
timothyroch/HackerRank
Language: Python - Size: 60.5 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0
nabeelshan78/document-classification-nlp
Automated document classification system using PyTorch & TorchText. Loads and preprocesses news articles, trains a text classification model, visualizes embeddings, and predicts topics such as World, Sports, Business, and Sci/Tech.
Language: Jupyter Notebook - Size: 22.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0
jfilter/text-classification-keras 📦
📚 Text classification library with Keras
Language: Python - Size: 11.8 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 53 - Forks: 11
kk7nc/HDLTex
HDLTex: Hierarchical Deep Learning for Text Classification
Language: Python - Size: 32 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 276 - Forks: 66
DataTurks/DataTurks
ML data annotations made super easy for teams. Just upload data, add your team and build training/evaluation dataset in hours.
Language: JavaScript - Size: 3.95 MB - Last synced at: 3 months ago - Pushed at: almost 4 years ago - Stars: 269 - Forks: 123
NavodPeiris/DoClasiq
A document classification and data extraction Web App for company documents. Classification is done using a fine-tuned version of LayoutLMv2ForSequenceClassification from HuggingFace Transformers library. Rule-based structured data extraction is done using predefined configuration for each class of documents.
Language: Jupyter Notebook - Size: 5.02 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0
jankstar/pydocu
fastapi server for classification of documents and extraction of data
Language: Python - Size: 156 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0
gdmatrix/gdmatrix
GDMatrix is a modular document management system that provides an integrated set of services and applications designed to meet the requirements of public administrations.
Language: Java - Size: 27.5 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 5 - Forks: 5
Willgnner-Santos/DPE-Legal-Doc-Classification-Pipeline
The results are drawn from experiments on the classification of legal documents using LLMs in a real-world institutional setting
Language: Jupyter Notebook - Size: 45.8 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0
ankitdipto/ImageInfoExtractor
An end-to-end pipeline to filter scanned documents from arbitrary images with subsequent classification of the extracted documents.
Language: Jupyter Notebook - Size: 12.5 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0
anukraticodes/CareerStack
Intelligent file manager for career professionals - Organize resumes, certificates, and cover letters with AI-powered auto-detection. Built with Java Spring Boot, React, and Machine Learning.
Language: Python - Size: 69.8 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0
Yosef-AlSabbah/Cloud-Based-Document-Analytics-Service-2
Cloud-based service for uploading, scraping, and managing PDF/DOCX documents. Features include title sorting, content search with highlights, rule-based classification, and storage stats. Integrated with cloud platforms for scalable document analytics.
Language: TypeScript - Size: 269 KB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0
sgrvinod/a-PyTorch-Tutorial-to-Text-Classification
Hierarchical Attention Networks | a PyTorch Tutorial to Text Classification
Language: Python - Size: 712 KB - Last synced at: 5 months ago - Pushed at: over 5 years ago - Stars: 248 - Forks: 55
DocsaidLab/DocClassifier
A zero-shot document classifier.
Language: Python - Size: 46 MB - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 4 - Forks: 1
eleonc56/Cloud-Based-Document-Analytics-Service
Cloud-Based Document Analytics Service offers a simple way to manage your documents in the cloud. With features like drag-and-drop upload and powerful web scraping, it streamlines your document analysis. 🗂️💻
Language: TypeScript - Size: 315 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0
MohamedMoubarakHussein/Automatic-Document-Classification-Categorization-By-Subject
Machine Learning-powered document classifier using SVM and TF-IDF vectorization. Automatically categorizes BBC news articles into 5 subjects with 98.65% accuracy.
Language: Jupyter Notebook - Size: 2.43 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0
vietnh1009/Hierarchical-attention-networks-pytorch
Hierarchical Attention Networks for document classification
Language: Python - Size: 48.5 MB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 396 - Forks: 104
reinelt88/document-classifier
AI-powered document classifier with FastAPI + Streamlit | Supports text & PDF input
Language: Python - Size: 305 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0
MiniAiLive/ID-DocumentRecognition-SDK-Docker
MiniAiLive Intelligent ID OCR for Reliable Identity Verification From document verification to data entry, our MiniAiLive OCR solution can help transform your identity verification process.
Language: Python - Size: 1.66 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 152 - Forks: 4
brightmart/bert_language_understanding
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Language: Python - Size: 16 MB - Last synced at: 6 months ago - Pushed at: almost 7 years ago - Stars: 966 - Forks: 211
Hazoom/bert-han
Hierarchical-Attention-Network
Language: Python - Size: 522 KB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 46 - Forks: 9
Neeraj652/AI-_Document_Extraction
Document Intelligence: Demonstrating AI and OCR capabilities in the the system
Language: Python - Size: 101 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0
kk7nc/Text_Classification
Text Classification Algorithms: A Survey
Language: Python - Size: 13.8 MB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 1,811 - Forks: 544
JPLeoRX/detectron2-publaynet
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
Language: Python - Size: 7.76 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 49 - Forks: 7
devrahulbanjara/Document-Verification-System
The system that is designed to assist in verifying the authenticity of Nepali documents (Citizenship, Driving License, Passport) against the government database.
Language: Jupyter Notebook - Size: 4.01 GB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 2 - Forks: 1
hank110/bagofconcepts
Python implementation of bag-of-concepts
Language: Python - Size: 5.34 MB - Last synced at: 27 days ago - Pushed at: over 3 years ago - Stars: 20 - Forks: 1
luopeixiang/textclf
TextClf :基于Pytorch/Sklearn的文本分类框架,包括逻辑回归、SVM、TextCNN、TextRNN、TextRCNN、DRNN、DPCNN、Bert等多种模型,通过简单配置即可完成数据处理、模型训练、测试等过程。
Language: Python - Size: 281 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 241 - Forks: 39
sraashis/diseaseprediction
Undergrad final year project to predict diseases given any text symptoms.
Language: Java - Size: 1.4 MB - Last synced at: 7 months ago - Pushed at: about 4 years ago - Stars: 26 - Forks: 7
vietnh1009/Character-level-cnn-pytorch
Character-level CNN for text classification
Language: Python - Size: 53.3 MB - Last synced at: 7 months ago - Pushed at: almost 4 years ago - Stars: 54 - Forks: 17
vietnh1009/Character-level-cnn-tensorflow
Character-level CNN for text classification
Language: Python - Size: 84.3 MB - Last synced at: 4 months ago - Pushed at: almost 4 years ago - Stars: 28 - Forks: 11
vietnh1009/Very-deep-cnn-pytorch
Very deep CNN for text classification
Language: Python - Size: 120 MB - Last synced at: 7 months ago - Pushed at: almost 4 years ago - Stars: 36 - Forks: 14
vietnh1009/Very-deep-cnn-tensorflow
Very deep CNN for text classification
Language: Python - Size: 17.3 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 20 - Forks: 8
danfmaia/hybrid-legal-doc-classifier
Production-ready zero-shot legal document classifier using Mistral-7B LLM and FAISS validation, built with FastAPI for high-performance document classification.
Language: Python - Size: 301 KB - Last synced at: 8 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0
vickshan001/Friends-Character-Classifier-Vector-Semantics-NLP
NLP coursework using vector space semantics to classify Friends character dialogue. Includes TF-IDF, POS, sentiment, and context-aware features.
Language: Jupyter Notebook - Size: 339 KB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0
MiniAiLive/ID-DocumentRecognition-Windows
MiniAiLive Intelligent ID OCR for Reliable Identity Verification From document verification to data entry, our MiniAiLive OCR solution can help transform your identity verification process.
Language: Python - Size: 1.72 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 83 - Forks: 76
MiniAiLive/ID-DocumentRecognition-Linux
MiniAiLive Intelligent ID OCR for Reliable Identity Verification From document verification to data entry, our MiniAiLive OCR solution can help transform your identity verification process.
Language: Python - Size: 1.73 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 83 - Forks: 72
MiniAiLive/.github
Size: 47.9 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 15 - Forks: 2
castorini/hedwig
PyTorch deep learning models for document classification
Language: Python - Size: 23.3 MB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 595 - Forks: 125
SDpDas/YOLOv5-DocAnalyser
This tool extracts images from a PDF, annotates them using the YOLOv5 model, and converts the annotated images back into a single PDF.. https://github.com/ultralytics/yolov5 https://github.com/HumanSignal/labelImg https://www.kaggle.com/code/sagardeepdas/yolov5-model1
Language: Python - Size: 179 MB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0
justinbt1/Multimodal-Document-Classification
MSc project investigating multi-modal fusion approaches to combining textual and visual features for multi-page classification of documents within the OGA National Data Repository (NDR).
Language: Jupyter Notebook - Size: 11.9 MB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0
ematvey/hierarchical-attention-networks
Document classification with Hierarchical Attention Networks in TensorFlow. WARNING: project is currently unmaintained, issues will probably not be addressed.
Language: Python - Size: 40 KB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 465 - Forks: 148
olekli/MrDocument
Automatic PDF transcription and classification via OpenAI
Language: Rust - Size: 163 KB - Last synced at: about 2 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0
RituYadav92/Lightweighted-CNN-for-Document-Classification
Optimized Text Document Classification
Language: Python - Size: 4.93 MB - Last synced at: 7 months ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 4
sr-murthy/doc_classifier
Experimental document classification tool based on a domain-dependent, keywords-based document class map and a keyword frequency score
Language: Python - Size: 74.2 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
MiteshPuthran/Document_Classification
Python code for classification of documents into different classes using machine learning
Language: Jupyter Notebook - Size: 89.1 MB - Last synced at: 7 months ago - Pushed at: over 6 years ago - Stars: 28 - Forks: 8
DanjelPiDev/SmartPDFManager
A Python-based tool for organizing PDF files. The tool automatically extracts text from PDF documents to determine their category based on predefined keywords and moves them into categorized folders. Making document management easier and more efficient.
Language: Python - Size: 1010 KB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
Hellisotherpeople/Active-Explainable-Classification
A set of tools for leveraging pre-trained embeddings, active learning and model explainability for effecient document classification
Language: HTML - Size: 2.78 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 29 - Forks: 1
acsenrafilho/cucaracha
A bureaucratic cockroach (cucaracha) assistent to help in document processing and analysis
Language: Python - Size: 5.93 MB - Last synced at: 23 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 1
GerHobbelt/qiqqa-open-source Fork of jimmejardine/qiqqa-open-source
The open-sourced version of the award-winning Qiqqa research management tool for Windows (a bleeding edge dev fork) ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ☞☞☞ File any issues you find in the main repo issue tracker at https://github.com/jimmejardine/qiqqa-open-source/issues
Language: TeX - Size: 1.43 GB - Last synced at: 8 months ago - Pushed at: 10 months ago - Stars: 47 - Forks: 5
digiparser/digiparser-docs
DigiParser Documentation and API reference
Language: MDX - Size: 149 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0
qkrdmsghk/TextSSL
[AAAI 2022] Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification
Language: Python - Size: 142 MB - Last synced at: 7 months ago - Pushed at: 12 months ago - Stars: 32 - Forks: 10
IBM/twitter-customer-care-document-prediction 📦
Twitter dataset for Conversational Document Prediction to Assist Customer Care Agents (Ganhotra et al. 2020, EMNLP)
Size: 6.84 MB - Last synced at: 5 months ago - Pushed at: about 5 years ago - Stars: 15 - Forks: 6
pandeykartikey/Hierarchical-Attention-Network
Implementation of Hierarchical Attention Networks in PyTorch
Language: Jupyter Notebook - Size: 98.2 MB - Last synced at: 7 months ago - Pushed at: about 7 years ago - Stars: 129 - Forks: 27
wri-dssg-omdena/policy-data-analyzer
Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Language: Jupyter Notebook - Size: 226 MB - Last synced at: 8 months ago - Pushed at: over 3 years ago - Stars: 35 - Forks: 8
eahlys/EdPaper
Helps you organizing your paperwork
Language: PHP - Size: 354 KB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 57 - Forks: 10
qtuantruong/hierarchical-attention-networks
TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"
Language: Python - Size: 1.07 MB - Last synced at: 4 months ago - Pushed at: over 6 years ago - Stars: 87 - Forks: 25
PKPDAI/PKDocClassifier
Binary classifier to identify scientific publications reporting pharmacokinetic parameters estimated in vivo
Language: Python - Size: 8.42 MB - Last synced at: 5 months ago - Pushed at: almost 2 years ago - Stars: 18 - Forks: 8
SDpDas/Document-layout-generator-and-segmentation-tool
Lists all parts of a document PDF and is a highly scalable with robust code.
Language: Python - Size: 58.7 MB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0
bitrao/Job-Advertisement-Classification
Classification Models for Job Advertisements
Language: Jupyter Notebook - Size: 2.19 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
lozingaro/multimodal-side-tuning
Classification using deep-learning additive technique and multimodal inputs.
Language: Python - Size: 4.05 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 0
caltechlibrary/documentarist
Process Caltech Archives' digital documents and photos, and annotate each page or image with information about its contents
Language: Python - Size: 519 KB - Last synced at: 3 months ago - Pushed at: over 3 years ago - Stars: 12 - Forks: 4
IgorAugust0/info-org-retrieval
📙 Arquivos e materiais utilizados na disciplina GSI024 - Organização e Recuperação da Informação da UFU.
Language: Jupyter Notebook - Size: 5.61 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
gaurav104/TextClassification
Repository of state of the art text/documentation classification algorithms in Pytorch.
Language: Jupyter Notebook - Size: 1.82 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 11 - Forks: 2
csebuetnlp/banglabert
This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: NAACL-2022.
Language: Python - Size: 1.14 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 230 - Forks: 31
ExtrieveTechnologies/QuickCapture_IOS
QuickCapture Mobile Scanning SDK Specially designed for native IOS
Language: Objective-C - Size: 90.8 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0
prnvpwr2612/MS-AI-900-Practice
Completing the exercises & exploring service offered by Azure AI.
Language: Python - Size: 33.5 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
MS1034/document-classification-using-KNN
Documents classification using KNN Algorithm a graph based approach along with scrapped data
Language: Python - Size: 13.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
ali7haider/Classification_of_Documents_Using_Graph-Based-Features_and_KNN_GT
Classification of Documents Using Graph-Based Features and KNN This project offers hands-on experience with graph theory and machine learning, fostering skills in data representation, algorithm implementation, and analytical thinking in the context of document classification.
Language: Python - Size: 1.79 MB - Last synced at: 9 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0
Yahya123-hub/Classification-of-Documents-Using-Graph-Based-Features-and-KNN
An innovative project that integrates graph theory and machine learning techniques to classify documents into predefined topics. By leveraging graph representations of documents and employing the K-Nearest Neighbors (KNN) algorithm, this project aims to provide a robust system for document classification
Language: Python - Size: 49 MB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 2
hautran7201/skip_gram_for_document_classification
Language: Python - Size: 79.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
gangeshwark/hierarchical_document_modeling
Hierarchical Attention Networks for Document Classification
Language: Python - Size: 63.9 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 3 - Forks: 5
pfalcon/papersman
Minimalist electronic documents/papers/publications manager/indexer/categorizer
Language: Python - Size: 21.5 KB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 14 - Forks: 2
jhj0517/document_classification
finetune text classification model
Language: Jupyter Notebook - Size: 4.9 MB - Last synced at: 8 months ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 1
stipefrkovic/identity-document-classification-trainer
[Software Engineering]
Language: Python - Size: 327 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0
11harini04/Feedly
Sentiment analysis of Reuter news articles..
Language: Python - Size: 25.4 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1
FranzTscharf/JavaScript-D3-Object-Detection-Example
JavaScript D3 Object Detection Bounding Box Draw Example
Language: HTML - Size: 598 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 1
EdGENetworks/attention-networks-for-classification
Hierarchical Attention Networks for Document Classification in PyTorch
Language: Jupyter Notebook - Size: 233 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 595 - Forks: 135
elahehaghaarabi/language_model_grant_classifier
A language model is fine-tuned using domain data to identify pre-defined groups of documents
Language: Jupyter Notebook - Size: 43.9 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
sbischoff-ai/basic-document-classifier
A simple CNN for n-class classification of document images
Language: Python - Size: 27.3 KB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0
GuillaumeDD/gowpy
A very simple library for exploiting graph-of-words in NLP
Language: Python - Size: 1.09 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 12 - Forks: 2
aditya00kumar/document-classification
This project is an attempt to provide a generic pipeline for document classification using different machine learning models.
Language: Jupyter Notebook - Size: 12 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 7 - Forks: 3
Emvista/Gnn4DependencyDocumentClassification
Language: Python - Size: 35.2 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0
malteos/semantic-document-relations
Implementation, trained models and result data for the paper "Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles"
Language: Python - Size: 174 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 31 - Forks: 2
ibtihelgharsalah/Document-Summarization-and-Information-Retrieval
A web app that uses NLP tasks for document manipulation and QA.
Language: HTML - Size: 35.2 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0