An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: text-processing

wenet-e2e/WeTextProcessing

Text Normalization & Inverse Text Normalization

Language: Python - Size: 892 KB - Last synced at: about 3 hours ago - Pushed at: about 5 hours ago - Stars: 611 - Forks: 85

hitesh22rana/sourcecollector

A simple tool to consolidate multiple files into a single .txt file. Perfect for feeding your files to AI tools without any fuss.

Language: Go - Size: 27.7 MB - Last synced at: about 15 hours ago - Pushed at: about 17 hours ago - Stars: 4 - Forks: 0

Lips7/Matcher

A high-performance matcher designed to solve LOGICAL and TEXT VARIATIONS problems in word matching, implemented in Rust.

Language: Rust - Size: 36.9 MB - Last synced at: about 16 hours ago - Pushed at: about 18 hours ago - Stars: 17 - Forks: 1

pymupdf/PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Language: Python - Size: 331 MB - Last synced at: about 18 hours ago - Pushed at: about 20 hours ago - Stars: 7,559 - Forks: 624

VitinDM/data-science-snippets

🧰 Essential EDA and Data Cleaning Helpers for Any DataFrame This collection of functions is designed to accelerate exploratory data analysis (EDA), quickly surface data quality issues, and offer high-level insights into the structure and content of your dataset.

Language: Python - Size: 30.3 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Goldziher/html-to-markdown

HTML to markdown converter

Language: Python - Size: 453 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 58 - Forks: 14

ChenghaoMou/text-dedup

All-in-one text de-duplication

Language: Python - Size: 5.77 MB - Last synced at: 1 day ago - Pushed at: about 2 months ago - Stars: 700 - Forks: 74

Taha5125/DocxWriter-JSON

DocxWriter is a Python library for generating professional Word documents from JSON. Automate reports, add tables, lists, images, and apply custom styles — all from clean, structured data.

Language: Python - Size: 23.4 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

thomaszilliox1/Automated-Consumer-Goods-Classification

This project is focused on segmenting e-commerce customers using unsupervised machine learning models, specifically clustering algorithms.

Language: Jupyter Notebook - Size: 8.81 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

olympus-terminal/unix-utilities

General-purpose UNIX/Linux command-line utilities

Language: Shell - Size: 25.4 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

digineo/texd

texd wraps TeX in a web API

Language: Go - Size: 1000 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 12 - Forks: 1

iarri/Shadertoy2GM

This javascript webapp converts GLSL code from shadertoy.com to Gamemaker GLSL ES as well as output other necessary code to run.

Language: JavaScript - Size: 48.8 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 8 - Forks: 3

helix-editor/nucleo

A fast and convenient fuzzy matcher library for rust

Language: Rust - Size: 232 KB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 1,138 - Forks: 41

CyberCRI/refinedoc

python library for post-extraction refinement of text that may be derived from PDF extraction.

Language: Python - Size: 23.4 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 7 - Forks: 2

BurntSushi/aho-corasick

A fast implementation of Aho-Corasick in Rust.

Language: Rust - Size: 4.71 MB - Last synced at: 1 day ago - Pushed at: 10 months ago - Stars: 1,122 - Forks: 103

victoria217-bottino/google-news-scraper

# 📰 Google News Scraper A Python tool to fetch, decode, and process Google News articles by keyword and time range. Extract clean article text, decode URLs, and perform NLP effortlessly. Perfect for news aggregation, analysis, or building bots. Includes progress tracking with `tqdm` and customizable features for advanced use cases. 🚀

Size: 1000 Bytes - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 3 - Forks: 1

ds-modules/CUNEIF-102A

UC Berkeley CUNEIF 102A (Sumerian Text Analysis) Fall 2017

Language: Jupyter Notebook - Size: 40.8 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 6 - Forks: 0

tamtural/premium-file-parser

This parser extracts key financial transaction info from fixed-width carrier-generated premium files and classifies transaction types based on receipt references and refund/chargeback indicators.

Language: Python - Size: 0 Bytes - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

12345far/metrics-calculation-precision-recall

Laboratory 7 - Retrieval Information

Size: 1.95 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

teenu/gpu-text-search

Ultra-high-performance GPU-accelerated text search using Metal compute shaders

Language: Swift - Size: 554 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2 - Forks: 0

IG-onGit/TexeT

TexeT is the tool you need to take your interaction and content control to the next level.

Language: Python - Size: 117 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

Darko-Martinovic/MeetingTranscriptProcessor

🤖 Intelligent meeting transcript processor that automatically extracts action items using Azure OpenAI and creates Jira tickets. Supports multiple file formats with fallback to rule-based processing when AI is unavailable.

Language: C# - Size: 188 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

DougLau/booky

A tool to analyze English text

Language: Rust - Size: 1.92 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

David-Langat/Information_Retrieval

An Information Retrieval system that processes and ranks news articles. It parses XML files, applies stop-word removal and stemming, and uses TF-IDF and BM25 algorithms to score documents against user queries, sorting them by relevance.

Language: Python - Size: 69.3 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

omicsNLP/Auto-CORPus

Auto-CORPus pipeline developed by a University of Nottingham and Imperial College London collaboration to standardize text and table data extracted from full text publications. See Open Access publication at: https://doi.org/10.3389/fdgth.2022.788124.

Language: HTML - Size: 57.1 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 21 - Forks: 8

linuxscout/pyarabic

pyarabic

Language: Python - Size: 1.23 MB - Last synced at: about 9 hours ago - Pushed at: over 1 year ago - Stars: 459 - Forks: 88

yuvrajpandiya/Piero-EnDe-Coder

A powerful encryption and decryption tool that combines the Vigenère cipher, XOR encryption, and Base64 encoding to secure messages. This tool allows users to encode and decode messages using a secret key, ensuring an extra layer of security.

Size: 1000 Bytes - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

Automattic/go-search-replace

🚀 Search & replace URLs in WordPress SQL files.

Language: Go - Size: 104 KB - Last synced at: 2 days ago - Pushed at: 23 days ago - Stars: 98 - Forks: 19

pyparsing/pyparsing

Python library for creating PEG parsers

Language: Python - Size: 7.8 MB - Last synced at: 3 days ago - Pushed at: 9 days ago - Stars: 2,358 - Forks: 291

open-korean-text/open-korean-text

Open Korean Text Processor - An Open-source Korean Text Processor

Language: Scala - Size: 32.7 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 634 - Forks: 98

KaizoKonpaku/Hush

AI-Powered Screenshot, Audio Transcription, and Text Processing for macOS, Hidden from Screen Sharing, Packed with Features, and Just 2MB

Language: Swift - Size: 12 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 40 - Forks: 9

andalugeeks/andaluh-py

Transliterate español (spanish) spelling to andaluz proposals using python

Language: Python - Size: 802 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 23 - Forks: 3

fossology/atarashi

Atarashi scans for license statements in open source software, focusing on text statistics. Designed to work stand-alone and with FOSSology.

Language: Python - Size: 46.2 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 29 - Forks: 29

alihoseiny/word_cloud_fa

A wrapper for wordcloud module for creating Persian word clouds.

Language: Python - Size: 1.76 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 145 - Forks: 13

PyThaiNLP/pythainlp

Thai natural language processing in Python

Language: Python - Size: 65.6 MB - Last synced at: 6 days ago - Pushed at: 12 days ago - Stars: 1,045 - Forks: 280

milliorn/cli-password-generators

Simple command-line applications for generating passwords

Language: Go - Size: 6.87 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2 - Forks: 0

casics/nostril 📦

Nostril: Nonsense String Evaluator

Language: Python - Size: 143 MB - Last synced at: 4 days ago - Pushed at: about 3 years ago - Stars: 194 - Forks: 35

dataout-org/hate_crimes_2010_2023

Identifying hate crimes against LGBTQIA+ people in Russia in court rulings

Language: Jupyter Notebook - Size: 20.5 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

rhiosutoyo/Teaching-Deep-Learning-and-Its-Applications

This course introduces the building blocks of deep learning and provides overview of various deep learning architectures. It also demonstrates how to solve real-world problems using a practical approach.

Language: Jupyter Notebook - Size: 30.7 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1 - Forks: 0

fullscreen-triangle/kwasa-kwasa

Semantic computing framework with meta-cognitive orchestration and biomimetic principles

Language: Rust - Size: 9.93 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2 - Forks: 0

hyung-hwan/hawk

An AWK interpreter

Language: C - Size: 4.41 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 8 - Forks: 1

Willgnner-Santos/DPE-Legal-Doc-Classification-Pipeline

The results are drawn from experiments on the classification of legal documents using LLMs in a real-world institutional setting

Language: Jupyter Notebook - Size: 45.8 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

rhaberkorn/sciteco

Advanced TECO dialect and interactive screen editor based on Scintilla

Language: C - Size: 3.61 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 51 - Forks: 6

Sumit-807/newsnow

NewsNow offers a clean and elegant interface for reading real-time trending news. 🌐 Dive into the latest updates and enjoy seamless access with GitHub OAuth integration! 🐙

Language: TypeScript - Size: 4.55 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

homeofhx/Text-Purifier

Simple Mac application that filters out specific characters in given text using regular expression (Regex)

Language: Swift - Size: 1.14 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

alirezatheh/perke

A keyphrase extractor for Persian

Language: Python - Size: 143 KB - Last synced at: 3 days ago - Pushed at: 19 days ago - Stars: 69 - Forks: 8

theveryhim/Frequent-item-sets-And-LSH

A practice on finding frequent item sets and similar items in pysaprk framework

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

davidavidnitish/CoreTex

Discover the CORTEX Anomaly Detection app with real-time AI and facial recognition. Explore its cyberpunk interface and advanced features. 🌐💻

Size: 1000 Bytes - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

kupolak/textstat

Ruby gem to calculate statistics from text to determine readability, complexity and grade level of a particular corpus.

Language: Ruby - Size: 242 KB - Last synced at: 7 days ago - Pushed at: 12 months ago - Stars: 34 - Forks: 10

maqeel019/ATS

A powerful Python-based ATS that parses and ranks PDF resumes on recruiter-defined filters like skills, education, and experience. Handles scanned and complex resumes with detailed scoring and Excel output.

Language: Python - Size: 6.77 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

sstadick/hck

A sharp cut(1) clone.

Language: Rust - Size: 494 KB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 715 - Forks: 18

derek73/python-nameparser

A simple Python module for parsing human names into their individual components

Language: Python - Size: 778 KB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 675 - Forks: 105

twardoch/split-markdown4gpt

A Python tool for splitting large Markdown files into smaller sections based on a specified token limit. This is particularly useful for processing large Markdown files with GPT models, as it allows the models to handle the data in manageable chunks.

Language: Python - Size: 78.1 KB - Last synced at: 4 days ago - Pushed at: 11 days ago - Stars: 24 - Forks: 2

ProfRandom/Excel-Lambda-Suite

Reusable Excel LAMBDA function library for modeling, simulation, statistics, and advanced spreadsheet design.

Size: 2.02 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

nilskruthoff/pptx-parser

Parses PowerPoint presentations into Markdown syntax

Language: Rust - Size: 145 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 0

google/diff-match-patch 📦

Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.

Language: Python - Size: 659 KB - Last synced at: 12 days ago - Pushed at: about 1 year ago - Stars: 7,804 - Forks: 1,144

notesjor/corpusexplorer2.0

Korpuslinguistik war noch nie so einfach...

Language: C# - Size: 32.5 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 23 - Forks: 3

EDeev/chatping_abobot

Многофункциональный Telegram-бот для управления группами с аналитикой активности, интеллектуальными упоминаниями и интерактивными функциями

Language: Python - Size: 3.24 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

dnafication/llm-textfix

Sanitize LLM output by detecting and replacing 25+ problematic characters

Language: TypeScript - Size: 0 Bytes - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

ZeroX-DG/vi-rs

Vietnamese Input Method library

Language: Rust - Size: 385 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 152 - Forks: 15

bugbundle/texdora

A unique Docker image to build LaTeX documentation.

Language: Dockerfile - Size: 226 KB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 2 - Forks: 0

paul-j-lucas/wrap

Text reformatter better than fmt(1) or fold(1).

Language: C - Size: 2.94 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 16 - Forks: 4

voidful/TFkit

🤖📇 handling multiple nlp task in one pipeline

Language: Python - Size: 15.9 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 56 - Forks: 6

Arash-Mansourpour/MultiAgent-Chain-of-Expert

MultiAgent Chain of Expert: A Python app using Groq API for dual-model text processing. Gemma analyzes, LLaMA responds, with a modern tkinter GUI. Features history tracking, file I/O, and customizable AI settings. Secure API key handling via .env. MIT License.

Language: Python - Size: 0 Bytes - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

bocaletto-luca/TextEditorQt

This program is a simple text editor with an intuitive user interface, created using the PyQt5 framework for developing desktop applications in Python. The text editor provides many basic features expected from an editor, along with advanced functionalities such as text formatting.

Language: Python - Size: 34.2 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 8 - Forks: 2

blueheron786/line-by-line-quran

Scrapes the Qur'an text, from quran.com, and generates one page per file, with one line per line of the mushaf. This is the "15 line mushaf" which is also known as the Uthmani and Madini mushaf.

Language: Python - Size: 3.91 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

Hasnat-Aarif-Aslam/NLP-Foundation-Tokens-Ngrams-BoW-TF-IDF-TFIDF

Comprehensive guide to text preprocessing and vectorization techniques for NLP, covering tokenization, n-grams, Bag-of-Words, TF-IDF, and related feature-engineering methods.

Size: 2.93 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

himkt/konoha

🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.

Language: Python - Size: 1.35 MB - Last synced at: 13 days ago - Pushed at: 2 months ago - Stars: 251 - Forks: 28

Moez-lab/parallel-keyword-scanner

High-performance keyword scanner for text and PDF files with multiprocessing and a modern React UI.

Language: TypeScript - Size: 80.1 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

samwega/obsidian-wordsmith Fork of chrisgrieser/obsidian-proofreader

AI-powered context-aware writing assistant for Obsidian. Instantly improve, translate, or generate new text with context-aware AI inline suggestions, custom prompts, and granular review. Supports ALL remote and local models. Enjoy a seamless, keyboard-first workflow for editing, refining, and creative writing—all within your notes.

Language: TypeScript - Size: 986 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

rlayers/pawpaw

Text Processing & Segmentation Framework

Language: Python - Size: 2.52 MB - Last synced at: 15 days ago - Pushed at: 4 months ago - Stars: 23 - Forks: 4

proycon/pynlpl

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

Language: Python - Size: 12.8 MB - Last synced at: 10 days ago - Pushed at: almost 2 years ago - Stars: 475 - Forks: 68

alexandersisco/kubun

Python-style slicing for paths and delimiter-separated strings, from your terminal.

Language: Go - Size: 29.3 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

arverma/HindiXlit Fork of AI4Bharat/IndicXlit

Transliteration models for Roman to Devanagari language

Language: Python - Size: 45.8 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

twardoch/wiktra2 Fork of kbatsuren/wiktra

Wiktra: transliteration tool using Wiktionary transliteration modules. Version 2 (fork)

Language: Lua - Size: 1.29 MB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 4 - Forks: 0

Mukeshthenraj/date-extraction-project

Extract and normalize dates from unstructured medical notes using Python and regular expressions.

Language: Python - Size: 40 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

loderunner/typelit

A type-safe string templating library for TypeScript

Language: TypeScript - Size: 381 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 1 - Forks: 1

dewanakl/aman

🤬 Filter kata kotor sederhana dengan regex. Cek, sensor, dan hapus kata kasar dengan pola karakter mirip.

Language: PHP - Size: 85 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 2 - Forks: 2

shama-llama/pdf-epub-converter

PDF to EPUB conversion using ML for layout detection

Language: Python - Size: 140 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

binsarjr/chatbot-indonesia

Kumpulan data yang akan digunakan untuk keperluan chatbot bahasa Indonesia dengan kode chatbot sederhana menggunakan Typescript

Language: TypeScript - Size: 559 KB - Last synced at: 10 days ago - Pushed at: almost 2 years ago - Stars: 35 - Forks: 12

sunsided/merge-whitespace-rs

Procedural macros for merging whitespace in const contexts

Language: Rust - Size: 101 KB - Last synced at: 10 days ago - Pushed at: 18 days ago - Stars: 1 - Forks: 0

weiwei/silabacion

Convert Spanish words into syllables

Language: TypeScript - Size: 1.62 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 8 - Forks: 0

hakatashi/japanese.js

Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.

Language: JavaScript - Size: 283 KB - Last synced at: 12 days ago - Pushed at: almost 5 years ago - Stars: 168 - Forks: 3

znwang25/fuzzychinese

A small package to fuzzy match chinese words

Language: Python - Size: 1.81 MB - Last synced at: 14 days ago - Pushed at: over 2 years ago - Stars: 88 - Forks: 10

Puchaczov/Musoq

SQL Syntax without any database

Language: C# - Size: 15.7 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 482 - Forks: 21

YULINHEEE/NLP-text-preprocessing-and-classification

Starter code to solve real-world text data problems related to job advertisements. Includes: Word2Vec, phrase embeddings, Text Classification with Logistic Regression, simple text preprocessing, pre-trained embeddings and more.

Language: Jupyter Notebook - Size: 1.21 MB - Last synced at: 19 days ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

shohanur-shoron/bangla_normalizer

A Python library designed to convert various written forms of Bengali text elements (like numbers, dates, times, currency, percentages, distances, etc.) into their corresponding spoken word representations.

Language: Python - Size: 96.7 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

bithead21/parcel

Parser for cpp programms! Parcel is simple language for parsing text information and retrieving any data.

Language: C++ - Size: 1.2 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 2 - Forks: 0

guillaumeast/mentorai

Turn any YouTube channel into a full Custom GPT (avatar, settings, transcripts)

Language: Shell - Size: 61.5 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

sdleffler/qp-trie-rs

An idiomatic and fast QP-trie implementation in pure Rust.

Language: Rust - Size: 80.1 KB - Last synced at: 17 days ago - Pushed at: about 1 year ago - Stars: 101 - Forks: 25

mary-lev/mary-lev.github.io

Just another blog

Language: HTML - Size: 19.9 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

phil65/docler

Abstractions & Tools for OCR / document processing

Language: Python - Size: 2.28 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 2 - Forks: 0

brothersincode/virastar

Cleaning-up Persian Texts!

Language: JavaScript - Size: 1.3 MB - Last synced at: 1 day ago - Pushed at: 2 months ago - Stars: 138 - Forks: 15

hasinhayder/javascript-text-expander

Expands texts as you type, naturally

Language: JavaScript - Size: 12.7 KB - Last synced at: 10 days ago - Pushed at: almost 2 years ago - Stars: 67 - Forks: 19

Lord-Memester/tagger-txt-to-XMP

A python script to convert the .txt files generated by an automatic tagger plugin for Automatic1111's stable diffusion Web UI into XMP sidecar files interpretable by Immich.

Language: Python - Size: 105 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

LunarisApp/text-tools

A collection of text processing tools

Language: TypeScript - Size: 2.58 MB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

AlanSteinbarth/Audio2Tekst

Profesjonalny konwerter audio na tekst wykorzystujący OpenAI Whisper. Wspiera batch processing, eksport do różnych formatów (TXT, DOCX, PDF). GUI z drag&drop, progress tracking i opcjami konfiguracji jakości transkrypcji. Idealny dla dziennikarzy, studentów i twórców treści.

Language: Python - Size: 3.47 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

DineshDhamodharan24/Data_Science_Final_Project

Customer Insights & Recommendation System: Harnessing Decision Tree, Logistic Regression, and Random Forest models for behavior analysis. Utilizing EasyOCR and Python Imaging Library for image information extraction. Employing NLTK for sentiment analysis on textual data

Language: Jupyter Notebook - Size: 21.1 MB - Last synced at: 17 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

MonikaBarget/atr-historical-research

Automated Text Recognition in Historical Research

Language: Jupyter Notebook - Size: 2.92 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 5 - Forks: 14

Romelium/mpatch

A fuzzy patch tool in Rust for applying AI-generated diffs from markdown, ignoring line numbers.

Language: Rust - Size: 0 Bytes - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 0 - Forks: 0