GitHub topics: chunker
chonkie-inc/chonkie
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense chunking library
Language: Python - Size: 789 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 260 - Forks: 14

drittich/SemanticSlicer
A recursive text chunker that attempts to break the text on meaningful boundaries.
Language: C# - Size: 73.2 KB - Last synced at: 1 day ago - Pushed at: 24 days ago - Stars: 20 - Forks: 1

ATOMIC09/chunker-batch-converter
A GUI for batch conversion of Chunker CLI
Language: Python - Size: 1.76 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

Dadmatech/DadmaTools
DadmaTools is a Persian NLP tools developed by Dadmatech Co.
Language: Python - Size: 92.6 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 191 - Forks: 42

avineshpvs/indic_tagger
Indian Language Tagger and Chunker (Hindi, Telugu, Tamil, Marathi, Punjabi, Kanada, Malayalam, Urdu, Bengali)
Language: Python - Size: 305 MB - Last synced at: 9 months ago - Pushed at: about 2 years ago - Stars: 40 - Forks: 13

macarie/trancio
Lazily split an array into chunks, just like slices of pizza 🍕
Language: TypeScript - Size: 1.16 MB - Last synced at: 7 days ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

mohsenim/persianp
A Processing Toolbox for Persian Texts
Language: Java - Size: 61.3 MB - Last synced at: 27 days ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

madhurimarawat/Natural-Language-Processing-in-Python
This repository contains Natural Language Processing programs in the Python programming language.
Language: Jupyter Notebook - Size: 47.9 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Christoph-Beckmann/youtube-transcript-splitter
The primary purpose of this tool is to make it easier to input long YouTube transcripts into ChatGPT by splitting them into smaller chunks. Additionally, prompt-engineering techniques have been incorporated to improve the quality of the output.
Language: Python - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 1

pythoncircus/splendid
A collection of useful small python helpers
Language: Python - Size: 33.2 KB - Last synced at: 24 days ago - Pushed at: almost 8 years ago - Stars: 4 - Forks: 0

Gozala/rabin-wasm Fork of hugomrdias/rabin-wasm
Rabin IPFS chunker in rust/wasm
Language: JavaScript - Size: 11.2 MB - Last synced at: 5 days ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

gitlost-murali/POS-Tagging-using-Neural-Network-Models
Contains implementation of models like BiLSTM CRF, Hierarchical BiLSTM for POS Tagging.
Language: Python - Size: 40 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

YuriyBereguliak/QAGenerator
This is a simple project of building custom training and model data for Apache OpeNLP library. The main task is recognizing Ukrainian texts and building helpful questions and theses.
Language: Kotlin - Size: 1.85 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

M4t1ss/chunker
A sentence chunker PHP class + visualizer for Berkeley Parser parse trees
Language: PHP - Size: 23.5 MB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 0
