An open API service providing repository metadata for many open source software ecosystems.

Topic: "text-chunking"

isaacus-dev/semchunk

A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.

Language: Python - Size: 117 KB - Last synced at: 14 days ago - Pushed at: about 1 month ago - Stars: 284 - Forks: 16

lazyFrogLOL/llmdocparser

A package for parsing PDFs and analyzing their content using LLMs.

Language: Python - Size: 1.21 MB - Last synced at: 27 days ago - Pushed at: 9 months ago - Stars: 268 - Forks: 8

drittich/SemanticSlicer

A recursive text chunker that attempts to break the text on meaningful boundaries.

Language: C# - Size: 73.2 KB - Last synced at: 2 days ago - Pushed at: 25 days ago - Stars: 20 - Forks: 1

jparkerweb/semantic-chunking

🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows

Language: JavaScript - Size: 107 KB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 15 - Forks: 0

ChenTaHung/HTML-Text-Parser

This project is designed to extract text from documents and prepare it for processing by Large Language Models (LLM). Implemented a feature to store and utilize text style information, enabling the program to identify and segment content based on potential headers and titles.

Language: HTML - Size: 18.7 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 6 - Forks: 1

betcorg/llm-text-splitter Fork of golbin/llm-chunk

A lightweight TypeScript text splitter for RAG applications

Language: TypeScript - Size: 180 KB - Last synced at: 16 days ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

philnash/chunkers

An exploration of text splitting and chunking in JavaScript

Language: TypeScript - Size: 15.5 MB - Last synced at: 22 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

Besthope-Official/predoc

Preprocess document service for RAG (Retriveal Augumented Generation)

Language: Python - Size: 18.6 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

adityapathak-cubastion/cubastion-hr-chatbot

Presenting, Cubastion's HR chatbot - it can answer queries based on all the latest HR documents published by Cubastion's HR team. This conveniently saves time, allowing a Cubastion employee to resolve their query without having to comb through the actual documents. <<Developed with Python, sentence-transformers, Pinecone, llama3.2, and Streamlit>>

Language: Python - Size: 33.4 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 1