An open API service providing repository metadata for many open source software ecosystems.

Topic: "text-splitting"

isaacus-dev/semchunk

A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.

Language: Python - Size: 117 KB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 284 - Forks: 16

sentencizer/sentencizer

A sentence splitting (sentence boundary disambiguation) library for Go. It is rule-based and works out-of-the-box.

Language: Go - Size: 1.83 MB - Last synced at: 6 days ago - Pushed at: 22 days ago - Stars: 31 - Forks: 6

jparkerweb/semantic-chunking

🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows

Language: JavaScript - Size: 107 KB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 15 - Forks: 0

PabloSanchi/jchunk

JChunk is a lightweight and flexible library designed to provide multiple strategies for text chunking within Spring Boot applications

Language: Java - Size: 243 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

HemalDholakiya12/PDFChat

A web app that allows users to upload PDFs and interact with them through a Q&A interface. The application extracts text from PDFs, generates embeddings, stores them in a FAISS database, and retrieves relevant information to provide context-aware answers using a large language model .

Language: JavaScript - Size: 119 KB - Last synced at: 3 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

philnash/chunkers

An exploration of text splitting and chunking in JavaScript

Language: TypeScript - Size: 15.5 MB - Last synced at: 21 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

shikhar13012001/research-papers-QA-langchain-pinecone

This is an experiment in learning langchain, pinecone and stuff, don't mind

Language: TypeScript - Size: 23.2 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

pranav-kural/ledaa-text-splitter

Specialized markdown text splitter - part of LEDAA project's data ingestion pipeline for RAG.

Language: Python - Size: 8.79 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

skitsanos/streamlit-split-text

Text splitting example using Tiktoken

Language: Python - Size: 4.88 KB - Last synced at: 21 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

samliebl/word-matching

Matching strings between lists based on length​

Language: JavaScript - Size: 1.95 KB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

Shuvob4/LangChain-Tutorial

LangChain is a framework, which is very helpful and easy to build applications based on available Large Language Models.

Language: Jupyter Notebook - Size: 869 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0