Topic: "text-splitter"
chonkie-inc/chonkie
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
Language: Jupyter Notebook - Size: 1.27 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 846 - Forks: 44

jupediaz/chatgpt-prompt-splitter
ChatGPT PROMPTs Splitter. Tool for safely process chunks of up to 15,000 characters per request
Language: Python - Size: 2.69 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 435 - Forks: 74

mirth/chonky
Fully neural approach for text chunking
Language: Python - Size: 34.2 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 331 - Forks: 10

jparkerweb/semantic-chunking
🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows
Language: JavaScript - Size: 8.81 MB - Last synced at: 16 days ago - Pushed at: 3 months ago - Stars: 88 - Forks: 6

sentencizer/sentencizer
A sentence splitting (sentence boundary disambiguation) library for Go. It is rule-based and works out-of-the-box.
Language: Go - Size: 1.83 MB - Last synced at: 1 day ago - Pushed at: about 1 month ago - Stars: 33 - Forks: 6

loganliffick/react-spltjs
SpltJS for React
Language: TypeScript - Size: 21.6 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 28 - Forks: 0

thrivewithai/NinjaSearchWithHumanGPT
An agent with human in the loop that can search the web for information while bypassing bot detection for private sites.
Language: Python - Size: 659 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 17 - Forks: 2

thrivewithai/langchain-fixie-marvin
We compared LangChain, Fixie, and Marvin
Language: Python - Size: 21.5 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

PabloSanchi/jchunk
JChunk is a lightweight and flexible library designed to provide multiple strategies for text chunking within Spring Boot applications
Language: Java - Size: 243 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

LuisHBeck/genAI-article-research
Generative AI projetc using LangChain for similarity search. Input 3 articles urls and ask something about the topic
Language: Python - Size: 338 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0

betcorg/llm-text-splitter Fork of golbin/llm-chunk
A lightweight TypeScript text splitter for RAG applications
Language: TypeScript - Size: 180 KB - Last synced at: about 3 hours ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

sushant1827/RAG_with_LangChain
Leveraging Langchain for a RAG (Retriever Augmented Generation) project, this implementation enables efficient querying across multiple books, enhancing data retrieval and natural language generation for context-rich answers.
Language: Python - Size: 2.71 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

philnash/chunkers
An exploration of text splitting and chunking in JavaScript
Language: TypeScript - Size: 15.5 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

mich2k/Text-Splitter
Allows you to upload to GitHub text files over 100MB
Language: Python - Size: 53.7 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

SayamAlt/Langchain-with-Python-Bootcamp
This repository covers all the code materials covered within Jose Portilla's Langchain with Python Bootcamp on Udemy.
Language: Jupyter Notebook - Size: 15.4 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

959kissfm/AI-Powered-Research-Assistant-using-Langchain
Successfully developed an LLM application which generates a summary, a list of citations and references and response to a user's query based on the research paper's content.
Language: Python - Size: 10.7 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

SayamAlt/AI-Powered-Research-Assistant-using-Langchain
Successfully developed an LLM application which generates a summary, a list of citations and references and response to a user's query based on the research paper's content.
Language: Python - Size: 7.81 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

pranav-kural/ledaa-web-scrapper
Web scrapper to scrap and prepare data for data ingestion in RAG pipeline of LEDAA project.
Language: Python - Size: 19.5 KB - Last synced at: about 21 hours ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

skitsanos/streamlit-split-text
Text splitting example using Tiktoken
Language: Python - Size: 4.88 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

samliebl/word-matching
Matching strings between lists based on length
Language: JavaScript - Size: 1.95 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

ennanuel/large-text-splitter
A text splitter to help handle character limit in ChatGPT, Gemini and other text based AI without reducing the characters in the text.
Language: JavaScript - Size: 55.7 KB - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

jempe/text_splitter
Text Splitter A Go library for splitting strings into smaller chunks based on specified lengths and optional delimiters.
Language: Go - Size: 11.7 KB - Last synced at: about 17 hours ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

saadtariq-ds/langchain_chat_with_data
Dive into LangChain, a powerful platform that lets you interact with your data like never before. This guide offers insights on its unique capabilities, helping you tap into your data in conversational ways.
Language: Jupyter Notebook - Size: 2.26 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ZarTek-Creole/tcl-eggdrop-textsplitter
Script TCL pour EGGDROP sur IRC, permettant la division de textes en blocs selon une longueur spécifiée. Il respecte les codes de formatage IRC et facilite la gestion et la manipulation des messages IRC.
Language: Tcl - Size: 5.86 KB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0
