An open API service providing repository metadata for many open source software ecosystems.

Topic: "text-splitter"

chonkie-inc/chonkie

🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library

Language: Jupyter Notebook - Size: 1.27 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 846 - Forks: 44

jupediaz/chatgpt-prompt-splitter

ChatGPT PROMPTs Splitter. Tool for safely process chunks of up to 15,000 characters per request

Language: Python - Size: 2.69 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 435 - Forks: 74

mirth/chonky

Fully neural approach for text chunking

Language: Python - Size: 34.2 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 331 - Forks: 10

jparkerweb/semantic-chunking

🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows

Language: JavaScript - Size: 8.81 MB - Last synced at: 16 days ago - Pushed at: 3 months ago - Stars: 88 - Forks: 6

sentencizer/sentencizer

A sentence splitting (sentence boundary disambiguation) library for Go. It is rule-based and works out-of-the-box.

Language: Go - Size: 1.83 MB - Last synced at: 1 day ago - Pushed at: about 1 month ago - Stars: 33 - Forks: 6

loganliffick/react-spltjs

SpltJS for React

Language: TypeScript - Size: 21.6 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 28 - Forks: 0

thrivewithai/NinjaSearchWithHumanGPT

An agent with human in the loop that can search the web for information while bypassing bot detection for private sites.

Language: Python - Size: 659 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 17 - Forks: 2

thrivewithai/langchain-fixie-marvin

We compared LangChain, Fixie, and Marvin

Language: Python - Size: 21.5 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

PabloSanchi/jchunk

JChunk is a lightweight and flexible library designed to provide multiple strategies for text chunking within Spring Boot applications

Language: Java - Size: 243 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

LuisHBeck/genAI-article-research

Generative AI projetc using LangChain for similarity search. Input 3 articles urls and ask something about the topic

Language: Python - Size: 338 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0

betcorg/llm-text-splitter Fork of golbin/llm-chunk

A lightweight TypeScript text splitter for RAG applications

Language: TypeScript - Size: 180 KB - Last synced at: about 3 hours ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

sushant1827/RAG_with_LangChain

Leveraging Langchain for a RAG (Retriever Augmented Generation) project, this implementation enables efficient querying across multiple books, enhancing data retrieval and natural language generation for context-rich answers.

Language: Python - Size: 2.71 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

philnash/chunkers

An exploration of text splitting and chunking in JavaScript

Language: TypeScript - Size: 15.5 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

mich2k/Text-Splitter

Allows you to upload to GitHub text files over 100MB

Language: Python - Size: 53.7 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

SayamAlt/Langchain-with-Python-Bootcamp

This repository covers all the code materials covered within Jose Portilla's Langchain with Python Bootcamp on Udemy.

Language: Jupyter Notebook - Size: 15.4 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

959kissfm/AI-Powered-Research-Assistant-using-Langchain

Successfully developed an LLM application which generates a summary, a list of citations and references and response to a user's query based on the research paper's content.

Language: Python - Size: 10.7 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

SayamAlt/AI-Powered-Research-Assistant-using-Langchain

Successfully developed an LLM application which generates a summary, a list of citations and references and response to a user's query based on the research paper's content.

Language: Python - Size: 7.81 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

pranav-kural/ledaa-web-scrapper

Web scrapper to scrap and prepare data for data ingestion in RAG pipeline of LEDAA project.

Language: Python - Size: 19.5 KB - Last synced at: about 21 hours ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

skitsanos/streamlit-split-text

Text splitting example using Tiktoken

Language: Python - Size: 4.88 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

samliebl/word-matching

Matching strings between lists based on length​

Language: JavaScript - Size: 1.95 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

ennanuel/large-text-splitter

A text splitter to help handle character limit in ChatGPT, Gemini and other text based AI without reducing the characters in the text.

Language: JavaScript - Size: 55.7 KB - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

jempe/text_splitter

Text Splitter A Go library for splitting strings into smaller chunks based on specified lengths and optional delimiters.

Language: Go - Size: 11.7 KB - Last synced at: about 17 hours ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

saadtariq-ds/langchain_chat_with_data

Dive into LangChain, a powerful platform that lets you interact with your data like never before. This guide offers insights on its unique capabilities, helping you tap into your data in conversational ways.

Language: Jupyter Notebook - Size: 2.26 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ZarTek-Creole/tcl-eggdrop-textsplitter

Script TCL pour EGGDROP sur IRC, permettant la division de textes en blocs selon une longueur spécifiée. Il respecte les codes de formatage IRC et facilite la gestion et la manipulation des messages IRC.

Language: Tcl - Size: 5.86 KB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0