GitHub topics: tiktoken
baselinerepo/LLMs
Understanding Large Language Models
Language: CSS - Size: 34.3 MB - Last synced at: about 10 hours ago - Pushed at: about 11 hours ago - Stars: 0 - Forks: 0

SameerManan/rs-bpe
A ridiculously fast Python BPE (Byte Pair Encoder) implementation written in Rust
Language: Python - Size: 2.44 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 1

CNSeniorious000/free-chat
An elegant LLM chat UI forked from chatgpt-demo of @anse-app. Index site at https://free-chat.asia
Language: Svelte - Size: 3.63 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 182 - Forks: 54

jill64/cf-tiktoken
⏳ js-tiktoken on Cloudflare Pages
Language: TypeScript - Size: 345 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 1

ryhkml/ytingest
Extract YouTube video, feed it to any LLM as knowledge
Language: C - Size: 39.1 KB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

jaco-bro/MLX.zig
MLX.zig: Phi-4, Llama 3.2, and Whisper in Zig
Language: Zig - Size: 5.18 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 14 - Forks: 1

dqbd/tiktokenizer
Online playground for OpenAPI tokenizers
Language: TypeScript - Size: 709 KB - Last synced at: 8 days ago - Pushed at: 2 months ago - Stars: 1,104 - Forks: 129

jimmc414/onefilellm
Specify a github or local repo, github pull request, arXiv or Sci-Hub paper, Youtube transcript or documentation URL on the web and scrape into a text file and clipboard for easier LLM ingestion
Language: Python - Size: 538 KB - Last synced at: 9 days ago - Pushed at: 24 days ago - Stars: 1,073 - Forks: 100

annnieglez/genai-travel-guide
This project is an AI-powered chatbot that provides real-time travel advice about Iceland. It utilizes Retrieval-Augmented Generation (RAG) by storing document embeddings in ChromaDB and retrieving relevant information to generate responses using a Large Language Model (LLM).
Language: Jupyter Notebook - Size: 117 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

stefanpietrusky/IECV1.5
Repository for the article in the online magazine Level Up Coding
Language: Python - Size: 11.5 MB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

AnuritiGupta26/ResearchMate-
Research Mate is a end to end LLM model using langchain, designed to assist researchers, students, and professionals in efficiently processing and extracting insights from research articles and online content. Users can input multiple research URLs, which the app processes and converts into useful information.
Language: Python - Size: 45.9 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

izikeros/count_tokens
Count tokens in a text file.
Language: Python - Size: 104 KB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 6 - Forks: 0

gweidart/rs-bpe
A ridiculously fast Python BPE (Byte Pair Encoder) implementation written in Rust
Language: Python - Size: 2.47 MB - Last synced at: 12 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

unitythemaker/tokdu
tokdu (Token Disk Usage) is a terminal-based utility that helps you analyze and visualize token usage in your codebase. Similar to the classic du (disk usage) command, tokdu shows you how many tokens your files and directories consume, which is essential when working with Large Language Models (LLMs) that have token limits.
Language: Python - Size: 2.48 MB - Last synced at: 4 days ago - Pushed at: 15 days ago - Stars: 3 - Forks: 0

xp-forge/openai
OpenAI APIs for XP Framework
Language: PHP - Size: 168 KB - Last synced at: 10 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

sewenew/tokenizer
C++ implementation of tokenizers, including tiktoken.
Language: C++ - Size: 761 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 2

tryAGI/Tiktoken
This project implements token calculation for OpenAI's gpt-4 and gpt-3.5-turbo model, specifically using `cl100k_base` encoding.
Language: C# - Size: 3.82 MB - Last synced at: 14 days ago - Pushed at: 15 days ago - Stars: 75 - Forks: 4

johannschopplich/tokenx
📐 GPT token estimation and context size utilities without a full tokenizer
Language: TypeScript - Size: 353 KB - Last synced at: 3 days ago - Pushed at: 5 months ago - Stars: 21 - Forks: 1

openshieldai/openshield
OpenShield is a new generation security layer for AI models
Language: Go - Size: 2.26 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 74 - Forks: 7

cahya-wirawan/rwkv-tokenizer
A fast RWKV Tokenizer written in Rust
Language: Jupyter Notebook - Size: 1.9 MB - Last synced at: 15 days ago - Pushed at: 22 days ago - Stars: 44 - Forks: 3

tural00a1568/llm-chat-indexer
The LLM Chat Indexer is a clever tool designed to transform chaotic chat files into organized, searchable insights—ideal for anyone overwhelmed by digital conversations.
Language: Python - Size: 2.15 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

viniciusmecosta/CountTokensPython
Language: Jupyter Notebook - Size: 3.91 KB - Last synced at: 20 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

marcelovicentegc/counttokens
Yet another CLI tool for counting tokens in text datasets using tiktoken.
Language: Python - Size: 18.6 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

shivendrra/shredword
Fast & efficient BPE tokenizer written in C & python for LLM tranining
Language: C++ - Size: 14.3 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

pkoukk/tiktoken-go
go version of tiktoken
Language: Go - Size: 1.15 MB - Last synced at: 28 days ago - Pushed at: 11 months ago - Stars: 738 - Forks: 87

madhurajayashanka/ai-travel-assistant-langchain
AI Travel Assistant uses Python, OpenAI API, Streamlit, SQLite & LangChain to generate smart, personalized travel itineraries.
Language: Python - Size: 0 Bytes - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

ElmiraGhorbani/chatgpt-long-term-memory
The ChatGPT Long Term Memory package is a powerful tool designed to empower your projects with the ability to handle a large number of simultaneous users and external sources.
Language: Python - Size: 43.9 KB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 56 - Forks: 2

rogerchang1108/OpenAI-API-Token-Counter-with-Tiktoken
This project harnesses the power of Tiktoken and the OpenAI API to create a Python Streamlit web application with a primary focus on token counting and price estimation.
Language: Python - Size: 278 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

aallam/ktoken
Kotlin multiplatform BPE tokenizer library for OpenAI models
Language: Kotlin - Size: 10.7 MB - Last synced at: 23 days ago - Pushed at: 3 months ago - Stars: 30 - Forks: 2

howdymic/tiktoken-server
Docker container to expose the OpenAI tokenizer as a REST service
Language: Python - Size: 4.88 KB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 8

kgruiz/PyTokenCounter
A simple Python library for tokenizing text and counting tokens. While currently only supporting OpenAI LLMs, it helps with text processing and managing token limits in AI applications.
Language: Python - Size: 433 KB - Last synced at: 12 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

hupe1980/go-tiktoken
✂️ OpenAI's tiktoken tokenizer written in Go
Language: Go - Size: 3.51 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 18 - Forks: 1

danny50610/bpe-tokeniser
PHP port for openai/tiktoken (most)
Language: PHP - Size: 2.73 MB - Last synced at: 8 days ago - Pushed at: 8 months ago - Stars: 9 - Forks: 0

403errors/TubeQuery
TubeQuery is a LLM based model, fetching all the queries related to your video. Just input the video link and all the qestiones are welcomed!
Language: Python - Size: 557 KB - Last synced at: 15 days ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

akshay-kamath/personal-projects
The projects which made by me while self learning.
Language: Jupyter Notebook - Size: 8.26 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

SandaruRF/AI-Voting-Assistant-Predictor-Comparator-Chatbot
Language: Jupyter Notebook - Size: 100 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 0 - Forks: 1

ReshiAdavan/Thoth
industry standard tokenizer purposed for large-scale language models (GPT, Claude, Llama, etc.)
Language: Python - Size: 2.96 MB - Last synced at: 4 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

RahulDey12/tiktoken-php
A PHP implementation of OpenAI's BPE tokenizer tiktoken.
Language: PHP - Size: 95.7 KB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 7 - Forks: 0

oelmekki/tiktoken-cli
Simple wrapper around tiktoken to use it in your favorite language.
Language: Python - Size: 6.84 KB - Last synced at: 14 days ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 3

kojix2/tiktoken-c
C API for tiktoken-rs
Language: Rust - Size: 42 KB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 9 - Forks: 1

vstrickl/gpt-token-counter
Simple Python script to help me parse through an entire video podcast transcript and create prompts for chat-gpt within its 4000 token limit.
Language: Python - Size: 1.64 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

maledorak/single-token-words
List of single token words for LLM usage
Language: Python - Size: 2.32 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

skitsanos/streamlit-split-text
Text splitting example using Tiktoken
Language: Python - Size: 4.88 KB - Last synced at: 17 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

farithadnan/DatasetForge
Extracts Google Sheets to JSONL for fine-tuning, estimates task costs with tiktoken.
Language: Python - Size: 50.8 KB - Last synced at: 16 days ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

P1ayer-1/ChatGPT-Web-vs-API-pricing
Count tokens to determine cost differential of ChatGPT Plus subscription and ChatGPT API
Language: Python - Size: 10.7 KB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

rubenselander/openai-function-tokens
Predict the exact openai token usage of functions
Language: Python - Size: 23.4 KB - Last synced at: 16 days ago - Pushed at: over 1 year ago - Stars: 19 - Forks: 1

gcondeh/Tokens
Pequeñas utilidades para contar tokens y cortar cadenas de texto
Language: Python - Size: 21.5 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

augustoomb/projeto-ia-langchain
Uso do framework langchain para uma API que responde a perguntas baseadas em documentos (RAG)
Language: Python - Size: 35.2 KB - Last synced at: 23 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

dbtreasure/zig-bpe
Byte Pair Encoding (BPE) in the Zig programming language (0.13.0)
Language: Zig - Size: 1.84 MB - Last synced at: 4 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

schneiderfelipe/chat-splitter
Split chat messages by maximum chat completion token count
Language: Rust - Size: 21.5 KB - Last synced at: 19 days ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

wonyoung-jang/logseq-tokenizer
Logseq Markdown Tokenizer is a Python application that tokenizes and estimates prices for one to many markdown files.
Language: Python - Size: 176 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

flexchar/tiktoken-counter
Tiktoken-counter as standalone API
Language: Python - Size: 2.93 KB - Last synced at: 4 days ago - Pushed at: 8 months ago - Stars: 2 - Forks: 1

guanhui07/tiktoken-php Fork of yethee/tiktoken-php
This is a port of the tiktoken
Language: PHP - Size: 2.71 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

kojix2/tiktoken-cr
Tiktoken for Crystalists
Language: Crystal - Size: 137 KB - Last synced at: 22 days ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0

biomchen/doc-split-estimator
Language: Python - Size: 13.7 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

haha-systems/toll
The OpenAI tiktoken library as a service. For counting the number of tokens in a message to an LLM like GPT.
Language: Python - Size: 20.5 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

tobi303x/Web-crawling-tool-for-NGO
Web crawling GUI app for non-govermental events, originally created in 2023 as Uni project. Enjoy!
Language: Python - Size: 166 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

saschaschramm/chatgpt
Analysis of OpenAI's ChatGPT
Language: Jupyter Notebook - Size: 413 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 149 - Forks: 12

maxim-saplin/tiktoken-bench
Comparing OpenAI tokeniser (tiktoken) performance - stock Python/Rust vs JS/WASM
Language: Dart - Size: 5.37 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

JacobLinCool/Tiktoken-Calculator
Calculate the token count for GPT-4, GPT-3.5, GPT-3, and GPT-2.
Language: Python - Size: 68.4 KB - Last synced at: 23 days ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

GustavoHBraga/RecommendationsAI
Este projeto consiste em uma aplicação que utiliza IA generativa para enviar recomendações de produtos com base nos perfis de compra de cada cliente
Language: Python - Size: 32.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

phukon/temporal-traverse 📦
console based game based on a llm
Language: Python - Size: 2.93 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

w95/tiktoken
The Tiktoken API is a tool that enables developers to calculate the token usage of their OpenAI API requests before sending them, allowing for more efficient use of tokens.
Language: Python - Size: 2.93 KB - Last synced at: 12 months ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 0

krasnoturinsk/telegram_bot_support_auchan Fork of RedAlexDad/telegram_bot_support_auchan
В рамках хакатона был создан телеграм бот для консультации с клиентами с сфере обслуживания магазина "Ашан"
Language: Python - Size: 20.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

mthli/tiktoken-android Fork of eisber/tiktoken
Run OpenAI tiktoken on Android 😃
Size: 70.3 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

mytechnotalent/kgpt
A custom GPT based on [Zero To Hero](https://karpathy.ai/zero-to-hero.html) utilizing tiktoken with the intent to augment AI Transformer-model education and reverse engineer GPT models from scratch.
Language: Python - Size: 1.17 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

cameronk/token-counter
Wraps @dqbd/tiktoken to count the number of tokens used by various OpenAI models.
Language: TypeScript - Size: 791 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

b0o/tiktoken-bench
A small Node.js benchmark suite for the tiktoken WASM port.
Language: TypeScript - Size: 12.7 KB - Last synced at: 19 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

functorism/gpt4-tokenizer-visualizer
GPT4 Tokenizer Visualizer
Language: TypeScript - Size: 10.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

peterheb/gotoken
Gotoken is a pure-Go implementation of the Python library openai/tiktoken.
Language: Go - Size: 6.36 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0
