An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: tiktoken

baselinerepo/LLMs

Understanding Large Language Models

Language: CSS - Size: 34.3 MB - Last synced at: about 10 hours ago - Pushed at: about 11 hours ago - Stars: 0 - Forks: 0

SameerManan/rs-bpe

A ridiculously fast Python BPE (Byte Pair Encoder) implementation written in Rust

Language: Python - Size: 2.44 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 1

CNSeniorious000/free-chat

An elegant LLM chat UI forked from chatgpt-demo of @anse-app. Index site at https://free-chat.asia

Language: Svelte - Size: 3.63 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 182 - Forks: 54

jill64/cf-tiktoken

⏳ js-tiktoken on Cloudflare Pages

Language: TypeScript - Size: 345 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 1

ryhkml/ytingest

Extract YouTube video, feed it to any LLM as knowledge

Language: C - Size: 39.1 KB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

jaco-bro/MLX.zig

MLX.zig: Phi-4, Llama 3.2, and Whisper in Zig

Language: Zig - Size: 5.18 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 14 - Forks: 1

dqbd/tiktokenizer

Online playground for OpenAPI tokenizers

Language: TypeScript - Size: 709 KB - Last synced at: 8 days ago - Pushed at: 2 months ago - Stars: 1,104 - Forks: 129

jimmc414/onefilellm

Specify a github or local repo, github pull request, arXiv or Sci-Hub paper, Youtube transcript or documentation URL on the web and scrape into a text file and clipboard for easier LLM ingestion

Language: Python - Size: 538 KB - Last synced at: 9 days ago - Pushed at: 24 days ago - Stars: 1,073 - Forks: 100

annnieglez/genai-travel-guide

This project is an AI-powered chatbot that provides real-time travel advice about Iceland. It utilizes Retrieval-Augmented Generation (RAG) by storing document embeddings in ChromaDB and retrieving relevant information to generate responses using a Large Language Model (LLM).

Language: Jupyter Notebook - Size: 117 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

stefanpietrusky/IECV1.5

Repository for the article in the online magazine Level Up Coding

Language: Python - Size: 11.5 MB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

AnuritiGupta26/ResearchMate-

Research Mate is a end to end LLM model using langchain, designed to assist researchers, students, and professionals in efficiently processing and extracting insights from research articles and online content. Users can input multiple research URLs, which the app processes and converts into useful information.

Language: Python - Size: 45.9 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

izikeros/count_tokens

Count tokens in a text file.

Language: Python - Size: 104 KB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 6 - Forks: 0

gweidart/rs-bpe

A ridiculously fast Python BPE (Byte Pair Encoder) implementation written in Rust

Language: Python - Size: 2.47 MB - Last synced at: 12 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

unitythemaker/tokdu

tokdu (Token Disk Usage) is a terminal-based utility that helps you analyze and visualize token usage in your codebase. Similar to the classic du (disk usage) command, tokdu shows you how many tokens your files and directories consume, which is essential when working with Large Language Models (LLMs) that have token limits.

Language: Python - Size: 2.48 MB - Last synced at: 4 days ago - Pushed at: 15 days ago - Stars: 3 - Forks: 0

xp-forge/openai

OpenAI APIs for XP Framework

Language: PHP - Size: 168 KB - Last synced at: 10 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

sewenew/tokenizer

C++ implementation of tokenizers, including tiktoken.

Language: C++ - Size: 761 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 2

tryAGI/Tiktoken

This project implements token calculation for OpenAI's gpt-4 and gpt-3.5-turbo model, specifically using `cl100k_base` encoding.

Language: C# - Size: 3.82 MB - Last synced at: 14 days ago - Pushed at: 15 days ago - Stars: 75 - Forks: 4

johannschopplich/tokenx

📐 GPT token estimation and context size utilities without a full tokenizer

Language: TypeScript - Size: 353 KB - Last synced at: 3 days ago - Pushed at: 5 months ago - Stars: 21 - Forks: 1

openshieldai/openshield

OpenShield is a new generation security layer for AI models

Language: Go - Size: 2.26 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 74 - Forks: 7

cahya-wirawan/rwkv-tokenizer

A fast RWKV Tokenizer written in Rust

Language: Jupyter Notebook - Size: 1.9 MB - Last synced at: 15 days ago - Pushed at: 22 days ago - Stars: 44 - Forks: 3

tural00a1568/llm-chat-indexer

The LLM Chat Indexer is a clever tool designed to transform chaotic chat files into organized, searchable insights—ideal for anyone overwhelmed by digital conversations.

Language: Python - Size: 2.15 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

viniciusmecosta/CountTokensPython

Language: Jupyter Notebook - Size: 3.91 KB - Last synced at: 20 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

marcelovicentegc/counttokens

Yet another CLI tool for counting tokens in text datasets using tiktoken.

Language: Python - Size: 18.6 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

shivendrra/shredword

Fast & efficient BPE tokenizer written in C & python for LLM tranining

Language: C++ - Size: 14.3 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

pkoukk/tiktoken-go

go version of tiktoken

Language: Go - Size: 1.15 MB - Last synced at: 28 days ago - Pushed at: 11 months ago - Stars: 738 - Forks: 87

madhurajayashanka/ai-travel-assistant-langchain

AI Travel Assistant uses Python, OpenAI API, Streamlit, SQLite & LangChain to generate smart, personalized travel itineraries.

Language: Python - Size: 0 Bytes - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

ElmiraGhorbani/chatgpt-long-term-memory

The ChatGPT Long Term Memory package is a powerful tool designed to empower your projects with the ability to handle a large number of simultaneous users and external sources.

Language: Python - Size: 43.9 KB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 56 - Forks: 2

rogerchang1108/OpenAI-API-Token-Counter-with-Tiktoken

This project harnesses the power of Tiktoken and the OpenAI API to create a Python Streamlit web application with a primary focus on token counting and price estimation.

Language: Python - Size: 278 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

aallam/ktoken

Kotlin multiplatform BPE tokenizer library for OpenAI models

Language: Kotlin - Size: 10.7 MB - Last synced at: 23 days ago - Pushed at: 3 months ago - Stars: 30 - Forks: 2

howdymic/tiktoken-server

Docker container to expose the OpenAI tokenizer as a REST service

Language: Python - Size: 4.88 KB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 8

kgruiz/PyTokenCounter

A simple Python library for tokenizing text and counting tokens. While currently only supporting OpenAI LLMs, it helps with text processing and managing token limits in AI applications.

Language: Python - Size: 433 KB - Last synced at: 12 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

hupe1980/go-tiktoken

✂️ OpenAI's tiktoken tokenizer written in Go

Language: Go - Size: 3.51 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 18 - Forks: 1

danny50610/bpe-tokeniser

PHP port for openai/tiktoken (most)

Language: PHP - Size: 2.73 MB - Last synced at: 8 days ago - Pushed at: 8 months ago - Stars: 9 - Forks: 0

403errors/TubeQuery

TubeQuery is a LLM based model, fetching all the queries related to your video. Just input the video link and all the qestiones are welcomed!

Language: Python - Size: 557 KB - Last synced at: 15 days ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

akshay-kamath/personal-projects

The projects which made by me while self learning.

Language: Jupyter Notebook - Size: 8.26 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

SandaruRF/AI-Voting-Assistant-Predictor-Comparator-Chatbot

Language: Jupyter Notebook - Size: 100 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 0 - Forks: 1

ReshiAdavan/Thoth

industry standard tokenizer purposed for large-scale language models (GPT, Claude, Llama, etc.)

Language: Python - Size: 2.96 MB - Last synced at: 4 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

RahulDey12/tiktoken-php

A PHP implementation of OpenAI's BPE tokenizer tiktoken.

Language: PHP - Size: 95.7 KB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 7 - Forks: 0

oelmekki/tiktoken-cli

Simple wrapper around tiktoken to use it in your favorite language.

Language: Python - Size: 6.84 KB - Last synced at: 14 days ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 3

kojix2/tiktoken-c

C API for tiktoken-rs

Language: Rust - Size: 42 KB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 9 - Forks: 1

vstrickl/gpt-token-counter

Simple Python script to help me parse through an entire video podcast transcript and create prompts for chat-gpt within its 4000 token limit.

Language: Python - Size: 1.64 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

maledorak/single-token-words

List of single token words for LLM usage

Language: Python - Size: 2.32 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

skitsanos/streamlit-split-text

Text splitting example using Tiktoken

Language: Python - Size: 4.88 KB - Last synced at: 17 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

farithadnan/DatasetForge

Extracts Google Sheets to JSONL for fine-tuning, estimates task costs with tiktoken.

Language: Python - Size: 50.8 KB - Last synced at: 16 days ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

P1ayer-1/ChatGPT-Web-vs-API-pricing

Count tokens to determine cost differential of ChatGPT Plus subscription and ChatGPT API

Language: Python - Size: 10.7 KB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

rubenselander/openai-function-tokens

Predict the exact openai token usage of functions

Language: Python - Size: 23.4 KB - Last synced at: 16 days ago - Pushed at: over 1 year ago - Stars: 19 - Forks: 1

gcondeh/Tokens

Pequeñas utilidades para contar tokens y cortar cadenas de texto

Language: Python - Size: 21.5 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

augustoomb/projeto-ia-langchain

Uso do framework langchain para uma API que responde a perguntas baseadas em documentos (RAG)

Language: Python - Size: 35.2 KB - Last synced at: 23 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

dbtreasure/zig-bpe

Byte Pair Encoding (BPE) in the Zig programming language (0.13.0)

Language: Zig - Size: 1.84 MB - Last synced at: 4 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

schneiderfelipe/chat-splitter

Split chat messages by maximum chat completion token count

Language: Rust - Size: 21.5 KB - Last synced at: 19 days ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

wonyoung-jang/logseq-tokenizer

Logseq Markdown Tokenizer is a Python application that tokenizes and estimates prices for one to many markdown files.

Language: Python - Size: 176 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

flexchar/tiktoken-counter

Tiktoken-counter as standalone API

Language: Python - Size: 2.93 KB - Last synced at: 4 days ago - Pushed at: 8 months ago - Stars: 2 - Forks: 1

guanhui07/tiktoken-php Fork of yethee/tiktoken-php

This is a port of the tiktoken

Language: PHP - Size: 2.71 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

kojix2/tiktoken-cr

Tiktoken for Crystalists

Language: Crystal - Size: 137 KB - Last synced at: 22 days ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0

biomchen/doc-split-estimator

Language: Python - Size: 13.7 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

haha-systems/toll

The OpenAI tiktoken library as a service. For counting the number of tokens in a message to an LLM like GPT.

Language: Python - Size: 20.5 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

tobi303x/Web-crawling-tool-for-NGO

Web crawling GUI app for non-govermental events, originally created in 2023 as Uni project. Enjoy!

Language: Python - Size: 166 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

saschaschramm/chatgpt

Analysis of OpenAI's ChatGPT

Language: Jupyter Notebook - Size: 413 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 149 - Forks: 12

maxim-saplin/tiktoken-bench

Comparing OpenAI tokeniser (tiktoken) performance - stock Python/Rust vs JS/WASM

Language: Dart - Size: 5.37 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

JacobLinCool/Tiktoken-Calculator

Calculate the token count for GPT-4, GPT-3.5, GPT-3, and GPT-2.

Language: Python - Size: 68.4 KB - Last synced at: 23 days ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

GustavoHBraga/RecommendationsAI

Este projeto consiste em uma aplicação que utiliza IA generativa para enviar recomendações de produtos com base nos perfis de compra de cada cliente

Language: Python - Size: 32.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

phukon/temporal-traverse 📦

console based game based on a llm

Language: Python - Size: 2.93 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

w95/tiktoken

The Tiktoken API is a tool that enables developers to calculate the token usage of their OpenAI API requests before sending them, allowing for more efficient use of tokens.

Language: Python - Size: 2.93 KB - Last synced at: 12 months ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 0

krasnoturinsk/telegram_bot_support_auchan Fork of RedAlexDad/telegram_bot_support_auchan

В рамках хакатона был создан телеграм бот для консультации с клиентами с сфере обслуживания магазина "Ашан"

Language: Python - Size: 20.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

mthli/tiktoken-android Fork of eisber/tiktoken

Run OpenAI tiktoken on Android 😃

Size: 70.3 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

mytechnotalent/kgpt

A custom GPT based on [Zero To Hero](https://karpathy.ai/zero-to-hero.html) utilizing tiktoken with the intent to augment AI Transformer-model education and reverse engineer GPT models from scratch.

Language: Python - Size: 1.17 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

cameronk/token-counter

Wraps @dqbd/tiktoken to count the number of tokens used by various OpenAI models.

Language: TypeScript - Size: 791 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

b0o/tiktoken-bench

A small Node.js benchmark suite for the tiktoken WASM port.

Language: TypeScript - Size: 12.7 KB - Last synced at: 19 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

functorism/gpt4-tokenizer-visualizer

GPT4 Tokenizer Visualizer

Language: TypeScript - Size: 10.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

peterheb/gotoken

Gotoken is a pure-Go implementation of the Python library openai/tiktoken.

Language: Go - Size: 6.36 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0