An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: unigram

UnigramDev/Unigram

Telegram for Windows

Language: C# - Size: 374 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 4,546 - Forks: 527

Systemcluster/kitoken

Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and WordPiece tokenization in JavaScript, Python and Rust.

Language: Rust - Size: 27.3 MB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 29 - Forks: 0

hikmatazimzade/azerbaijani-tokenizer

High-Performance Azerbaijani Tokenizers (30% fewer tokens, 40% faster than multilingual alternatives)

Language: Jupyter Notebook - Size: 1.86 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

naztar0/Emugram Fork of UnigramDev/Unigram

Telegram for Windows Modification

Language: C# - Size: 350 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

words/n-gram

Get n-grams from text

Language: JavaScript - Size: 88.9 KB - Last synced at: 26 days ago - Pushed at: almost 3 years ago - Stars: 83 - Forks: 17

francofrizzo/marcos-bot-js

Telegram bot that generates random messages using a Markov chain, now for Node.js

Language: TypeScript - Size: 154 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 8 - Forks: 0

UnigramDev/unigram.me 📦

The Unigram website build with React and Bootstrap.

Language: HTML - Size: 63.3 MB - Last synced at: 19 days ago - Pushed at: about 8 years ago - Stars: 9 - Forks: 1

ollie283/language-models

Build unigram and bigram language models, implement Laplace smoothing and use the models to compute the perplexity of test corpora.

Language: Python - Size: 78.1 KB - Last synced at: 4 months ago - Pushed at: about 8 years ago - Stars: 84 - Forks: 42

pngo1997/N-gram-Language-Models

Builds N-gram language modes and applies text generation.

Language: Jupyter Notebook - Size: 4.73 MB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

burcgokden/SentencePiece-Tokenizer-Wrapper-for-PLDR-LLM-KVG-cache

SentencePiece Tokenizer Wrapper implementation for PLDR-LLM with KV cache and G-cache

Language: Python - Size: 11.7 KB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

mediaexplorer74/UnigramMobile Fork of hihain/UnigramMobile

The Telegram client planned to be optimized for the Miscrosoft Windows 10 Mobile retro-os one day....

Language: C# - Size: 191 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

AdrienPoupa/comp6721

COMP6721: Introduction to Artificial Intelligence Assignments (GOFAI, Machine Learning, NLP)

Language: HTML - Size: 9.01 MB - Last synced at: 5 months ago - Pushed at: almost 7 years ago - Stars: 2 - Forks: 1

burcgokden/Sentencepiece-Tokenizer-Wrapper-for-PLDR-LLM

A framework for building Sentencepiece tokenizer from a dataset

Language: Python - Size: 3.31 MB - Last synced at: 5 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

iglee/Perceptron-Text-Classifier

Sentiment Classification exercise with perceptron, feed-forward multilayer net, LSTM RNN, and RCNN!

Language: Python - Size: 575 KB - Last synced at: 4 months ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

pouriaSameti/NLP

The projects for the NLP course at the University of Isfahan.

Language: Jupyter Notebook - Size: 1.5 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

msamprovalaki/Context-Aware-Spelling-Corrector

Academic project centered around n-grams and their application in developing a spelling corrector with contextual awareness.

Language: Jupyter Notebook - Size: 2.14 MB - Last synced at: 11 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 1

farzeennimran/N-grams

Language: Jupyter Notebook - Size: 1.7 MB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

WonYong-Jang/Text-Data-Analysis

N gram language model

Language: Java - Size: 9.36 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

DmitryAsdre/UnigramTokenization

Unigram Tokenization realization from scratch

Language: Jupyter Notebook - Size: 424 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

iAmKankan/Natural-Language-Processing-NLP-Tutorial

NLP tutorials and guidelines to learn efficiently

Size: 123 KB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 1

VaasuDevanS/Natural-Language-Processing-Assignments

UNB Fall-2018 NLP Assignments 💬

Language: Python - Size: 23.5 MB - Last synced at: 5 months ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 1

Clealiya/Modeles_de_Markov

[FR - Duo] 2023 - 2024 Centrale Méditerranée AI Master | NLP project about Markov models

Language: Mask - Size: 37.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

motiurinfo/sentiment_classification

Performance evaluation of sentiment classification on movie reviews

Language: Python - Size: 20.9 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 0

mediaexplorer74/Unigram Fork of UnigramDev/Unigram

Unigram RnD. Draft.

Language: C# - Size: 338 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

arxiver/Onepiecelang

Text segmentation solution using natural language processing.

Language: Jupyter Notebook - Size: 1010 KB - Last synced at: 11 days ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

JonathanMonga/word_ninja_dart

Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies. Inspired from wordninja for Python.

Language: Dart - Size: 688 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

clj0020/python-web-crawler

Python Web Crawler implementing Iterative Deepening Depth Search

Language: Python - Size: 1.08 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

arpankapoor/ngram

find uni/bi/tri-grams from text files

Language: Rust - Size: 11.7 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

subtosilencio/unigrams_pt-br

Word segmentation to create unigrams in Portuguese (pt-br)

Language: Python - Size: 675 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

mehdiye5/WordSegmentation

The task for this project is to segment a sequence of English characters into the most likely word sequence.

Language: Python - Size: 2.09 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

susantabiswas/Word-Prediction-Ngram

Next Word Prediction using n-gram Probabilistic Model with various Smoothing Techniques

Language: Jupyter Notebook - Size: 49.8 KB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 30 - Forks: 12

pbgnz/automatic-language-identification

a probabilistic language identification system that identifies the language of a sentence

Language: Python - Size: 8.62 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Sepehr1812/NLP_AI_project

Final AI course of CE department at Amirkabir University of Technology (Tehran Polytechnic) - Winter 2020.

Language: Python - Size: 7.81 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

Dev-47/unigram-forum-service

Language: JavaScript - Size: 490 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 4

Dev-47/unigram-accounts-service

Language: Python - Size: 36.1 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

dracula/unigram

🧛🏻‍♂️ Dark theme for Unigram

Size: 1.3 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

Dev-47/unigram-broadcast-service

Size: 1000 Bytes - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Heewon-Hailey/NLP-geolocation-classifier-for-tweets

implement a model to predict the country where the tweet comes from

Language: Jupyter Notebook - Size: 64.5 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

prajjwaldimri/Unigram Fork of UnigramDev/Unigram

The Windows Universal version of Telegram build by the community

Language: C# - Size: 38.8 MB - Last synced at: over 2 years ago - Pushed at: over 8 years ago - Stars: 2 - Forks: 0

wenhaofang/Tokenizer

Some demo tokenizers especially for Chinese, including Maximum Matching, UniGram, HMM, CRF.

Language: Python - Size: 2.33 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 1

aliazizii/Who-wrote-this-poem

NLP extra project at AUT Artificial Intelligence course (Fall 2020)

Language: Python - Size: 1.15 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 0

2arian3/Artificial-Intelligence

All AUT's principles and applications of artificial intelligence course projects.

Language: Python - Size: 1.39 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 0

sudhanshusks/twitter_bot

Language: HTML - Size: 6.73 MB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 13 - Forks: 5

jrdodson/unigram-lm

Simple language model for computing unigram frequencies.

Language: Java - Size: 4.43 MB - Last synced at: over 2 years ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 1

rajatb115/Document-Reranking

Assignment on Document Reranking

Language: Python - Size: 345 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

JunhoKim94/Word2Vec_Study

Word2Vec using Hierarchy Softmax and Negative Sampling with Unigram & Subsampling

Language: Python - Size: 78 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 0

erfanghasemi/poet_detector

We have implemened an NLP Project to recognize the correct poets. In the project, we have used the bigram, unigram and backoff model for smoothing.

Language: Jupyter Notebook - Size: 408 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

b05902062/mixture_of_unigram

easy to use mixture of unigram topic modeling tool

Language: Python - Size: 512 KB - Last synced at: over 2 years ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

schmintendo/translate.py

This is a small program that takes two lists, zips them, and translates a file after making the translation dictionary.

Language: Python - Size: 17.1 MB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

albertusk95/nips-challenge-plagiarism-detection-vsm

Global NIPS Paper Implementation Challenge - Plagiarism Detection on Electronic Text Based Assignments Using Vector Space Model (iciafs14)

Language: Python - Size: 396 KB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 1

vgratian/phon_bigrams

[draft] phonological unigrams and bigrams

Language: Python - Size: 2.93 KB - Last synced at: over 2 years ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 0

kaushikhande/Ensemble_sentiment

Language: Python - Size: 1.48 MB - Last synced at: over 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0