GitHub topics: tokenizing
HamedStack/HamedStack.SyntaxMania
Empowering you to create your own parser.
Language: C# - Size: 15.6 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

alasdairforsythe/tokenmonster
Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript
Language: Go - Size: 734 KB - Last synced at: 29 days ago - Pushed at: 10 months ago - Stars: 575 - Forks: 20

mina-faridi/Document-Ranking-with-Galago
Galago related homeworks of Information Retrieval Course
Language: Java - Size: 1.85 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

bzick/tokenizer
Tokenizer (lexer) for golang
Language: Go - Size: 103 KB - Last synced at: 9 months ago - Pushed at: about 1 year ago - Stars: 89 - Forks: 5

phughesmcr/happynodetokenizer
Javascript port of HappyFunTokenizer.py by Christopher Potts and HappierFunTokenizing.py by H. Andrew Schwartz
Language: TypeScript - Size: 1.64 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 0

made42/jackcomp
Compiler for the Jack language, as part of the Nand to Tetris courses
Language: Java - Size: 27.3 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

nqkhanh2002/Fake-News-Detection-with-Machine-Learning
In this work, I trained a Long Short Term Memory (LSTM) network to detect fake news from a given news corpus. This project could be practically used by media companies to automatically predict whether the circulating news is fake or not. The process could be done automatically without having humans manually review thousands of news-related articles.
Language: Jupyter Notebook - Size: 48.2 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

sajmaru/Spam-Email-Detection
Spam Email Detection using Natural Language Processing📨
Language: Python - Size: 969 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0
