GitHub / SpydazWebAI-NLP / Basic_Tokenizer2023
The Tokenizer is a versatile text processing library written in Visual Basic (VB.NET). It provides functionalities for tokenizing text into words, sentences, characters, and n-grams. The library is designed to be flexible, customizable, and easy to integrate into your VB.NET projects.
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SpydazWebAI-NLP%2FBasic_Tokenizer2023
PURL: pkg:github/SpydazWebAI-NLP/Basic_Tokenizer2023
Stars: 0
Forks: 1
Open issues: 0
License: mit
Language: Visual Basic .NET
Size: 1.06 MB
Dependencies parsed at: Pending
Created at: almost 2 years ago
Updated at: almost 2 years ago
Pushed at: almost 2 years ago
Last synced at: almost 2 years ago
Topics: bpe, frequent-pattern-mining, ngrams, pmi, text-preprocessing, tokenization, tokenizer, vocabulary-builder