An open API service providing repository metadata for many open source software ecosystems.

framagit.org topics: natural language processing

nlp/substitutionstring

Modification of strings without loss of information. Useful for cleaning, normalizing, de-noising, filtering, ... any work when insertion and deletion from and to a string are in use.

Last synced at: 14 days ago - Stars: 0 - Forks: 0

nlp/iamtokenizing

Tokenizer classes for several NLP tasks: splitting a text on white space, using a REGEX expression, ... This package is based on the tokenspan package, see https://framagit.org/nlp/tokenspan

Last synced at: 17 days ago - Stars: 0 - Forks: 0

nlp/extractionstring

Extract part of a string in a versatile way, and without destroying information from the parent string. Allows discontinuous part of a string to be collected as an ExtractionString. Allows several strategies of string-splitting at the same time, for a given string.

Last synced at: 25 days ago - Stars: 0 - Forks: 0

nlp/iambagging

Bag of Words tools to represent natural language processing, and associate a few graph representation of a document. The main interest of this module is to be agnostic of the preprocessing and/or normalizing and or clean and/or tokenization protocols

Last synced at: almost 2 years ago - Stars: 0 - Forks: 0

nlp/iamnormalizing

Tools that normalize a text in a non-destructive way.

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

Quent--y/extension-mozilla

Diccionari ortografic per Mozilla Firefox, basat sul dico Hunspell (https://gitlab.com/taissou/hunspell-files-for-occitan-lengadocian/-/tree/master/Files)

Last synced at: about 2 years ago - Stars: 0 - Forks: 0

nlp/tokenspan

Deprecated from sept. 2022. See https://framagit.org/nlp/extractionstring for improved tools to extract any sub-string from a parent one without losing information from the parent string.

Last synced at: 4 days ago - Stars: 0 - Forks: 0