GitHub topics: language-modeling
DmitryRyumin/INTERSPEECH-2023-24-Papers
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
Size: 11.4 MB - Last synced at: 2 days ago - Pushed at: 6 months ago - Stars: 673 - Forks: 42

DRSY/EMO
[ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)
Language: Python - Size: 37 MB - Last synced at: about 22 hours ago - Pushed at: over 1 year ago - Stars: 123 - Forks: 13

freon4dsl/Freon4dsl
Web Native language Workbench with Projectional Web Editor
Language: TypeScript - Size: 30.3 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 69 - Forks: 8

roddar92/linguistics_problems
Natural language processing in examples and games
Language: Jupyter Notebook - Size: 24.1 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 24 - Forks: 5

tonybeltramelli/Deep-Lyrics
Lyrics Generator aka Character-level Language Modeling with Multi-layer LSTM Recurrent Neural Network
Language: Python - Size: 12.7 KB - Last synced at: 11 days ago - Pushed at: over 7 years ago - Stars: 152 - Forks: 27

DmitryRyumin/ICASSP-2023-24-Papers
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
Language: Python - Size: 9.11 MB - Last synced at: 18 days ago - Pushed at: about 1 month ago - Stars: 463 - Forks: 18

pemistahl/lingua-go
The most accurate natural language detection library for Go, suitable for short text and mixed-language text
Language: Go - Size: 226 MB - Last synced at: 12 days ago - Pushed at: 4 months ago - Stars: 1,245 - Forks: 68

EgoAlpha/prompt-in-context-learning
Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.
Language: Jupyter Notebook - Size: 44.3 MB - Last synced at: 19 days ago - Pushed at: 24 days ago - Stars: 1,587 - Forks: 96

songlab-cal/tape
Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.
Language: Python - Size: 840 KB - Last synced at: 18 days ago - Pushed at: over 2 years ago - Stars: 696 - Forks: 132

muditbhargava66/PyxLSTM
Efficient Python library for Extended LSTM with exponential gating, memory mixing, and matrix memory for superior sequence modeling.
Language: Python - Size: 120 KB - Last synced at: 21 days ago - Pushed at: 12 months ago - Stars: 290 - Forks: 27

allenai/RL4LMs
A modular RL library to fine-tune language models to human preferences
Language: Python - Size: 29.1 MB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 2,307 - Forks: 197

quark0/darts
Differentiable architecture search for convolutional and recurrent networks
Language: Python - Size: 4.7 MB - Last synced at: 19 days ago - Pushed at: over 4 years ago - Stars: 3,961 - Forks: 835

uber-research/PPLM
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
Language: Python - Size: 2.36 MB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 1,146 - Forks: 204

google-deepmind/long-form-factuality
Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".
Language: Python - Size: 755 KB - Last synced at: 23 days ago - Pushed at: about 1 month ago - Stars: 606 - Forks: 73

euclaise/SlimTrainer
Full finetuning of large language models without large memory requirements
Language: Python - Size: 85 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 93 - Forks: 3

microsoft/CodeMixed-Text-Generator
This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalence Constant Theory and Matrix Language Theory.
Language: Jupyter Notebook - Size: 3.79 MB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 55 - Forks: 12

BESSER-PEARL/BESSER
A Python-based low-modeling low-code platform for smart and AI-enhanced software
Language: Python - Size: 89.8 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 93 - Forks: 18

lucidrains/gated-state-spaces-pytorch
Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch
Language: Python - Size: 34.1 MB - Last synced at: 5 days ago - Pushed at: over 2 years ago - Stars: 100 - Forks: 4

dellison/WikiText.jl
Julia interface to the WikiText dataset.
Language: Julia - Size: 14.6 KB - Last synced at: 12 days ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 1

flatironinstitute/deepblast
Neural Networks for Protein Sequence Alignment
Language: Python - Size: 56.7 MB - Last synced at: 23 days ago - Pushed at: 7 months ago - Stars: 121 - Forks: 22

MagedSaeed/generate-sequences
A python package made to generate sequences (greedy and beam-search) from Pytorch (not necessarily HF transformers) models.
Language: Python - Size: 1.11 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 17 - Forks: 0

jeffhj/LM-reasoning
This repository contains a collection of papers and resources on Reasoning in Large Language Models.
Size: 99.6 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 564 - Forks: 35

somosnlp/nlp-de-cero-a-cien
Curso práctico: NLP de cero a cien 🤗
Language: Jupyter Notebook - Size: 3.86 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 188 - Forks: 90

aalok-sathe/surprisal
A unified interface for computing surprisal (log probabilities) from language models! Supports neural, symbolic, and black-box API models.
Language: Python - Size: 888 KB - Last synced at: 25 days ago - Pushed at: 6 months ago - Stars: 40 - Forks: 10

majumderb/rezero
Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"
Language: Python - Size: 42 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 408 - Forks: 53

BitcoinChatGPT/DeserializeSignature-Vulnerability-Algorithm
Learn about the DeserializeSignature vulnerability in Bitcoin's ECDSA signature algorithm and its potential impact on the security of Bitcoin transactions. Discover how the vulnerability can be exploited and what steps are being taken to mitigate the risk. Stay informed on the latest developments in Bitcoin security.
Language: Jupyter Notebook - Size: 1.72 MB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

shmsw25/FActScore
A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"
Language: Python - Size: 102 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 337 - Forks: 50

google-research/mozolm
MozoLM: A language model (LM) serving library
Language: C++ - Size: 10.4 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 45 - Forks: 12

CQCL/Quixer
Code repository for the preprint "Quixer: A Quantum Transformer Model"
Language: Python - Size: 48.8 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 22 - Forks: 9

UIC-Liu-Lab/ContinualLM
An Extensible Continual Learning Framework Focused on Language Models (LMs)
Language: Python - Size: 696 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 272 - Forks: 21

madaan/memprompt
A method to fix GPT-3 after deployment with user feedback, without re-training.
Language: Python - Size: 20.8 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 328 - Forks: 13

Sunnydreamrain/IndRNN_pytorch
Independently Recurrent Neural Networks (IndRNN) implemented in pytorch.
Language: Python - Size: 3.05 MB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 135 - Forks: 31

songlab-cal/tape-neurips2019
Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. (DEPRECATED)
Language: Python - Size: 136 KB - Last synced at: 26 days ago - Pushed at: almost 4 years ago - Stars: 120 - Forks: 35

dayyass/language-modeling
Pipeline for training Language Models using PyTorch.
Language: Python - Size: 68.4 KB - Last synced at: 5 days ago - Pushed at: about 3 years ago - Stars: 12 - Forks: 0

BitcoinChatGPT/Jacobian-Curve-Vulnerability-Algorithm
Discover the implications of the Jacobian Curve vulnerability in elliptic curve cryptography, particularly its impact on the Elliptic Curve Digital Signature Algorithm (ECDSA). This article explores how attackers can exploit this flaw to generate fraudulent transactions, create fake signatures, and compromise the integrity of blockchain systems.
Language: Jupyter Notebook - Size: 1.72 MB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 1

andstor/verified-smart-contracts
:page_facing_up: Verified Ethereum Smart Contract dataset
Language: Python - Size: 42 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 29 - Forks: 4

bjam24/agh-natural-language-processing
This respository contains projects made for the NLP course at the AGH UST in 2024 / 2025. They received maximum grade 5.0.
Language: Julia - Size: 25 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

TencentARC/FLM
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
Language: Python - Size: 7 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 32 - Forks: 1

yxuansu/SimCTG
[NeurIPS'22 Spotlight] A Contrastive Framework for Neural Text Generation
Language: Python - Size: 6.94 MB - Last synced at: 24 days ago - Pushed at: over 1 year ago - Stars: 471 - Forks: 40

shaoxiongji/fed-att
Attentive Federated Learning for Private NLM
Language: Python - Size: 10.7 KB - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 61 - Forks: 17

Ingenious-c0der/Beluga
An esoteric programming language based on Turing Machines
Language: C++ - Size: 163 KB - Last synced at: 28 days ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

amazon-science/synthesizrr
Synthesizing realistic and diverse text-datasets from augmented LLMs
Language: Python - Size: 1.44 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 12 - Forks: 3

Madjakul/HALvesting
Harvests open research papers from HAL (Hyper Articles en Ligne).
Language: Python - Size: 490 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

Separius/BERT-keras 📦
Keras implementation of BERT with pre-trained weights
Language: Python - Size: 552 KB - Last synced at: 2 months ago - Pushed at: almost 6 years ago - Stars: 814 - Forks: 196

kmario23/KenLM-training
Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
Size: 5.86 KB - Last synced at: 2 months ago - Pushed at: about 6 years ago - Stars: 114 - Forks: 21

google/BEGIN-dataset
A benchmark dataset for evaluating dialog system and natural language generation metrics.
Size: 3.5 MB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 36 - Forks: 5

andstor/verified-smart-contracts-audit
:bug: Verified smart contract dataset with vulnerability labeling
Size: 3.91 KB - Last synced at: 2 months ago - Pushed at: about 3 years ago - Stars: 5 - Forks: 0

giganticode/codeprep
A toolkit for pre-processing large source code corpora
Language: Python - Size: 1.56 MB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 47 - Forks: 11

nikitas-theo/BERTtimeStories
Code implementation for our paper "BERTtime Stories: Investigating the Role of Synthetic Story Data in Language Pre-training" as part of the 2024 BabyLM Challenge
Language: Python - Size: 1000 Bytes - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

indiejoseph/chinese-char-rnn 📦
Character-Level language models
Language: Python - Size: 2.08 MB - Last synced at: about 1 month ago - Pushed at: almost 8 years ago - Stars: 77 - Forks: 20

prajjwal1/language-modelling
LM, ULMFit et al.
Language: Python - Size: 491 KB - Last synced at: 22 days ago - Pushed at: over 5 years ago - Stars: 46 - Forks: 6

MyDarapy/gpt-1-from-scratch
Rewriting and pretraining GPT-1 from scratch. Implementing Multihead Attention (MHA) in pyTorch from the original paper Improving Language Understanding by Generative Pre-Training (https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
Language: Python - Size: 44.9 KB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

hirofumi0810/neural_sp
End-to-end ASR/LM implementation with PyTorch
Language: Python - Size: 8.66 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 596 - Forks: 139

oooranz/Baby-CoThought
Baby's CoThought: Leveraging LLMs for Enhanced Reasoning in Compact Models
Language: Python - Size: 61.3 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 17 - Forks: 4

IDSIA/recurrent-fwp
Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers" (NeurIPS 2021)
Language: Python - Size: 5.61 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 48 - Forks: 5

mpuig/gpt2-fine-tuning
Fine-tune GPT2 to generate fake job experiences
Language: Jupyter Notebook - Size: 54.1 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 11 - Forks: 2

L0SG/relational-rnn-pytorch
An implementation of DeepMind's Relational Recurrent Neural Networks (NeurIPS 2018) in PyTorch.
Language: Python - Size: 4.49 MB - Last synced at: 2 months ago - Pushed at: over 6 years ago - Stars: 245 - Forks: 35

BitcoinChatGPT/Gauss-Jacobi-Method-Algorithm
To use a pre-trained Bitcoin ChatGPT AI model to learn this method, you would first need to provide the model with a clear and concise description of the algorithm, including its purpose, prerequisites, and the mathematical principles behind it. How To Get PrivateKey of Bitcoin Wallet Address.
Language: Jupyter Notebook - Size: 1.7 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 3

OSU-STARLAB/LeaPformer
[ICML 2024] Official implementation of "LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions."
Language: Python - Size: 20.1 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 9 - Forks: 1

Helsinki-NLP/lm-vs-mt
A Comparison of Language Modeling and Translation as Multilingual Pretraining Objectives
Language: Python - Size: 1.15 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

apple/ml-interspeech2022-phi_rtn
Repository accompanying the Interspeech 2022 publication titled "Space-Efficient Representation of Entity-centric Query Language Models" by Van Gysel et al.
Size: 33.1 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 13 - Forks: 2

CLMBRs/lm-training
Repository for training transformer _and recurrent_ language models via HuggingFace in an entirely configuration-file driven manner.
Language: Python - Size: 95.7 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 8 - Forks: 0

tinhb92/rnn_darts_fastai
Implement Differentiable Architecture Search (DARTS) for RNN with fastai
Language: Jupyter Notebook - Size: 1.86 MB - Last synced at: 4 days ago - Pushed at: about 6 years ago - Stars: 24 - Forks: 3

rubypoddar/microsoft-phi3-language-model
Explore the power of Microsoft Phi-3 language model with this repository, featuring a versatile natural language processing tool. Leverage advanced text generation, summarization, and AI-driven creativity directly from the Phi-3 model. Dive into cutting-edge language capabilities for your projects.
Language: Jupyter Notebook - Size: 12.1 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 14 - Forks: 0

arrrrrmin/albert-guide
Understanding "A Lite BERT". An Transformer approach for learning self-supervised Language Models.
Language: Python - Size: 52.7 KB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 1

PraveenKumar-Rajendran/Udacity-Natural-Language-Processing-Engineer-Nanodegree
Projects Implemented for the Udacity Natural Language Processing Engineer Nanodegree Program
Size: 1.42 MB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

gidim/Babler
Data Collection System For NLP/Speech Recognition
Language: Java - Size: 32.7 MB - Last synced at: 25 days ago - Pushed at: about 4 years ago - Stars: 25 - Forks: 12

KIST-CSRC/Text-to-BatteryRecipe
Official source codes for implementing "Text-to-Battery Recipe: A language modeling-based protocol for automatic battery recipe extraction and retrieval"
Language: Python - Size: 27.9 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

jwchoi95/Text-to-BatteryRecipe
Official source codes for implementing "Text-to-Battery Recipe: A language modeling-based protocol for automatic battery recipe extraction and retrieval"
Language: Python - Size: 5.39 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 1

mit-han-lab/neurips-micronet
[JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion
Language: Jupyter Notebook - Size: 65.6 MB - Last synced at: 27 days ago - Pushed at: over 4 years ago - Stars: 40 - Forks: 8

meta-toolkit/meta
A Modern C++ Data Sciences Toolkit
Language: C++ - Size: 30.4 MB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 689 - Forks: 233

CODING-Enthusiast9857/Gemini_LLM_Application
It is an innovative repository housing a sophisticated Large Language Model (LLM) project, showcasing the intersection of advanced natural language processing and cutting-edge artificial intelligence. This repository serves as a comprehensive platform for the development, experimentation, and application of state-of-the-art language models.
Language: Python - Size: 8.79 KB - Last synced at: 2 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

sayhitosandy/Transformer-Speech-Classifier-LM
Implementation and exploration of transformer models for speech segment classification and language modeling.
Language: Jupyter Notebook - Size: 1.83 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

ShiningLab/PromptSub
This repository is for the paper Lexical Substitution as Causal Language Modeling. In Proceedings of the 13th Joint Conference on Lexical and Computational Semantics (*SEM 2024), Mexico City, Mexico. Association for Computational Linguistics.
Language: Python - Size: 4.32 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

quentin-riffault/Lingua
Language analyzer
Language: Python - Size: 132 KB - Last synced at: 12 months ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

marcruef/lingoai
Language Analysis for A.I.
Language: PHP - Size: 2.93 KB - Last synced at: 12 months ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0

LarsHill/pointer-guided-pre-training
Code for the ECML 2024 paper "Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness"
Language: Python - Size: 63.5 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

albinjm/FinSpeech
A Speech Recognition Framework for Banking Interactions using Convolutional Recurrent Dense Neural Networks and Language Models
Language: Jupyter Notebook - Size: 188 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

BitcoinChatGPT/Fuzzing-Vulnerability-Algorithm
Learn about the Fuzzing vulnerability in Bitcoin's ECDSA signature algorithm and its potential impact on the security of Bitcoin transactions. Discover how the vulnerability can be exploited and what steps are being taken to mitigate the risk. Stay informed on the latest developments in Bitcoin security.
Language: Jupyter Notebook - Size: 1.7 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 1

kazuki-irie/dct-fast-weights
PyTorch implementation of DCT fast weight RNNs
Language: Python - Size: 11.7 KB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 4 - Forks: 0

Atenrev/forocoches-language-generation
This is a PyTorch implementation of a decoder only transformer inspired on GPT-2. The model was trained from scratch on a custom dataset of over 1 million threads from the Spanish forum ForoCoches. The dataset is publicly available.
Language: Python - Size: 39.1 KB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 0

Atenrev/comics-dialogue-generation
PyTorch code for Automatic generation of comic dialogues. The purpose of this project is to generate subsequent dialogues given a multimodal context.
Language: Python - Size: 727 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 0

sbdzdz/thesis-scripts
Miscellaneous scripts and utilities for my master's thesis.
Language: Python - Size: 80.1 KB - Last synced at: about 1 year ago - Pushed at: almost 9 years ago - Stars: 0 - Forks: 1

ohmthanap/CS583_Deep-Learning
Learned knowledge and techniques in Deep Learning and also related tools: Python, Pytorch, Jupyter Notebook, RNN, CNN, Reinforcement Learning, LSTM, BERT, Language Modeling
Language: Jupyter Notebook - Size: 127 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

UIC-Liu-Lab/CPT
[EMNLP 2022] Continual Training of Language Models for Few-Shot Learning
Language: Python - Size: 808 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 40 - Forks: 1

paule32/Kurt_Goedel_Experiment
This is just a fun Project that is faced from the "Kurt Goedel" Therom of incomplete sentences. Produced with SBCL - a Common Lisp implementation. Free for non-profit usage.
Language: Common Lisp - Size: 44.9 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

cynthia/kosentences
Large scale unannotated Korean corpus for unsupervised tasks. (e.g. Language modeling)
Language: Python - Size: 15.6 KB - Last synced at: 2 months ago - Pushed at: almost 6 years ago - Stars: 27 - Forks: 6

lyeoni/pretraining-for-language-understanding
Pre-training of Language Models for Language Understanding
Language: Python - Size: 562 KB - Last synced at: 2 months ago - Pushed at: almost 6 years ago - Stars: 83 - Forks: 14

hltcoe/sandle
Run a large language modeling SANDbox in your Local Environment
Language: Python - Size: 2.3 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 1

ArmanBehnam/NLP
Natural language processing including Datasets,Farsi NLP, Automated Essay Scoring, Automatic Speech Recognition and etc.
Language: Jupyter Notebook - Size: 512 KB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

cedrickchee/BERT-pytorch Fork of codertimo/BERT-pytorch
Google AI BERT 2018 pytorch implementation
Language: Python - Size: 36.1 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 1

cedrickchee/awd-lstm-lm Fork of salesforce/awd-lstm-lm
LSTM and QRNN Language Model Toolkit for PyTorch (adapted to fast.ai version)
Language: Python - Size: 50.8 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 1

Machine-Learning-Foundations/day_15_exercise_sequence_processing
Exercise on generative language modelling in Jax.
Language: Python - Size: 1.13 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

clovaai/group-transformer
Official code for Group-Transformer (Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model, COLING-2020).
Language: Python - Size: 51.8 KB - Last synced at: 9 days ago - Pushed at: over 4 years ago - Stars: 25 - Forks: 1

Mawazoni/KiswahiliModuleNooJ
he Kiswahili module is designed for NooJ linguistic development environment software and corpus processor. It goes with a 45000 words Kiswahili-English dictionary and morphological grammars that detect all tenses of Kiswahili verbs. Mathieu ROY et al. 2017. 2018. CC BY NC SA 3.0
Size: 12.5 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

dimits-ts/text_analytics
Language Modelling (text generation, spell correction) and Sentiment Analysis / POS Tagging with MLP, RNN, CNN and BERT models and LLM prompting
Language: Jupyter Notebook - Size: 69.1 MB - Last synced at: 23 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 1

BoHuangLab/CELL-E_2
Encoder-only model for image-based protein predictions
Language: Python - Size: 12.9 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 0

suryatejreddy/Memeify
Code and Dataset for Memeify: A Large-scale Meme Generation System
Language: JavaScript - Size: 11.8 MB - Last synced at: about 2 months ago - Pushed at: about 5 years ago - Stars: 25 - Forks: 5

harsha-desaraju/HMM-Model-for-POS-tagging
Parts of Speech (POS) tagging for English using Hidden Markov Model.
Language: Python - Size: 6.5 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

young-zonglin/yangzl-deep-lm-keras
Language modeling using several deep models.
Language: Python - Size: 4.19 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0
