An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: language-modeling

DmitryRyumin/INTERSPEECH-2023-24-Papers

INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

Size: 11.4 MB - Last synced at: 2 days ago - Pushed at: 6 months ago - Stars: 673 - Forks: 42

DRSY/EMO

[ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)

Language: Python - Size: 37 MB - Last synced at: about 22 hours ago - Pushed at: over 1 year ago - Stars: 123 - Forks: 13

freon4dsl/Freon4dsl

Web Native language Workbench with Projectional Web Editor

Language: TypeScript - Size: 30.3 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 69 - Forks: 8

roddar92/linguistics_problems

Natural language processing in examples and games

Language: Jupyter Notebook - Size: 24.1 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 24 - Forks: 5

tonybeltramelli/Deep-Lyrics

Lyrics Generator aka Character-level Language Modeling with Multi-layer LSTM Recurrent Neural Network

Language: Python - Size: 12.7 KB - Last synced at: 11 days ago - Pushed at: over 7 years ago - Stars: 152 - Forks: 27

DmitryRyumin/ICASSP-2023-24-Papers

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Language: Python - Size: 9.11 MB - Last synced at: 18 days ago - Pushed at: about 1 month ago - Stars: 463 - Forks: 18

pemistahl/lingua-go

The most accurate natural language detection library for Go, suitable for short text and mixed-language text

Language: Go - Size: 226 MB - Last synced at: 12 days ago - Pushed at: 4 months ago - Stars: 1,245 - Forks: 68

EgoAlpha/prompt-in-context-learning

Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.

Language: Jupyter Notebook - Size: 44.3 MB - Last synced at: 19 days ago - Pushed at: 24 days ago - Stars: 1,587 - Forks: 96

songlab-cal/tape

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.

Language: Python - Size: 840 KB - Last synced at: 18 days ago - Pushed at: over 2 years ago - Stars: 696 - Forks: 132

muditbhargava66/PyxLSTM

Efficient Python library for Extended LSTM with exponential gating, memory mixing, and matrix memory for superior sequence modeling.

Language: Python - Size: 120 KB - Last synced at: 21 days ago - Pushed at: 12 months ago - Stars: 290 - Forks: 27

allenai/RL4LMs

A modular RL library to fine-tune language models to human preferences

Language: Python - Size: 29.1 MB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 2,307 - Forks: 197

quark0/darts

Differentiable architecture search for convolutional and recurrent networks

Language: Python - Size: 4.7 MB - Last synced at: 19 days ago - Pushed at: over 4 years ago - Stars: 3,961 - Forks: 835

uber-research/PPLM

Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.

Language: Python - Size: 2.36 MB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 1,146 - Forks: 204

google-deepmind/long-form-factuality

Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".

Language: Python - Size: 755 KB - Last synced at: 23 days ago - Pushed at: about 1 month ago - Stars: 606 - Forks: 73

euclaise/SlimTrainer

Full finetuning of large language models without large memory requirements

Language: Python - Size: 85 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 93 - Forks: 3

microsoft/CodeMixed-Text-Generator

This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalence Constant Theory and Matrix Language Theory.

Language: Jupyter Notebook - Size: 3.79 MB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 55 - Forks: 12

BESSER-PEARL/BESSER

A Python-based low-modeling low-code platform for smart and AI-enhanced software

Language: Python - Size: 89.8 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 93 - Forks: 18

lucidrains/gated-state-spaces-pytorch

Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch

Language: Python - Size: 34.1 MB - Last synced at: 5 days ago - Pushed at: over 2 years ago - Stars: 100 - Forks: 4

dellison/WikiText.jl

Julia interface to the WikiText dataset.

Language: Julia - Size: 14.6 KB - Last synced at: 12 days ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 1

flatironinstitute/deepblast

Neural Networks for Protein Sequence Alignment

Language: Python - Size: 56.7 MB - Last synced at: 23 days ago - Pushed at: 7 months ago - Stars: 121 - Forks: 22

MagedSaeed/generate-sequences

A python package made to generate sequences (greedy and beam-search) from Pytorch (not necessarily HF transformers) models.

Language: Python - Size: 1.11 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 17 - Forks: 0

jeffhj/LM-reasoning

This repository contains a collection of papers and resources on Reasoning in Large Language Models.

Size: 99.6 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 564 - Forks: 35

somosnlp/nlp-de-cero-a-cien

Curso práctico: NLP de cero a cien 🤗

Language: Jupyter Notebook - Size: 3.86 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 188 - Forks: 90

aalok-sathe/surprisal

A unified interface for computing surprisal (log probabilities) from language models! Supports neural, symbolic, and black-box API models.

Language: Python - Size: 888 KB - Last synced at: 25 days ago - Pushed at: 6 months ago - Stars: 40 - Forks: 10

majumderb/rezero

Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"

Language: Python - Size: 42 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 408 - Forks: 53

BitcoinChatGPT/DeserializeSignature-Vulnerability-Algorithm

Learn about the DeserializeSignature vulnerability in Bitcoin's ECDSA signature algorithm and its potential impact on the security of Bitcoin transactions. Discover how the vulnerability can be exploited and what steps are being taken to mitigate the risk. Stay informed on the latest developments in Bitcoin security.

Language: Jupyter Notebook - Size: 1.72 MB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

shmsw25/FActScore

A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

Language: Python - Size: 102 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 337 - Forks: 50

google-research/mozolm

MozoLM: A language model (LM) serving library

Language: C++ - Size: 10.4 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 45 - Forks: 12

CQCL/Quixer

Code repository for the preprint "Quixer: A Quantum Transformer Model"

Language: Python - Size: 48.8 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 22 - Forks: 9

UIC-Liu-Lab/ContinualLM

An Extensible Continual Learning Framework Focused on Language Models (LMs)

Language: Python - Size: 696 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 272 - Forks: 21

madaan/memprompt

A method to fix GPT-3 after deployment with user feedback, without re-training.

Language: Python - Size: 20.8 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 328 - Forks: 13

Sunnydreamrain/IndRNN_pytorch

Independently Recurrent Neural Networks (IndRNN) implemented in pytorch.

Language: Python - Size: 3.05 MB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 135 - Forks: 31

songlab-cal/tape-neurips2019

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. (DEPRECATED)

Language: Python - Size: 136 KB - Last synced at: 26 days ago - Pushed at: almost 4 years ago - Stars: 120 - Forks: 35

dayyass/language-modeling

Pipeline for training Language Models using PyTorch.

Language: Python - Size: 68.4 KB - Last synced at: 5 days ago - Pushed at: about 3 years ago - Stars: 12 - Forks: 0

BitcoinChatGPT/Jacobian-Curve-Vulnerability-Algorithm

Discover the implications of the Jacobian Curve vulnerability in elliptic curve cryptography, particularly its impact on the Elliptic Curve Digital Signature Algorithm (ECDSA). This article explores how attackers can exploit this flaw to generate fraudulent transactions, create fake signatures, and compromise the integrity of blockchain systems.

Language: Jupyter Notebook - Size: 1.72 MB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 1

andstor/verified-smart-contracts

:page_facing_up: Verified Ethereum Smart Contract dataset

Language: Python - Size: 42 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 29 - Forks: 4

bjam24/agh-natural-language-processing

This respository contains projects made for the NLP course at the AGH UST in 2024 / 2025. They received maximum grade 5.0.

Language: Julia - Size: 25 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

TencentARC/FLM

Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)

Language: Python - Size: 7 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 32 - Forks: 1

yxuansu/SimCTG

[NeurIPS'22 Spotlight] A Contrastive Framework for Neural Text Generation

Language: Python - Size: 6.94 MB - Last synced at: 24 days ago - Pushed at: over 1 year ago - Stars: 471 - Forks: 40

shaoxiongji/fed-att

Attentive Federated Learning for Private NLM

Language: Python - Size: 10.7 KB - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 61 - Forks: 17

Ingenious-c0der/Beluga

An esoteric programming language based on Turing Machines

Language: C++ - Size: 163 KB - Last synced at: 28 days ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

amazon-science/synthesizrr

Synthesizing realistic and diverse text-datasets from augmented LLMs

Language: Python - Size: 1.44 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 12 - Forks: 3

Madjakul/HALvesting

Harvests open research papers from HAL (Hyper Articles en Ligne).

Language: Python - Size: 490 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

Separius/BERT-keras 📦

Keras implementation of BERT with pre-trained weights

Language: Python - Size: 552 KB - Last synced at: 2 months ago - Pushed at: almost 6 years ago - Stars: 814 - Forks: 196

kmario23/KenLM-training

Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2

Size: 5.86 KB - Last synced at: 2 months ago - Pushed at: about 6 years ago - Stars: 114 - Forks: 21

google/BEGIN-dataset

A benchmark dataset for evaluating dialog system and natural language generation metrics.

Size: 3.5 MB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 36 - Forks: 5

andstor/verified-smart-contracts-audit

:bug: Verified smart contract dataset with vulnerability labeling

Size: 3.91 KB - Last synced at: 2 months ago - Pushed at: about 3 years ago - Stars: 5 - Forks: 0

giganticode/codeprep

A toolkit for pre-processing large source code corpora

Language: Python - Size: 1.56 MB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 47 - Forks: 11

nikitas-theo/BERTtimeStories

Code implementation for our paper "BERTtime Stories: Investigating the Role of Synthetic Story Data in Language Pre-training" as part of the 2024 BabyLM Challenge

Language: Python - Size: 1000 Bytes - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

indiejoseph/chinese-char-rnn 📦

Character-Level language models

Language: Python - Size: 2.08 MB - Last synced at: about 1 month ago - Pushed at: almost 8 years ago - Stars: 77 - Forks: 20

prajjwal1/language-modelling

LM, ULMFit et al.

Language: Python - Size: 491 KB - Last synced at: 22 days ago - Pushed at: over 5 years ago - Stars: 46 - Forks: 6

MyDarapy/gpt-1-from-scratch

Rewriting and pretraining GPT-1 from scratch. Implementing Multihead Attention (MHA) in pyTorch from the original paper Improving Language Understanding by Generative Pre-Training (https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)

Language: Python - Size: 44.9 KB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

hirofumi0810/neural_sp

End-to-end ASR/LM implementation with PyTorch

Language: Python - Size: 8.66 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 596 - Forks: 139

oooranz/Baby-CoThought

Baby's CoThought: Leveraging LLMs for Enhanced Reasoning in Compact Models

Language: Python - Size: 61.3 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 17 - Forks: 4

IDSIA/recurrent-fwp

Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers" (NeurIPS 2021)

Language: Python - Size: 5.61 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 48 - Forks: 5

mpuig/gpt2-fine-tuning

Fine-tune GPT2 to generate fake job experiences

Language: Jupyter Notebook - Size: 54.1 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 11 - Forks: 2

L0SG/relational-rnn-pytorch

An implementation of DeepMind's Relational Recurrent Neural Networks (NeurIPS 2018) in PyTorch.

Language: Python - Size: 4.49 MB - Last synced at: 2 months ago - Pushed at: over 6 years ago - Stars: 245 - Forks: 35

BitcoinChatGPT/Gauss-Jacobi-Method-Algorithm

To use a pre-trained Bitcoin ChatGPT AI model to learn this method, you would first need to provide the model with a clear and concise description of the algorithm, including its purpose, prerequisites, and the mathematical principles behind it. How To Get PrivateKey of Bitcoin Wallet Address.

Language: Jupyter Notebook - Size: 1.7 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 3

OSU-STARLAB/LeaPformer

[ICML 2024] Official implementation of "LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions."

Language: Python - Size: 20.1 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 9 - Forks: 1

Helsinki-NLP/lm-vs-mt

A Comparison of Language Modeling and Translation as Multilingual Pretraining Objectives

Language: Python - Size: 1.15 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

apple/ml-interspeech2022-phi_rtn

Repository accompanying the Interspeech 2022 publication titled "Space-Efficient Representation of Entity-centric Query Language Models" by Van Gysel et al.

Size: 33.1 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 13 - Forks: 2

CLMBRs/lm-training

Repository for training transformer _and recurrent_ language models via HuggingFace in an entirely configuration-file driven manner.

Language: Python - Size: 95.7 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 8 - Forks: 0

tinhb92/rnn_darts_fastai

Implement Differentiable Architecture Search (DARTS) for RNN with fastai

Language: Jupyter Notebook - Size: 1.86 MB - Last synced at: 4 days ago - Pushed at: about 6 years ago - Stars: 24 - Forks: 3

rubypoddar/microsoft-phi3-language-model

Explore the power of Microsoft Phi-3 language model with this repository, featuring a versatile natural language processing tool. Leverage advanced text generation, summarization, and AI-driven creativity directly from the Phi-3 model. Dive into cutting-edge language capabilities for your projects.

Language: Jupyter Notebook - Size: 12.1 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 14 - Forks: 0

arrrrrmin/albert-guide

Understanding "A Lite BERT". An Transformer approach for learning self-supervised Language Models.

Language: Python - Size: 52.7 KB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 1

PraveenKumar-Rajendran/Udacity-Natural-Language-Processing-Engineer-Nanodegree

Projects Implemented for the Udacity Natural Language Processing Engineer Nanodegree Program

Size: 1.42 MB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

gidim/Babler

Data Collection System For NLP/Speech Recognition

Language: Java - Size: 32.7 MB - Last synced at: 25 days ago - Pushed at: about 4 years ago - Stars: 25 - Forks: 12

KIST-CSRC/Text-to-BatteryRecipe

Official source codes for implementing "Text-to-Battery Recipe: A language modeling-based protocol for automatic battery recipe extraction and retrieval"

Language: Python - Size: 27.9 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

jwchoi95/Text-to-BatteryRecipe

Official source codes for implementing "Text-to-Battery Recipe: A language modeling-based protocol for automatic battery recipe extraction and retrieval"

Language: Python - Size: 5.39 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 1

mit-han-lab/neurips-micronet

[JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion

Language: Jupyter Notebook - Size: 65.6 MB - Last synced at: 27 days ago - Pushed at: over 4 years ago - Stars: 40 - Forks: 8

meta-toolkit/meta

A Modern C++ Data Sciences Toolkit

Language: C++ - Size: 30.4 MB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 689 - Forks: 233

CODING-Enthusiast9857/Gemini_LLM_Application

It is an innovative repository housing a sophisticated Large Language Model (LLM) project, showcasing the intersection of advanced natural language processing and cutting-edge artificial intelligence. This repository serves as a comprehensive platform for the development, experimentation, and application of state-of-the-art language models.

Language: Python - Size: 8.79 KB - Last synced at: 2 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

sayhitosandy/Transformer-Speech-Classifier-LM

Implementation and exploration of transformer models for speech segment classification and language modeling.

Language: Jupyter Notebook - Size: 1.83 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

ShiningLab/PromptSub

This repository is for the paper Lexical Substitution as Causal Language Modeling. In Proceedings of the 13th Joint Conference on Lexical and Computational Semantics (*SEM 2024), Mexico City, Mexico. Association for Computational Linguistics.

Language: Python - Size: 4.32 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

quentin-riffault/Lingua

Language analyzer

Language: Python - Size: 132 KB - Last synced at: 12 months ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

marcruef/lingoai

Language Analysis for A.I.

Language: PHP - Size: 2.93 KB - Last synced at: 12 months ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0

LarsHill/pointer-guided-pre-training

Code for the ECML 2024 paper "Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness"

Language: Python - Size: 63.5 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

albinjm/FinSpeech

A Speech Recognition Framework for Banking Interactions using Convolutional Recurrent Dense Neural Networks and Language Models

Language: Jupyter Notebook - Size: 188 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

BitcoinChatGPT/Fuzzing-Vulnerability-Algorithm

Learn about the Fuzzing vulnerability in Bitcoin's ECDSA signature algorithm and its potential impact on the security of Bitcoin transactions. Discover how the vulnerability can be exploited and what steps are being taken to mitigate the risk. Stay informed on the latest developments in Bitcoin security.

Language: Jupyter Notebook - Size: 1.7 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 1

kazuki-irie/dct-fast-weights

PyTorch implementation of DCT fast weight RNNs

Language: Python - Size: 11.7 KB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 4 - Forks: 0

Atenrev/forocoches-language-generation

This is a PyTorch implementation of a decoder only transformer inspired on GPT-2. The model was trained from scratch on a custom dataset of over 1 million threads from the Spanish forum ForoCoches. The dataset is publicly available.

Language: Python - Size: 39.1 KB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 0

Atenrev/comics-dialogue-generation

PyTorch code for Automatic generation of comic dialogues. The purpose of this project is to generate subsequent dialogues given a multimodal context.

Language: Python - Size: 727 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 0

sbdzdz/thesis-scripts

Miscellaneous scripts and utilities for my master's thesis.

Language: Python - Size: 80.1 KB - Last synced at: about 1 year ago - Pushed at: almost 9 years ago - Stars: 0 - Forks: 1

ohmthanap/CS583_Deep-Learning

Learned knowledge and techniques in Deep Learning and also related tools: Python, Pytorch, Jupyter Notebook, RNN, CNN, Reinforcement Learning, LSTM, BERT, Language Modeling

Language: Jupyter Notebook - Size: 127 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

UIC-Liu-Lab/CPT

[EMNLP 2022] Continual Training of Language Models for Few-Shot Learning

Language: Python - Size: 808 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 40 - Forks: 1

paule32/Kurt_Goedel_Experiment

This is just a fun Project that is faced from the "Kurt Goedel" Therom of incomplete sentences. Produced with SBCL - a Common Lisp implementation. Free for non-profit usage.

Language: Common Lisp - Size: 44.9 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

cynthia/kosentences

Large scale unannotated Korean corpus for unsupervised tasks. (e.g. Language modeling)

Language: Python - Size: 15.6 KB - Last synced at: 2 months ago - Pushed at: almost 6 years ago - Stars: 27 - Forks: 6

lyeoni/pretraining-for-language-understanding

Pre-training of Language Models for Language Understanding

Language: Python - Size: 562 KB - Last synced at: 2 months ago - Pushed at: almost 6 years ago - Stars: 83 - Forks: 14

hltcoe/sandle

Run a large language modeling SANDbox in your Local Environment

Language: Python - Size: 2.3 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 1

ArmanBehnam/NLP

Natural language processing including Datasets,Farsi NLP, Automated Essay Scoring, Automatic Speech Recognition and etc.

Language: Jupyter Notebook - Size: 512 KB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

cedrickchee/BERT-pytorch Fork of codertimo/BERT-pytorch

Google AI BERT 2018 pytorch implementation

Language: Python - Size: 36.1 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 1

cedrickchee/awd-lstm-lm Fork of salesforce/awd-lstm-lm

LSTM and QRNN Language Model Toolkit for PyTorch (adapted to fast.ai version)

Language: Python - Size: 50.8 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 1

Machine-Learning-Foundations/day_15_exercise_sequence_processing

Exercise on generative language modelling in Jax.

Language: Python - Size: 1.13 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

clovaai/group-transformer

Official code for Group-Transformer (Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model, COLING-2020).

Language: Python - Size: 51.8 KB - Last synced at: 9 days ago - Pushed at: over 4 years ago - Stars: 25 - Forks: 1

Mawazoni/KiswahiliModuleNooJ

he Kiswahili module is designed for NooJ linguistic development environment software and corpus processor. It goes with a 45000 words Kiswahili-English dictionary and morphological grammars that detect all tenses of Kiswahili verbs. Mathieu ROY et al. 2017. 2018. CC BY NC SA 3.0

Size: 12.5 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

dimits-ts/text_analytics

Language Modelling (text generation, spell correction) and Sentiment Analysis / POS Tagging with MLP, RNN, CNN and BERT models and LLM prompting

Language: Jupyter Notebook - Size: 69.1 MB - Last synced at: 23 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 1

BoHuangLab/CELL-E_2

Encoder-only model for image-based protein predictions

Language: Python - Size: 12.9 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 0

suryatejreddy/Memeify

Code and Dataset for Memeify: A Large-scale Meme Generation System

Language: JavaScript - Size: 11.8 MB - Last synced at: about 2 months ago - Pushed at: about 5 years ago - Stars: 25 - Forks: 5

harsha-desaraju/HMM-Model-for-POS-tagging

Parts of Speech (POS) tagging for English using Hidden Markov Model.

Language: Python - Size: 6.5 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

young-zonglin/yangzl-deep-lm-keras

Language modeling using several deep models.

Language: Python - Size: 4.19 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

Related Keywords
language-modeling 255 nlp 67 natural-language-processing 59 deep-learning 46 pytorch 36 machine-learning 32 python 29 language-model 23 tensorflow 21 transformers 20 text-generation 19 lstm 18 recurrent-neural-networks 16 rnn 15 dataset 14 artificial-intelligence 12 ai 10 neural-networks 10 word-embeddings 10 speech-recognition 10 transformer 10 language 9 machine-translation 8 bert 8 natural-language-generation 7 llm 7 transfer-learning 7 deep-neural-networks 6 gpt-2 6 gpt2 6 attention-mechanism 6 language-processing 6 chatgpt 6 nlp-machine-learning 6 text-classification 6 sequence-to-sequence 5 protein-sequences 5 python3 5 named-entity-recognition 5 text-processing 5 classification 5 colab-notebook 5 openai 5 datasets 5 question-answering 5 n-grams 5 linguistics 5 generative-models 5 language-generation 5 ngrams 5 sentiment-analysis 5 lstm-model 4 chatbot 4 nlg 4 asr 4 paper 4 automatic-speech-recognition 4 gru 4 reinforcement-learning 4 data-analysis 4 shakespeare 4 benchmark 4 t5 4 text-summarization 4 computational-linguistics 4 natural-language-understanding 4 huggingface 4 bitcoin 4 bitcoin-wallet 4 word2vec 4 transformer-xl 4 large-language-models 4 keras 3 pretrained-models 3 pos-tagging 3 text-analysis 3 seq2seq 3 code-switching 3 cnn 3 pretraining 3 transformer-architecture 3 programming 3 few-shot-learning 3 information-retrieval 3 neural-network 3 jupyter-notebook 3 bert-model 3 gpt-3 3 deeplearning 3 long-short-term-memory-models 3 generative-model 3 llms 3 fine-tuning 3 text-to-image 3 corpus 3 rnn-model 3 in-context-learning 3 topic-modeling 3 image-classification 3 research 3