GitHub / taabishhh / LLM_Training
This project implements a distributed pipeline for NLP model training using Apache Spark and DeepLearning4J (DL4J). The methodology utilizes a sliding window approach for data preparation, positional embeddings for token encoding, and Word2Vec model training with parallel processing. The model and training process is designed for scalability and op
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/taabishhh%2FLLM_Training
PURL: pkg:github/taabishhh/LLM_Training
Stars: 0
Forks: 0
Open issues: 0
License: apache-2.0
Language: Scala
Size: 17 MB
Dependencies parsed at: Pending
Created at: 10 months ago
Updated at: 8 days ago
Pushed at: 8 days ago
Last synced at: 4 days ago
Topics: apache-spark, deeplearning4j, dl4j, llm, llm-training, logback-classic, mapreduce-scala, scalatest, sliding-window, tensorflow, word2vec