An open API service providing repository metadata for many open source software ecosystems.

GitHub / taabishhh / LLM_Training

This project implements a distributed pipeline for NLP model training using Apache Spark and DeepLearning4J (DL4J). The methodology utilizes a sliding window approach for data preparation, positional embeddings for token encoding, and Word2Vec model training with parallel processing. The model and training process is designed for scalability and op

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/taabishhh%2FLLM_Training
PURL: pkg:github/taabishhh/LLM_Training

Stars: 0
Forks: 0
Open issues: 0

License: apache-2.0
Language: Scala
Size: 17 MB
Dependencies parsed at: Pending

Created at: 10 months ago
Updated at: 8 days ago
Pushed at: 8 days ago
Last synced at: 4 days ago

Topics: apache-spark, deeplearning4j, dl4j, llm, llm-training, logback-classic, mapreduce-scala, scalatest, sliding-window, tensorflow, word2vec

    Loading...