GitHub / sathishkumar67 / GPT-2-IMDB-Sentiment-Fine-Tuning-with-PPO
Implemented the Proximal Policy Optimization (PPO) algorithm to fine-tune a large language model for generating consistently positive reviews
Stars: 0
Forks: 0
Open issues: 0
License: mit
Language: Python
Size: 7.95 MB
Dependencies parsed at: Pending
Created at: about 1 year ago
Updated at: 11 months ago
Pushed at: 11 months ago
Last synced at: 11 months ago
Topics: gpt2, ppo, pytorch, reinforcement-learning, rlhf, text-generation, transformers
Loading...