An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: gpu-optimization

Kuenoz/pytorch_training_optimization_using_tensordict_memory_mapping

Optimizing PyTorch Model Training by Wrapping Memory Mapped Tensors on an Nvidia GPU with TensorDict.

Language: Python - Size: 11.8 MB - Last synced at: about 18 hours ago - Pushed at: about 18 hours ago - Stars: 0 - Forks: 0

Dongskie43/nlp-engineering-hub

📚 Enterprise NLP systems and LLM applications. Features custom language model implementations, distributed training pipelines, and efficient inference systems. 🔤

Size: 1.95 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

OriYarden/pytorch_training_optimization_using_tensordict_memory_mapping

Optimizing PyTorch Model Training by Wrapping Memory Mapped Tensors on Nvidia GPUs with TensorDict.

Language: Python - Size: 11.9 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0

0xf0011/cryptic-simglyph-allocator

Size: 0 Bytes - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

Md-Emon-Hasan/Fine-Tuning

End-to-end fine-tuning of Hugging Face models using LoRA, QLoRA, quantization, and PEFT techniques. Optimized for low-memory with efficient model deployment

Language: Jupyter Notebook - Size: 5.5 MB - Last synced at: 7 days ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

RobThePCGuy/Performance-Mod-Guide-For-Valheim

Boost Valheim's FPS to forge a smoother Viking journey!

Language: PowerShell - Size: 14.5 MB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 26 - Forks: 0

Berto70/nbody_cuda

Parallel N-Body algorithm with CUDA. Modern Computing for Physics - 2025 - UniPD

Language: Jupyter Notebook - Size: 17.5 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

icelook349/Nvidia-Driver-Tweaker-No-Crack

This repository provides a tool for tweaking and optimizing Nvidia graphics card drivers for better performance, stability, and custom configurations. It allows users to adjust various settings for optimal GPU performance and better gaming or rendering experience.

Size: 0 Bytes - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

fatalik2319/Nvidia-Driver-Tweaker-No-Crack

This repository provides a tool for tweaking and optimizing Nvidia graphics card drivers for better performance, stability, and custom configurations. It allows users to adjust various settings for optimal GPU performance and better gaming or rendering experience.

Size: 6.84 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

broomelasticheart/Claymore-Dual-Miner-Multi-Crypto-Mining

Claymore Dual Miner allows simultaneous mining of multiple cryptocurrencies, optimizing your mining profits while efficiently using GPU resources. ⛏️💰

Size: 7.81 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

BjornMelin/nlp-engineering-hub

📚 Enterprise NLP systems and LLM applications. Features custom language model implementations, distributed training pipelines, and efficient inference systems. 🔤

Size: 5.86 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 1

BjornMelin/edge-ai-engineering

📱 Optimized ML for edge devices. Showcasing efficient model deployment, GPU-CPU memory transfer optimization, and real-world edge AI applications. 🤖

Size: 5.86 KB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

AKKI0511/Traffic-Sign-Recognition

Traffic sign recognition using deep learning. Implemented and compared custom CNN and transfer learning models (ResNet50, MobileNetV2) with comprehensive evaluation metrics. Achieved 98.8% accuracy with a focus on real-world efficiency.

Language: Jupyter Notebook - Size: 171 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

raj200501/GPUOptimizerML

The GPU Optimizer for ML Models enhances GPU performance for machine learning. It offers advanced scheduling, real-time monitoring, and efficient resource management through a user-friendly web interface and robust API, integrating big data technologies for seamless data processing and model optimization.

Language: Python - Size: 56.6 KB - Last synced at: 11 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

GVProf/GVProf

GVProf: A Value Profiler for GPU-based Clusters

Language: Python - Size: 229 KB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 43 - Forks: 9

yui0/waifu2x-glsl

Fast waifu2x converter with GPU optimization

Language: C - Size: 40.4 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 26 - Forks: 8

yui0/waifu2x-ocl

Fast waifu2x converter with GPU optimization

Language: C - Size: 2.98 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 23 - Forks: 3

Related Keywords
gpu-optimization 17 python 6 cuda 5 machine-learning 4 transformers 3 gpu 3 nlp 3 memory-mapping 3 pytorch 3 nvidia 3 huggingface 3 graphics-card 2 gpu-tuning 2 gpu-performance 2 gaming-tools 2 driver-update 2 driver-tweaker 2 driver-optimization 2 driver-enhancement 2 driver-customization 2 computer-optimization 2 deep-learning 2 graphics-tuning 2 hardware-optimization 2 nvidia-driver 2 nvidia-tools 2 pc-performance 2 performance-tuning 2 system-performance 2 system-tuning 2 fast-waifu2x-converter 2 linux 2 macos 2 nyanko 2 resolution 2 waifu2x 2 artificial-intelligence 2 openai 2 large-language-models 2 ai 2 memory-mapped-tensors 2 optimization 2 pytorch-tensors 2 pytorch-training 2 language-models 2 langchain 2 huggingface-transformers 2 pytorch-training-optimization 2 torch 2 tensors 2 tensordict 2 iot 1 mobile-ml 1 model-optimization 1 tflite 1 cnn-architecture 1 computer-vision 1 convolutional-neural-networks-cnn 1 data-augmentation 1 image-classification 1 image-preprocessing 1 embedded-systems 1 edge-computing 1 profitability 1 multi-crypto-mining 1 mining-tools 1 mining-performance 1 mining-algorithms 1 gpu-mining 1 ethereum-mining 1 eth-mining 1 windows 1 waifu2x-ocl 1 opencl 1 waifu2x-glsl 1 gpgpu 1 glsl 1 glew 1 value-profiler 1 redundancy 1 profiler 1 patterns 1 instrumentation 1 data-flow 1 clusters 1 binary-analysis 1 secure-api 1 real-time-monitoring 1 model-management 1 gpu-scheduling 1 big-data-integration 1 transfer-learning 1 traffic-sign-recognition 1 tensorflow 1 resnet50 1 neural-network 1 model-evaluation 1 model-comparison 1 mobilenetv2 1 keras 1