GitHub topics: nvidia-cuda
kartavyaantani/CUDA_IMAGE_PROCESSING
A CUDA-accelerated image processing project featuring multiple GPU-based filters and enhancement techniques. Implements convolution, edge detection, Non-Local Means (NLM) denoising, K-Nearest Neighbors (KNN), and pixelization. Each operation is optimized using CUDA kernels for real-time performance on large images. The project supports command-line
Language: Jupyter Notebook - Size: 5.4 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

Cat-Gawr/DeepSeek-FlashMLA
DeepSeek Flash MLA - DeepSeek - copy manual
Language: C++ - Size: 58.6 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

yashkathe/Image-Noise-Reduction-with-CUDA
This project conducts an analysis of image denoising technique - median blur, comparing GPU-accelerated (Numba) and CPU-based (OpenCV) processing speeds.
Language: Jupyter Notebook - Size: 25.4 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 3 - Forks: 0

Ghostbird/local-ai
Docker compose files to quickly spin up local AI set-ups for fun and learning
Language: Shell - Size: 21.5 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

markods/GpuSeqAlign
A benchmark for dynamic-programming-based GPU sequence alignment algorithms.
Language: C++ - Size: 2.12 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

Jimver/cuda-toolkit
GitHub Action to install CUDA
Language: TypeScript - Size: 9.41 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 170 - Forks: 64

NexusGPU/tensor-fusion-site
TensorFusion landing page and product docs
Language: CSS - Size: 1.78 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 5 - Forks: 1

m1k1o/go-transcode
On-demand transcoding origin server for live inputs and static files in Go using ffmpeg. Also with NVIDIA GPU hardware acceleration.
Language: Go - Size: 295 KB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 244 - Forks: 41

bigsk1/gpu-monitor
Real-time performance metrics and statistics for your Nvidia GPU
Language: HTML - Size: 1020 KB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 27 - Forks: 0

Koushikphy/Intro-to-CUDA-Fortran
A Complete beginner's introduction to programming with CUDA Fortran
Size: 200 KB - Last synced at: 13 days ago - Pushed at: over 2 years ago - Stars: 26 - Forks: 1

KernFerm/exporting-YOLO
This repository contains scripts and commands for exporting YOLO models to different formats, including TensorRT (.engine) and ONNX (.onnx).
Language: Python - Size: 773 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 4 - Forks: 1

genn-team/genn
GeNN is a GPU-enhanced Neuronal Network simulation environment based on code generation for Nvidia CUDA.
Language: C++ - Size: 246 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 252 - Forks: 65

tgautam03/tGeMM
General Matrix Multiplication using NVIDIA Tensor Cores
Language: Cuda - Size: 47.9 KB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 13 - Forks: 3

ProjectPhysX/PTXprofiler
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
Language: C++ - Size: 11.7 KB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 50 - Forks: 6

e-ago/hpgmg-cuda-async
GPUDirect Async implementation of HPGMG-FV CUDA
Language: Cuda - Size: 225 MB - Last synced at: 15 days ago - Pushed at: almost 7 years ago - Stars: 11 - Forks: 0

suvash/nixos-nvidia-cuda-python-docker-compose
A step-by-step guide to setting up Nvidia GPUs with CUDA support running on Docker (and Compose) containers on NixOS host
Language: Dockerfile - Size: 69.3 KB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 41 - Forks: 5

SohelRana-aiub-Pro/Region-Proposal-Object-Detection-using-Computer-Vision-Algorithms
https://docs.omniverse.nvidia.com/prod_install-guide/prod_install-guide/overview.html
Language: Jupyter Notebook - Size: 1.69 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

pc2/GPUInspector.jl
Inspecting GPUs with Julia
Language: Julia - Size: 3.27 MB - Last synced at: 6 days ago - Pushed at: 11 months ago - Stars: 43 - Forks: 5

m1k1o/hls-restream
Restream live content as HLS using ffmpeg in docker. Also with NVIDIA GPU hardware acceleration.
Language: Shell - Size: 45.9 KB - Last synced at: 7 days ago - Pushed at: over 2 years ago - Stars: 42 - Forks: 25

neurite/debian-setup
Language: Shell - Size: 1.08 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 27 - Forks: 3

aibos-dev/development-container-template-dg
Docker Container Template
Language: Dockerfile - Size: 13.7 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

mattdean1/cuda
An implementation of parallel exclusive scan in CUDA
Language: Cuda - Size: 32.2 KB - Last synced at: 14 days ago - Pushed at: about 7 years ago - Stars: 62 - Forks: 23

jatolentino/gAIze
Gaze Correction Tool With AI
Language: TypeScript - Size: 9.86 MB - Last synced at: 6 days ago - Pushed at: 7 months ago - Stars: 2 - Forks: 0

kvdomingo/genai-lab
GenAI playground maximizing the use of open-source software and models
Language: Python - Size: 1.33 MB - Last synced at: 17 days ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

ameli/manylinux-cuda
manylinux docker images with CUDA Toolkit
Language: Dockerfile - Size: 62.5 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 10 - Forks: 4

SparseLinearAlgebra/cuBool
Sparse linear Boolean algebra for Nvidia Cuda
Language: C++ - Size: 38.8 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 24 - Forks: 4

SartajBhuvaji/Cuda
Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.
Language: Cuda - Size: 286 MB - Last synced at: 22 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Aviksaikat/Hashcat-GPU-fix-for-kali
My solution for Hashcat not detecting NVIDIA GPU for hybrid graphics setup
Size: 426 KB - Last synced at: about 1 hour ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 1

bradleydworak/opencv-cuda-ubuntu2204
Compile OpenCV with NVIDIA GPU CUDA support under Ubuntu 24.04
Size: 30.3 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 1

pjyi2147/CUDA_HTN_Workshop
Introduction to Nvidia CUDA workshop repository @ Hack the North 2024
Language: Jupyter Notebook - Size: 8.47 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 5 - Forks: 2

ramcovasu/finetunelocalllm
Finetuning a Local LLM Gemma 2 2B using Unsloth and your own custom dataset for Custom Attribute extraction from an unstructured content
Language: Python - Size: 9.77 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

ItsMeDevRoland/Video-Frame-Converter
Convert Your Videos into Frame By Frame Png's... Useful for Rotoscoping
Language: Python - Size: 41 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 1

robang74/isar-nvidia-debian
Build with ISAR an evaluation image based on Debian 11 (bullseye) selecting from nVidia GPU support (515.65.07) up to a graphic developing enviroment with the full nVidia software stack (11.7.1) running a standard debian kernel
Language: Shell - Size: 289 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

denyskryvytskyi/capgemini-cuda
CUDA implementation of vector additon, matrix multiplication, reduction and sorting
Language: Cuda - Size: 38.1 KB - Last synced at: 16 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

mahikshith/Transformer-Text-Summarizer-Fine-tuning-with-ETL-pipeline-and-Deployment
Fine tuning pre-trained transformer model for custom text summarization with ETL pipeline and end to end deployment
Language: Jupyter Notebook - Size: 116 KB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

Rtoax/2D3D-TI-FD-RTM-cuda
This is an open source program based on NVIDIA cuda, which includes two-dimensional and three-dimensional VTI media forward simulation and reverse time migration imaging, two-dimensional TTI media reverse time migration imaging, and ADCIGs extraction of the above media]
Language: Cuda - Size: 763 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 34 - Forks: 9

Rtoax/VTI-FD-CUDA-GTK
NVIDIA-based GPU Accelerated Finite Difference Forward Seismic Simulation of VTI Media]
Language: C - Size: 2.6 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 26 - Forks: 6

TomTolleson/CUDA-Kernel-Benchmarking-Tool
A benchmarking tool in C++ that creates Cuda kernels and tests the overall system performance between CPU and GPU
Language: Cuda - Size: 10.7 KB - Last synced at: 22 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

orlandopalmeira/Trabalho-CP-2023-2024
Repositório do trabalho prático no âmbito da UC de Computação Paralela (CP) - Mestrado em Engenharia Informática (MEI/MIEI) - Universidade do Minho (UMinho)
Language: C++ - Size: 1.36 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

NexusGPU/tensor-fusion-docs Fork of Code2Life/vitepress-diataxis-template
TensorFusion product documents
Language: HTML - Size: 1.65 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 5 - Forks: 1

NDHANA94/ros_nvidia_container
ROS Noetic (can be setup for other distros) container with NVIDIA GPU-accelerated OpenGL for Gazebo and RViz.
Language: Dockerfile - Size: 14.6 KB - Last synced at: 14 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 1

imsanjoykb/CUDA-Bootcamp
CUDA Programming Practices
Language: Cuda - Size: 6.14 MB - Last synced at: 12 days ago - Pushed at: about 3 years ago - Stars: 15 - Forks: 3

BSalita/WSL2-Kali-KDE-Docker-Nvidia
Notes for Installing WSL2 Kali Linux + KDE Plasma GUI + Docker + Nvidia
Size: 73.2 KB - Last synced at: 8 days ago - Pushed at: about 4 years ago - Stars: 8 - Forks: 2

kiwijuice56/cuda-mandelbox
Ray marching renderer of the 3D mandelbox fractal, accelerated with CUDA GPU code
Language: Cuda - Size: 58.1 MB - Last synced at: 24 days ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

clowdhaus/amazon-eks-gpu-ami 📦
Packer configuration for creating an Amazon EKS AMI for use with NVIDIA GPUs
Language: HCL - Size: 45.9 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

parzival-space/stable-diffusion-docker
Simple Click-and-Run Docker Image for Stable Diffusion WebUI
Language: Dockerfile - Size: 1.42 MB - Last synced at: 9 days ago - Pushed at: 8 months ago - Stars: 1 - Forks: 2

pranavvss/Hand-Face-recognition-with-python
Hand/Face detection model using python (No hardware(Arduino, sensors required)
Language: Python - Size: 122 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

aaditya29/Parallel-Computing-And-CUDA
Learning about Parallel Computing and GPU programming using CUDA.
Language: C++ - Size: 47.9 KB - Last synced at: 20 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

tkob-vh/CUDA_kernels
Some general algorithms implemented in cuda.
Language: Cuda - Size: 76.2 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Mattjesc/PTQ-MeshGraphNet-NVIDIA-Modulus
Post-Training Quantization for MeshGraphNet Physics-Based ML Model: Cardiovascular Flow Simulation Implementation
Language: Python - Size: 6.84 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

efecaliskannn/Pneumonia-Detection-with-CNN--VGG16--and-ResNet50-Deep-Learning-Models
In this project, pneumonia detection using deep learning, a subset of artificial intelligence, is aimed. The performance of deep learning algorithms, including CNN, VGG16, and ResNet50 models, in detecting pneumonia has been examined.(Bu projede yapay zekanın alt kümesi olan derin öğrenme ile zatürre tespiti amaçlanmaktadır.)
Size: 9.77 KB - Last synced at: 22 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

chrisbraddock/gpu-tune
This project analyzes GPU performance metrics from AI training and inference tasks, highlighting the optimal power settings for efficient operation. It generates visualizations to help identify the best max power settings for training and inference based on energy consumption.
Language: Python - Size: 283 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

raidan-x/Intel-for-display-NVIDIA-for-computing
Intel for display, NVIDIA for computing
Size: 12.7 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

sahilmgandhi/gpu-parallel-kernel-execution
Final Project for CS259 Spring 2019 - Evaluating Parallel Kernel Execution on GPUs
Language: TeX - Size: 31.6 MB - Last synced at: 6 days ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

lalitdotdev/transcribeX
Transcribe audio in minutes with OpenAI's WhisperV3 and Flash Attention v2 + Transformers without relying on third-party providers and APIs. Host it yourself or try it out.
Language: TypeScript - Size: 235 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Beuth-Erdelt/prometheus_nvlink_exporter
This script collects some informations about NVLink and PCI bus traffic of NVidia GPUs. Results are published as prometheus metrics via a websocket.
Language: Python - Size: 10.7 KB - Last synced at: 7 days ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 0

UntouchedWagons/K3S-NVidia
A guide on using NVidia GPUs for transcoding or AI in Kubernetes
Size: 7.81 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 4 - Forks: 2

GTruf/Driver-Drowsiness-Detector
Prototype of an intelligent safety system for detecting driver drowsiness
Language: C++ - Size: 25.5 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

IsaacAlves7/nvidia-cuda
👁️🗨️📗 It's a repository of Nvidia CUDA programming.
Size: 6.84 KB - Last synced at: 25 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

cooleybyte/linvidia
nvidia drivers AND cuda for linux (drivers are the latest as of 2024)
Size: 3.91 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Hello-Computer-Science/Hello-CUDA
A repo for study CUDA
Language: Cuda - Size: 13.7 KB - Last synced at: 18 days ago - Pushed at: about 3 years ago - Stars: 9 - Forks: 0

sukesh-ak/Nvidia-GPU-vs-CPU
Comprarison of vector operation using CPU vs GPU using Nvidia Cuda
Language: C - Size: 307 KB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

BBC-Esq/Nvidia_Gpu_Monitor
Realtime Monitor of Nvidia GPU Metrics with NVML Library
Language: Python - Size: 37.1 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

qulei123/fmpeg
主要适配avfilter中的scale_npp支持RGB24、BGR24格式,以减少CPU使用率。
Language: C - Size: 17.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

johannlilly/nvidia-cuda
Sample projects and code associated with the installation of CUDA v11 for learning CUDA with C and C++.
Size: 95.1 MB - Last synced at: 4 days ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

gitctrlx/JetYOLO
JetYOLO:Speed through your DeepStream app development, cleverly and creatively.
Language: C++ - Size: 19 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

energypatrikhu/ShinobiNvidiaDocker
Language: Shell - Size: 36.1 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

americandatascience/alphai
AlphAI is a versatile Python toolkit for GPU profiling and analytics, supporting various tensor models. It enhances GPU server operations and serves as a client for American Data Science's notebook servers.
Language: Python - Size: 1 MB - Last synced at: 7 days ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

harshdeepsokhey/cse570-parallel-distributed-processing
Implementation of Parallel and Distributed Processing assignments
Size: 1.95 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 1

ProjectoOfficial/CUDA
Learn cuda step-by-step starting from 0 with these simple and free code examples (comments are provided!)
Language: Cuda - Size: 18.6 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

sandialabs/p3a
Portably Performant Physical Algebra
Language: C++ - Size: 645 KB - Last synced at: 14 days ago - Pushed at: about 2 years ago - Stars: 12 - Forks: 4

AbdusamadDev/Face-ID-Surveillance-Project
In Production and AI based service for human face detection. Uses most advanced and popular technologies to efficiently detect face identity. System runs locally and need strong Graphic card for best performance
Language: Python - Size: 791 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

FabricSoul/pytorch-jupyter-cuda
The docker image that can run pytorch and jupyterlab
Language: Dockerfile - Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Zwilla/BitCaine5_aeternity_miner
BitCaine5 the mining engine meets æternity blockchain technology
Language: Cuda - Size: 7.24 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 1

itm-unipi/Parallelized-Nearest-Neighbor-Upscaler
University Project for "Computer Architecture" course (MSc Computer Engineering @ University of Pisa). Implementation of a Parallelized Nearest Neighbor Upscaler using CUDA.
Language: C - Size: 44.6 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 2

kriation/docker-ethminer 📦
A docker container with a CUDA enabled build of ethminer
Language: Dockerfile - Size: 25.4 KB - Last synced at: 15 days ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 5

zocker-160/sheepit-docker-webUI
A lightweight docker container for the SheepIt! render farm with WebUI with CUDA support
Language: Shell - Size: 65.4 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 0

ghokun/nvidia-docker-host 📦
Installation details of a virtual machine with nvidia-docker to run CUDA in containers.
Size: 682 KB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

nopesir/thesis-project
Thesis: "Autocalibration of monocular cameras for autonomous driving scenarios", carried out in collaboration with Luxoft. This repository contains all the related code and the two implemented solution pipelines for Structure from Motion (SfM) implementation.
Language: C++ - Size: 120 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

kon-si/ntua_parlab
Semester assignment for ECE NTUA 3257 Parallel Processing
Language: C - Size: 5.79 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

fthbng77/segmentation_WebRTC
Language: JavaScript - Size: 3.61 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

enfiskutensykkel/ssd-gpu-dma
Build userspace NVMe drivers and storage applications with CUDA support
Language: C - Size: 711 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 261 - Forks: 35

maya-undefined/gpu-desktop-calculator
Language: Cuda - Size: 66.4 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 0

kaviles22/OpenCV-CUDA
How to copile OPENCV to use CUDA within a DOCKER image
Size: 55.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

xallard/scalable-tensorflow-example
Scalable TensorFlow is an open-source initiative tailored to harness the full potential of TensorFlow in high-demand scenarios, emphasizing distributed training and efficient model serving.
Size: 1.95 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

bhattbhavesh91/rapids-cudf-23-10-release-demo
Supercharge Your Pandas Code with RAPIDS cuDF & pandas accelerator mode | RAPIDS cuDF 23.10
Language: Jupyter Notebook - Size: 23.4 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

fjramireg/StiffMa
StiffMa: Fast finite element STIFFness MAtrix generation in MATLAB by using GPU computing.
Language: MATLAB - Size: 68.4 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 19 - Forks: 5

ayazhassan/RT-CUDA-GUI-Development
Recent development in Graphic Processing Units (GPUs) has opened a new challenge in harnessing their computing power as a new general-purpose computing paradigm with its CUDA parallel programming. However, porting applications to CUDA remains a challenge to average programmers. We have developed a restructuring software compiler (RT-CUDA) with best possible kernel optimizations to bridge the gap between high-level languages and the machine dependent CUDA environment. RT-CUDA is based upon a set of compiler optimizations. RT-CUDA takes a C-like program and convert it into an optimized CUDA kernel with user directives in a con.figuration .file for guiding the compiler. While the invocation of external libraries is not possible with OpenACC commercial compiler, RT-CUDA allows transparent invocation of the most optimized external math libraries like cuSparse and cuBLAS. For this, RT-CUDA uses interfacing APIs, error handling interpretation, and user transparent programming. This enables efficient design of linear algebra solvers (LAS). Evaluation of RT-CUDA has been performed on Tesla K20c GPU with a variety of basic linear algebra operators (M+, MM, MV, VV, etc.) as well as the programming of solvers of systems of linear equations like Jacobi and Conjugate Gradient. We obtained significant speedup over other compilers like OpenACC and GPGPU compilers. RT-CUDA facilitates the design of efficient parallel software for developing parallel simulators (reservoir simulators, molecular dynamics, etc.) which are critical for Oil & Gas industry. We expect RT-CUDA to be needed by many industries dealing with science and engineering simulation on massively parallel computers like NVIDIA GPUs.
Language: C - Size: 8.38 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 4 - Forks: 0

gvvsnrnaveen/cuda
this repository contains the various programs that can written using CUDA Toolkit.
Language: C - Size: 1.31 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

cloudgear-io/azure-bigcompute
:star: :penguin: :new: GPU Sku usage for Ubuntu 16.04-LTS and CentOS 7.4 , Standard Open Source Scheduler Deployments Torque, SLURM, PBSPro for HPC Skus for CentOS 7.4-HPC with OMS. This is presently on the GAed CentOS-HPC A9/H16R/H16MR and GPU NC6/NC12/NC24 . Latest Docker CE and nvidia-docker present in all. Updating for DIGITS for 2.0 on nvidia-runtime
Language: Shell - Size: 3.37 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 2

koulkoudakis/CUDA-FFT-Research
Language: Cuda - Size: 1.1 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 1

kangyolo/get-started-jetson-nano
Guidance for Nvidia Jetson Nano
Size: 816 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Fotic/CUDA-Project-2
Calculate minus of 2D arrays on GPU
Language: Cuda - Size: 488 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

Fotic/CUDA-Project
Calculate mean of 2D arrays on GPU
Language: Cuda - Size: 1.25 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

Wayne2Wang/WSL2InstallationGuide
Installation Guide for Windows Subsystem for Linux 2
Size: 32.2 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

fogoat/htmlmining
Unofficial HTMLCOIN Mining Group
Size: 2.79 MB - Last synced at: 5 days ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 4

Jaay7/Object-Detection
Detecting vehicles and warn them if they get close to another vehicle.
Language: Python - Size: 6.84 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

c1ph3r-fsocitey/CUDA-CUDNN-install-Ubuntu22.04
Here are the steps to install CUDA v11.36 and cuDNN v8.5 on Ubuntu 22.04
Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Bader-Research/CUDARMAT
RMAT Graph Generator for NVIDIA CUDA
Language: Cuda - Size: 326 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

karthikeyann/cuda-calculator Fork of szho42/cuda-calculator
HTML/JS port of CUDA Occupancy Calculator
Language: CoffeeScript - Size: 170 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 13 - Forks: 7
