An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: nvidia-cuda

kartavyaantani/CUDA_IMAGE_PROCESSING

A CUDA-accelerated image processing project featuring multiple GPU-based filters and enhancement techniques. Implements convolution, edge detection, Non-Local Means (NLM) denoising, K-Nearest Neighbors (KNN), and pixelization. Each operation is optimized using CUDA kernels for real-time performance on large images. The project supports command-line

Language: Jupyter Notebook - Size: 5.4 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

Cat-Gawr/DeepSeek-FlashMLA

DeepSeek Flash MLA - DeepSeek - copy manual

Language: C++ - Size: 58.6 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

yashkathe/Image-Noise-Reduction-with-CUDA

This project conducts an analysis of image denoising technique - median blur, comparing GPU-accelerated (Numba) and CPU-based (OpenCV) processing speeds.

Language: Jupyter Notebook - Size: 25.4 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 3 - Forks: 0

Ghostbird/local-ai

Docker compose files to quickly spin up local AI set-ups for fun and learning

Language: Shell - Size: 21.5 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

markods/GpuSeqAlign

A benchmark for dynamic-programming-based GPU sequence alignment algorithms.

Language: C++ - Size: 2.12 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

Jimver/cuda-toolkit

GitHub Action to install CUDA

Language: TypeScript - Size: 9.41 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 170 - Forks: 64

NexusGPU/tensor-fusion-site

TensorFusion landing page and product docs

Language: CSS - Size: 1.78 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 5 - Forks: 1

m1k1o/go-transcode

On-demand transcoding origin server for live inputs and static files in Go using ffmpeg. Also with NVIDIA GPU hardware acceleration.

Language: Go - Size: 295 KB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 244 - Forks: 41

bigsk1/gpu-monitor

Real-time performance metrics and statistics for your Nvidia GPU

Language: HTML - Size: 1020 KB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 27 - Forks: 0

Koushikphy/Intro-to-CUDA-Fortran

A Complete beginner's introduction to programming with CUDA Fortran

Size: 200 KB - Last synced at: 13 days ago - Pushed at: over 2 years ago - Stars: 26 - Forks: 1

KernFerm/exporting-YOLO

This repository contains scripts and commands for exporting YOLO models to different formats, including TensorRT (.engine) and ONNX (.onnx).

Language: Python - Size: 773 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 4 - Forks: 1

genn-team/genn

GeNN is a GPU-enhanced Neuronal Network simulation environment based on code generation for Nvidia CUDA.

Language: C++ - Size: 246 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 252 - Forks: 65

tgautam03/tGeMM

General Matrix Multiplication using NVIDIA Tensor Cores

Language: Cuda - Size: 47.9 KB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 13 - Forks: 3

ProjectPhysX/PTXprofiler

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.

Language: C++ - Size: 11.7 KB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 50 - Forks: 6

e-ago/hpgmg-cuda-async

GPUDirect Async implementation of HPGMG-FV CUDA

Language: Cuda - Size: 225 MB - Last synced at: 15 days ago - Pushed at: almost 7 years ago - Stars: 11 - Forks: 0

suvash/nixos-nvidia-cuda-python-docker-compose

A step-by-step guide to setting up Nvidia GPUs with CUDA support running on Docker (and Compose) containers on NixOS host

Language: Dockerfile - Size: 69.3 KB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 41 - Forks: 5

SohelRana-aiub-Pro/Region-Proposal-Object-Detection-using-Computer-Vision-Algorithms

https://docs.omniverse.nvidia.com/prod_install-guide/prod_install-guide/overview.html

Language: Jupyter Notebook - Size: 1.69 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

pc2/GPUInspector.jl

Inspecting GPUs with Julia

Language: Julia - Size: 3.27 MB - Last synced at: 6 days ago - Pushed at: 11 months ago - Stars: 43 - Forks: 5

m1k1o/hls-restream

Restream live content as HLS using ffmpeg in docker. Also with NVIDIA GPU hardware acceleration.

Language: Shell - Size: 45.9 KB - Last synced at: 7 days ago - Pushed at: over 2 years ago - Stars: 42 - Forks: 25

neurite/debian-setup

Language: Shell - Size: 1.08 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 27 - Forks: 3

aibos-dev/development-container-template-dg

Docker Container Template

Language: Dockerfile - Size: 13.7 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

mattdean1/cuda

An implementation of parallel exclusive scan in CUDA

Language: Cuda - Size: 32.2 KB - Last synced at: 14 days ago - Pushed at: about 7 years ago - Stars: 62 - Forks: 23

jatolentino/gAIze

Gaze Correction Tool With AI

Language: TypeScript - Size: 9.86 MB - Last synced at: 6 days ago - Pushed at: 7 months ago - Stars: 2 - Forks: 0

kvdomingo/genai-lab

GenAI playground maximizing the use of open-source software and models

Language: Python - Size: 1.33 MB - Last synced at: 17 days ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

ameli/manylinux-cuda

manylinux docker images with CUDA Toolkit

Language: Dockerfile - Size: 62.5 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 10 - Forks: 4

SparseLinearAlgebra/cuBool

Sparse linear Boolean algebra for Nvidia Cuda

Language: C++ - Size: 38.8 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 24 - Forks: 4

SartajBhuvaji/Cuda

Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.

Language: Cuda - Size: 286 MB - Last synced at: 22 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Aviksaikat/Hashcat-GPU-fix-for-kali

My solution for Hashcat not detecting NVIDIA GPU for hybrid graphics setup

Size: 426 KB - Last synced at: about 1 hour ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 1

bradleydworak/opencv-cuda-ubuntu2204

Compile OpenCV with NVIDIA GPU CUDA support under Ubuntu 24.04

Size: 30.3 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 1

pjyi2147/CUDA_HTN_Workshop

Introduction to Nvidia CUDA workshop repository @ Hack the North 2024

Language: Jupyter Notebook - Size: 8.47 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 5 - Forks: 2

ramcovasu/finetunelocalllm

Finetuning a Local LLM Gemma 2 2B using Unsloth and your own custom dataset for Custom Attribute extraction from an unstructured content

Language: Python - Size: 9.77 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

ItsMeDevRoland/Video-Frame-Converter

Convert Your Videos into Frame By Frame Png's... Useful for Rotoscoping

Language: Python - Size: 41 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 1

robang74/isar-nvidia-debian

Build with ISAR an evaluation image based on Debian 11 (bullseye) selecting from nVidia GPU support (515.65.07) up to a graphic developing enviroment with the full nVidia software stack (11.7.1) running a standard debian kernel

Language: Shell - Size: 289 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

denyskryvytskyi/capgemini-cuda

CUDA implementation of vector additon, matrix multiplication, reduction and sorting

Language: Cuda - Size: 38.1 KB - Last synced at: 16 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

mahikshith/Transformer-Text-Summarizer-Fine-tuning-with-ETL-pipeline-and-Deployment

Fine tuning pre-trained transformer model for custom text summarization with ETL pipeline and end to end deployment

Language: Jupyter Notebook - Size: 116 KB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

Rtoax/2D3D-TI-FD-RTM-cuda

This is an open source program based on NVIDIA cuda, which includes two-dimensional and three-dimensional VTI media forward simulation and reverse time migration imaging, two-dimensional TTI media reverse time migration imaging, and ADCIGs extraction of the above media]

Language: Cuda - Size: 763 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 34 - Forks: 9

Rtoax/VTI-FD-CUDA-GTK

NVIDIA-based GPU Accelerated Finite Difference Forward Seismic Simulation of VTI Media]

Language: C - Size: 2.6 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 26 - Forks: 6

TomTolleson/CUDA-Kernel-Benchmarking-Tool

A benchmarking tool in C++ that creates Cuda kernels and tests the overall system performance between CPU and GPU

Language: Cuda - Size: 10.7 KB - Last synced at: 22 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

orlandopalmeira/Trabalho-CP-2023-2024

Repositório do trabalho prático no âmbito da UC de Computação Paralela (CP) - Mestrado em Engenharia Informática (MEI/MIEI) - Universidade do Minho (UMinho)

Language: C++ - Size: 1.36 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

NexusGPU/tensor-fusion-docs Fork of Code2Life/vitepress-diataxis-template

TensorFusion product documents

Language: HTML - Size: 1.65 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 5 - Forks: 1

NDHANA94/ros_nvidia_container

ROS Noetic (can be setup for other distros) container with NVIDIA GPU-accelerated OpenGL for Gazebo and RViz.

Language: Dockerfile - Size: 14.6 KB - Last synced at: 14 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 1

imsanjoykb/CUDA-Bootcamp

CUDA Programming Practices

Language: Cuda - Size: 6.14 MB - Last synced at: 12 days ago - Pushed at: about 3 years ago - Stars: 15 - Forks: 3

BSalita/WSL2-Kali-KDE-Docker-Nvidia

Notes for Installing WSL2 Kali Linux + KDE Plasma GUI + Docker + Nvidia

Size: 73.2 KB - Last synced at: 8 days ago - Pushed at: about 4 years ago - Stars: 8 - Forks: 2

kiwijuice56/cuda-mandelbox

Ray marching renderer of the 3D mandelbox fractal, accelerated with CUDA GPU code

Language: Cuda - Size: 58.1 MB - Last synced at: 24 days ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

clowdhaus/amazon-eks-gpu-ami 📦

Packer configuration for creating an Amazon EKS AMI for use with NVIDIA GPUs

Language: HCL - Size: 45.9 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

parzival-space/stable-diffusion-docker

Simple Click-and-Run Docker Image for Stable Diffusion WebUI

Language: Dockerfile - Size: 1.42 MB - Last synced at: 9 days ago - Pushed at: 8 months ago - Stars: 1 - Forks: 2

pranavvss/Hand-Face-recognition-with-python

Hand/Face detection model using python (No hardware(Arduino, sensors required)

Language: Python - Size: 122 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

aaditya29/Parallel-Computing-And-CUDA

Learning about Parallel Computing and GPU programming using CUDA.

Language: C++ - Size: 47.9 KB - Last synced at: 20 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

tkob-vh/CUDA_kernels

Some general algorithms implemented in cuda.

Language: Cuda - Size: 76.2 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Mattjesc/PTQ-MeshGraphNet-NVIDIA-Modulus

Post-Training Quantization for MeshGraphNet Physics-Based ML Model: Cardiovascular Flow Simulation Implementation

Language: Python - Size: 6.84 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

efecaliskannn/Pneumonia-Detection-with-CNN--VGG16--and-ResNet50-Deep-Learning-Models

In this project, pneumonia detection using deep learning, a subset of artificial intelligence, is aimed. The performance of deep learning algorithms, including CNN, VGG16, and ResNet50 models, in detecting pneumonia has been examined.(Bu projede yapay zekanın alt kümesi olan derin öğrenme ile zatürre tespiti amaçlanmaktadır.)

Size: 9.77 KB - Last synced at: 22 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

chrisbraddock/gpu-tune

This project analyzes GPU performance metrics from AI training and inference tasks, highlighting the optimal power settings for efficient operation. It generates visualizations to help identify the best max power settings for training and inference based on energy consumption.

Language: Python - Size: 283 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

raidan-x/Intel-for-display-NVIDIA-for-computing

Intel for display, NVIDIA for computing

Size: 12.7 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

sahilmgandhi/gpu-parallel-kernel-execution

Final Project for CS259 Spring 2019 - Evaluating Parallel Kernel Execution on GPUs

Language: TeX - Size: 31.6 MB - Last synced at: 6 days ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

lalitdotdev/transcribeX

Transcribe audio in minutes with OpenAI's WhisperV3 and Flash Attention v2 + Transformers without relying on third-party providers and APIs. Host it yourself or try it out.

Language: TypeScript - Size: 235 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Beuth-Erdelt/prometheus_nvlink_exporter

This script collects some informations about NVLink and PCI bus traffic of NVidia GPUs. Results are published as prometheus metrics via a websocket.

Language: Python - Size: 10.7 KB - Last synced at: 7 days ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 0

UntouchedWagons/K3S-NVidia

A guide on using NVidia GPUs for transcoding or AI in Kubernetes

Size: 7.81 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 4 - Forks: 2

GTruf/Driver-Drowsiness-Detector

Prototype of an intelligent safety system for detecting driver drowsiness

Language: C++ - Size: 25.5 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

IsaacAlves7/nvidia-cuda

👁️‍🗨️📗 It's a repository of Nvidia CUDA programming.

Size: 6.84 KB - Last synced at: 25 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

cooleybyte/linvidia

nvidia drivers AND cuda for linux (drivers are the latest as of 2024)

Size: 3.91 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Hello-Computer-Science/Hello-CUDA

A repo for study CUDA

Language: Cuda - Size: 13.7 KB - Last synced at: 18 days ago - Pushed at: about 3 years ago - Stars: 9 - Forks: 0

sukesh-ak/Nvidia-GPU-vs-CPU

Comprarison of vector operation using CPU vs GPU using Nvidia Cuda

Language: C - Size: 307 KB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

BBC-Esq/Nvidia_Gpu_Monitor

Realtime Monitor of Nvidia GPU Metrics with NVML Library

Language: Python - Size: 37.1 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

qulei123/fmpeg

主要适配avfilter中的scale_npp支持RGB24、BGR24格式,以减少CPU使用率。

Language: C - Size: 17.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

johannlilly/nvidia-cuda

Sample projects and code associated with the installation of CUDA v11 for learning CUDA with C and C++.

Size: 95.1 MB - Last synced at: 4 days ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

gitctrlx/JetYOLO

JetYOLO:Speed through your DeepStream app development, cleverly and creatively.

Language: C++ - Size: 19 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

energypatrikhu/ShinobiNvidiaDocker

Language: Shell - Size: 36.1 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

americandatascience/alphai

AlphAI is a versatile Python toolkit for GPU profiling and analytics, supporting various tensor models. It enhances GPU server operations and serves as a client for American Data Science's notebook servers.

Language: Python - Size: 1 MB - Last synced at: 7 days ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

harshdeepsokhey/cse570-parallel-distributed-processing

Implementation of Parallel and Distributed Processing assignments

Size: 1.95 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 1

ProjectoOfficial/CUDA

Learn cuda step-by-step starting from 0 with these simple and free code examples (comments are provided!)

Language: Cuda - Size: 18.6 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

sandialabs/p3a

Portably Performant Physical Algebra

Language: C++ - Size: 645 KB - Last synced at: 14 days ago - Pushed at: about 2 years ago - Stars: 12 - Forks: 4

AbdusamadDev/Face-ID-Surveillance-Project

In Production and AI based service for human face detection. Uses most advanced and popular technologies to efficiently detect face identity. System runs locally and need strong Graphic card for best performance

Language: Python - Size: 791 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

FabricSoul/pytorch-jupyter-cuda

The docker image that can run pytorch and jupyterlab

Language: Dockerfile - Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Zwilla/BitCaine5_aeternity_miner

BitCaine5 the mining engine meets æternity blockchain technology

Language: Cuda - Size: 7.24 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 1

itm-unipi/Parallelized-Nearest-Neighbor-Upscaler

University Project for "Computer Architecture" course (MSc Computer Engineering @ University of Pisa). Implementation of a Parallelized Nearest Neighbor Upscaler using CUDA.

Language: C - Size: 44.6 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 2

kriation/docker-ethminer 📦

A docker container with a CUDA enabled build of ethminer

Language: Dockerfile - Size: 25.4 KB - Last synced at: 15 days ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 5

zocker-160/sheepit-docker-webUI

A lightweight docker container for the SheepIt! render farm with WebUI with CUDA support

Language: Shell - Size: 65.4 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 0

ghokun/nvidia-docker-host 📦

Installation details of a virtual machine with nvidia-docker to run CUDA in containers.

Size: 682 KB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

nopesir/thesis-project

Thesis: "Autocalibration of monocular cameras for autonomous driving scenarios", carried out in collaboration with Luxoft. This repository contains all the related code and the two implemented solution pipelines for Structure from Motion (SfM) implementation.

Language: C++ - Size: 120 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

kon-si/ntua_parlab

Semester assignment for ECE NTUA 3257 Parallel Processing

Language: C - Size: 5.79 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

fthbng77/segmentation_WebRTC

Language: JavaScript - Size: 3.61 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

enfiskutensykkel/ssd-gpu-dma

Build userspace NVMe drivers and storage applications with CUDA support

Language: C - Size: 711 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 261 - Forks: 35

maya-undefined/gpu-desktop-calculator

Language: Cuda - Size: 66.4 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 0

kaviles22/OpenCV-CUDA

How to copile OPENCV to use CUDA within a DOCKER image

Size: 55.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

xallard/scalable-tensorflow-example

Scalable TensorFlow is an open-source initiative tailored to harness the full potential of TensorFlow in high-demand scenarios, emphasizing distributed training and efficient model serving.

Size: 1.95 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

bhattbhavesh91/rapids-cudf-23-10-release-demo

Supercharge Your Pandas Code with RAPIDS cuDF & pandas accelerator mode | RAPIDS cuDF 23.10

Language: Jupyter Notebook - Size: 23.4 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

fjramireg/StiffMa

StiffMa: Fast finite element STIFFness MAtrix generation in MATLAB by using GPU computing.

Language: MATLAB - Size: 68.4 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 19 - Forks: 5

ayazhassan/RT-CUDA-GUI-Development

Recent development in Graphic Processing Units (GPUs) has opened a new challenge in harnessing their computing power as a new general-purpose computing paradigm with its CUDA parallel programming. However, porting applications to CUDA remains a challenge to average programmers. We have developed a restructuring software compiler (RT-CUDA) with best possible kernel optimizations to bridge the gap between high-level languages and the machine dependent CUDA environment. RT-CUDA is based upon a set of compiler optimizations. RT-CUDA takes a C-like program and convert it into an optimized CUDA kernel with user directives in a con.figuration .file for guiding the compiler. While the invocation of external libraries is not possible with OpenACC commercial compiler, RT-CUDA allows transparent invocation of the most optimized external math libraries like cuSparse and cuBLAS. For this, RT-CUDA uses interfacing APIs, error handling interpretation, and user transparent programming. This enables efficient design of linear algebra solvers (LAS). Evaluation of RT-CUDA has been performed on Tesla K20c GPU with a variety of basic linear algebra operators (M+, MM, MV, VV, etc.) as well as the programming of solvers of systems of linear equations like Jacobi and Conjugate Gradient. We obtained significant speedup over other compilers like OpenACC and GPGPU compilers. RT-CUDA facilitates the design of efficient parallel software for developing parallel simulators (reservoir simulators, molecular dynamics, etc.) which are critical for Oil & Gas industry. We expect RT-CUDA to be needed by many industries dealing with science and engineering simulation on massively parallel computers like NVIDIA GPUs.

Language: C - Size: 8.38 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 4 - Forks: 0

gvvsnrnaveen/cuda

this repository contains the various programs that can written using CUDA Toolkit.

Language: C - Size: 1.31 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

cloudgear-io/azure-bigcompute

:star: :penguin: :new: GPU Sku usage for Ubuntu 16.04-LTS and CentOS 7.4 , Standard Open Source Scheduler Deployments Torque, SLURM, PBSPro for HPC Skus for CentOS 7.4-HPC with OMS. This is presently on the GAed CentOS-HPC A9/H16R/H16MR and GPU NC6/NC12/NC24 . Latest Docker CE and nvidia-docker present in all. Updating for DIGITS for 2.0 on nvidia-runtime

Language: Shell - Size: 3.37 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 2

koulkoudakis/CUDA-FFT-Research

Language: Cuda - Size: 1.1 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 1

kangyolo/get-started-jetson-nano

Guidance for Nvidia Jetson Nano

Size: 816 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Fotic/CUDA-Project-2

Calculate minus of 2D arrays on GPU

Language: Cuda - Size: 488 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

Fotic/CUDA-Project

Calculate mean of 2D arrays on GPU

Language: Cuda - Size: 1.25 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

Wayne2Wang/WSL2InstallationGuide

Installation Guide for Windows Subsystem for Linux 2

Size: 32.2 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

fogoat/htmlmining

Unofficial HTMLCOIN Mining Group

Size: 2.79 MB - Last synced at: 5 days ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 4

Jaay7/Object-Detection

Detecting vehicles and warn them if they get close to another vehicle.

Language: Python - Size: 6.84 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

c1ph3r-fsocitey/CUDA-CUDNN-install-Ubuntu22.04

Here are the steps to install CUDA v11.36 and cuDNN v8.5 on Ubuntu 22.04

Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Bader-Research/CUDARMAT

RMAT Graph Generator for NVIDIA CUDA

Language: Cuda - Size: 326 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

karthikeyann/cuda-calculator Fork of szho42/cuda-calculator

HTML/JS port of CUDA Occupancy Calculator

Language: CoffeeScript - Size: 170 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 13 - Forks: 7