GitHub topics: gpu-computing

Repositories

kpet/clvk

Implementation of OpenCL 3.0 on Vulkan

Language: C++ - Size: 1.68 MB - Last synced at: about 7 hours ago - Pushed at: about 10 hours ago - Stars: 399 - Forks: 45

Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications adapt themselves to all the hardware in the system - even at runtime!

Language: C++ - Size: 14.6 MB - Last synced at: about 15 hours ago - Pushed at: 3 days ago - Stars: 1,664 - Forks: 203

preda/gpuowl

GPU Mersenne primality test.

Language: C++ - Size: 13.6 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 197 - Forks: 49

ewanwm/nuTens

Tensor based engine for calculating neutrino oscillation probabilities in a fast, flexible, and differentiable way

Language: C++ - Size: 6.3 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

KomputeProject/kompute

General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.

Language: C++ - Size: 25.3 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 2,258 - Forks: 174

nsreelekha/driven_cavity_flow

2D lid-driven cavity flow using Chorin projection and GFDM (GPU acceleration and CPU implementation)

Language: Jupyter Notebook - Size: 120 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

tiktokfnf33/Rayleigh-Taylor-Instability-Simulation

# CUDA Rayleigh-Taylor Instability SimulationThis repository features a high-performance simulation of the Rayleigh-Taylor instability using CUDA, Python, and C. Explore the implementation and results to understand fluid dynamics in a parallel computing context. 🖥️🚀

Language: Jupyter Notebook - Size: 13.9 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 1

Cyxuan0311/cuOP

The cuOP is a opertaor library with supporting of cuda and smart memory control

Language: C++ - Size: 73.2 KB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

lightbulb128/troy-nova

GPU/CUDA implementation of Leveled BFV/CKKS/BGV scheme.

Language: Cuda - Size: 976 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 35 - Forks: 8

tor4z/cov.hpp

A library for GPU computing with vulkan and shader

Language: C++ - Size: 93.8 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

LuxCoreRender/BlendLuxCore

Blender Integration for LuxCore

Language: Python - Size: 341 MB - Last synced at: 2 days ago - Pushed at: 11 days ago - Stars: 794 - Forks: 96

exospherehost/exospherehost

Mono repo for exosphere.host to simplify infrastructure once and for all.

Language: TypeScript - Size: 26 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 21 - Forks: 6

ComputationalRadiationPhysics/picongpu

Performance-Portable Particle-in-Cell Simulations for the Exascale Era :sparkles:

Language: C++ - Size: 58.4 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 744 - Forks: 221

uncomplicate/clojurecl

ClojureCL is a Clojure library for parallel computations with OpenCL.

Language: Clojure - Size: 875 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 280 - Forks: 18

uncomplicate/clojurecuda

Clojure library for CUDA development

Language: Clojure - Size: 514 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 186 - Forks: 10

Seth024/architecture-guides

Explore essential architecture guides on modern patterns and design principles. Enhance your software projects with proven strategies. 🌐🚀

Size: 21.5 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

open-atmos/PySDM

Pythonic particle-based (super-droplet) warm-rain/aqueous-chemistry cloud microphysics package with box, parcel & 1D/2D prescribed-flow examples in Python, Julia and Matlab

Language: Python - Size: 71.9 MB - Last synced at: 1 day ago - Pushed at: 5 days ago - Stars: 76 - Forks: 46

MultiphaseFlowLab/MHIT36

Multi-GPU version of MHIT36 using cuDecomp

Language: Fortran - Size: 49.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 6 - Forks: 0

lovnishverma/CPU_VS_GPU

A comprehensive benchmarking tool to compare performance between CPU and GPU

Language: Python - Size: 626 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

mikbry/awesome-webgpu

😎 Curated list of awesome things around WebGPU ecosystem.

Size: 99.6 KB - Last synced at: 4 days ago - Pushed at: 4 months ago - Stars: 1,680 - Forks: 71

kousuke-nakano/orbkit

`orbkit` is a JAX-compatible toolkit for continuous ab initio quantum Monte Carlo (QMC) simulations, developed entirely from scratch using Python and JAX.

Language: Python - Size: 310 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 7 - Forks: 0

NVIDIA/thrust 📦

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

Language: C++ - Size: 17 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 4,980 - Forks: 763

beehive-lab/docker-tornadovm

Docker build scripts for TornadoVM on GPUs: https://github.com/beehive-lab/TornadoVM

Language: Shell - Size: 175 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 29 - Forks: 6

NVIDIA/MatX

An efficient C++17 GPU numerical computing library with Python-like syntax

Language: C++ - Size: 20.6 MB - Last synced at: 4 days ago - Pushed at: 6 days ago - Stars: 1,338 - Forks: 101

FluidNumerics/SELF

Spectral Element Library in Fortran

Language: Fortran - Size: 48.6 MB - Last synced at: 4 days ago - Pushed at: 10 days ago - Stars: 81 - Forks: 13

IntelPython/dpctl

Python SYCL bindings and SYCL-based Python Array API library

Language: C++ - Size: 219 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 114 - Forks: 30

microsoft/pai 📦

Resource scheduling and cluster management for AI

Language: JavaScript - Size: 70.5 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 2,666 - Forks: 547

ProjectPhysX/OpenCL-Benchmark

A small OpenCL benchmark program to measure peak GPU/CPU performance.

Language: C++ - Size: 248 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 230 - Forks: 28

inducer/pycuda

CUDA integration for Python, plus shiny features

Language: Python - Size: 2.95 MB - Last synced at: 2 days ago - Pushed at: about 1 month ago - Stars: 1,966 - Forks: 291

KernelTuner/kernel_tuner

Kernel Tuner

Language: Python - Size: 41.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 351 - Forks: 56

CEA-MetroCarac/pyvsnr

A Python library for computing the VSNR in 2D images. It provides both CPU and GPU implementations.

Language: Jupyter Notebook - Size: 24.8 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 9 - Forks: 0

uncomplicate/neanderthal

Fast Clojure Matrix Library

Language: Clojure - Size: 3.72 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,100 - Forks: 58

8e8bdba457c18cf692a95fe2ec67000b/VulkanCooperativeMatrixAttention

Vulkan & GLSL implementation of FlashAttention-2

Size: 1.95 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

software-mansion/TypeGPU

TypeScript library that enhances the WebGPU API, allowing resource management in a type-safe, declarative way.

Language: TypeScript - Size: 89.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 537 - Forks: 12

gyroflow/gyroflow

Video stabilization using gyroscope data

Language: Rust - Size: 81.3 MB - Last synced at: 6 days ago - Pushed at: 9 days ago - Stars: 7,598 - Forks: 344

trixi-gpu/trixi-gpu.github.io

Documentation for Trixi-GPU

Language: SCSS - Size: 10.2 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 3 - Forks: 0

Dpdl-io/DpdlEngine

Dpdl (Dynamic Packet Definition Language) is a rapid development programming language and constrained device framework with built-in database technology. Dpdl enables also the embedding and execution of multiple programming languages (C, C++, Python, etc...) directly within Dpdl code

Language: C - Size: 981 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 3 - Forks: 0

recp/gpu

🔭 cross platform general purpose GPU library - optimized for rendering

Language: C - Size: 1.82 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 38 - Forks: 2

RobbiSixx/prysm

Prysm is a blazing-smart Puppeteer-based web scraper that doesn't just extract - it understands structure. Capable of scraping virtually any website with intelligent content detection and 14 specialized scroll strategies that adapt to different page layouts, Prysm excels at extracting content that other scrapers miss.

Language: JavaScript - Size: 1.22 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 3 - Forks: 0

SciML/SciMLBook

Parallel Computing and Scientific Machine Learning (SciML): Methods and Applications (MIT 18.337J/6.338J)

Language: HTML - Size: 120 MB - Last synced at: 5 days ago - Pushed at: 13 days ago - Stars: 1,919 - Forks: 351

NVIDIA/cccl

CUDA Core Compute Libraries

Language: C++ - Size: 84.2 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,736 - Forks: 234

xsuite/xsuite

Suite of python packages for multiparticle simulations of particle accelerators.

Language: Python - Size: 47 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 39 - Forks: 25

AccelerateHS/accelerate

Embedded language for high-performance array computations

Language: Haskell - Size: 15.4 MB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 928 - Forks: 123

LuxCoreRender/LuxCore

LuxCore source repository

Language: C++ - Size: 156 MB - Last synced at: 6 days ago - Pushed at: 9 days ago - Stars: 1,235 - Forks: 153

tkemmer/CuNESSie.jl

CUDA-accelerated Nonlocal Electrostatics in Structured Solvents

Language: Julia - Size: 207 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

BindsNET/bindsnet

Simulation of spiking neural networks (SNNs) using PyTorch.

Language: Python - Size: 39.1 MB - Last synced at: 7 days ago - Pushed at: 14 days ago - Stars: 1,600 - Forks: 336

haschka/multicube

Obtaining molecular partial charges using direct minimization

Language: C - Size: 2.01 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

mikeroyal/GPU-Guide

Graphics Processing Unit (GPU) Architecture Guide

Language: Shell - Size: 815 KB - Last synced at: about 17 hours ago - Pushed at: over 3 years ago - Stars: 221 - Forks: 19

catboost/catboost

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Language: C++ - Size: 1.5 GB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 8,461 - Forks: 1,231

gopmur/OpenSayal

Real-time 2D fluid simulator on CUDA

Language: Cuda - Size: 669 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 21 - Forks: 2

Tensor-Array/Tensor-Array

A C++ machine learning framework/library.

Language: C++ - Size: 11.8 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 2 - Forks: 0

statmlben/rankseg

RankSEG: A consistent ranking-based framework for segmentation

Language: Jupyter Notebook - Size: 63.5 MB - Last synced at: 2 days ago - Pushed at: 8 days ago - Stars: 26 - Forks: 2

ginkgo-project/ginkgo

Numerical linear algebra software package

Language: C++ - Size: 156 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 480 - Forks: 97

AnicetNgrt/jiro-nn

A Deep Learning and preprocessing framework in Rust with support for CPU and GPU.

Language: Rust - Size: 17.5 MB - Last synced at: 4 days ago - Pushed at: almost 2 years ago - Stars: 132 - Forks: 3

dyGiLa/dyGiLa

Repository of dyGiLa project

Language: C++ - Size: 1020 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 1

NonDairyNeutrino/Thesis

Thesis for the Computational Science Master's program at Central Washington University. 3D extension of an analog of cosmological particle creation in a Friedmann-Robertson-Walker universe by numerically simulating a Bose-Einstein condensate with a time-dependent scattering length.

Language: TeX - Size: 14.4 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 2 - Forks: 0

denosaurs/netsaur

Powerful Powerful Machine Learning library with GPU, CPU and WASM backends

Language: Rust - Size: 146 MB - Last synced at: 7 days ago - Pushed at: 10 months ago - Stars: 246 - Forks: 4

JuliaGPU/KernelAbstractions.jl

Heterogeneous programming in Julia

Language: Julia - Size: 4.41 MB - Last synced at: 10 days ago - Pushed at: 13 days ago - Stars: 438 - Forks: 74

ProjectPhysX/OpenCL-Wrapper

OpenCL is the most powerful programming language ever created. Yet the OpenCL C++ bindings are cumbersome and the code overhead prevents many people from getting started. I created this lightweight OpenCL-Wrapper to greatly simplify OpenCL software development with C++ while keeping functionality and performance.

Language: C++ - Size: 344 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 408 - Forks: 41

ROCm/hipBLASLt

[DEPRECATED] Moved to ROCm/rocm-libraries repo

Language: Assembly - Size: 1.15 GB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 109 - Forks: 142

beehive-lab/TornadoVM

TornadoVM: A practical and efficient heterogeneous programming framework for managed languages

Language: Java - Size: 152 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 1,280 - Forks: 120

akilandrews/hecbench-openmp-builder

Bash build scripts for HeCBench OpenMP offload benchmarks

Language: Shell - Size: 8.79 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

lachlan2k/phatcrack

Modern web-based distributed hashcracking solution, built on hashcat

Language: Go - Size: 10.7 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 141 - Forks: 12

google/tf-quant-finance

High-performance TensorFlow library for quantitative finance.

Language: Python - Size: 16.9 MB - Last synced at: 13 days ago - Pushed at: 4 months ago - Stars: 4,913 - Forks: 628

mumax/plus

More versatile and extensible GPU-accelerated micromagnetic simulator

Language: Python - Size: 90.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 19 - Forks: 5

tensorflow/lingvo

Lingvo

Language: Python - Size: 142 MB - Last synced at: 3 days ago - Pushed at: 26 days ago - Stars: 2,844 - Forks: 450

NonDairyNeutrino/PararealGPU.jl

A distributed and GPU-based implementation of the Parareal algorithm for parallel-in-time integration of equations of motion.

Language: Julia - Size: 3.37 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 3 - Forks: 0

datum-cloud/awesome-alt-clouds

A list of specialized clouds that span traditional infra, AI, data, connectivity, and more.

Size: 257 KB - Last synced at: 4 days ago - Pushed at: 21 days ago - Stars: 22 - Forks: 5

hhaoyan/opt-einsum-torch

Memory-efficient optimum einsum using opt_einsum planning and PyTorch kernels.

Language: Python - Size: 23.4 KB - Last synced at: 15 days ago - Pushed at: about 2 years ago - Stars: 16 - Forks: 2

AmesingFlank/taichi.js

Modern GPU Compute and Rendering in Javascript

Language: TypeScript - Size: 220 MB - Last synced at: 9 days ago - Pushed at: 12 months ago - Stars: 502 - Forks: 19

mratsim/Arraymancer

A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends

Language: Nim - Size: 3.8 MB - Last synced at: 13 days ago - Pushed at: 4 months ago - Stars: 1,373 - Forks: 96

alfinauzikri/ROCm-RX6600XT

AMD ROCm Installation Guide on RX 6600 XT + TensorFlow and PyTorch

Size: 8.79 KB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 74 - Forks: 6

chr0n1x/rpi-talos

My homelab on TalosOS, and a variety of other hardware.

Language: Shell - Size: 1.24 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 3 - Forks: 0

ProjectPhysX/FluidX3D

The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.

Language: C++ - Size: 20.8 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 4,480 - Forks: 389

ParaGroup/WindFlow

A C++17 Data Stream Processing Parallel Library for Multicores and GPUs

Language: C++ - Size: 48.9 MB - Last synced at: 11 days ago - Pushed at: 4 months ago - Stars: 85 - Forks: 19

dawn-gpu/node-webgpu

webgpu for node.js

Language: JavaScript - Size: 25.9 MB - Last synced at: 1 day ago - Pushed at: 20 days ago - Stars: 34 - Forks: 3

goki/vgpu

Vulkan GPU Framework for Graphics and Compute in Go, now developed at https://github.com/cogentcore/core/tree/main/gpu

Language: Go - Size: 1.4 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 32 - Forks: 4

KempnerInstitute/kempner-computing-handbook

Kempner Institute Computing Handbook

Language: JavaScript - Size: 66.4 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 12 - Forks: 5

RRZE-HPC/gpu-benches

collection of benchmarks to measure basic GPU capabilities

Language: C++ - Size: 1.78 MB - Last synced at: 13 days ago - Pushed at: 5 months ago - Stars: 386 - Forks: 55

PurdueRCAC/gpu-stress

Simple GPU stress utility

Language: Python - Size: 26.4 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 1 - Forks: 0

hel-astro-lab/runko

Modern C++/python CPU/GPU plasma toolbox

Language: C++ - Size: 4.33 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 50 - Forks: 19

EMI-Group/evomo

EvoMO is a GPU-accelerated library for evolutionary multiobjective optimization (EMO)

Language: Python - Size: 999 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 90 - Forks: 9

calebwin/emu

The write-once-run-anywhere GPGPU library for Rust

Language: Rust - Size: 342 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 1,610 - Forks: 52

mfem/PyMFEM

Python wrapper for MFEM

Language: SWIG - Size: 25.9 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 257 - Forks: 64

TinkerTools/tinker-gpu

Tinker-GPU: Next Generation of Tinker with GPU Support

Language: C++ - Size: 38.4 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 51 - Forks: 29

AMReX-Agent/ExaEpi

An agent-based epidemiological simulation code using AMReX

Language: Jupyter Notebook - Size: 159 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 3 - Forks: 10

baggepinnen/MonteCarloMeasurements.jl

Propagation of distributions by Monte-Carlo sampling: Real number types with uncertainty represented by samples.

Language: Julia - Size: 5.24 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 276 - Forks: 18

Ricks-Lab/gpu-utils

A set of utilities for monitoring and customizing GPU performance

Language: Python - Size: 3.98 MB - Last synced at: 20 days ago - Pushed at: about 1 year ago - Stars: 152 - Forks: 24

bokiko/Kuzco-Multi-GPU-Inference.net

Kuzco : Multi-GPU Inference.net guide

Language: Shell - Size: 43.9 KB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 5 - Forks: 0

Nicolas-Ferre/wgso

WebGPU Shader Orchestrator to create GPU-native applications

Language: Rust - Size: 289 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

nosek002/exospherehost

Build and scale AI workflows effortlessly with Exospherehost. Our open-source infrastructure empowers creators to focus on innovation. 🚀🌍

Size: 1.06 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

ROCm/Tensile

[DEPRECATED] Moved to ROCm/rocm-libraries repo

Language: Python - Size: 95.2 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 246 - Forks: 166

coreylowman/dfdx

Deep learning in Rust, with shape checked tensors and neural networks

Language: Rust - Size: 2.6 MB - Last synced at: 22 days ago - Pushed at: 12 months ago - Stars: 1,824 - Forks: 106

bmahe/Genetics4J

Mirror - Genetics Algorithms and Genetic Programming library. https://genetics4j.org

Language: Java - Size: 15.1 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

knagrecha/saturn

Saturn accelerates the training of large-scale deep learning models with a novel joint optimization approach.

Language: Python - Size: 107 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 23 - Forks: 5

barbagroup/PetIBM

PetIBM - toolbox and applications of the immersed-boundary method on distributed-memory architectures

Language: C++ - Size: 14.9 MB - Last synced at: 7 days ago - Pushed at: almost 3 years ago - Stars: 110 - Forks: 52

IntelPython/DPEP

Data Parallel Extensions for Python*

Language: Jupyter Notebook - Size: 8.36 MB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 36 - Forks: 8

eth-cscs/DLA-Future-Fortran

Fortran interface for DLA-Future

Language: Fortran - Size: 1.64 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 6 - Forks: 1

Guillaume-Helbecque/GPU-accelerated-tree-search-Chapel

GPU-accelerated tree search: Investigating Chapel versus CUDA/HIP+X

Language: C - Size: 488 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 2 - Forks: 1

andrewmilson/ministark

🏃‍♂️💨 GPU accelerated STARK prover built on @arkworks-rs

Language: Rust - Size: 1.65 MB - Last synced at: 13 days ago - Pushed at: 8 months ago - Stars: 361 - Forks: 36

Related Keywords

gpu-computing 856 cuda 277 gpu 257 gpu-acceleration 139 opencl 101 gpu-programming 96 parallel-computing 85 python 78 deep-learning 75 machine-learning 73 cpp 73 gpgpu 63 high-performance-computing 48 hpc 48 c 48 nvidia 46 cuda-programming 44 parallel-programming 33 tensorflow 32 pytorch 32 docker 28 openmp 27 opengl 25 ai 24 rust 23 metal 23 vulkan 23 simulation 22 cuda-kernels 22 neural-networks 22 scientific-computing 20 nvidia-gpu 19 matrix-multiplication 18 python3 18 deep-neural-networks 18 neural-network 17 parallel 17 cfd 16 artificial-intelligence 16 raytracing 15 image-processing 15 glsl 14 benchmark 14 nvidia-cuda 14 mpi 14 computational-fluid-dynamics 13 webgpu 13 cpu 13 distributed-computing 12 hip 12 tensor 12 julia 12 computer-vision 12 cplusplus 11 physics-simulation 11 sycl 11 shaders 11 graphics 11 graph-algorithms 11 high-performance 10 fortran 10 data-science 10 distributed-systems 10 rocm 10 java 10 cuda-toolkit 10 linear-algebra 10 swift 10 linux 10 convolutional-neural-networks 10 cudnn 9 gpgpu-computing 9 benchmarking 9 opencv 9 optimization 9 llm 9 jupyter-notebook 9 heterogeneous-parallel-programming 9 compute-shader 9 kubernetes 8 shader 8 cpp17 8 javascript 8 physics 8 numpy 8 sparse-matrix 8 ray-tracing 8 amd 8 oneapi 8 tutorial 8 hpc-applications 8 openacc 7 numerical-methods 7 3d 7 webgl 7 csharp 7 numba 7 jax 7 algorithms 7 c-plus-plus 7