GitHub topics: gpu-computing
aledinola/AiyagariSparse
Infinite horizon dynamic programming with Howard acceleration and sparse matrices, tested on Aiyagari model
Language: MATLAB - Size: 24.4 KB - Last synced at: about 4 hours ago - Pushed at: about 5 hours ago - Stars: 0 - Forks: 0
software-mansion/TypeGPU
A modular and open-ended toolkit for WebGPU, with advanced type inference and the ability to write shaders in TypeScript
Language: TypeScript - Size: 256 MB - Last synced at: about 6 hours ago - Pushed at: about 6 hours ago - Stars: 1,026 - Forks: 25
JuliaGPU/KernelAbstractions.jl
Heterogeneous programming in Julia
Language: Julia - Size: 4.73 MB - Last synced at: about 4 hours ago - Pushed at: about 5 hours ago - Stars: 466 - Forks: 80
Guillaume-Helbecque/GPU-accelerated-tree-search-Chapel
GPU-accelerated tree search: Investigating Chapel versus CUDA/HIP+X
Language: Chapel - Size: 592 KB - Last synced at: about 7 hours ago - Pushed at: about 9 hours ago - Stars: 2 - Forks: 1
FAST-Imaging/FAST
A framework for high-performance medical image processing, neural network inference and visualization
Language: C++ - Size: 20 MB - Last synced at: about 13 hours ago - Pushed at: about 13 hours ago - Stars: 488 - Forks: 108
dyGiLa/dyGiLa
Repository of dyGiLa project
Language: C++ - Size: 1.07 MB - Last synced at: about 13 hours ago - Pushed at: about 15 hours ago - Stars: 0 - Forks: 1
JuliaAstroSim/AstroNbodySim.jl
Unitful and differentiable gravitational N-body simulation code in Julia
Language: Julia - Size: 19.6 MB - Last synced at: about 15 hours ago - Pushed at: about 17 hours ago - Stars: 34 - Forks: 0
NonDairyNeutrino/Thesis
Thesis for the Computational Science Master's program at Central Washington University. 3D extension of an analog of cosmological particle creation in a Friedmann-Robertson-Walker universe by numerically simulating a Bose-Einstein condensate with a time-dependent scattering length.
Language: TeX - Size: 20.4 MB - Last synced at: about 18 hours ago - Pushed at: about 19 hours ago - Stars: 3 - Forks: 0
ley995/triangle-splatting2
🎨 Implement differentiable rendering techniques using opaque triangles for advanced graphics applications with the Triangle Splatting+ framework.
Language: Python - Size: 3.08 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0
stotko/stdgpu
stdgpu: Efficient STL-like Data Structures on the GPU
Language: C++ - Size: 5.01 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,234 - Forks: 91
Avicted/hip-img-fx
Fast GPU-accelerated image filters (HIP) with a CPU fallback path for portability.
Language: C++ - Size: 15.5 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0
Lyynn777/CUDA-Bitonic-sort
Simple CUDA project to implement Bitonic Sort and compare it with normal CPU sorting.
Language: Jupyter Notebook - Size: 179 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0
google/tf-quant-finance
High-performance TensorFlow library for quantitative finance.
Language: Python - Size: 16.9 MB - Last synced at: 1 day ago - Pushed at: 8 months ago - Stars: 5,030 - Forks: 647
tinker495/PuXle
PuXle: Planning using jaX-based learning environments
Language: Python - Size: 1.14 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 4 - Forks: 1
ProteusMRIgHIFU/BabelViscoFDTD
Software library for FDTD of viscoelastic equation using a staggered grid arrangement with support for GPU and CPU backends
Language: Jupyter Notebook - Size: 117 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 56 - Forks: 12
chr0n1x/rpi-talos
My homelab on TalosOS, and a variety of other hardware.
Language: Shell - Size: 1.4 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 6 - Forks: 0
catboost/catboost
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Language: C++ - Size: 1.51 GB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 8,641 - Forks: 1,243
quantaosun/NAMD-FEP
Cloud GPU supported NAMD3-based binding free energy difference between two small molecules against the same protein target. This is probably one of the fastest FEP simulation with FREE GPU hardwares that the genearl public could have access to
Language: Jupyter Notebook - Size: 30.1 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 26 - Forks: 9
LennartPaduch/WebLBM
Fast and memory efficient Lattice Boltzmann CFD (D2Q9) for the browser running on the GPU via the WebGPU API https://weblbm.pages.dev/
Language: TypeScript - Size: 57.6 KB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0
tinker495/JAxtar
JAxtar is a project with a JAX-native implementation of parallelizeable A* & Q* solver for neural heuristic search research.
Language: Python - Size: 13.5 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 42 - Forks: 4
sean-hungerford/seedVR2_cudafull
🖥️ Enhance your video and image quality with SeedVR2 for ComfyUI, supporting multi-GPU setups for efficient upscaling.
Language: Python - Size: 4.74 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0
DESIGN4ADDITIVE/GPUCADforAM
Design Software for Additive Manufacturing
Language: C++ - Size: 1.48 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 10 - Forks: 2
maltsev-andrey/gpu-data-exploration
Data exploration tools for GPU computing benchmarks - Wikipedia & HDF5 sensor datasets
Language: Python - Size: 8.79 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0
xsuite/xsuite
Suite of python packages for multiparticle simulations of particle accelerators.
Language: Python - Size: 58.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 44 - Forks: 29
LLNL/CARE
CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as loop fusion capability and a portable interface for many numerical algorithms. It provides all the basics for anyone wanting to write portable code.
Language: C++ - Size: 1.51 MB - Last synced at: about 1 hour ago - Pushed at: about 3 hours ago - Stars: 31 - Forks: 5
KernelTuner/kernel_tuner
Kernel Tuner
Language: Python - Size: 41.5 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 370 - Forks: 59
8e8bdba457c18cf692a95fe2ec67000b/VulkanCooperativeMatrixAttention
Vulkan & GLSL implementation of FlashAttention-2
Size: 1.95 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0
ComputationalRadiationPhysics/picongpu
Performance-Portable Particle-in-Cell Simulations for the Exascale Era :sparkles:
Language: C++ - Size: 59.1 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 760 - Forks: 225
ZrobMiloudaa/jetson-orin-matmul-analysis
🔍 Analyze CUDA matrix multiplication performance and power consumption on NVIDIA Jetson Orin Nano across multiple implementations and settings.
Language: Python - Size: 9.36 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0
javimanotas/fractals
2D and 3D fractal rendering with Unity
Language: ShaderLab - Size: 62.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0
gyroflow/gyroflow
Video stabilization using gyroscope data
Language: Rust - Size: 83.8 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 7,898 - Forks: 363
Dipper180/nvidia-z33qn
🎨 Explore random, Nvidia-inspired projects in this unique collection, showcasing creativity and innovation for developers and tech enthusiasts alike.
Language: JavaScript - Size: 1.29 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0
TranNgocHieu/nodejs-native-gpu
🚀 Harness GPU power in Node.js with this native addon for accelerated computations, opening new possibilities for JavaScript developers.
Language: JavaScript - Size: 1.3 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0
Thando11/microsoft-dgx7u
🛠 Explore an experimental sandbox inspired by Microsoft, featuring random code, ideas, and prototypes for innovative development.
Language: JavaScript - Size: 1.29 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0
JLnorthwestern/GO-MELT
GO-MELT: GPU-Optimized Multilevel Execution of LPBF Thermal simulations
Language: Python - Size: 5.72 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 26 - Forks: 5
NVIDIA/cccl
CUDA Core Compute Libraries
Language: C++ - Size: 240 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,994 - Forks: 284
DeonDavisV/Nodexa-Chain-Core
🌐 Reward hosting providers on the Clore.ai marketplace with Nodexa-Chain-Core, a blockchain solution designed for efficiency and stability.
Language: C - Size: 7.59 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0
tkemmer/CuNESSie.jl
CUDA-accelerated Nonlocal Electrostatics in Structured Solvents
Language: Julia - Size: 223 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0
AccelerateHS/accelerate-llvm
LLVM backend for Accelerate
Language: Haskell - Size: 3.95 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 167 - Forks: 59
ginkgo-project/ginkgo
Numerical linear algebra software package
Language: C++ - Size: 158 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 522 - Forks: 99
Parsifal1916/Aster
A robust, simple yet complete n-body integrator
Language: C++ - Size: 5.93 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0
Zydak/Vulkan-Path-Tracer
Vulkan Path Tracer. Physically based path tracer made in Vulkan with Ray Tracing Pipeline.
Language: C++ - Size: 462 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 341 - Forks: 11
AdaptiveCpp/AdaptiveCpp
Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications adapt themselves to all the hardware in the system - even at runtime!
Language: C++ - Size: 14.3 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,723 - Forks: 199
NVIDIA/MatX
An efficient C++17 GPU numerical computing library with Python-like syntax
Language: C++ - Size: 21.2 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,357 - Forks: 108
shivakrsna/LightGridShader
📺 Create stunning LED display effects with this Unity Shader Graph sample project, designed for clear, vibrant visuals in your applications.
Language: HLSL - Size: 1.33 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0
RobbiSixx/prysm
Prysm is a blazing-smart Puppeteer-based web scraper that doesn't just extract - it understands structure. Capable of scraping virtually any website with intelligent content detection and 14 specialized scroll strategies that adapt to different page layouts, Prysm excels at extracting content that other scrapers miss.
Language: JavaScript - Size: 1.22 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 4 - Forks: 0
exospherehost/exospherehost
Infra for scalable and reliable AI agents
Language: Python - Size: 34.7 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 153 - Forks: 38
elcruzo/cuda-conv
Lightweight CUDA kernel for 2D image convolution achieving 20x+ speedup. Built with CuPy for the NVIDIA Hackathon.
Language: Python - Size: 99.6 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0
beehive-lab/TornadoVM
TornadoVM: A practical and efficient heterogeneous programming framework for managed languages
Language: Java - Size: 160 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,342 - Forks: 123
Maison-de-la-Simulation/miniPIC
Playground for computer science and HPC experiments applied to the Particle-In-Cell method.
Language: C++ - Size: 1020 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 1
nabilshadman/cuda-4-dummies
Lecture slides and exercise files of the CUDA 4 Dummies course (2025)
Language: Cuda - Size: 11.4 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0
lachlan2k/phatcrack
Modern web-based distributed hashcracking solution, built on hashcat
Language: Go - Size: 11.2 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 146 - Forks: 12
tensorflow/lingvo
Lingvo
Language: Python - Size: 142 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2,851 - Forks: 452
cherubrock-seb/PrMers
Mersenne prime search using integer arithmetic and an IDBWT via an NTT executed on the GPU through OpenCL.
Language: C++ - Size: 59.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 8 - Forks: 3
uncomplicate/neanderthal
Fast Clojure Matrix Library
Language: Clojure - Size: 3.96 MB - Last synced at: 6 days ago - Pushed at: 11 days ago - Stars: 1,111 - Forks: 58
kpet/clvk
Implementation of OpenCL 3.0 on Vulkan
Language: C++ - Size: 1.74 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 412 - Forks: 45
ROCm/Tensile
[DEPRECATED] Moved to ROCm/rocm-libraries repo
Language: Python - Size: 98.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 254 - Forks: 167
BindsNET/bindsnet
Simulation of spiking neural networks (SNNs) using PyTorch.
Language: Python - Size: 61.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,634 - Forks: 340
oracle-quickstart/oci-gpu-scanner
GPU & cluster health and performance monitoring solution for OCI
Language: HCL - Size: 6.13 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 8 - Forks: 2
Dpdl-io/DpdlEngine
Dpdl (Dynamic Packet Definition Language) is a rapid development programming language and constrained device framework with built-in database and agent technology. Dpdl enables also the embedding and execution of multiple programming languages (C, C++, Python, etc...) directly within Dpdl code
Language: C - Size: 899 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 4 - Forks: 0
ProjectPhysX/FluidX3D
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
Language: C++ - Size: 21.3 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 4,728 - Forks: 425
arkodeepsen/qwen-image
Production-ready RunPod serverless endpoint and pod for Qwen-Image (20B) - Text-to-image generation with exceptional English and Chinese text rendering
Language: Shell - Size: 14.6 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 2 - Forks: 0
mdkhussairiee/Asrix-Labs-FastAPI-Sadtalker
A GPU-accelerated FastAPI microservice that generates talking-head videos from static images and audio using SadTalker.
Language: Python - Size: 23 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0
LuxCoreRender/LuxCore
LuxCore source repository
Language: C++ - Size: 155 MB - Last synced at: 8 days ago - Pushed at: 16 days ago - Stars: 1,262 - Forks: 156
mratsim/Arraymancer
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Language: Nim - Size: 3.8 MB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 1,380 - Forks: 95
IntelPython/dpctl
Python SYCL bindings and SYCL-based Python Array API library
Language: C++ - Size: 222 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 117 - Forks: 31
tumaer/JAXFLUIDS
Differentiable Fluid Dynamics Package
Language: Python - Size: 12.6 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 466 - Forks: 85
tinker495/Xtructure
Xtructure is datastructure for using in JAX
Language: Python - Size: 2.2 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 18 - Forks: 0
DiamondLightSource/fast-feedback-service
GPU based service to provide fast-feedback results
Language: C++ - Size: 1010 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 3 - Forks: 3
mikeroyal/GPU-Guide
Graphics Processing Unit (GPU) Architecture Guide
Language: Shell - Size: 815 KB - Last synced at: 9 days ago - Pushed at: almost 4 years ago - Stars: 245 - Forks: 19
shaazib-tanvir/wave-simulation
A GPU based simulation of the Wave Equation
Language: C - Size: 52.7 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0
ProjectPhysX/OpenCL-Wrapper
OpenCL is the most powerful programming language ever created. Yet the OpenCL C++ bindings are cumbersome and the code overhead prevents many people from getting started. I created this lightweight OpenCL-Wrapper to greatly simplify OpenCL software development with C++ while keeping functionality and performance.
Language: C++ - Size: 396 KB - Last synced at: 8 days ago - Pushed at: 14 days ago - Stars: 440 - Forks: 43
AmesingFlank/taichi.js
Modern GPU Compute and Rendering in Javascript
Language: TypeScript - Size: 220 MB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 515 - Forks: 20
Finoptimize/agentaflow-sro-community
Manage AI and Machine Learning workloads more efficiently with lower cost: GPU Orchestration / Scheduling / Routing / Serving / Optimization / Observability for AI/ML systems
Language: Go - Size: 1.18 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 3
nadult/lucid
LucidRaster: real-time GPU software rasterizer for exact OIT
Language: C++ - Size: 1.38 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 54 - Forks: 0
ROCm/hipBLASLt
[DEPRECATED] Moved to ROCm/rocm-libraries repo
Language: Assembly - Size: 1.31 GB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 114 - Forks: 146
LuxCoreRender/BlendLuxCore
Blender Integration for LuxCore
Language: Python - Size: 341 MB - Last synced at: 12 days ago - Pushed at: 16 days ago - Stars: 815 - Forks: 98
johnh2o2/cuvarbase
Python library for fast time-series analysis on CUDA GPUs
Language: Jupyter Notebook - Size: 50.2 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 30 - Forks: 6
seanwevans/WarpDB
An on-GPU database
Language: C++ - Size: 618 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0
inducer/pycuda
CUDA integration for Python, plus shiny features
Language: Python - Size: 2.87 MB - Last synced at: 9 days ago - Pushed at: 23 days ago - Stars: 1,998 - Forks: 295
SciML/SciMLBook
Parallel Computing and Scientific Machine Learning (SciML): Methods and Applications (MIT 18.337J/6.338J)
Language: HTML - Size: 128 MB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 1,943 - Forks: 357
ProjectPhysX/OpenCL-Benchmark
A small OpenCL benchmark program to measure peak GPU/CPU performance.
Language: C++ - Size: 286 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 255 - Forks: 31
Cre4T3Tiv3/jetson-orin-matmul-analysis
Scientific CUDA benchmarking framework: 4 implementations x 3 power modes x 5 matrix sizes on Jetson Orin Nano. 1,282 GFLOPS peak, 90% performance @ 88% power (25W mode), 99.5% accuracy validation, edge AI deployment guide.
Language: Python - Size: 9.36 MB - Last synced at: 14 days ago - Pushed at: 20 days ago - Stars: 6 - Forks: 0
brandondube/prysm
physical optics: integrated modeling, phase retrieval, segmented systems, polynomials and fitting, sequential raytracing...
Language: Python - Size: 12.2 MB - Last synced at: 9 days ago - Pushed at: 10 months ago - Stars: 315 - Forks: 53
hpi-epic/gpucsl
Constraint-based Causal Structure Learning on GPUs.
Language: Python - Size: 140 KB - Last synced at: 5 days ago - Pushed at: almost 3 years ago - Stars: 41 - Forks: 1
NVIDIA/thrust 📦
[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl
Language: C++ - Size: 17 MB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 4,985 - Forks: 765
KempnerInstitute/kempner-computing-handbook
Kempner Institute Computing Handbook
Language: JavaScript - Size: 67.6 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 21 - Forks: 10
Brijes987/ChronoTrade
Modern C++23 trading engine with lock-free data structures, CUDA acceleration, coroutines, and SIMD optimizations. Achieves sub-microsecond order matching with comprehensive benchmarking and CI/CD pipeline.
Language: C++ - Size: 30.3 KB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0
Tennisee-data/benchHUB
benchHUB is a Python-based project to parse, aggregate, and visualize system and performance benchmarks. It includes a Streamlit dashboard to display and compare results.
Language: Python - Size: 1.36 MB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0
uncomplicate/clojurecuda
Clojure library for CUDA development
Language: Clojure - Size: 563 KB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 191 - Forks: 10
mikbry/awesome-webgpu
😎 Curated list of awesome things around WebGPU ecosystem.
Size: 108 KB - Last synced at: 15 days ago - Pushed at: about 2 months ago - Stars: 1,749 - Forks: 74
knagrecha/saturn
Saturn accelerates the training of large-scale deep learning models with a novel joint optimization approach.
Language: Python - Size: 107 KB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 24 - Forks: 5
UchihaIthachi/sssp-apsp-hpc-openmp-cuda
🚀 High-performance implementations and benchmarks of SSSP and APSP algorithms (Bellman–Ford, Dijkstra, Floyd–Warshall, Johnson) in Serial, OpenMP, CUDA, and Hybrid CPU+GPU. Includes profiling, speedup plots, and HPC notebooks
Language: Jupyter Notebook - Size: 494 KB - Last synced at: 17 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0
microsoft/pai 📦
Resource scheduling and cluster management for AI
Language: JavaScript - Size: 70.5 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 2,677 - Forks: 549
datum-cloud/awesome-alt-clouds
A list of specialized clouds that span traditional infra, AI, data, connectivity, and more.
Size: 319 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 31 - Forks: 7
AccelerateHS/accelerate
Embedded language for high-performance array computations
Language: Haskell - Size: 15.4 MB - Last synced at: 17 days ago - Pushed at: 18 days ago - Stars: 938 - Forks: 128
chrisnewell91/QuanticCompute
GPU Micro-Clouds
Language: Python - Size: 2.6 MB - Last synced at: 17 days ago - Pushed at: 19 days ago - Stars: 1 - Forks: 0
MultiphaseFlowLab/MHIT36
Multi-GPU version of MHIT36 using cuDecomp
Language: Fortran - Size: 52.4 MB - Last synced at: 18 days ago - Pushed at: 19 days ago - Stars: 18 - Forks: 4
hel-astro-lab/runko
Modern C++/python CPU/GPU plasma toolbox
Language: C++ - Size: 4.24 MB - Last synced at: 19 days ago - Pushed at: 20 days ago - Stars: 52 - Forks: 19
ComputationalRadiationPhysics/cuda_memtest
Fork of CUDA GPU memtest :eyeglasses:
Language: C++ - Size: 275 KB - Last synced at: 14 days ago - Pushed at: about 1 year ago - Stars: 134 - Forks: 32