An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: gpu-computing

aledinola/AiyagariSparse

Infinite horizon dynamic programming with Howard acceleration and sparse matrices, tested on Aiyagari model

Language: MATLAB - Size: 24.4 KB - Last synced at: about 4 hours ago - Pushed at: about 5 hours ago - Stars: 0 - Forks: 0

software-mansion/TypeGPU

A modular and open-ended toolkit for WebGPU, with advanced type inference and the ability to write shaders in TypeScript

Language: TypeScript - Size: 256 MB - Last synced at: about 6 hours ago - Pushed at: about 6 hours ago - Stars: 1,026 - Forks: 25

JuliaGPU/KernelAbstractions.jl

Heterogeneous programming in Julia

Language: Julia - Size: 4.73 MB - Last synced at: about 4 hours ago - Pushed at: about 5 hours ago - Stars: 466 - Forks: 80

Guillaume-Helbecque/GPU-accelerated-tree-search-Chapel

GPU-accelerated tree search: Investigating Chapel versus CUDA/HIP+X

Language: Chapel - Size: 592 KB - Last synced at: about 7 hours ago - Pushed at: about 9 hours ago - Stars: 2 - Forks: 1

FAST-Imaging/FAST

A framework for high-performance medical image processing, neural network inference and visualization

Language: C++ - Size: 20 MB - Last synced at: about 13 hours ago - Pushed at: about 13 hours ago - Stars: 488 - Forks: 108

dyGiLa/dyGiLa

Repository of dyGiLa project

Language: C++ - Size: 1.07 MB - Last synced at: about 13 hours ago - Pushed at: about 15 hours ago - Stars: 0 - Forks: 1

JuliaAstroSim/AstroNbodySim.jl

Unitful and differentiable gravitational N-body simulation code in Julia

Language: Julia - Size: 19.6 MB - Last synced at: about 15 hours ago - Pushed at: about 17 hours ago - Stars: 34 - Forks: 0

NonDairyNeutrino/Thesis

Thesis for the Computational Science Master's program at Central Washington University. 3D extension of an analog of cosmological particle creation in a Friedmann-Robertson-Walker universe by numerically simulating a Bose-Einstein condensate with a time-dependent scattering length.

Language: TeX - Size: 20.4 MB - Last synced at: about 18 hours ago - Pushed at: about 19 hours ago - Stars: 3 - Forks: 0

ley995/triangle-splatting2

🎨 Implement differentiable rendering techniques using opaque triangles for advanced graphics applications with the Triangle Splatting+ framework.

Language: Python - Size: 3.08 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

stotko/stdgpu

stdgpu: Efficient STL-like Data Structures on the GPU

Language: C++ - Size: 5.01 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,234 - Forks: 91

Avicted/hip-img-fx

Fast GPU-accelerated image filters (HIP) with a CPU fallback path for portability.

Language: C++ - Size: 15.5 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Lyynn777/CUDA-Bitonic-sort

Simple CUDA project to implement Bitonic Sort and compare it with normal CPU sorting.

Language: Jupyter Notebook - Size: 179 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

google/tf-quant-finance

High-performance TensorFlow library for quantitative finance.

Language: Python - Size: 16.9 MB - Last synced at: 1 day ago - Pushed at: 8 months ago - Stars: 5,030 - Forks: 647

tinker495/PuXle

PuXle: Planning using jaX-based learning environments

Language: Python - Size: 1.14 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 4 - Forks: 1

ProteusMRIgHIFU/BabelViscoFDTD

Software library for FDTD of viscoelastic equation using a staggered grid arrangement with support for GPU and CPU backends

Language: Jupyter Notebook - Size: 117 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 56 - Forks: 12

chr0n1x/rpi-talos

My homelab on TalosOS, and a variety of other hardware.

Language: Shell - Size: 1.4 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 6 - Forks: 0

catboost/catboost

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Language: C++ - Size: 1.51 GB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 8,641 - Forks: 1,243

quantaosun/NAMD-FEP

Cloud GPU supported NAMD3-based binding free energy difference between two small molecules against the same protein target. This is probably one of the fastest FEP simulation with FREE GPU hardwares that the genearl public could have access to

Language: Jupyter Notebook - Size: 30.1 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 26 - Forks: 9

LennartPaduch/WebLBM

Fast and memory efficient Lattice Boltzmann CFD (D2Q9) for the browser running on the GPU via the WebGPU API https://weblbm.pages.dev/

Language: TypeScript - Size: 57.6 KB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

tinker495/JAxtar

JAxtar is a project with a JAX-native implementation of parallelizeable A* & Q* solver for neural heuristic search research.

Language: Python - Size: 13.5 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 42 - Forks: 4

sean-hungerford/seedVR2_cudafull

🖥️ Enhance your video and image quality with SeedVR2 for ComfyUI, supporting multi-GPU setups for efficient upscaling.

Language: Python - Size: 4.74 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

DESIGN4ADDITIVE/GPUCADforAM

Design Software for Additive Manufacturing

Language: C++ - Size: 1.48 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 10 - Forks: 2

maltsev-andrey/gpu-data-exploration

Data exploration tools for GPU computing benchmarks - Wikipedia & HDF5 sensor datasets

Language: Python - Size: 8.79 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

xsuite/xsuite

Suite of python packages for multiparticle simulations of particle accelerators.

Language: Python - Size: 58.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 44 - Forks: 29

LLNL/CARE

CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as loop fusion capability and a portable interface for many numerical algorithms. It provides all the basics for anyone wanting to write portable code.

Language: C++ - Size: 1.51 MB - Last synced at: about 1 hour ago - Pushed at: about 3 hours ago - Stars: 31 - Forks: 5

KernelTuner/kernel_tuner

Kernel Tuner

Language: Python - Size: 41.5 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 370 - Forks: 59

8e8bdba457c18cf692a95fe2ec67000b/VulkanCooperativeMatrixAttention

Vulkan & GLSL implementation of FlashAttention-2

Size: 1.95 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

ComputationalRadiationPhysics/picongpu

Performance-Portable Particle-in-Cell Simulations for the Exascale Era :sparkles:

Language: C++ - Size: 59.1 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 760 - Forks: 225

ZrobMiloudaa/jetson-orin-matmul-analysis

🔍 Analyze CUDA matrix multiplication performance and power consumption on NVIDIA Jetson Orin Nano across multiple implementations and settings.

Language: Python - Size: 9.36 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

javimanotas/fractals

2D and 3D fractal rendering with Unity

Language: ShaderLab - Size: 62.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

gyroflow/gyroflow

Video stabilization using gyroscope data

Language: Rust - Size: 83.8 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 7,898 - Forks: 363

Dipper180/nvidia-z33qn

🎨 Explore random, Nvidia-inspired projects in this unique collection, showcasing creativity and innovation for developers and tech enthusiasts alike.

Language: JavaScript - Size: 1.29 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

TranNgocHieu/nodejs-native-gpu

🚀 Harness GPU power in Node.js with this native addon for accelerated computations, opening new possibilities for JavaScript developers.

Language: JavaScript - Size: 1.3 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

Thando11/microsoft-dgx7u

🛠 Explore an experimental sandbox inspired by Microsoft, featuring random code, ideas, and prototypes for innovative development.

Language: JavaScript - Size: 1.29 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

JLnorthwestern/GO-MELT

GO-MELT: GPU-Optimized Multilevel Execution of LPBF Thermal simulations

Language: Python - Size: 5.72 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 26 - Forks: 5

NVIDIA/cccl

CUDA Core Compute Libraries

Language: C++ - Size: 240 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,994 - Forks: 284

DeonDavisV/Nodexa-Chain-Core

🌐 Reward hosting providers on the Clore.ai marketplace with Nodexa-Chain-Core, a blockchain solution designed for efficiency and stability.

Language: C - Size: 7.59 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

tkemmer/CuNESSie.jl

CUDA-accelerated Nonlocal Electrostatics in Structured Solvents

Language: Julia - Size: 223 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

AccelerateHS/accelerate-llvm

LLVM backend for Accelerate

Language: Haskell - Size: 3.95 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 167 - Forks: 59

ginkgo-project/ginkgo

Numerical linear algebra software package

Language: C++ - Size: 158 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 522 - Forks: 99

Parsifal1916/Aster

A robust, simple yet complete n-body integrator

Language: C++ - Size: 5.93 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0

Zydak/Vulkan-Path-Tracer

Vulkan Path Tracer. Physically based path tracer made in Vulkan with Ray Tracing Pipeline.

Language: C++ - Size: 462 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 341 - Forks: 11

AdaptiveCpp/AdaptiveCpp

Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications adapt themselves to all the hardware in the system - even at runtime!

Language: C++ - Size: 14.3 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,723 - Forks: 199

NVIDIA/MatX

An efficient C++17 GPU numerical computing library with Python-like syntax

Language: C++ - Size: 21.2 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,357 - Forks: 108

shivakrsna/LightGridShader

📺 Create stunning LED display effects with this Unity Shader Graph sample project, designed for clear, vibrant visuals in your applications.

Language: HLSL - Size: 1.33 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

RobbiSixx/prysm

Prysm is a blazing-smart Puppeteer-based web scraper that doesn't just extract - it understands structure. Capable of scraping virtually any website with intelligent content detection and 14 specialized scroll strategies that adapt to different page layouts, Prysm excels at extracting content that other scrapers miss.

Language: JavaScript - Size: 1.22 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 4 - Forks: 0

exospherehost/exospherehost

Infra for scalable and reliable AI agents

Language: Python - Size: 34.7 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 153 - Forks: 38

elcruzo/cuda-conv

Lightweight CUDA kernel for 2D image convolution achieving 20x+ speedup. Built with CuPy for the NVIDIA Hackathon.

Language: Python - Size: 99.6 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

beehive-lab/TornadoVM

TornadoVM: A practical and efficient heterogeneous programming framework for managed languages

Language: Java - Size: 160 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,342 - Forks: 123

Maison-de-la-Simulation/miniPIC

Playground for computer science and HPC experiments applied to the Particle-In-Cell method.

Language: C++ - Size: 1020 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 1

nabilshadman/cuda-4-dummies

Lecture slides and exercise files of the CUDA 4 Dummies course (2025)

Language: Cuda - Size: 11.4 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

lachlan2k/phatcrack

Modern web-based distributed hashcracking solution, built on hashcat

Language: Go - Size: 11.2 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 146 - Forks: 12

tensorflow/lingvo

Lingvo

Language: Python - Size: 142 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2,851 - Forks: 452

cherubrock-seb/PrMers

Mersenne prime search using integer arithmetic and an IDBWT via an NTT executed on the GPU through OpenCL.

Language: C++ - Size: 59.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 8 - Forks: 3

uncomplicate/neanderthal

Fast Clojure Matrix Library

Language: Clojure - Size: 3.96 MB - Last synced at: 6 days ago - Pushed at: 11 days ago - Stars: 1,111 - Forks: 58

kpet/clvk

Implementation of OpenCL 3.0 on Vulkan

Language: C++ - Size: 1.74 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 412 - Forks: 45

ROCm/Tensile

[DEPRECATED] Moved to ROCm/rocm-libraries repo

Language: Python - Size: 98.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 254 - Forks: 167

BindsNET/bindsnet

Simulation of spiking neural networks (SNNs) using PyTorch.

Language: Python - Size: 61.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,634 - Forks: 340

oracle-quickstart/oci-gpu-scanner

GPU & cluster health and performance monitoring solution for OCI

Language: HCL - Size: 6.13 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 8 - Forks: 2

Dpdl-io/DpdlEngine

Dpdl (Dynamic Packet Definition Language) is a rapid development programming language and constrained device framework with built-in database and agent technology. Dpdl enables also the embedding and execution of multiple programming languages (C, C++, Python, etc...) directly within Dpdl code

Language: C - Size: 899 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 4 - Forks: 0

ProjectPhysX/FluidX3D

The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.

Language: C++ - Size: 21.3 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 4,728 - Forks: 425

arkodeepsen/qwen-image

Production-ready RunPod serverless endpoint and pod for Qwen-Image (20B) - Text-to-image generation with exceptional English and Chinese text rendering

Language: Shell - Size: 14.6 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 2 - Forks: 0

mdkhussairiee/Asrix-Labs-FastAPI-Sadtalker

A GPU-accelerated FastAPI microservice that generates talking-head videos from static images and audio using SadTalker.

Language: Python - Size: 23 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

LuxCoreRender/LuxCore

LuxCore source repository

Language: C++ - Size: 155 MB - Last synced at: 8 days ago - Pushed at: 16 days ago - Stars: 1,262 - Forks: 156

mratsim/Arraymancer

A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends

Language: Nim - Size: 3.8 MB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 1,380 - Forks: 95

IntelPython/dpctl

Python SYCL bindings and SYCL-based Python Array API library

Language: C++ - Size: 222 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 117 - Forks: 31

tumaer/JAXFLUIDS

Differentiable Fluid Dynamics Package

Language: Python - Size: 12.6 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 466 - Forks: 85

tinker495/Xtructure

Xtructure is datastructure for using in JAX

Language: Python - Size: 2.2 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 18 - Forks: 0

DiamondLightSource/fast-feedback-service

GPU based service to provide fast-feedback results

Language: C++ - Size: 1010 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 3 - Forks: 3

mikeroyal/GPU-Guide

Graphics Processing Unit (GPU) Architecture Guide

Language: Shell - Size: 815 KB - Last synced at: 9 days ago - Pushed at: almost 4 years ago - Stars: 245 - Forks: 19

shaazib-tanvir/wave-simulation

A GPU based simulation of the Wave Equation

Language: C - Size: 52.7 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

ProjectPhysX/OpenCL-Wrapper

OpenCL is the most powerful programming language ever created. Yet the OpenCL C++ bindings are cumbersome and the code overhead prevents many people from getting started. I created this lightweight OpenCL-Wrapper to greatly simplify OpenCL software development with C++ while keeping functionality and performance.

Language: C++ - Size: 396 KB - Last synced at: 8 days ago - Pushed at: 14 days ago - Stars: 440 - Forks: 43

AmesingFlank/taichi.js

Modern GPU Compute and Rendering in Javascript

Language: TypeScript - Size: 220 MB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 515 - Forks: 20

Finoptimize/agentaflow-sro-community

Manage AI and Machine Learning workloads more efficiently with lower cost: GPU Orchestration / Scheduling / Routing / Serving / Optimization / Observability for AI/ML systems

Language: Go - Size: 1.18 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 3

nadult/lucid

LucidRaster: real-time GPU software rasterizer for exact OIT

Language: C++ - Size: 1.38 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 54 - Forks: 0

ROCm/hipBLASLt

[DEPRECATED] Moved to ROCm/rocm-libraries repo

Language: Assembly - Size: 1.31 GB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 114 - Forks: 146

LuxCoreRender/BlendLuxCore

Blender Integration for LuxCore

Language: Python - Size: 341 MB - Last synced at: 12 days ago - Pushed at: 16 days ago - Stars: 815 - Forks: 98

johnh2o2/cuvarbase

Python library for fast time-series analysis on CUDA GPUs

Language: Jupyter Notebook - Size: 50.2 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 30 - Forks: 6

seanwevans/WarpDB

An on-GPU database

Language: C++ - Size: 618 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

inducer/pycuda

CUDA integration for Python, plus shiny features

Language: Python - Size: 2.87 MB - Last synced at: 9 days ago - Pushed at: 23 days ago - Stars: 1,998 - Forks: 295

SciML/SciMLBook

Parallel Computing and Scientific Machine Learning (SciML): Methods and Applications (MIT 18.337J/6.338J)

Language: HTML - Size: 128 MB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 1,943 - Forks: 357

ProjectPhysX/OpenCL-Benchmark

A small OpenCL benchmark program to measure peak GPU/CPU performance.

Language: C++ - Size: 286 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 255 - Forks: 31

Cre4T3Tiv3/jetson-orin-matmul-analysis

Scientific CUDA benchmarking framework: 4 implementations x 3 power modes x 5 matrix sizes on Jetson Orin Nano. 1,282 GFLOPS peak, 90% performance @ 88% power (25W mode), 99.5% accuracy validation, edge AI deployment guide.

Language: Python - Size: 9.36 MB - Last synced at: 14 days ago - Pushed at: 20 days ago - Stars: 6 - Forks: 0

brandondube/prysm

physical optics: integrated modeling, phase retrieval, segmented systems, polynomials and fitting, sequential raytracing...

Language: Python - Size: 12.2 MB - Last synced at: 9 days ago - Pushed at: 10 months ago - Stars: 315 - Forks: 53

hpi-epic/gpucsl

Constraint-based Causal Structure Learning on GPUs.

Language: Python - Size: 140 KB - Last synced at: 5 days ago - Pushed at: almost 3 years ago - Stars: 41 - Forks: 1

NVIDIA/thrust 📦

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

Language: C++ - Size: 17 MB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 4,985 - Forks: 765

KempnerInstitute/kempner-computing-handbook

Kempner Institute Computing Handbook

Language: JavaScript - Size: 67.6 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 21 - Forks: 10

Brijes987/ChronoTrade

Modern C++23 trading engine with lock-free data structures, CUDA acceleration, coroutines, and SIMD optimizations. Achieves sub-microsecond order matching with comprehensive benchmarking and CI/CD pipeline.

Language: C++ - Size: 30.3 KB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

Tennisee-data/benchHUB

benchHUB is a Python-based project to parse, aggregate, and visualize system and performance benchmarks. It includes a Streamlit dashboard to display and compare results.

Language: Python - Size: 1.36 MB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

uncomplicate/clojurecuda

Clojure library for CUDA development

Language: Clojure - Size: 563 KB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 191 - Forks: 10

mikbry/awesome-webgpu

😎 Curated list of awesome things around WebGPU ecosystem.

Size: 108 KB - Last synced at: 15 days ago - Pushed at: about 2 months ago - Stars: 1,749 - Forks: 74

knagrecha/saturn

Saturn accelerates the training of large-scale deep learning models with a novel joint optimization approach.

Language: Python - Size: 107 KB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 24 - Forks: 5

UchihaIthachi/sssp-apsp-hpc-openmp-cuda

🚀 High-performance implementations and benchmarks of SSSP and APSP algorithms (Bellman–Ford, Dijkstra, Floyd–Warshall, Johnson) in Serial, OpenMP, CUDA, and Hybrid CPU+GPU. Includes profiling, speedup plots, and HPC notebooks

Language: Jupyter Notebook - Size: 494 KB - Last synced at: 17 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

microsoft/pai 📦

Resource scheduling and cluster management for AI

Language: JavaScript - Size: 70.5 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 2,677 - Forks: 549

datum-cloud/awesome-alt-clouds

A list of specialized clouds that span traditional infra, AI, data, connectivity, and more.

Size: 319 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 31 - Forks: 7

AccelerateHS/accelerate

Embedded language for high-performance array computations

Language: Haskell - Size: 15.4 MB - Last synced at: 17 days ago - Pushed at: 18 days ago - Stars: 938 - Forks: 128

chrisnewell91/QuanticCompute

GPU Micro-Clouds

Language: Python - Size: 2.6 MB - Last synced at: 17 days ago - Pushed at: 19 days ago - Stars: 1 - Forks: 0

MultiphaseFlowLab/MHIT36

Multi-GPU version of MHIT36 using cuDecomp

Language: Fortran - Size: 52.4 MB - Last synced at: 18 days ago - Pushed at: 19 days ago - Stars: 18 - Forks: 4

hel-astro-lab/runko

Modern C++/python CPU/GPU plasma toolbox

Language: C++ - Size: 4.24 MB - Last synced at: 19 days ago - Pushed at: 20 days ago - Stars: 52 - Forks: 19

ComputationalRadiationPhysics/cuda_memtest

Fork of CUDA GPU memtest :eyeglasses:

Language: C++ - Size: 275 KB - Last synced at: 14 days ago - Pushed at: about 1 year ago - Stars: 134 - Forks: 32