An open API service providing repository metadata for many open source software ecosystems.

Topic: "high-performance-computing"

taskflow/taskflow

A General-purpose Task-parallel Programming System using Modern C++

Language: C++ - Size: 143 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 11,475 - Forks: 1,343

Netflix/metaflow

Build, Manage and Deploy AI/ML Systems

Language: Python - Size: 44.3 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 9,661 - Forks: 933

google/tf-quant-finance

High-performance TensorFlow library for quantitative finance.

Language: Python - Size: 16.9 MB - Last synced at: 12 days ago - Pushed at: 9 months ago - Stars: 5,095 - Forks: 654

ProjectPhysX/FluidX3D

The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.

Language: C++ - Size: 21.4 MB - Last synced at: 23 days ago - Pushed at: about 1 month ago - Stars: 4,787 - Forks: 433

parallel101/course

高性能并行编程与优化 - 课件

Language: C++ - Size: 430 MB - Last synced at: 7 months ago - Pushed at: about 1 year ago - Stars: 3,998 - Forks: 551

alpa-projects/alpa 📦

Training and serving large-scale neural networks with auto parallelization.

Language: Python - Size: 7.11 MB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 3,171 - Forks: 356

merrymercy/awesome-tensor-compilers

A list of awesome compiler projects and papers for tensor computation and deep learning.

Size: 98.6 KB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 2,698 - Forks: 323

bshoshany/thread-pool

BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library

Language: C++ - Size: 343 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 2,648 - Forks: 287

flame/blis

BLAS-like Library Instantiation Software Framework

Language: C - Size: 52.3 MB - Last synced at: 17 days ago - Pushed at: about 1 month ago - Stars: 2,568 - Forks: 405

kokkos/kokkos

Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction

Language: C++ - Size: 38.3 MB - Last synced at: 4 days ago - Pushed at: 6 days ago - Stars: 2,404 - Forks: 474

BOINC/boinc

Open-source software for volunteer computing and grid computing.

Language: PHP - Size: 273 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2,271 - Forks: 504

mfem/mfem

Lightweight, general, scalable C++ library for finite element methods

Language: C++ - Size: 243 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 2,065 - Forks: 579

chapel-lang/chapel

a Productive Parallel Programming Language

Language: Chapel - Size: 1010 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 1,962 - Forks: 438

hermit-os/hermit-rs

Hermit for Rust.

Language: Rust - Size: 2.4 MB - Last synced at: 4 days ago - Pushed at: 9 days ago - Stars: 1,852 - Forks: 104

AdaptiveCpp/AdaptiveCpp

Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications adapt themselves to all the hardware in the system - even at runtime!

Language: C++ - Size: 14.4 MB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 1,758 - Forks: 206

Maratyszcza/NNPACK

Acceleration package for neural networks on multi-core CPUs

Language: C - Size: 1.06 MB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 1,702 - Forks: 319

DLTcollab/sse2neon

A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation

Language: C++ - Size: 3.04 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1,465 - Forks: 231

hermit-os/kernel

A Rust-based, lightweight unikernel.

Language: Rust - Size: 64.3 MB - Last synced at: 3 days ago - Pushed at: 7 days ago - Stars: 1,390 - Forks: 113

mratsim/Arraymancer

A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends

Language: Nim - Size: 3.81 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,380 - Forks: 95

ropensci/drake

An R-focused pipeline toolkit for reproducibility and high-performance computing

Language: R - Size: 92.4 MB - Last synced at: 1 day ago - Pushed at: about 1 year ago - Stars: 1,341 - Forks: 129

trilinos/Trilinos

Primary repository for the Trilinos Project

Language: C++ - Size: 819 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 1,339 - Forks: 607

sail-sg/envpool

C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.

Language: C++ - Size: 3.53 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 1,195 - Forks: 117

Liu-xiandong/How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

Language: Cuda - Size: 1.25 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 1,158 - Forks: 170

uncomplicate/neanderthal

Fast Clojure Matrix Library

Language: Clojure - Size: 3.98 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,114 - Forks: 58

ropensci/targets

Function-oriented Make-like declarative workflows for R

Language: R - Size: 7.15 MB - Last synced at: 16 days ago - Pushed at: about 1 month ago - Stars: 1,040 - Forks: 78

openmc-dev/openmc

OpenMC Monte Carlo Code

Language: Python - Size: 72.9 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 929 - Forks: 594

mateogianolio/vectorious

Linear algebra in TypeScript.

Language: TypeScript - Size: 42.8 MB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 921 - Forks: 43

precice/precice

A coupling library and ecosystem for partitioned multi-physics and multi-scale simulations, including surface and volume coupling.

Language: C++ - Size: 40.3 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 862 - Forks: 208

Geant4/geant4

Geant4 toolkit for the simulation of the passage of particles through matter - NIM A 506 (2003) 250-303

Language: C++ - Size: 355 MB - Last synced at: 16 days ago - Pushed at: 19 days ago - Stars: 741 - Forks: 359

brucefan1983/GPUMD

Graphics Processing Units Molecular Dynamics

Language: Cuda - Size: 317 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 698 - Forks: 164

AMReX-Codes/amrex

AMReX: Software Framework for Block Structured AMR

Language: C++ - Size: 56.1 MB - Last synced at: about 3 hours ago - Pushed at: 1 day ago - Stars: 688 - Forks: 430

MarioSieg/magnetron

(WIP) A small but powerful, homemade PyTorch from scratch.

Language: C - Size: 28.2 MB - Last synced at: 2 days ago - Pushed at: 4 days ago - Stars: 662 - Forks: 33

zanellia/prometeo

An experimental Python-to-C transpiler and domain specific language for embedded high-performance computing

Language: Python - Size: 1.93 MB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 641 - Forks: 34

llnl/sundials

Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.

Language: C - Size: 245 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 617 - Forks: 159

austinksmith/Hamsters.js

100% Vanilla Javascript Multithreading & Parallel Execution Library

Language: JavaScript - Size: 42.9 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 596 - Forks: 31

spcl/dace

DaCe - Data Centric Parallel Programming

Language: Python - Size: 155 MB - Last synced at: about 16 hours ago - Pushed at: 1 day ago - Stars: 568 - Forks: 148

DeveloperPaul123/thread-pool

A modern, fast, lightweight thread pool library based on C++2x

Language: C++ - Size: 729 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 527 - Forks: 42

mpi4jax/mpi4jax

Zero-copy MPI communication of JAX arrays, for turbo-charged HPC applications in Python :zap:

Language: Python - Size: 5.08 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 506 - Forks: 32

CurvineIO/curvine

High-performance distributed multi-tier cache system. Built in Rust.

Language: Rust - Size: 2.45 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 505 - Forks: 63

3dem/relion

Image-processing software for cryo-electron microscopy

Language: C++ - Size: 58.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 504 - Forks: 222

pypr/pysph

A framework for Smoothed Particle Hydrodynamics in Python

Language: Python - Size: 7.09 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 500 - Forks: 144

neuronsimulator/nrn

NEURON Simulator

Language: C++ - Size: 164 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 482 - Forks: 130

Xiangyu-Hu/SPHinXsys

SPHinXsys provides C++ APIs for engineering simulation and optimization. It aims at complex systems driven by fluid, structure, multi-body dynamics and beyond. The multi-physics library is based on a unique and unified computational framework by which strong coupling has been achieved for all involved physics.

Language: C++ - Size: 252 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 475 - Forks: 339

philipturner/metal-flash-attention

FlashAttention (Metal Port)

Language: Swift - Size: 9.26 MB - Last synced at: 9 months ago - Pushed at: over 1 year ago - Stars: 459 - Forks: 23

cselab/aphros

Finite volume solver for incompressible multiphase flows with surface tension. Foaming flows in complex geometries.

Language: C++ - Size: 205 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 426 - Forks: 51

GraphIt-DSL/graphit

GraphIt - A High-Performance Domain Specific Language for Graph Analytics

Language: C++ - Size: 8.48 MB - Last synced at: 8 months ago - Pushed at: about 3 years ago - Stars: 377 - Forks: 46

QMCPACK/qmcpack

Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support

Language: C++ - Size: 397 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 369 - Forks: 151

uncomplicate/bayadera

High-performance Bayesian Data Analysis on the GPU in Clojure

Language: Clojure - Size: 1020 KB - Last synced at: 7 months ago - Pushed at: over 5 years ago - Stars: 365 - Forks: 23

SciML/Surrogates.jl

Surrogate modeling and optimization for scientific machine learning (SciML)

Language: Julia - Size: 327 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 357 - Forks: 76

mrshaw01/software-engineer

A curated learning repository focused on High-Performance Computing (HPC) — covering fundamentals to advanced topics in CUDA, MPI, C++, and Python-C++ interoperability.

Language: C++ - Size: 41 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 355 - Forks: 61

huggingface/datablations

Scaling Data-Constrained Language Models

Language: Jupyter Notebook - Size: 45.8 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 342 - Forks: 19

Glavnokoman/vuh

Vulkan compute for people

Language: C++ - Size: 705 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 340 - Forks: 34

nebius/soperator

Run Slurm in Kubernetes

Language: Go - Size: 40.8 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 335 - Forks: 49

DragonSpit/HPCsharp

High performance algorithms in C#: SIMD/SSE, multi-core and faster

Language: C# - Size: 1.37 MB - Last synced at: about 19 hours ago - Pushed at: 2 days ago - Stars: 331 - Forks: 34

dionhaefner/pyhpc-benchmarks

A suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python :rocket:

Language: Python - Size: 1.19 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 330 - Forks: 27

feelpp/feelpp

:gem: Feel++: Finite Element Embedded Language and Library in C++

Language: C++ - Size: 349 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 325 - Forks: 68

ornladios/ADIOS2

Next generation of ADIOS developed in the Exascale Computing Program

Language: C++ - Size: 33.7 MB - Last synced at: 1 day ago - Pushed at: 9 days ago - Stars: 313 - Forks: 144

r-lib/mirai

Minimalist Async Evaluation Framework for R

Language: R - Size: 15.5 MB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 295 - Forks: 17

zero-one-group/geni

A Clojure dataframe library that runs on Spark

Language: Clojure - Size: 1.86 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 294 - Forks: 27

mratsim/laser

The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers

Language: Nim - Size: 3.65 MB - Last synced at: 22 days ago - Pushed at: almost 2 years ago - Stars: 290 - Forks: 14

SciML/NonlinearSolve.jl

High-performance and differentiation-enabled nonlinear solvers (Newton methods), bracketed rootfinding (bisection, Falsi), with sparsity and Newton-Krylov support.

Language: Julia - Size: 41.6 MB - Last synced at: 7 days ago - Pushed at: 10 days ago - Stars: 284 - Forks: 60

uncomplicate/clojurecl

ClojureCL is a Clojure library for parallel computations with OpenCL.

Language: Clojure - Size: 910 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 280 - Forks: 18

hongbo-miao/hongbomiao.com

A personal research and development (R&D) lab that facilitates the sharing of knowledge.

Language: Python - Size: 1.03 GB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 279 - Forks: 45

Trinkle23897/Fast-Poisson-Image-Editing

A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.

Language: Python - Size: 2.88 MB - Last synced at: 4 months ago - Pushed at: about 3 years ago - Stars: 277 - Forks: 16

geodynamics/aspect

A parallel, extensible finite element code to simulate convection in both 2D and 3D models.

Language: C++ - Size: 384 MB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 272 - Forks: 261

ProjectPhysX/OpenCL-Benchmark

A small OpenCL benchmark program to measure peak GPU/CPU performance.

Language: C++ - Size: 294 KB - Last synced at: 23 days ago - Pushed at: about 1 month ago - Stars: 267 - Forks: 35

Shenggan/awesome-distributed-ml

A curated list of awesome projects and papers for distributed training or inference

Size: 44.9 KB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 260 - Forks: 30

sourceryinstitute/OpenCoarrays

A parallel application binary interface for Fortran 2018 compilers.

Language: Fortran - Size: 8.52 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 258 - Forks: 55

flexi-framework/flexi

FLEXI: A high order discontinuous Galerkin framework for hyperbolic–parabolic conservation laws

Language: Fortran - Size: 75.3 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 257 - Forks: 69

ECP-copa/Cabana

Performance-portable library for particle-based simulations

Language: C++ - Size: 283 MB - Last synced at: 2 days ago - Pushed at: about 1 month ago - Stars: 255 - Forks: 60

CaNS-World/CaNS

A code for fast, massively-parallel direct numerical simulations (DNS) of canonical flows

Language: Fortran - Size: 1.15 MB - Last synced at: 17 days ago - Pushed at: 23 days ago - Stars: 253 - Forks: 84

intel/intel-qs

High-performance simulator of quantum circuits

Language: C++ - Size: 17.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 250 - Forks: 74

cb-geo/mpm

CB-Geo High-Performance Material Point Method

Language: C++ - Size: 7.47 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 248 - Forks: 83

DLR-AMR/t8code

Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.

Language: C++ - Size: 148 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 247 - Forks: 62

flame/libflame

High-performance object-based library for DLA computations

Language: Fortran - Size: 31.3 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 245 - Forks: 84

CEED/libCEED

CEED Library: Code for Efficient Extensible Discretizations

Language: C - Size: 21 MB - Last synced at: 13 days ago - Pushed at: 14 days ago - Stars: 238 - Forks: 66

hermit-os/libhermit 📦

HermitCore: A C-based, lightweight unikernel

Language: C - Size: 42.6 MB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 227 - Forks: 42

arborx/ArborX

Performance-portable geometric search library

Language: C++ - Size: 5 MB - Last synced at: 9 days ago - Pushed at: 11 days ago - Stars: 220 - Forks: 45

esa/torchquad

Numerical integration in arbitrary dimensions on the GPU using PyTorch / TF / JAX

Language: Python - Size: 10.9 MB - Last synced at: 26 days ago - Pushed at: 5 months ago - Stars: 211 - Forks: 44

SlinkyProject/slurm-operator

Run Slurm on Kubernetes. A Slinky project.

Language: Go - Size: 3.69 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 209 - Forks: 54

penn-graphics-research/claymore

Language: Cuda - Size: 30.7 MB - Last synced at: 7 months ago - Pushed at: over 2 years ago - Stars: 209 - Forks: 31

springer13/hptt

High-Performance Tensor Transpose library

Language: C++ - Size: 818 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 200 - Forks: 49

AvtechScientific/ASL

Advanced Simulation Library - hardware accelerated multiphysics simulation platform.

Language: C++ - Size: 24.1 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 200 - Forks: 54

tikv/minstant

Performant time measuring in Rust

Language: Rust - Size: 226 KB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 195 - Forks: 20

SciML/MethodOfLines.jl

Automatic Finite Difference PDE solving with Julia SciML

Language: Julia - Size: 370 MB - Last synced at: 8 days ago - Pushed at: 10 days ago - Stars: 194 - Forks: 38

mlr-org/batchtools

Tools for computation on batch systems

Language: R - Size: 6.74 MB - Last synced at: 15 days ago - Pushed at: 28 days ago - Stars: 182 - Forks: 52

hao-lh/the-books-making-you-better

This repo is a curated library to help you achieve a deeper understanding of what drives success and continuous improvement. Dive in, and discover content that can expand your thinking, sharpen your expertise, and fuel you drive better, whether you’re exploring new fields, honing in-demand skills, or simply looking for fresh perspectives.

Size: 254 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 181 - Forks: 22

LibRapid/librapid

A highly optimised C++ library for mathematical applications and neural networks.

Language: C++ - Size: 30.3 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 176 - Forks: 10

lanl/vpic

Vector Particle-In-Cell (VPIC) Project

Language: C++ - Size: 23.1 MB - Last synced at: 22 days ago - Pushed at: 6 months ago - Stars: 171 - Forks: 77

rabauke/mpl

A C++17 message passing library based on MPI

Language: C++ - Size: 33.9 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 171 - Forks: 30

pranabdas/espresso

Notes and tutorials on Density Functional Theory calculation using Quantum ESPRESSO.

Language: Jupyter Notebook - Size: 57.2 MB - Last synced at: 6 days ago - Pushed at: 9 days ago - Stars: 169 - Forks: 53

kahypar/mt-kahypar

Mt-KaHyPar (Multi-Threaded Karlsruhe Hypergraph Partitioner) is a shared-memory multilevel graph and hypergraph partitioner equipped with parallel implementations of techniques used in the best sequential partitioning algorithms. Mt-KaHyPar can partition extremely large hypergraphs very fast and with high quality.

Language: C++ - Size: 37.6 MB - Last synced at: 1 day ago - Pushed at: 3 days ago - Stars: 168 - Forks: 33

Keysight/Jlsca

Side-channel toolkit in Julia

Language: Julia - Size: 30.7 MB - Last synced at: 5 months ago - Pushed at: almost 4 years ago - Stars: 165 - Forks: 34

Yihao-Shi/GeoTaichi

A Taichi-powered high-performance numerical simulator for multiscale and multifield geophysical problems

Language: Python - Size: 91.9 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 161 - Forks: 23

claudebarthels/infinity

A lightweight C++ RDMA library for InfiniBand networks.

Language: C++ - Size: 37.1 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 155 - Forks: 40

mschubert/clustermq

R package to send function calls as jobs on LSF, SGE, Slurm, PBS/Torque, or each via SSH

Language: R - Size: 6.28 MB - Last synced at: 15 days ago - Pushed at: 8 months ago - Stars: 153 - Forks: 28

dash-project/dash

DASH, the C++ Template Library for Distributed Data Structures with Support for Hierarchical Locality for HPC and Data-Driven Science

Language: C++ - Size: 14.5 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 150 - Forks: 43

eBay/accelerator 📦

The Accelerator is a tool for fast and reproducible processing of large amounts of data.

Language: Python - Size: 2.18 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 149 - Forks: 28

dftfeDevelopers/dftfe

DFT-FE: Real-space DFT calculations using Finite Elements

Language: C++ - Size: 92.1 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 148 - Forks: 42

parthenon-hpc-lab/parthenon

Parthenon AMR infrastructure

Language: C++ - Size: 101 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 147 - Forks: 40

Related Topics
hpc 154 parallel-computing 143 cuda 103 mpi 96 machine-learning 95 python 93 cpp 86 openmp 75 scientific-computing 69 gpu 68 c 53 high-performance 53 gpu-computing 51 parallel-programming 47 deep-learning 44 data-science 43 r 39 distributed-computing 39 simulation 39 multithreading 31 gpu-acceleration 30 parallel 28 slurm 28 pipeline 27 c-plus-plus 27 computational-fluid-dynamics 26 fortran 26 linear-algebra 26 julia 26 reproducibility 26 optimization 25 rstats 25 bioinformatics 24 opencl 22 cloud-computing 21 distributed-systems 20 gpu-programming 20 numerical-methods 20 image-processing 19 openmpi 19 fluid-dynamics 19 gpgpu 18 targets 17 matrix-multiplication 17 statistics 17 simd 17 cfd 16 nvidia 16 artificial-intelligence 16 reproducible-research 16 parallel-processing 16 cuda-programming 16 rust 15 python3 15 workflow 15 docker 14 java 14 science 14 hpc-applications 14 pytorch 14 r-package 13 molecular-dynamics 13 benchmark 13 compiler 13 concurrency 12 kokkos 12 computational-physics 12 cplusplus 11 cpp17 11 tensorflow 11 graph-algorithms 11 supercomputing 11 kubernetes 11 performance 11 openmp-parallelization 10 open-source 10 big-data 10 fluid-simulation 10 cuda-kernels 9 llm 9 high-throughput-computing 9 numerical-integration 9 density-functional-theory 9 discontinuous-galerkin 9 make 9 linux 9 partial-differential-equations 9 physics 9 sparse-matrix 9 computer-vision 9 openacc 8 neural-network 8 bash 8 finite-element-methods 8 cluster-computing 8 ai 8 heterogeneous-parallel-programming 8 quantum-chemistry 8 containers 8 tensor 8