Topic: "parallel-computing"
taskflow/taskflow
A General-purpose Task-parallel Programming System using Modern C++
Language: C++ - Size: 142 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 11,321 - Forks: 1,323
joblib/joblib
Computing with Python functions.
Language: Python - Size: 4.24 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 4,249 - Forks: 441
OpenNMT/CTranslate2
Fast inference engine for Transformer models
Language: C++ - Size: 14.5 MB - Last synced at: 23 days ago - Pushed at: 7 months ago - Stars: 4,048 - Forks: 402
parallel101/course
高性能并行编程与优化 - 课件
Language: C++ - Size: 430 MB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 3,998 - Forks: 551
amilajack/reading
A list of computer-science readings I recommend
Size: 523 MB - Last synced at: 7 months ago - Pushed at: about 3 years ago - Stars: 3,310 - Forks: 728
jofpin/turbit
Build applications, scripts, and automations powered by high-performance multicore computing using Node.js
Language: JavaScript - Size: 179 KB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 2,823 - Forks: 491
jmcarpenter2/swifter
A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner
Language: Python - Size: 2.15 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 2,611 - Forks: 104
kokkos/kokkos
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
Language: C++ - Size: 37.7 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 2,350 - Forks: 472
geatpy-dev/geatpy
Evolutionary algorithm toolbox and framework with high performance for Python
Language: Python - Size: 575 MB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 2,085 - Forks: 731
mfem/mfem
Lightweight, general, scalable C++ library for finite element methods
Language: C++ - Size: 261 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 2,015 - Forks: 570
NVIDIA/cccl
CUDA Core Compute Libraries
Language: C++ - Size: 240 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1,994 - Forks: 284
chapel-lang/chapel
a Productive Parallel Programming Language
Language: Chapel - Size: 1010 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1,948 - Forks: 433
zwang4/awesome-machine-learning-in-compilers
Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation
Size: 394 KB - Last synced at: 27 days ago - Pushed at: about 2 months ago - Stars: 1,606 - Forks: 171
dealii/dealii
The development repository for the deal.II finite element library
Language: C++ - Size: 365 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,577 - Forks: 810
VcDevel/Vc
SIMD Vector Classes for C++
Language: C++ - Size: 11.2 MB - Last synced at: 30 days ago - Pushed at: over 1 year ago - Stars: 1,511 - Forks: 153
pyper-dev/pyper
Concurrent Python made simple
Language: Python - Size: 462 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 1,503 - Forks: 30
JuliaSymbolics/Symbolics.jl
Symbolic programming for the next generation of numerical software
Language: Julia - Size: 38.2 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 1,469 - Forks: 173
ElmerCSC/elmerfem
Official git repository of Elmer FEM software
Language: Fortran - Size: 125 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1,414 - Forks: 353
mratsim/Arraymancer
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Language: Nim - Size: 3.8 MB - Last synced at: 2 days ago - Pushed at: 8 months ago - Stars: 1,380 - Forks: 95
beehive-lab/TornadoVM
TornadoVM: A practical and efficient heterogeneous programming framework for managed languages
Language: Java - Size: 160 MB - Last synced at: about 12 hours ago - Pushed at: about 12 hours ago - Stars: 1,342 - Forks: 123
python-adaptive/adaptive
:chart_with_upwards_trend: Adaptive: parallel active learning of mathematical functions
Language: Python - Size: 5.85 MB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 1,203 - Forks: 62
mmstick/parallel 📦
This project now lives on in a rewrite at https://gitlab.redox-os.org/redox-os/parallel
Language: Rust - Size: 411 KB - Last synced at: 20 days ago - Pushed at: almost 8 years ago - Stars: 1,200 - Forks: 31
KratosMultiphysics/Kratos
Kratos Multiphysics (A.K.A Kratos) is a framework for building parallel multi-disciplinary simulation software. Modularity, extensibility and HPC are the main objectives. Kratos has BSD license and is written in C++ with extensive Python interface.
Language: C++ - Size: 2.01 GB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,186 - Forks: 273
inducer/pyopencl
OpenCL integration for Python, plus shiny features
Language: Python - Size: 5.76 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,116 - Forks: 249
gunrock/gunrock
Programmable CUDA/C++ GPU Graph Analytics
Language: C++ - Size: 74.6 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 1,039 - Forks: 212
OSGeo/grass
GRASS - free and open-source geospatial processing engine
Language: C - Size: 442 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,028 - Forks: 370
AppiumTestDistribution/AppiumTestDistribution
A tool for running android and iOS appium tests in parallel across devices... U like it STAR it !
Language: Java - Size: 110 MB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 1,026 - Forks: 365
futureverse/future
:rocket: R package: future: Unified Parallel and Distributed Processing in R for Everyone
Language: R - Size: 14.4 MB - Last synced at: 15 days ago - Pushed at: 16 days ago - Stars: 995 - Forks: 91
FEniCS/dolfinx
Next generation FEniCS problem solving environment
Language: C++ - Size: 65.7 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 988 - Forks: 223
AccelerateHS/accelerate
Embedded language for high-performance array computations
Language: Haskell - Size: 15.4 MB - Last synced at: 14 days ago - Pushed at: 16 days ago - Stars: 938 - Forks: 128
esa/pagmo2
A C++ platform to perform parallel computations of optimisation tasks (global and local) via the asynchronous generalized island model.
Language: C++ - Size: 58.4 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 892 - Forks: 171
mpi4py/mpi4py
Python bindings for MPI
Language: Python - Size: 9.59 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 882 - Forks: 131
ConorWilliams/libfork
A bleeding-edge, lock-free, wait-free, continuation-stealing tasking library built on C++20's coroutines
Language: C++ - Size: 6.56 MB - Last synced at: 7 days ago - Pushed at: 8 months ago - Stars: 750 - Forks: 32
taskflow/awesome-parallel-computing
A curated list of awesome parallel computing resources
Size: 3.41 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 740 - Forks: 69
uxlfoundation/oneMath
oneAPI Math Library (oneMath)
Language: C++ - Size: 11.8 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 717 - Forks: 172
IntelPython/sdc 📦
Numba extension for compiling Pandas data frames, Intel® Scalable Dataframe Compiler
Language: Python - Size: 15.8 MB - Last synced at: 6 months ago - Pushed at: almost 2 years ago - Stars: 644 - Forks: 64
OpenTimer/OpenTimer
A High-performance Timing Analysis Tool for VLSI Systems
Language: Verilog - Size: 329 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 633 - Forks: 159
LLNL/sundials
Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.
Language: C - Size: 249 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 607 - Forks: 157
nwchemgit/nwchem
NWChem: Open Source High-Performance Computational Chemistry
Language: Fortran - Size: 342 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 571 - Forks: 180
LLNL/RAJA
RAJA Performance Portability Layer (C++)
Language: C++ - Size: 39.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 540 - Forks: 108
SimonBlanke/Hyperactive
An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.
Language: Python - Size: 31.2 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 528 - Forks: 51
alesgenova/post-me
📩 Use web Workers and other Windows through a simple Promise API
Language: TypeScript - Size: 801 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 520 - Forks: 13
esa/pygmo2
A Python platform to perform parallel computations of optimisation tasks (global and local) via the asynchronous generalized island model.
Language: C++ - Size: 13.9 MB - Last synced at: 11 days ago - Pushed at: about 1 year ago - Stars: 505 - Forks: 64
NGSolve/ngsolve
Netgen/NGSolve is a high performance multiphysics finite element software. It is widely used to analyze models from solid mechanics, fluid dynamics and electromagnetics. Due to its flexible Python interface new physical equations and solution algorithms can be implemented easily.
Language: C++ - Size: 60 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 504 - Forks: 87
mpi4jax/mpi4jax
Zero-copy MPI communication of JAX arrays, for turbo-charged HPC applications in Python :zap:
Language: Python - Size: 5.08 MB - Last synced at: about 22 hours ago - Pushed at: 16 days ago - Stars: 498 - Forks: 32
01alchemist/TurboScript 📦
Super charged typed JavaScript dialect for parallel programming which compiles to WebAssembly
Language: JavaScript - Size: 13.2 MB - Last synced at: 3 months ago - Pushed at: about 8 years ago - Stars: 496 - Forks: 35
FAST-Imaging/FAST
A framework for high-performance medical image processing, neural network inference and visualization
Language: C++ - Size: 20 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 487 - Forks: 107
constellation-rs/amadeus
Harmonious distributed data analysis in Rust.
Language: Rust - Size: 2.46 MB - Last synced at: 3 days ago - Pushed at: over 4 years ago - Stars: 482 - Forks: 25
mindspore-courses/step_into_llm
MindSpore online courses: Step into LLM
Language: Jupyter Notebook - Size: 246 MB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 478 - Forks: 122
luispedro/jug
Parallel programming with Python
Language: Python - Size: 2.34 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 459 - Forks: 62
pipefunc/pipefunc
Lightweight fast function pipeline (DAG) creation in pure Python for scientific workflows 🕸️🧪
Language: Python - Size: 2.96 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 425 - Forks: 17
ARM-software/mango
Parallel Hyperparameter Tuning in Python
Language: Jupyter Notebook - Size: 54.6 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 419 - Forks: 48
lehins/massiv
Efficient Haskell Arrays featuring Parallel computation
Language: Haskell - Size: 6.65 MB - Last synced at: 12 days ago - Pushed at: 4 months ago - Stars: 401 - Forks: 26
ChunelFeng/CThreadPool
【A simple used C++ threadpool】一个简单好用,性能优异的,跨平台的C++线程池。欢迎 star & fork
Language: C++ - Size: 184 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 399 - Forks: 74
SmileiPIC/Smilei
Particle-in-cell code for plasma simulation
Language: C++ - Size: 118 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 393 - Forks: 133
BitairLabs/concurrent.js
Non-blocking Concurrent Computation for JavaScript RTEs (Web Browsers, Node.js, Deno & Bun)
Language: TypeScript - Size: 283 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 388 - Forks: 6
chengzeyi/ParaAttention
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
Language: Python - Size: 13.4 MB - Last synced at: 5 days ago - Pushed at: 4 months ago - Stars: 385 - Forks: 38
GraphIt-DSL/graphit
GraphIt - A High-Performance Domain Specific Language for Graph Analytics
Language: C++ - Size: 8.48 MB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 377 - Forks: 46
cmuparlay/parlaylib
A Toolkit for Programming Parallel Algorithms on Shared-Memory Multicore Machines
Language: C++ - Size: 1.17 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 372 - Forks: 74
kysucix/gipuma
Massively Parallel Multiview Stereopsis by Surface Normal Diffusion
Language: C++ - Size: 144 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 349 - Forks: 104
tirthajyoti/Spark-with-Python
Fundamentals of Spark with Python (using PySpark), code examples
Language: Jupyter Notebook - Size: 8.97 MB - Last synced at: 5 months ago - Pushed at: about 3 years ago - Stars: 347 - Forks: 271
dionhaefner/pyhpc-benchmarks
A suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python :rocket:
Language: Python - Size: 1.19 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 330 - Forks: 27
feelpp/feelpp
:gem: Feel++: Finite Element Embedded Language and Library in C++
Language: C++ - Size: 349 MB - Last synced at: 9 days ago - Pushed at: 12 days ago - Stars: 325 - Forks: 68
mtmucha/coros
An easy-to-use and fast library for task-based parallelism, utilizing coroutines.
Language: C++ - Size: 724 KB - Last synced at: 7 months ago - Pushed at: about 1 year ago - Stars: 322 - Forks: 6
bodo-ai/Bodo
High-Performance Python Compute Engine for Data and AI
Language: Python - Size: 735 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 314 - Forks: 13
pothosware/PothosCore
The Pothos data-flow framework
Language: C++ - Size: 63 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 314 - Forks: 50
optimagic-dev/optimagic
optimagic is a Python package for numerical optimization. It is a unified interface to optimizers from SciPy, NlOpt and other packages. optimagic's minimize function works just like SciPy's, so you don't have to adjust your code. You simply get more optimizers for free. On top you get diagnostic tools, parallel numerical derivatives and more.
Language: Python - Size: 27.3 MB - Last synced at: 10 days ago - Pushed at: 12 days ago - Stars: 307 - Forks: 47
taskflow/work-stealing-queue
A fast work-stealing queue template in C++
Language: C++ - Size: 1010 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 306 - Forks: 39
XiaoSong9905/CUDA-Optimization-Guide
Xiao's CUDA Optimization Guide [Active Adding New Contents]
Size: 36.4 MB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 295 - Forks: 20
IntelLabs/ParallelAccelerator.jl 📦
The ParallelAccelerator package, part of the High Performance Scripting project at Intel Labs
Language: Julia - Size: 45.2 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 294 - Forks: 32
zero-one-group/geni
A Clojure dataframe library that runs on Spark
Language: Clojure - Size: 1.86 MB - Last synced at: 30 days ago - Pushed at: almost 2 years ago - Stars: 292 - Forks: 27
niedakh/pqdm
Comfortable parallel TQDM using concurrent.futures
Language: Python - Size: 86.9 KB - Last synced at: 5 months ago - Pushed at: 11 months ago - Stars: 289 - Forks: 9
r-lib/mirai
Minimalist Async Evaluation Framework for R
Language: R - Size: 14.6 MB - Last synced at: about 16 hours ago - Pushed at: about 16 hours ago - Stars: 288 - Forks: 16
BY571/Soft-Actor-Critic-and-Extensions
PyTorch implementation of Soft-Actor-Critic and Prioritized Experience Replay (PER) + Emphasizing Recent Experience (ERE) + Munchausen RL + D2RL and parallel Environments.
Language: Python - Size: 5.99 MB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 288 - Forks: 33
Trinkle23897/Fast-Poisson-Image-Editing
A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.
Language: Python - Size: 2.88 MB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 277 - Forks: 16
ashvardanian/ForkUnion
Lower-latency OpenMP-style minimalistic scoped thread-pool designed for 'Fork-Join' parallelism in Rust and C++, avoiding memory allocations, mutexes, CAS-primitives, and false-sharing on the hot path 🍴
Language: C++ - Size: 612 KB - Last synced at: 11 days ago - Pushed at: 13 days ago - Stars: 276 - Forks: 24
owensgroup/RXMesh
GPU-accelerated triangle mesh processing
Language: Cuda - Size: 11.5 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 274 - Forks: 38
pgiri/dispy
Distributed and Parallel Computing Framework with / for Python
Language: Python - Size: 3.76 MB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 267 - Forks: 54
mfem/PyMFEM
Python wrapper for MFEM
Language: SWIG - Size: 26 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 262 - Forks: 63
sourceryinstitute/OpenCoarrays
A parallel application binary interface for Fortran 2018 compilers.
Language: Fortran - Size: 8.52 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 258 - Forks: 55
DLR-AMR/t8code
Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.
Language: C++ - Size: 147 MB - Last synced at: 6 days ago - Pushed at: 8 days ago - Stars: 238 - Forks: 61
LLNL/SAMRAI
Structured Adaptive Mesh Refinement Application Infrastructure - a scalable C++ framework for block-structured AMR application development
Language: C++ - Size: 71 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 233 - Forks: 85
agenium-scale/boost.simd
Boost SIMD
Size: 192 KB - Last synced at: 2 months ago - Pushed at: over 6 years ago - Stars: 232 - Forks: 47
vincentjzy/OpenCorr
Digital Image Correlation & Digital Volume Correlation Library
Language: C++ - Size: 352 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 231 - Forks: 57
charmplusplus/charm
The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.
Language: C++ - Size: 200 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 224 - Forks: 55
LLNL/libROM
Data-driven model reduction library with an emphasis on large scale parallelism and linear subspace methods
Language: C++ - Size: 54.5 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 222 - Forks: 40
bh107/bohrium
Automatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX
Language: C++ - Size: 32.4 MB - Last synced at: 11 days ago - Pushed at: almost 5 years ago - Stars: 222 - Forks: 31
Alpine-DAV/ascent
A flyweight in situ visualization and analysis runtime for multi-physics HPC simulations
Language: C++ - Size: 177 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 219 - Forks: 68
futureverse/future.apply
:rocket: R package: future.apply - Apply Function to Elements in Parallel using Futures
Language: R - Size: 2.2 MB - Last synced at: 15 days ago - Pushed at: 21 days ago - Stars: 218 - Forks: 19
BWbwchen/MapReduce
An easy-to-use Map Reduce Go parallel-computing framework inspired by 2021 6.824 lab1. It supports multiple workers threads on a single machine and multiple processes on a single machine right now.
Language: Go - Size: 2.6 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 214 - Forks: 13
bueler/p4pdes
C and Python examples from my book on using PETSc and Firedrake to solve PDEs
Language: C - Size: 4.49 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 212 - Forks: 79
privefl/bigsnpr
R package for the analysis of massive SNP arrays.
Language: R - Size: 109 MB - Last synced at: 10 days ago - Pushed at: 2 months ago - Stars: 210 - Forks: 45
grailbio/bigmachine
Bigmachine is a library for self-managing serverless computing in Go
Language: Go - Size: 635 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 201 - Forks: 20
krABMaga/krABMaga
krABMaga: A modern developing art for reliable and efficient Agent-based Model (ABM) simulation with the Rust language
Language: Rust - Size: 63.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 196 - Forks: 13
xupsh/pp4fpgas-cn-hls
HLS Project of pp4fpgas - https://github.com/xupsh/pp4fpgas-cn
Language: Jupyter Notebook - Size: 58.4 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 194 - Forks: 73
JohannesBuchner/UltraNest
Fit and compare complex models reliably and rapidly. Advanced nested sampling.
Language: Python - Size: 167 MB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 192 - Forks: 31
SCOREC/core
parallel finite element unstructured meshes
Language: C++ - Size: 11.5 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 189 - Forks: 65
merzlab/QUICK
QUICK: A GPU-enabled ab intio quantum chemistry software package
Language: C - Size: 74.6 MB - Last synced at: about 9 hours ago - Pushed at: about 11 hours ago - Stars: 188 - Forks: 49
SciML/MethodOfLines.jl
Automatic Finite Difference PDE solving with Julia SciML
Language: Julia - Size: 370 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 188 - Forks: 39
siemens/embb
Embedded Multicore Building Blocks (EMB²): Library for parallel programming of embedded systems. Star us on GitHub? +1
Language: C++ - Size: 18.9 MB - Last synced at: 12 days ago - Pushed at: almost 2 years ago - Stars: 188 - Forks: 41