An open API service providing repository metadata for many open source software ecosystems.

Topic: "parallel-computing"

taskflow/taskflow

A General-purpose Task-parallel Programming System using Modern C++

Language: C++ - Size: 142 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 11,321 - Forks: 1,323

joblib/joblib

Computing with Python functions.

Language: Python - Size: 4.24 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 4,249 - Forks: 441

OpenNMT/CTranslate2

Fast inference engine for Transformer models

Language: C++ - Size: 14.5 MB - Last synced at: 23 days ago - Pushed at: 7 months ago - Stars: 4,048 - Forks: 402

parallel101/course

高性能并行编程与优化 - 课件

Language: C++ - Size: 430 MB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 3,998 - Forks: 551

amilajack/reading

A list of computer-science readings I recommend

Size: 523 MB - Last synced at: 7 months ago - Pushed at: about 3 years ago - Stars: 3,310 - Forks: 728

jofpin/turbit

Build applications, scripts, and automations powered by high-performance multicore computing using Node.js

Language: JavaScript - Size: 179 KB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 2,823 - Forks: 491

jmcarpenter2/swifter

A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner

Language: Python - Size: 2.15 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 2,611 - Forks: 104

kokkos/kokkos

Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction

Language: C++ - Size: 37.7 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 2,350 - Forks: 472

geatpy-dev/geatpy

Evolutionary algorithm toolbox and framework with high performance for Python

Language: Python - Size: 575 MB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 2,085 - Forks: 731

mfem/mfem

Lightweight, general, scalable C++ library for finite element methods

Language: C++ - Size: 261 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 2,015 - Forks: 570

NVIDIA/cccl

CUDA Core Compute Libraries

Language: C++ - Size: 240 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1,994 - Forks: 284

chapel-lang/chapel

a Productive Parallel Programming Language

Language: Chapel - Size: 1010 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1,948 - Forks: 433

zwang4/awesome-machine-learning-in-compilers

Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation

Size: 394 KB - Last synced at: 27 days ago - Pushed at: about 2 months ago - Stars: 1,606 - Forks: 171

dealii/dealii

The development repository for the deal.II finite element library

Language: C++ - Size: 365 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,577 - Forks: 810

VcDevel/Vc

SIMD Vector Classes for C++

Language: C++ - Size: 11.2 MB - Last synced at: 30 days ago - Pushed at: over 1 year ago - Stars: 1,511 - Forks: 153

pyper-dev/pyper

Concurrent Python made simple

Language: Python - Size: 462 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 1,503 - Forks: 30

JuliaSymbolics/Symbolics.jl

Symbolic programming for the next generation of numerical software

Language: Julia - Size: 38.2 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 1,469 - Forks: 173

ElmerCSC/elmerfem

Official git repository of Elmer FEM software

Language: Fortran - Size: 125 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1,414 - Forks: 353

mratsim/Arraymancer

A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends

Language: Nim - Size: 3.8 MB - Last synced at: 2 days ago - Pushed at: 8 months ago - Stars: 1,380 - Forks: 95

beehive-lab/TornadoVM

TornadoVM: A practical and efficient heterogeneous programming framework for managed languages

Language: Java - Size: 160 MB - Last synced at: about 12 hours ago - Pushed at: about 12 hours ago - Stars: 1,342 - Forks: 123

python-adaptive/adaptive

:chart_with_upwards_trend: Adaptive: parallel active learning of mathematical functions

Language: Python - Size: 5.85 MB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 1,203 - Forks: 62

mmstick/parallel 📦

This project now lives on in a rewrite at https://gitlab.redox-os.org/redox-os/parallel

Language: Rust - Size: 411 KB - Last synced at: 20 days ago - Pushed at: almost 8 years ago - Stars: 1,200 - Forks: 31

KratosMultiphysics/Kratos

Kratos Multiphysics (A.K.A Kratos) is a framework for building parallel multi-disciplinary simulation software. Modularity, extensibility and HPC are the main objectives. Kratos has BSD license and is written in C++ with extensive Python interface.

Language: C++ - Size: 2.01 GB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,186 - Forks: 273

inducer/pyopencl

OpenCL integration for Python, plus shiny features

Language: Python - Size: 5.76 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,116 - Forks: 249

gunrock/gunrock

Programmable CUDA/C++ GPU Graph Analytics

Language: C++ - Size: 74.6 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 1,039 - Forks: 212

OSGeo/grass

GRASS - free and open-source geospatial processing engine

Language: C - Size: 442 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,028 - Forks: 370

AppiumTestDistribution/AppiumTestDistribution

A tool for running android and iOS appium tests in parallel across devices... U like it STAR it !

Language: Java - Size: 110 MB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 1,026 - Forks: 365

futureverse/future

:rocket: R package: future: Unified Parallel and Distributed Processing in R for Everyone

Language: R - Size: 14.4 MB - Last synced at: 15 days ago - Pushed at: 16 days ago - Stars: 995 - Forks: 91

FEniCS/dolfinx

Next generation FEniCS problem solving environment

Language: C++ - Size: 65.7 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 988 - Forks: 223

AccelerateHS/accelerate

Embedded language for high-performance array computations

Language: Haskell - Size: 15.4 MB - Last synced at: 14 days ago - Pushed at: 16 days ago - Stars: 938 - Forks: 128

esa/pagmo2

A C++ platform to perform parallel computations of optimisation tasks (global and local) via the asynchronous generalized island model.

Language: C++ - Size: 58.4 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 892 - Forks: 171

mpi4py/mpi4py

Python bindings for MPI

Language: Python - Size: 9.59 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 882 - Forks: 131

ConorWilliams/libfork

A bleeding-edge, lock-free, wait-free, continuation-stealing tasking library built on C++20's coroutines

Language: C++ - Size: 6.56 MB - Last synced at: 7 days ago - Pushed at: 8 months ago - Stars: 750 - Forks: 32

taskflow/awesome-parallel-computing

A curated list of awesome parallel computing resources

Size: 3.41 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 740 - Forks: 69

uxlfoundation/oneMath

oneAPI Math Library (oneMath)

Language: C++ - Size: 11.8 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 717 - Forks: 172

IntelPython/sdc 📦

Numba extension for compiling Pandas data frames, Intel® Scalable Dataframe Compiler

Language: Python - Size: 15.8 MB - Last synced at: 6 months ago - Pushed at: almost 2 years ago - Stars: 644 - Forks: 64

OpenTimer/OpenTimer

A High-performance Timing Analysis Tool for VLSI Systems

Language: Verilog - Size: 329 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 633 - Forks: 159

LLNL/sundials

Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.

Language: C - Size: 249 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 607 - Forks: 157

nwchemgit/nwchem

NWChem: Open Source High-Performance Computational Chemistry

Language: Fortran - Size: 342 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 571 - Forks: 180

LLNL/RAJA

RAJA Performance Portability Layer (C++)

Language: C++ - Size: 39.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 540 - Forks: 108

SimonBlanke/Hyperactive

An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.

Language: Python - Size: 31.2 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 528 - Forks: 51

alesgenova/post-me

📩 Use web Workers and other Windows through a simple Promise API

Language: TypeScript - Size: 801 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 520 - Forks: 13

esa/pygmo2

A Python platform to perform parallel computations of optimisation tasks (global and local) via the asynchronous generalized island model.

Language: C++ - Size: 13.9 MB - Last synced at: 11 days ago - Pushed at: about 1 year ago - Stars: 505 - Forks: 64

NGSolve/ngsolve

Netgen/NGSolve is a high performance multiphysics finite element software. It is widely used to analyze models from solid mechanics, fluid dynamics and electromagnetics. Due to its flexible Python interface new physical equations and solution algorithms can be implemented easily.

Language: C++ - Size: 60 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 504 - Forks: 87

mpi4jax/mpi4jax

Zero-copy MPI communication of JAX arrays, for turbo-charged HPC applications in Python :zap:

Language: Python - Size: 5.08 MB - Last synced at: about 22 hours ago - Pushed at: 16 days ago - Stars: 498 - Forks: 32

01alchemist/TurboScript 📦

Super charged typed JavaScript dialect for parallel programming which compiles to WebAssembly

Language: JavaScript - Size: 13.2 MB - Last synced at: 3 months ago - Pushed at: about 8 years ago - Stars: 496 - Forks: 35

FAST-Imaging/FAST

A framework for high-performance medical image processing, neural network inference and visualization

Language: C++ - Size: 20 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 487 - Forks: 107

constellation-rs/amadeus

Harmonious distributed data analysis in Rust.

Language: Rust - Size: 2.46 MB - Last synced at: 3 days ago - Pushed at: over 4 years ago - Stars: 482 - Forks: 25

mindspore-courses/step_into_llm

MindSpore online courses: Step into LLM

Language: Jupyter Notebook - Size: 246 MB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 478 - Forks: 122

luispedro/jug

Parallel programming with Python

Language: Python - Size: 2.34 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 459 - Forks: 62

pipefunc/pipefunc

Lightweight fast function pipeline (DAG) creation in pure Python for scientific workflows 🕸️🧪

Language: Python - Size: 2.96 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 425 - Forks: 17

ARM-software/mango

Parallel Hyperparameter Tuning in Python

Language: Jupyter Notebook - Size: 54.6 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 419 - Forks: 48

lehins/massiv

Efficient Haskell Arrays featuring Parallel computation

Language: Haskell - Size: 6.65 MB - Last synced at: 12 days ago - Pushed at: 4 months ago - Stars: 401 - Forks: 26

ChunelFeng/CThreadPool

【A simple used C++ threadpool】一个简单好用,性能优异的,跨平台的C++线程池。欢迎 star & fork

Language: C++ - Size: 184 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 399 - Forks: 74

SmileiPIC/Smilei

Particle-in-cell code for plasma simulation

Language: C++ - Size: 118 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 393 - Forks: 133

BitairLabs/concurrent.js

Non-blocking Concurrent Computation for JavaScript RTEs (Web Browsers, Node.js, Deno & Bun)

Language: TypeScript - Size: 283 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 388 - Forks: 6

chengzeyi/ParaAttention

https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching

Language: Python - Size: 13.4 MB - Last synced at: 5 days ago - Pushed at: 4 months ago - Stars: 385 - Forks: 38

GraphIt-DSL/graphit

GraphIt - A High-Performance Domain Specific Language for Graph Analytics

Language: C++ - Size: 8.48 MB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 377 - Forks: 46

cmuparlay/parlaylib

A Toolkit for Programming Parallel Algorithms on Shared-Memory Multicore Machines

Language: C++ - Size: 1.17 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 372 - Forks: 74

kysucix/gipuma

Massively Parallel Multiview Stereopsis by Surface Normal Diffusion

Language: C++ - Size: 144 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 349 - Forks: 104

tirthajyoti/Spark-with-Python

Fundamentals of Spark with Python (using PySpark), code examples

Language: Jupyter Notebook - Size: 8.97 MB - Last synced at: 5 months ago - Pushed at: about 3 years ago - Stars: 347 - Forks: 271

dionhaefner/pyhpc-benchmarks

A suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python :rocket:

Language: Python - Size: 1.19 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 330 - Forks: 27

feelpp/feelpp

:gem: Feel++: Finite Element Embedded Language and Library in C++

Language: C++ - Size: 349 MB - Last synced at: 9 days ago - Pushed at: 12 days ago - Stars: 325 - Forks: 68

mtmucha/coros

An easy-to-use and fast library for task-based parallelism, utilizing coroutines.

Language: C++ - Size: 724 KB - Last synced at: 7 months ago - Pushed at: about 1 year ago - Stars: 322 - Forks: 6

bodo-ai/Bodo

High-Performance Python Compute Engine for Data and AI

Language: Python - Size: 735 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 314 - Forks: 13

pothosware/PothosCore

The Pothos data-flow framework

Language: C++ - Size: 63 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 314 - Forks: 50

optimagic-dev/optimagic

optimagic is a Python package for numerical optimization. It is a unified interface to optimizers from SciPy, NlOpt and other packages. optimagic's minimize function works just like SciPy's, so you don't have to adjust your code. You simply get more optimizers for free. On top you get diagnostic tools, parallel numerical derivatives and more.

Language: Python - Size: 27.3 MB - Last synced at: 10 days ago - Pushed at: 12 days ago - Stars: 307 - Forks: 47

taskflow/work-stealing-queue

A fast work-stealing queue template in C++

Language: C++ - Size: 1010 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 306 - Forks: 39

XiaoSong9905/CUDA-Optimization-Guide

Xiao's CUDA Optimization Guide [Active Adding New Contents]

Size: 36.4 MB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 295 - Forks: 20

IntelLabs/ParallelAccelerator.jl 📦

The ParallelAccelerator package, part of the High Performance Scripting project at Intel Labs

Language: Julia - Size: 45.2 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 294 - Forks: 32

zero-one-group/geni

A Clojure dataframe library that runs on Spark

Language: Clojure - Size: 1.86 MB - Last synced at: 30 days ago - Pushed at: almost 2 years ago - Stars: 292 - Forks: 27

niedakh/pqdm

Comfortable parallel TQDM using concurrent.futures

Language: Python - Size: 86.9 KB - Last synced at: 5 months ago - Pushed at: 11 months ago - Stars: 289 - Forks: 9

r-lib/mirai

Minimalist Async Evaluation Framework for R

Language: R - Size: 14.6 MB - Last synced at: about 16 hours ago - Pushed at: about 16 hours ago - Stars: 288 - Forks: 16

BY571/Soft-Actor-Critic-and-Extensions

PyTorch implementation of Soft-Actor-Critic and Prioritized Experience Replay (PER) + Emphasizing Recent Experience (ERE) + Munchausen RL + D2RL and parallel Environments.

Language: Python - Size: 5.99 MB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 288 - Forks: 33

Trinkle23897/Fast-Poisson-Image-Editing

A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.

Language: Python - Size: 2.88 MB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 277 - Forks: 16

ashvardanian/ForkUnion

Lower-latency OpenMP-style minimalistic scoped thread-pool designed for 'Fork-Join' parallelism in Rust and C++, avoiding memory allocations, mutexes, CAS-primitives, and false-sharing on the hot path 🍴

Language: C++ - Size: 612 KB - Last synced at: 11 days ago - Pushed at: 13 days ago - Stars: 276 - Forks: 24

owensgroup/RXMesh

GPU-accelerated triangle mesh processing

Language: Cuda - Size: 11.5 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 274 - Forks: 38

pgiri/dispy

Distributed and Parallel Computing Framework with / for Python

Language: Python - Size: 3.76 MB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 267 - Forks: 54

mfem/PyMFEM

Python wrapper for MFEM

Language: SWIG - Size: 26 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 262 - Forks: 63

sourceryinstitute/OpenCoarrays

A parallel application binary interface for Fortran 2018 compilers.

Language: Fortran - Size: 8.52 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 258 - Forks: 55

DLR-AMR/t8code

Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.

Language: C++ - Size: 147 MB - Last synced at: 6 days ago - Pushed at: 8 days ago - Stars: 238 - Forks: 61

LLNL/SAMRAI

Structured Adaptive Mesh Refinement Application Infrastructure - a scalable C++ framework for block-structured AMR application development

Language: C++ - Size: 71 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 233 - Forks: 85

agenium-scale/boost.simd

Boost SIMD

Size: 192 KB - Last synced at: 2 months ago - Pushed at: over 6 years ago - Stars: 232 - Forks: 47

vincentjzy/OpenCorr

Digital Image Correlation & Digital Volume Correlation Library

Language: C++ - Size: 352 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 231 - Forks: 57

charmplusplus/charm

The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.

Language: C++ - Size: 200 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 224 - Forks: 55

LLNL/libROM

Data-driven model reduction library with an emphasis on large scale parallelism and linear subspace methods

Language: C++ - Size: 54.5 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 222 - Forks: 40

bh107/bohrium

Automatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX

Language: C++ - Size: 32.4 MB - Last synced at: 11 days ago - Pushed at: almost 5 years ago - Stars: 222 - Forks: 31

Alpine-DAV/ascent

A flyweight in situ visualization and analysis runtime for multi-physics HPC simulations

Language: C++ - Size: 177 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 219 - Forks: 68

futureverse/future.apply

:rocket: R package: future.apply - Apply Function to Elements in Parallel using Futures

Language: R - Size: 2.2 MB - Last synced at: 15 days ago - Pushed at: 21 days ago - Stars: 218 - Forks: 19

BWbwchen/MapReduce

An easy-to-use Map Reduce Go parallel-computing framework inspired by 2021 6.824 lab1. It supports multiple workers threads on a single machine and multiple processes on a single machine right now.

Language: Go - Size: 2.6 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 214 - Forks: 13

bueler/p4pdes

C and Python examples from my book on using PETSc and Firedrake to solve PDEs

Language: C - Size: 4.49 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 212 - Forks: 79

privefl/bigsnpr

R package for the analysis of massive SNP arrays.

Language: R - Size: 109 MB - Last synced at: 10 days ago - Pushed at: 2 months ago - Stars: 210 - Forks: 45

grailbio/bigmachine

Bigmachine is a library for self-managing serverless computing in Go

Language: Go - Size: 635 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 201 - Forks: 20

krABMaga/krABMaga

krABMaga: A modern developing art for reliable and efficient Agent-based Model (ABM) simulation with the Rust language

Language: Rust - Size: 63.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 196 - Forks: 13

xupsh/pp4fpgas-cn-hls

HLS Project of pp4fpgas - https://github.com/xupsh/pp4fpgas-cn

Language: Jupyter Notebook - Size: 58.4 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 194 - Forks: 73

JohannesBuchner/UltraNest

Fit and compare complex models reliably and rapidly. Advanced nested sampling.

Language: Python - Size: 167 MB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 192 - Forks: 31

SCOREC/core

parallel finite element unstructured meshes

Language: C++ - Size: 11.5 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 189 - Forks: 65

merzlab/QUICK

QUICK: A GPU-enabled ab intio quantum chemistry software package

Language: C - Size: 74.6 MB - Last synced at: about 9 hours ago - Pushed at: about 11 hours ago - Stars: 188 - Forks: 49

SciML/MethodOfLines.jl

Automatic Finite Difference PDE solving with Julia SciML

Language: Julia - Size: 370 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 188 - Forks: 39

siemens/embb

Embedded Multicore Building Blocks (EMB²): Library for parallel programming of embedded systems. Star us on GitHub? +1

Language: C++ - Size: 18.9 MB - Last synced at: 12 days ago - Pushed at: almost 2 years ago - Stars: 188 - Forks: 41