An open API service providing repository metadata for many open source software ecosystems.

Topic: "parallel-computing"

taskflow/taskflow

A General-purpose Task-parallel Programming System using Modern C++

Language: C++ - Size: 142 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 11,405 - Forks: 1,332

joblib/joblib

Computing with Python functions.

Language: Python - Size: 4.24 MB - Last synced at: 10 days ago - Pushed at: 25 days ago - Stars: 4,274 - Forks: 442

OpenNMT/CTranslate2

Fast inference engine for Transformer models

Language: C++ - Size: 15.2 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 4,151 - Forks: 424

parallel101/course

高性能并行编程与优化 - 课件

Language: C++ - Size: 430 MB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 3,998 - Forks: 551

amilajack/reading

A list of computer-science readings I recommend

Size: 523 MB - Last synced at: 8 months ago - Pushed at: about 3 years ago - Stars: 3,310 - Forks: 728

jofpin/turbit

Build applications, scripts, and automations powered by high-performance multicore computing using Node.js

Language: JavaScript - Size: 179 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2,823 - Forks: 491

jmcarpenter2/swifter

A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner

Language: Python - Size: 2.15 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 2,611 - Forks: 104

kokkos/kokkos

Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction

Language: C++ - Size: 38.2 MB - Last synced at: about 21 hours ago - Pushed at: 3 days ago - Stars: 2,382 - Forks: 473

geatpy-dev/geatpy

Evolutionary algorithm toolbox and framework with high performance for Python

Language: Python - Size: 575 MB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 2,085 - Forks: 731

mfem/mfem

Lightweight, general, scalable C++ library for finite element methods

Language: C++ - Size: 240 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 2,037 - Forks: 573

NVIDIA/cccl

CUDA Core Compute Libraries

Language: C++ - Size: 340 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 2,035 - Forks: 294

chapel-lang/chapel

a Productive Parallel Programming Language

Language: Chapel - Size: 1010 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1,951 - Forks: 436

zwang4/awesome-machine-learning-in-compilers

Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation

Size: 394 KB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 1,624 - Forks: 172

dealii/dealii

The development repository for the deal.II finite element library

Language: C++ - Size: 365 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,589 - Forks: 813

VcDevel/Vc

SIMD Vector Classes for C++

Language: C++ - Size: 11.2 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 1,511 - Forks: 153

pyper-dev/pyper

Concurrent Python made simple

Language: Python - Size: 462 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 1,503 - Forks: 30

JuliaSymbolics/Symbolics.jl

Symbolic programming for the next generation of numerical software

Language: Julia - Size: 38.7 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 1,470 - Forks: 174

ElmerCSC/elmerfem

Official git repository of Elmer FEM software

Language: Fortran - Size: 125 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1,440 - Forks: 354

mratsim/Arraymancer

A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends

Language: Nim - Size: 3.81 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 1,380 - Forks: 95

beehive-lab/TornadoVM

TornadoVM: A practical and efficient heterogeneous programming framework for managed languages

Language: Java - Size: 160 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 1,344 - Forks: 124

python-adaptive/adaptive

:chart_with_upwards_trend: Adaptive: parallel active learning of mathematical functions

Language: Python - Size: 5.85 MB - Last synced at: about 16 hours ago - Pushed at: 4 days ago - Stars: 1,206 - Forks: 62

mmstick/parallel 📦

This project now lives on in a rewrite at https://gitlab.redox-os.org/redox-os/parallel

Language: Rust - Size: 411 KB - Last synced at: about 2 months ago - Pushed at: almost 8 years ago - Stars: 1,200 - Forks: 31

KratosMultiphysics/Kratos

Kratos Multiphysics (A.K.A Kratos) is a framework for building parallel multi-disciplinary simulation software. Modularity, extensibility and HPC are the main objectives. Kratos has BSD license and is written in C++ with extensive Python interface.

Language: C++ - Size: 2.08 GB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1,192 - Forks: 274

inducer/pyopencl

OpenCL integration for Python, plus shiny features

Language: Python - Size: 5.77 MB - Last synced at: about 21 hours ago - Pushed at: 4 days ago - Stars: 1,120 - Forks: 248

gunrock/gunrock

Programmable CUDA/C++ GPU Graph Analytics

Language: C++ - Size: 74.6 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 1,039 - Forks: 212

OSGeo/grass

GRASS - free and open-source geospatial processing engine

Language: C - Size: 456 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,036 - Forks: 372

AppiumTestDistribution/AppiumTestDistribution

A tool for running android and iOS appium tests in parallel across devices... U like it STAR it !

Language: Java - Size: 110 MB - Last synced at: 5 days ago - Pushed at: 11 months ago - Stars: 1,036 - Forks: 366

FEniCS/dolfinx

Next generation FEniCS problem solving environment

Language: C++ - Size: 65.8 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,004 - Forks: 224

futureverse/future

:rocket: R package: future: Unified Parallel and Distributed Processing in R for Everyone

Language: R - Size: 15.7 MB - Last synced at: about 20 hours ago - Pushed at: 3 days ago - Stars: 994 - Forks: 92

AccelerateHS/accelerate

Embedded language for high-performance array computations

Language: Haskell - Size: 15.5 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 942 - Forks: 130

esa/pagmo2

A C++ platform to perform parallel computations of optimisation tasks (global and local) via the asynchronous generalized island model.

Language: C++ - Size: 58.4 MB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 896 - Forks: 171

mpi4py/mpi4py

Python bindings for MPI

Language: Python - Size: 9.66 MB - Last synced at: 1 day ago - Pushed at: 3 days ago - Stars: 883 - Forks: 131

ConorWilliams/libfork

A bleeding-edge, lock-free, wait-free, continuation-stealing tasking library built on C++20's coroutines

Language: C++ - Size: 6.56 MB - Last synced at: 8 days ago - Pushed at: 9 months ago - Stars: 765 - Forks: 36

taskflow/awesome-parallel-computing

A curated list of awesome parallel computing resources

Size: 3.42 MB - Last synced at: 10 days ago - Pushed at: 22 days ago - Stars: 760 - Forks: 70

uxlfoundation/oneMath

oneAPI Math Library (oneMath)

Language: C++ - Size: 11.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 717 - Forks: 172

IntelPython/sdc 📦

Numba extension for compiling Pandas data frames, Intel® Scalable Dataframe Compiler

Language: Python - Size: 15.8 MB - Last synced at: 7 months ago - Pushed at: about 2 years ago - Stars: 644 - Forks: 64

OpenTimer/OpenTimer

A High-performance Timing Analysis Tool for VLSI Systems

Language: Verilog - Size: 329 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 633 - Forks: 159

LLNL/sundials

Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.

Language: C - Size: 251 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 611 - Forks: 159

nwchemgit/nwchem

NWChem: Open Source High-Performance Computational Chemistry

Language: Fortran - Size: 342 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 573 - Forks: 180

LLNL/RAJA

RAJA Performance Portability Layer (C++)

Language: C++ - Size: 39.2 MB - Last synced at: about 22 hours ago - Pushed at: 3 days ago - Stars: 549 - Forks: 109

SimonBlanke/Hyperactive

An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.

Language: Python - Size: 31.2 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 529 - Forks: 53

alesgenova/post-me

📩 Use web Workers and other Windows through a simple Promise API

Language: TypeScript - Size: 801 KB - Last synced at: 25 days ago - Pushed at: almost 5 years ago - Stars: 520 - Forks: 13

NGSolve/ngsolve

Netgen/NGSolve is a high performance multiphysics finite element software. It is widely used to analyze models from solid mechanics, fluid dynamics and electromagnetics. Due to its flexible Python interface new physical equations and solution algorithms can be implemented easily.

Language: C++ - Size: 60.2 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 507 - Forks: 87

esa/pygmo2

A Python platform to perform parallel computations of optimisation tasks (global and local) via the asynchronous generalized island model.

Language: C++ - Size: 13.9 MB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 507 - Forks: 65

mpi4jax/mpi4jax

Zero-copy MPI communication of JAX arrays, for turbo-charged HPC applications in Python :zap:

Language: Python - Size: 5.08 MB - Last synced at: 6 days ago - Pushed at: 9 days ago - Stars: 498 - Forks: 31

01alchemist/TurboScript 📦

Super charged typed JavaScript dialect for parallel programming which compiles to WebAssembly

Language: JavaScript - Size: 13.2 MB - Last synced at: 4 months ago - Pushed at: about 8 years ago - Stars: 496 - Forks: 35

FAST-Imaging/FAST

A framework for high-performance medical image processing, neural network inference and visualization

Language: C++ - Size: 20.1 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 488 - Forks: 108

constellation-rs/amadeus

Harmonious distributed data analysis in Rust.

Language: Rust - Size: 2.46 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 482 - Forks: 25

mindspore-courses/step_into_llm

MindSpore online courses: Step into LLM

Language: Jupyter Notebook - Size: 246 MB - Last synced at: 8 days ago - Pushed at: 10 days ago - Stars: 480 - Forks: 123

luispedro/jug

Parallel programming with Python

Language: Python - Size: 2.34 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 459 - Forks: 62

pipefunc/pipefunc

Lightweight fast function pipeline (DAG) creation in pure Python for scientific (HPC) workflows 🕸️🧪

Language: Python - Size: 2.81 MB - Last synced at: about 5 hours ago - Pushed at: 2 days ago - Stars: 430 - Forks: 17

ARM-software/mango

Parallel Hyperparameter Tuning in Python

Language: Jupyter Notebook - Size: 54.6 MB - Last synced at: 19 days ago - Pushed at: 9 months ago - Stars: 418 - Forks: 47

lehins/massiv

Efficient Haskell Arrays featuring Parallel computation

Language: Haskell - Size: 6.65 MB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 403 - Forks: 26

ChunelFeng/CThreadPool

【A simple used C++ threadpool】一个简单好用,性能优异的,跨平台的C++线程池。欢迎 star & fork

Language: C++ - Size: 184 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 399 - Forks: 74

cmuparlay/parlaylib

A Toolkit for Programming Parallel Algorithms on Shared-Memory Multicore Machines

Language: C++ - Size: 1.18 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 395 - Forks: 75

SmileiPIC/Smilei

Particle-in-cell code for plasma simulation

Language: C++ - Size: 118 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 393 - Forks: 133

BitairLabs/concurrent.js

Non-blocking Concurrent Computation for JavaScript RTEs (Web Browsers, Node.js, Deno & Bun)

Language: TypeScript - Size: 283 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 388 - Forks: 6

chengzeyi/ParaAttention

https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching

Language: Python - Size: 13.4 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 385 - Forks: 38

GraphIt-DSL/graphit

GraphIt - A High-Performance Domain Specific Language for Graph Analytics

Language: C++ - Size: 8.48 MB - Last synced at: 7 months ago - Pushed at: almost 3 years ago - Stars: 377 - Forks: 46

kysucix/gipuma

Massively Parallel Multiview Stereopsis by Surface Normal Diffusion

Language: C++ - Size: 144 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 349 - Forks: 104

tirthajyoti/Spark-with-Python

Fundamentals of Spark with Python (using PySpark), code examples

Language: Jupyter Notebook - Size: 8.97 MB - Last synced at: 6 months ago - Pushed at: about 3 years ago - Stars: 347 - Forks: 271

dionhaefner/pyhpc-benchmarks

A suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python :rocket:

Language: Python - Size: 1.19 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 330 - Forks: 27

mtmucha/coros

An easy-to-use and fast library for task-based parallelism, utilizing coroutines.

Language: C++ - Size: 724 KB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 329 - Forks: 8

bodo-ai/Bodo

High Performance Data Processing in Python

Language: Python - Size: 739 MB - Last synced at: about 22 hours ago - Pushed at: 1 day ago - Stars: 328 - Forks: 14

feelpp/feelpp

:gem: Feel++: Finite Element Embedded Language and Library in C++

Language: C++ - Size: 349 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 325 - Forks: 68

pothosware/PothosCore

The Pothos data-flow framework

Language: C++ - Size: 63 MB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 314 - Forks: 50

optimagic-dev/optimagic

optimagic is a Python package for numerical optimization. It is a unified interface to optimizers from SciPy, NlOpt and other packages. optimagic's minimize function works just like SciPy's, so you don't have to adjust your code. You simply get more optimizers for free. On top you get diagnostic tools, parallel numerical derivatives and more.

Language: Python - Size: 27.4 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 309 - Forks: 47

taskflow/work-stealing-queue

A fast work-stealing queue template in C++

Language: C++ - Size: 1010 KB - Last synced at: 6 months ago - Pushed at: almost 2 years ago - Stars: 306 - Forks: 39

niedakh/pqdm

Comfortable parallel TQDM using concurrent.futures

Language: Python - Size: 86.9 KB - Last synced at: 19 days ago - Pushed at: 12 months ago - Stars: 298 - Forks: 9

XiaoSong9905/CUDA-Optimization-Guide

Xiao's CUDA Optimization Guide [Active Adding New Contents]

Size: 36.4 MB - Last synced at: 7 months ago - Pushed at: about 3 years ago - Stars: 295 - Forks: 20

zero-one-group/geni

A Clojure dataframe library that runs on Spark

Language: Clojure - Size: 1.86 MB - Last synced at: 25 days ago - Pushed at: about 2 years ago - Stars: 294 - Forks: 27

IntelLabs/ParallelAccelerator.jl 📦

The ParallelAccelerator package, part of the High Performance Scripting project at Intel Labs

Language: Julia - Size: 45.2 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 294 - Forks: 32

r-lib/mirai

Minimalist Async Evaluation Framework for R

Language: R - Size: 15.6 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 292 - Forks: 17

BY571/Soft-Actor-Critic-and-Extensions

PyTorch implementation of Soft-Actor-Critic and Prioritized Experience Replay (PER) + Emphasizing Recent Experience (ERE) + Munchausen RL + D2RL and parallel Environments.

Language: Python - Size: 5.99 MB - Last synced at: 4 months ago - Pushed at: almost 5 years ago - Stars: 288 - Forks: 33

owensgroup/RXMesh

GPU-accelerated triangle mesh processing

Language: Cuda - Size: 11 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 281 - Forks: 39

Trinkle23897/Fast-Poisson-Image-Editing

A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.

Language: Python - Size: 2.88 MB - Last synced at: 3 months ago - Pushed at: about 3 years ago - Stars: 277 - Forks: 16

ashvardanian/ForkUnion

Lower-latency OpenMP-style minimalistic scoped thread-pool designed for 'Fork-Join' parallelism in Rust and C++, avoiding memory allocations, mutexes, CAS-primitives, and false-sharing on the hot path 🍴

Language: C++ - Size: 612 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 276 - Forks: 24

mfem/PyMFEM

Python wrapper for MFEM

Language: SWIG - Size: 26.1 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 273 - Forks: 64

pgiri/dispy

Distributed and Parallel Computing Framework with / for Python

Language: Python - Size: 3.76 MB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 267 - Forks: 53

sourceryinstitute/OpenCoarrays

A parallel application binary interface for Fortran 2018 compilers.

Language: Fortran - Size: 8.52 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 258 - Forks: 55

DLR-AMR/t8code

Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.

Language: C++ - Size: 147 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 246 - Forks: 61

LLNL/SAMRAI

Structured Adaptive Mesh Refinement Application Infrastructure - a scalable C++ framework for block-structured AMR application development

Language: C++ - Size: 71 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 234 - Forks: 86

agenium-scale/boost.simd

Boost SIMD

Size: 192 KB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 232 - Forks: 47

vincentjzy/OpenCorr

Digital Image Correlation & Digital Volume Correlation Library

Language: C++ - Size: 352 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 231 - Forks: 57

LLNL/libROM

Data-driven model reduction library with an emphasis on large scale parallelism and linear subspace methods

Language: C++ - Size: 54.5 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 225 - Forks: 43

Alpine-DAV/ascent

A flyweight in situ visualization and analysis runtime for multi-physics HPC simulations

Language: C++ - Size: 177 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 224 - Forks: 71

charmplusplus/charm

The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.

Language: C++ - Size: 199 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 224 - Forks: 55

bh107/bohrium

Automatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX

Language: C++ - Size: 32.4 MB - Last synced at: about 1 month ago - Pushed at: about 5 years ago - Stars: 222 - Forks: 31

futureverse/future.apply

:rocket: R package: future.apply - Apply Function to Elements in Parallel using Futures

Language: R - Size: 2.2 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 218 - Forks: 19

BWbwchen/MapReduce

An easy-to-use Map Reduce Go parallel-computing framework inspired by 2021 6.824 lab1. It supports multiple workers threads on a single machine and multiple processes on a single machine right now.

Language: Go - Size: 2.6 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 214 - Forks: 13

bueler/p4pdes

C and Python examples from my book on using PETSc and Firedrake to solve PDEs

Language: C - Size: 4.49 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 213 - Forks: 79

privefl/bigsnpr

R package for the analysis of massive SNP arrays.

Language: R - Size: 109 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 210 - Forks: 45

grailbio/bigmachine

Bigmachine is a library for self-managing serverless computing in Go

Language: Go - Size: 635 KB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 201 - Forks: 20

krABMaga/krABMaga

krABMaga: A modern developing art for reliable and efficient Agent-based Model (ABM) simulation with the Rust language

Language: Rust - Size: 63.6 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 196 - Forks: 13

xupsh/pp4fpgas-cn-hls

HLS Project of pp4fpgas - https://github.com/xupsh/pp4fpgas-cn

Language: Jupyter Notebook - Size: 58.4 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 194 - Forks: 73

JohannesBuchner/UltraNest

Fit and compare complex models reliably and rapidly. Advanced nested sampling.

Language: Python - Size: 167 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 192 - Forks: 31

merzlab/QUICK

QUICK: A GPU-enabled ab intio quantum chemistry software package

Language: C - Size: 74.6 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 191 - Forks: 50

SciML/MethodOfLines.jl

Automatic Finite Difference PDE solving with Julia SciML

Language: Julia - Size: 370 MB - Last synced at: 19 days ago - Pushed at: 23 days ago - Stars: 190 - Forks: 39

SCOREC/core

parallel finite element unstructured meshes

Language: C++ - Size: 12.5 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 189 - Forks: 65

siemens/embb

Embedded Multicore Building Blocks (EMB²): Library for parallel programming of embedded systems. Star us on GitHub? +1

Language: C++ - Size: 18.9 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 188 - Forks: 41

Related Topics