GitHub topics: heterogeneous-parallel-programming
inducer/pyopencl
OpenCL integration for Python, plus shiny features
Language: Python - Size: 5.59 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1,096 - Forks: 246

taskflow/taskflow
A General-purpose Task-parallel Programming System using Modern C++
Language: C++ - Size: 137 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 10,810 - Forks: 1,272

pocl/pocl
pocl - Portable Computing Language
Language: C - Size: 60.6 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 979 - Forks: 265

JuliaGPU/KernelAbstractions.jl
Heterogeneous programming in Julia
Language: Julia - Size: 3.88 MB - Last synced at: 6 days ago - Pushed at: 13 days ago - Stars: 425 - Forks: 74

flecsi/flecsi
Flexible Computational Science (FleCSI) Project
Language: C++ - Size: 373 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 17 - Forks: 11

mosaic-group/openfpm
OpenFPM: A scalable open framework for particle and particle-mesh codes on parallel computers
Language: C++ - Size: 35.8 MB - Last synced at: 23 days ago - Pushed at: 24 days ago - Stars: 17 - Forks: 11

bsc-pm-ompss-at-fpga/ait
The Accelerator Integration Tool (AIT) automatically integrates OmpSs@FPGA accelerators into FPGA designs using different vendor backends
Language: Tcl - Size: 10.5 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 5 - Forks: 2

taskflow/awesome-parallel-computing
A curated list of awesome parallel computing resources
Size: 3.41 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 722 - Forks: 68

bsc-pm-ompss-at-fpga/xtasks
Library implementing a common interface to manage FPGA tasks
Language: C - Size: 698 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

alpaka-group/alpaka
Abstraction Library for Parallel Kernel Acceleration :llama:
Language: C++ - Size: 17.9 MB - Last synced at: 27 days ago - Pushed at: about 1 month ago - Stars: 372 - Forks: 76

pulp-platform/hero
Heterogeneous Research Platform (HERO) for exploration of heterogeneous computers consisting of programmable many-core accelerators and an application-class host CPU, including full-stack software and hardware.
Language: SystemVerilog - Size: 61.8 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 101 - Forks: 25

triSYCL/triSYCL
Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group
Language: C++ - Size: 382 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 441 - Forks: 98

Heteroflow/Heteroflow
Concurrent CPU-GPU Programming using Task Models
Language: C++ - Size: 1.58 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 101 - Forks: 13

embeddedcrab/stm32mp1_multicore_comm
Projects done on STM32MP157C-DK2 Kit. Communication between Multiple Cores in mutithreaded environment using C/C++.
Language: C - Size: 23.4 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

artecs-group/Juliana.jl
A tool for converting specific Julia GPU code writen in CUDA.jl, into abstract multi-backend code with KernelAbstractions.jl.
Language: Julia - Size: 210 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

GMAP/NPB-GPU
NAS Parallel Benchmarks for evaluating GPU and APIs
Language: C++ - Size: 321 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 22 - Forks: 7

PlatformAwareProgramming/PlatformAware.jl
Platform-aware programming in Julia
Language: Julia - Size: 1.56 MB - Last synced at: 19 days ago - Pushed at: 5 months ago - Stars: 13 - Forks: 1

bsc-pm-ompss-at-fpga/ompss-2-at-fpga-releases
Meta-repository for OmpSs-2@FPGA releases
Language: Makefile - Size: 67.4 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 1

unisa-hpc/SYgraph
A Portable headers-only library for Graph Analytics tasks on Heterogeneous GPUs
Language: C++ - Size: 351 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 4 - Forks: 0

beehive-lab/levelzero-jni
Intel LevelZero JNI library for TornadoVM
Language: Java - Size: 333 KB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 12 - Forks: 6

eZWALT/Graphic-Cards-And-Accelerators Fork of AleexHrB/TGA-FIB
FIB Graphic cards and accelerators (TGA) 2023-24 Q1 final project
Language: C - Size: 1.06 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

sejoonoh/GTA-Tensor
High-Performance Tucker Factorization on Heterogeneous Platforms (GTA) - TPDS 2019
Language: C++ - Size: 3.12 MB - Last synced at: 9 months ago - Pushed at: almost 5 years ago - Stars: 6 - Forks: 3

tkob-vh/CUDA_kernels
Some general algorithms implemented in cuda.
Language: Cuda - Size: 76.2 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

intel/hdk ๐ฆ
A low-level execution library for analytic data processing.
Language: C++ - Size: 66.3 MB - Last synced at: 9 months ago - Pushed at: about 1 year ago - Stars: 30 - Forks: 14

cggos/hpc
High-Performance Computing: CPU Instructions, GPU OpenCL & CUDA, etc. :sunny:
Language: Python - Size: 2.66 MB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 14 - Forks: 4

shaanzie/QuickRef
Quick references to notes on specific topics and their basic introductions
Language: C - Size: 10.3 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 12 - Forks: 1

nellogan/distributed_compy
Distributed_compy is a distributed computing library that offers multi-threading, heterogeneous (CPU + mult-GPU), and multi-node support
Language: Python - Size: 6.83 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

ParCoreLab/BeyondMoore
BeyondMoore has an ambitious goal to develop a software framework that performs static and dynamic optimizations, issues accelerator-initiated data transfers, and reasons about parallel execution strategies that exploit both processor and memory heterogeneity.
Size: 25.4 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

tudorv91/SparkJNI
A heterogeneous Apache Spark framework.
Language: Java - Size: 5.83 MB - Last synced at: about 2 months ago - Pushed at: about 8 years ago - Stars: 19 - Forks: 2

codes1gn/chopper ๐ฆ
Composable Computing Platform targeting Large-scale Heterogeneous Computing
Language: C++ - Size: 31.3 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

GuillermoFdez98/Video-pipeline-for-event-based-sensors
This project provides a video pipeline using event-based sensors to capture the vision process. It can run on PC and Xilinx Pynq-Z2, using an abstraction library between the user and the architecture. The repository contains the main codes and the designs of the complete system.
Language: VHDL - Size: 87.2 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

deevashwer/Heterogeneous-GPU-Connected-Components
Heterogeneous Parallel implementation to solve the Connected Components problem using OpenMP, CUDA and OpenCL.
Language: Cuda - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 5 - Forks: 2

tugrul512bit/UfSaCL
Ultra fast simulated annealing with OpenCL & multiple accelerators, GPUs, CPUs.
Language: C++ - Size: 310 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

soumyasen1809/SYCL-Basics
Basic introduction to SYCL
Language: C++ - Size: 61.5 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

robclu/ripple
A library for simplified distributed computing across any heterogeneous architectures (cpu + gpu), with tensor support, and polymorphic data layouts for optimal performance! Ripple enables you to scale quickly without sacrificing performance!
Language: C++ - Size: 1.47 MB - Last synced at: 5 months ago - Pushed at: about 3 years ago - Stars: 6 - Forks: 0

chenxuhao/gardenia
GARDENIA: Graph Analytics Repository for Designing Efficient Next-generation Accelerators
Language: C++ - Size: 1.24 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 26 - Forks: 7

SuperbTUM/GloVe-GPU
GloVe representation with parallelization
Language: Python - Size: 3.31 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

architector1324/EasyCL
OpenCL based lightweight c++ computing library
Language: C++ - Size: 311 KB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 7 - Forks: 0

pulp-platform/hero-gcc-toolchain ๐ฆ
โ DEPRECATED โ HERO toolchain with support for RISC-V offloading over OpenMP Accelerator Execution Model
Language: Shell - Size: 65.4 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

pulp-platform/hero-openmp-examples ๐ฆ
โ DEPRECATED โ HERO OpenMP Heterogenous Execution Model Examples
Language: C - Size: 1.45 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

alessandrocapotondi/nvidia-jetson-llvm-builder
Builder script for Clang/LLVM10 compiler for Nvidia Jetson Nano (could be extended to other Jetson boards) with OpenMP 4.5 offloading support.
Size: 24.4 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 0

zaen-archive/weasel
Weasel language is a project i created to proof of concept that we can support heterogeneous internally inside a language.
Language: C++ - Size: 429 KB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 2

taimurrabuske/adc_ica_calibration_opencl
Simulating ADC background calibration algorithms in OpenCL
Language: Python - Size: 1.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

pulp-platform/hero-sdk ๐ฆ
โ DEPRECATED โ HERO Software Development Kit
Language: Shell - Size: 1.04 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 21 - Forks: 7

yas-sim/python-dpcpp-extension-sample-code
Python extension sample code using Intel oneAPI DPC++. The extension does a simple image processing using DPC++ kernel.
Language: C++ - Size: 1.89 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

bsc-pm-ompss-at-fpga/ompss-at-fpga-releases
Meta-repository for OmpSs@FPGA releases
Language: Dockerfile - Size: 186 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

mcsweeney90/heterogeneous_optimistic_finish_time
To accompany the paper "An efficient new static scheduling heuristic for accelerated architectures".
Language: Python - Size: 2.72 GB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 4

vineeths96/Heterogeneous-Systems
We present an algorithm to dynamically adjust the data assigned for each worker at every epoch during the training in a heterogeneous cluster. We empirically evaluate the performance of the dynamic partitioning by training deep neural networks on the CIFAR10 dataset.
Language: Python - Size: 1.02 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 1

nctu-homeworks/PP-hw5 ๐ฆ
Language: C - Size: 156 KB - Last synced at: over 1 year ago - Pushed at: over 10 years ago - Stars: 0 - Forks: 0

nctu-homeworks/PP-hw4 ๐ฆ
Language: Cuda - Size: 164 KB - Last synced at: over 1 year ago - Pushed at: over 10 years ago - Stars: 0 - Forks: 0

Tolisz/blurhash
blurhash algorithm implemented on GPU (OpenCL)
Language: C - Size: 2.42 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

nkusla/epidemic-GPU-optimization
Repository for scientific project in Petnica Science Center (2020)
Language: C - Size: 319 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

esa-tu-darmstadt/daphne-benchmark
The Darmstadt Automotive Parallel HeterogeNEous (DAPHNE) Benchmark-Suite
Language: C++ - Size: 18.3 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 1

olutosinbanjo/Hello_World_dpcpp
Intel Data Parallel C++, DPC++, for beginners
Language: C++ - Size: 1.23 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

architector1324/EasyCL2
OpenCL based lightweight c computing library
Language: C - Size: 35.2 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

g4m3r0/MagmaDNN-Benchmarsuite
MagmaDNN Benchmarksuite for heterogenous architectures
Language: C++ - Size: 14.6 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

nielsAD/hgb
Graph-processing benchmarking framework that targets heterogeneous architectures.
Language: C - Size: 920 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

lucasdavid/convolutional
Convolutional Nets implemented in pyCuda.
Language: Python - Size: 16.5 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 1

PRiME-project/PRiME-Framework
Repository for development of the PRiME Framework software.
Language: C++ - Size: 11.2 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 2 - Forks: 2

ixiDev/OpenCLHelloWorld
An example of OpenCL/C++ helloWorld programme
Language: C++ - Size: 5.86 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

whitelok/gpu-computation-gems-codes
GPU Computing Gems Jade 2012 Editionๅฎ็จ็คบไพไปฃ็
Language: Cuda - Size: 155 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

SuperbTUM/kmeans-pycuda
A general k-means algorithm with L2 distance using pyCUDA
Language: Jupyter Notebook - Size: 499 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

GLaDAP/heterogeneous_computing_project
Heterogeneous parallel programming exercise using OpenMP and CUDA to parallelize image filters
Language: C - Size: 2.17 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 1

KaoCC/HeterogeneousQueue
The Heterogeneous Queuing Framework utilizing Fibers
Language: C++ - Size: 2.8 MB - Last synced at: 29 days ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 1

DDreher/OpenCLSandbox
A sandbox project to learn the concepts of working with massively parallel processors using OpenCL.
Language: C++ - Size: 232 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

tanmayv25/GaussianProcessRegression
Using Nvidia K20 to accelerate Gaussian Process Regression
Language: Cuda - Size: 72.3 KB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

stormsinbrewing/HSLOG
An automated heterogenous log management script created in Python and automated using DevOps pipeline in ELK Stack.
Size: 458 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

7aitsev/ocl_labs
Computing with OpenCL: labs for a part of a Parallel Programming course
Language: C - Size: 42 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

echuraev/OpenCL-Practice
This repository contains source code of practices from different presentations that I made about OpenCL.
Language: C++ - Size: 545 KB - Last synced at: about 1 month ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

KaoCC/HeteroBench
HeteroBench is a collection of numerous OpenCL benchmarks
Language: C++ - Size: 2.04 MB - Last synced at: 2 months ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0
