An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: heterogeneous-parallel-programming

inducer/pyopencl

OpenCL integration for Python, plus shiny features

Language: Python - Size: 5.59 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1,096 - Forks: 246

taskflow/taskflow

A General-purpose Task-parallel Programming System using Modern C++

Language: C++ - Size: 137 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 10,810 - Forks: 1,272

pocl/pocl

pocl - Portable Computing Language

Language: C - Size: 60.6 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 979 - Forks: 265

JuliaGPU/KernelAbstractions.jl

Heterogeneous programming in Julia

Language: Julia - Size: 3.88 MB - Last synced at: 6 days ago - Pushed at: 13 days ago - Stars: 425 - Forks: 74

flecsi/flecsi

Flexible Computational Science (FleCSI) Project

Language: C++ - Size: 373 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 17 - Forks: 11

mosaic-group/openfpm

OpenFPM: A scalable open framework for particle and particle-mesh codes on parallel computers

Language: C++ - Size: 35.8 MB - Last synced at: 23 days ago - Pushed at: 24 days ago - Stars: 17 - Forks: 11

bsc-pm-ompss-at-fpga/ait

The Accelerator Integration Tool (AIT) automatically integrates OmpSs@FPGA accelerators into FPGA designs using different vendor backends

Language: Tcl - Size: 10.5 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 5 - Forks: 2

taskflow/awesome-parallel-computing

A curated list of awesome parallel computing resources

Size: 3.41 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 722 - Forks: 68

bsc-pm-ompss-at-fpga/xtasks

Library implementing a common interface to manage FPGA tasks

Language: C - Size: 698 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

alpaka-group/alpaka

Abstraction Library for Parallel Kernel Acceleration :llama:

Language: C++ - Size: 17.9 MB - Last synced at: 27 days ago - Pushed at: about 1 month ago - Stars: 372 - Forks: 76

pulp-platform/hero

Heterogeneous Research Platform (HERO) for exploration of heterogeneous computers consisting of programmable many-core accelerators and an application-class host CPU, including full-stack software and hardware.

Language: SystemVerilog - Size: 61.8 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 101 - Forks: 25

triSYCL/triSYCL

Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group

Language: C++ - Size: 382 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 441 - Forks: 98

Heteroflow/Heteroflow

Concurrent CPU-GPU Programming using Task Models

Language: C++ - Size: 1.58 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 101 - Forks: 13

embeddedcrab/stm32mp1_multicore_comm

Projects done on STM32MP157C-DK2 Kit. Communication between Multiple Cores in mutithreaded environment using C/C++.

Language: C - Size: 23.4 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

artecs-group/Juliana.jl

A tool for converting specific Julia GPU code writen in CUDA.jl, into abstract multi-backend code with KernelAbstractions.jl.

Language: Julia - Size: 210 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

GMAP/NPB-GPU

NAS Parallel Benchmarks for evaluating GPU and APIs

Language: C++ - Size: 321 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 22 - Forks: 7

PlatformAwareProgramming/PlatformAware.jl

Platform-aware programming in Julia

Language: Julia - Size: 1.56 MB - Last synced at: 19 days ago - Pushed at: 5 months ago - Stars: 13 - Forks: 1

bsc-pm-ompss-at-fpga/ompss-2-at-fpga-releases

Meta-repository for OmpSs-2@FPGA releases

Language: Makefile - Size: 67.4 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 1

unisa-hpc/SYgraph

A Portable headers-only library for Graph Analytics tasks on Heterogeneous GPUs

Language: C++ - Size: 351 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 4 - Forks: 0

beehive-lab/levelzero-jni

Intel LevelZero JNI library for TornadoVM

Language: Java - Size: 333 KB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 12 - Forks: 6

eZWALT/Graphic-Cards-And-Accelerators Fork of AleexHrB/TGA-FIB

FIB Graphic cards and accelerators (TGA) 2023-24 Q1 final project

Language: C - Size: 1.06 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

sejoonoh/GTA-Tensor

High-Performance Tucker Factorization on Heterogeneous Platforms (GTA) - TPDS 2019

Language: C++ - Size: 3.12 MB - Last synced at: 9 months ago - Pushed at: almost 5 years ago - Stars: 6 - Forks: 3

tkob-vh/CUDA_kernels

Some general algorithms implemented in cuda.

Language: Cuda - Size: 76.2 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

intel/hdk ๐Ÿ“ฆ

A low-level execution library for analytic data processing.

Language: C++ - Size: 66.3 MB - Last synced at: 9 months ago - Pushed at: about 1 year ago - Stars: 30 - Forks: 14

cggos/hpc

High-Performance Computing: CPU Instructions, GPU OpenCL & CUDA, etc. :sunny:

Language: Python - Size: 2.66 MB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 14 - Forks: 4

shaanzie/QuickRef

Quick references to notes on specific topics and their basic introductions

Language: C - Size: 10.3 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 12 - Forks: 1

nellogan/distributed_compy

Distributed_compy is a distributed computing library that offers multi-threading, heterogeneous (CPU + mult-GPU), and multi-node support

Language: Python - Size: 6.83 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

ParCoreLab/BeyondMoore

BeyondMoore has an ambitious goal to develop a software framework that performs static and dynamic optimizations, issues accelerator-initiated data transfers, and reasons about parallel execution strategies that exploit both processor and memory heterogeneity.

Size: 25.4 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

tudorv91/SparkJNI

A heterogeneous Apache Spark framework.

Language: Java - Size: 5.83 MB - Last synced at: about 2 months ago - Pushed at: about 8 years ago - Stars: 19 - Forks: 2

codes1gn/chopper ๐Ÿ“ฆ

Composable Computing Platform targeting Large-scale Heterogeneous Computing

Language: C++ - Size: 31.3 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

GuillermoFdez98/Video-pipeline-for-event-based-sensors

This project provides a video pipeline using event-based sensors to capture the vision process. It can run on PC and Xilinx Pynq-Z2, using an abstraction library between the user and the architecture. The repository contains the main codes and the designs of the complete system.

Language: VHDL - Size: 87.2 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

deevashwer/Heterogeneous-GPU-Connected-Components

Heterogeneous Parallel implementation to solve the Connected Components problem using OpenMP, CUDA and OpenCL.

Language: Cuda - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 5 - Forks: 2

tugrul512bit/UfSaCL

Ultra fast simulated annealing with OpenCL & multiple accelerators, GPUs, CPUs.

Language: C++ - Size: 310 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

soumyasen1809/SYCL-Basics

Basic introduction to SYCL

Language: C++ - Size: 61.5 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

robclu/ripple

A library for simplified distributed computing across any heterogeneous architectures (cpu + gpu), with tensor support, and polymorphic data layouts for optimal performance! Ripple enables you to scale quickly without sacrificing performance!

Language: C++ - Size: 1.47 MB - Last synced at: 5 months ago - Pushed at: about 3 years ago - Stars: 6 - Forks: 0

chenxuhao/gardenia

GARDENIA: Graph Analytics Repository for Designing Efficient Next-generation Accelerators

Language: C++ - Size: 1.24 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 26 - Forks: 7

SuperbTUM/GloVe-GPU

GloVe representation with parallelization

Language: Python - Size: 3.31 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

architector1324/EasyCL

OpenCL based lightweight c++ computing library

Language: C++ - Size: 311 KB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 7 - Forks: 0

pulp-platform/hero-gcc-toolchain ๐Ÿ“ฆ

โ›” DEPRECATED โ›” HERO toolchain with support for RISC-V offloading over OpenMP Accelerator Execution Model

Language: Shell - Size: 65.4 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

pulp-platform/hero-openmp-examples ๐Ÿ“ฆ

โ›” DEPRECATED โ›” HERO OpenMP Heterogenous Execution Model Examples

Language: C - Size: 1.45 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

alessandrocapotondi/nvidia-jetson-llvm-builder

Builder script for Clang/LLVM10 compiler for Nvidia Jetson Nano (could be extended to other Jetson boards) with OpenMP 4.5 offloading support.

Size: 24.4 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 0

zaen-archive/weasel

Weasel language is a project i created to proof of concept that we can support heterogeneous internally inside a language.

Language: C++ - Size: 429 KB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 2

taimurrabuske/adc_ica_calibration_opencl

Simulating ADC background calibration algorithms in OpenCL

Language: Python - Size: 1.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

pulp-platform/hero-sdk ๐Ÿ“ฆ

โ›” DEPRECATED โ›” HERO Software Development Kit

Language: Shell - Size: 1.04 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 21 - Forks: 7

yas-sim/python-dpcpp-extension-sample-code

Python extension sample code using Intel oneAPI DPC++. The extension does a simple image processing using DPC++ kernel.

Language: C++ - Size: 1.89 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

bsc-pm-ompss-at-fpga/ompss-at-fpga-releases

Meta-repository for OmpSs@FPGA releases

Language: Dockerfile - Size: 186 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

mcsweeney90/heterogeneous_optimistic_finish_time

To accompany the paper "An efficient new static scheduling heuristic for accelerated architectures".

Language: Python - Size: 2.72 GB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 4

vineeths96/Heterogeneous-Systems

We present an algorithm to dynamically adjust the data assigned for each worker at every epoch during the training in a heterogeneous cluster. We empirically evaluate the performance of the dynamic partitioning by training deep neural networks on the CIFAR10 dataset.

Language: Python - Size: 1.02 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 1

nctu-homeworks/PP-hw5 ๐Ÿ“ฆ

Language: C - Size: 156 KB - Last synced at: over 1 year ago - Pushed at: over 10 years ago - Stars: 0 - Forks: 0

nctu-homeworks/PP-hw4 ๐Ÿ“ฆ

Language: Cuda - Size: 164 KB - Last synced at: over 1 year ago - Pushed at: over 10 years ago - Stars: 0 - Forks: 0

Tolisz/blurhash

blurhash algorithm implemented on GPU (OpenCL)

Language: C - Size: 2.42 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

nkusla/epidemic-GPU-optimization

Repository for scientific project in Petnica Science Center (2020)

Language: C - Size: 319 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

esa-tu-darmstadt/daphne-benchmark

The Darmstadt Automotive Parallel HeterogeNEous (DAPHNE) Benchmark-Suite

Language: C++ - Size: 18.3 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 1

olutosinbanjo/Hello_World_dpcpp

Intel Data Parallel C++, DPC++, for beginners

Language: C++ - Size: 1.23 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

architector1324/EasyCL2

OpenCL based lightweight c computing library

Language: C - Size: 35.2 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

g4m3r0/MagmaDNN-Benchmarsuite

MagmaDNN Benchmarksuite for heterogenous architectures

Language: C++ - Size: 14.6 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

nielsAD/hgb

Graph-processing benchmarking framework that targets heterogeneous architectures.

Language: C - Size: 920 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

lucasdavid/convolutional

Convolutional Nets implemented in pyCuda.

Language: Python - Size: 16.5 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 1

PRiME-project/PRiME-Framework

Repository for development of the PRiME Framework software.

Language: C++ - Size: 11.2 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 2 - Forks: 2

ixiDev/OpenCLHelloWorld

An example of OpenCL/C++ helloWorld programme

Language: C++ - Size: 5.86 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

whitelok/gpu-computation-gems-codes

GPU Computing Gems Jade 2012 Editionๅฎž็”จ็คบไพ‹ไปฃ็ 

Language: Cuda - Size: 155 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

SuperbTUM/kmeans-pycuda

A general k-means algorithm with L2 distance using pyCUDA

Language: Jupyter Notebook - Size: 499 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

GLaDAP/heterogeneous_computing_project

Heterogeneous parallel programming exercise using OpenMP and CUDA to parallelize image filters

Language: C - Size: 2.17 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 1

KaoCC/HeterogeneousQueue

The Heterogeneous Queuing Framework utilizing Fibers

Language: C++ - Size: 2.8 MB - Last synced at: 29 days ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 1

DDreher/OpenCLSandbox

A sandbox project to learn the concepts of working with massively parallel processors using OpenCL.

Language: C++ - Size: 232 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

tanmayv25/GaussianProcessRegression

Using Nvidia K20 to accelerate Gaussian Process Regression

Language: Cuda - Size: 72.3 KB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

stormsinbrewing/HSLOG

An automated heterogenous log management script created in Python and automated using DevOps pipeline in ELK Stack.

Size: 458 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

7aitsev/ocl_labs

Computing with OpenCL: labs for a part of a Parallel Programming course

Language: C - Size: 42 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

echuraev/OpenCL-Practice

This repository contains source code of practices from different presentations that I made about OpenCL.

Language: C++ - Size: 545 KB - Last synced at: about 1 month ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

KaoCC/HeteroBench

HeteroBench is a collection of numerous OpenCL benchmarks

Language: C++ - Size: 2.04 MB - Last synced at: 2 months ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0