GitHub topics: openacc
trholding/llama2.c Fork of karpathy/llama2.c
Llama 2 Everywhere (L2E)
Language: C - Size: 6.95 MB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 1,517 - Forks: 43

MFlowCode/MFC
Exascale simulation of multiphase/physics fluid dynamics
Language: Fortran - Size: 517 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 221 - Forks: 98

pyccel/pyccel
Python extension language using accelerators
Language: Python - Size: 19.8 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 364 - Forks: 60

nakib/elphbolt
A solver for the coupled and decoupled electron and phonon Boltzmann transport equations.
Language: Fortran - Size: 12.5 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 48 - Forks: 28

predsci/POT3D
POT3D: High Performance Potential Field Solver
Language: Fortran - Size: 23.9 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 46 - Forks: 26

ParRes/Kernels
This is a set of simple programs that can be used to explore the features of a parallel platform.
Language: C - Size: 23.6 MB - Last synced at: 25 days ago - Pushed at: 26 days ago - Stars: 427 - Forks: 109

UoB-HPC/BabelStream
STREAM, for lots of devices written in many programming models
Language: C++ - Size: 2.36 MB - Last synced at: 21 days ago - Pushed at: 9 months ago - Stars: 333 - Forks: 118

j3soon/nways_accelerated_programming Fork of openhackathons-org/nways_accelerated_programming
N-Ways to GPU Programming Bootcamp
Language: Jupyter Notebook - Size: 25.2 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 4 - Forks: 4

Jishnuraj07/EdgeAI-NVIDIA-NIM-based-RAG-chatbot
NVIDIA NIM based RAG application deployed locally(LLM ,Embedding model and reranking model)which is optimized to use GPU cluster
Language: Python - Size: 49.3 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

OpenACC/openacc-best-practices-guide
The sources for the OpenACC Programming and Best Practices Guide.
Language: Python - Size: 2.91 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 36 - Forks: 14

eric2003/OneFLOW
LargeScale Multiphysics Scientific Simulation Environment-OneFLOW CFD
Language: C++ - Size: 113 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 258 - Forks: 82

alpaka-group/alpaka
Abstraction Library for Parallel Kernel Acceleration :llama:
Language: C++ - Size: 17.9 MB - Last synced at: 28 days ago - Pushed at: about 2 months ago - Stars: 372 - Forks: 76

szaghi/FUNDAL
Fortran UNified Device Acceleration Library
Language: Fortran - Size: 2.96 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 14 - Forks: 2

openhackathons-org/gpubootcamp
This repository consists for gpu bootcamp material for HPC and AI
Language: Jupyter Notebook - Size: 261 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 532 - Forks: 256

jeng1220/openacc_fortran_examples
Simple OpenACC Fortran Examples
Language: Fortran - Size: 119 KB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 56 - Forks: 10

peterdschwartz/SPEL_OpenACC
Python tool designed for E3SM Land Model to create unit-tests and code-insertion of GPU compiler directives.
Language: Python - Size: 3.81 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1 - Forks: 2

ROCm/gpufort
GPUFORT: S2S translation tool for CUDA Fortran and Fortran+X in the spirit of hipify
Language: Fortran - Size: 7.48 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 162 - Forks: 16

jefflarkin/acc-events
Language: Fortran - Size: 5.86 KB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 1

ShadyBoukhary/GPU-research-FFT-OpenACC-CUDA
Case studies constitute a modern interdisciplinary and valuable teaching practice which plays a critical and fundamental role in the development of new skills and the formation of new knowledge. This research studies the behavior and performance of two interdisciplinary and widely adopted scientific kernels, a Fast Fourier Transform and Matrix Multiplication. Both routines are implemented in the two current most popular many-core programming models CUDA and OpenACC. A Fast Fourier Transform (FFT) samples a signal over a period of time and divides it into its frequency components, computing the Discrete Fourier Transform (DFT) of a sequence. Unlike the traditional approach to computing a DFT, FFT algorithms reduce the complexity of the problem from O(n2) to O(nLog2n). Matrix multiplication is a cornerstone routine in Mathematics, Artificial Intelligence and Machine Learning. This research also shows that the nature of the problem plays a crucial role in determining what many-core model will provide the highest benefit in performance.
Language: Cuda - Size: 9.12 MB - Last synced at: 21 days ago - Pushed at: over 6 years ago - Stars: 13 - Forks: 3

OpenACC/openacc-training-materials
Training materials provided by OpenACC.org.
Language: C - Size: 16.8 MB - Last synced at: about 2 months ago - Pushed at: 9 months ago - Stars: 88 - Forks: 28

usnistgov/hiperc
High Performance Computing Strategies for Boundary Value Problems
Language: HTML - Size: 63.4 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 42 - Forks: 10

jefflarkin/miniWeather Fork of mrnorman/miniWeather
A parallel programming training mini app simulating weather-like flows
Language: C++ - Size: 8.24 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Rodolfo-Gallegos/Brownian-Dynamics-Simulation-OpenACC-OpenMP
This is my thesis work for the Bachelor's degree in Physics. / Este es mi trabajo de titulación para la Licenciatura en Física.
Language: C++ - Size: 5.55 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

nlg550/ZPIC_OmpSs2
Parallel 2D EM-PIC kinetic plasma simulator based on the ZPIC suite
Language: C - Size: 952 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 1

KaiErikNiermann/hpc-uzh-notes
These are some notes for the High Performance Computing course taught at UZH
Size: 108 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

lele394/Fortran-N-Body-OpenMP-OpenACC
Fortran N-Body simulation parallelized on CPU using OpenMP and GPU using OpenACC
Language: Jupyter Notebook - Size: 384 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

predsci/multigpu-test-code
This code mimics the basic MPI+OpenACC tasks of PSI's MAS Solar MHD code, for use with testing multi-GPU multi-node clusters
Language: Fortran - Size: 36.1 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

jefflarkin/openacc-interoperability
Interoperability examples for OpenACC.
Language: C - Size: 43 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 49 - Forks: 23

mnicely/computeWorks_examples
Matrix multiplication example performed with OpenMP, OpenACC, BLAS, cuBLABS, and CUDA
Language: C++ - Size: 834 KB - Last synced at: 21 days ago - Pushed at: almost 3 years ago - Stars: 7 - Forks: 1

yasahi-hpc/P3-miniapps
Kinetic plasma simulation code parallelized with C++ parallel algorithm
Language: C++ - Size: 4.91 MB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 5 - Forks: 0

tan2/geoflac-old
Code for lithospheric scale geodynamics
Language: Fortran - Size: 1.2 MB - Last synced at: about 2 months ago - Pushed at: 9 months ago - Stars: 7 - Forks: 11

intel/intel-application-migration-tool-for-openacc-to-openmp
OpenACC* to OpenMP* API assisting migration tool
Language: Python - Size: 257 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 32 - Forks: 6

pkestene/tsp
traveling salesman problem solved with different programing models
Language: C++ - Size: 56.6 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 1

khaki3/acc-saturator
Equality Saturation Framework for Directive-Based GPU Code
Language: C - Size: 214 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

MFlowCode/MicroFC
A micro MFC and CFD mini-app
Language: Fortran - Size: 35.3 MB - Last synced at: 8 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

TommasoTarchi/Advanced_HPC-Final_assignments
Work in progress...
Language: C - Size: 1.15 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

jnbntz/gpu-edu-workshops
Code examples for CUDA and OpenACC
Language: Cuda - Size: 9.04 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 34 - Forks: 18

openhackathons-org/nways_accelerated_programming
N-Ways to GPU Programming Bootcamp
Language: Jupyter Notebook - Size: 21.7 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 54 - Forks: 27

OpenACCUserGroup/openacc-users-group
Language: C - Size: 6.69 MB - Last synced at: 10 months ago - Pushed at: almost 8 years ago - Stars: 82 - Forks: 24

claw-project/claw-compiler
CLAW Compiler for Performance Portability
Language: Java - Size: 8.48 MB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 41 - Forks: 15

OpenACC/openacc-interoperability-examples Fork of jefflarkin/openacc-interoperability
Interoperability examples for OpenACC.
Language: C - Size: 199 KB - Last synced at: 10 months ago - Pushed at: about 10 years ago - Stars: 6 - Forks: 6

openhackathons-org/HPC_Profiler
Profiling with NVIDIA Nsight Tools Bootcamp
Language: C++ - Size: 23.5 MB - Last synced at: 10 months ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 1

Tamerkobba/Parallel_Matrix_Mul
Parallelizing Matrix multiplication using CUDA C and OpenACC
Language: Jupyter Notebook - Size: 51.8 KB - Last synced at: 10 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Hopobcn/FWI
RTM
Language: C - Size: 754 KB - Last synced at: 10 months ago - Pushed at: about 7 years ago - Stars: 36 - Forks: 15

dc-fukuoka/gpu_ring
a test of GPU direct with CUDA/OpenACC.
Language: C - Size: 7.81 KB - Last synced at: 10 months ago - Pushed at: almost 5 years ago - Stars: 4 - Forks: 0

eafit-apolo/2DPartInt
Soil particles contact simulation
Language: C - Size: 1.17 MB - Last synced at: 10 months ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 0

phbastosa/seismic_tomography_3D
Master degree project using object-oriented programming
Language: C++ - Size: 18.8 MB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

Programacao-Paralela-e-Distribuida/programacao-paralela-e-distribuida.github.io
Página do Livro Programação Paralela e Distribuída
Language: HTML - Size: 124 KB - Last synced at: 10 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

danxuZhang/HPC
HPC and Parallel Computing Learning Notes and Code
Language: Cuda - Size: 41 KB - Last synced at: 10 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

dc-fukuoka/jacobi
jacobi - a benchmark by solving 2D laplace equation with jacobi iterative method. GPU or Xeon Phi can be used.
Language: Fortran - Size: 24.4 KB - Last synced at: 10 months ago - Pushed at: about 7 years ago - Stars: 7 - Forks: 4

nachovizzo/saxpy_openacc_cpp
My way of thinking about OpenACC, C++, and Parallel computing in general
Language: C++ - Size: 16.6 KB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

RodrigoOt/nvptx-tools Fork of SourceryTools/nvptx-tools
nvptx-tools: a collection of tools for use with nvptx-none GCC toolchains.
Language: C - Size: 874 KB - Last synced at: 10 months ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

PawseySC/sc20-gpu-offloading
Materials for "Differences between OpenACC and OpenMP offloading models" tutorial.
Language: C - Size: 650 KB - Last synced at: 10 months ago - Pushed at: about 3 years ago - Stars: 6 - Forks: 5

stfc/PSycloneBench
Various benchmarks used to inform PSyclone optimisations
Language: Fortran - Size: 18.7 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 5

marcjoos-cea/dumses-hybrid
CFD/MHD code for astrophysics
Language: Fortran - Size: 24.2 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 1

yasahi-hpc/vlp4d_mpi
MPI+Kokkos/OpenACC/OpenMP4.5/stdpar implementation of vlp4d
Language: C++ - Size: 374 KB - Last synced at: 10 months ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 0

gabrielchristo/prog-dist
Programação Paralela e Distribuída
Language: C - Size: 7.93 MB - Last synced at: 10 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

dc-fukuoka/gpumm
gpumm - matrix-matrix multiplication by using CUDA, cublas, cublasxt and OpenACC.
Language: Cuda - Size: 7.6 MB - Last synced at: 10 months ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

kuldeep-tolia/OpenACC_FORTRAN_Codes
OpenACC GPU parallelization for various numerical methods and miscellaneous problems using FORTRAN
Language: Fortran - Size: 136 KB - Last synced at: 10 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

sotanmochi/HPC-Samples
Code Samples for CUDA, OpenACC and OpenMP
Language: C++ - Size: 16.6 KB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

olcf-tutorials/openmp_offloading
OpenMP programming tips for GPU offloading
Language: C++ - Size: 46.9 KB - Last synced at: 10 months ago - Pushed at: over 5 years ago - Stars: 5 - Forks: 2

kuldeep-tolia/OpenACC_C_Codes
OpenACC GPU parallelization for various numerical methods and miscellaneous problems using C
Language: C - Size: 42 KB - Last synced at: 10 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

ENCCS/openacc
OpenACC
Language: C - Size: 3.24 MB - Last synced at: 10 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 2

MaxStrange/pyACC
OpenACC for Python
Language: Python - Size: 184 KB - Last synced at: 19 days ago - Pushed at: almost 6 years ago - Stars: 20 - Forks: 1

WalterNadalin/ParallelJacobi
Numerical solution of the Laplace equation implementing the Jacobi method
Language: C - Size: 44.1 MB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

goatandsheep/mandelboxes 📦
:package: mandelboxes for a course, SFWR ENG 4F03
Language: C++ - Size: 3.86 MB - Last synced at: 10 months ago - Pushed at: almost 9 years ago - Stars: 1 - Forks: 0

muriloboratto/benchmark-mode-optimization-GPU
Benchmark Matrix Multiply on GPU Environment.
Language: Shell - Size: 49.8 KB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

piyueh/PETSC-OpenACC Fork of olcf/PETSC-OpenACC
A mirror to https://github.com/olcf/PETSC-OpenACC -- An example of accelerating PETSc with OpenACC
Language: Shell - Size: 286 KB - Last synced at: 10 months ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 0

larsgeb/fd-wave-modelling-gpu
Forward 2D elastic wave equation modelling using either OpenMP or OpenACC. Compiles with PGI compiler.
Language: C++ - Size: 28.3 KB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

hahnjo/CGxx
Object-Oriented Implementation of the Conjugate Gradients Method
Language: C - Size: 204 KB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

ajarmusch/Testsuite
OpenACC Validation and Verification Testsuite repository
Language: JavaScript - Size: 1.53 MB - Last synced at: 10 months ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

estradjm/Parallel-Gaussian-Blurring
LANL Parallel Computing Summer Research Institute 2017 GPU Exercise - C implementation of Gaussian Blurring of .ppm format image
Language: C - Size: 483 KB - Last synced at: 10 months ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 1

xstupi00/N-Body-OpenACC
Parallel Computations on GPU - Project - N-Body-OpenACC
Language: Python - Size: 76 MB - Last synced at: 10 months ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

diamantopoulos/diamantopoulos.github.io
Dionysios Diamantopoulos Web Edition
Language: JavaScript - Size: 10.5 MB - Last synced at: 10 months ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

capellil/IHPCSS_Programming_challenge_2019
The repository containing everything you need to compete in the IHPCSS 2019 programming challenge.
Language: Fortran - Size: 1.14 MB - Last synced at: about 2 months ago - Pushed at: almost 6 years ago - Stars: 9 - Forks: 11

paveon/PCG-NBody-OpenACC
[VUT FIT] OpenACC N-Body simulation project for the PCG course
Language: C++ - Size: 2.35 MB - Last synced at: 10 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

gilbertobastos/prj_perceptron_multicamadas_OpenACC_NVIDIA
Implementação paralela e simples do Perceptron Multicamadas utilizando OpenACC destinada para GPUs da NVIDIA
Language: C - Size: 36.1 KB - Last synced at: 10 months ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

gilbertobastos/prj_perceptron_multicamadas_OpenACC_MULTICORE
Implementação paralela e simples do Perceptron Multicamadas utilizando OpenACC destinada para CPU
Language: C - Size: 34.2 KB - Last synced at: 10 months ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

pyccel/lampy
Extension of Pyccel for functional programming
Language: Python - Size: 1.52 MB - Last synced at: 10 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

dc-fukuoka/can
can - a simple dense matrix-matrix multiplication benchmark with MPI/OpenMP/OpenACC. MPI version is based on Cannon's algorithm.
Language: Fortran - Size: 27.3 KB - Last synced at: 10 months ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 2

PawseySC/2D-Laplace-Offload
The main purpose of this tutorial is to present similarities and differences between device offload mechanisms available in OpenACC and OpenMP standards.
Language: C - Size: 25.4 KB - Last synced at: 10 months ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 2

milladgit/rodinia Fork of qbunia/rodinia
Rodinia 2.1 benchmark modified to run with OpenACC 2.7 and PGI 18.4
Language: C - Size: 344 MB - Last synced at: 10 months ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

dc-fukuoka/mandelbrot
Mandelbrot set by MPI/OpenMP/OpenACC.
Language: Fortran - Size: 73.2 KB - Last synced at: 10 months ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 0

truongdangqe/hello
Some Fortran codes to practice programming in Fortran.
Language: Fortran - Size: 48.8 KB - Last synced at: 2 months ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

DonAurelio/coder
the base for a web-based parallel programming environment build over a microservice approach
Language: JavaScript - Size: 7.41 MB - Last synced at: 29 days ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

Raienryu97/parallelizationstudy
A performance study of various parallelisation tools on a few benchmarks
Language: C++ - Size: 51 MB - Last synced at: 10 months ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

spino327/NAS_SHOC_OpenACC_2.5
Code repository for paper "Exploring translation of OpenMP to OpenACC 2.5: Lessons Learned"
Language: Python - Size: 39.1 KB - Last synced at: 10 months ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 2

RodrigoOt/OpenaccBuildScript
Just another build script for gcc an nvptx
Language: Shell - Size: 39.1 KB - Last synced at: 10 months ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

AndiH/jarvice-gtc19-power-image
Docker Image for JARVICE
Language: Dockerfile - Size: 807 KB - Last synced at: 2 months ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

Trick-17/backends
Interchangeable backends in C++, OpenMP, CUDA, OpenCL, OpenACC
Language: C++ - Size: 80.1 KB - Last synced at: 2 months ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 0
