Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: avx-512

nidud/asmc

Masm compatible assembler

Language: Assembly - Size: 67.9 MB - Last synced: about 4 hours ago - Pushed: 1 day ago - Stars: 12 - Forks: 4

simdutf/is_utf8

Fast C++ function "is_utf8": checks if the input is valid UTF-8. Made of a single source file. Optimized for ARM NEON, x64 SSE, AVX2 and AVX-512.

Language: C++ - Size: 134 KB - Last synced: about 6 hours ago - Pushed: 1 day ago - Stars: 44 - Forks: 5

simdutf/simdutf

Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js and Bun.

Language: C++ - Size: 5.22 MB - Last synced: about 21 hours ago - Pushed: 4 days ago - Stars: 965 - Forks: 60

RoaringBitmap/CRoaring

Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks

Language: C - Size: 47 MB - Last synced: about 21 hours ago - Pushed: 7 days ago - Stars: 1,460 - Forks: 259

aff3ct/MIPP

MIPP is a portable wrapper for SIMD instructions written in C++11. It supports NEON, SSE, AVX, AVX-512 and SVE (length specific).

Language: C++ - Size: 2.01 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 463 - Forks: 86

quasilyte/avx512test

Utility that was used to generate initial Go AVX-512 encoder test suite.

Language: Assembly - Size: 1.46 MB - Last synced: 11 days ago - Pushed: about 5 years ago - Stars: 9 - Forks: 0

SESAME-Synchrotron/orbit-feedback

Design of the Fast-Orbit Feedback correction for SESAME's accelerator

Language: C - Size: 62.5 KB - Last synced: 21 days ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

google/highway

Performance-portable, length-agnostic SIMD with runtime dispatch

Language: C++ - Size: 22.5 MB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 3,609 - Forks: 291

awesome-simd/awesome-simd

A curated list of awesome SIMD frameworks, libraries and software

Size: 38.1 KB - Last synced: about 1 month ago - Pushed: 3 months ago - Stars: 105 - Forks: 14

intel/hexl

Intel:registered: Homomorphic Encryption Acceleration Library accelerates modular arithmetic operations used in homomorphic encryption

Language: C++ - Size: 5.44 MB - Last synced: 27 days ago - Pushed: 8 months ago - Stars: 207 - Forks: 47

jonicho/simd-radix-sort

A generic and efficient SIMD implementation of MSB Radix Sort with separate key and payload datastreams that supports arbitrary key and payload data types written in C++ accompanied by a bachelor's thesis.

Language: C++ - Size: 992 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 4 - Forks: 0

antoinecarme/xeon-phi-data

Data for Intel Xeon-Phi server used in PyAF tests

Language: Python - Size: 43.8 MB - Last synced: about 13 hours ago - Pushed: over 3 years ago - Stars: 2 - Forks: 1

hubery-tao/fast_math

high-speed math functions based on AVX-512 intrinsics

Language: C++ - Size: 71.3 KB - Last synced: 3 months ago - Pushed: almost 2 years ago - Stars: 4 - Forks: 1

romz-pl/matrix-matrix-multiply

Algorithms for matrix matrix multiplication, dgemm, AVX-256, AVX-512

Language: C++ - Size: 55.7 KB - Last synced: 25 days ago - Pushed: almost 3 years ago - Stars: 10 - Forks: 2

swojtasiak/fcml-lib

A general purpose machine code manipulation library for x86-32 (IA-32) and x86-64 (AMD64) architectures (Assembler, Disassembler, Library).

Language: C - Size: 22.9 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 81 - Forks: 24

bgin/Radar_ElectroOptical_Simulation

(REOS) Radar and ElectroOptical Simulation Framework written in Fortran.

Language: Fortran - Size: 39.2 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 45 - Forks: 14

intel/document-level-sentiment-analysis

Document Level Sentiment Analysis is an End-to-End deep learning workflow using Hugging Face transformers API to do a "classification" task at document level, to analyze the sentiment of input document containing English sentences or paragraphs.

Language: Python - Size: 239 KB - Last synced: 27 days ago - Pushed: 5 months ago - Stars: 6 - Forks: 3

twest820/AVX-512

AVX-512 documentation beyond what Intel provides

Size: 1.27 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 28 - Forks: 3

pcineverdies/FFT-AVX-512 📦

Fast Fourier Transform implementation though x86 AVX-512 SIMD extension

Language: C++ - Size: 7.81 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 1 - Forks: 1

ammarfaizi2/memcpy_benchmark

Benchmark to show which is the fastest memcpy.

Language: Assembly - Size: 54.7 KB - Last synced: 10 days ago - Pushed: over 3 years ago - Stars: 10 - Forks: 2

rainerzufalldererste/hypersonic-rle-kit

The fastest Run-Length-Encoding on the Planet (for x64)

Language: C - Size: 1.69 MB - Last synced: 10 months ago - Pushed: 11 months ago - Stars: 18 - Forks: 4

falseywinchnet/tomatofft

The Tomato Patch FFT is the fastest FFT in the world- but it is by no means efficient.

Size: 8.53 MB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 0 - Forks: 0

nsomatilda/Matilda

Matilda is a library to repeatedly multiply a constant matrix with a variable vector

Language: C++ - Size: 38.1 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

tugrul512bit/VectorizedKernel

Running GPGPU-like kernels on CPU with auto-vectorization for SSE/AVX/AVX512 SIMD Architectures

Language: C++ - Size: 241 KB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 5 - Forks: 0

gfurtadoalmeida/study-assembly-x64

Projects and annotations used to learn x64 assembly

Language: C++ - Size: 194 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 2 - Forks: 0

lemonjesus/avx512-polyline

An implementation of Google's Encoded Polyline algorithm in AVX512 because why not. Perhaps the fastest and least portable polyline encoder out there?

Language: C - Size: 33.2 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

DmitryYurov/BitsCount

Count set bits in an integer

Language: C++ - Size: 30.3 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

venovako/VecKog

The vectorized (AVX-512) batched singular value decomposition algorithm for matrices of order two.

Language: C - Size: 107 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 2 - Forks: 0

MamarezaAlipour/AVX-Hole

AVX-Hole C++ SIMD Library

Language: C++ - Size: 133 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 15 - Forks: 0

harshapathuri86/parallel-codes

Language: C++ - Size: 2.47 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

zingaburga/SIMDflate

Experimental speed-oriented DEFLATE implementation, based on AVX-512

Language: C++ - Size: 130 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

kvr000/zbynek-cxx-exp

Zbynek's various C and C++ experiments

Language: C++ - Size: 40 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

stefan-zobel/cramer

Some loose performance experiments with Agner Fog's VCL

Language: C++ - Size: 770 KB - Last synced: 26 days ago - Pushed: 27 days ago - Stars: 0 - Forks: 0

toshioendo/hoalgos

Implementation of Hierarchy Oblivious Algorithms

Language: C++ - Size: 731 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 3 - Forks: 0

PhuNH/hpc-lab

Scientific Computing - High-Performance Computing Practical Course in WS18-19 at TUM

Language: C++ - Size: 1.08 MB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0