Topic: "simd-instructions"
google/highway
Performance-portable, length-agnostic SIMD with runtime dispatch
Language: C++ - Size: 28.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4,560 - Forks: 343

xtensor-stack/xsimd
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))
Language: C++ - Size: 3.8 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 2,349 - Forks: 264

lakshayg/tensorflow-build-archived 📦
TensorFlow binaries supporting AVX, FMA, SSE
Language: Shell - Size: 212 MB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 1,921 - Forks: 231

VcDevel/Vc
SIMD Vector Classes for C++
Language: C++ - Size: 11.2 MB - Last synced at: 11 days ago - Pushed at: 11 months ago - Stars: 1,479 - Forks: 151

ashvardanian/SimSIMD
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐
Language: C - Size: 1.83 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,344 - Forks: 79

gnuradio/volk
The Vector Optimized Library of Kernels
Language: C++ - Size: 3.69 MB - Last synced at: 11 days ago - Pushed at: about 2 months ago - Stars: 579 - Forks: 211

fast-pack/simdcomp
A simple C library for compressing lists of integers using binary packing
Language: C - Size: 593 KB - Last synced at: 11 days ago - Pushed at: over 1 year ago - Stars: 498 - Forks: 53

fast-pack/SIMDCompressionAndIntersection
A C++ library to compress and intersect sorted lists of integers using SIMD instructions
Language: C++ - Size: 1.33 MB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 433 - Forks: 59

agenium-scale/nsimd
Agenium Scale vectorization library for CPUs and GPUs
Language: C - Size: 6.92 MB - Last synced at: 16 days ago - Pushed at: over 3 years ago - Stars: 333 - Forks: 30

DragonSpit/HPCsharp
High performance algorithms in C#: SIMD/SSE, multi-core and faster
Language: C# - Size: 1.15 MB - Last synced at: 17 days ago - Pushed at: about 1 year ago - Stars: 271 - Forks: 32

lakshayg/tensorflow-build
TensorFlow binaries supporting AVX, FMA, SSE
Size: 17.6 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 236 - Forks: 44

fast-pack/MaskedVByte
Fast decoder for VByte-compressed integers
Language: C - Size: 72.3 KB - Last synced at: 5 days ago - Pushed at: 11 months ago - Stars: 122 - Forks: 19

lemire/SIMDxorshift
Fast random number generators: Vectorized (SIMD) version of xorshift128+
Language: C - Size: 34.2 KB - Last synced at: 23 days ago - Pushed at: almost 5 years ago - Stars: 113 - Forks: 15

fast-pack/dictionary
High-performance dictionary coding
Language: C++ - Size: 169 KB - Last synced at: 5 days ago - Pushed at: about 8 years ago - Stars: 104 - Forks: 10

cloudflare/sliceslice-rs
A fast implementation of single-pattern substring search using SIMD acceleration.
Language: Rust - Size: 350 KB - Last synced at: 16 days ago - Pushed at: 7 months ago - Stars: 96 - Forks: 18

edanor/umesimd
UME::SIMD A library for explicit simd vectorization.
Language: C++ - Size: 5.89 MB - Last synced at: 10 days ago - Pushed at: over 7 years ago - Stars: 91 - Forks: 16

lsp-plugins/lsp-dsp-lib
DSP library for signal processing
Language: C++ - Size: 3.61 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 75 - Forks: 18

RealTimeChris/Jsonifier
A few classes for extremely fast json parsing/serializing in modern C++. Possibly the fastest json parser in C++. Possibly the fastest json serializer in C++.
Language: C++ - Size: 256 MB - Last synced at: 25 days ago - Pushed at: 28 days ago - Stars: 75 - Forks: 6

bgin/Radar-ElectroOptical-Simulation
(REOS) Radar and Electro-Optical Simulation Framework written in C++.
Language: C++ - Size: 31.1 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 58 - Forks: 20

lemire/FastDifferentialCoding
Fast differential coding functions (using SIMD instructions)
Language: C - Size: 10.7 KB - Last synced at: 21 days ago - Pushed at: over 7 years ago - Stars: 52 - Forks: 9

mklarqvist/positional-popcount
Fast C functions for the computing the positional popcount (pospopcnt).
Language: C - Size: 545 KB - Last synced at: 11 months ago - Pushed at: over 5 years ago - Stars: 51 - Forks: 5

badamczewski/SimpleIntrinsics
This project aims to rename all C# intrinsic names to their more compact C/C++ counterparts that the industry uses.
Language: C# - Size: 57.6 KB - Last synced at: 6 days ago - Pushed at: over 4 years ago - Stars: 50 - Forks: 2

mklarqvist/libalgebra
Fast C header-only library for popcnt, pospopcnt, and set algebraic operations
Language: C - Size: 92.8 KB - Last synced at: 11 months ago - Pushed at: over 5 years ago - Stars: 42 - Forks: 8

frtru/GemParticles
Particle engine built on OpenGL used to produce various visual effects.
Language: C++ - Size: 187 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 32 - Forks: 4

PatwinchIR/ultra-sort
DSL for SIMD Sorting on AVX2 & AVX512
Language: C++ - Size: 6.43 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 30 - Forks: 2

jermp/mutable_rank_select
A SIMD-based C++ library providing rank/select queries over mutable bitmaps.
Language: C++ - Size: 350 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 29 - Forks: 4

Technologicat/cython-sse-example
Simple example for embedding SSE2 assembly in Cython projects
Language: Python - Size: 5.86 KB - Last synced at: 16 days ago - Pushed at: almost 8 years ago - Stars: 22 - Forks: 5

jermp/psds
Efficient Prefix-Sum data structures in C++.
Language: C++ - Size: 303 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 21 - Forks: 1

SungJJinKang/std_find_simd
std::find simd version
Language: C++ - Size: 49.8 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 20 - Forks: 1

nulidangxueshen/ALBUS
A Method for efficiently processing SpMV using SIMD and load balancing
Language: C++ - Size: 50.8 KB - Last synced at: 4 days ago - Pushed at: about 3 years ago - Stars: 16 - Forks: 2

nulidangxueshen/CSR2
A New Format for SIMD-accelerated SpMV
Language: C++ - Size: 145 KB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 16 - Forks: 2

lemire/vectorclass
Random number generator for large applications using vector instructions
Language: C++ - Size: 602 KB - Last synced at: 16 days ago - Pushed at: about 9 years ago - Stars: 15 - Forks: 3

andrelrt/litesimd
Litesimd is a no overhead, header only, C++ library for SIMD processing, specialized on SIMD comparison and data shuffle.
Language: C++ - Size: 335 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 14 - Forks: 0

unevens/avec
A little library for using SIMD instructions for x86 and ARM, wrapping Agner Fog's vectorclass for x86 and filling some of its functionality for ARM, and providing containers for aligned memory with views and interleaving/deinterleaving.
Language: C++ - Size: 700 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 11 - Forks: 1

ashvardanian/vector-dossier
Vector Dossier is a CLI tool that statically analyzes vectorization depth of programs and libraries
Language: Jupyter Notebook - Size: 1.21 MB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 8 - Forks: 0

ms0g/vml
SIMD-accelerated Vector math lib
Language: Assembly - Size: 29.3 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 7 - Forks: 1

aminya/minijson
Minify JSON files fast! Supports Comments. Uses D, C, and AVX2 and SSE4_1 SIMD.
Language: D - Size: 693 KB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 7 - Forks: 1

unevens/audio-dsp
A collection of template classes for audio dsp using SIMD instructions.
Language: C++ - Size: 151 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 1

minio/go-cv 📦
Golang wrapper for https://github.com/ermig1979/Simd
Language: Go - Size: 391 KB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 6 - Forks: 4

sahmad98/vstring
Vectroized String Helper Functions
Language: C++ - Size: 61.5 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 0

zamronypj/simd
Simple pascal demo project to show how to use Single Instruction Multiple Data (SIMD) using Intel SSE instruction
Language: Pascal - Size: 110 KB - Last synced at: 12 months ago - Pushed at: about 8 years ago - Stars: 6 - Forks: 1

Nemandza82/Symd
C++ header only template library designed to make it easier to write high-performance SIMD (SSE, AVX, Neon) and multi-threaded code.
Language: C++ - Size: 1.46 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 5 - Forks: 3

BlueLort/Lort-Renderer
C++ Optimized Software Renderer using SDL2.0
Language: C - Size: 10 MB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 1

FakhriFki77/DUGL
Dust Ultimate Game Library: A full featured x86 C/Assembly Game library using software renderer
Language: Assembly - Size: 6.24 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 3 - Forks: 1

m3y54m/sobel-simd-opencv
Using SIMD instructions in image processing using OpenCV
Language: C++ - Size: 114 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

sshekh/fast-lda
Course project in 'How to write Fast Numerical Code' on optimized implementation of latent dirichlet allocation
Language: C - Size: 63.3 MB - Last synced at: about 1 year ago - Pushed at: almost 8 years ago - Stars: 3 - Forks: 0

andrejajanja/taylor_compiler
LLVM based JIT for approximating numerical integrals on discrete number of samples
Language: Rust - Size: 259 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

z1skgr/SIMD-instruction-MPI-PTHREADS-parallism
Parallelism standards for accelerating performance on calculations for detection of positive DNA selection
Language: C - Size: 866 KB - Last synced at: 23 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

12Acorns/Portfolio-SIMDExtensions
A, Source-Generated, library to add easier processing of SIMD instructions whilst maintaing a performance expected for each platform.
Language: C# - Size: 44.9 KB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 2 - Forks: 0

RandomHashTags/swift-intrinsics
Unlock SIMD intrinsics for Swift.
Language: Swift - Size: 24.4 KB - Last synced at: about 21 hours ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

bsgbryan/roc
A thoroughly-modern real-time simulation engine
Language: TypeScript - Size: 326 KB - Last synced at: 21 days ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

ChrisWhealy/wasm_simd
Test framework for WASM SIMD instructions
Language: JavaScript - Size: 1.28 MB - Last synced at: 5 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

rsusik/cf2
Approximate pattern matching with Counting Filter on q-grams using SSE instructions (CF2)
Language: C++ - Size: 11.7 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

shadergz/rust-sse-studies
A collection of code used for understand the use-cases of the SIMD extensions inside Rust programming language
Language: Rust - Size: 4.88 KB - Last synced at: about 14 hours ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

shadergz/simd-studies
A study collection of SSE technologies use-case
Language: C - Size: 7.81 KB - Last synced at: about 14 hours ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

yuan-energy/small_tensor
A high performance small tensor library for inelastic finite element simulation
Language: C++ - Size: 7.51 MB - Last synced at: 2 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 1

andrelrt/SimdTTM
Simd To The Masses
Size: 80.1 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

ell-hol/simd-parallelized-haar-transform
8x speedup of 1D Haar-Transform using intel SIMD intrinsics
Language: C - Size: 116 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

PierreGe/microprocessor-architecture
Include the 3 labs : Risc16 (ASM) , dspic33 (C), and SIMD application
Language: C++ - Size: 3.54 MB - Last synced at: 12 months ago - Pushed at: almost 9 years ago - Stars: 1 - Forks: 0

korbolkoinc/uuids
High performance C++ uuid generator
Language: C++ - Size: 71.3 KB - Last synced at: 13 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 1

hunyadi/simdparse
High-speed parser with vector instructions
Language: C++ - Size: 116 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

mtumilowicz/java17-mesi-false-sharing-processor-optimisations-workshop
Introduction to cache coherence: false sharing, MESI protocol and vectorization
Language: Java - Size: 428 KB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

WolfHex/Mystic
A C++17 header-only library that provides compile-time string encryption and decryption using SIMD instructions
Language: C++ - Size: 135 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

alighanbari2002/Parallel-Programming-Course-Projects
Parallel Programming course projects demonstrating various parallelism techniques with SIMD SSE3, OMP, and POSIX threads, including Intel Parallel Studio for analysis and parallelization.
Language: C++ - Size: 7.32 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Maged152/Intel-Intrinsics-CPP-Wrapper
Intel Intrinsics C++ Wrapper
Language: C++ - Size: 199 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

t0re199/ARCHP_PROJECT
C & Assembly optimized version of the Stochastic Gradient Descent x SoftSVM x Polynomial Kernel Method algorithm
Language: C - Size: 32.2 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Daksh2060/vectorized-arrays-demo
This C++ program is a demonstration of array vectorization techniques utilized in the AVX2 SIMD Assembly library, being run with C++ arrays through the vector class library created by Agner Fog. An ASM version of the same process has been implemented for comparison.
Language: C++ - Size: 654 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Geolm/simd
Neon/AVX simd library, vector size agnostic
Language: C - Size: 269 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

jacek13/findPrimes
A program with a graphical interface designed to search for prime numbers. The application uses vector instructions (SIMD) from the x64 assembler level.
Language: C++ - Size: 5.04 MB - Last synced at: 2 months ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

d3phys/mandelbrot
Mandelbrot set SIMD optimization
Language: C++ - Size: 107 KB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

noorus/nmath
Tiny common performance maths library.
Language: C++ - Size: 140 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

chi-0828/SIMD_vips
rewrite CON_INT function to build SIMD-version convolution
Language: C - Size: 19.2 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

michal3pol/JAFiltrObrazu
Academic project - embossing filter. Using assembler with SIMD instructions in desktop app (C++). Mainly polish comments - don't get scared
Language: C++ - Size: 62.5 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

ndoll1998/FairPT
A fairly optimized cpu-only path tracer
Language: C++ - Size: 1.95 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

CGHoussem/TD_simd
Devoir Maison, TD sur les SIMD
Language: Assembly - Size: 285 KB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

dendisuhubdy/simple_vector_classes Fork of VcDevel/Vc
SIMD Vector Classes for C++
Language: C++ - Size: 9.15 MB - Last synced at: 11 months ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

tflahaul/libsimd
Tiny single header library of standard functions using SIMD optimization
Language: C - Size: 18.6 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 1

kenwi/fibs
Different solutions/benchmarks for generating the Fibonacci sequence.
Language: C# - Size: 17.6 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

iworeushankaonce/ips_arch
Language: C - Size: 647 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

Jose-agg/SIMD_Multihilo
Proyecto de investigación sobre la ejecución de aplicaciones con instrucciones SIMD y multihilo
Language: C++ - Size: 1.76 MB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

chatrasen/SP_parallel_lib
A parallel library of the SP algorithm for batch verification of ECDSA signatures
Language: C++ - Size: 24.4 KB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

JoaquinVillagra/Laboratorio_1_HPC
Laboratorio 1 de la asignatura de Computación de alto rendimiento. Universidad de Santiago de Chile, segundo semestre 2017.
Language: C - Size: 1.28 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

sgryjp/simd_test
SIMD instruction benchmark
Language: C - Size: 19.5 KB - Last synced at: 6 days ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

JAICHANGPARK/SIMD_C_FLOAT32BIT
Language: C - Size: 3.91 KB - Last synced at: about 2 years ago - Pushed at: almost 9 years ago - Stars: 0 - Forks: 0
