Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: avx2
bgin/Radar-ElectroOptical-Simulation
(REOS) Radar and Electro-Optical Simulation Framework written in C++.
Language: C++ - Size: 28.3 MB - Last synced: about 8 hours ago - Pushed: about 9 hours ago - Stars: 51 - Forks: 16
simdjson/simdjson
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
Language: C++ - Size: 56 MB - Last synced: about 16 hours ago - Pushed: about 22 hours ago - Stars: 18,486 - Forks: 968
kbalt/ezk-image
Convert between common image/video formats
Language: Rust - Size: 283 KB - Last synced: about 16 hours ago - Pushed: about 18 hours ago - Stars: 1 - Forks: 0
simdutf/is_utf8
Fast C++ function "is_utf8": checks if the input is valid UTF-8. Made of a single source file. Optimized for ARM NEON, x64 SSE, AVX2 and AVX-512.
Language: C++ - Size: 134 KB - Last synced: about 2 hours ago - Pushed: about 23 hours ago - Stars: 44 - Forks: 5
simdutf/simdutf
Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js and Bun.
Language: C++ - Size: 5.22 MB - Last synced: about 16 hours ago - Pushed: 3 days ago - Stars: 965 - Forks: 60
RoaringBitmap/CRoaring
Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks
Language: C - Size: 47 MB - Last synced: about 16 hours ago - Pushed: 6 days ago - Stars: 1,460 - Forks: 259
path-racer/pathlib
Lightweight AVX-optimized containers and routines for the Path game engine.
Language: C - Size: 102 MB - Last synced: about 19 hours ago - Pushed: 1 day ago - Stars: 1 - Forks: 0
microsoft/DirectXMath
DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
Language: C++ - Size: 2.18 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 1,485 - Forks: 227
axze-az/cftal
a template based C++ short vector library with vectorized faithfully rounded elementary functions
Language: C++ - Size: 61.5 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 0 - Forks: 0
ashvardanian/SimSIMD
Up to 200x Faster Inner Products and Vector Similarity β for Python, JavaScript, Rust, and C, supporting f64, f32, f16 real & complex, i8, and binary vectors using SIMD for both x86 AVX2 & AVX-512 and Arm NEON & SVE π
Language: C - Size: 685 KB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 727 - Forks: 35
EricGrange/DWScript
Delphi Web Script general purpose scripting engine
Language: Pascal - Size: 191 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 156 - Forks: 47
OpenNMT/CTranslate2
Fast inference engine for Transformer models
Language: C++ - Size: 13.5 MB - Last synced: 4 days ago - Pushed: 11 days ago - Stars: 2,828 - Forks: 249
intel/x86-simd-sort
C++ template library for high performance SIMD based sorting algorithms
Language: C++ - Size: 1000 KB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 798 - Forks: 47
lssfau/ExaStencils
Mirror of the official ExaStencils Project repository. Please open pull requests on GitLab: https://i10git.cs.fau.de/exastencils/exastencils
Language: Scala - Size: 299 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 3 - Forks: 1
rsonquery/rsonpath
Blazing fast JSONPath query engine written in Rust.
Language: Rust - Size: 61.5 MB - Last synced: 6 days ago - Pushed: 7 days ago - Stars: 42 - Forks: 5
RRZE-HPC/OSACA
Open Source Architecture Code Analyzer
Language: Jupyter Notebook - Size: 8.19 MB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 274 - Forks: 15
libxsmm/libxsmm
Library for specialized dense and sparse matrix operations, and deep learning primitives.
Language: C - Size: 297 MB - Last synced: 25 days ago - Pushed: 25 days ago - Stars: 795 - Forks: 181
lemire/fastbase64
SIMD-accelerated base64 codecs
Language: C - Size: 4 MB - Last synced: about 16 hours ago - Pushed: about 1 month ago - Stars: 420 - Forks: 32
parlance-zz/Pulse
Pre/post-processing utility for generating quantized log-normally-distributed spike-intervals from raw audio, and back again (C++, AVX2)
Language: C++ - Size: 6.14 MB - Last synced: 9 days ago - Pushed: over 1 year ago - Stars: 7 - Forks: 1
minio/highwayhash
Native Go version of HighwayHash with optimized assembly implementations on Intel and ARM. Able to process over 10 GB/sec on a single core on Intel CPUs - https://en.wikipedia.org/wiki/HighwayHash
Language: Go - Size: 98.6 KB - Last synced: 9 days ago - Pushed: 10 days ago - Stars: 857 - Forks: 67
t-hishinuma/DD-AVX_v3
Library of High Precision Sparse Matrix Operations Accelerated by SIMD
Language: C++ - Size: 1.17 MB - Last synced: 10 days ago - Pushed: almost 3 years ago - Stars: 40 - Forks: 3
HJLebbink/asm-dude
Visual Studio extension for assembly syntax highlighting and code completion in assembly files and the disassembly window
Language: C# - Size: 80.2 MB - Last synced: 9 days ago - Pushed: about 1 month ago - Stars: 4,104 - Forks: 94
pypy/fast-utf8-methods
Fast UTF-8 utility methods
Language: HTML - Size: 1.26 MB - Last synced: 10 days ago - Pushed: almost 7 years ago - Stars: 2 - Forks: 0
VcDevel/Vc
SIMD Vector Classes for C++
Language: C++ - Size: 11 MB - Last synced: 9 days ago - Pushed: 3 months ago - Stars: 1,420 - Forks: 150
mcountryman/simd-adler32
A SIMD-accelerated Adler-32 hash algorithm implementation.
Language: Rust - Size: 202 KB - Last synced: 11 days ago - Pushed: about 1 month ago - Stars: 31 - Forks: 5
andyD123/DR3
DR3 enables users to write vectorised code using generic lambdas and filters. Switch instruction set just by changing enclosing namespace
Language: C++ - Size: 19.2 MB - Last synced: 13 days ago - Pushed: 13 days ago - Stars: 25 - Forks: 5
maddsua/simd
AVX2 and SSE2 usecases and benchmarks
Language: C++ - Size: 131 KB - Last synced: 13 days ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0
ClaudiuHKS/Se-Capabilities
Se Capabilities
Language: C++ - Size: 9.77 KB - Last synced: 13 days ago - Pushed: over 1 year ago - Stars: 2 - Forks: 0
albertomontesg/fast-tsne
How to Write Fast Numerical Code Project @ ETHZ 2017
Language: Jupyter Notebook - Size: 140 MB - Last synced: 14 days ago - Pushed: over 6 years ago - Stars: 4 - Forks: 0
Auburn/FastSIMD
Low level generic SIMD wrapper for x86, ARM, WASM with dynamic dispatch
Language: C++ - Size: 228 KB - Last synced: 10 days ago - Pushed: 14 days ago - Stars: 23 - Forks: 2
matthewkolbe/LitMath
A collection of SIMD (AVX2 & AVX512) accelerated mathematical functions for .NET
Language: C# - Size: 175 KB - Last synced: 14 days ago - Pushed: about 1 month ago - Stars: 44 - Forks: 2
powturbo/Turbo-Run-Length-Encoding
TurboRLE-Fastest Run Length Encoding
Language: C - Size: 15.2 MB - Last synced: 9 days ago - Pushed: about 1 year ago - Stars: 277 - Forks: 27
manodeep/Corrfunc
β‘οΈβ‘οΈβ‘οΈBlazing fast correlation functions on the CPU.
Language: C - Size: 150 MB - Last synced: 7 days ago - Pushed: about 1 month ago - Stars: 162 - Forks: 49
Auburn/FastNoiseSIMD π¦
C++ SIMD Noise Library
Language: C++ - Size: 206 KB - Last synced: 10 days ago - Pushed: about 3 years ago - Stars: 604 - Forks: 89
simd-everywhere/simde
Implementations of SIMD instruction sets for systems which don't natively support them.
Language: C - Size: 35 MB - Last synced: 18 days ago - Pushed: 20 days ago - Stars: 2,168 - Forks: 225
IAKOBVS/jstring
C String Library
Language: C - Size: 4.88 MB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 1 - Forks: 0
VectorChief/QuadRay-engine
Realtime raytracer using SIMD on ARM, MIPS, PPC and x86
Language: C - Size: 14.6 MB - Last synced: 17 days ago - Pushed: 17 days ago - Stars: 25 - Forks: 4
VectorChief/UniSIMD-assembler
SIMD macro assembler unified for ARM, MIPS, PPC and x86
Language: C - Size: 9.11 MB - Last synced: 17 days ago - Pushed: 17 days ago - Stars: 85 - Forks: 7
WojciechMula/toys
Storage for my snippets, toy programs, etc.
Language: C++ - Size: 2.34 MB - Last synced: 17 days ago - Pushed: 17 days ago - Stars: 311 - Forks: 37
maj113/Counter
High performance character occurrence counter
Language: C++ - Size: 21.5 KB - Last synced: 18 days ago - Pushed: 18 days ago - Stars: 1 - Forks: 0
JohT/convolution-benchmarks
Benchmark convolution implementations in C++ with Catch2 visualized with Vega-Lite
Language: C++ - Size: 5.94 MB - Last synced: 25 days ago - Pushed: 26 days ago - Stars: 1 - Forks: 1
kimwalisch/libpopcnt
π Fast C/C++ bit population count library
Language: C - Size: 170 KB - Last synced: 10 days ago - Pushed: about 2 months ago - Stars: 298 - Forks: 36
RobRich999/Chromium_Clang
Chromium browser compiled with the Clang/LLVM compiler.
Size: 1.62 MB - Last synced: 24 days ago - Pushed: 24 days ago - Stars: 144 - Forks: 10
cdl-saarland/rv
RV: A Unified Region Vectorizer for LLVM
Language: C++ - Size: 8.36 MB - Last synced: 19 days ago - Pushed: 29 days ago - Stars: 94 - Forks: 13
p12tic/libsimdpp
Portable header-only C++ low level SIMD library
Language: C++ - Size: 4.38 MB - Last synced: 25 days ago - Pushed: 5 months ago - Stars: 1,187 - Forks: 132
jfalcou/eve
Expressive Vector Engine - SIMD in C++ Goes Brrrr
Language: C++ - Size: 44.2 MB - Last synced: 27 days ago - Pushed: 27 days ago - Stars: 842 - Forks: 51
google/highway
Performance-portable, length-agnostic SIMD with runtime dispatch
Language: C++ - Size: 22.5 MB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 3,609 - Forks: 291
jvdd/argminmax
Efficient argmin & argmax
Language: Rust - Size: 536 KB - Last synced: 30 days ago - Pushed: about 1 month ago - Stars: 51 - Forks: 5
minio/md5-simd
Accelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.
Language: Go - Size: 698 KB - Last synced: 27 days ago - Pushed: over 1 year ago - Stars: 161 - Forks: 18
powturbo/TurboPFor-Integer-Compression
Fastest Integer Compression
Language: C - Size: 5.9 MB - Last synced: 29 days ago - Pushed: 2 months ago - Stars: 741 - Forks: 109
awesome-simd/awesome-simd
A curated list of awesome SIMD frameworks, libraries and software
Size: 38.1 KB - Last synced: about 1 month ago - Pushed: 3 months ago - Stars: 105 - Forks: 14
localcc/lightningscanner-rs
A lightning-fast memory pattern scanner, capable of scanning gigabytes of data per second
Language: Rust - Size: 11.7 KB - Last synced: 27 days ago - Pushed: about 1 month ago - Stars: 9 - Forks: 3
pre-eth/adam
ADAM is an actively developed CSPRNG inspired by ISAAC64
Language: C - Size: 889 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 15 - Forks: 0
cloudflare/sliceslice-rs
A fast implementation of single-pattern substring search using SIMD acceleration.
Language: Rust - Size: 334 KB - Last synced: 3 days ago - Pushed: 3 months ago - Stars: 87 - Forks: 16
bluescarni/rakau
C++17 N-body Barnes-Hut on heterogeneous hardware architectures
Language: C++ - Size: 1.26 MB - Last synced: 9 days ago - Pushed: almost 4 years ago - Stars: 20 - Forks: 5
EgorBo/SimdJsonSharp
C# bindings for lemire/simdjson (and full C# port)
Language: C# - Size: 4.01 MB - Last synced: 14 days ago - Pushed: over 1 year ago - Stars: 630 - Forks: 76
WojciechMula/sse-popcount
SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html
Language: C++ - Size: 299 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 305 - Forks: 47
Daniel-Liu-c0deb0t/block-aligner
SIMD-accelerated library for computing global and X-drop affine gap penalty sequence-to-sequence or sequence-to-profile alignments using an adaptive block-based algorithm.
Language: Jupyter Notebook - Size: 2.34 MB - Last synced: about 1 month ago - Pushed: 8 months ago - Stars: 118 - Forks: 6
WojciechMula/base64simd
Base64 coding and decoding with SIMD instructions (SSE/AVX2/AVX512F/AVX512BW/AVX512VBMI/ARM Neon)
Language: C++ - Size: 401 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 147 - Forks: 13
sahmad98/vstring
Vectroized String Helper Functions
Language: C++ - Size: 61.5 KB - Last synced: about 2 months ago - Pushed: over 4 years ago - Stars: 6 - Forks: 0
cristian-bicheru/detect-simd
Python library to detect CPU SIMD capabilities.
Language: C - Size: 31.3 KB - Last synced: 6 days ago - Pushed: about 3 years ago - Stars: 3 - Forks: 0
ltlollo/lattice
Vectorized primitives on Intel AVX/AVX2 for some Ring-LWE problems
Language: C - Size: 45.9 KB - Last synced: about 2 months ago - Pushed: about 7 years ago - Stars: 1 - Forks: 0
ltlollo/algos
A stash of useful algorithms
Language: C - Size: 115 KB - Last synced: about 2 months ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0
lithander/Leorik
Leorik is a strong, open-source UCI chess engine written in C#
Language: C# - Size: 17.6 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 18 - Forks: 3
powturbo/Turbo-Base64
Turbo Base64 - Fastest Base64 SIMD:SSE/AVX2/AVX512/Neon/Altivec - Faster than memcpy!
Language: C - Size: 439 KB - Last synced: 2 months ago - Pushed: 9 months ago - Stars: 245 - Forks: 36
wx257osn2/qoixx
Single Header Quite Fast QOI(Quite OK Image Format) Implementation written in C++20
Language: C++ - Size: 72.3 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 30 - Forks: 3
mind/wheels
Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
Size: 39.1 KB - Last synced: about 1 month ago - Pushed: almost 5 years ago - Stars: 888 - Forks: 109
GargamelJR1/ImageEditor_WinForms
Simple image editor made using Windows Forms. It includes three versions of functions: C#, C++ and ASM and allow to compare their processing times.
Language: C# - Size: 62.5 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
guzba/nimsimd
Pleasant Nim bindings for SIMD instruction sets.
Language: Nim - Size: 65.4 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 60 - Forks: 6
rusticstuff/simdutf8
SIMD-accelerated UTF-8 validation for Rust.
Language: Rust - Size: 2.73 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 500 - Forks: 22
Daniel-Liu-c0deb0t/triple_accel
Rust edit distance routines accelerated using SIMD. Supports fast Hamming, Levenshtein, restricted Damerau-Levenshtein, etc. distance calculations and string search.
Language: Rust - Size: 182 KB - Last synced: 11 days ago - Pushed: about 1 year ago - Stars: 93 - Forks: 10
Alex313031/Thorium-Linux-AVX2
Repo to serve AVX2 Linux builds of Thorium. https://github.com/Alex313031/Thorium/
Size: 9.77 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 26 - Forks: 0
lovell/highwayhash
Node.js implementation of HighwayHash, Google's fast and strong hash function
Language: JavaScript - Size: 115 KB - Last synced: 26 days ago - Pushed: over 2 years ago - Stars: 210 - Forks: 20
MrUnbelievable92/MaxMath
A C# SIMD math library for use with Unity only, substantially extending Unity.Mathematics by new types and functions, using Unity.Burst.
Language: C# - Size: 2.95 MB - Last synced: 3 months ago - Pushed: 7 months ago - Stars: 103 - Forks: 9
RickWong/go-aoc
Advent of Code in Go. First make it work, then right, then fast, then simple. Going for all puzzles < 1s.
Language: Go - Size: 298 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
dzaima/intrinsics-viewer
x86-64, ARM, and RVV intrinsics viewer
Language: JavaScript - Size: 727 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 16 - Forks: 1
WojciechMula/parsing-int-series
Parse multiple decimal integers separated by arbitrary number of delimiters
Language: C++ - Size: 280 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 28 - Forks: 5
Geolm/math_intrinsics
One header file library that implement missing transcendental math functions (cos, sin, acos, and more....) using 100% AVX/Neon instructions (no branching)
Language: C - Size: 213 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
NIR3X/FastXor.cpp
FastXor - SIMD-based XOR Encryption
Language: C++ - Size: 24.4 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
Daksh2060/vectorized-arrays-demo
This C++ program is a demonstration of array vectorization techniques utilized in the AVX2 SIMD Assembly library, being run with C++ arrays through the vector class library created by Agner Fog. An ASM version of the same process has been implemented for comparison.
Language: C++ - Size: 654 KB - Last synced: 24 days ago - Pushed: 4 months ago - Stars: 0 - Forks: 0
swojtasiak/fcml-lib
A general purpose machine code manipulation library for x86-32 (IA-32) and x86-64 (AMD64) architectures (Assembler, Disassembler, Library).
Language: C - Size: 22.9 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 81 - Forks: 24
Alex313031/Thorium-Win-AVX2
Repo to serve AVX2 Windows builds of Thorium. https://github.com/Alex313031/Thorium/
Language: Batchfile - Size: 294 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 358 - Forks: 8
fekle/tensorflow-docker-gpu-avx2
A little script to build tensorflow gpu images with avx2
Language: Shell - Size: 23.4 KB - Last synced: 4 months ago - Pushed: almost 6 years ago - Stars: 0 - Forks: 0
tk-yoshimura/AvxUInt
AVX Accelerated BigUInt Arithmetic Implements
Language: C# - Size: 236 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
SungJJinKang/EveryCulling
This library integrates multiple culling methods into one library.
Language: C++ - Size: 620 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 117 - Forks: 9
alainesp/simd-function
Python library to metaprogram C/C++ functions using SIMD instruction sets
Size: 145 KB - Last synced: about 1 month ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
ptahmose/colortwist
an exercise in SIMD-optimization
Language: C++ - Size: 4.08 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 1 - Forks: 0
FCLC/AdvancedCiderXtensions
Measure accelerate BLAS performance
Language: Swift - Size: 65.4 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 5 - Forks: 1
zbjornson/bson-to-json
Fast BSON to JSON string transcoder
Language: C++ - Size: 282 KB - Last synced: 2 days ago - Pushed: about 4 years ago - Stars: 10 - Forks: 4
morian/leek
SSE/AVX2/AVX512 onion v2 address generator.
Language: C - Size: 215 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 4 - Forks: 0
hpjansson/smolscale
Fast, embeddable C code for smooth image scaling and pixel format conversion
Language: C - Size: 1.35 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 24 - Forks: 2
alignedalignof/avx-image-integral
Image integral calculation using AVX
Language: C++ - Size: 131 KB - Last synced: 5 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0
alignedalignof/avx-4x8-filter
Small fixed size image correlation filter implemented with AVX
Language: C++ - Size: 2.93 KB - Last synced: 5 months ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0
alignedalignof/swtormousedroid
A Star Wars: The Old Republic mouselook application with configurable click binds
Size: 8.7 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 2 - Forks: 0
bgin/Radar_ElectroOptical_Simulation
(REOS) Radar and ElectroOptical Simulation Framework written in Fortran.
Language: Fortran - Size: 39.2 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 45 - Forks: 14
nevinbaiju/transformer_cpp_ITCS-5182
Optimization of Attention layers for efficient inferencing on the CPU and GPU. It covers optimizations for AVX and CUDA also efficient memory processing techniques.
Language: C++ - Size: 96.7 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0
AutomataLab/JSONSki
JSONPath Streaming with Bit-Parallel Fast-Forwarding
Language: C++ - Size: 3.65 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 19 - Forks: 4
stuarthayhurst/battleships
Battleships opponent and compute experiments, with AVX2 / AVX-512
Language: Python - Size: 86.9 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
bamert/pmstereo Fork of tetterl/pmstereo
Implements the PatchMatch stereo algorihm. AVX2 intrinsics for Intel.
Language: C++ - Size: 2.16 MB - Last synced: 5 months ago - Pushed: over 3 years ago - Stars: 1 - Forks: 1
RobRich999/LLVM_Optimized_AVX2
Clang/LLVM built with AVX2, Polly, PGO, LTO, BOLT, and other optimizations.
Size: 186 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 5 - Forks: 2