Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: avx2

bgin/Radar-ElectroOptical-Simulation

(REOS) Radar and Electro-Optical Simulation Framework written in C++.

Language: C++ - Size: 28.3 MB - Last synced: about 8 hours ago - Pushed: about 9 hours ago - Stars: 51 - Forks: 16

simdjson/simdjson

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

Language: C++ - Size: 56 MB - Last synced: about 16 hours ago - Pushed: about 22 hours ago - Stars: 18,486 - Forks: 968

kbalt/ezk-image

Convert between common image/video formats

Language: Rust - Size: 283 KB - Last synced: about 16 hours ago - Pushed: about 18 hours ago - Stars: 1 - Forks: 0

simdutf/is_utf8

Fast C++ function "is_utf8": checks if the input is valid UTF-8. Made of a single source file. Optimized for ARM NEON, x64 SSE, AVX2 and AVX-512.

Language: C++ - Size: 134 KB - Last synced: about 2 hours ago - Pushed: about 23 hours ago - Stars: 44 - Forks: 5

simdutf/simdutf

Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js and Bun.

Language: C++ - Size: 5.22 MB - Last synced: about 16 hours ago - Pushed: 3 days ago - Stars: 965 - Forks: 60

RoaringBitmap/CRoaring

Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks

Language: C - Size: 47 MB - Last synced: about 16 hours ago - Pushed: 6 days ago - Stars: 1,460 - Forks: 259

path-racer/pathlib

Lightweight AVX-optimized containers and routines for the Path game engine.

Language: C - Size: 102 MB - Last synced: about 19 hours ago - Pushed: 1 day ago - Stars: 1 - Forks: 0

microsoft/DirectXMath

DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps

Language: C++ - Size: 2.18 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 1,485 - Forks: 227

axze-az/cftal

a template based C++ short vector library with vectorized faithfully rounded elementary functions

Language: C++ - Size: 61.5 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 0 - Forks: 0

ashvardanian/SimSIMD

Up to 200x Faster Inner Products and Vector Similarity β€” for Python, JavaScript, Rust, and C, supporting f64, f32, f16 real & complex, i8, and binary vectors using SIMD for both x86 AVX2 & AVX-512 and Arm NEON & SVE πŸ“

Language: C - Size: 685 KB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 727 - Forks: 35

EricGrange/DWScript

Delphi Web Script general purpose scripting engine

Language: Pascal - Size: 191 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 156 - Forks: 47

OpenNMT/CTranslate2

Fast inference engine for Transformer models

Language: C++ - Size: 13.5 MB - Last synced: 4 days ago - Pushed: 11 days ago - Stars: 2,828 - Forks: 249

intel/x86-simd-sort

C++ template library for high performance SIMD based sorting algorithms

Language: C++ - Size: 1000 KB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 798 - Forks: 47

lssfau/ExaStencils

Mirror of the official ExaStencils Project repository. Please open pull requests on GitLab: https://i10git.cs.fau.de/exastencils/exastencils

Language: Scala - Size: 299 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 3 - Forks: 1

rsonquery/rsonpath

Blazing fast JSONPath query engine written in Rust.

Language: Rust - Size: 61.5 MB - Last synced: 6 days ago - Pushed: 7 days ago - Stars: 42 - Forks: 5

RRZE-HPC/OSACA

Open Source Architecture Code Analyzer

Language: Jupyter Notebook - Size: 8.19 MB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 274 - Forks: 15

libxsmm/libxsmm

Library for specialized dense and sparse matrix operations, and deep learning primitives.

Language: C - Size: 297 MB - Last synced: 25 days ago - Pushed: 25 days ago - Stars: 795 - Forks: 181

lemire/fastbase64

SIMD-accelerated base64 codecs

Language: C - Size: 4 MB - Last synced: about 16 hours ago - Pushed: about 1 month ago - Stars: 420 - Forks: 32

parlance-zz/Pulse

Pre/post-processing utility for generating quantized log-normally-distributed spike-intervals from raw audio, and back again (C++, AVX2)

Language: C++ - Size: 6.14 MB - Last synced: 9 days ago - Pushed: over 1 year ago - Stars: 7 - Forks: 1

minio/highwayhash

Native Go version of HighwayHash with optimized assembly implementations on Intel and ARM. Able to process over 10 GB/sec on a single core on Intel CPUs - https://en.wikipedia.org/wiki/HighwayHash

Language: Go - Size: 98.6 KB - Last synced: 9 days ago - Pushed: 10 days ago - Stars: 857 - Forks: 67

t-hishinuma/DD-AVX_v3

Library of High Precision Sparse Matrix Operations Accelerated by SIMD

Language: C++ - Size: 1.17 MB - Last synced: 10 days ago - Pushed: almost 3 years ago - Stars: 40 - Forks: 3

HJLebbink/asm-dude

Visual Studio extension for assembly syntax highlighting and code completion in assembly files and the disassembly window

Language: C# - Size: 80.2 MB - Last synced: 9 days ago - Pushed: about 1 month ago - Stars: 4,104 - Forks: 94

pypy/fast-utf8-methods

Fast UTF-8 utility methods

Language: HTML - Size: 1.26 MB - Last synced: 10 days ago - Pushed: almost 7 years ago - Stars: 2 - Forks: 0

VcDevel/Vc

SIMD Vector Classes for C++

Language: C++ - Size: 11 MB - Last synced: 9 days ago - Pushed: 3 months ago - Stars: 1,420 - Forks: 150

mcountryman/simd-adler32

A SIMD-accelerated Adler-32 hash algorithm implementation.

Language: Rust - Size: 202 KB - Last synced: 11 days ago - Pushed: about 1 month ago - Stars: 31 - Forks: 5

andyD123/DR3

DR3 enables users to write vectorised code using generic lambdas and filters. Switch instruction set just by changing enclosing namespace

Language: C++ - Size: 19.2 MB - Last synced: 13 days ago - Pushed: 13 days ago - Stars: 25 - Forks: 5

maddsua/simd

AVX2 and SSE2 usecases and benchmarks

Language: C++ - Size: 131 KB - Last synced: 13 days ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

ClaudiuHKS/Se-Capabilities

Se Capabilities

Language: C++ - Size: 9.77 KB - Last synced: 13 days ago - Pushed: over 1 year ago - Stars: 2 - Forks: 0

albertomontesg/fast-tsne

How to Write Fast Numerical Code Project @ ETHZ 2017

Language: Jupyter Notebook - Size: 140 MB - Last synced: 14 days ago - Pushed: over 6 years ago - Stars: 4 - Forks: 0

Auburn/FastSIMD

Low level generic SIMD wrapper for x86, ARM, WASM with dynamic dispatch

Language: C++ - Size: 228 KB - Last synced: 10 days ago - Pushed: 14 days ago - Stars: 23 - Forks: 2

matthewkolbe/LitMath

A collection of SIMD (AVX2 & AVX512) accelerated mathematical functions for .NET

Language: C# - Size: 175 KB - Last synced: 14 days ago - Pushed: about 1 month ago - Stars: 44 - Forks: 2

powturbo/Turbo-Run-Length-Encoding

TurboRLE-Fastest Run Length Encoding

Language: C - Size: 15.2 MB - Last synced: 9 days ago - Pushed: about 1 year ago - Stars: 277 - Forks: 27

manodeep/Corrfunc

⚑️⚑️⚑️Blazing fast correlation functions on the CPU.

Language: C - Size: 150 MB - Last synced: 7 days ago - Pushed: about 1 month ago - Stars: 162 - Forks: 49

Auburn/FastNoiseSIMD πŸ“¦

C++ SIMD Noise Library

Language: C++ - Size: 206 KB - Last synced: 10 days ago - Pushed: about 3 years ago - Stars: 604 - Forks: 89

simd-everywhere/simde

Implementations of SIMD instruction sets for systems which don't natively support them.

Language: C - Size: 35 MB - Last synced: 18 days ago - Pushed: 20 days ago - Stars: 2,168 - Forks: 225

IAKOBVS/jstring

C String Library

Language: C - Size: 4.88 MB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 1 - Forks: 0

VectorChief/QuadRay-engine

Realtime raytracer using SIMD on ARM, MIPS, PPC and x86

Language: C - Size: 14.6 MB - Last synced: 17 days ago - Pushed: 17 days ago - Stars: 25 - Forks: 4

VectorChief/UniSIMD-assembler

SIMD macro assembler unified for ARM, MIPS, PPC and x86

Language: C - Size: 9.11 MB - Last synced: 17 days ago - Pushed: 17 days ago - Stars: 85 - Forks: 7

WojciechMula/toys

Storage for my snippets, toy programs, etc.

Language: C++ - Size: 2.34 MB - Last synced: 17 days ago - Pushed: 17 days ago - Stars: 311 - Forks: 37

maj113/Counter

High performance character occurrence counter

Language: C++ - Size: 21.5 KB - Last synced: 18 days ago - Pushed: 18 days ago - Stars: 1 - Forks: 0

JohT/convolution-benchmarks

Benchmark convolution implementations in C++ with Catch2 visualized with Vega-Lite

Language: C++ - Size: 5.94 MB - Last synced: 25 days ago - Pushed: 26 days ago - Stars: 1 - Forks: 1

kimwalisch/libpopcnt

πŸš€ Fast C/C++ bit population count library

Language: C - Size: 170 KB - Last synced: 10 days ago - Pushed: about 2 months ago - Stars: 298 - Forks: 36

RobRich999/Chromium_Clang

Chromium browser compiled with the Clang/LLVM compiler.

Size: 1.62 MB - Last synced: 24 days ago - Pushed: 24 days ago - Stars: 144 - Forks: 10

cdl-saarland/rv

RV: A Unified Region Vectorizer for LLVM

Language: C++ - Size: 8.36 MB - Last synced: 19 days ago - Pushed: 29 days ago - Stars: 94 - Forks: 13

p12tic/libsimdpp

Portable header-only C++ low level SIMD library

Language: C++ - Size: 4.38 MB - Last synced: 25 days ago - Pushed: 5 months ago - Stars: 1,187 - Forks: 132

jfalcou/eve

Expressive Vector Engine - SIMD in C++ Goes Brrrr

Language: C++ - Size: 44.2 MB - Last synced: 27 days ago - Pushed: 27 days ago - Stars: 842 - Forks: 51

google/highway

Performance-portable, length-agnostic SIMD with runtime dispatch

Language: C++ - Size: 22.5 MB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 3,609 - Forks: 291

jvdd/argminmax

Efficient argmin & argmax

Language: Rust - Size: 536 KB - Last synced: 30 days ago - Pushed: about 1 month ago - Stars: 51 - Forks: 5

minio/md5-simd

Accelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.

Language: Go - Size: 698 KB - Last synced: 27 days ago - Pushed: over 1 year ago - Stars: 161 - Forks: 18

powturbo/TurboPFor-Integer-Compression

Fastest Integer Compression

Language: C - Size: 5.9 MB - Last synced: 29 days ago - Pushed: 2 months ago - Stars: 741 - Forks: 109

awesome-simd/awesome-simd

A curated list of awesome SIMD frameworks, libraries and software

Size: 38.1 KB - Last synced: about 1 month ago - Pushed: 3 months ago - Stars: 105 - Forks: 14

localcc/lightningscanner-rs

A lightning-fast memory pattern scanner, capable of scanning gigabytes of data per second

Language: Rust - Size: 11.7 KB - Last synced: 27 days ago - Pushed: about 1 month ago - Stars: 9 - Forks: 3

pre-eth/adam

ADAM is an actively developed CSPRNG inspired by ISAAC64

Language: C - Size: 889 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 15 - Forks: 0

cloudflare/sliceslice-rs

A fast implementation of single-pattern substring search using SIMD acceleration.

Language: Rust - Size: 334 KB - Last synced: 3 days ago - Pushed: 3 months ago - Stars: 87 - Forks: 16

bluescarni/rakau

C++17 N-body Barnes-Hut on heterogeneous hardware architectures

Language: C++ - Size: 1.26 MB - Last synced: 9 days ago - Pushed: almost 4 years ago - Stars: 20 - Forks: 5

EgorBo/SimdJsonSharp

C# bindings for lemire/simdjson (and full C# port)

Language: C# - Size: 4.01 MB - Last synced: 14 days ago - Pushed: over 1 year ago - Stars: 630 - Forks: 76

WojciechMula/sse-popcount

SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html

Language: C++ - Size: 299 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 305 - Forks: 47

Daniel-Liu-c0deb0t/block-aligner

SIMD-accelerated library for computing global and X-drop affine gap penalty sequence-to-sequence or sequence-to-profile alignments using an adaptive block-based algorithm.

Language: Jupyter Notebook - Size: 2.34 MB - Last synced: about 1 month ago - Pushed: 8 months ago - Stars: 118 - Forks: 6

WojciechMula/base64simd

Base64 coding and decoding with SIMD instructions (SSE/AVX2/AVX512F/AVX512BW/AVX512VBMI/ARM Neon)

Language: C++ - Size: 401 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 147 - Forks: 13

sahmad98/vstring

Vectroized String Helper Functions

Language: C++ - Size: 61.5 KB - Last synced: about 2 months ago - Pushed: over 4 years ago - Stars: 6 - Forks: 0

cristian-bicheru/detect-simd

Python library to detect CPU SIMD capabilities.

Language: C - Size: 31.3 KB - Last synced: 6 days ago - Pushed: about 3 years ago - Stars: 3 - Forks: 0

ltlollo/lattice

Vectorized primitives on Intel AVX/AVX2 for some Ring-LWE problems

Language: C - Size: 45.9 KB - Last synced: about 2 months ago - Pushed: about 7 years ago - Stars: 1 - Forks: 0

ltlollo/algos

A stash of useful algorithms

Language: C - Size: 115 KB - Last synced: about 2 months ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0

lithander/Leorik

Leorik is a strong, open-source UCI chess engine written in C#

Language: C# - Size: 17.6 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 18 - Forks: 3

powturbo/Turbo-Base64

Turbo Base64 - Fastest Base64 SIMD:SSE/AVX2/AVX512/Neon/Altivec - Faster than memcpy!

Language: C - Size: 439 KB - Last synced: 2 months ago - Pushed: 9 months ago - Stars: 245 - Forks: 36

wx257osn2/qoixx

Single Header Quite Fast QOI(Quite OK Image Format) Implementation written in C++20

Language: C++ - Size: 72.3 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 30 - Forks: 3

mind/wheels

Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)

Size: 39.1 KB - Last synced: about 1 month ago - Pushed: almost 5 years ago - Stars: 888 - Forks: 109

GargamelJR1/ImageEditor_WinForms

Simple image editor made using Windows Forms. It includes three versions of functions: C#, C++ and ASM and allow to compare their processing times.

Language: C# - Size: 62.5 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

guzba/nimsimd

Pleasant Nim bindings for SIMD instruction sets.

Language: Nim - Size: 65.4 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 60 - Forks: 6

rusticstuff/simdutf8

SIMD-accelerated UTF-8 validation for Rust.

Language: Rust - Size: 2.73 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 500 - Forks: 22

Daniel-Liu-c0deb0t/triple_accel

Rust edit distance routines accelerated using SIMD. Supports fast Hamming, Levenshtein, restricted Damerau-Levenshtein, etc. distance calculations and string search.

Language: Rust - Size: 182 KB - Last synced: 11 days ago - Pushed: about 1 year ago - Stars: 93 - Forks: 10

Alex313031/Thorium-Linux-AVX2

Repo to serve AVX2 Linux builds of Thorium. https://github.com/Alex313031/Thorium/

Size: 9.77 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 26 - Forks: 0

lovell/highwayhash

Node.js implementation of HighwayHash, Google's fast and strong hash function

Language: JavaScript - Size: 115 KB - Last synced: 26 days ago - Pushed: over 2 years ago - Stars: 210 - Forks: 20

MrUnbelievable92/MaxMath

A C# SIMD math library for use with Unity only, substantially extending Unity.Mathematics by new types and functions, using Unity.Burst.

Language: C# - Size: 2.95 MB - Last synced: 3 months ago - Pushed: 7 months ago - Stars: 103 - Forks: 9

RickWong/go-aoc

Advent of Code in Go. First make it work, then right, then fast, then simple. Going for all puzzles < 1s.

Language: Go - Size: 298 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

dzaima/intrinsics-viewer

x86-64, ARM, and RVV intrinsics viewer

Language: JavaScript - Size: 727 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 16 - Forks: 1

WojciechMula/parsing-int-series

Parse multiple decimal integers separated by arbitrary number of delimiters

Language: C++ - Size: 280 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 28 - Forks: 5

Geolm/math_intrinsics

One header file library that implement missing transcendental math functions (cos, sin, acos, and more....) using 100% AVX/Neon instructions (no branching)

Language: C - Size: 213 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

NIR3X/FastXor.cpp

FastXor - SIMD-based XOR Encryption

Language: C++ - Size: 24.4 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

Daksh2060/vectorized-arrays-demo

This C++ program is a demonstration of array vectorization techniques utilized in the AVX2 SIMD Assembly library, being run with C++ arrays through the vector class library created by Agner Fog. An ASM version of the same process has been implemented for comparison.

Language: C++ - Size: 654 KB - Last synced: 24 days ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

swojtasiak/fcml-lib

A general purpose machine code manipulation library for x86-32 (IA-32) and x86-64 (AMD64) architectures (Assembler, Disassembler, Library).

Language: C - Size: 22.9 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 81 - Forks: 24

Alex313031/Thorium-Win-AVX2

Repo to serve AVX2 Windows builds of Thorium. https://github.com/Alex313031/Thorium/

Language: Batchfile - Size: 294 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 358 - Forks: 8

fekle/tensorflow-docker-gpu-avx2

A little script to build tensorflow gpu images with avx2

Language: Shell - Size: 23.4 KB - Last synced: 4 months ago - Pushed: almost 6 years ago - Stars: 0 - Forks: 0

tk-yoshimura/AvxUInt

AVX Accelerated BigUInt Arithmetic Implements

Language: C# - Size: 236 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

SungJJinKang/EveryCulling

This library integrates multiple culling methods into one library.

Language: C++ - Size: 620 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 117 - Forks: 9

alainesp/simd-function

Python library to metaprogram C/C++ functions using SIMD instruction sets

Size: 145 KB - Last synced: about 1 month ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

ptahmose/colortwist

an exercise in SIMD-optimization

Language: C++ - Size: 4.08 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 1 - Forks: 0

FCLC/AdvancedCiderXtensions

Measure accelerate BLAS performance

Language: Swift - Size: 65.4 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 5 - Forks: 1

zbjornson/bson-to-json

Fast BSON to JSON string transcoder

Language: C++ - Size: 282 KB - Last synced: 2 days ago - Pushed: about 4 years ago - Stars: 10 - Forks: 4

morian/leek

SSE/AVX2/AVX512 onion v2 address generator.

Language: C - Size: 215 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 4 - Forks: 0

hpjansson/smolscale

Fast, embeddable C code for smooth image scaling and pixel format conversion

Language: C - Size: 1.35 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 24 - Forks: 2

alignedalignof/avx-image-integral

Image integral calculation using AVX

Language: C++ - Size: 131 KB - Last synced: 5 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

alignedalignof/avx-4x8-filter

Small fixed size image correlation filter implemented with AVX

Language: C++ - Size: 2.93 KB - Last synced: 5 months ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

alignedalignof/swtormousedroid

A Star Wars: The Old Republic mouselook application with configurable click binds

Size: 8.7 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 2 - Forks: 0

bgin/Radar_ElectroOptical_Simulation

(REOS) Radar and ElectroOptical Simulation Framework written in Fortran.

Language: Fortran - Size: 39.2 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 45 - Forks: 14

nevinbaiju/transformer_cpp_ITCS-5182

Optimization of Attention layers for efficient inferencing on the CPU and GPU. It covers optimizations for AVX and CUDA also efficient memory processing techniques.

Language: C++ - Size: 96.7 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

AutomataLab/JSONSki

JSONPath Streaming with Bit-Parallel Fast-Forwarding

Language: C++ - Size: 3.65 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 19 - Forks: 4

stuarthayhurst/battleships

Battleships opponent and compute experiments, with AVX2 / AVX-512

Language: Python - Size: 86.9 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

bamert/pmstereo Fork of tetterl/pmstereo

Implements the PatchMatch stereo algorihm. AVX2 intrinsics for Intel.

Language: C++ - Size: 2.16 MB - Last synced: 5 months ago - Pushed: over 3 years ago - Stars: 1 - Forks: 1

RobRich999/LLVM_Optimized_AVX2

Clang/LLVM built with AVX2, Polly, PGO, LTO, BOLT, and other optimizations.

Size: 186 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 5 - Forks: 2