GitHub topics: intrinsics
google/highway
Performance-portable, length-agnostic SIMD with runtime dispatch
Language: C++ - Size: 26.7 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4,610 - Forks: 348

awesome-simd/awesome-simd
A curated list of awesome SIMD frameworks, libraries and software
Size: 64.5 KB - Last synced at: about 18 hours ago - Pushed at: 8 months ago - Stars: 181 - Forks: 16

OpenNMT/CTranslate2
Fast inference engine for Transformer models
Language: C++ - Size: 14.5 MB - Last synced at: 4 days ago - Pushed at: about 1 month ago - Stars: 3,785 - Forks: 354

xavetar/Fasma
Spectrum of components
Language: Rust - Size: 162 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

Lokathor/safe_arch
Exposes arch-specific intrinsics as safe function (via cfg).
Language: Rust - Size: 360 KB - Last synced at: 5 days ago - Pushed at: 2 months ago - Stars: 55 - Forks: 10

manodeep/Corrfunc
⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
Language: C - Size: 150 MB - Last synced at: 14 days ago - Pushed at: 3 months ago - Stars: 175 - Forks: 56

Dr-Noob/peakperf
Achieve peak performance on x86 CPUs and NVIDIA GPUs
Language: C++ - Size: 250 KB - Last synced at: 20 days ago - Pushed at: 7 months ago - Stars: 69 - Forks: 15

simstim-star/DirectXMath-in-C
Port of https://github.com/microsoft/DirectXMath to C
Language: C - Size: 35.2 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 1 - Forks: 0

jiegec/unofficial-loongarch-intrinsics-guide
Unofficial LoongArch Intrinsics Guide
Language: C - Size: 3.61 MB - Last synced at: 19 days ago - Pushed at: about 2 months ago - Stars: 51 - Forks: 8

HJLebbink/intrinsics-dude
Opensource Visual Studio extension for compiler instrinsics in C/C++
Language: HTML - Size: 4.71 MB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 73 - Forks: 4

jedisct1/untrinsics
Header-only portable implementations of common Intel intrisics, including cryptographic instructions.
Language: C - Size: 10.7 KB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 13 - Forks: 1

AdamNiederer/faster
SIMD for humans
Language: Rust - Size: 453 KB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 1,576 - Forks: 51

EgorBo/IntrinsicsPlayground
My toys to play with SSE/AVX in pure C# (.NET Core 2.1)
Language: C# - Size: 1.62 MB - Last synced at: 21 days ago - Pushed at: over 6 years ago - Stars: 64 - Forks: 5

urbanjost/M_intrinsics
man-page style descriptions of Fortran intrinsics for use as a reference for developers and tutorials
Language: Fortran - Size: 161 MB - Last synced at: about 18 hours ago - Pushed at: about 1 month ago - Stars: 20 - Forks: 3

tttapa/ARM-NEON-Compositor
Fast SIMD alpha overlay and blending for Raspberry Pi and other ARM systems.
Language: C++ - Size: 9.05 MB - Last synced at: 21 days ago - Pushed at: almost 5 years ago - Stars: 22 - Forks: 2

karlobratko/L1d-cache-optimization
Practical L1d cache optimizations on matrix multiplication
Language: C - Size: 6.84 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ljharb/get-intrinsic
Get and robustly cache all JS language-level intrinsics at first require time.
Language: JavaScript - Size: 227 KB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 26 - Forks: 4

mku11/Salmon-AES-CTR
Salmon is an AES-256 CTR encryption library with built-in integrity, parallel operations, and seekable stream support. It provides a high level API for encrypting data, byte streams, and a virtual drive API for encrypted local and remote files. Optimized for Intel x86_64, ARM64, and GPU cards.
Language: Java - Size: 76.7 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 5 - Forks: 1

n0thhhing/zeon
ARM/ARM64 Neon intrinsics implemented in zig
Language: Zig - Size: 446 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 9 - Forks: 0

yxlao/camtools
CamTools: Camera Tools for Computer Vision
Language: Python - Size: 756 KB - Last synced at: 28 days ago - Pushed at: 3 months ago - Stars: 189 - Forks: 12

dyfcalid/CameraCalibration
Fisheye or Normal Camera Intrinsic and Extrinsic Calibration. Surround Camera Bird Eye View Generator.
Language: Python - Size: 11.5 MB - Last synced at: about 2 months ago - Pushed at: 12 months ago - Stars: 689 - Forks: 187

zchrissirhcz/neon_sim
Implement ARM NEON intrinsics in C++
Language: C++ - Size: 562 KB - Last synced at: 29 days ago - Pushed at: 12 months ago - Stars: 20 - Forks: 4

piotte13/SIMD-Visualiser
A tool to graphically visualize SIMD code
Language: JavaScript - Size: 28.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 677 - Forks: 43

howjmay/neon2rvv
A translator from ARM NEON intrinsics to RISCV-V Extension implementation
Language: C++ - Size: 1.6 MB - Last synced at: 27 days ago - Pushed at: 9 months ago - Stars: 30 - Forks: 7

badamczewski/SimpleIntrinsics
This project aims to rename all C# intrinsic names to their more compact C/C++ counterparts that the industry uses.
Language: C# - Size: 57.6 KB - Last synced at: 4 days ago - Pushed at: over 4 years ago - Stars: 50 - Forks: 2

voidxno/fast-recursive-sha256
Fast Recursive SHA256
Language: C++ - Size: 110 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 12 - Forks: 0

Metalnem/sha256-armv8 📦
Accelerated SHA-256 computation in pure C# using ARMv8 SHA-256 compiler intrinsics
Language: C# - Size: 12.7 KB - Last synced at: 6 days ago - Pushed at: almost 7 years ago - Stars: 10 - Forks: 0

Sibras/ShiftIntrinsicGuide
A GUI for viewing Intel intrinsic information combined with uops.info measurement data.
Language: C++ - Size: 487 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 3 - Forks: 1

ThiagoFBastos/Closest-String-Problem
heurísticas: ILS, Simulated Annealing e Genetic Algorithm para o problema closest string
Language: C++ - Size: 143 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

KabalMcBlade/ECS-API
ECS-API is a ECS API framework, built to be very performing yet lightweight and easy to use
Language: C++ - Size: 146 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

RandomHashTags/swift-intrinsics
Unlock SIMD intrinsics for Swift.
Language: Swift - Size: 24.4 KB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

WojciechMula/man-intrinsics
Create man pages from information used by Intel Intrinsics Guide and optionally uops.info
Language: Python - Size: 106 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 45 - Forks: 5

Lokad/FasterMath
Intrinsics accelerated math functions for .NET Core - trading accuracy for performance
Language: C# - Size: 45.9 KB - Last synced at: 16 days ago - Pushed at: over 5 years ago - Stars: 17 - Forks: 2

Mampenda/Multiprogramming
This is a short recap of multiprogramming with Java created for the course at UiB.
Language: Java - Size: 9.81 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Geolm/math_intrinsics
One header file library that implement missing transcendental math functions (cos, sin, acos, and more....) using 100% AVX/Neon instructions (no branching)
Language: C - Size: 216 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 7 - Forks: 0

sbstndb/vectorized_find
Experimentations on the find function on vectors in c++
Language: C++ - Size: 53.7 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

JishinMaster/simd_utils
A header only library implementing common mathematical functions using SIMD intrinsics
Language: C - Size: 1.46 MB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 95 - Forks: 21

c9ts/romulus
C wrapped for SIMD intrinsics
Language: C - Size: 4.88 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

turborium/SSE2Sample
Example of using SSE2
Language: Pascal - Size: 3.01 MB - Last synced at: 5 days ago - Pushed at: 11 months ago - Stars: 6 - Forks: 0

AuburnSounds/intel-intrinsics
The Dlang SIMD library
Language: D - Size: 4.07 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 68 - Forks: 11

rikba/hypersim_multiview
Implements and analyses frame-to-frame pixel reprojection for the Hypersim Evermotion data set.
Language: Python - Size: 4.3 MB - Last synced at: 26 days ago - Pushed at: almost 4 years ago - Stars: 4 - Forks: 1

evrhel/MatrixUtil
A C++ header-only library for vector, matrix, and quaternion math.
Language: C++ - Size: 138 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 4 - Forks: 1

CBGonzalez/Core3Intrinsics-Intro
Taking the new `System.Runtime.Intrinsics` namespace for a spin
Language: C# - Size: 74.2 KB - Last synced at: 22 days ago - Pushed at: over 5 years ago - Stars: 32 - Forks: 2

Technologicat/cython-sse-example
Simple example for embedding SSE2 assembly in Cython projects
Language: Python - Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: about 8 years ago - Stars: 22 - Forks: 5

aidenfmunro/MandelbrotSet
Mandelbrot visualization & AVX2 optimization.
Language: C++ - Size: 330 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

maj113/Counter
High performance character occurrence counter
Language: C++ - Size: 21.5 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Computeiful/BiRandom
Fast AES-based Bijective PRNG in C using CPU intrinsics
Language: C - Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

CoffeeBeforeArch/spinlocks
Example implementations of spinlocks
Language: C++ - Size: 318 KB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 18 - Forks: 2

naitri/camera-calibration
A python implementation of Zhang, Z., "A Flexible New Technique for Camera Calibration"
Language: Python - Size: 29 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

soaringleefighting/x86_asm_intrin_demo
This project is the demo for x86 assembly and intrinsic optimization.
Language: Assembly - Size: 39.1 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

danieldotwav/CPU-Manufacturer-ID-Decoder
This program utilizes C++ and specific Windows compiler intrinsics to decode and display the CPU manufacturer ID by accessing the CPU's built-in CPUID information. It demonstrates how to extract and format this low-level information for display.
Language: C++ - Size: 9.77 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

FernandoGPinto/Intrinsics
.NET 7 had brought very significant performance improvements to several LINQ queries by vectorising the processing of various data structures. Whilst the improvements achieved are considerable, I wondered if it was possible to obtain further gains by resorting to .NET Hardware Intrinsics, particularly for large datasets.
Language: C# - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Metalnem/alignment-and-pipelining 📦
Benchmarks showing the difference between the naive intrinsics usage and the optimized code that takes advantage of alignment and pipelining
Language: C# - Size: 11.7 KB - Last synced at: 6 days ago - Pushed at: almost 7 years ago - Stars: 12 - Forks: 0

Metalnem/aes-armv8 📦
Accelerated AES computation in pure C# using ARMv8 AES compiler intrinsics
Language: C# - Size: 12.7 KB - Last synced at: 6 days ago - Pushed at: almost 7 years ago - Stars: 10 - Forks: 1

siddharthKatageri/tsai-camera-calibration
Implementation of TSAI camera calibration to compute intrinsic and extrinsic of the camera, with visualization of extrinsics.
Language: Python - Size: 13.6 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 7 - Forks: 2

debOliveira/myCameraCalibrator 📦
MATLAB and Python camera calibration framework
Language: Jupyter Notebook - Size: 62.6 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 0

sergey-zinchenko/IntrinsicsCompat
Library that maps intrinsics to their equivalent implementations on different hardware platforms
Language: C# - Size: 2.93 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

GMUCERG/PQC_NEON
NEON implementation of NIST lattice-based PQC finalists
Language: C - Size: 11.3 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 5

gaujay/simd_collection
A collection of highly optimized, SIMD-accelerated (SSE, AVX, FMA, NEON) functions written in C
Language: C - Size: 1.89 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 13 - Forks: 1

Reavenk/Datum
A C# library that represents and boxes intrinsic datatypes explicitly outside of the .NET runtime.
Language: C# - Size: 34.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

artoolkitx/artoolkitx-calibration Fork of edennz/artoolkit6-calibration
artoolkitX camera calibration Android, macOS, Linux, and iOS app.
Language: C - Size: 68.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 3

TheAssemblyArmada/BaseConfig
Shareable module for basic configuration and platform detection. Used by all hosted projects.
Language: C - Size: 59.6 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 4

per-framework/intrinsics.cpp
Low level intrinsics not provided in the C++ standard
Language: C++ - Size: 8.79 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

ptrvsrg/NSU-computer-and-peripherals
Лабораторные работы по дисциплине "ЭВМ и ПУ" 3 семестра ФИТ НГУ.
Language: Assembly - Size: 7.99 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ptrvsrg/NSU-homework-C
Домашние работы по дисциплине "Программирование" 1 курса ФИТ НГУ
Language: C - Size: 1.11 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

jsburke/HighPerf
Private Repo for collaboration and work in EC527
Language: C++ - Size: 6.15 MB - Last synced at: almost 2 years ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

xDiaym/luhncode
AVX2 version of Luhn algorithm
Language: C++ - Size: 16.6 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

andjsrk/curried-intrinsic
Curried intrinsics
Language: TypeScript - Size: 39.1 KB - Last synced at: about 7 hours ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

proydakov/cppzone
My C/C++/Intrinsic, OpenGL/OpenGLES2 experiments for desktop computers.
Language: C++ - Size: 21.1 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 2

welikethestock/libutil
Language: C - Size: 514 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

awr1/cpuwhat
Nim utilities for advanced CPU operations: CPU identification, ISA extension detection, bindings to assorted intrinsics
Language: Nim - Size: 76.2 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 27 - Forks: 2

chamorajg/pytorch_depth_and_motion_planning
This project is an unofficial PyTorch implementation of https://github.com/google-research/google-research/tree/master/depth_and_motion_learning
Language: Python - Size: 279 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 24 - Forks: 5

pvphan/camera-calibration
Camera calibration from scratch
Language: Python - Size: 376 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

SleepingSoul/intrinsics-benchmark
project uses Google Benchmark to test few intrinsics implementations (AVX and AVX2) against MSVC max optimizations
Language: C++ - Size: 49.8 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

joshlvmh/iqtree_arm_neon
IQ-TREE ported to work for systems with ARM NEON ISA
Language: C++ - Size: 7.59 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 1

ManuCanedo/fractal-generator
An discoverable fractal world.
Language: C++ - Size: 73.2 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 4 - Forks: 1

Metalnem/aes-gcm-siv 📦
.NET Core 3.0 implementation of AES-GCM-SIV nonce misuse-resistant authenticated encryption
Language: C# - Size: 2.54 MB - Last synced at: 6 days ago - Pushed at: over 6 years ago - Stars: 22 - Forks: 5

kovdan01/parallel-computing 📦
Language: C++ - Size: 590 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

jkivilin/camellia-simd-aesni
Camellia cipher SIMD vector implementations for x86 (with AES-NI, VAES and/or GFNI instructions), ARM (with ARMv8 Crypto Extension instructions) and POWER (with VMX+VSX+crypto instructions)
Language: C - Size: 306 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 6 - Forks: 0

MateusPincho/chessboardCalibration
Calibrate your camera using a chessboard
Language: Jupyter Notebook - Size: 6.54 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

dmckinnon/calibration
What is a camera calibration, why is it necessary, and how do we compute it?
Language: C++ - Size: 4.46 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 13 - Forks: 2

sebaFlame/NippyWard.Text
A high-performance UTF-8 implementation for .NET
Language: C# - Size: 95.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

adel-elmala/optimization-playGround
learning code optimization based on hardware resources, making a simple image processing library to test different approaches
Language: C - Size: 12.7 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

dkurt/cv_winter_camp_2022
Language: C++ - Size: 24.4 KB - Last synced at: about 1 month ago - Pushed at: about 3 years ago - Stars: 4 - Forks: 28

matmuher/mandelbrot_optimization
Intrinsics-optimization of program that computes points that belongs to Mandelbrot's.
Language: C - Size: 1.9 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

d3phys/mandelbrot
Mandelbrot set SIMD optimization
Language: C++ - Size: 107 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

olekscode/IntrinsicsAVX
Implementation of several algorithms using Intel AVX intrinsic instructions
Language: C - Size: 39.1 KB - Last synced at: about 1 month ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 1

ljharb/primordials
node core's "primordials" module, but robust for use in a published package
Size: 1.95 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

rodolfoap/cameracalibration
OpenCV4 C++ camera calibration in some lines
Language: C++ - Size: 691 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 2

ogurets/popcnt_emulator
Pintool library for running Quantum Break on pre-SSE4.2 CPUs
Language: C++ - Size: 21.5 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 17 - Forks: 5

mike-barber/rust-fast-linear-estimator
Fast linear and logistic estimation using Rust intrinsics and C# (Intel and ARM)
Language: Rust - Size: 756 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

ThomasRetornaz/poutre
Generic ImageProcessing library
Language: C++ - Size: 11.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Baleg00/RBLibs
Various libraries on different subjects, such as cryptography, mathematics and utilities.
Language: C++ - Size: 120 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

ifd3f/bigfloat
High-performance, high-precision 80-bit floating point library
Language: C++ - Size: 326 KB - Last synced at: 2 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

olekscode/Cholesky-AVX
Optimizing Cholesky Factorization with Intel AVX Instructions
Language: C++ - Size: 1.45 MB - Last synced at: about 1 month ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0
