GitHub topics: half-precision

Repositories

KernelTuner/kernel_float

CUDA/HIP header-only library for low-precision (16 bit, 8 bit) and vectorized GPU kernel development

Language: C++ - Size: 6.86 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 11 - Forks: 2

petamoriken/float16

IEEE 754 half-precision floating-point ponyfill

Language: JavaScript - Size: 10.1 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 102 - Forks: 9

adelj88/rocm_wmma_gemm

WMMA GEMM in ROCm for RDNA GPUs

Language: C++ - Size: 1.02 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 4 - Forks: 0

Large collection of number systems providing custom arithmetic for mixed-precision algorithm development and optimization for AI, Machine Learning, Computer Vision, Signal Processing, CAE, EDA, control, optimization, estimation, and approximation.

Language: C++ - Size: 117 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 460 - Forks: 67

Maratyszcza/FP16

Conversion to/from half-precision floating point formats

Language: C++ - Size: 135 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 362 - Forks: 97

dyeo/dym

The DYM Math Library for Graphics and Game Programming

Language: C++ - Size: 240 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 2 - Forks: 2

canbula/ieee754

Python module which finds the IEEE-754 representation of a floating point number.

Language: Python - Size: 85.9 KB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 29 - Forks: 5

enp1s0/cuMpSGEMM

Fast SGEMM emulation on Tensor Cores

Language: Cuda - Size: 476 KB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 13 - Forks: 1

x448/float16

float16 provides IEEE 754 half-precision format (binary16) with correct conversions to/from float32

Language: Go - Size: 197 KB - Last synced at: 28 days ago - Pushed at: about 1 month ago - Stars: 81 - Forks: 8

shibatch/tlfloat

C++ template library for floating point operations

Language: C++ - Size: 740 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 29 - Forks: 2

DivergentClouds/subleq-linear

An implementation of the Subleq OISC using only linear operations on half-precision (16 bit) IEEE-754 floats (and a loop).

Language: Zig - Size: 22.5 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

minhhn2910/cuda-half2 📦

Convert CUDA programs from float data type to half or half2 with SIMDization

Language: C++ - Size: 144 MB - Last synced at: 8 days ago - Pushed at: over 6 years ago - Stars: 20 - Forks: 6

stdlib-js/constants-float16

Half-precision floating-point mathematical constants.

Language: JavaScript - Size: 629 KB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

higham/chop

Round matrix elements to lower precision in MATLAB

Language: MATLAB - Size: 52.7 KB - Last synced at: 7 days ago - Pushed at: about 3 years ago - Stars: 37 - Forks: 11

stdlib-js/constants-float16-sqrt-eps

Square root of half-precision floating-point epsilon.

Language: JavaScript - Size: 325 KB - Last synced at: 2 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-num-bytes

Size (in bytes) of a half-precision floating-point number.

Language: JavaScript - Size: 313 KB - Last synced at: 7 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-cbrt-eps

Cube root of half-precision floating-point epsilon.

Language: JavaScript - Size: 319 KB - Last synced at: 3 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

SomeRandomiOSDev/Half

Swift Half-Precision Floating Point

Language: Swift - Size: 209 KB - Last synced at: 13 days ago - Pushed at: almost 2 years ago - Stars: 11 - Forks: 3

joeltg/fp16

Half-precision 16-bit floating point numbers

Language: TypeScript - Size: 476 KB - Last synced at: 7 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

artyom-beilis/float16

half float library for C and for z80

Language: C - Size: 22.5 KB - Last synced at: 6 months ago - Pushed at: over 5 years ago - Stars: 34 - Forks: 7

oleks/binary16

Emulating binary, half-precision IEEE-754 (2008) floats

Language: C - Size: 29.3 KB - Last synced at: 5 months ago - Pushed at: over 8 years ago - Stars: 2 - Forks: 0

yowidin/fast-half-float

Fast Half precision Floating point operations for C++

Language: C++ - Size: 8.79 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

bindog/pytorch-model-parallel

A memory balanced and communication efficient FullyConnected layer with CrossEntropyLoss model parallel implementation in PyTorch

Language: Python - Size: 85 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 74 - Forks: 20

nitronoid/floatingPoint

Language: C++ - Size: 29.3 KB - Last synced at: almost 2 years ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 0

jamesalbert/halfprec

Half-precision assembly interface for C

Language: Assembly - Size: 9.77 KB - Last synced at: 6 months ago - Pushed at: over 8 years ago - Stars: 1 - Forks: 0

hma02/cublasHgemm-P100

Code for testing the native float16 matrix multiplication performance on Tesla P100 and V100 GPU based on cublasHgemm

Language: Cuda - Size: 18.6 KB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 35 - Forks: 11

steven-varga/h5cpp

C++17 templates between [stl::vector | armadillo | eigen3 | ublas | blitz++] and HDF5 datasets

Language: C++ - Size: 21.9 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 139 - Forks: 32

fengwang/float16_t Fork of acgessler/half_float

CPP20 implementation of a 16-bit floating-point type mimicking most of the IEEE 754 behavior. Single file and header-only.

Language: C++ - Size: 204 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 19 - Forks: 5

Related Keywords

half-precision 35 floating-point 16 float16 10 ieee754 9 math 7 float 6 javascript 5 fp16 5 binary16 5 bfloat16 5 half 5 cuda 5 const 4 mathematics 4 stdlib 4 gpu 4 nodejs 4 node-js 4 node 4 matrix 3 cpp20 3 arithmetic 3 gemm 3 16bit 3 constant 3 epsilon 2 eps 2 real 2 matlab 2 c 2 library 2 z80 2 zx-spectrum 2 rounding 2 16-bit 2 constexpr 2 double-precision 2 assembly 2 cplusplus 2 cpp 2 arbitrary-precision 2 typescript 2 mixed-precision 2 stochastic-rounding 1 carthage 1 bfloat 1 cocoapods 1 header-only 1 hacktoberfest 1 ublas 1 ios 1 macos 1 swift 1 swiftpm 1 tvos 1 watchos 1 stl 1 serialization-library 1 delphi 1 sizeof 1 size-of 1 neslib 1 size 1 nbytes 1 random-number-generators 1 bytes 1 minifloat 1 bits 1 quantization 1 mips-instructions 1 android 1 caffe 1 opencl 1 hdf5-library 1 hdf5-datasets 1 doge 1 nasm 1 hdf5 1 cublas 1 p100 1 precision 1 v100 1 h5cpp 1 armadillo 1 blaze 1 blitz 1 cpp17 1 dataset 1 dlib 1 eigen3 1 z80asm 1 persistence 1 distributed-training 1 model-parallel 1 mpi 1 itpp 1 pytorch 1 re-id 1 32bit 1 48bit 1