Topic: "ieee754"
tc39/proposal-decimal
Built-in exact decimal numbers for JavaScript
Language: HTML - Size: 1.17 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 535 - Forks: 18

libcg/bfp
Beyond Floating Point - Posit C/C++ implementation
Language: C++ - Size: 85 KB - Last synced at: 5 months ago - Pushed at: 11 months ago - Stars: 289 - Forks: 25

VoidStarKat/half-rs
Half-precision floating point types f16 and bf16 for Rust.
Language: Rust - Size: 622 KB - Last synced at: 1 day ago - Pushed at: 13 days ago - Stars: 248 - Forks: 59

Kimbatt/soft-float-starter-pack
Software implementation of floating point numbers and operations
Language: C# - Size: 34.2 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 135 - Forks: 24

feross/ieee754
Read/write IEEE754 floating point numbers from/to a Buffer or array-like object.
Language: JavaScript - Size: 47.9 KB - Last synced at: 13 days ago - Pushed at: over 3 years ago - Stars: 121 - Forks: 37

petamoriken/float16
Stage 3 IEEE 754 half-precision floating-point ponyfill
Language: JavaScript - Size: 9.61 MB - Last synced at: 12 days ago - Pushed at: 19 days ago - Stars: 100 - Forks: 7

x448/float16
float16 provides IEEE 754 half-precision format (binary16) with correct conversions to/from float32
Language: Go - Size: 178 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 73 - Forks: 8

hukenovs/fp23fftk
Floating point Forward/Inverse Fast Fourier Transform (FFT) IP-core for newest Xilinx FPGAs (Source lang. - VHDL).
Language: VHDL - Size: 1.27 MB - Last synced at: 18 days ago - Pushed at: almost 3 years ago - Stars: 58 - Forks: 18

LiraNuna/soft-ieee754
Software implementation of any size ieee754 floating points
Language: C++ - Size: 38.1 KB - Last synced at: 2 days ago - Pushed at: over 4 years ago - Stars: 52 - Forks: 11

eruffaldi/cppPosit 📦
c++ posit implementation
Language: C++ - Size: 2.99 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 42 - Forks: 1

KarenUllrich/pytorch-binary-converter
Turning float tensors to binary tensors according to IEEE-754 standard.
Language: Python - Size: 29.3 KB - Last synced at: 19 days ago - Pushed at: almost 6 years ago - Stars: 38 - Forks: 10

canbula/NumericalAnalysis
Repository for Numerical Analysis course given by Assoc. Prof. Dr. Bora Canbula at Computer Engineering Department of Manisa Celal Bayar University.
Language: Python - Size: 19.7 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 28 - Forks: 70

huonw/ieee754
Low-level manipulations of IEEE754 floating-point numbers.
Language: Rust - Size: 639 KB - Last synced at: 9 days ago - Pushed at: almost 2 years ago - Stars: 28 - Forks: 4

canbula/ieee754
Python module which finds the IEEE-754 representation of a floating point number.
Language: Python - Size: 85.9 KB - Last synced at: 26 days ago - Pushed at: about 1 year ago - Stars: 27 - Forks: 5

shibatch/tlfloat
C++ template library for floating point operations
Language: C++ - Size: 674 KB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 26 - Forks: 2

kkimdev/ieee754-types
Single header file C++ library that provides IEEE 754 floating point types.
Language: C++ - Size: 35.2 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 23 - Forks: 6

LeventErkok/crackNum
Convert to/from IEEE-754 HP/SP/DP formats
Language: Haskell - Size: 141 KB - Last synced at: 12 days ago - Pushed at: 5 months ago - Stars: 21 - Forks: 8

Daniel-Abrecht/IEEE754_binary_encoder
A C library for converting float and double values to binary
Language: C - Size: 1.95 KB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 17 - Forks: 9

FirebirdSQL/decimal-java
Java library to encode and decode IEEE-754 decimals
Language: Java - Size: 842 KB - Last synced at: 14 days ago - Pushed at: 7 months ago - Stars: 12 - Forks: 3

lordazzi/calc-js
Handle JavaScript operations, avoiding the native problems of the language
Language: TypeScript - Size: 641 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 4

denishoornaert/Chisel3-Float-Type
Chisel3 implementation of IEEE-754 compliant floating point data type (logic & representation)
Language: Scala - Size: 845 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 9 - Forks: 2

hukenovs/fp32_logic
Floating point FP32 core HDL. For Xilinx FPGAs. Include base converters and some math functions.
Language: VHDL - Size: 26.4 KB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 9 - Forks: 3

mgriebling/MGDecimal
IEEE Decimal arithmetic 128-, 64-, and 32-bit types built entirely in Swift.
Language: Swift - Size: 5.58 MB - Last synced at: 8 days ago - Pushed at: 8 months ago - Stars: 8 - Forks: 0

sebastianlipponer/zorder_knn
Floating point morton order comparison operator.
Language: C++ - Size: 27.3 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 8 - Forks: 2

thunderpoot/pointy
Interactive IEEE 754 floating point calculator/visualiser written in Perl
Language: Perl - Size: 1.6 MB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 0

smasher164/fma
software implementation of Fused-Multiply Add for 64-bit floats
Language: Go - Size: 38.1 KB - Last synced at: 2 days ago - Pushed at: over 4 years ago - Stars: 8 - Forks: 0

justjavac/IEEE-754 Fork of cvickery/IEEE-754
在线工具:分析 IEEE-754 浮点数
Language: JavaScript - Size: 225 KB - Last synced at: 14 days ago - Pushed at: almost 12 years ago - Stars: 8 - Forks: 5

RobTillaart/IEEE754tools
Arduino library to manipulate IEEE754 float numbers fast. (experimental)
Language: C - Size: 25.4 KB - Last synced at: 19 days ago - Pushed at: about 1 year ago - Stars: 7 - Forks: 0

AlexSp3/Basenumber.js
A BigDecimal library for arbitrary precision that allows you to work with numbers in different bases from 2 to 36.
Language: JavaScript - Size: 784 KB - Last synced at: 11 months ago - Pushed at: about 3 years ago - Stars: 7 - Forks: 0

shahsaumya00/Floating-Point-Adder
32 bit pipelined binary floating point adder using IEEE-754 Single Precision Format in Verilog
Language: Verilog - Size: 30.3 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 7 - Forks: 3

optframe/kahan-float
Kahan Floating-Point (C++ implementation)
Language: C++ - Size: 336 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 6 - Forks: 1

dleged/number-operator
🧮Perform correct +-* / operation operations on the front end
Language: JavaScript - Size: 1.95 KB - Last synced at: 9 months ago - Pushed at: about 5 years ago - Stars: 6 - Forks: 0

TheThirdOne/JSoftFloat
An implementation of the IEEE 754-2008 standard
Language: Java - Size: 56.6 KB - Last synced at: 15 days ago - Pushed at: almost 3 years ago - Stars: 5 - Forks: 3

LeventErkok/FloatingHex
Hexadecimal Floats for Haskell
Language: Haskell - Size: 15.6 KB - Last synced at: 12 days ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 1

QuantumGorilla/Numeric-methods-for-engineers
Numerical methods for engineers used for finding roots, solving matrix, finding functions from given values, performing integrals whose analytical solution is exhaustive, and solutions by approximation for differential equations.
Language: MATLAB - Size: 20.5 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 2

crookseta/missing-values
.Net 8 generic math compatible mathematic library.
Language: C# - Size: 3.61 MB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 1

verificarlo/significantdigits
Solid statistical analysis of Stochastic Arithmetic.
Language: Python - Size: 73.2 KB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 2

bigbit/bigbitjs
Node js implementation of BigBit standard with online convertion and comparision tool
Language: JavaScript - Size: 2.07 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 4 - Forks: 1

jenska/float
80-bit IEEE 754 extended double precision floating-point library for Go
Language: Go - Size: 42 KB - Last synced at: 10 months ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 1

12NaN/MIPS-Assembly-Course-Projects
Projects that were done for my CS14 (Assembly language) course that used the MIPS assembly language.
Language: Assembly - Size: 8.79 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 4 - Forks: 5

patrickjmcd/ieee754-binary16-modbus-plc
Converts a modbus WORD into a Binary16 REAL value for Micro800 PLCs
Size: 4.88 KB - Last synced at: about 1 month ago - Pushed at: about 7 years ago - Stars: 4 - Forks: 0

stdlib-js/constants-float64-ln-two-pi
Natural logarithm of 2π.
Language: JavaScript - Size: 359 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 3 - Forks: 0

stdlib-js/math-base-special-ldexp
Multiply a double-precision floating-point number by an integer power of two.
Language: C - Size: 908 KB - Last synced at: 9 days ago - Pushed at: 14 days ago - Stars: 3 - Forks: 0

stdlib-js/constants-float32-max
Maximum single-precision floating-point number.
Language: JavaScript - Size: 352 KB - Last synced at: 9 days ago - Pushed at: 2 months ago - Stars: 3 - Forks: 1

stdlib-js/constants-float32-smallest-subnormal
Smallest positive single-precision floating-point subnormal number.
Language: JavaScript - Size: 333 KB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

sbaldzenka/dec_to_ieee754_converter
Console program for convert decimal numbers to float32 ieee754 format.
Language: C - Size: 14.6 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 3 - Forks: 0

IsaMorphic/QuadrupleLib
QuadrupleLib is a modern implementation of the IEEE 754 binary128 floating point number type for .NET 7 and above based on the UInt128 built-in.
Language: C# - Size: 32.2 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

Veloctor/IEEE754Inspector
IEEE754 二进制可视化器
Language: C# - Size: 86.9 KB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

chakravala/Wilkinson.jl
Toolkit for studying numerical analysis and floating point algebra round-off
Language: Julia - Size: 29.3 KB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 1

OlimilO1402/IEEE754_Infinity
Definition of infinity in floating point standard
Language: VBA - Size: 58.6 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

b-sullender/CAPI
General C/C++ programming library with data algorithms and CPU based image processing
Language: C - Size: 6.34 MB - Last synced at: 19 days ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 0

JeffreySarnoff/IEEEFloats.jl
Standard conformant constants for Float64, Float32, Float16
Language: Julia - Size: 127 KB - Last synced at: 16 days ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 0

stdlib-js/constants-float16-max-safe-integer
Maximum safe half-precision floating-point integer.
Language: JavaScript - Size: 323 KB - Last synced at: about 12 hours ago - Pushed at: about 14 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-sqrt-two
Square root of 2.
Language: JavaScript - Size: 319 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-sqrt-pi
Square root of π.
Language: JavaScript - Size: 320 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-sqrt-half
Square root of 1/2.
Language: JavaScript - Size: 354 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-sqrt-eps
Square root of double-precision floating-point epsilon.
Language: JavaScript - Size: 326 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-sqrt-half-pi
Square root of 0.5π.
Language: JavaScript - Size: 318 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-min-base2-exponent-subnormal
The minimum biased base 2 exponent for a subnormal double-precision floating-point number.
Language: JavaScript - Size: 339 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-min-base10-exponent
The minimum base 10 exponent for a normal double-precision floating-point number.
Language: JavaScript - Size: 327 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-smallest-normal
Smallest positive double-precision floating-point normal number.
Language: JavaScript - Size: 387 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-max-base2-exponent
The maximum biased base 2 exponent for a double-precision floating-point number.
Language: JavaScript - Size: 342 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-precision
Effective number of bits in the significand of a double-precision floating-point number.
Language: JavaScript - Size: 308 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-ln-two
Natural logarithm of 2.
Language: JavaScript - Size: 373 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-log2-e
Base 2 logarithm of Euler's number.
Language: JavaScript - Size: 365 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-ninf
Double-precision floating-point negative infinity.
Language: JavaScript - Size: 392 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-ln-sqrt-two-pi
Natural logarithm of the square root of 2π.
Language: JavaScript - Size: 342 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-half-pi
1/2 times π.
Language: JavaScript - Size: 315 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-max-safe-integer
Maximum safe double-precision floating-point integer.
Language: JavaScript - Size: 362 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-max-base10-exponent-subnormal
The maximum base 10 exponent for a subnormal double-precision floating-point number.
Language: JavaScript - Size: 296 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-fourth-pi
1/4 times π.
Language: JavaScript - Size: 304 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-e
Euler's number.
Language: JavaScript - Size: 297 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 2 - Forks: 0

Jiuso1/easyP754
IEEE P754 representation calculator.
Language: Java - Size: 18.2 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-special-flipsignf
Return a single-precision floating-point number with the magnitude of x and the sign of x*y.
Language: Python - Size: 737 KB - Last synced at: 9 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-special-flipsign
Return a double-precision floating-point number with the magnitude of x and the sign of x*y.
Language: Python - Size: 1.06 MB - Last synced at: 9 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-special-copysignf
Return a single-precision floating-point number with the magnitude of x and the sign of y.
Language: Python - Size: 758 KB - Last synced at: 9 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-special-copysign
Return a double-precision floating-point number with the magnitude of x and the sign of y.
Language: Python - Size: 1.08 MB - Last synced at: 9 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-special-frexp
Split a double-precision floating-point number into a normalized fraction and an integer power of two.
Language: JavaScript - Size: 809 KB - Last synced at: 9 days ago - Pushed at: 14 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-apery
Apéry's constant.
Language: JavaScript - Size: 347 KB - Last synced at: 8 days ago - Pushed at: 15 days ago - Stars: 2 - Forks: 0

stdlib-js/assert-is-complex64array
Test if a value is a Complex64Array.
Language: JavaScript - Size: 1.13 MB - Last synced at: 9 days ago - Pushed at: 15 days ago - Stars: 2 - Forks: 0

stdlib-js/number-float32
Utilities for single-precision floating-point numbers.
Language: JavaScript - Size: 1.46 MB - Last synced at: 9 days ago - Pushed at: 15 days ago - Stars: 2 - Forks: 0

stdlib-js/number-float32-base
Base utilities for single-precision floating-point numbers.
Language: JavaScript - Size: 1.53 MB - Last synced at: 9 days ago - Pushed at: 15 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-utils-float64-epsilon-difference
Compute the relative difference of two real numbers in units of double-precision floating-point epsilon.
Language: JavaScript - Size: 776 KB - Last synced at: 9 days ago - Pushed at: 22 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-two-pi
2π.
Language: JavaScript - Size: 319 KB - Last synced at: 1 day ago - Pushed at: 22 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float32-cbrt-eps
Cube root of single-precision floating-point epsilon.
Language: JavaScript - Size: 538 KB - Last synced at: 9 days ago - Pushed at: 22 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-special-modf
Decompose a double-precision floating-point number into integral and fractional parts.
Language: C - Size: 852 KB - Last synced at: 9 days ago - Pushed at: 22 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-max
Maximum half-precision floating-point number.
Language: JavaScript - Size: 320 KB - Last synced at: 8 days ago - Pushed at: 29 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-assert-is-infinite
Test if a double-precision floating-point numeric value is infinite.
Language: Python - Size: 397 KB - Last synced at: 8 days ago - Pushed at: 29 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-cbrt-eps
Cube root of double-precision floating-point epsilon.
Language: JavaScript - Size: 342 KB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

stdlib-js/math-base-assert-is-positive-zerof
Test if a single-precision floating-point numeric value is positive zero.
Language: Python - Size: 542 KB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

stdlib-js/math-base-assert-is-nanf
Test if a single-precision floating-point numeric value is NaN.
Language: Python - Size: 355 KB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-ln-pi
Natural logarithm of π.
Language: JavaScript - Size: 372 KB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-eps
Difference between one and the smallest value greater than one that can be represented as a half-precision floating-point number.
Language: JavaScript - Size: 366 KB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

stdlib-js/constants-float32-sqrt-eps
Square root of single-precision floating-point epsilon.
Language: JavaScript - Size: 526 KB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

stdlib-js/number-float32-base-normalize
Return a normal number `y` and exponent `exp` satisfying `x = y * 2^exp`.
Language: JavaScript - Size: 696 KB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

stdlib-js/constants-float32-exponent-bias
The bias of a single-precision floating-point number's exponent.
Language: JavaScript - Size: 324 KB - Last synced at: 10 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-exponent-bias
The bias of a half-precision floating-point number's exponent.
Language: JavaScript - Size: 312 KB - Last synced at: 8 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/number-float64-base-normalize
Return a normal number `y` and exponent `exp` satisfying `x = y * 2^exp`.
Language: JavaScript - Size: 561 KB - Last synced at: 8 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-high-word-abs-mask
High word mask for excluding the sign bit of a double-precision floating-point number.
Language: JavaScript - Size: 286 KB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float32-exponent-mask
Mask for the exponent of a single-precision floating-point number.
Language: JavaScript - Size: 314 KB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0
