Topic: "ieee754"
tc39/proposal-decimal
Built-in exact decimal numbers for JavaScript
Language: HTML - Size: 1.42 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 549 - Forks: 19

libcg/bfp
Beyond Floating Point - Posit C/C++ implementation
Language: C++ - Size: 85 KB - Last synced at: 30 days ago - Pushed at: about 1 year ago - Stars: 293 - Forks: 28

VoidStarKat/half-rs
Half-precision floating point types f16 and bf16 for Rust.
Language: Rust - Size: 490 KB - Last synced at: 2 days ago - Pushed at: 9 days ago - Stars: 256 - Forks: 62

Kimbatt/soft-float-starter-pack
Software implementation of floating point numbers and operations
Language: C# - Size: 34.2 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 135 - Forks: 24

feross/ieee754
Read/write IEEE754 floating point numbers from/to a Buffer or array-like object.
Language: JavaScript - Size: 47.9 KB - Last synced at: 8 days ago - Pushed at: almost 4 years ago - Stars: 121 - Forks: 37

petamoriken/float16
Stage 3 IEEE 754 half-precision floating-point ponyfill
Language: JavaScript - Size: 10.3 MB - Last synced at: about 22 hours ago - Pushed at: about 22 hours ago - Stars: 101 - Forks: 8

x448/float16
float16 provides IEEE 754 half-precision format (binary16) with correct conversions to/from float32
Language: Go - Size: 190 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 75 - Forks: 8

hukenovs/fp23fftk
Floating point Forward/Inverse Fast Fourier Transform (FFT) IP-core for newest Xilinx FPGAs (Source lang. - VHDL).
Language: VHDL - Size: 1.27 MB - Last synced at: 23 days ago - Pushed at: almost 3 years ago - Stars: 58 - Forks: 18

LiraNuna/soft-ieee754
Software implementation of any size ieee754 floating points
Language: C++ - Size: 38.1 KB - Last synced at: 10 days ago - Pushed at: over 4 years ago - Stars: 53 - Forks: 11

eruffaldi/cppPosit 📦
c++ posit implementation
Language: C++ - Size: 2.99 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 42 - Forks: 1

KarenUllrich/pytorch-binary-converter
Turning float tensors to binary tensors according to IEEE-754 standard.
Language: Python - Size: 29.3 KB - Last synced at: 2 months ago - Pushed at: almost 6 years ago - Stars: 38 - Forks: 10

canbula/NumericalAnalysis
Repository for Numerical Analysis course given by Assoc. Prof. Dr. Bora Canbula at Computer Engineering Department of Manisa Celal Bayar University.
Language: Python - Size: 19.7 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 28 - Forks: 70

huonw/ieee754
Low-level manipulations of IEEE754 floating-point numbers.
Language: Rust - Size: 639 KB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 28 - Forks: 4

canbula/ieee754
Python module which finds the IEEE-754 representation of a floating point number.
Language: Python - Size: 85.9 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 5

shibatch/tlfloat
C++ template library for floating point operations
Language: C++ - Size: 674 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 26 - Forks: 2

kkimdev/ieee754-types
Single header file C++ library that provides IEEE 754 floating point types.
Language: C++ - Size: 35.2 KB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 23 - Forks: 6

LeventErkok/crackNum
Convert to/from IEEE-754 HP/SP/DP formats
Language: Haskell - Size: 141 KB - Last synced at: 18 days ago - Pushed at: 7 months ago - Stars: 21 - Forks: 8

Daniel-Abrecht/IEEE754_binary_encoder
A C library for converting float and double values to binary
Language: C - Size: 1.95 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 17 - Forks: 9

FirebirdSQL/decimal-java
Java library to encode and decode IEEE-754 decimals
Language: Java - Size: 892 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 12 - Forks: 3

lordazzi/calc-js
Handle JavaScript operations, avoiding the native problems of the language
Language: TypeScript - Size: 700 KB - Last synced at: 29 days ago - Pushed at: 9 months ago - Stars: 10 - Forks: 4

denishoornaert/Chisel3-Float-Type
Chisel3 implementation of IEEE-754 compliant floating point data type (logic & representation)
Language: Scala - Size: 845 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 9 - Forks: 2

hukenovs/fp32_logic
Floating point FP32 core HDL. For Xilinx FPGAs. Include base converters and some math functions.
Language: VHDL - Size: 26.4 KB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 9 - Forks: 3

mgriebling/MGDecimal
IEEE Decimal arithmetic 128-, 64-, and 32-bit types built entirely in Swift.
Language: Swift - Size: 5.58 MB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 8 - Forks: 0

sebastianlipponer/zorder_knn
Floating point morton order comparison operator.
Language: C++ - Size: 27.3 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 2

thunderpoot/pointy
Interactive IEEE 754 floating point calculator/visualiser written in Perl
Language: Perl - Size: 1.6 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 8 - Forks: 0

smasher164/fma
software implementation of Fused-Multiply Add for 64-bit floats
Language: Go - Size: 38.1 KB - Last synced at: 2 days ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 0

justjavac/IEEE-754 Fork of cvickery/IEEE-754
在线工具:分析 IEEE-754 浮点数
Language: JavaScript - Size: 225 KB - Last synced at: 2 months ago - Pushed at: about 12 years ago - Stars: 8 - Forks: 5

RobTillaart/IEEE754tools
Arduino library to manipulate IEEE754 float numbers fast. (experimental)
Language: C - Size: 25.4 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 7 - Forks: 0

AlexSp3/Basenumber.js
A BigDecimal library for arbitrary precision that allows you to work with numbers in different bases from 2 to 36.
Language: JavaScript - Size: 784 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 0

shahsaumya00/Floating-Point-Adder
32 bit pipelined binary floating point adder using IEEE-754 Single Precision Format in Verilog
Language: Verilog - Size: 30.3 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 7 - Forks: 3

optframe/kahan-float
Kahan Floating-Point (C++ implementation)
Language: C++ - Size: 336 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 1

dleged/number-operator
🧮Perform correct +-* / operation operations on the front end
Language: JavaScript - Size: 1.95 KB - Last synced at: 10 months ago - Pushed at: about 5 years ago - Stars: 6 - Forks: 0

TheThirdOne/JSoftFloat
An implementation of the IEEE 754-2008 standard
Language: Java - Size: 56.6 KB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 5 - Forks: 3

LeventErkok/FloatingHex
Hexadecimal Floats for Haskell
Language: Haskell - Size: 15.6 KB - Last synced at: about 2 months ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 1

QuantumGorilla/Numeric-methods-for-engineers
Numerical methods for engineers used for finding roots, solving matrix, finding functions from given values, performing integrals whose analytical solution is exhaustive, and solutions by approximation for differential equations.
Language: MATLAB - Size: 20.5 KB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 2

crookseta/missing-values
.Net 8 generic math compatible mathematic library.
Language: C# - Size: 3.5 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 4 - Forks: 1

verificarlo/significantdigits
Solid statistical analysis of Stochastic Arithmetic.
Language: Python - Size: 73.2 KB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 2

bigbit/bigbitjs
Node js implementation of BigBit standard with online convertion and comparision tool
Language: JavaScript - Size: 2.07 MB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 1

jenska/float
80-bit IEEE 754 extended double precision floating-point library for Go
Language: Go - Size: 42 KB - Last synced at: 12 months ago - Pushed at: almost 4 years ago - Stars: 4 - Forks: 1

12NaN/MIPS-Assembly-Course-Projects
Projects that were done for my CS14 (Assembly language) course that used the MIPS assembly language.
Language: Assembly - Size: 8.79 KB - Last synced at: over 2 years ago - Pushed at: almost 7 years ago - Stars: 4 - Forks: 5

patrickjmcd/ieee754-binary16-modbus-plc
Converts a modbus WORD into a Binary16 REAL value for Micro800 PLCs
Size: 4.88 KB - Last synced at: 3 months ago - Pushed at: over 7 years ago - Stars: 4 - Forks: 0

stdlib-js/constants-float32-max
Maximum single-precision floating-point number.
Language: JavaScript - Size: 363 KB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 3 - Forks: 1

stdlib-js/constants-float64-ln-two-pi
Natural logarithm of 2π.
Language: JavaScript - Size: 343 KB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 3 - Forks: 0

Jiuso1/easyP754
IEEE P754 representation calculator.
Language: Java - Size: 18.3 MB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

stdlib-js/math-base-special-ldexp
Multiply a double-precision floating-point number by an integer power of two.
Language: C - Size: 908 KB - Last synced at: 20 days ago - Pushed at: 2 months ago - Stars: 3 - Forks: 0

stdlib-js/constants-float32-smallest-subnormal
Smallest positive single-precision floating-point subnormal number.
Language: JavaScript - Size: 333 KB - Last synced at: 26 days ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

sbaldzenka/dec_to_ieee754_converter
Console program for convert decimal numbers to float32 ieee754 format.
Language: C - Size: 14.6 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 3 - Forks: 0

IsaMorphic/QuadrupleLib
QuadrupleLib is a modern implementation of the IEEE 754 binary128 floating point number type for .NET 7 and above based on the UInt128 built-in.
Language: C# - Size: 32.2 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

Veloctor/IEEE754Inspector
IEEE754 二进制可视化器
Language: C# - Size: 86.9 KB - Last synced at: 11 days ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

chakravala/Wilkinson.jl
Toolkit for studying numerical analysis and floating point algebra round-off
Language: Julia - Size: 29.3 KB - Last synced at: 23 days ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 1

OlimilO1402/IEEE754_Infinity
Definition of infinity in floating point standard
Language: VBA - Size: 58.6 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

b-sullender/CAPI
General C/C++ programming library with data algorithms and CPU based image processing
Language: C - Size: 6.34 MB - Last synced at: 2 months ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

JeffreySarnoff/IEEEFloats.jl
Standard conformant constants for Float64, Float32, Float16
Language: Julia - Size: 127 KB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 0

sigurd4/custom_float
Customizable floating point types, with all standard floating point operations implemented from scratch.
Language: Rust - Size: 75 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-special-flipsign
Return a double-precision floating-point number with the magnitude of x and the sign of x*y.
Language: Python - Size: 1.07 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16
Half-precision floating-point mathematical constants.
Language: JavaScript - Size: 629 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-max-safe-integer
Maximum safe double-precision floating-point integer.
Language: JavaScript - Size: 344 KB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-special-frexp
Split a double-precision floating-point number into a normalized fraction and an integer power of two.
Language: JavaScript - Size: 815 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-special-modf
Decompose a double-precision floating-point number into integral and fractional parts.
Language: JavaScript - Size: 856 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-assert-is-positive-zerof
Test if a single-precision floating-point numeric value is positive zero.
Language: Python - Size: 565 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-cbrt-eps
Cube root of double-precision floating-point epsilon.
Language: JavaScript - Size: 365 KB - Last synced at: 6 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 0

stdlib-js/number-float64-base-normalize
Return a normal number `y` and exponent `exp` satisfying `x = y * 2^exp`.
Language: JavaScript - Size: 566 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-sqrt-half
Square root of 1/2.
Language: JavaScript - Size: 356 KB - Last synced at: 1 day ago - Pushed at: 19 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-sqrt-half-pi
Square root of 0.5π.
Language: JavaScript - Size: 318 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float32-min-safe-integer
Minimum safe single-precision floating-point integer.
Language: JavaScript - Size: 318 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-ln-sqrt-two-pi
Natural logarithm of the square root of 2π.
Language: JavaScript - Size: 348 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-eps
Difference between one and the smallest value greater than one that can be represented as a half-precision floating-point number.
Language: JavaScript - Size: 343 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-ln-pi
Natural logarithm of π.
Language: JavaScript - Size: 354 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float32-exponent-bias
The bias of a single-precision floating-point number's exponent.
Language: JavaScript - Size: 344 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-sqrt-eps
Square root of double-precision floating-point epsilon.
Language: JavaScript - Size: 328 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-fourth-pi
1/4 times π.
Language: JavaScript - Size: 307 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-two-pi
2π.
Language: JavaScript - Size: 323 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 2 - Forks: 0

stdlib-js/number-float32
Utilities for single-precision floating-point numbers.
Language: JavaScript - Size: 1.46 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-min-base10-exponent
The minimum base 10 exponent for a normal double-precision floating-point number.
Language: JavaScript - Size: 333 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 2 - Forks: 0

stdlib-js/math-base-special-copysignf
Return a single-precision floating-point number with the magnitude of x and the sign of y.
Language: Python - Size: 758 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

stdlib-js/number-float32-base
Base utilities for single-precision floating-point numbers.
Language: JavaScript - Size: 1.54 MB - Last synced at: 1 day ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-max-safe-integer
Maximum safe half-precision floating-point integer.
Language: JavaScript - Size: 332 KB - Last synced at: 8 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-sqrt-eps
Square root of half-precision floating-point epsilon.
Language: JavaScript - Size: 325 KB - Last synced at: 16 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-precision
Effective number of bits in the significand of a half-precision floating-point number.
Language: JavaScript - Size: 318 KB - Last synced at: 24 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-min-safe-integer
Minimum safe half-precision floating-point integer.
Language: JavaScript - Size: 354 KB - Last synced at: 12 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float32-cbrt-eps
Cube root of single-precision floating-point epsilon.
Language: JavaScript - Size: 544 KB - Last synced at: 15 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-exponent-bias
The bias of a half-precision floating-point number's exponent.
Language: JavaScript - Size: 319 KB - Last synced at: 14 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float32-eps
Difference between one and the smallest value greater than one that can be represented as a single-precision floating-point number.
Language: JavaScript - Size: 529 KB - Last synced at: 4 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float32-max-safe-integer
Maximum safe single-precision floating-point integer.
Language: JavaScript - Size: 301 KB - Last synced at: 8 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-num-bytes
Size (in bytes) of a half-precision floating-point number.
Language: JavaScript - Size: 313 KB - Last synced at: 12 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float16-max
Maximum half-precision floating-point number.
Language: JavaScript - Size: 330 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-sqrt-two
Square root of 2.
Language: JavaScript - Size: 319 KB - Last synced at: 17 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-sqrt-pi
Square root of π.
Language: JavaScript - Size: 320 KB - Last synced at: 16 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-min-base2-exponent-subnormal
The minimum biased base 2 exponent for a subnormal double-precision floating-point number.
Language: JavaScript - Size: 346 KB - Last synced at: 16 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-smallest-normal
Smallest positive double-precision floating-point normal number.
Language: JavaScript - Size: 396 KB - Last synced at: 15 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-max-base2-exponent
The maximum biased base 2 exponent for a double-precision floating-point number.
Language: JavaScript - Size: 342 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-precision
Effective number of bits in the significand of a double-precision floating-point number.
Language: JavaScript - Size: 317 KB - Last synced at: 13 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-ln-two
Natural logarithm of 2.
Language: JavaScript - Size: 379 KB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-log2-e
Base 2 logarithm of Euler's number.
Language: JavaScript - Size: 341 KB - Last synced at: 17 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-ninf
Double-precision floating-point negative infinity.
Language: JavaScript - Size: 396 KB - Last synced at: 5 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-half-pi
1/2 times π.
Language: JavaScript - Size: 322 KB - Last synced at: 17 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-max-base10-exponent-subnormal
The maximum base 10 exponent for a subnormal double-precision floating-point number.
Language: JavaScript - Size: 306 KB - Last synced at: 12 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-e
Euler's number.
Language: JavaScript - Size: 297 KB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-float64-apery
Apéry's constant.
Language: JavaScript - Size: 356 KB - Last synced at: 21 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

stdlib-js/math-base-assert-is-positive-zero
Test if a double-precision floating-point numeric value is positive zero.
Language: Python - Size: 453 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0
