GitHub topics: xla
flaport/sax
S + Autograd + XLA :: S-parameter based frequency domain circuit simulations and optimizations using JAX.
Language: Python - Size: 72.7 MB - Last synced at: about 7 hours ago - Pushed at: 19 days ago - Stars: 94 - Forks: 24

LuxDL/Lux.jl
Elegant and Performant Deep Learning
Language: Julia - Size: 294 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 609 - Forks: 74

wcxve/xspex
JAX interface for XSPEC spectral models.
Language: Python - Size: 288 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 4 - Forks: 1

gomlx/gomlx
GoMLX: An Accelerated Machine Learning Framework For Go
Language: Go - Size: 299 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 871 - Forks: 38

zml/zml
Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild
Language: Zig - Size: 3.72 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2,520 - Forks: 95

n2cholas/awesome-jax
JAX - A curated list of resources https://github.com/google/jax
Size: 258 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,911 - Forks: 148

cdelv/JaxDEM
Easy to use and blazing fast JAX-based library for high-performance 2D/3D Discrete Element Method (DEM) simulations.
Language: Python - Size: 2.59 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 3 - Forks: 0

elixir-nx/nx
Multi-dimensional arrays (tensors) and numerical definitions for Elixir
Language: Elixir - Size: 7.02 MB - Last synced at: 6 days ago - Pushed at: 16 days ago - Stars: 2,804 - Forks: 211

pytorch/xla
Enabling PyTorch on XLA Devices (e.g. Google TPU)
Language: Python - Size: 90.6 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 2,661 - Forks: 563

felafax/felafax
Felafax is building AI infra for non-NVIDIA GPUs
Language: Jupyter Notebook - Size: 3.36 MB - Last synced at: 7 days ago - Pushed at: 8 months ago - Stars: 567 - Forks: 38

mpi4jax/mpi4jax
Zero-copy MPI communication of JAX arrays, for turbo-charged HPC applications in Python :zap:
Language: Python - Size: 5.08 MB - Last synced at: 3 days ago - Pushed at: 17 days ago - Stars: 491 - Forks: 32

gordicaleksa/get-started-with-JAX
The purpose of this repo is to make it easy to get started with JAX, Flax, and Haiku. It contains my "Machine Learning with JAX" series of tutorials (YouTube videos and Jupyter Notebooks) as well as the content I found useful while learning about the JAX ecosystem.
Language: Jupyter Notebook - Size: 1.78 MB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 748 - Forks: 113

r-xla/stablehlo
Create stableHLO programs in R
Language: R - Size: 1.59 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 3 - Forks: 1

r-xla/pjrt
R Interface to PJRT
Language: C++ - Size: 1.63 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 3 - Forks: 0

bahremsd/tmmax
A fast transfer matrix method written in jax for modelling optical multilayer thin films
Language: Jupyter Notebook - Size: 28.1 MB - Last synced at: 2 days ago - Pushed at: 20 days ago - Stars: 20 - Forks: 4

bahremsd/katmer
katmer is a powerful library for optimizing the design of optical thin films using automatic differentiation via JAX and Equinox, enabling efficient and accurate inverse design solutions.
Language: Python - Size: 785 KB - Last synced at: 4 days ago - Pushed at: 20 days ago - Stars: 3 - Forks: 0

HomebrewML/revlib
Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload
Language: Python - Size: 131 KB - Last synced at: 10 days ago - Pushed at: about 3 years ago - Stars: 129 - Forks: 6

DifferentiableUniverseInitiative/jaxDecomp
JAX bindings for the NVIDIA cuDecomp library
Language: Python - Size: 60 MB - Last synced at: 19 days ago - Pushed at: 2 months ago - Stars: 38 - Forks: 1

AlibabaPAI/torchacc
PyTorch distributed training acceleration framework
Language: Python - Size: 33 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 51 - Forks: 9

nv-legate/multimesh-jax
PjRt plugin and Python APIs for MPMD workflows in Jax
Language: C++ - Size: 35.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 6 - Forks: 0

nv-legate/zuku
Event-based runtime for gang-scheduling multi-GPU operations across sharded arrays
Language: C++ - Size: 97.7 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

MoFHeka/xla-launcher
XLA Launcher is a high-performance, lightweight C++ library designed to provide a simple interface for loading and executing computation graphs represented in the StableHLO format.
Language: C++ - Size: 138 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 1

paolodelia99/tf-quant-finance Fork of google/tf-quant-finance
High-performance TensorFlow library for quantitative finance.
Language: Python - Size: 19.1 MB - Last synced at: 20 days ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

JuliaGPU/XLA.jl 📦
Julia on TPUs
Language: Julia - Size: 7.94 MB - Last synced at: about 15 hours ago - Pushed at: over 4 years ago - Stars: 222 - Forks: 19

0xSooki/permanent-boost
A high performance Python permanent calculator implemented in C++
Language: Jupyter Notebook - Size: 406 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

petrkozorezov/elixir-nx-livebook-devenv-example
Example of using Elixir/Nx with CUDA backend in Livebook in Devenv environment
Language: Nix - Size: 4.88 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

nv-legate/multimesh-jax-workflows
Language: Python - Size: 438 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

0xSooki/extending-jax
JAX Custom Operations with C++ and CUDA (using Pybind11)
Language: Python - Size: 19.5 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

kamalkraj/ALBERT-TF2.0
ALBERT model Pretraining and Fine Tuning using TF2.0
Language: Python - Size: 235 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 202 - Forks: 44

dfm/extending-jax
Extending JAX with custom C++ and CUDA code
Language: Python - Size: 125 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 394 - Forks: 23

bahremsd/tmmax-workshop
Workshop given in graduate-level thin film coatings course in ITU
Language: Jupyter Notebook - Size: 2.73 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

scala-network/scala-pool
Official scala pool repository
Language: JavaScript - Size: 1.92 MB - Last synced at: 5 months ago - Pushed at: 11 months ago - Stars: 24 - Forks: 13

sayakpaul/keras-xla-benchmarks
Presents comprehensive benchmarks of XLA-compatible pre-trained models in Keras.
Language: Jupyter Notebook - Size: 844 KB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 37 - Forks: 2

mzguntalan/neptune
[WIP] Neptune: JAX iterop-able library in Haskell.
Language: Haskell - Size: 85.9 KB - Last synced at: 4 months ago - Pushed at: 7 months ago - Stars: 9 - Forks: 0

walln/loadax
Dataloading for JAX
Language: Python - Size: 1.17 MB - Last synced at: 22 days ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

AlibabaPAI/FlashModels
Fast and easy distributed model training examples.
Language: Python - Size: 42.9 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 9 - Forks: 4

inoryy/tensorflow-optimized-wheels
TensorFlow wheels built for latest CUDA/CuDNN and enabled performance flags: SSE, AVX, FMA; XLA
Size: 23.4 KB - Last synced at: 5 months ago - Pushed at: over 5 years ago - Stars: 119 - Forks: 9

HuiResearch/tfbert
基于tensorflow1.x的预训练模型调用,支持单机多卡、梯度累积,XLA加速,混合精度。可灵活训练、验证、预测。
Language: Python - Size: 7.45 MB - Last synced at: 3 months ago - Pushed at: about 4 years ago - Stars: 58 - Forks: 11

AndreiMoraru123/Super-Resolution
Modern Graph TensorFlow implementation of Super-Resolution GAN
Language: Python - Size: 54.7 KB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

onnx/onnx-xla 📦
XLA integration of Open Neural Network Exchange (ONNX)
Language: C++ - Size: 46.9 KB - Last synced at: 10 months ago - Pushed at: about 7 years ago - Stars: 19 - Forks: 9

aklein4/MonArc
A practical method for training energy-based language models.
Language: Python - Size: 592 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

kmkolasinski/tensorflow-nanoGPT
Example how to train GPT-2 (XLA + AMP), export to SavedModel and serve with Tensorflow Serving
Language: Jupyter Notebook - Size: 602 KB - Last synced at: 22 days ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 1

googleinterns/paksha
Compiling JAX to WebAssembly for exploring client-side machine learning
Language: WebAssembly - Size: 836 KB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 0

sayakpaul/you-dont-know-tensorflow
Contains materials for my talk "You don't know TensorFlow".
Language: Jupyter Notebook - Size: 92.8 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 2

jhashekhar/multilingual-clf
Classification of multilingual dataset trained only on English training data using pre-trained models. Model is trained on TPUs using PyTorch and torch_xla library.
Language: Python - Size: 10.7 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 0

InikoPro/mineveruscoinonarm
Mine verus coin on ARM like Pi, Tablet, Mobile & Other.
Size: 32.2 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

alessblaze/torch-xla Fork of pytorch/xla
Enabling PyTorch on XLA Devices (e.g. Google TPU)
Language: C++ - Size: 40.1 MB - Last synced at: 10 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sayakpaul/xla-benchmark-sd
Provides code to serialize the different models involved in Stable Diffusion as SavedModels and to compile them with XLA.
Language: Python - Size: 27.3 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 3

jhn-nt/data-snax
Versatile Data Ingestion Pipelines for Jax
Language: Python - Size: 569 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

VertexC/dl-infer-perf
deep learning inference perf analysis
Language: Python - Size: 511 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 1

scala-network/StellitePay-API
DEPRECATED ⛔️
Language: PHP - Size: 17.6 KB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 5 - Forks: 3

ReturnToFirst/FastTFWorkflow
Tutorial about How to change your slow tensorflow training faster
Language: Jupyter Notebook - Size: 128 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

nguyentruonglau/keras-classify
Optimal choice for 🛰 classification problem.
Language: Python - Size: 1.27 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

gottingen/tf-reading
tensorflow code reading
Language: C++ - Size: 66.4 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

SalamanderXing/jax-gaussian-bayes
A Multivariate Gaussian Bayes classifier written using JAX
Language: Python - Size: 5.86 KB - Last synced at: 6 days ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

sseung0703/TF2-jit-compile-on-multi-gpu
Tensorflow2 training code with jit compiling on multi-GPU.
Language: Python - Size: 157 KB - Last synced at: 4 months ago - Pushed at: over 4 years ago - Stars: 17 - Forks: 2

mugithi/google-terraform-pytorch-tpu
Automated provisioner of a Google Cloud TPU environment for training in PyTorch
Language: HCL - Size: 20.7 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 4 - Forks: 3

aleksander-haugas/XLArig-proxy Fork of xmrig/xmrig-proxy
Scala (XLA) Stratum protocol proxy with panthera (not official)
Language: C++ - Size: 2.6 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

aleksander-haugas/XLArig Fork of scala-network/XLArig
An XMRig fork with support for latest XLA PoW algorithms
Language: C++ - Size: 4.72 MB - Last synced at: 6 months ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

AkashSDas/cassava-leaf-disease-classification
Deep learning solution for Cassava Leaf Disease Classification, a Kaggle's Research Code Competition using Tensorflow.
Language: Jupyter Notebook - Size: 15.4 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 1

hhhhhojeihsu/tensorflow_1.8_woboq
Woboq codebrowser for Tensorflow v1.8 with XLA Enabled
Size: 62.2 MB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0
