Own your
Artificial Intelligence
We have designed a new chip that dramatically lowers the barrier to create frontier artificial intelligence
a simpler chip
−1 · 0 · +1
for the densest possible intelligence
The Problem
Today's silicon runs AI at 4–32 bits
Most of them are wasted
No silicon today natively runs in 1-bit
So we built one.
The Chip
1-bit native silicon
Training and Inference
Per-MAC silicon
Traditional MAC
Multiplier · Carry-save tree · Adder
Ternary MAC
Mux · Adder
A chip for the era of 1-bit neural networks
Formats
- W1A4
- W1.58A4
- W1.58A6
Workload
- Training
- Inference
Software
PyTorch
Status
Live on FPGA
How it works
model =
"meta-llama/Llama-3.1-8B"
# or any torch.nn.Module
01
Pick a model
point to a hugging-face model id, or bring your own pytorch module
baud.compile(model)
02
Compile
our compiler converts your model and weights to a format our hardware understands
model.generate()
model.train()
03
Accelerate training and inference
get accelerated training, rl fine-tuning and inference on our hardware
Live Now
Deployed on FPGAs
Our architecture is running on FPGAs today. Accessible to developers right now.
Models
Up to 32B parameters
Pretraining
Supported
RL fine-tuning
Supported
Inference
Supported
Next
ASIC Deployment
Architecture verified on GlobalFoundries 12nm process, first silicon in progress
Models
1T+ parameters
Pretraining
Accelerated
RL fine-tuning
Accelerated
Inference
Accelerated
Availability
Coming soon
