!pip install -q ivy
!pip install -q dm-haiku
!git clone https://github.com/unifyai/models.git
# Installing models package from cloned repository! 😄
!cd models/ && pip install .
exit()
Ivy AlexNet demo
In this demo, we show how an AlexNet model written in Ivy native code, can be used for image classification, and integrated with all three of the major ML frameworks: PyTorch, TensorFlow and JAX.
Installation
Since we want the packages to be available after installing, after running the first cell, the notebook will automatically restart.
You can then do Runtime -> Run all after the notebook has restarted, to run all of the cells.
Make sure you run this demo with GPU enabled!
Data Preparation
First we need to download the ImageNet classes and preprocess the image.
!wget https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt
with open("imagenet_classes.txt", "r") as f:
= [s.strip() for s in f.readlines()] categories
!wget https://raw.githubusercontent.com/unifyai/models/master/images/cat.jpg
= "cat.jpg" filename
# Preprocess torch image
import torch
from torchvision import transforms
from PIL import Image
import numpy as np
import warnings
import time
'ignore')
warnings.filterwarnings(
= transforms.Compose([
preprocess 256),
transforms.Resize(224),
transforms.CenterCrop(
transforms.ToTensor(),
transforms.Normalize(=[0.485, 0.456, 0.406],
mean=[0.229, 0.224, 0.225]
std
)])= Image.open(filename)
torch_img = preprocess(torch_img)
torch_img = torch.unsqueeze(torch_img, 0)
torch_img
= torch_img.numpy() img
from IPython.display import Image, display
display(Image(filename))
Ivy AlexNet inference in Torch
We import the Ivy native implementation of AlexNet. The code for this model is given at the end of this notebook, click here to see it.
import ivy
"torch")
ivy.set_backend(
from ivy_models.alexnet import alexnet
= alexnet() ivy_alexnet
In order to make sure the model is as quick as possible, we can call ivy.compile()
. This can take a moment, but is a one-time cost.
= ivy.compile(ivy_alexnet, args=(ivy.asarray(torch_img.cuda()),)) ivy_alexnet
= ivy.softmax(ivy_alexnet(ivy.asarray(img))) # pass the image to the model
output = ivy.argsort(output[0], descending=True)[:3] # get the top 3 classes
classes = ivy.gather(output[0], classes) # get the logits
logits
print("Indices of the top 3 classes are:", classes)
print("Logits of the top 3 classes are:", logits)
print("Categories of the top 3 classes are:", [categories[i] for i in classes.to_list()])
Indices of the top 3 classes are: ivy.array([282, 281, 285], dev=gpu:0)
Logits of the top 3 classes are: ivy.array([0.64773697, 0.29496649, 0.04526037], dev=gpu:0)
Categories of the top 3 classes are: ['tiger cat', 'tabby', 'Egyptian cat']
We can check to confirm that the model gets the same results as the torchvision implementation.
from torchvision.models import alexnet as torch_alexnet
from torchvision.models import AlexNet_Weights
= torch_alexnet(weights=AlexNet_Weights.IMAGENET1K_V1, dropout=0).to("cuda") torch_alexnet
= torch.softmax(torch_alexnet(torch_img.cuda()), dim=1)
torch_output = torch.argsort(torch_output[0], descending=True)[:3]
torch_classes = torch.take(torch_output[0], torch_classes)
torch_logits
print("Indices of the top 3 classes are:", torch_classes)
print("Logits of the top 3 classes are:", torch_logits)
print("Categories of the top 3 classes are:", [categories[i] for i in torch_classes])
Indices of the top 3 classes are: tensor([282, 281, 285], device='cuda:0')
Logits of the top 3 classes are: tensor([0.6477, 0.2950, 0.0453], device='cuda:0', grad_fn=<TakeBackward0>)
Categories of the top 3 classes are: ['tiger cat', 'tabby', 'Egyptian cat']
Great! We can see that the classes and corresponding logits are the same in the Ivy native model and the torchvision model.
TensorFlow inference
With an Ivy native model, it is simple to change the backend, which lets us use the model in a different framework. First we’ll try TensorFlow.
import tensorflow as tf
"tensorflow")
ivy.set_backend(= alexnet() ivy_alexnet
Once the backend is set to TensorFlow, we can use TensorFlow to do our logit post-processing.
= time.perf_counter()
st = ivy_alexnet(ivy.asarray(img)) # pass the image to the model
raw_logits = time.perf_counter() - st
latency = tf.nn.softmax(raw_logits)
output = tf.argsort(output[0], axis=-1, direction="DESCENDING")[:3] # get the top 3 classes
classes = tf.gather(output[0], classes) # get the logits
logits
print("Latency:", latency)
print("Indices of the top 3 classes are:", classes)
print("Logits of the top 3 classes are:", logits)
print("Categories of the top 3 classes are:", [categories[i] for i in classes.numpy().tolist()])
Latency: 10.652289830999962
Indices of the top 3 classes are: tf.Tensor([282 281 285], shape=(3,), dtype=int32)
Logits of the top 3 classes are: tf.Tensor([0.6477362 0.29496726 0.04526032], shape=(3,), dtype=float32)
Categories of the top 3 classes are: ['tiger cat', 'tabby', 'Egyptian cat']
As expected, the results are identical to the Torch backend! If you had another model or postprocessing routine written in TensorFlow, Ivy makes it simple to feed one into the other, without having to (carefully) rewrite them all to one backend. It also means you can easily try out different backends to see which one is the quickest for your particular model and hardware.
Again, we can call ivy.compile to speed up inference.
= ivy.compile(ivy_alexnet, args=(ivy.asarray(img),)) ivy_alexnet
Repeating the previous inference, we see that the compiled model gets the same results as before, and is faster.
= time.perf_counter()
st = ivy_alexnet(ivy.asarray(img)) # pass the image to the model
raw_logits = time.perf_counter() - st
latency = tf.nn.softmax(raw_logits)
output = tf.argsort(output[0], axis=-1, direction="DESCENDING")[:3] # get the top 3 classes
classes = tf.gather(output[0], classes) # get the logits
logits
print("Latency:", latency)
print("Indices of the top 3 classes are:", classes)
print("Logits of the top 3 classes are:", logits)
print("Categories of the top 3 classes are:", [categories[i] for i in classes.numpy().tolist()])
Latency: 0.026875037000081647
Indices of the top 3 classes are: tf.Tensor([282 281 285], shape=(3,), dtype=int32)
Logits of the top 3 classes are: tf.Tensor([0.6477362 0.29496726 0.04526032], shape=(3,), dtype=float32)
Categories of the top 3 classes are: ['tiger cat', 'tabby', 'Egyptian cat']
JAX inference
# Overrides Jax's default behavior of preallocating 75% of GPU memory
# Temporary fix until this is handled by ivy's graph compiler
import os
"XLA_PYTHON_CLIENT_ALLOCATOR"] = "platform"
os.environ[
import jax
"jax")
ivy.set_backend(= alexnet()
ivy_alexnet = ivy.compile(ivy_alexnet, args=(ivy.asarray(img),))
ivy_alexnet = jax.jit(ivy_alexnet)
ivy_alexnet
= jax.device_put(jax.numpy.asarray(img), device=jax.devices()[0]) img_jax
# warm-up
for _ in range(5):
= ivy_alexnet(img_jax) _
= time.perf_counter()
st = ivy_alexnet(img_jax) # pass the image to the model
raw_logits = time.perf_counter() - st
latency = jax.nn.softmax(raw_logits) # pass the image to the model
output = jax.numpy.argsort(-output[0])[:3] # get the top 3 classes
classes = ivy.gather(output[0], classes) # get the logits
logits
print("Latency:", latency)
print("Indices of the top 3 classes are:", classes)
print("Logits of the top 3 classes are:", logits)
print("Categories of the top 3 classes are:", [categories[i] for i in np.array(classes).tolist()])
Latency: 0.0022192720000475674
Indices of the top 3 classes are: [282 281 285]
Logits of the top 3 classes are: ivy.array([0.64773613, 0.29496723, 0.04526032], dev=gpu:0)
Categories of the top 3 classes are: ['tiger cat', 'tabby', 'Egyptian cat']
We get the exact same results as before. Note again that we were able to use JAX functions in calculating the top three classes.
Let’s try another image
!wget https://raw.githubusercontent.com/unifyai/models/master/images/dog.jpg
= "dog.jpg"
filename # Preprocess torch image
from torchvision import transforms
from PIL import Image
= transforms.Compose([
preprocess 256),
transforms.Resize(224),
transforms.CenterCrop(
transforms.ToTensor(),
transforms.Normalize(=[0.485, 0.456, 0.406],
mean=[0.229, 0.224, 0.225]
std
)])= Image.open(filename)
torch_img = preprocess(torch_img)
torch_img = torch.unsqueeze(torch_img, 0)
torch_img
= torch_img.numpy()
img = jax.device_put(jax.numpy.asarray(img), device=jax.devices()[0]) img_jax
from IPython.display import Image, display
display(Image(filename))
= time.perf_counter()
st = ivy_alexnet(img_jax) # pass the image to the model
raw_logits = time.perf_counter() - st
latency = jax.nn.softmax(raw_logits) # pass the image to the model
output = jax.numpy.argsort(-output[0])[:3] # get the top 3 classes
classes = ivy.gather(output[0], classes) # get the logits
logits
print("Latency:", latency)
print("Indices of the top 3 classes are:", classes)
print("Logits of the top 3 classes are:", logits)
print("Categories of the top 3 classes are:", [categories[i] for i in np.array(classes).tolist()])
Latency: 0.006431100999861883
Indices of the top 3 classes are: [258 104 259]
Logits of the top 3 classes are: ivy.array([0.72447652, 0.13937832, 0.05874982], dev=gpu:0)
Categories of the top 3 classes are: ['Samoyed', 'wallaby', 'Pomeranian']
Note that the incorrect prediction of “wallaby” is down to the AlexNet model itself, as the torchvision version returns the same logits!
= time.perf_counter()
st = torch_alexnet(torch_img.cuda())
raw_logits = time.perf_counter() - st
latency = torch.softmax(raw_logits, dim=1)
torch_output = torch.argsort(torch_output[0], descending=True)[:3]
torch_classes = torch.take(torch_output[0], torch_classes)
torch_logits
print("Latency:", latency)
print("Indices of the top 3 classes are:", torch_classes)
print("Logits of the top 3 classes are:", torch_logits)
print("Categories of the top 3 classes are:", [categories[i] for i in torch_classes])
Latency: 0.004749261999904775
Indices of the top 3 classes are: tensor([258, 104, 259], device='cuda:0')
Logits of the top 3 classes are: tensor([0.7245, 0.1394, 0.0587], device='cuda:0', grad_fn=<TakeBackward0>)
Categories of the top 3 classes are: ['Samoyed', 'wallaby', 'Pomeranian']
Appendix (Ivy code for AlexNet implementation)
As promised, here is the Ivy native source code for the AlexNet model used in this demo.
class AlexNet(ivy.Module):
"""An Ivy native implementation of AlexNet"""
def __init__(self, num_classes=1000, dropout=0, v=None):
self.num_classes = num_classes
self.dropout = dropout
super(AlexNet, self).__init__(v=v)
def _build(self, *args, **kwargs):
self.features = ivy.Sequential(
3, 64, [11, 11], [4, 4], 2, data_format="NCHW"),
ivy.Conv2D(
ivy.ReLU(),3, 2, 0, data_format="NCHW"),
ivy.MaxPool2D(64, 192, [5, 5], [1, 1], 2, data_format="NCHW"),
ivy.Conv2D(
ivy.ReLU(),3, 2, 0, data_format="NCHW"),
ivy.MaxPool2D(192, 384, [3, 3], 1, 1, data_format="NCHW"),
ivy.Conv2D(
ivy.ReLU(),384, 256, [3, 3], 1, 1, data_format="NCHW"),
ivy.Conv2D(
ivy.ReLU(),256, 256, [3, 3], 1, 1, data_format="NCHW"),
ivy.Conv2D(
ivy.ReLU(),3, 2, 0, data_format="NCHW"),
ivy.MaxPool2D(
)self.avgpool = ivy.AdaptiveAvgPool2d((6, 6))
self.classifier = ivy.Sequential(
=self.dropout),
ivy.Dropout(prob256 * 6 * 6, 4096),
ivy.Linear(
ivy.ReLU(),=self.dropout),
ivy.Dropout(prob4096, 4096),
ivy.Linear(
ivy.ReLU(),4096, self.num_classes),
ivy.Linear(
)
def _forward(self, x):
= self.features(x)
x = self.avgpool(x)
x = ivy.reshape(x, (x.shape[0], -1))
x = self.classifier(x)
x return x