NVIDIA Jetson Nano: Zero to AI Hero - Bare Metal Deep Learning Setup Guide

NVIDIA Jetson Nano: Zero to AI Hero - Bare Metal Deep Learning Setup Guide
NVIDIA Jetson & Edge AI Boards

Article

Introduction

The NVIDIA Jetson Nano packs impressive AI capabilities into a compact, power-efficient device. With its 472 GFLOPS of computational power and 128-core Maxwell GPU, it's perfect for edge AI applications like computer vision, robotics, and smart IoT devices. This guide will walk you through setting up your Jetson Nano for deep learning projects in clear, manageable steps.

What You'll Need

Hardware

  • NVIDIA Jetson Nano Developer Kit

  • MicroSD Card (32GB or larger, UHS-1 speed recommended)

  • Power Supply (5V 4A with barrel jack, or 5V 2A via micro-USB)

  • USB keyboard and mouse

  • HDMI monitor

  • Ethernet connection or compatible WiFi adapter

Software

  • Computer with SD card reader

  • SD card flashing tool (BalenaEtcher recommended)

  • Latest JetPack SD card image from NVIDIA

Setup Process

1. Prepare Your Jetson Nano (15 minutes)

  1. Download the JetPack image:

  2. Flash the SD card:

    • Install BalenaEtcher on your computer

    • Open BalenaEtcher and select the downloaded image

    • Select your SD card as the target

    • Click "Flash" and wait for the process to complete

  3. Hardware assembly:

    • Insert the SD card into the Jetson Nano

    • Connect the display, keyboard, and mouse

    • Connect the power supply last

2. First Boot and System Configuration (10 minutes)

  1. Initial setup:

    • Power on the Jetson Nano

    • Follow the on-screen instructions to:

      • Accept the license agreement

      • Set your language and region

      • Create a username and password

      • Configure your network

  2. Update the system:

    sudo apt update
    sudo apt upgrade -y
    
  3. Install essential development tools:

    sudo apt install -y git cmake python3-pip
    sudo apt install -y build-essential libatlas-base-dev gfortran
    

3. Setting Up the Deep Learning Environment (15 minutes)

  1. Expand swap space (prevents out-of-memory errors):

    sudo fallocate -l 4G /swapfile
    sudo chmod 600 /swapfile
    sudo mkswap /swapfile
    sudo swapon /swapfile
    echo '/swapfile swap swap defaults 0 0' | sudo tee -a /etc/fstab
    
  2. Install TensorFlow and Keras:

    sudo pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v512 tensorflow==2.12.0+nv23.06
    pip3 install keras
    
  3. Install PyTorch (alternative deep learning framework):

    wget https://nvidia.box.com/shared/static/p57jwntv436lfrd78inwl7iml6p13fzh.whl -O torch-1.8.0-cp36-cp36m-linux_aarch64.whl
    pip3 install torch-1.8.0-cp36-cp36m-linux_aarch64.whl
    
  4. Verify installation:

    python3 -c "import tensorflow as tf; print(tf.__version__)"
    python3 -c "import torch; print(torch.__version__)"
    

4. Running Your First Deep Learning Model (10 minutes)

Create a file named mnist_test.py with this content:

import tensorflow as tf
import time

# Load the MNIST dataset
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize the data
x_train, x_test = x_train / 255.0, x_test / 255.0

# Build a simple neural network model
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Train the model
print("Training model...")
start_time = time.time()
model.fit(x_train, y_train, epochs=5)
training_time = time.time() - start_time

# Evaluate the model
print("Evaluating model...")
test_loss, test_acc = model.evaluate(x_test, y_test)

print(f"Training time: {training_time:.2f} seconds")
print(f"Test accuracy: {test_acc:.4f}")

Run the script:

python3 mnist_test.py

This will train a simple neural network on the MNIST dataset and display the training time and accuracy.

Optimizing Performance

Memory Management

  • Close unnecessary applications when training models

  • Use smaller batch sizes for large models

  • Consider model quantization for inference

GPU Acceleration

  • Use TensorRT for faster inference:

    sudo apt-get install python3-libnvinfer-devsudo apt-get install uff-converter-tf
    

Practical Applications

  1. Computer Vision: Object detection, face recognition, gesture recognition

  2. Robotics: Navigation, obstacle avoidance, human-robot interaction

  3. IoT: Smart homes, industrial monitoring, predictive maintenance

Troubleshooting Common Issues

  1. System freezes during training:

    • Check power supply (5V 4A recommended)

    • Reduce batch size

    • Ensure adequate cooling

  2. Out of memory errors:

    • Increase swap space

    • Use smaller models or reduce batch size

    • Optimize model architecture

  3. Slow performance:

    • Enable jetson_clocks for maximum performance:

      sudo jetson_clocks
      
    • Use TensorRT optimization

    • Consider quantizing models to int8 precision

Implementing a Camera-Based Application (15 minutes)

Let's create a simple real-time object detection application using your Jetson Nano's camera capabilities.

1. Setting up the camera

  1. Install OpenCV and required packages:

    sudo apt-get install -y python3-opencv
    pip3 install pillow numpy matplotlib
    
  2. Camera connection:

    • For USB webcam: simply plug it into a USB port

    • For Raspberry Pi Camera Module v2:

      • Power off your Jetson Nano

      • Connect the camera to the MIPI CSI camera connector

      • Boot the Jetson Nano

  3. Test your camera:

    # For USB camera
    nvgstcapture-1.0 -m 1
    
    # For CSI camera
    nvgstcapture-1.0
    

    Press 'q' to exit the camera test.

2. Building a real-time object detection app

Create a file named object_detection.py with this content:

import cv2
import numpy as np
import time

# Load pre-trained MobileNet-SSD model
net = cv2.dnn.readNetFromCaffe(
    'deploy.prototxt',
    'mobilenet_iter_73000.caffemodel'
)

# Download these files if you don't have them
# !wget https://github.com/PINTO0309/MobileNet-SSD-RealSense/raw/master/caffemodel/MobileNetSSD/deploy.prototxt
# !wget https://github.com/PINTO0309/MobileNet-SSD-RealSense/raw/master/caffemodel/MobileNetSSD/mobilenet_iter_73000.caffemodel

# Class labels MobileNet-SSD was trained on
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
           "bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
           "dog", "horse", "motorbike", "person", "pottedplant", "sheep",
           "sofa", "train", "tvmonitor"]

# Random color for each class
COLORS = np.random.uniform(0, 255, size=(len(CLASSES), 3))

# Initialize camera (0 for first USB webcam, 1 for CSI camera usually)
camera = cv2.VideoCapture(0)
time.sleep(2.0)  # Warm up the camera

print("Starting object detection - press 'q' to quit")

while True:
    # Read frame
    ret, frame = camera.read()
    if not ret:
        print("Failed to capture image")
        break
        
    # Prepare input for neural network
    (h, w) = frame.shape[:2]
    blob = cv2.dnn.blobFromImage(cv2.resize(frame, (300, 300)),
                                 0.007843, (300, 300), 127.5)
    
    # Forward pass through the network
    net.setInput(blob)
    detections = net.forward()
    
    # Process detection results
    for i in range(detections.shape[2]):
        confidence = detections[0, 0, i, 2]
        
        # Filter weak detections
        if confidence > 0.2:
            # Get class label
            class_id = int(detections[0, 0, i, 1])
            
            # Calculate bounding box coordinates
            box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
            (startX, startY, endX, endY) = box.astype("int")
            
            # Draw bounding box and label
            label = f"{CLASSES[class_id]}: {confidence:.2f}"
            cv2.rectangle(frame, (startX, startY), (endX, endY),
                          COLORS[class_id], 2)
            y = startY - 15 if startY - 15 > 15 else startY + 15
            cv2.putText(frame, label, (startX, y),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[class_id], 2)
    
    # Display result
    cv2.imshow("Object Detection", frame)
    
    # Exit on 'q' key press
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# Clean up
camera.release()
cv2.destroyAllWindows()

3. Download pre-trained model files

wget https://github.com/PINTO0309/MobileNet-SSD-RealSense/raw/master/caffemodel/MobileNetSSD/deploy.prototxt
wget https://github.com/PINTO0309/MobileNet-SSD-RealSense/raw/master/caffemodel/MobileNetSSD/mobilenet_iter_73000.caffemodel

4. Run the application

python3 object_detection.py

This application:

  • Captures video from your camera in real-time

  • Processes each frame through a pre-trained MobileNet-SSD model

  • Detects and labels objects like people, animals, vehicles, and common household items

  • Displays bounding boxes around detected objects with confidence scores

5. Understanding the performance

Object detection runs on the Jetson Nano's GPU, demonstrating how this small device can perform complex AI tasks on the edge without sending data to the cloud. You may notice that:

  • The model processes at 10-15 frames per second (depending on your camera resolution)

  • Lower confidence thresholds (in the code) increase detection sensitivity but may introduce false positives

  • The MobileNet-SSD model balances speed and accuracy, making it suitable for edge devices

Next Steps

  1. Enhance your camera application:

    • Add motion detection capabilities

    • Implement person counting or activity recognition

    • Connect to cloud services for alerts or data logging

  2. Deploy models to production:

    • Containerize your application with Docker

    • Create systemd services for automatic startup

    • Implement power management for extended battery life

  3. Join the community:

    • NVIDIA Developer Forums

    • JetsonHacks tutorials

    • GitHub projects and examples

Conclusion

The NVIDIA Jetson Nano offers impressive deep learning capabilities in an affordable, power-efficient package. By following this guide, you've set up a complete development environment capable of training and running sophisticated AI models. Whether you're building a smart robot, an intelligent camera system, or an IoT device, the Jetson Nano provides the computational power needed for edge AI applications.

As you continue your journey, remember that optimization is key when working with edge devices. Focus on model efficiency, power management, and application-specific tuning to get the most out of your Jetson Nano.

Happy building!

Related Articles

Edge Hackers

Join our community of makers, builders, and innovators exploring the cutting edge of technology.

Subscribe to our newsletter

The latest news, articles, and resources, sent to your inbox weekly.

© 2025 Edge Hackers. All rights reserved.