DEEPSTREAM SDK

(1)

DEEPSTREAM SDK

(2)

2

REALTIME STREAMING VIDEO ANALYTICS FROM EDGE TO CLOUD

Access Control

Retail Analytics

Traffic Engineering

Content Filtering Managing operations

Optical Inspection

Parking Management

Managing Logistics

(3)

EASILY DEPLOY AI

Tools and Libraries To Accelerate Your AI Models

TensorFlow

PyTorch

Automatic Mixed Precision

(AMP) DLProf

Nsight Systems Analyze Your model

Train

TensorFlow-TRT Integrated Model

Inference

TensorRT

Scale

Triton Inference

Server

Transfer

Learning Video streaming

Data Loading Library

(DALI)

Pre-Trained model Data pipeline

TRTorch Integrated Model

(4)

4

EASILY DEPLOY AI

Tools and Libraries To Accelerate Your AI Models

TensorFlow

PyTorch

(AMP) DLProf

Train

Inference

TensorRT

Scale

Triton Inference

Server

Transfer Learning

DeepStream SDK

Video streaming

Transfer Learning Toolkit Data Loading

Library (DALI)

Speech

Language Computer

Vision Recommender

(5)

FRAMEWORK FOR ANALYZING VIDEO

STREAM &BATCH PROCESSING MULTIMEDIA APIs

MULTIMEDIA APIs

TENSORRT, CUDA

PRE-PROCESS TRACK, DETECT, CLASSIFY

METADATA PROCESSING DECODE

COMPOSITE

CUDA

LOCAL DISPLAY

REMOTE DISPLAY

Metadata

(6)

6

UNDERSTAND THE BASICS:

DEEPSTREAM SDK

(7)

WHAT IS DEEPSTREAM?

CUDA-X

Kubernetes ON GPUs NVIDIA Containers RT CUDA

DEEPSTREAM SDK

Multimedia TensorRT

Hardware Accelerated

Plugins Docker Containers

Reference Applications &

Orchestration Recipes Azure IOT Runtime

Applications and Services

(8)

8

DEEPSTREAM GRAPH ARCHITECTURE

CAPTURE

DECODE, CAMERA PROCESS

SCALE, DEWARP,

CROP

STREAM MGMT

DETECT &

CLASSIFY TRACKING ON SCREEN

DISPLAY OUTPUT

CPU NVDEC GPU CPU GPU GPU GPU HDMI

VIC CPU VIC SATA

ISP

RTSP/RAW DECODE/ISP BATCHING TRACKING VIZULIZATION DISPLAY/

STORAGE DNN(s)

IMAGE PROCESSING

DLA

(9)

DEEPSTREAM - MANY INDUSTRIES, FLEXIBLE DEPLOYMENT

(10)

10

DEEPSTREAM CONCEPTS

(11)

GSTREAMER FOUNDATIONS

The DeepStream SDK is based on the open source GStreamer multimedia framework. There are a few key concepts in GStreamer that we need to touch on before getting started. These include Elements, Pads, Buffers, and Caps. We will be describing them at a high level, but

encourage those who are interested in the details to read the GStreamer documentation to

learn more.

(12)

https://en.wikipedia.org/wiki/GStreamer 12

(13)

GSTREAMER

Src Plugin src sink Filter Plugin src sink Sink Plugin

BIN

Plugin1 Plugin2

src sink

PIPELINE

SOURCE BIN PROCESS BIN SINK BIN

Video

File Decoder Scale Filter Display

Level Component Function

1 PLUGINS Basic building block connected through PADs

2 BINS

A container for a collection of plugins

3 PIPELINE

Top level bin providing a bus and managing the synchronization

(14)

14

DEEPSTREAM BUILDING BLOCK

•

A plugin model-based pipeline architecture

•

Graph-based pipeline interface to allow high-level component

interconnect

•

Heterogenous processing on GPU and CPU

•

Hides parallelization and

synchronization under the hood

•

Inherently multi-threaded

Input +

[Metadata] Output +

Metadata

Low Level API

Hardware

GPU PLUGIN

LOW LEVEL LIB

(15)

(16)

16

DEEPSTREAM ACCELERATED PLUGIN

Plugin Name Functionality

Gst-nvvideo4linux2 Hardware accelerated decode and encode

Gst-nvinfer DL inference for detection, classification and segmentation

Gst-nvtracker Reference object trackers; KLT, IOU, NvDCF

Gst-nvmsgbroker Messaging to cloud

Gst-nvstreammux Stream aggregation, multiplexing, and batching

Gst-nvdsosd Draw boxes and text overlay

Gst-nvmultistreamtiler Renders frames 2D grid array

Gst-nveglglessink Accelerated X11 / EGL rendering

Gst-nvvideoconvert Scaling, format conversion, rotation

Gst-nvdewarp Dewarping for fish-eye degree cameras

Gst-nvmsgconv Metadata generation

Gst-nvsegvisual Visualizes segmentation results

Get-out Hardware accelerated optical flow

Gst-nvinferserver Run Triton Inference Server

(17)

VIDEO DECODER

(18)

18

DEWARPER

(19)

GST-NVSTREAMMUX

(20)

20

GST-NVINFER

(21)

GST-NVTILER

(22)

22

GST-NVTRACKER

(23)

METADATA TO MESSAGE BROKERS

(24)

24

GETTING STARTED

DEEPSTREAM

(25)

BUILD DEEPSTREAM PIPELINE WITH TENSORRT

Run application

$gst-launch-1.0 rtspsrc location=<rtsp://.stream> protocols=tcp ! rtph264depay name=demux ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! nvinfer config-file-path=config_infer_primary_ssd.txt batch-size=1

unique-id=1 interval=3 ! nvtracker ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_mot_klt.so ! nvvideoconvert !

RTSP DECODE BATCH

OBJECT DETECTION

(SSD)

RTSP Tracker

TensorRT engine

PIPELINE

(26)

26

BUILD DEEPSTREAM PIPELINE WITH TENSORRT

Configuration file for the inference

gpu-id= <Device ID of GPU> ex) 0

net-scale-factor= <Pixel normalization factor> ex) 0.0039215697906911373 model-engine-file= <Pathname of the serialized TensorRT engine file>

model-file= <Pathname of caffemodel file>

proto-file= <Pathname of prototxt file>

onnx-file= <Pathname of ONNX model file>

labelfile-path= <Pathname of a text file containing the labels for the model>

int8-calib-file= <Pathname of INT8 calibration file>

batch-size= <Number of frames to be inferred together>

network-mode= <Data format to be used by inference, 0:FP32, 1:INT8, 2:FP16> ex) 0 num-detected-classes= <Number of classes detected by the network> ex) 4

interval= <Specifies the number of consecutive batches to be skipped for inference> ex) 3

parse-bbox-func-name= <Name of the custom bounding box parsing function> ex) NvDsInferParseCustomSSD custom-lib-path= <Pathname of a library containing custom method> ex) libnvdsinfer_custom_impl_ssd.so

config_infer_primary.txt

(27)

TRITON INFERENCE SERVER

Plugin!

(28)

28

TESNORRT AND TRITON INFERENCE SERVER

Trade off

TensorRT Triton Server Pros Highest throughput Highest flexibility Cons Custom layers require

writing plugins

Less performant than a

TensorRT solution

(29)

(30)

30

TRANSFER LEARNING TOOLKIT

Make a TRT-engine from TLT

NGC¹

NGC¹ : NVIDIA GPU CLOUD (https://ngc.nvidia.com/)

DEPLOY ENGINE BUILD

ENGINE OUTPUT

MODEL RETRAIN

PRUNE TRAIN WITH

NEW DATA PRETRAINED

MODEL

https://github.com/cycoslee/ai_developers_day_2019/blob/master/3_inference/Session2/tlt_ssd.ipynb

(31)

TRANSFER LEARNING TOOLKIT

Support commonly used DNNs

(32)

32

TRANSFER LEARNING TOOLKIT

Pretrained model from NGC

See TLT Details on NVIDIA developer site

https://developer.nvidia.com/transfer-learning-toolkit

(33)

TRANSFER LEARNING TOOLKIT

Key capability

Pruning

Scene Adaptation

Prune Retrain

Adapt Data

Train with new data from another vantage point, camera location, or added attribute

Same network adapting to different angles and vantage points

Same network adapting to new data

Network compression

(34)

34

PRUNING MODELS

Reduce model size and increase throughput Incrementally retrain model after pruning to recover accuracy

1 2

Network - ResNet 18 4-class (Car, Person, Bicycle, Road sign) EXAMPLE Memory size - 46.2 MB to 6.7 MB

FPS - 16 fps to 30 fps

6.5x

Model Size Reduction

>2x

Throughput

Increase

(35)

SPARSITY ACCELERATION WITH A100

with Tensor Cores

(36)

36

SCENE ADAPTATION

Camera location vantage point

Train with new data from another vantage point, camera location, or added attribute

Person with blue shirt

(37)

NEW CLASSES

Easy to modify models to add new classes

Train, Prune and Retrain new networ

k

New network detecting

Ambulance

Sedan SUV Truck Van Coupe LargeV

Ambulance Edit Config spec

Add new class Ambulance Pretrained

network

(38)

38

END-TO-END THROUGHPUT

DeepStream SDK with purpose-built models from TLT

(39)

TAKEAWAY

Tools and Libraries To Accelerate Your AI Models

TensorFlow

PyTorch

(AMP) DLProf

Train

Inference

TensorRT

Scale

Triton Inference

Server

Transfer

Learning Video streaming

Data Loading Library

(DALI)

(40)

40

TAKEAWAY

Tools and Libraries To Accelerate Your AI Models

TensorFlow

PyTorch

(AMP) DLProf

Train

Inference

TensorRT

Scale

Triton Inference

Server

Transfer Learning

DeepStream SDK

Video streaming

Transfer Learning Toolkit Data Loading

Library (DALI)

Speech

Language Computer

Vision Recommender

(41)

(42)

42

LEARN MORE ABOUT DEEPSTREAM

Helpful Links & Examples on NGC

▪

Important Web Link

: DeepStream SDK Download

▪

Technical Blogs

: Breaking the Boundaries of Intelligent Video Analytics with DeepStream SDK 3.0 : Multi-Camera Large-Scale Intelligent Video Analytics with DeepStream SDK

▪

GitHub

: Reference Apps using Deepstream 4.0 : Deepstream Python API

: An example of using DeepStream SDK for redaction

: DeepStream 3.0 - 360 Degree Smart Parking Application

▪

NVIDIA GPU CLOUD(NGC) containers : DeepStream for Tesla

: DeepStream for Jetson : Smart Parking Detection

: Intelligent Video Analytics Demo

(43)

LEARN MORE ABOUT TLT

▪

Important Web Link

:https://developer.nvidia.com/transfer-learning-toolkit

▪

Technical Blogs

: Training with Custom Pretrained Models Using the NVIDIA Transfer Learning Toolkit : What is transfer Learning

: Pruning models with NVIDIA Transfer Learning Toolkit

: Build and Deploy Accurate Deep Learning Models for Intelligent Image and Video Analytics

▪

NVIDIA GPU CLOUD(NGC) containers

Helpful Links

(44)