DEEPSTREAM SDK
2
REALTIME STREAMING VIDEO ANALYTICS FROM EDGE TO CLOUD
Access Control
Retail Analytics
Traffic Engineering
Content Filtering Managing operations
Optical Inspection
Parking Management
Managing Logistics
EASILY DEPLOY AI
Tools and Libraries To Accelerate Your AI Models
TensorFlow
PyTorch
Automatic Mixed Precision
(AMP) DLProf
Nsight Systems Analyze Your model
Train
TensorFlow-TRT Integrated Model
Inference
TensorRT
Scale
Triton Inference
Server
Transfer
Learning Video streaming
Data Loading Library
(DALI)
Pre-Trained model Data pipeline
TRTorch Integrated Model
4
EASILY DEPLOY AI
Tools and Libraries To Accelerate Your AI Models
TensorFlow
PyTorch
Automatic Mixed Precision
(AMP) DLProf
Nsight Systems Analyze Your model
Train
TensorFlow-TRT Integrated Model
Inference
TensorRT
Scale
Triton Inference
Server
Transfer Learning
DeepStream SDK
Video streaming
Transfer Learning Toolkit Data Loading
Library (DALI)
Speech
Language Computer
Vision Recommender
Pre-Trained model Data pipeline
TRTorch Integrated Model
FRAMEWORK FOR ANALYZING VIDEO
STREAM &BATCH PROCESSING MULTIMEDIA APIs
MULTIMEDIA APIs
TENSORRT, CUDA
PRE-PROCESS TRACK, DETECT, CLASSIFY
METADATA PROCESSING DECODE
COMPOSITE
CUDA
LOCAL DISPLAY
REMOTE DISPLAY
Metadata
6
UNDERSTAND THE BASICS:
DEEPSTREAM SDK
WHAT IS DEEPSTREAM?
CUDA-X
Kubernetes ON GPUs NVIDIA Containers RT CUDA
DEEPSTREAM SDK
Multimedia TensorRT
Hardware Accelerated
Plugins Docker Containers
Reference Applications &
Orchestration Recipes Azure IOT Runtime
Applications and Services
8
DEEPSTREAM GRAPH ARCHITECTURE
CAPTURE
DECODE, CAMERA PROCESS
SCALE, DEWARP,
CROP
STREAM MGMT
DETECT &
CLASSIFY TRACKING ON SCREEN
DISPLAY OUTPUT
CPU NVDEC GPU CPU GPU GPU GPU HDMI
VIC CPU VIC SATA
ISP
RTSP/RAW DECODE/ISP BATCHING TRACKING VIZULIZATION DISPLAY/
STORAGE DNN(s)
IMAGE PROCESSING
DLA
DEEPSTREAM - MANY INDUSTRIES, FLEXIBLE DEPLOYMENT
10
DEEPSTREAM CONCEPTS
GSTREAMER FOUNDATIONS
The DeepStream SDK is based on the open source GStreamer multimedia framework. There are a few key concepts in GStreamer that we need to touch on before getting started. These include Elements, Pads, Buffers, and Caps. We will be describing them at a high level, but
encourage those who are interested in the details to read the GStreamer documentation to
learn more.
https://en.wikipedia.org/wiki/GStreamer 12
GSTREAMER
Src Plugin src sink Filter Plugin src sink Sink Plugin
BIN
Plugin1 Plugin2
src sink
PIPELINE
SOURCE BIN PROCESS BIN SINK BIN
Video
File Decoder Scale Filter Display
Level Component Function
1 PLUGINS Basic building block connected through PADs
2 BINS
A container for a collection of plugins
3 PIPELINE
Top level bin providing a bus and managing the synchronization
14
DEEPSTREAM BUILDING BLOCK
•
A plugin model-based pipeline architecture
•
Graph-based pipeline interface to allow high-level component
interconnect
•
Heterogenous processing on GPU and CPU
•
Hides parallelization and
synchronization under the hood
•
Inherently multi-threaded
Input +
[Metadata] Output +
Metadata
Low Level API
Hardware
GPU PLUGIN
LOW LEVEL LIB
16
DEEPSTREAM ACCELERATED PLUGIN
Plugin Name Functionality
Gst-nvvideo4linux2 Hardware accelerated decode and encode
Gst-nvinfer DL inference for detection, classification and segmentation
Gst-nvtracker Reference object trackers; KLT, IOU, NvDCF
Gst-nvmsgbroker Messaging to cloud
Gst-nvstreammux Stream aggregation, multiplexing, and batching
Gst-nvdsosd Draw boxes and text overlay
Gst-nvmultistreamtiler Renders frames 2D grid array
Gst-nveglglessink Accelerated X11 / EGL rendering
Gst-nvvideoconvert Scaling, format conversion, rotation
Gst-nvdewarp Dewarping for fish-eye degree cameras
Gst-nvmsgconv Metadata generation
Gst-nvsegvisual Visualizes segmentation results
Get-out Hardware accelerated optical flow
Gst-nvinferserver Run Triton Inference Server
VIDEO DECODER
18
DEWARPER
GST-NVSTREAMMUX
20
GST-NVINFER
GST-NVTILER
22
GST-NVTRACKER
METADATA TO MESSAGE BROKERS
24
GETTING STARTED
DEEPSTREAM
BUILD DEEPSTREAM PIPELINE WITH TENSORRT
Run application
$gst-launch-1.0 rtspsrc location=<rtsp://.stream> protocols=tcp ! rtph264depay name=demux ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! nvinfer config-file-path=config_infer_primary_ssd.txt batch-size=1
unique-id=1 interval=3 ! nvtracker ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_mot_klt.so ! nvvideoconvert !
RTSP DECODE BATCH
OBJECT DETECTION
(SSD)
RTSP Tracker
TensorRT engine
PIPELINE
26
BUILD DEEPSTREAM PIPELINE WITH TENSORRT
Configuration file for the inference
gpu-id= <Device ID of GPU> ex) 0
net-scale-factor= <Pixel normalization factor> ex) 0.0039215697906911373 model-engine-file= <Pathname of the serialized TensorRT engine file>
model-file= <Pathname of caffemodel file>
proto-file= <Pathname of prototxt file>
onnx-file= <Pathname of ONNX model file>
labelfile-path= <Pathname of a text file containing the labels for the model>
int8-calib-file= <Pathname of INT8 calibration file>
batch-size= <Number of frames to be inferred together>
network-mode= <Data format to be used by inference, 0:FP32, 1:INT8, 2:FP16> ex) 0 num-detected-classes= <Number of classes detected by the network> ex) 4
interval= <Specifies the number of consecutive batches to be skipped for inference> ex) 3
parse-bbox-func-name= <Name of the custom bounding box parsing function> ex) NvDsInferParseCustomSSD custom-lib-path= <Pathname of a library containing custom method> ex) libnvdsinfer_custom_impl_ssd.so
config_infer_primary.txt
TRITON INFERENCE SERVER
Plugin!
28
TESNORRT AND TRITON INFERENCE SERVER
Trade off
TensorRT Triton Server Pros Highest throughput Highest flexibility Cons Custom layers require
writing plugins
Less performant than a
TensorRT solution
30
TRANSFER LEARNING TOOLKIT
Make a TRT-engine from TLT
NGC1
NGC1 : NVIDIA GPU CLOUD (https://ngc.nvidia.com/)
DEPLOY ENGINE BUILD
ENGINE OUTPUT
MODEL RETRAIN
PRUNE TRAIN WITH
NEW DATA PRETRAINED
MODEL
https://github.com/cycoslee/ai_developers_day_2019/blob/master/3_inference/Session2/tlt_ssd.ipynb
TRANSFER LEARNING TOOLKIT
Support commonly used DNNs
32
TRANSFER LEARNING TOOLKIT
Pretrained model from NGC
See TLT Details on NVIDIA developer site
https://developer.nvidia.com/transfer-learning-toolkit
TRANSFER LEARNING TOOLKIT
Key capability
Pruning
Scene Adaptation
Prune Retrain
Adapt Data
Train with new data from another vantage point, camera location, or added attribute
Same network adapting to different angles and vantage points
Same network adapting to new data
Network compression
34
PRUNING MODELS
Reduce model size and increase throughput Incrementally retrain model after pruning to recover accuracy
1 2
Network - ResNet 18 4-class (Car, Person, Bicycle, Road sign) EXAMPLE Memory size - 46.2 MB to 6.7 MB
FPS - 16 fps to 30 fps
6.5x
Model Size Reduction
>2x
Throughput
Increase
SPARSITY ACCELERATION WITH A100
with Tensor Cores
36
SCENE ADAPTATION
Camera location vantage point
Train with new data from another vantage point, camera location, or added attribute
Person with blue shirt
NEW CLASSES
Easy to modify models to add new classes
Train, Prune and Retrain new networ
k
New network detecting
Ambulance
Sedan SUV Truck Van Coupe LargeV
Sedan SUV Truck Van Coupe LargeV
Ambulance Edit Config spec
Add new class Ambulance Pretrained
network
38
END-TO-END THROUGHPUT
DeepStream SDK with purpose-built models from TLT
TAKEAWAY
Tools and Libraries To Accelerate Your AI Models
TensorFlow
PyTorch
Automatic Mixed Precision
(AMP) DLProf
Nsight Systems Analyze Your model
Train
TensorFlow-TRT Integrated Model
Inference
TensorRT
Scale
Triton Inference
Server
Transfer
Learning Video streaming
Data Loading Library
(DALI)
Pre-Trained model Data pipeline
TRTorch Integrated Model
40
TAKEAWAY
Tools and Libraries To Accelerate Your AI Models
TensorFlow
PyTorch
Automatic Mixed Precision
(AMP) DLProf
Nsight Systems Analyze Your model
Train
TensorFlow-TRT Integrated Model
Inference
TensorRT
Scale
Triton Inference
Server
Transfer Learning
DeepStream SDK
Video streaming
Transfer Learning Toolkit Data Loading
Library (DALI)
Speech
Language Computer
Vision Recommender
Pre-Trained model Data pipeline
TRTorch Integrated Model
42
LEARN MORE ABOUT DEEPSTREAM
Helpful Links & Examples on NGC
▪
Important Web Link
: DeepStream SDK Download
▪
Technical Blogs
: Breaking the Boundaries of Intelligent Video Analytics with DeepStream SDK 3.0 : Multi-Camera Large-Scale Intelligent Video Analytics with DeepStream SDK
▪
GitHub
: Reference Apps using Deepstream 4.0 : Deepstream Python API
: An example of using DeepStream SDK for redaction
: DeepStream 3.0 - 360 Degree Smart Parking Application
▪
NVIDIA GPU CLOUD(NGC) containers : DeepStream for Tesla
: DeepStream for Jetson : Smart Parking Detection
: Intelligent Video Analytics Demo
LEARN MORE ABOUT TLT
▪
Important Web Link
:https://developer.nvidia.com/transfer-learning-toolkit
▪
Technical Blogs
: Training with Custom Pretrained Models Using the NVIDIA Transfer Learning Toolkit : What is transfer Learning
: Pruning models with NVIDIA Transfer Learning Toolkit
: Build and Deploy Accurate Deep Learning Models for Intelligent Image and Video Analytics
▪