Goal:
Learn the main components of the video content analysis, such as feature detection and description, object segmentation, object detection and tracking. Learn advanced object classification techniques based on the Deep Learning concept. Learn the basics of 3D multi-view geometry, 3D sensing principles and 3D model reconstruction. Learn the practical aspects of implementing the above described methods, by programming (C++, python, TensorFlow) a surveillance application, using a UAV drone as a data capturing device.
Content:
Content of the course is divided globally into two areas.
- Techniques for object detection and recognition, feature extraction and analysis, like SIFT and HOG. Furthermore, semantic level processing for understanding events and scenes, including human behavior. Furthermore, classification techniques for understanding objects and events. Modern classification like K-means and SVM (support vector machine) algorithms, evolving into basics of learning with neural nets. This part will gradually evolve to fundamentals and practical applications of Deep Learning.
- 3D processing based on the camera pinhole model, multi-view processing and calibration. Also registration of 3D datasets, 3D reconstruction models with TSDF, introduction to SLAM, RGB-Depth processing and specific algorithms like G2.0 and bundle adjustment. Finally, the 3D processing modules end with plane/object segmentation in 3D.
The programming assignments aim at applying the knowledge and algorithms (or parts of them) to provide the student a framework for experiments with video content understanding and 3D image-based modeling for surveillance applications.
The assignments are based on C++ / Python / TensorFlow programming.
Preknowledge
Advised: 5LSE0 Multimedia video coding and architectures
Date | Time | Room |
---|---|---|
27 aug | 09.30-17.30 | Flux 1.07 |
28 aug | 09.30-17.30 | Flux 1.07 |
29 aug | 19.30-17.30 | Flux 1.07 |
30 aug | 09.30-17.30 | Flux 1.07 |
31 aug | 09.30-17.30 | Flux 1.07 |