3d Object Detection, Recognition, Segmentation, & Position Using Deep Learning (II)

  • Tseng, Din-Chang (PI)

Project Details


This proposal had been applied a three-year grant support, however, we only obtain one-year support; in this project, we modify the original project content to apply the remained two-year project execution. In this project, we pursue the detection and recognition rates reaching 99%, 99.5%, 99.9%, even 99.99% by modifying the existed techniques applied in special fields; we are not develop new techniques with testing in common data bases; such as, PASCAL VOC, ImageNet, MS COCO, to pursue 70%, 80% performance with 1%, 3% increasment. In the special-field applications, we need principle of CNN related theory to improve the existed techniques to reach higher performance. I studied more than 100 related papers in near retiring year, systematically prepared more than 50 famous CNN models, and supervised the middle-quality students to modify the network structure, modules, functions, and alogrithms to reach 99.5% detection and recognition rates; I have no enough time to complete the top journal/top conference papers.This research project is a two-year project. In this project, we want to develop deep-learning techniques to improve the effect and efficient of 3D object detection, recognition, segmentation, and position application techniques. In each year study, we propose two theoretical techniques on developing CNNs and two application topics on 3D objects. In the passed year, two theoretical research topics are: (1) improving the CNN with detection and recognition of objects to adapt the large size variation and (2) analyzing the performance of different fusion structures on 2D and 3D images; two application topics are: (1) comparing CNN-based object detection and recognition using RGB images and (2) CNN-based object and recognition using RGBD data. In the first year of this project, two theoretical research topics are: (1) developing 3D CNN to acquire 9 DoF parameters of 3D objects and (2) using GAN to correct the distance error of 3D camera; two application topics are: (1) executing 3D CNN for 9 DoF estimation of 3D object and (2) developing the bin-picking robot arm system. In the second year, two theoretical research topics are: (1) improving the performance and speed of the CNN and (2) modifying the CNN by adding segmentation function; two application topics are: (1) CNN-based 3D object detection, recognition, and segmentation and (2) 3D object position estimation for automonous in door vehicles.This study bases on our previous fruitful studying results, and focuses on the fixed topics to develop special CNN systems to solve the tricky problems on visual detection, recognition, segmentation, and position. The principal investigator of this project is an original researcher on computer vision; he has studied computer vision techniques more than thirty years; moreover, he has several-year experience of deep learning techniques applied on computer vision problems. In these two years, he has separately collaborated with three companies and ITRI to develop CNN techniques for object detection / recognition and defect inspection on PCBs; thus, we have ability to complete the execution of this research project.
Effective start/end date1/08/2031/07/21

UN Sustainable Development Goals

In 2015, UN member states agreed to 17 global Sustainable Development Goals (SDGs) to end poverty, protect the planet and ensure prosperity for all. This project contributes towards the following SDG(s):

  • SDG 8 - Decent Work and Economic Growth
  • SDG 9 - Industry, Innovation, and Infrastructure
  • SDG 17 - Partnerships for the Goals


  • deep learning
  • convolutional neural network
  • computer vision
  • 3D object detection
  • 3D object recognition
  • 3D object segmentation
  • 3D object position


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.