Representing Objects using Global 3D Relational Features for Recognition Tasks

Wail Mustafa

Research output: Book/ReportPh.D. thesisResearch


In robotic systems, visual interpretations of the environment compose an essential element in a variety of applications, especially those involving manipulation of objects. Interpreting the environment is often done in terms of recognition of objects using machine learning approaches. For user acceptance, the recognition must be reliable and the learning must be fast, i.e. only few number of training samples should be used.

In this thesis, a framework utilized in three recognition tasks is presented. The framework uses a combination of effective learning algorithms and strong visual representations. For representing objects, we derive global descriptors encoding shape using viewpoint-invariant features obtained from multiple sensors observing the scene. Objects are also described using color independently. This allows for combining color and shape when it is required for the task. For more robust color description, color calibration is performed. The framework was used in three recognition tasks: object instance recognition, object category recognition, and object spatial relationship recognition.

For the object instance recognition task, we present a system that utilizes color and scale-variant shape representations. We achieve both high performance and learning efficiency. The learning efficiency is expressed in terms of scalability to many objects while requiring only a few training samples. The system has also been applied in real-world application in which its reliability allowed to initiate higher-level semantic interpretations of complex scenes.

In the object category recognition task, we present a system that is capable of assigning multiple and nested categories for novel objects using a method developed for this purpose. Integrating this method with other multi-label learning approaches is also investigated. The system uses a scale-invariant shape representation and it is shown to outperform a state-of-the-art method particularly when a few samples are used for training. Both systems are benchmarked using a multi-view object dataset of 100 objects specially created for this purpose.

Finally, we show that, by defining global descriptors between objects, the framework can be extended to perform recognition of object spatial relationships. In this task, we show that we can achieve high recognition performance for a number of relationships of objects in a simulated environment.
Original languageEnglish
Publication statusPublished - 2015
Externally publishedYes


Dive into the research topics of 'Representing Objects using Global 3D Relational Features for Recognition Tasks'. Together they form a unique fingerprint.

Cite this