In order to operate in human environments, a robot's semantic perception has to overcome open-world challenges such as novel objects and domain gaps. Autonomous deployment to such environments therefore requires robots to update their knowledge and learn without supervision. We investigate how a robot can autonomously discover novel semantic classes and improve accuracy on known classes when exploring an unknown environment. To this end, we develop a general framework for mapping and clustering that we then use to generate a self-supervised learning signal to update a semantic segmentation model. In particular, we show how clustering parameters can be optimized during deployment and that fusion of multiple observation modalities improves novel object discovery compared to prior work.
Many existing works use kNN clustering, but the assumption that the number of clusters is known is inadequate for open-world perception. Without the number of clusters, algorithms however get often stuck in local minima of over- or under-clustering. We propose an automatic optimisation to yield clustering parameters that are aligned with the segmentation model.
By integrating single-frame anomaly detection and mapping, we can reliably find the unknown parts of a scene.