selected

Sim2Real Diffusion: Leveraging Foundation Vision Language Models for Adaptive Automated Driving

Abstract: Simulation-based design, optimization, and validation of autonomous vehicles have proven to be crucial for their improvement over the years. Nevertheless, the ultimate measure of effectiveness is their successful transition from simulation to reality (sim2real).

Point Cloud Self-supervised Learning via 3D to Multi-view Masked Learner

Abstract: Recently, multi-modal masked autoencoders (MAE) has been introduced in 3D self-supervised learning, offering enhanced feature learning by leveraging both 2D and 3D data to capture richer cross-modal representations. However, these approaches have two limitations: (1) they inefficiently require both 2D and 3D modalities as inputs, even though the inherent multi-view properties of 3D point clouds already contain 2D modality.

SAM-Guided Masked Token Prediction for 3D Scene Understanding

Abstract: Foundation models have significantly enhanced 2D task performance, and recent works like Bridge3D have successfully applied these models to improve 3D scene understanding through knowledge distillation, marking considerable advancements. Nonetheless, challenges such as the misalignment between 2D and 3D representations and the persistent long-tail distribution in 3D datasets still restrict the effectiveness of knowledge distillation from 2D to 3D using foundation models.

An Update on International Robotic Wheelchair Development

Abstract: Disability knows no boarders, so the development of assistive technology is an international effort. This review is a follow up to our previous comprehensive review (Leaman 2017) and a recent mini-review (Sivakanthan 2022).

Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation Models

Abstract: Foundation models have made significant strides in 2D and language tasks such as image segmentation, object detection, and visual-language understanding. Nevertheless, their potential to enhance 3D scene representation learning remains largely untapped due to the domain gap.