AkB Expo

A research project proposes a novel method for implementing an Indoor Navigation System: a monocular SLAM-based Indoor Navigation System (bib | DOI | pdf). In recent years, navigation research has expanded significantly. While outdoor navigation has reached commercial-level efficiency, indoor navigation systems (INS) still lag behind their outdoor counterparts. Outdoor systems rely primarily on GPS and inertial trackers, widely adopted since the early 2000s. Indoor navigation commonly uses Bluetooth Low Energy (BLE) beacon technology, but BLE has efficiency limitations compared to outdoor solutions. Other technologies, such as Wi-Fi, lidar, and infrared sensors, are also used for indoor navigation.

Among various approaches to implementing INS, vision-based solutions are promising for their usability and operability. Vision-based navigation aligns with how humans recognize environments by associating unique landmarks and objects. A vision-based INS typically involves a network of markers, such as QR codes, barcodes, ArUco markers, or customized patterns, scanned by users as they navigate. While this method is accurate and straightforward, it relies heavily on the availability of markers. If markers are inaccessible, locating the next one can become time-consuming and confusing. Implementing a new marker-based INS in an unfamiliar location can also be tedious.

A better approach is simultaneous localization and mapping (SLAM). SLAM is a computational process that enables a device to map an unknown environment in 3D while locating itself within it. SLAM applications include indoor cleaners, autonomous vehicles, robots, and extended reality. With advancements in mobile camera technology, both iOS and Android offer augmented reality development kits, ARKit and ARCore. However, challenges like map accuracy and storage requirements for 3D point clouds remain.

In this paper, we propose a markerless, vision-based, cost-effective, real-time solution for indoor navigation using visual SLAM. Visual SLAM's ability to operate on monocular cameras is crucial since our system uses the mobile device’s built-in camera. Advances in computer vision algorithms have demonstrated the potential of monocular visual SLAM on mobile devices. For scalability and to overcome AR SDK challenges, we use ARKit’s ARWorldMap module. ARWorldMap allows users to create location-based 3D maps, containing details such as coordinates and object identifiers.

Our Contributions

Indoor navigation systems are still evolving. This paper addresses existing limitations in vision-based systems by introducing the following features to enhance usability and operability:

Markerlessness: Most vision-based systems require frequent marker scanning, slowing traversal. Our system eliminates the need for markers, making it fully markerless.
Efficient Initial Positioning: In marker-based systems, precise placement is required to find the nearest marker, posing usability challenges. Our SLAM-based approach addresses this by simplifying initial positioning.
Real-time Localization: When users launch the app, the device scans the environment continuously, identifying its location by matching current image frames with the server-stored 3D map. This allows real-time localization.
Cost-efficiency: Modern smartphones with advanced cameras and processing capabilities eliminate the need for additional hardware. These devices now handle tasks previously requiring remote services.
Increased Navigation Speed: By removing marker dependency, our system provides faster, seamless navigation. The client-server pipeline is only used to exchange key points, with most computations on the device, offering speed over server-dependent systems.
Security and Privacy: The client-server pipeline uses TLS for localization data. The server does not store the client’s location or destination, as it sends the entire 3D map without retaining specific coordinates.

Methodology

Our system includes two main phases: (1) Generating a 3D map of the environment and (2) Localizing the device using ARWorldMap coordinates. Figure 2 illustrates the system architecture, showing interactions between mobile devices (clients) and server components that enable navigation.

Conclusions

The proposed system uses a shared server for storing and retrieving 3D maps of indoor spaces. Initially, the mobile device scans the environment using ARWorldMap, constructing a 3D map and defining shortest paths between location tags, which it uploads to the server. During navigation, the device receives the 3D map and path data, locating itself within the environment to guide users.

The main limitation is that the system requires consistency in physical structure, as changes in object color intensity can disrupt mapping. While currently limited to iOS, the Unity plugin enables cross-platform compatibility, supporting both iOS and Android. Future work includes extending the system to Android and integrating it with mature outdoor navigation systems to provide a seamless user experience.

My Blog

Latest blog

Our Contributions

Methodology

Conclusions

Contact Me

Contact With Me

Search This Blog