-->
A research project proposes a novel method for implementing an Indoor Navigation System: a monocular SLAM-based Indoor Navigation System (bib | DOI | pdf). In recent years, navigation research has expanded significantly. While outdoor navigation has reached commercial-level efficiency, indoor navigation systems (INS) still lag behind their outdoor counterparts. Outdoor systems rely primarily on GPS and inertial trackers, widely adopted since the early 2000s. Indoor navigation commonly uses Bluetooth Low Energy (BLE) beacon technology, but BLE has efficiency limitations compared to outdoor solutions. Other technologies, such as Wi-Fi, lidar, and infrared sensors, are also used for indoor navigation.
Among various approaches to implementing INS, vision-based solutions are promising for their usability and operability. Vision-based navigation aligns with how humans recognize environments by associating unique landmarks and objects. A vision-based INS typically involves a network of markers, such as QR codes, barcodes, ArUco markers, or customized patterns, scanned by users as they navigate. While this method is accurate and straightforward, it relies heavily on the availability of markers. If markers are inaccessible, locating the next one can become time-consuming and confusing. Implementing a new marker-based INS in an unfamiliar location can also be tedious.
A better approach is simultaneous localization and mapping (SLAM). SLAM is a computational process that enables a device to map an unknown environment in 3D while locating itself within it. SLAM applications include indoor cleaners, autonomous vehicles, robots, and extended reality. With advancements in mobile camera technology, both iOS and Android offer augmented reality development kits, ARKit and ARCore. However, challenges like map accuracy and storage requirements for 3D point clouds remain.
In this paper, we propose a markerless, vision-based, cost-effective, real-time solution for indoor navigation using visual SLAM. Visual SLAM's ability to operate on monocular cameras is crucial since our system uses the mobile device’s built-in camera. Advances in computer vision algorithms have demonstrated the potential of monocular visual SLAM on mobile devices. For scalability and to overcome AR SDK challenges, we use ARKit’s ARWorldMap module. ARWorldMap allows users to create location-based 3D maps, containing details such as coordinates and object identifiers.
Indoor navigation systems are still evolving. This paper addresses existing limitations in vision-based systems by introducing the following features to enhance usability and operability:
Our system includes two main phases: (1) Generating a 3D map of the environment and (2) Localizing the device using ARWorldMap coordinates. Figure 2 illustrates the system architecture, showing interactions between mobile devices (clients) and server components that enable navigation.
The proposed system uses a shared server for storing and retrieving 3D maps of indoor spaces. Initially, the mobile device scans the environment using ARWorldMap, constructing a 3D map and defining shortest paths between location tags, which it uploads to the server. During navigation, the device receives the 3D map and path data, locating itself within the environment to guide users.
The main limitation is that the system requires consistency in physical structure, as changes in object color intensity can disrupt mapping. While currently limited to iOS, the Unity plugin enables cross-platform compatibility, supporting both iOS and Android. Future work includes extending the system to Android and integrating it with mature outdoor navigation systems to provide a seamless user experience.
Want to join the squad, Feel free to reach me out anytime! Here's my contact info: