Notes on Indoor Navigation

Indoor Navigation is one of the most promising research areas in the area of pervasive computing. I have been working on Indoor Navigation for a couple of years now. With this article, I want to highlight some of my research results together with some open questions for further investigation.

The Area of Indoor Navigation

When you are working in the field of Indoor Navigation, you soon realize that Indoor Navigation means something completely different depending on whom you talk to. For a large group of scientists, Indoor Navigation is about inferring spatial coordinates of mobile devices inside buildings similar to GPS. For me and the rest of this text, we will refer to this as Indoor Positioning. Other people define Indoor Navigation as a solution to the demand of having a similar guidance experience inside buildings as we are used to in car navigation using GPS. This is, what we will call Indoor Guidance. And at the baseline of all of these situations, there is spatial information about buildings that is needed for many applications. The area of creating such data is often called Indoor Mapping while the area of using these maps for services is often called Indoor Navigation or Indoor Routing.

These various terms are used with different meanings all the time, however, what the community agreed on so far is that there are some building blocks. For a concrete application, some building blocks can be omitted and in general, each building block can be implemented by a different mindset or model or algorithm. But in my opinion, there are the following blocks:

  • Indoor Map Information
  • Indoor Position Determination
  • Information Fusion
  • Routing, Route Description, and Guidance

I myself have been working on most of these (at least a bit) and want to highlight some of my research results and ideas in the various areas.

Indoor Map Information

Indoor map information basically comes in several flavours. One way to distinguish various approaches is via the dimensionality of maps. The most important choices of dimensionality are

  • 2D Maps: Buildings are represented in 2D, that is on a flat surface
  • 2.5D Maps: Buildings are basically represented in 2D, but there are several layers, one for each story of a building.
  • 3D Maps: Buildings are represented as 3D objects in space.

The choice of dimensionality has severe impact on the complexity of the map and, unfortunately, also on the complexity of map generation and performing computation in the map. Therefore, it is not the case that one of these choices is any better than another. It always depends on the use case and situation. For example, with a 3D laser scanner and camera system such as the one developed by [NavVis] (, it is easily and quickly possible to create a 3D map of a building. However, it is not easy to provide navigation services from this data directly and you need to survey the building before. In the robotics domain, 2D maps are very often used. In this case, the limited dimension just reflects the limited ability of the robot (especially for driving robots) to use or experience a third dimension. Additionally, many filtering techniques such as Particle Filters are most often done in 2D maps, sometimes in 2.5D. This is due to the fact that the number of hypotheses (e.g., particles) would explode together with the dimensions of the space. Finally note, that most user interfaces are 2D. Yes, there might be more 3D indoor maps in the near future for Virtual Reality or Augmented Reality applications, but we are not yet there.

Another distiction with respect to map information is on the actual representation of the data:

  • Occupancy Grid Maps are maps in which a pixel (or voxel in 3D space) is either occupied or free.
  • Vector Drawings are maps in which the geometry is composed of (usually) simple geometric objects including points, lines, polylines, surfaces and others.
  • Point Clouds are maps in which any observed point is put into a full 3D space as a point.

In this area, I have been working on a toolchain in order to extract sufficient map information from CAD drawings of Munich airport. In expectation of changes in the floorplan over time (construction work, but even seasonal effects such as a Christmas market), we decided to model all navigational data as an overlay to the CAD data, so that we could easily exchange and update the underlying CAD dataset.

Screenshot showing a university building together with automatically extracted information about doors Detailed View

However, we also want to exploit the existing CAD data as much as possible. In essence, we designed a semi-automated system capable of finding and classifying several building objects greatly reducing the amount of time and gradually building a large grid-based navigation graph for the building.

Grid-based graphs have their limitations and it is not easy to set the correct scale. In the following grid graph, few vertices are being used to model the given space, but exploration with this grid resolution was unable to model the small room in the middle.

A graph sampled from a low resolution grid A graph sampled from a higher resolution grid

A smaller grid length repairs for this limitation, but comes with a dramatic increase in the number of vertices. However, for usual buildings, the number of vertices will always be small compared to the outside continental scale street networks used in street navigation..

In summary, depending on the application, a different type of map seems to be appropriate. On the other hand, we need interoperability and, possibly, standardization activities in order to better exploit the knowledge extracted from site surveys and manual map generation. From a research perspective, one of the most important questions could be formulated like this:

How can we transform map data from one representation into another in a seamless way such that we gain the flexibility of choosing one map representation while surveying (e.g., a 2D vector map) and using another map (e.g., an occupancy grid) in a given application.

In my research, some aspects of this question have been deeply discussed (e.g., using computer vision to extract building objects like doors, elevators, escalators from a blueprint @citation), but a lot of the problems in this area is unsolved and extremely interesting in practice.

Personally, I expect the mapping agencies (Google Maps, HERE, and others) to try to map indoor spaces in a representation that fits their outdoor maps and Startups (including NavVis, etc.) and other companies without such a large data basis will more likely be following a minimal modelling effort principle. Consequently, we can expect that we will get access to a large set of building information bases in different representations and it would be a good idea to prepare for this challenge...

Indoor Position Determination

Finding the location of a mobile asset is basically a prerequisite to navigation. Without a position in a map, it is very difficult to find and present useful location-based information for mobile users. However, there are several perspectives on what is actually meant by position.

indoor/MoVIPS.png indoor/sensorstream.png

Most commonly, a position refers to coordinates in a reference frame or in other words, a single unique location on the globe. This is due to the fact that GNSS positioning usually extracts coordinates in this way. However, other positioning systems are based on data mining (e.g., Wi-Fi fingerprinting), on map matching (e.g., terrain navigation), or on proximity (e.g., iBeacon and Eddystone). Many of those systems are still designed and evaluated in terms of coordinates, however, it is clear that we can also relax location to larger spaces (e.g., the name of the street, the homotopy class of a path in a terrain, the room name). Unfortunately, this is widely ignored by the positioning community due to the fact that it is extremely difficult to assess quality of location in these coarse and non-numeric settings. Still, room-based positioning systems have been proposed mapping sensor information to room labels.

One relatively new idea of mine is to represent position in term of a complete trajectory: You are either on it or not. In this way, several services have been propsoed, for example, estimating the indoor location you are heading for.

The following video shows a demo application of this concept:

In this video, you can clearly see that the system works well in predicting the next location. Especially, it has more or less no delay as you can see when we pass through the door at time 1:25. While it is reasonable that we are heading for the room, at the moment in time, where we pass this door and do not enter the stairs, we realize that we are not going to go to this room. Impressive, right?

While this application does not at all provide any spatial location, it is easy to setup and use. It replaces the common filtering technique (Kalman filter, particle filter, etc.) reducing measurement nosie with a holistic trajectory computing approach based on the Fr├ęchet distance. This motivates the second general challenge in indoor navigation related to positioning:

How can we overcome the limitations of coordinate-based systems and how can we integrate coordinate-based and coordinate-free approaches in pervasive computing environments.

Information Fusion

A vital part of indoor positioning and, therefore, also indoor navigation is given by multisensor navigation. As single systems (e.g., GNSS, Wi-Fi, etc.) do not suffice to provide indoor orientation for indoor navigation systems, one basic approach is given by integrating all available measurements, that is, use Wi-Fi, GNSS, IMUs, vision, and other systems in a combination.

For example, in the following figure showing Wi-Fi signal space as simulated by the Wall attenuation factor model, one should expect that the contours in the figure form circles. Their great deviation from circles shows, that distance is not actually determined by signal strength: In fact, we have to fuse lots of these models together as is being done in Wi-Fi fingerprinting.

Complexity of Wi-Fi signal space which needs information fusion.
Complexity of Wi-Fi signal space which needs information fusion.

For information fusion, there are several approaches, some for real-time applications but with strong assumptions about the tracked systems such as the Kalman filter and its variants assuming local linearity and Gaussian noise, others for general systems (e.g., particle filtering) fulfilling guarantees in wide ranges of application scenarios. These systems, however, need a large and often prohibitive amount of computation.

While both approaches are deeply investigated, the natural question of whether there are systems inbetween should be discussed more often. Are there more efficient ways of tracking sensor information along with their accuracy information than sampling? Are there generalizations of the assumptions for the various Kalman filters?

But more generally, I doubt that filtering is always a good idea: It is, when the ultimate goal is to integrate measurements in order to generate position information in coordinate space. With new terminology from deep learning, however, this is similar to a very low-dimensional autoencoder: process all information into two values x and y, such that the expectation of observing the information at this point in space is high. This autoencoding perspective makes it clear that a lot of effects need to be rejected, which can be informative. We have exploited such regular errors made by simple Wi-Fi positioning by applying trajectory computing on traces in high-dimensional Wi-Fi signal space and are ourselves impressed by the amount of information that seems to be extracted from those spaces, where positioning would fail and rely on the filter. The video above is an example of a result taken from this perspective.

This motivates the third general challenge for indoor location-based services:

Are we able to fuse the multitude of sensor information in a way such that good intermediate result in form of spatial coordinates are made available, while useful information disturbing the signal situation is retained and made available in addition to these coordinates?

Routing, Route Description, and Guidance

Indoor Navigation has much been concentrating on the positioning aspects of this challenge. This is reasonable for at least two reasons: first, it is a great challenge with a clear evaluation metric (e.g., mean squared error) and, secondly, without major advances in coarser forms of spatial computing, it is an inevitable input to a navigation system.

A very basic fact that makes indoor geometry largely different from many outdoor scenarios is the missing triangle inequality. In the triangle depicted below, the triangle inequality would read: Taking the stairs is always faster than taking the escalator and stepping two meter to the left.


When we assume that we are given a positioning system that works indoors with moderate noise and errors, how would we proceed towards indoor location-based services? Of course, we could map a lot of buildings in various forms as described in the first section on Indoor Map Information. However, this would lead to a system in which you can provide turn-by-turn navigation just as you would provide in a car. However, the amount of attention given to a screen while walking through an unknown building is a lot smaller. One fairy story often used by practicioners in this area sounds like this: We now provide a navigation system of this type, because soon everyone will be wearing a Google glass or similar AR headset. Other researchers avoid this problem of unavailable visual user interaces altogether by providing specialized services for the blind. Unfortunately, there is not much discussion going on how we should actually calculate shortest paths in buildings, how we can describe them and how we can generate usable guidance from them.

Some research projects now start on these topics investigating questions such as spatial sketches: Can we simplify given map information together with a given route in a form that human beings can visually understand in milliseconds, just raise your smartphone for a second and have a look, then continue your journey.


With a sequence of papers on combinatorial complexities of finding alternative routes in buildings, we have just presented research into another direction: How do we handle the free space situation as opposed to the graph situation in street networks without a combinatorial explosion or unnatural paths?

In summary, I believe that in this area, a lot of progress will be seen in the next decade as positioning is getting more and more mature and cheap. Therefore, the fourth challenge for indoor location-based services might be formulated as follows:

How can we deal with the complex indoor space and the limited user interface situation in indoor navigation as opposed to outdoor navigation?

Final thoughts

With this short article, I tried to express my personal feeling on the developments in the beautiful area of indoor location-based services. A lot of commercial and personal opportunities are present in this area. Additionally, some surprisingly hard, but beatiful problems remain largely unsolved. So:

Join the area of indoor navigation research! It is promising... Tweet It