Projecting Trackable Thermal Patterns for Dynamic Computer Vision
(CVPR '24)

Mark Sheinin, Aswin Sankaranarayanan, and Srinivasa Narasimhan

supp

The problem: Pattern recognition requires, well, *patterns*

It is no coincidence that "Pattern Recognition" is in the name of the most prominent computer vision conference (i.e., CVPR). This is because the ability to detect and recognize the same pattern from different views is fundamental for dynamic vision tasks, namely for tasks where the observer (i.e., the camera) must navigate and map in a given environment. But what if the environment doesn't have enough good texture to facilitate this task?

Sure, you can always go and find a better room for your Nerf/Gaussian-Splatting demo, but a navigating/mapping robot must operate in any given environment, including navigating long dark roads at night, inside pipelines and long featureless hallways, and scanning textureless objects.

But what if we could give our robots the ability to 'paint' auxiliary patterns on textureless environment regions to improve their navigation and mapping?

The solution: Painting light with light

But wait, don't heat patterns evaporate with time?

Indeed, they do, and that's actually a problem because it means that a surface pattern imaged now will appear differently when imaged in future frames. This brightness inconsistency is incompatible with various off-the-shelf dynamic vision algorithms and may degrade their performance (as shown below). So, how do we fix this problem?

Answer: We train a neural network to reverse the brightness inconsistency, as explained next.

A frame taken at time t+dt differs from a frame taken at time t in two ways: (a) First, points existing at time t have undergone heat diffusion, making them smoother and dimmer frame t+dt, and (b) a bunch of new points were projected on the surface showing up in frame t+dt but not existing in frame t. So, our network performs two functions: it takes the later frame t+dt as input, removes the newly added points appearing only in frame t+dt, and undiffuses the existing points in frame t+dt to match their appearance to frame t. Our network also returns the newly added points, which is useful for tracking initialization (details in the paper).

The effect of the UNET correction can be seen in the example below.

environments and objects lacking 'good' texture for feature extraction and matching

generated with deepai.org

Well, we can not equip our robots with buckets of paint. But we can use something very similar to a device that already exists on many of our robots -- like autonomous cars -- and that is the Lidar. Lidars measure depth by scanning a laser at surface environments. But when a laser hits a surface, some light energy gets absorbed by the material, causing a slight increase in the material's surface temperature. While invisible to the naked eye, this increase in surface temperature can be seen by a thermal camera, which images in the infrared domain. So, a robot can use the laser to paint its own tailor-made heat patterns, which it can use to facilitate navigation and mapping, while 'seeing' the pattern with a thermal camera.

imaging system schematic

rotating object

with pattern projection

our laser 'paints' a dot pattern that 'sticks' to the object's surface

we track the heat points to generate point matches between frames

we manually feed the point matches to COLMAP to generate the camera motion and a sparse 3D model of the object

time t

time t+1 (no new points)
heat diffusion causes the surface points to appear smoother and dimmer

time t+1 (with new points)
the system projects new heat points
to constantly reinforce the heat pattern

frame at time t

UNET

corrected frame at time t+1

frame at time t+1

- point existing in both frames.

- new point not existing in frame t.

scene object

thermal video with pattern projection
notice how the heat points rapidly diffuse after projection

without correction

with correction

COLMAP output without heat diffusion correcting

COLMAP out with heat diffusion correction

3D shape without heat diffusion correction

3D shape with heat diffusion correction

Additional applications

Our system is useful for various dynamic vision tasks like:

Object tracking

texturless planer object

thermal pattern projection and tracking

superimposed a picture
on the plane’s surface

Optical flow

texturless planer object (rotating)

thermal pattern projection
(no explicit point tracking)

optical flow computed between consecutive frame pairs

Indoor localization

we put the system on a cart, directed at the floor using a mirror. Then, we make a loop around the office desks on the left.

thermal pattern projection and tracking

recovered camera (cart) trajectory

Great, now tell me when it doesn't work (limitations)

Our method relies on the laser's ability to heat up surface spots rapidly. However, some materials are not amenable to such an operation. To heat up a surface point 'well,' the surface material must have a low albedo in the laser's wavelength, high emissivity, and low thermal conductivity (so that the absorbed heat does not diffuse too quickly). Therefore, materials such as glass and metals will show degraded performance for our method. Compared to visible-light SfM, our patterns can not facilitate loop closure using temporally distant frames because all the patterns evaporate entirely after some time duration.

BibTex

@inproceedings{Sheinin:2024,
title = {Projecting Trackable Thermal Patterns for Dynamic Computer Vision},
author={Sheinin, Mark and Sankaranarayanan, Aswin and Narasimhan, Srinivasa G.},
booktitle={Proc. IEEE/CVF CVPR},
year={2024},
}

Projecting Trackable Thermal Patterns for Dynamic Computer Vision (CVPR '24)

Mark Sheinin, Aswin Sankaranarayanan, and Srinivasa Narasimhan

paper

supp

environments and objects lacking 'good' texture for feature extraction and matching

generated with deepai.org

imaging system schematic

rotating object

with pattern projection

our laser 'paints' a dot pattern that 'sticks' to the object's surface

we track the heat points to generate point matches between frames

we manually feed the point matches to COLMAP to generate the camera motion and a sparse 3D model of the object

time t

time t+1 (no new points) heat diffusion causes the surface points to appear smoother and dimmer

time t+1 (with new points) the system projects new heat points to constantly reinforce the heat pattern

frame at time t

UNET

corrected frame at time t+1

frame at time t+1

- point existing in both frames.

- new point not existing in frame t.

scene object

thermal video with pattern projection notice how the heat points rapidly diffuse after projection

without correction

with correction

COLMAP output without heat diffusion correcting

COLMAP out with heat diffusion correction

3D shape without heat diffusion correction

3D shape with heat diffusion correction

Object tracking

texturless planer object

thermal pattern projection and tracking

superimposed a picture on the plane’s surface

Optical flow

texturless planer object (rotating)

thermal pattern projection (no explicit point tracking)

optical flow computed between consecutive frame pairs

Indoor localization

we put the system on a cart, directed at the floor using a mirror. Then, we make a loop around the office desks on the left.

thermal pattern projection and tracking

recovered camera (cart) trajectory

Projecting Trackable Thermal Patterns for Dynamic Computer Vision
(CVPR '24)

time t+1 (no new points)
heat diffusion causes the surface points to appear smoother and dimmer

time t+1 (with new points)
the system projects new heat points
to constantly reinforce the heat pattern

thermal video with pattern projection
notice how the heat points rapidly diffuse after projection

superimposed a picture
on the plane’s surface

thermal pattern projection
(no explicit point tracking)