An artificial intelligence and machine learning method formulated by army researchers generates a visible face image from a thermal image of a person's face captured in nighttime or low-light conditions. This development could result in improved real-time biometrics and post-mission forensic analysis for secret nighttime operations.
A conceptual illustration for thermal-to-visible synthesis for interoperability with existing visible-based facial recognition systems. (Image credit: Courtesy Eric Proctor, William Parks, and Benjamin S. Riggan)
Thermal cameras like Forward Looking Infrared (FLIR) sensors are dynamically deployed on ground and aerial vehicles, in watchtowers and at checkpoints for surveillance purposes. Of late, thermal cameras are becoming available for use as body-worn cameras. The ability to perform automatic face recognition at nighttime using such thermal cameras is advantageous for informing a soldier that a particular person is someone of interest (meaning a person who may be on a watch list).
The motivations for this technology -- created by Drs. Benjamin S. Riggan, Nathaniel J. Short and Shuowen "Sean" Hu, from the
U.S. Army Research Laboratory -- are to improve both automatic and human-matching capabilities.
This technology enables matching between thermal face images and existing biometric face databases/watch lists that only contain visible face imagery," said Riggan, a research scientist. "The technology provides a way for humans to visually compare visible and thermal facial imagery through thermal-to-visible face synthesis."
He said under low-light and nighttime conditions, there is scarce light for a conventional camera to capture facial imagery for recognition without active illumination such as a spotlight or flash, which would reveal the position of such surveillance cameras; however, thermal cameras that capture the heat signature naturally radiating from living skin tissue are perfect for such conditions.
When using thermal cameras to capture facial imagery, the main challenge is that the captured thermal image must be matched against a watch list or gallery that only contains conventional visible imagery from known persons of interest. Therefore, the problem becomes what is referred to as cross-spectrum, or heterogeneous, face recognition. In this case, facial probe imagery acquired in one modality is matched against a gallery database acquired using a different imaging modality.
Dr. Benjamin S. Riggan
This method leverages modern domain adaptation methods based on deep neural networks. The primary approach is made up of two main parts: a non-linear regression model that maps a particular thermal image into a corresponding visible latent representation and an optimization issue that projects the latent projection back into the image space.
A technical paper titled "Thermal to Visible Synthesis of Face Images using Multiple Regions" showcased the research at the IEEE Winter Conference on Applications of Computer Vision, or WACV, in Lake Tahoe, Nevada in March 2018. The technical conference comprised of scientists and scholars from industry, academia, and government.
At the conference, the army researchers showed that integrating global information, such as the characteristics from across the whole face, and local information, such as features from discriminative fiducial regions, for instance, eyes, mouth, and nose, improved the discriminability of the synthesized imagery. They demonstrated how the thermal-to-visible mapped representations from both local and global regions in the thermal face signature could be used in combination to synthesize a refined visible face image.
The optimization issue for synthesizing an image attempts to equally preserve the shape of the whole face and appearance of the local fiducial fine points. Using the synthesized thermal-to-visible imagery and current visible gallery imagery, they performed face verification experiments using a common open source deep neural network architecture for face recognition. The architecture used is openly designed for visible-based face recognition. The most astonishing result is that their approach realized better verification performance than a generative adversarial network-based technique, which formerly showed photo-realistic properties.
Riggan associates this result to the fact the game theoretic objective for GANs instantly seeks to produce imagery that is adequately similar in dynamic range and photo-like appearance to the training imagery, while occasionally neglecting to preserve identifying features, he said. The technique developed by ARL preserves identity information to improve discriminability, for instance, better recognition accuracy for both automatic face recognition algorithms and human adjudication.
As part of the paper presentation, the Army Research Laboratory (ARL) researchers exhibited a near real-time demonstration of this technology. The proof of concept demonstration incorporated the use of a FLIR Boson 320 thermal camera and a laptop working the algorithm in near real-time. This demonstration revealed to the audience that a captured thermal image of a person can be used to create a synthesized visible image in situ. This work was bestowed the best paper award in the faces/biometrics session of the conference, out of over 70 papers presented.
Going forward, Riggan said he and his colleagues will continue this research under the sponsorship of the Defense Forensics and Biometrics Agency to develop a powerful nighttime face recognition capability for soldiers.