A Cycle Ride to HDR: Semantics Aware Self-Supervised Framework for Unpaired LDR-to-HDR Image Translation
The official repository of the paper with supplementary: A Cycle Ride to HDR!!
This project is carried out at the Human-Centered AI Lab in the Faculty of Information Technology, Monash University, Melbourne (Clayton), Australia.
Project Members -
Hrishav Bakul Barua (Monash University and TCS Research, Kolkata, India),
Kalin Stefanov (Monash University, Melbourne, Australia),
Lemuel Lai En Che (Monash University, Malaysia),
Abhinav Dhall (Indian Institute of Technology (IIT) Ropar, India and Flinders University, Adelaide, Australia),
KokSheik Wong (Monash University, Malaysia), and
Ganesh Krishnasami (Monash University, Malaysia).
For any quries kindly contact: hbarua@acm.org
/ hrishav.barua@ieee.org
This work is supported by the prestigious Global Excellence and Mobility Scholarship (GEMS)
, Monash University. This research is also supported, in part, by the E-Science fund under the project: Innovative High Dynamic Range Imaging - From Information Hiding to Its Applications (Grant No. 01-02-10-SF0327)
. Funding in the form of Monash IT Student Research (iSR) Scheme 2023
also contributed to this work.
Low Dynamic Range (LDR) to High Dynamic Range (HDR) image translation is an important computer vision problem. There is a significant amount of research utilizing both conventional non-learning methods and modern data-driven approaches, focusing on using both single-exposed and multi-exposed LDR for HDR image reconstruction. However, most current state-of-the-art methods require high-quality paired {LDR,HDR} datasets for model training. In addition, there is limited literature on using unpaired datasets for this task where the model learns a mapping between domains, i.e., LDR ↔ HDR. To address limitations of current methods, such as the paired data constraint, as well as unwanted blurring and visual artifacts in the reconstructed HDR, we propose a method that uses a modified cycle-consistent adversarial architecture and utilizes unpaired {LDR,HDR} datasets for training. The method introduces novel generators to address visual artifact removal and an encoder and loss to address semantic consistency, another under-explored topic. The method achieves state-of-the-art results across several benchmark datasets and reconstructs high-quality HDR images.
Left: Overview of the proposed method architecture where x and y represent LDR and HDR images, respectively. The method is trained with five objectives: adversarial, cycle consistency, identity, contrastive, and semantic segmentation. GX and GY are the generators while DX and DY are the discriminators. E(.) is the Contrastive Language-Image Pretraining - CLIP encoder. Right: Overview of the proposed generators based on our novel feedback based U-Net architecture.
Left part (
Please check out the paper for more details!!
Depiction of the cycle consistency loss Lcyc using an image from the DrTMO dataset.
Left: Depiction of the contrastive loss Lcon. Positive
(
Please check out the paper for more details!!
Method | Input | Output | Unpaired | Context-aware | Semantics | Artifacts | Tone-mapping |
---|---|---|---|---|---|---|---|
PSENet (WACV'23) | SE | D | ❌ | ❌ | ❌ | ❌ | ❌ |
SingleHDR(W) (WACV'23) | SE | I | ❌ | ❌ | ❌ | ❌ | ❌ |
UPHDR-GAN (TCSVT'22) | ME | D | ✅ | ❌ | ❌ | ✅ | ❌ |
SelfHDR (ICLR'24) | ME | I | ❌ | ❌ | ❌ | ✅ | ❌ |
KUNet (IJCAI'22) | SE | D | ❌ | ❌ | ✅ | ❌ | ❌ |
Ghost-free HDR (ECCV'22) | ME | D | ❌ | ✅ | ❌ | ✅ | ❌ |
DITMO | SE | I | ❌ | ❌ | ✅ | ✅ | ❌ |
(ours) | SE | D | ✅ | ✅ | ✅ | ✅ | ✅ |
Input: LDR used as input (SE: Single-exposed and ME: Multi-exposed), Output: Reconstructs directly HDR (D) or multi-exposed LDR stack (I), Unpaired: Uses unpaired data, Context-aware: Uses local/global image information and relationship among entities in the image, Semantics: Uses color/texture information and identity of the items in the image, Artifacts: Handles visual artifacts, Tone-mapping: Also performs tone-mapping i.e. HDR → LDR.
HDR reconstruction (inverse tone-mapping) learned with our self-supervised learning approach. Quantitative comparison with supervised (gray) and unsupervised/weakly-supervised/self- supervised (black) learning methods trained on the paired datasets HDRTV, NTIRE, and HDR-Synth & HDR-Real.
LDR reconstruction (tone-mapping) learned with our self- supervised learning approach. Quantitative comparison with the state-of-the-art one-mapping operators.
Comparison of the SingleHDR(W) U-Net with and without our feedback mechanism on images from the DrTMO dataset. It illustrates the improvement in SingleHDR(W) when we use the proposed feedback U-Net (mod) instead of the original U-Net of SingleHDR(W). The original U-Net produces many artifacts in the output HDR images whereas our modified version with feedback reconstructs artifact-free HDR images.
Please check out the paper for more details!!
Please check out the paper for more details!!
ACM TOG 2017
| HDRCNN
- HDR image reconstruction from a single exposure using deep CNNs | Code
ACM TOG 2017
| DrTMO
- Deep Reverse Tone Mapping | Code
Eurographics 2018
| ExpandNet
- A Deep Convolutional Neural Network for High Dynamic Range Expansion from Low Dynamic Range Content | Code
GlobalSIP 2019
| FHDR
- HDR Image Reconstruction from a Single LDR Image using Feedback Network | Code
CVPR 2020
| SingleHDR
- Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline | Code
CVPRW 2021
| Two-stage HDR
- A two-stage deep network for high dynamic range image reconstruction | Code
IEEE TIP 2021
| HDR-GAN
- HDR Image Reconstruction from Multi-Exposed LDR Images with Large Motions | Code
IJCAI 2022
| KUNet
- Imaging Knowledge-Inspired Single HDR Image Reconstruction | Code
ECCV 2022
| Ghost-free HDR
- Ghost-free High Dynamic Range Imaging with Context-aware Transformer | Code
APSIPA 2023
| ArtHDR-Net
- Perceptually Realistic and Accurate HDR Content Creation | Code
ICIP 2024
| HistoHDR-Net
- Histogram Equalization for Single LDR to HDR Image
Translation| Code
IEEE TCSVT 2022
| UPHDR-GAN
- Generative Adversarial Network for High Dynamic Range Imaging with Unpaired Data | Code
WACV 2023
| PSENet
- Progressive Self-Enhancement Network for Unsupervised Extreme-Light Image Enhancement | Code
WACV 2023
| SingleHDR
- Single-Image HDR Reconstruction by Multi-Exposure Generation | Code
ICLR 2024
| SelfHDR
- Self-Supervised High Dynamic Range Imaging with Multi-Exposure Images in Dynamic Scenes | Code
VPQM 2015
| HDR-Eye
- Visual attention in LDR and HDR images | Dataset
ACM TOG 2017
| Kalantari
et al. - Deep high dynamic range imaging of dynamic scenes | Dataset
IEEE Access 2020
| LDR-HDR Pair
- Dynamic range expansion using cumulative histogram learning for high dynamic range image generation | Dataset
CVPR 2020
| HDR-Synth & HDR-Real
- Single-image HDR reconstruction by learning to reverse the camera pipeline | Dataset
ICCV 2021
| HDRTV
- A New Journey from SDRTV to HDRTV | Dataset
CVPR 2021
| NTIRE
- Ntire 2021 challenge on high dynamic range imaging: Dataset, methods and results.| Dataset
ACM TOG 2017
| DrTMO
- Deep Reverse Tone Mapping | Dataset
GTA-HDR
- A Large-Scale Synthetic Dataset for HDR Image Reconstruction | Dataset
μ-law operator
| Link
Reinhard's operator
| Link
Photomatix
| Link
PSNR
- Peak Signal to Noise Ratio | Link
SSIM
- Structural Similarity Index Measure | Link
LPIPS
- Learned Perceptual Image Patch Similarity | Link
HDR-VDP-2
- High Dynamic Range Visual Differences Predictor | Link