Skip to content

Theory, Experiments, and Dataset for our newly proposed Deep Learning method for Cycle Consistency and Semantics Aware Self-Supervised Framework for Unpaired LDR ↔ HDR Image Translation

Notifications You must be signed in to change notification settings

HrishavBakulBarua/Cycle-HDR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 

Repository files navigation

A Cycle Ride to HDR: Semantics Aware Self-Supervised Framework for Unpaired LDR-to-HDR Image Translation

My Image

The official repository of the paper with supplementary: A Cycle Ride to HDR!!

About the project

This project is carried out at the Human-Centered AI Lab in the Faculty of Information Technology, Monash University, Melbourne (Clayton), Australia.

Project Members -

Hrishav Bakul Barua (Monash University and TCS Research, Kolkata, India),
Kalin Stefanov (Monash University, Melbourne, Australia),
Lemuel Lai En Che (Monash University, Malaysia),
Abhinav Dhall (Indian Institute of Technology (IIT) Ropar, India and Flinders University, Adelaide, Australia),
KokSheik Wong (Monash University, Malaysia), and
Ganesh Krishnasami (Monash University, Malaysia).

For any quries kindly contact: hbarua@acm.org/ hrishav.barua@ieee.org

Funding details

This work is supported by the prestigious Global Excellence and Mobility Scholarship (GEMS), Monash University. This research is also supported, in part, by the E-Science fund under the project: Innovative High Dynamic Range Imaging - From Information Hiding to Its Applications (Grant No. 01-02-10-SF0327). Funding in the form of Monash IT Student Research (iSR) Scheme 2023 also contributed to this work.

Overview

Low Dynamic Range (LDR) to High Dynamic Range (HDR) image translation is an important computer vision problem. There is a significant amount of research utilizing both conventional non-learning methods and modern data-driven approaches, focusing on using both single-exposed and multi-exposed LDR for HDR image reconstruction. However, most current state-of-the-art methods require high-quality paired {LDR,HDR} datasets for model training. In addition, there is limited literature on using unpaired datasets for this task where the model learns a mapping between domains, i.e., LDR ↔ HDR. To address limitations of current methods, such as the paired data constraint, as well as unwanted blurring and visual artifacts in the reconstructed HDR, we propose a method that uses a modified cycle-consistent adversarial architecture and utilizes unpaired {LDR,HDR} datasets for training. The method introduces novel generators to address visual artifact removal and an encoder and loss to address semantic consistency, another under-explored topic. The method achieves state-of-the-art results across several benchmark datasets and reconstructs high-quality HDR images.

My Image

Left: Overview of the proposed method architecture where x and y represent LDR and HDR images, respectively. The method is trained with five objectives: adversarial, cycle consistency, identity, contrastive, and semantic segmentation. GX and GY are the generators while DX and DY are the discriminators. E(.) is the Contrastive Language-Image Pretraining - CLIP encoder. Right: Overview of the proposed generators based on our novel feedback based U-Net architecture. Left part ($${\color{blue}Blue}$$) is the encoder and right part ($${\color{red}Red}$$) is the decoder. The decoder is inside the feedback iteration loop

Please check out the paper for more details!!

Loss mechanisms

My Image

Depiction of the cycle consistency loss Lcyc using an image from the DrTMO dataset.

My Image

Left: Depiction of the contrastive loss Lcon. Positive ($${\color{green}green}$$) and negative ($${\color{red}Red}$$) pairs in a batch. We use a histogram- equalized version of the LDR processed using the OpenCV func- tion equalizeHist. Right: Depiction of the semantic segmentation loss Lsem. We use Segment Anything (SAM) to generate segmentation classes in the histogram-equalized LDR and reconstructed tone-mapped HDR images.

Please check out the paper for more details!!

State-of-the-art comparision in high level parameters

Comparision summary table

Method Input Output Unpaired Context-aware Semantics Artifacts Tone-mapping
PSENet (WACV'23) SE D
SingleHDR(W) (WACV'23) SE I
UPHDR-GAN (TCSVT'22) ME D
SelfHDR (ICLR'24) ME I
KUNet (IJCAI'22) SE D
Ghost-free HDR (ECCV'22) ME D
DITMO SE I
(ours) SE D

Input: LDR used as input (SE: Single-exposed and ME: Multi-exposed), Output: Reconstructs directly HDR (D) or multi-exposed LDR stack (I), Unpaired: Uses unpaired data, Context-aware: Uses local/global image information and relationship among entities in the image, Semantics: Uses color/texture information and identity of the items in the image, Artifacts: Handles visual artifacts, Tone-mapping: Also performs tone-mapping i.e. HDR → LDR.

Experiments and Results

LDR → HDR:

My Image HDR reconstruction (inverse tone-mapping) learned with our self-supervised learning approach. Quantitative comparison with supervised (gray) and unsupervised/weakly-supervised/self- supervised (black) learning methods trained on the paired datasets HDRTV, NTIRE, and HDR-Synth & HDR-Real.

HDR → LDR:

My Image LDR reconstruction (tone-mapping) learned with our self- supervised learning approach. Quantitative comparison with the state-of-the-art one-mapping operators.

My Image Comparison of the SingleHDR(W) U-Net with and without our feedback mechanism on images from the DrTMO dataset. It illustrates the improvement in SingleHDR(W) when we use the proposed feedback U-Net (mod) instead of the original U-Net of SingleHDR(W). The original U-Net produces many artifacts in the output HDR images whereas our modified version with feedback reconstructs artifact-free HDR images.

Please check out the paper for more details!!

Visual results

My Image

Please check out the paper for more details!!

Our work utilizes the following awesome works and datasets:

State-of-the-art DL models for LDR → HDR reconstruction/translation

Supervised:

ACM TOG 2017 | HDRCNN - HDR image reconstruction from a single exposure using deep CNNs | Code

ACM TOG 2017 | DrTMO - Deep Reverse Tone Mapping | Code

Eurographics 2018 | ExpandNet - A Deep Convolutional Neural Network for High Dynamic Range Expansion from Low Dynamic Range Content | Code

GlobalSIP 2019 | FHDR - HDR Image Reconstruction from a Single LDR Image using Feedback Network | Code

CVPR 2020 | SingleHDR - Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline | Code

CVPRW 2021 | Two-stage HDR - A two-stage deep network for high dynamic range image reconstruction | Code

IEEE TIP 2021 | HDR-GAN - HDR Image Reconstruction from Multi-Exposed LDR Images with Large Motions | Code

IJCAI 2022 | KUNet - Imaging Knowledge-Inspired Single HDR Image Reconstruction | Code

ECCV 2022 | Ghost-free HDR - Ghost-free High Dynamic Range Imaging with Context-aware Transformer | Code

APSIPA 2023 | ArtHDR-Net - Perceptually Realistic and Accurate HDR Content Creation | Code

ICIP 2024 | HistoHDR-Net - Histogram Equalization for Single LDR to HDR Image Translation| Code

Unsupervised/weakly-supervised/self-supervised:

IEEE TCSVT 2022 | UPHDR-GAN - Generative Adversarial Network for High Dynamic Range Imaging with Unpaired Data | Code

WACV 2023 | PSENet - Progressive Self-Enhancement Network for Unsupervised Extreme-Light Image Enhancement | Code

WACV 2023 | SingleHDR - Single-Image HDR Reconstruction by Multi-Exposure Generation | Code

ICLR 2024 | SelfHDR - Self-Supervised High Dynamic Range Imaging with Multi-Exposure Images in Dynamic Scenes | Code

State-of-the-art DL datasets for LDR → HDR reconstruction/translation

VPQM 2015 | HDR-Eye - Visual attention in LDR and HDR images | Dataset

ACM TOG 2017 | Kalantari et al. - Deep high dynamic range imaging of dynamic scenes | Dataset

IEEE Access 2020 | LDR-HDR Pair - Dynamic range expansion using cumulative histogram learning for high dynamic range image generation | Dataset

CVPR 2020 | HDR-Synth & HDR-Real - Single-image HDR reconstruction by learning to reverse the camera pipeline | Dataset

ICCV 2021 | HDRTV - A New Journey from SDRTV to HDRTV | Dataset

CVPR 2021 | NTIRE - Ntire 2021 challenge on high dynamic range imaging: Dataset, methods and results.| Dataset

ACM TOG 2017 | DrTMO - Deep Reverse Tone Mapping | Dataset

GTA-HDR - A Large-Scale Synthetic Dataset for HDR Image Reconstruction | Dataset

Popular HDR → LDR tome-mapping operators used

μ-law operator | Link

Reinhard's operator | Link

Photomatix | Link

Evaluation metrices used

PSNR - Peak Signal to Noise Ratio | Link

SSIM - Structural Similarity Index Measure | Link

LPIPS - Learned Perceptual Image Patch Similarity | Link

HDR-VDP-2 - High Dynamic Range Visual Differences Predictor | Link