Official repository for the paper :

"Using Set Covering to Generate Databases for Holistic Steganalysis"

released at @WIFS2022 (Shanghai, China)

Rony Abecidan, Vincent Itier, Jeremie Boulanger, Patrick Bas, Tomáš Pevný

Abstract : Within an operational framework, covers used by a steganographer are likely to come from different sensors and different processing pipelines than the ones used by researchers for training their steganalysis models. Thus, a performance gap is unavoidable when it comes to out-of-distributions covers, an extremely frequent scenario called Cover Source Mismatch (CSM). Here, we explore a grid of processing pipelines to study the origins of CSM, to better understand it, and to better tackle it. A set-covering greedy algorithm is used to select representative pipelines minimizing the maximum regret between the representative and the pipelines within the set. Our main contribution is a methodology for generating relevant bases able to tackle operational CSM.

Comments about the repo :

The file pipelines.csv is a directory of pipeline disclosing their parameters and identifying them precisely with a number.
The file RAW_DATABASE.csv contains some information about all the RAW images we used for our experiments. They are all from the database ALASKA.
The file FLICKR_BASE.csv contains some information about all the FLICKR images we used as our wild base for our last experiment. They are extracted from the website FLICKR and copyright free.
The folder 1-Developing contains some code enabling to develop RAW Images like we did.
The folder 2-Clustering contains some code enabling to extract relevant pipelines from the grid using the greedy set-covering algorithm we used. There is also a playground notebook to help you reproduce some results we obtained in the paper. Don't hesitate to use our PE/Regret Matrix to derive new conclusions.
To be able to reproduce our experiments and do your own ones, please follow our Installation Instructions

Example of covering :

Using a maximum regret radius of 10%, the greedy algorithm returned a set of 5 pipelines. Hence, 5 sources are enabling to cover every other source from the grid to an accuracy of 10% in terms of regret. Meaning,

Whatever the source you consider from the grid, I can always find a representative among the 5 found such that, training on this representative will give me a test performance almost as satisfying as if I trained directly on the original source, the maximum difference of performance being 10%.

Illustration of the covering obtained with a maximum regret radius of 10%

Main references

@article{giboulot:hal-02631559,
  TITLE = {{Effects and Solutions of Cover-Source Mismatch in Image Steganalysis}},
  AUTHOR = {Giboulot, Quentin and Cogranne, R{\'e}mi and Borghys, Dirk and Bas, Patrick},
  URL = {https://hal-utt.archives-ouvertes.fr/hal-02631559},
  JOURNAL = {{Signal Processing: Image Communication}},
  PUBLISHER = {{Elsevier}},
  SERIES = {86},
  YEAR = {2020},
  MONTH = Aug,
  DOI = {10.1016/j.image.2020.115888},
  KEYWORDS = {Steganography ; Steganalysis ; Cover-Source Mismatch ; Image processing ; Image Heterogeneity},
  PDF = {https://hal-utt.archives-ouvertes.fr/hal-02631559/file/ImageCommunication_Final.pdf},
  HAL_ID = {hal-02631559},
  HAL_VERSION = {v1},
}

@inproceedings{giboulot:hal-03694662,
  TITLE = {{The Cover Source Mismatch Problem in Deep-Learning Steganalysis}},
  AUTHOR = {Giboulot, Quentin and Bas, Patrick and Cogranne, R{\'e}mi and Borghys, Dirk},
  URL = {https://hal-utt.archives-ouvertes.fr/hal-03694662},
  BOOKTITLE = {{European Signal Processing Conference}},
  ADDRESS = {Belgrade, Serbia},
  YEAR = {2022},
  MONTH = Aug,
  PDF = {https://hal-utt.archives-ouvertes.fr/hal-03694662/file/Giboulot_EUSIPCO_2022.pdf},
  HAL_ID = {hal-03694662},
  HAL_VERSION = {v1},
}

@inproceedings{cogranne:hal-02147763,
  TITLE = {{The ALASKA Steganalysis Challenge: A First Step Towards Steganalysis ''Into The Wild''}},
  AUTHOR = {Cogranne, R{\'e}mi and Giboulot, Quentin and Bas, Patrick},
  URL = {https://hal.archives-ouvertes.fr/hal-02147763},
  BOOKTITLE = {{ACM IH\&MMSec (Information Hiding \& Multimedia Security)}},
  ADDRESS = {Paris, France},
  SERIES = {ACM IH\&MMSec (Information Hiding \& Multimedia Security)},
  YEAR = {2019},
  MONTH = Jul,
  DOI = {10.1145/3335203.3335726},
  KEYWORDS = {steganography ; steganalysis ; contest ; forensics ; Security and privacy},
  PDF = {https://hal.archives-ouvertes.fr/hal-02147763/file/ALASKA_lesson_learn_Vsubmitted.pdf},
  HAL_ID = {hal-02147763},
  HAL_VERSION = {v1},
}

@inproceedings{sepak,
	title = {Formalizing cover-source mismatch as a robust optimization},
	author = {Šepák, Dominik and Adam, Lukáš and Pevný, Tomáš},
  BOOKTITLE  = {{EUSIPCO: European Signal Processing Conference}},
  MONTH = Sep,
  ADDRESS = {Belgrade, Serbia},
  YEAR = {2022},
}

@inproceedings{10.1145/3437880.3460395,
author = {Butora, Jan and Yousfi, Yassine and Fridrich, Jessica},
title = {How to Pretrain for Steganalysis},
year = {2021},
isbn = {9781450382953},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3437880.3460395},
doi = {10.1145/3437880.3460395},
booktitle = {Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security},
pages = {143–148},
numpages = {6},
keywords = {steganalysis, imagenet, convolutional neural network, JPEG, transfer learning},
location = {Virtual Event, Belgium},
series = {IH&amp;MMSec '21}
}

Citing our paper

If you wish to refer to our paper, please use the following BibTeX entry

@inproceedings{abecidan:hal-03840926,
  TITLE = {{Using Set Covering to Generate Databases for Holistic Steganalysis}},
  AUTHOR = {Abecidan, Rony and Itier, Vincent and Boulanger, J{\'e}r{\'e}mie and Bas, Patrick and Pevn{\'y}, Tom{\'a}{\v s}},
  URL = {https://hal.archives-ouvertes.fr/hal-03840926},
  BOOKTITLE = {{IEEE International Workshop on Information Forensics and Security (WIFS 2022)}},
  ADDRESS = {Shanghai, China},
  YEAR = {2022},
  MONTH = Dec,
  PDF = {https://hal.archives-ouvertes.fr/hal-03840926/file/2022_wifs.pdf},
  HAL_ID = {hal-03840926},
  HAL_VERSION = {v1},
}

Acknowledgements

Our experiments were possible thanks to computing means of IDRIS through the resource allocation 2021- AD011013285 assigned by GENCI. This work received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No 101021687 (project “UNCOVER”) and the French Defense & Innovation Agency. The work of Tomas Pevny was supported by Czech Ministry of Education 19-29680L.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Official repository for the paper :

"Using Set Covering to Generate Databases for Holistic Steganalysis"

released at @WIFS2022 (Shanghai, China)

Rony Abecidan, Vincent Itier, Jeremie Boulanger, Patrick Bas, Tomáš Pevný

Comments about the repo :

Example of covering :

Main references

Citing our paper

If you wish to refer to our paper, please use the following BibTeX entry

Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

Official repository for the paper :

"Using Set Covering to Generate Databases for Holistic Steganalysis"

released at @WIFS2022 (Shanghai, China)

Rony Abecidan, Vincent Itier, Jeremie Boulanger, Patrick Bas, Tomáš Pevný

Comments about the repo :

Example of covering :

Main references

Citing our paper

If you wish to refer to our paper, please use the following BibTeX entry

Acknowledgements