Skip to content

Implementation of various concepts around Digital Media (Image/Video) Processing (DMP) topics

License

Notifications You must be signed in to change notification settings

mr-pylin/media-processing-workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📷 Media Processing Workshop

Implementation of various concepts around Digital Media (Image/Video) Processing (DMP) topics.

📖 Table of Contents

Main Notebooks

  1. Introduction
    How to read/plot an image using matplotlib package
    Image properties: color, dtype, depth, resolution, ...
  2. Basic Modifications
    Crop, Flip, Circular Shift, Rotation
  3. Interpolations
    Nearest Neighbor, BiLinear, BiCubic, Lanczos interpolation
  4. Intensity Transformation
    Negative, Logarithm, Power-Law (Gamma correction), Piecewise-Linear Transform
  5. Histogram
    Histogram Stretching, Shrinking, Sliding
    Global Histogram Equalization
    Local Histogram Equalization (Adaptive Histogram Equalization)
    Adaptive Contrast Enhancement (ACE)
    Histogram Matching (Specification)
  6. Convolution
    1D Convolution
    2D Convolution (GrayScale/RGB image)
  7. Fourier Transform
    Basis vectors(1D)/images(2D)
    Forward/Backward Fourier Transform
    Fast Fourier Transform (FFT)
    Ideal Low-Pass filter
    Cardinal Sine (sinc) filter
    Ringing Effect
    Shift, Rotation, Flip effect in frequency domain
    Image sharpening using a gaussian high-pass filter
    Periodic noise removal
  8. Cosine Transform
    Basis vectors(1D)/images(2D)
    Forward/Backward Cosine Transform
    Compression Effect (DFT vs DCT)
    Zonal Masking
  9. Quality Assessment
    Mean Squared Error (MSE)
    Signal-to-Noise Ratio (SNR)
    Peak Signal-to-Noise Ratio (PSNR)
    Structural Similarity Index (SSIM)
    Root Mean Square Error (RMSE)
    Mean Absolute Error (MAE)
    Mean Structural Similarity Index (MSSIM)
    Visual Information Fidelity (VIF)
    Feature Similarity Index (FSIM)
    Multi-Scale Structural Similarity Index (MS-SSIM)
  10. Steganography
    Steganography using least significant bits
  11. JPEG codec
    JPEG Encoder & Decoder
  12. MPEG codec
    MPEG Encoder & Decoder
  13. Image Registration
    Aligning multiple images into a common coordinate system
  14. Image Stitching
    Combining multiple images to create a single larger image [Panorama]
  15. Optical Flow
    Optical Flow using Lucas-Kanade & Farneback algorithms

Utilities

Implementation of several concepts utilized in the main notebooks

  • Padding and Convolution
    Provides utility functions for 1D and 2D convolution, along with flexible padding options.
  • DCT Implementation
    Discrete Cosine Transform (DCT) implementation for 1D and 2D signals.
  • DFT Implementation
    Implementations of 1D and 2D Discrete Fourier Transform (DFT) and related functions.
  • Filter Functions Implementation
    Implementations of various 2D filter functions including ideal, Gaussian, sinc, Butterworth, Chebyshev, Bessel, and block masks.
  • JPEG Codec Implementation
    Class-based implementation of JPEG encoder and decoder using discrete cosine transform (DCT) and quantization.
  • MPEG Codec Implementation
    Class-based implementation of MPEG encoder and decoder using discrete cosine transform (DCT) and quantization.
  • Quality Assessment Metrics
    Implementation of famous metrics e.g. MSE, SNR, PSNR, SSIM, RMSE, MAE, ...
  • Spatial Modifications
    Implementation of concepts in spatial domain e.g. interpolations and histograms
  • Steganography
    Implementation of a simple steganography method using Least Significant Bits (LSB)

📋 Prerequisites

📝 TODO

  • 04: Adaptive Contrast Enhancement (ACE)
  • 04: Histogram Matching (Specification)
  • 14: Sparse Optical Flow using Lucas-Kanade

⚙️ Setup

This project was developed using Python v3.12.3. If you encounter issues running the specified version of dependencies, consider using this specific Python version.

📦 Installing Dependencies

You can install all dependencies listed in requirements.txt using pip.

pip install -r requirements.txt

🛠️ Usage Instructions

  • Open the root folder with VS Code
    • Windows/Linux: Ctrl + K followed by Ctrl + O
    • macOS: Cmd + K followed by Cmd + O
  • Open .ipynb files using Jupyter extension integrated with VS Code
  • Allow VS Code to install any recommended dependencies for working with Jupyter Notebooks.
  • Note: Jupyter is integrated with both VS Code & Google Colab

🔗 Usefull Links

  • ffmpeg & ffprobe:
    • ffmpeg is a Swiss Army knife for media, converting and manipulating audio and video files in a wide range of formats.
    • Link: github.com/BtbN/FFmpeg-Builds
  • YUV4MPEG Videos:
  • Video Quality Measurement Tool (VQMT):
  • yuv-player:
  • H.264 (AVC) codec:
  • H.265 (HEVC) codec:
  • H.266 (VVC) codec:
  • NumPy
    • A fundamental package for scientific computing in Python, providing support for arrays, matrices, and a large collection of mathematical functions.
    • Official site: numpy.org
  • MatPlotLib:
    • A comprehensive library for creating static, animated, and interactive visualizations in Python
    • Official site: matplotlib.org
  • OpenCV:
    • A powerful library for computer vision and image processing, supporting real-time operations on images and videos in Python and other languages.
    • Official site: opencv.org

🔍 Find Me

Any mistakes, suggestions, or contributions? Feel free to reach out to me at:

I look forward to connecting with you! 🏃‍♂️

©️ Copyright Information

  • Digital Image Processing by Gonzalez & Woods:
Image Copyright Owner Address
CH02_Fig0222(b)(cameraman).tif Massachusetts Institute of Technology MIT.edu
CH03_Fig0309(a)(washed_out_aerial_image).tif NASA nasa.gov
CH03_Fig0326(a)(embedded_square_noisy_512).tif - imageprocessingplace.com
CH03_Fig0354(a)(einstein_orig).tif Public domain -
CH06_Fig0638(a)(lenna_RGB).tif Public domain -
CH06_FigP0606(color_bars).tif - -
  • Third-Party Assets:
    • Additional images located in ./assets/images/third_party/ are used with permission or according to their original licenses.
    • Attributions and references to original sources are included in the code where these images are used.
Image Copyright Owner Address
nature_1.jpg - pexels.com
nature_2.jpg - pexels.com
  • Miscellaneous assets:
Image Copyright Owner Address
keyboard_1.jpg Amirhossein Heydari github.com/mr-pylin
keyboard_2.jpg Amirhossein Heydari github.com/mr-pylin
test.tif Amirhossein Heydari github.com/mr-pylin

📄 License

This project is licensed under the Apache License 2.0.
You are free to use, modify, and distribute this code, but you must include copies of both the LICENSE and NOTICE files in any distribution of your work.
Note: Assets in the above tables may have their own licenses

About

Implementation of various concepts around Digital Media (Image/Video) Processing (DMP) topics

Topics

Resources

License

Stars

Watchers

Forks