Skip to content

Latest commit

 

History

History
75 lines (56 loc) · 4.72 KB

Wishlist.md

File metadata and controls

75 lines (56 loc) · 4.72 KB

Wishlist of Future Capabilities

This page identifies functions that are desired but not yet implemented. See Demo #1 for definitions.

Open source compression NR metric trained on the VCRDCI dataset

Our goal in creating the VCRDCI dataset was to provide training data for an NR metric that predicts the quality impact of compression, but ignores all other impairments. It must be faster and more accurate than NR metric dipIQ.

Open source camera noise NR metric trained on the ITSnoise dataset

Our goal in creating the ITSnoise dataset was to provide training data for an NR metric that predicts the quality impact of noise produced by camera capture in low light, but ignores all other impairments.

NR metric identifying the likelihood that H.264, H.265, and AV1 will yield signifantly different results

While H.264, H.265, and AV1 are known to produce higher or lower quality at a given bitrate, the relationship among media is usually the same or similar. We are interested in a metric that would identify media that are likely to trigger atypical responses from one of these codecs. For example, this could be used to understand whether the lower cost of H.264 compression is adequate, or the higher cost of H.265 compression is justified because the quality will increase more than usual.

Interpretability and Adequacy

These terms were proposed by Dr. Jorge Caviedes of Arizona State University (ASU) during the December 2022 VQEG meeting, while describing subjective assessment of medical images. Interpretability assesses the ability of a professional to correctly interpret the media when performing a task (e.g., reach a medical diagnosis). Adequacy assesses whether the media captures the correct information (e.g., if using digital fingerprinting, does the image contain a full fingerprint).

The desired future capability is to predict these aspects of video quality for tasks. These factors may help explain media quality assessment for tasks.

Python Implementation

People who cannot afford MATLAB licenses have expressed interest in a Python implementation of this repository.

Analysis Techniques for Ordered Data

Wang et al. make available the University of Southern California (USC) Just Noticeable Difference (JND) dataset. The USC JND dataset would be suitable for training NR metrics, if we had statistical methods for evaluating the performance of a dataset on JND data. These methods would also let experts quickly create datasets with objective JND ratings, based on expert knowledge (e.g., bit-rate reduction, resolution subsampling).

NR metric dipIQ provides analysis techniques that may be suitable.

Bitstream Reader

The code in this repository could support bit-stream algorithms for video quality analysis, if bitstream support were added to read_media.m. Ioannis Katsavounidis and Margaret Pinson propose the following.

Most of the quality information in an encoded video's bitstream is associated with quantization parameter (QP), quantization scale (QS), and motion vectors (MV). QP and QS are likely related, and each block may have zero, one, or two MVs. Video coders use various block sizes, which may differ within a single image. Thus, it would be easiest to report all values on a per-pixel basis.

We would like a function that reads the video bitstream and returns the following values for each pixel:

Variable Definition
qp quantization parameter
qs quantization scale
x1 relative horizontal coordinate of MV1
y1 relative vertical coordinate of MV1
t1 relative time coordinate of MV1
wt1 weight of MV1 [0..1]
flag1 whether MV1 exists
x2 relative horizontal coordinate of MV2
y2 relative vertical coordinate of MV2
t2 relative time coordinate of MV2
wt2 weight of MV2 [0..1]
flag2 whether MV2 exists

where:

  • MV1 is motion vector 1
  • MV2 is motion vector 2
  • Neither MV1 or MV2 will exist for "I" frames
  • Both MV1 and MV2 will exist for "B" frames
  • Only MV1 will exist for "P" frames
  • Negative values indicate up or left
  • Positive values indicate down or right

The ffmpeg software would be a suitable starting-point for this function. The easiest solution would be to create a modified version of ffmpeg that saves the above information. Values x1, x2, y1, and y2 must ignore the standard and instead be measured relative to the current frame as stated above, to avoid confusion among users of this repository.

Independent Metric

Currently, there is no way to export a metric outside of the MATLAB ecosystem. The NR metric must either be implemented in another language, or the user must write a wrapper that reads an image or video and calculates the metric.