ITU-T P.1203 and P.1204 model and development

We developed several video quality models that are using bitstream-based data to predict subjective video/audiovisual quality in the context of video streaming using HAS (HTTP-based adaptive streaming), such as MPEG-DASH (Dynamic Adaptive Streaming over HTTP).

pnats overall structure — Overall P.1203 and P.1204 structure. Here, the P.1204 models are complementary to the Pv-module, P.1203.1, and provide short-term video-quality estimations. The P.1204.3 model developed by our group is bitstream-based, like the P.1203.1 Pv-module, as described further below.

ITU-T P.1204.3

ITU-T P.1204.3 is part of ITU-T P.1204, a set of video-quality models developed for up to UHD-1/4K resolution. ITU-T P.1204.3 is a short-term video quality prediction model that uses full bitstream data to estimate video-quality scores.

It provides two outputs:

A score at segment level, for an input bitstream of a video between 5 to 10 sec duration, on a 5-point scale [1,5], reflecting the Mean Opinion Score (MOS) collected from more than 600 subjects in a series of laboratory tests on almost 5,000 video sequences. These quality tests were carried out during the so-called P.NATS Phase 2 competition conducted within ITU-T’s Question Q14, Study Group 12, in collaboration with the Video Quality Experts Group (VQEG).
A per-1-sec video-quality score that can be used for longer-term quality or QoE integraction, following the model architecture of ITU-T Rec. P.1203 (see details below).

The P.1204.3 standard is accessible here.

Reference Implementation

We have created an open-source reference implementation that is now available for ITU-T P.1204.3. The model and its evaluation are described in an accompanying conference paper, see (Rao et al., IEEE QoMEX 2020).

Mode 3 Video Parser

To run the reference implementation, a bitstream parser is required, which is also available open source. This bitstream parser can be used for H.264, H.265 and VP9 encoded videos. It extracts a number of features from the bitstream, such as QP-values and statistics about motion vectors and transform coefficients. The bitstream parser is described in more detail in (Rao et al., IEEE QoMEX 2020).

Open Access Video Quality Test Dataset – AVT-VQDB-UHD-1

We further published a large scale Video Quality Database for UHD-1, and subjective data as well as most of the used videos can be downloaded. The database is described in (Rao et al., IEEE ISM 2019). We used this database for a complementary evaluation of the ITU-T P.1204.3 model, besides the validation during the P.1204 standard development, see (Rao et al., IEEE QoMEX 2020).

ITU-T P.1203

ITU-T Rec. P.1203 is the world’s first standard for measuring the Quality of Experience of HTTP Adaptive Streaming services for longer viewing sessions between 1 and 5 min duration.

P.1203 comprises three modules:

Short-term video-quality module Pv (“P” for “prediction”; ITU-T Rec. P.1203.1), providing per-1-sec video-quality scores on the aforementioned 5-point “MOS scale”. The bitstream model is available in different “Modes” that take input information of different complexity, depending on what is available to a corresponding probe. Input information ranges from metadata such as audio codec used, video resolution and framerate, audio and video bitrate (Mode 0) over information about encoded frame type and size (Mode 1) to frame-type specific QP information available from full access to the bitstream (Modes 2 and 3). The Pv model was initially developed for H.264/MPEG-4 AVC encoding.
Short-term audio-quality module Pa (ITU-T Rec. P.1203.2), delivering per-1-sec audio-quality scores on the 5-point “MOS scale”. The audio-quality module can handle a variety of audio codecs and is based on metadata only (“Mode 0”).
Quality integration module Pq (ITU-T Rec. P.1203.2), delivering (a) a per-1-sec audiovisual quality score, (b) an integral audiovisual quality score for the complete session addressed, and (c) as main output an integral session quality score, reflecting the Quality of Experience (QoE) resulting from audio and video quality as well as typical adaptive-streaming related factors such as quality-adaptation and hence visible / audible switches, the initial loading delay at the beginning of a session and possible stalling events that occur when the playout buffer has depleted.

The accompanying general standard document that outlines the application scope and other more general features of P.1203 is available as ITU-T Rec. P.1203.

Related scientific publications from our group describing specific model components are (Raake et al., IEEE QoMEX 2017) (scalable video-quality model for different types of input information) and (Robitza et al., ACM MMSys 2018) (open source implementation for P.1203, see below).

Reference Software

We developed a reference implementation of the ITU-T Rec. P.1203 standard. It is described in (Robitza et al., ACM MMSys 2018).

Open Dataset

An open dataset was created for the ITU-T P.1203 model, which contains training and validation databases from the standardization procedure. The database is described in (Robitza et al., ACM MMSys 2018) together with the open-source model implementation.

Technical Report

We have compiled a technical report detailing the model performance of the P.1203 series and P.1204.3 models on open datasets.

Codec Extensions

Moreover, to include the complementary video codecs H.265/MPEG-H HEVC and VP9, we developed a codec extension for the Mode 0 part of P.1203.

Additional Software

During development of the different bitstream models, we conducted several subjective video-quality tests. Further, we performed analyses with other state-of-the-art video-quality metrics and models.

avrateNG

To run a subjective test, we use our test-tool avrateNG. It is a video, image, and general multimedia rating system, based on a web interface (server-client architecture).

AVrate Voyager

To run an online/remote/crowd test for image/video/audio quality assessment, the developed AVrate Voyager tool can be used.

cencro

Cencro is a center cropped variant of Netflix’s VMAF (Video Multi-Method Assessment Fusion), that we used to compare our models during the development. With the center-cropping it can run significantly faster than the full-frame VMAF, with only slight decrease in prediction accuracy.

Processing Chain for P.NATS Phase2 and other repos

Processing chain used to generate sequences for the P.NATS Phase 2 / AHVD-AS project from ITU-T SG12 and VQEG.

Who are we?

The Audiovisual Technology Group is part of the Institute of Media Technology at TU Ilmenau, Germany, headed by Prof. Alexander Raake. The Audiovisual Technology Group (AVT) deals with the function, application and perception of audio and video equipment and systems.

Publications

2022

Ramachandra Rao, Rakesh Rao, Göring, Steve, and Raake, Alexander. “AVQBits-Adaptive Video Quality Model Based on Bitstream Information for Various Video Applications.” IEEE Access. vol. 10. 2022. [url]
Robitza, Werner, Rakesh Rao Ramachandra-Rao, Steve Göring, Alexander Dethof, and Alexander Raake. “Deploying the ITU-T P. 1203 QoE model in the wild and retraining for new codecs.” In Proceedings of the 1st Mile-High Video Conference, pp. 121-122 (2022). [url]

2021

Göring, Steve, Rakesh Rao Ramachandra Rao, Stephan Fremerey, and Alexander Raake. “AVrate Voyager: an open source online testing platform.” In 2021 IEEE 23rd International Workshop on Multimedia Signal Processing (MMSP), pp. 1-6. IEEE, 2021. [url]
Göring, Steve, Rakesh Rao Ramachandra Rao, Bernhard Feiten, and Alexander Raake. “Modular framework and instances of pixel-based video quality models for UHD-1/4K.” IEEE Access 9 (2021): 31842-31864. [url]
Rao, Rakesh Rao Ramachandra, Steve Göring, and Alexander Raake. “Towards High Resolution Video Quality Assessment in the Crowd.” In 2021 13th International Conference on Quality of Multimedia Experience (QoMEX), pp. 1-6. IEEE, 2021. [url]
Robitza, Werner, Rakesh Rao Ramachandra Rao, Steve Göring, and Alexer Raake. “Impact of Spatial and Temporal Information on Video Quality and Compressibility.” In 2021 13th International Conference on Quality of Multimedia Experience (QoMEX), pp. 65-68. IEEE, 2021. [url]
Ramachandra Rao, Rakesh Rao, Steve Göring, and Alexander Raake. “Enhancement of Pixel-based Video Quality Models using Meta-data.” Electronic Imaging 2021, no. 9 (2021):. [url]

2020

Raake, Alexander, Silvio Borer, Shahid M. Satti, Jörgen Gustafsson, Rakesh Rao Ramachandra Rao, Stefano Medagli, Peter List et al. “Multi-model standard for bitstream-, pixel-based and hybrid video quality assessment of UHD/4K: ITU-T P. 1204.” IEEE Access 8 (2020): 193020-193049. [url]
Rakesh Rao Ramachandra Rao, Steve Göring, Robert Steger, Saman Zadtootaghaj, Nabajeet Barman, Stephan Fremerey, Sebastian Möller and Alexander Raake. “A Large-scale Evaluation of the bitstream-based video-quality model ITU-T P.1204.3 on Gaming Content.” 22nd International Workshop on Multimedia Signal Processing (MMSP), IEEE. Tampere, Finland. Sep 2020. [url]
Susanna Schwarzmann, Nick Hainke, Thomas Zinner, Christian Sieber, Werner Robitza and Alexander Raake. “Comparing fixed and variable segment durations for adaptive video streaming: a holistic analysis.” Proceedings of the 11th ACM Multimedia Systems Conference. 2020. [url]
Rakesh Rao Ramachandra Rao, Steve Göring, Werner Robitza, Alexander Raake, Bernhard Feiten, Peter List, and Ulf Wüstenhagen. “Bitstream-based Model Standard for 4K/UHD: ITU-T P.1204.3 – Model Details, Evaluation, Analysis and Open Source Implementation.” Twelfth International Conference on Quality of Multimedia Experience (QoMEX). Athlone, Ireland. May 2020. [url]
Werner Robitza, Alexander M. Dethof, Steve Göring, Alexander Raake, Tim Polzehl, and Andre Beyer. “Are You Still Watching? Streaming Video Quality and Engagement Assessment in the Crowd.” Twelfth International Conference on Quality of Multimedia Experience (QoMEX). Athlone, Ireland. May 2020. [url]

2019

Steve Göring, Christopher Krämmer, and Alexander Raake. “cencro – Speedup of Video Quality Calculation using Center Cropping.” 21st IEEE International Symposium on Multimedia (2019 IEEE ISM). Dec 2019. [url]
Rakesh Rao Ramachandra Rao, Steve Göring, Werner Robitza, Bernhard Feiten, and Alexander Raake. “AVT-VQDB-UHD-1: A Large Scale Video Quality Database for UHD-1.” 21st IEEE International Symposium on Multimedia (2019 IEEE ISM). Dec 2019. [url]
Rakesh Rao Ramachandra Rao, Steve Göring, Patrick Vogel, Nicolas Pachatz, Juan Jose Villamar Villarreal, Werner Robitza, Peter List, Bernhard Feiten, and Alexander Raake. “Adaptive video streaming with current codecs and formats: Extensions to parametric video quality model ITU-T P.1203.” Electronic Imaging. 2019. [url]

2018

Werner Robitza, Steve Göring, Alexander Raake, David Lindegren, Gunnar Heikkilä, Jörgen Gustafsson, Peter List, Bernhard Feiten, Ulf Wüstenhagen, Marie-Neige Garcia, Kazuhisa Yamagishi, and Simon Broom. “HTTP Adaptive Streaming QoE Estimation with ITU-T Rec. P.1203 – Open Databases and Software.” 9th ACM Multimedia Systems Conference. Amsterdam. 2018. [url]
Werner Robitza, Dhananjaya G. Kittur, Alexander M. Dethof, Steve Göring, Bernhard Feiten and Alexander Raake. “Measuring YouTube QoE with ITU-T P. 1203 under Constrained Bandwidth Conditions.” Tenth International Conference on Quality of Multimedia Experience (QoMEX). IEEE. 2018. [url]

2017

Steve Göring, Alexander Raake and Bernhard Feiten. “A framework for QoE analysis of encrypted video streams.” Ninth International Conference on Quality of Multimedia Experience (QoMEX). May 2017. [url]
Alexander Raake, Marie-Neige Garcia, Werner Robitza, Peter List, Steve Göring and Bernhard Feiten. “A bitstream-based, scalable video-quality model for HTTP adaptive streaming: ITU-T P.1203.1.” Ninth International Conference on Quality of Multimedia Experience (QoMEX). May 2017. [url]
Robitza, Werner, Marie-Neige Garcia, and Alexander Raake. “A modular HTTP adaptive streaming QoE model - Candidate for ITU-T P. 1203 (‘‘P. NATS’).” Ninth International Conference on Quality of Multimedia Experience (QoMEX). IEEE, 2017. [url]