FaceQ
An overview of the content of FaceQ. Rating comparisons of eight dimensions. Each column presents a pair of intuitive examples of each dimension, with red
indicating the better rating and blue indicating the worse one. From left to right, the subsets are face generation, face customization, and
face restoration subsets. The last row displays the corresponding prompts, reference image-prompt pairs, and the GT-LQ image pairs.
F-bench
Average MOS score comparison across all models and dimensions. (a) Face generation. (b) Face customization. (c) Face
restoration. The models are arranged in a clockwise order by release date.
F-Eval
The overall framework of F-Eval. F-Eval can evaluate quality, authenticity, correspondence, and identity fidelity in a one-for-all framework. It can process both single and paired images, along with prompts, to produce quality scores. It consists of three encoders,
including a vision encoder, a face encoder, and a text tokenizer to process multi-modal inputs. These features are projected into the same
space by trained projectors. A pre-trained large language model is utilized to fuse the features while fine-tuned with four LoRA experts.
Specific LoRA will be activated by dimension ID, which is classified by a trainable router.
Performance on generation and customization tasks
Performance of state-of-the-art models and the proposed FineVQ on our established FineVD database in terms of the quality scoring task.
Performance on restoration task
Performance comparison between state-of-the-art VQA methods and the proposed FineVQ on six UGC VQA databases