MirrorPPR: Exemplar-Based Portrait Photo Retouching

Zhihong Liu1, Zheng Li1, Jiachun Jin1, Siqi Kou1,2, Yitao Jian1,2, Fengpei Yu2, Zhijie Deng1
1Shanghai Jiao Tong University 2Triverse AI
“Widen nasal alae”
Example source before retouching
Source
Query source before retouching
Query

Instead of relying on ambiguous text instructions, MirrorPPR infers fine-grained structural retouching operations from a before-and-after exemplar pair—such as facial-feature adjustment, face-contour refinement, and body-proportion reshaping—and transfers them to a new portrait while preserving identity and leaving unrelated regions unchanged.

Abstract

While text-guided image editing has made remarkable progress, it remains limited in structural portrait retouching. Textual descriptions struggle to convey fine-grained changes to facial features and body proportions. To address this gap, we introduce Exemplar-Based Portrait Photo Retouching, where the model is given an exemplar pair and tasked with inferring and applying the same retouching operations to a new query image. Existing exemplar-based editing methods primarily focus on tasks with pronounced visual transformations. In contrast, structural portrait retouching involves extremely delicate and localized modifications, making accurate extraction and transfer of these edits challenging. To tackle this, we propose MirrorPPR, a novel framework specifically designed to capture and transfer subtle structural retouching operations. Our method uses a Retouching Operation Extractor to capture the subtle differences from the exemplar pair. The extracted representations are then injected into a pre-trained Diffusion Transformer (DiT) through a connector and Low-Rank Adaptation (LoRA) modules. Furthermore, constructing perfectly aligned cross-identity training pairs is severely hindered by operational misalignment. To overcome this, we propose an advanced data self-augmentation paradigm that ensures strictly aligned retouching operations. To alleviate data scarcity and support this novel task, we introduce MirrorPPR47M, a large-scale dataset with over 47 million retouched pairs. By structuring the dataset into simulated and professional subsets, we enable progressive curriculum learning to smoothly optimize the network. Extensive experiments demonstrate that MirrorPPR significantly outperforms existing baselines in both retouching quality and identity preservation.

Method

MirrorPPR formulates portrait photo retouching as exemplar-based retouching operation transfer. Given an exemplar source image Xs , its retouched counterpart Xt , and a new query portrait Xq , the goal is to generate a retouched result that applies the same operations demonstrated by the exemplar pair. The framework first uses a Retouching Operation Extractor to capture the subtle differences between Xs and Xt : a frozen MAE provides local, structure-aware patch features, while a trainable R-Former uses learnable query tokens to distill these features into a compact operation representation. The extractor is pre-trained with an auxiliary reconstruction task, encouraging the representation to encode precise retouching intent. After pre-training, the extracted operation representation is mapped by a connector into the conditioning space of a frozen dual-stream DiT backbone. Since the task is driven by visual demonstrations rather than text instructions, MirrorPPR conditions the diffusion model on the query image and the extracted retouching operation. The R-Former, connector, and newly added LoRA modules are jointly fine-tuned, enabling the model to learn retouching operation transfer while preserving the strong prior of the pre-trained editing model.

MirrorPPR architecture

Extract operation Inject into DiT Transfer to query

Data Self-Augmentation

A central challenge in exemplar-based portrait retouching is constructing valid cross-identity training quadruplets. Because portraits differ in pose, shot scale, occlusion, and visible body regions, the same local retouching operation may not be applicable or spatially aligned across different identities. Meanwhile, the naive same-identity construction, where the exemplar source is directly reused as the query, introduces pixel-level shortcut learning: the model may simply copy coordinate-wise differences instead of understanding the underlying retouching semantics.

MirrorPPR addresses this with Data Self-Augmentation. For each source-target exemplar pair (Xs , Xt), we apply the same random spatial augmentation A , such as scaling, cropping, rotation, or horizontal flipping, to both images, forming Xq = A(Xs) and Yq = A(Xt). This construction keeps the retouching operation strictly aligned between the exemplar pair and the query pair, while breaking their absolute coordinate correspondence. As a result, the model is encouraged to transfer the demonstrated operation according to the query portrait’s own spatial layout, enabling robust cross-identity generalization.

Data self-augmentation illustration

MirrorPPR47M Dataset

MirrorPPR47M is a large-scale dataset designed for exemplar-based portrait retouching. It contains over 47 million retouched pairs covering facial features, face contours, and body proportions. To make subtle real-world retouching learnable, the dataset is organized for an easy-to-hard curriculum: a Simulated Retouching Subset provides pronounced geometric deformations for learning fundamental structural variations, while a Professional Retouching Subset provides authentic and fine-grained retouching operations.

The simulated subset is built from 30,171 high-quality FFHQ images and uses Landmark-Guided Local Warping (LLW) to generate 8 base facial operation types, each with two opposite directions, resulting in 808,439 retouched pairs. The professional subset is built from 3,789 4K–8K portraits from PPR10K and applies 27 professional retouching operations, including 18 facial-feature operations, 4 face-shape operations, and 5 body-proportion operations, yielding 46,642,845 finely retouched pairs. Together with the self-augmentation pipeline, MirrorPPR47M provides operation-aligned yet spatially decoupled training data for learning realistic structural portrait retouching.

808K simulated retouched pairs
46.6M professional retouched pairs
8 simulated retouching operation types
27 professional retouching operation types

Experimental Results

We evaluate MirrorPPR in a cross-identity setting where the exemplar pair and the query portrait come from different identities, matching the intended inference scenario rather than the self-augmented training construction. SimFace-100 contains 100 combinations of 8 LLW-based facial retouching operations applied to 12 face images, while ProPortrait-500 contains 500 combinations of 27 professional operations applied to 40 high-quality portraits. MirrorPPR is compared with strong baselines from three categories: multi-reference image editing, exemplar-based image editing, and text-guided image editing.

The results show that existing multi-reference and exemplar-based methods often misinterpret the task as image blending, copying, or face swapping. Text-guided methods achieve better reconstruction but still suffer from identity drift and imprecise control over fine-grained structural changes. In contrast, MirrorPPR consistently transfers the demonstrated structural operations with high fidelity and strong identity preservation.

Quantitative Comparisons

Quantitative comparison results with baselines on SimFace-100.
Category Model PSNR ↑ SSIM ↑ LPIPS ↓ Face Similarity ↑
Multi-Reference
Image Editing
Qwen-Image-Edit-25119.060.4680.7450.207
FLUX.2-dev9.280.4810.6980.110
Nano Banana 216.720.7840.3290.556
Seedream 4.513.010.7090.5010.351
Exemplar-based Qwen-Image-Edit-2511-ICEdit-LoRA9.210.5330.6400.300
RelationAdapter16.570.6980.5430.204
EditTransfer15.680.6910.4920.464
Text-guided Qwen-Image-Edit-251125.800.8620.2600.463
FLUX.2-dev22.440.8040.3010.531
Nano Banana 224.250.8600.2390.601
Seedream 4.518.010.7880.3680.600
Ours MirrorPPR-Face32.250.9090.1860.937
Quantitative comparison results with baselines on ProPortrait-500.
Category Model PSNR ↑ SSIM ↑ LPIPS ↓ Face Similarity ↑
Multi-Reference
Image Editing
Qwen-Image-Edit-251110.230.5380.6450.413
FLUX.2-dev9.360.4660.7280.220
Nano Banana 217.720.8350.2500.811
Seedream 4.512.120.6890.4360.705
Exemplar-based Qwen-Image-Edit-2511-ICEdit-LoRA12.060.6310.5640.606
RelationAdapter15.740.7090.5860.283
EditTransfer18.320.7480.4810.457
Text-guided Qwen-Image-Edit-251120.850.7320.3870.501
FLUX.2-dev19.940.7480.3450.616
Nano Banana 227.450.9040.1830.667
Seedream 4.516.430.7700.3780.782
Ours MirrorPPR-Pro32.650.9270.2000.960

Qualitative Comparisons

BibTeX