[Paper Review] Cali-Sketch: Stroke Calibration and Completion for High-Quality Face Image Generation from Poorly-Drawn Sketches 논문 리뷰
업데이트:
- Paper:
Cali-Sketch
: Stroke Calibration and Completion for High-Quality Face Image Generation from Poorly-Drawn Sketches (arxiv 2019): arxiv - GAN-Zoos! (GAN 포스팅 모음집)
Cali-Sketch:
poorly-drawn-sketch
tophoto-realistic-image generation
- Stroke Calibration Network (SCN): abstract한 sketch를 detail하게 만들어주는 network
- Image Synthesis Network (ISN): refined sketch로부터 이미지를 생성해주는 network
(1) Stroke Calibration Network (SCN)
- inconsequent stroke를 수정하고 input sketch를 detail하게 수정해주는 network → refined sketch $R$
-
$\mathbf{E}{g t}$ : global contours $\mathbf{E}{g c}$ + local details $\mathbf{E}_{ld}$
- calibration loss $\mathcal{L}_{CL}$
- global contour loss: inconsequent stroke를 수정
- HED detection
- local detail loss: 세부적인 detail을 추가
- Canny edge map
- 위 두 loss 모두 feature matching loss를 base로 하고 있음
- global contour loss: inconsequent stroke를 수정
-
GAN Loss
\[\begin{aligned}\min _{D_{1}} L_{\mathrm{adv}, \mathrm{SCN}}\left(D_{1}\right)=& \frac{1}{2} \mathbb{E}_{\boldsymbol{x} \sim p(\boldsymbol{x})}\left[\left(D_{1}(\boldsymbol{x})-b\right)^{2}\right]+\\& \frac{1}{2} \mathbb{E}_{\boldsymbol{z} \sim p_{\boldsymbol{z}}(\boldsymbol{z})}\left[\left(D_{1}\left(G_{1}(\boldsymbol{z})\right)-a\right)^{2}\right] \\\min _{G_{1}} L_{\mathrm{adv}, \mathrm{SCN}}\left(G_{1}\right)=& \frac{1}{2} \mathbb{E}_{\boldsymbol{z} \sim p_{z}(\boldsymbol{z})}\left[\left(D_{1}\left(G_{1}(\boldsymbol{z})\right)-c\right)^{2}\right]\end{aligned}\]- where a, b, c denotes the labels for fake data and real data and the value that G wants D to believe for fake data respectively
-
Total Loss
\[\min _{G_{1}} \max _{D_{1}} \mathcal{L}_{G_{1}}=\min _{G_{1}}\left(\max _{D_{1}}\left(\mathcal{L}_{a d v, S C N}\right)+\lambda \mathcal{L}_{C L}\right)\]
(2) Image Synthesis Network (ISN)
- SCN을 통해 얻은 refined sketch $R$ 를 input으로 삼아 photo-realistic face photo $P$ 를 생성하는 network
-
Reconstruction Loss
\[\mathcal{L}_{\ell_{1}}=\mathbb{E}\left[\sum_{i} \frac{1}{N_{i}}\left\|\mathbf{I}_{g t}-\mathbf{P}\right\|_{1}\right]\] - Perceptual Loss
- VGG-19 pre-trained network로 두 activated feature 사이에 거리를 계산
-
Style Loss
\[\mathcal{L}_{\text {style }}=\mathbb{E}_{j}\left[\left\|G_{j}^{\phi}\left(\mathbf{I}_{g t}\right)-G_{j}^{\phi}(\mathbf{P})\right\|_{1}\right]\] - GAN Loss
-
Total variation loss
\[\mathcal{L}_{\mathrm{tv}}=\left\|\nabla_{x} \mathbf{P}-\nabla_{y} \mathbf{P}\right\|_{1}\] -
Total Loss
\[\mathcal{L}_{G_{2}}=\lambda_{1} \mathcal{L}_{\ell_{1}}+\lambda_{2} \mathcal{L}_{a d v, I S N}+\lambda_{3} \mathcal{L}_{\text {percep }}+\lambda_{4} \mathcal{L}_{\text {style }}+\mathcal{L}_{\mathrm{tv}}\]
댓글남기기