[Paper Review] Cali-Sketch: Stroke Calibration and Completion for High-Quality Face Image Generation from Poorly-Drawn Sketches 논문 리뷰

업데이트: February 21, 2022

Paper: Cali-Sketch: Stroke Calibration and Completion for High-Quality Face Image Generation from Poorly-Drawn Sketches (arxiv 2019): arxiv
GAN-Zoos! (GAN 포스팅 모음집)

Cali-Sketch: poorly-drawn-sketch to photo-realistic-image generation

Stroke Calibration Network (SCN): abstract한 sketch를 detail하게 만들어주는 network

Image Synthesis Network (ISN): refined sketch로부터 이미지를 생성해주는 network

(1) Stroke Calibration Network (SCN)

inconsequent stroke를 수정하고 input sketch를 detail하게 수정해주는 network → refined sketch $R$

\[\mathbf{R}=G_{1}\left(\mathbf{S}, \mathbf{E}_{g t}\right)\]

$\mathbf{E}{g t}$ : global contours $\mathbf{E}{g c}$ + local details $\mathbf{E}_{ld}$
calibration loss $\mathcal{L}_{CL}$
- global contour loss: inconsequent stroke를 수정
  - HED detection
- local detail loss: 세부적인 detail을 추가
  - Canny edge map
- 위 두 loss 모두 feature matching loss를 base로 하고 있음
\[\mathcal{L}_{C L}=\mathbb{E}\left[\sum_{i=1}^{T} \frac{1}{N_{i}}\left\|D_{1}^{(i)}\left(\mathbf{E}_{g t}[j]\right)-D_{1}^{(i)}(\mathbf{R})\right\|_{1}\right]\]
GAN Loss
\[\begin{aligned}\min _{D_{1}} L_{\mathrm{adv}, \mathrm{SCN}}\left(D_{1}\right)=& \frac{1}{2} \mathbb{E}_{\boldsymbol{x} \sim p(\boldsymbol{x})}\left[\left(D_{1}(\boldsymbol{x})-b\right)^{2}\right]+\\& \frac{1}{2} \mathbb{E}_{\boldsymbol{z} \sim p_{\boldsymbol{z}}(\boldsymbol{z})}\left[\left(D_{1}\left(G_{1}(\boldsymbol{z})\right)-a\right)^{2}\right] \\\min _{G_{1}} L_{\mathrm{adv}, \mathrm{SCN}}\left(G_{1}\right)=& \frac{1}{2} \mathbb{E}_{\boldsymbol{z} \sim p_{z}(\boldsymbol{z})}\left[\left(D_{1}\left(G_{1}(\boldsymbol{z})\right)-c\right)^{2}\right]\end{aligned}\]
- where a, b, c denotes the labels for fake data and real data and the value that G wants D to believe for fake data respectively
Total Loss
\[\min _{G_{1}} \max _{D_{1}} \mathcal{L}_{G_{1}}=\min _{G_{1}}\left(\max _{D_{1}}\left(\mathcal{L}_{a d v, S C N}\right)+\lambda \mathcal{L}_{C L}\right)\]

(2) Image Synthesis Network (ISN)

SCN을 통해 얻은 refined sketch $R$ 를 input으로 삼아 photo-realistic face photo $P$ 를 생성하는 network

\[\mathbf{P}=G_{2}\left(\mathbf{R}, \mathbf{I}_{g t}\right)\]

Reconstruction Loss
\[\mathcal{L}_{\ell_{1}}=\mathbb{E}\left[\sum_{i} \frac{1}{N_{i}}\left\|\mathbf{I}_{g t}-\mathbf{P}\right\|_{1}\right]\]
Perceptual Loss
- VGG-19 pre-trained network로 두 activated feature 사이에 거리를 계산
\[\mathcal{L}_{\text {percep }}=\mathbb{E}\left[\sum_{i} \frac{1}{N_{i}}\left\|\phi_{i}\left(\mathbf{I}_{g t}\right)-\phi_{i}(\mathbf{P})\right\|_{1}\right]\]
Style Loss
\[\mathcal{L}_{\text {style }}=\mathbb{E}_{j}\left[\left\|G_{j}^{\phi}\left(\mathbf{I}_{g t}\right)-G_{j}^{\phi}(\mathbf{P})\right\|_{1}\right]\]
GAN Loss
Total variation loss
\[\mathcal{L}_{\mathrm{tv}}=\left\|\nabla_{x} \mathbf{P}-\nabla_{y} \mathbf{P}\right\|_{1}\]
Total Loss
\[\mathcal{L}_{G_{2}}=\lambda_{1} \mathcal{L}_{\ell_{1}}+\lambda_{2} \mathcal{L}_{a d v, I S N}+\lambda_{3} \mathcal{L}_{\text {percep }}+\lambda_{4} \mathcal{L}_{\text {style }}+\mathcal{L}_{\mathrm{tv}}\]

Training

Result

Twitter Facebook LinkedIn

Jihye Back

[Paper Review] Cali-Sketch: Stroke Calibration and Completion for High-Quality Face Image Generation from Poorly-Drawn Sketches 논문 리뷰

(1) Stroke Calibration Network (SCN)

(2) Image Synthesis Network (ISN)

Training

Result

공유하기

댓글남기기

참고

[Paper Review] RAFT: Adapting Language Model to Domain Specific RAG 논문 리뷰

[4] RAG (Retriever Augumented Generation)

[3-2] A Survey of Large Language Models - Adaptation of LLMs

[3-1] A Survey of Large Language Models - 다양한 LLMs부터, Data, Architecture, Training 까지