Contents

## Subgraph Isomorphism ๋ฌธ์  ์๊ฐ

Subgraph Isomorphism์ด๋, ์ฟผ๋ฆฌ ๊ทธ๋ํ $q$์ ๋ฐ์ดํฐ ๊ทธ๋ํ $G$๊ฐ ์ฃผ์ด์ง๋ ์ํฉ์์, $G$๊ฐ $q$์ isomorphicํ subgraph๋ฅผ ๊ฐ๋์ง ์ฌ๋ถ๋ฅผ ํ๋จํ๋ ๋ฌธ์ ์๋๋ค. ์ด ๋ฌธ์ ๋ NP-Complete์์ด ์ฆ๋ช๋์ด ์์ต๋๋ค. NP-Complete๋ฅผ ์ฆ๋ชํ๋ ๊ฒ์ ์ฐ๋ฆฌ์ ๋ผ์์ ๊ทธ๋ ๊ฒ ์ค์ํ์ง ์์ง๋ง, ์ ๊น ์๊ฐํด ๋ณด๋ฉด, Clique Problem (Subgraph Isomorphism์์, $q$๊ฐ ์ ์  $n$๊ฐ์ง๋ฆฌ ์์ ๊ทธ๋ํ $K_n$์ผ๋ก ํ์ ๋๋ ๋ฒ์ ) ๋ณด๋ค๋ ์ ์ด๋ ์ด๋ ค์ด ๋ฌธ์ ์์ ์ ์ ์์ต๋๋ค. Clique ๋ฌธ์ ๋ Karp๊ฐ NP-Complete๋ผ๋ ๊ฐ๋์ ์ ์ํ๊ณ  ์ฆ๋ชํ์ ๋ ๋์จ 21๊ฐ์ ์ค๋ฆฌ์ง๋ํ NP-Complete ๋ฌธ์  ์ค ํ๋๋ก, General SAT๋ก๋ถํฐ ํ์๋๋ฏ๋ก NP-Complete์๋๋ค.1

์ฐ๋ฆฌ๋ Subgraph์ ๊ดํ ๋ฌธ์ ๋ฅผ ์ ๊ทผํ๋ฉด์, ๋ค์๊ณผ ๊ฐ์ด Embedding์ ์ ์ํฉ๋๋ค.
Vertex Labeled graph $G, q$๊ฐ ์ฃผ์ด์ก์ ๋, ํจ์ $f : V_q \to V_G$๊ฐ ์กด์ฌํ์ฌ, $q$์์ edge $(u_1, u_2)$์ ๋ํด ํญ์ $G$์์ edge $(v_1 = f(u_1), v_2 = f(u_2))$ ๋ฅผ ์ฐพ์ ์ ์์ ๋, ์ด๋ฅผ Embedding ์ด๋ผ ํฉ๋๋ค. ๋จ, Labeled graph์ด๋ฏ๋ก $q$์์ $u$์ label๊ณผ $G$์์ $f(u)$์ label์ด ํญ์ ๊ฐ์์ผ ํฉ๋๋ค. ์์ผ๋ก ํธ์์ $u_1 \dots u_n$์ $q$์, $v_1 \dots v_n$์ $G$์ vertex๋ฅผ ์๋ฏธํ๋ ๊ฒ์ผ๋ก ์ฐ๊ฒ ์ต๋๋ค.

๋น์ทํ์ง๋ง ์ข๋ Application ์ค๋ฌ์ด ๋ฌธ์  ๋ ๊ฐ์ธ Subgraph search์ matching์, ๋ค์๊ณผ ๊ฐ์ด ์ ์๋ฉ๋๋ค.

• Subgraph Search๋, ๋ฐ์ดํฐ ๊ทธ๋ํ๊ฐ ํ๋๊ฐ ์๋๋ผ ์ฌ๋ฌ ๊ฐ $G_1, G_2, \dots G_n$์ ์งํฉ์ด ์ฃผ์ด์ง๊ณ , $q$์ embedding์ ๊ฐ๋ ๊ทธ๋ํ๋ค์ ๋ชจ๋ ์ฐพ๋ ๋ฌธ์ ์๋๋ค.
• Subgraph Matching์ด๋, ๋ฐ์ดํฐ ๊ทธ๋ํ๊ฐ ํ๋ ์ฃผ์ด์ง๊ณ , ์ฟผ๋ฆฌ ๊ทธ๋ํ๊ฐ ์ฃผ์ด์ ธ์, ๋ฐ์ดํฐ ๊ทธ๋ํ์์ $q$์ embedding์ ๋ชจ๋ (๋๋ ์ข๋ ํ์ค์ ์ผ๋ก, ๊ฐ๋ฅํํ ๋ง์ด) ์ฐพ๋ ๋ฌธ์ ์๋๋ค.

์ฆ, subgraph matching์ ํธ๋ ์๊ณ ๋ฆฌ์ฆ์๊ฒ embedding์ ํ๋ ์ฐพ๊ณ  return true ํ๋๋ก ํ๋ฉด, subgraph isomorphism์ด ๋๊ณ , ๋ค์ ์ด๊ฒ์ ์ฌ๋ฌ ๋ฐ์ดํฐ ๊ทธ๋ํ์ ๋ํด ๋ฐ๋ณตํ๋ฉด subgraph search๊ฐ ๋ฉ๋๋ค. ์ด ๊ธ์์๋, ์ธ ๋ฌธ์  ๋ชจ๋๋ฅผ ๋์ถฉ Subgraph Isomorphism์ด๋ผ๊ณ  ํ์น๊ณ  ๋งฅ๋ฝ์ ์ดํดํ๊ธฐ๋ก ํ๊ฒ ์ต๋๋ค.

## Subgraph Isomorphism Algorithms

์ด๋ฐ ์ด๋ ค์ด ๋ฌธ์ ์ ๋ํ ์ ๊ทผ์๋ ํฌ๊ฒ ์ธ ๊ฐ์ง๊ฐ ์์ต๋๋ค.

• ๋ฌธ์ ์ ์ ์ฝ ์๋ ๋ฒ์ ์ ๋ง๋ค์ด์, ๊ทธ ๋ฌธ์ ๋ฅผ ๋น ๋ฅด๊ฒ ํ๊ณ ์ ์๋ํฉ๋๋ค. ์ฃผ๋ก ์ฌ์ฉํ๋ ์ ๊ทผ์ผ๋ก๋ Chromatic Number๊ฐ $k$ ์ดํ์ธ ๊ทธ๋ํ, ํ๋ฉด ๊ทธ๋ํ, Sparseํ ๊ทธ๋ํ ๋ฑ์ ๋ํด ์๋ํฉ๋๋ค. ์ด ๋ฐฉํฅ์ ์ฐ๊ตฌ๋ ์ฃผ๋ก ์๊ณ ๋ฆฌ์ฆ์ด ๋คํญ ์๊ฐ ๋น์ทํ๊ฒ ์ค์ด๋ค๋ฉฐ (NP-Completeํ ๋ฌธ์ ์ ์ผ๋ถ๊ธด ํ์ง๋ง, ์ถ๊ฐ์ ์ธ ์กฐ๊ฑด์ ์ ์ฝํ์ผ๋ฏ๋ก ์ด๊ฒ ๊ฐ๋ฅํฉ๋๋ค) ์ํ์ ์ผ๋ก ์๋ฐํ๊ฒ ์ฆ๋ชํฉ๋๋ค.
• Randomized ์๊ณ ๋ฆฌ์ฆ์ด๋ ์ต์ ํ ํํ์ ์๊ณ ๋ฆฌ์ฆ์ ๋ง๋ค์ด์, expected time complexity๋ฅผ ์ค์ด๊ณ ์ ํฉ๋๋ค.
• ์ผ๋ฐ์ ์ธ Case์ ๋ํด, ํด๋ฆฌ์คํฑํ๊ฒ ๋น ๋ฅธ ์๊ณ ๋ฆฌ์ฆ์ ์ฐพ๊ณ , ์ด๋ฅผ ํฐ ๋ฐ์ดํฐ์์ ๋ํด ์คํ์ ํตํด ๊ฒ์ฆํ๋ ๋ฐฉํฅ์ด ์์ต๋๋ค.

2020๋ 2ํ๊ธฐ์ ์ธํด์ญ์ ์งํํ๋ฉด์ ์ฃผ๋ก 3๋ฒ์ ํด๋นํ๋ ์ชฝ์ ๊ณต๋ถํ๋๋ฐ, Sub-iso์์ 2๋ฒ์ ์ด๋ค ์์ธ์ง ์ ๋ชจ๋ฅด๊ฒ ๊ณ  (๋ณธ์  ์์ต๋๋ค) 1๋ฒ์ ์ฌ๋ฌ ์ฌ๋ฏธ์๋ ๊ฒฐ๊ณผ๋ค์ด ์์ง๋ง ์์ง ์์ธํ ์ฝ์ด๋ณด์ง๋ ๋ชปํ์ต๋๋ค.

์ด ๋ฌธ์ ์ 3๋ฒ ์ ๊ทผ์๋ ๋ค์ ํฌ๊ฒ ์ธ๊ฐ์ง ์ ๊ทผ์ด ์์ต๋๋ค.

• Ullmann (1976) ์ผ๋ก๋ถํฐ ์์ํ๋ Backtracking ๊ธฐ๋ฐ์ ์๊ณ ๋ฆฌ์ฆ์ผ๋ก, ๊ฐ์ฅ ์์ฐ์ค๋ฝ๊ฒ ์๊ฐํ  ์ ์๋ ๋ฐฑํธ๋ํน์ ๊ธฐ๋ฐํฉ๋๋ค.
• Artificial Intelligence ์ชฝ์์๋ ์ด ๋ฌธ์ ๋ฅผ ์๋นํ ์ค์ํ๊ฒ ๋ณด๊ณ  ์์ด์, ์ด์ชฝ์ ์ ๊ทผ๋ ์์ต๋๋ค. Graph ์์์ ๋ญ๊ฐ๋ฅผ ์ด์ฌํ ํ์ต์ํค๋ ๋ฐฉ๋ฒ์๋๋ค.
• Contraint Programming ์ด๋ผ๋ ์ ๊ธฐํ ๋ฐฉ๋ฒ๋ก ์ ๊ธฐ๋ฐํ๋ ์ ๊ทผ์ด ์์ต๋๋ค. ์ ๋ ์ด์ชฝ์ ์ ๋ชจ๋ฅด์ง๋ง, ์ด๋ก ์ ์ผ๋ก ์๋นํ ์ฌ๋ฐ๋ ๋ด์ฉ๋ค์ด ๋ง๋ค๊ณ  ๋ค์์ต๋๋ค.

2๋ฒ์ ์ฃผ๋ก Graph mining ๊ฐ์ ๋ฐฉํฅ์ผ๋ก ์ ๊ทผํ์ฌ ์ฝ๊ฐ ๋ฐฉํฅ์ฑ์ด ๋ค๋ฅด๊ธฐ ๋๋ฌธ์, 1๋ฒ๊ณผ 3๋ฒ์ ์ค์ง์ ์ผ๋ก ์๊ณ ๋ฆฌ์ฆ์ ์ธ ์ ๊ทผ์ผ๋ก ๋ณผ ์ ์์ต๋๋ค. 3๋ฒ์ ํ์ฌ SOTA๋ Glasgow๋ผ๋ ํ๋ก๊ทธ๋จ์ด ์๊ณ , 1๋ฒ์ ๊ฒฝ์ฐ CFL-Match๋ผ๋ ์๊ณ ๋ฆฌ์ฆ์ ๊ธฐ์ ์ผ๋ก CECI, DAF ๋ฑ์ด ์ฐ๊ตฌ๋์์ต๋๋ค.

## Backtracking Algorithms

CFL-Match, CECI, DAF๋ฅผ ๋น๋กฏํ์ฌ ๋ง์ ์๊ณ ๋ฆฌ์ฆ๋ค์ด ํฌ๊ฒ 3๋จ๊ณ๋ก ์ด ๋ฌธ์ ๋ฅผ ํด๊ฒฐํฉ๋๋ค. Filtering - Matching order generation - Backtracking์ธ๋ฐ, ๊ฐ๊ฐ์ด ์ด๋ค ๋๋์ธ์ง ์์๋ณด๊ฒ ์ต๋๋ค.

### Filtering

Filtering์ด๋, ๋์ ํ ๋งค์นญ์ด ์๋๋ ์ ๋ค์ ๋จผ์  ์ณ๋ด๋ ๋ฐฉ๋ฒ์๋๋ค. ์ด๋ Candidate Vertex Set์ด๋ผ๋ ๊ฐ๋์ด ๋ฑ์ฅํ๋๋ฐ, $q$์ ์ ์  $u$์ ๋ํด $u$๊ฐ ๋งคํ๋  ์ ์๋ $G$์ vertex๋ค์ด๋ผ๊ณ  ๋ณผ ์ ์์ต๋๋ค. ์๋ฅผ ๋ค์ด, Label์ด ๋ค๋ฅธ ์ ์ ์ ์์ ๊ณ ๋ คํ  ํ์๊ฐ ์์ต๋๋ค. ์กฐ๊ธ ๋ ๋ณต์กํ ์์๋ก๋, $q$์์ 1๋ฒ ์ ์ ์ผ๋ก๋ถํฐ label์ด $a$์ธ ์ ์ ์ผ๋ก ๊ฐ๋ ๊ฐ์ ์ด ์๋ ์ํฉ์ ์๊ฐํด ๋ณด๊ฒ ์ต๋๋ค. $G$์ ์ ์  10๋ฒ์ ๋ํด, 10๋ฒ ์ ์ ์ neighbor๋ค ์ค label์ด $a$์ธ ์ ์ ์ด ํ๋๋ ์๋ค๋ฉด, $u_1$ ์ $v_{10}$์ผ๋ก ๋งค์นญํ๋ ๋งคํ์ ์กด์ฌํ  ์ ์์ต๋๋ค. ์ด๋ฌํ ์ ์ ๋ค์ ์ต๋ํ ๊ฐํ๊ฒ ํํฐ๋งํด์ ์ ๊ฑฐํ๋ฉด, ๋ฐฑํธ๋ํนํ  ๋์์ด ์ค์ด๋ค ๊ฒ์๋๋ค. ์ด๋ Candidate vertex set์ $u_i$๊ฐ ๋งคํ๋  ์ ์๋ $v_j$ ๋ค์ ์งํฉ $C_i$๋ฅผ ๋งํ๋ฉฐ, ์๊ณ ๋ฆฌ์ฆ์ ๋ฐ๋ผ์๋ ์ด ๊ณผ์ ์ ์ข๋ ์ ํ๋ ๋ฐฉ๋ฒ์ ์ ์ํ๋ ๊ฒฝ์ฐ ์๋ฃ๊ตฌ์กฐ์ ์ด๋ฆ์ด ๋ฌ๋ผ์ง๊ธฐ๋ ํ์ง๋ง ๋๋ต์ ์ผ๋ก๋ ์ด๋ ์ต๋๋ค.

### Matching Order

๋ฌธ์ ์ ํน์ง ์, ์ด๋ค ์์๋ก ์ ์ ๋ค์ matchingํด ๋๊ฐ๋์ง๋ search space๊ฐ ๊ฐ์ํ๋ ์๋๋ฅผ ์ข์ฐํ๋ ๋งค์ฐ ์ค์ํ ์์์๋๋ค. Backtracking์ ํ๊ธฐ ์ํด์๋ ์ด matching order๊ฐ ํ์ํ๋ฐ, ์ดํ ๋ฐฑํธ๋ํน์ ์ํํ๋ ์์ (์ ์์ $q_V$์ permutation) ๋ฅผ matching order๋ผ๊ณ  ๋ถ๋ฆ๋๋ค. ์๊ณ ๋ฆฌ์ฆ๋ง๋ค ๋ค๋ฅธ matching order๋ฅผ ์ฌ์ฉํ๊ฒ ๋ฉ๋๋ค.

### Backtracking

๋ฐฑํธ๋ํน์ ๋จ์ํ ํ๋ฉด ๋์ง๋ง, ์ด ๊ณผ์ ์์ ๋ค์ํ ์ต์ ํ๊ฐ ๊ฐ๋ฅํฉ๋๋ค. ๋จผ์  ๋ฐฑํธ๋ํน์ ์ํด์๋ extendableํ candidate๋ฅผ ์ก์์ผ ํ๋๋ฐโฆ

1. ํ์ฌ์ partial embedding $M$์์ ์ฌ์ฉํ๋ ์ ๋ค์ ๋ค์ ์ฌ์ฉํ  ์ ์๊ณ
2. ์ง๊ธ ๋ด๊ฐ $u$๋ฅผ ๋ณด๊ณ  ์๋ค๋ฉด, $u$์ neighbor๋ค ์ค $M$์์ ์ด๋ฏธ ๋งคํ๋ ์ ์ ๋ค์ ๋ํด, ๊ทธ ์ ์ ๋ค ๋ชจ๋์ ์ฐ๊ฒฐ์ด ๊ฐ๋ฅํด์ผ ํฉ๋๋ค. ์ฆ, $u_3$์ด $u_1$, $u_2$ ์ ์ฐ๊ฒฐ๋์ด ์๊ณ , $u_1, u_2$๋ฅผ ๊ฐ๊ฐ $v_a, v_b$์ ์ฐ๊ฒฐ๋์ด ์์ผ๋ฉด, $u_3$์ ์ ์ด๋ $v_a, v_b$์ ์ฐ๊ฒฐ๋ ์ ๋ค ์ค์ ๊ณจ๋ผ์ผ ํ๋ค๋ ๊ฒ์๋๋ค. ์ด ๋๊ฐ์ง๊ฐ ๊ฐ์ฅ ๊ธฐ๋ณธ์ด ๋ฉ๋๋ค. ์ฌ๊ธฐ์ ์ถ๊ฐ๋ก DAF์ ๊ฒฝ์ฐ Failing set๊ณผ ๊ฐ์ ์ต์ ํ ๊ธฐ๋ฒ๋ค์ ์ ์ํ๊ธฐ๋ ํ์ต๋๋ค.

## References / Papers

• Sun, S., & Luo, Q. (2020). In-Memory Subgraph Matching: An In-depth Study. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1083โ1098. https://doi.org/10.1145/3318464.3380581 : Subgraph Isomorphism ๋ฐฉ๋ฒ๋ค์ ๋น๊ตํ๊ณ , ์ด๋ค์ ๋ชจ๋ ๊ตฌํํ์ฌ ํต์ผ๋ ํ๋ ์์ํฌ ์์์ ์คํํ ๋ผ๋ฌธ์๋๋ค. ์ด ๊ธ์ ๊ฑฐ์ ์ด ๋ผ๋ฌธ์ ๊ธฐ๋ฐํ ์ ๋ฆฌ ํฌ์คํ์ธ๋ฐ, ๋ผ๋ฌธ์ ๋ฉ์ธ์ธ ์คํ ๊ฒฐ๊ณผ๋ฅผ ์ ๋ฆฌํ์ง ์์๊ธฐ ๋๋ฌธ์ ๊ทธ๋ ๊ฒ ๋ถ๋ฅํด๋์ง๋ ์์์ต๋๋ค.

1. ์ด๋ถ๋ถ์ ์ฆ๋ช์ ์ด ๊ธ์ Scope๋ฅผ ๋๋ฌด ๋ง์ด ๋ฒ์ด๋ฉ๋๋ค.ย โฉ