Back to : ml-study
Contents

Motivation

  • Complex, nonlinear hypothesis
    • ๋งŽ์€ ์ˆ˜์˜ polynomial feature์„ ์“ธ ์ˆ˜๋„ ์žˆ๊ฒ ์ง€๋งŒโ€ฆ ์—ฌ๋Ÿฌ ๊ฐœ์˜ feature๋ฅผ ๊ฐ€์ง„ ๋ฌธ์ œ์— ์ ์šฉํ•˜๊ธฐ๋Š” ์–ด๋ ต๋‹ค.
    • 100๊ฐœ์˜ feature๊ฐ€ ์žˆ๋‹ค๋ฉด? ๊ทธ ์ด์ƒ์ด๋ผ๋ฉด? ์ ์ ˆํ•œ ๊ณ ์ฐจํ•ญ์„ ์“ฐ๊ธฐ๋Š” ๋งค์šฐ ์–ด๋ ค์šด ์ผ.
  • ex) Computer Vision. ์ด ์ด๋ฏธ์ง€๋Š” ์ฐจ๋Ÿ‰์ธ๊ฐ€?
    • Pixel intensity matrix๋ฅผ ๋ณด๊ณ  ์›๋ž˜์˜ ์ด๋ฏธ์ง€๋ฅผ ์ธ์‹ํ•  ์ˆ˜ ์žˆ๋Š”๊ฐ€?
    • Classification problem.
    • Feature size = ํ”ฝ์…€์˜ ์ˆ˜ (x3 if RGB)
      • ์ด๋ ‡๊ฒŒ ๋งŽ์€ feature๋กœ๋Š” logistic regression๊ฐ™์€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์“ธ ์ˆ˜๊ฐ€ ์—†๋‹ค.
  • Goal : Algorithm that mimics brain.

Background

  • 80๋…„๋Œ€-90๋…„๋Œ€ ์ดˆ์— ํฌ๊ฒŒ ์œ ํ–‰ํ–ˆ์œผ๋‚˜, 90๋…„๋Œ€ ๋ง์—๋Š” ๋ณ„๋กœ..
  • Computationally expensive.
  • ํ˜„๋Œ€์—๋Š” ์ด์ •๋„ ์ž์›์€ ์‚ฌ์šฉํ•  ๋งŒ ํ•˜๋‹ค => STATE OF THE ART
  • ๋‡Œ๋Š” ์ˆ˜๋งŽ์€ ๊ธฐ๋Šฅ์„ ์ฒ˜๋ฆฌํ•œ๋‹ค -> ์ด ๋งŽ์€ ๊ธฐ๋Šฅ (์–ธ์–ด, ์‹œ๊ฐ, ์ฒญ๊ฐโ€ฆ) ์„ ๊ฐ๊ฐ ๊ตฌํ˜„ํ•ด์•ผ ํ•˜๋Š”๊ฐ€?
  • NO. ONE LEARNING ALGORITHM HYPOTHESIS์— ์˜ํ•˜๋ฉด, ์‹ค์ œ๋กœ ๋‡Œ์—์„œ ์ž‘๋™ํ•˜๋Š” ๊ฒƒ์€ ๋‹จ ํ•˜๋‚˜์˜ ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜.
    • ์ฒญ๊ฐ ๊ด€๋ จ ๊ธฐ๋Šฅ์„ ๋Š๊ณ  ์‹œ๊ฐ ๊ด€๋ จ ๋ถ€๋ถ„์— ์ด๋ฅผ ์—ฐ๊ฒฐํ•˜๋ฉด, ๋‡Œ๊ฐ€ ์•Œ์•„์„œ ์ž˜ ๋งคํ•‘ํ•ด์„œ ์ž‘๋™ํ•˜๋”๋ผ.
    • Brain rewiring experiment
    • ๋‡Œ์— ๋‹ค๋ฅธ ์„ผ์„œ๋„ ์ž˜ ์—ฐ๊ฒฐํ•˜๋ฉด (direction ๋“ฑ) ๋Œ€๋žต ์ž˜ ์ž‘๋™ํ•˜๋”๋ผ.
    • ์•„๋งˆ๋„ ๊ฐ๊ฐ์˜ ๊ธฐ๋Šฅ์€ ๋ณ„๊ฐœ์˜ sw๊ฐ€ ์•„๋‹๊ฒƒ์ด๋‹ค.
  • Neuron : ์‹ ๊ฒฝ๊ณ„๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ๊ธฐ๋ณธ ์„ธํฌ.
    • Inputs (Dendrites)
    • Outputs (Axons)
    • I/O๋ฅผ ๊ฐ€์ง„ ๊ธฐ๋ณธ์ ์ธ ๊ณ„์‚ฐ ๋‹จ์œ„์ฒ˜๋Ÿผ ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ๋‹ค.

Neuron Model

  • Logistic Unit : $x_1, x_2, x_3$ ์„ ์ž…๋ ฅ๋ฐ›์•„์„œ $h_\theta(x)$๋ฅผ computeํ•˜๋Š” neuron์„ ์ƒ๊ฐ.
  • Layer structure (Neural Network) : Neuron๋“ค์˜ output์„ ๋‹ค์‹œ ๋ฐ›์•„์„œ ์ƒˆ๋กœ์šด ๊ฐ’์„ ๊ณ„์‚ฐํ•˜๋Š” neuron๊ฐ„์˜ layer๋ฅผ ์Œ“๋Š” ๋Š๋‚Œ.
  • Layer 1 (Input Layer) - ๋งจ ๋ (Output Layer) ์‚ฌ์ด์— Hidden layer๋“ค์ด ์œ„์น˜ํ•˜๋Š” ๊ตฌ์กฐ.
  • Bias unit ๊ฐ™์€ ์ถ”๊ฐ€ ํ…Œํฌ๋‹‰๋“ค ์‚ฌ์šฉ.
  • $a_i^{(j)}$ : โ€œActivationโ€ of unit $i$ in layer $j$
  • $\Theta^(j)$ : Matrix of weights, ๋‹ค์Œ layer๋กœ ๋„˜์–ด๊ฐ€๋Š” ๊ฐ’๋“ค.
    picture 1
  • Forward Propagation์€ Vectorize๋ฅผ ํ†ตํ•ด ๋น„๊ต์  ํšจ์œจ์ ์œผ๋กœ ์—ฐ์‚ฐ ๊ฐ€๋Šฅํ•˜๋‹ค.
  • ์ด ๋ฐฉ๋ฒ•์ด ์™œ ์ข‹์€๊ฐ€?
    • ๋งจ ๋ Layer (Output Layer) ๋Š” ์ผ์ข…์˜ Logistic regression
    • ๊ทธ ์ด์ „์˜ Hidden layer๋Š” ๊ทธ ์ž์ฒด๊ฐ€ Learning๋œ ๊ฒฐ๊ณผ๋ฌผ. ์ฆ‰, feature ์ž์ฒด๊ฐ€ ํ•™์Šต์„ ํ†ตํ•ด ๋ฐœ์ „ํ•œ๋‹ค.
    • Flexibleํ•œ ๋ฐฉ๋ฒ•.