Back to : ml-study
Contents

Normal Equation

  • Iteration์„ ํ†ตํ•ด ๊ทน์†Œ์ ์— ์ˆ˜๋ ดํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, Analytically ์ตœ์ ํ•ด $\theta$๋ฅผ ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•.
  • ex) $J(\theta) = a\theta^2 + b\theta + c$ ($a > 0$) ๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” $\theta$ ๋Š” $-\frac{b}{2a}$ ์ž„์„ ์‰ฝ๊ฒŒ ์•Œ ์ˆ˜ ์žˆ๋‹ค.
  • How to do for vector parameter $J$?
  • => Vector Calculus. $\pdv{}{\theta_i} J(\theta)$ ๊ฐ€ ๋ชจ๋‘ 0์ด ๋˜๋Š” $\theta$ ๋ฅผ ์ฐพ์œผ๋ฉด ๋œ๋‹ค.
  • Parameter๋“ค์„ ํ–‰๋ ฌ $X$๋กœ ๋งŒ๋“ค๊ณ , ์ด์— ๋Œ€์‘ํ•˜๋Š” ๊ฐ’๋“ค์„ $y$๋กœ ๋งŒ๋“ค์ž.
  • $\theta = (X^T X)^{-1} X^T y$ ๊ฐ€ ์šฐ๋ฆฌ์˜ Linear Regression์— ๋Œ€์‘ํ•จ์ด ์•Œ๋ ค์ ธ ์žˆ๋‹ค.
  • Feature scaling ๊ฐ™์€ ํ…Œํฌ๋‹‰ ๋ถˆํ•„์š”.
  • Gradient Descent์— ๋Œ€๋น„ํ•˜์—ฌ..
    • ์žฅ์  : $\alpha$๋ฅผ ์ƒ๊ฐํ•˜์ง€ ์•Š์•„๋„ ๋˜๊ณ , ๋ฐ˜๋ณต์ ์œผ๋กœ ์ ์ ˆํ•œ $\alpha$๋ฅผ ์ฐพ์„ ํ•„์š”๊ฐ€ ์—†๋‹ค.
    • ๋‹จ์  : ํ–‰๋ ฌ๊ณฑ์…ˆ ๋ฐ inverse๋Š” ๊ต‰์žฅํžˆ ๋А๋ฆผ. ํŠนํžˆ $n$์ด ํฌ๋ฉด ํ–‰๋ ฌ๊ณฑ์…ˆ์„ ์“ฐ๊ธฐ ์–ด๋ ต๋‹ค.

Noninvertible Case

  • $(X^T X)$๊ฐ€ invertibleํ•˜์ง€ ์•Š์œผ๋ฉด??
  • Pseudoinverse (octave pinv ํ•จ์ˆ˜)
  • ํฌ๊ฒŒ ๋‘ ๊ฐ€์ง€ ๊ฒฝ์šฐ
    • ๋‘ feature๊ฐ€ ์‚ฌ์‹ค linear ๊ด€๊ณ„์— ์žˆ๋Š” ๊ฒฝ์šฐ.
      • ex) size in feet^2 ์™€ size in m^2
      • Design matrix $X$๊ฐ€ dependent column ๊ฐ€์ง„๋‹ค.
      • Redundant features -> Throw away.
    • Too many features.
      • Data๋Š” ์ ์€๋ฐ feature๋Š” ๋งŽ์€ ๊ฒฝ์šฐ.
      • Feature ๋ช‡๊ฐœ ๋ฒ„๋ฆฌ๊ธฐ / ๋˜๋Š” Regularization.