๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
๋ฐ˜์‘ํ˜•

gradientvector
2

[Math] Deep-Learning ํ•™์Šต๋ฐฉ๋ฒ• ์ดํ•ดํ•˜๊ธฐ ๋”๋ณด๊ธฐ๊ธฐ์กด ๋ถ€์บ  ๋•Œ ๋…ธ์…˜์— ๊ฐœ์ธ์ ์œผ๋กœ ์ •๋ฆฌํ•œ ๊ฒƒ์„ ๊ณต๋ถ€ํ•  ๊ฒธ ์ž‘์„ฑํ•œ ๊ธ€์ž…๋‹ˆ๋‹ค.๊ฐœ์ธ์ ์œผ๋กœ ํ•ด์„ํ•ด์„œ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค. (ํ‹€๋ฆด ์ˆ˜ ์žˆ์Œ. ์ •์ •์š”์ฒญ ์š”๋งใ…‹)** ๊ฐ•์˜์ž๋ฃŒ๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค **** ์ƒ์—…์  ์ด์šฉ์„ ๊ธˆ์ง€ํ•ฉ๋‹ˆ๋‹ค **Today's Keyword์‹ ๊ฒฝ๋ง, softmax, activation function, Backpropagation, chain Rule๋น„์„ ํ˜•๋ชจ๋ธ - ์‹ ๊ฒฝ๋ง neural network์ „์ฒด ๋ฐ์ดํ„ฐ X, x๋ฅผ ๋‹ค๋ฅธ ๊ณต๊ฐ„์œผ๋กœ ๋ณด๋‚ด์ฃผ๋Š” ๊ฐ€์ค‘์น˜ W์˜ ๊ณฑ์œผ๋กœ ํ‘œํ˜„ + b(y์ ˆํŽธ)์ด ๋•Œ ์ถœ๋ ฅ ๋ฒกํ„ฐ์˜ ์ฐจ์›์€ d -> p x to O๋กœ ์—ฐ๊ฒฐํ•  ๋•Œ P๊ฐœ์˜ ๋ชจ๋ธ.softmax ํ•จ์ˆ˜๋ชจ๋ธ์˜ ์ถœ๋ ฅ์„ ํ™•๋ฅ ๋กœ ํ•ด์„ํ•  ์ˆ˜ ์žˆ๊ฒŒ๋ถ„๋ฅ˜ ๋ฌธ์ œ ํ’€๋•Œ ๋ชจ๋ธ X ์†Œํ”„ํŠธ๋งฅ์Šค → ์˜ˆ์ธกsoftmax(o) = softmax(Wx +b)ํ•™์Šตํ• .. 2025. 1. 20.
[Math] Gradient Descent (์ฐฉํ•œ๋ฏธ๋ถ„๋ง›) ๋”๋ณด๊ธฐ๊ธฐ์กด ๋ถ€์บ  ๋•Œ ๋…ธ์…˜์— ๊ฐœ์ธ์ ์œผ๋กœ ์ •๋ฆฌํ•œ ๊ฒƒ์„ ๊ณต๋ถ€ํ•  ๊ฒธ ์ž‘์„ฑํ•œ ๊ธ€์ž…๋‹ˆ๋‹ค.๊ฐœ์ธ์ ์œผ๋กœ ํ•ด์„ํ•ด์„œ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค. (ํ‹€๋ฆด ์ˆ˜ ์žˆ์Œ. ์ •์ •์š”์ฒญ ์š”๋งใ…‹)** ๊ฐ•์˜์ž๋ฃŒ๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค **** ์ƒ์—…์  ์ด์šฉ์„ ๊ธˆ์ง€ํ•ฉ๋‹ˆ๋‹ค **  Today's Keyword๋ฏธ๋ถ„, ๊ธฐ์šธ๊ธฐ, ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•, ํŽธ๋ฏธ๋ถ„, gradient vector  ๋ฏธ๋ถ„ (Differentiation)๋ณ€์ˆ˜์˜ ์›€์ง์ž„์— ๋”ฐ๋ฅธ ํ•จ์ˆ˜๊ฐ’์˜ ๋ณ€ํ™” ์ธก์ •์ตœ์ ํ™”์—์„œ ๋งŽ์ด ์‚ฌ์šฉ ๋ฏธ๋ถ„ == ์ ‘์„ ์˜ ๊ธฐ์šธ๊ธฐํ•จ์ˆ˜์˜ ๋ชจ์–‘์ด ๋งค๋„๋Ÿฌ์›Œ์•ผ ํ•œ๋‹ค (์—ฐ์†)ํ•œ ์ ์—์„œ ์ ‘์„ ์˜ ๊ธฐ์šธ๊ธฐ๋ฅผ ์•Œ๋ฉด ์ฆ๊ฐ€ : ๋ฏธ๋ถ„ ๊ฐ’ ๋”ํ•˜๊ธฐ๊ฐ์†Œ : ๋ฏธ๋ถ„ ๊ฐ’ ๋นผ๊ธฐ ๋ฏธ๋ถ„๊ฐ’์ด ์Œ์ˆ˜(์ขŒ๋กœ ์ƒ์Šนํ•˜๋Š” ํ˜•ํƒœ) = x+f'(x) ๋ฏธ๋ถ„๊ฐ’์ด ์•™์ˆ˜(์šฐ๋กœ ์ƒ์Šนํ•˜๋Š” ํ˜•ํƒœ) = x+f'(x) > x ๋Š” ์˜ค๋ฅธ์ชฝ์œผ๋กœ ์ด๋™ํ•˜์—ฌ ํ•จ์ˆ˜๊ฐ’์ด ๊ฐ์†Œsympy.s.. 2024. 12. 16.
๋ฐ˜์‘ํ˜•

์ตœ๊ทผ๋Œ“๊ธ€

์ตœ๊ทผ๊ธ€

skin by ยฉ 2024 ttutta