通常のDeepNetとResNetの違い

12月 12, 2022

ResNetは残差を次の層にスキップ接続することでモデル性能を高める手法である。手前の層の入力を後ろに直接足し合わせたことで勾配消失問題を解決したと解説されることが多い。今回は通常のDeepNetとResNetの違いについて自分用に簡単にまとめる。

DeepNet

h1 = f1(x)
h2 = f2(h1)
h3 = f3(h2)
h4 = f4(h3)
y = f5(h4)

ResNet

h1 = f1(x) + x
h2 = f2(h1) + h1
h3 = f3(h2) + h2
h4 = f4(h3) + h3
y = f5(h4) + h4

上記２つを数式で書くと、ResNetは１時刻前の入力ht-1を加算しているところに違いがあることがわかる。現在の値hを関数に入力するだけでなく、加算しているこの式の形はオイラー法と似ており、実際NeuralODEの論文でも下記のように書かれている。

These iterative updates can be seen as an Euler discretization of a continuous transformation (Lu et al., 2017; Haber and Ruthotto, 2017; Ruthotto and Haber, 2018).
Neural Ordinary Differential Equations [Chen+, NeurIPS18]

参考資料

https://proceedings.neurips.cc/paper/2018/file/69386f6bb1dfed68692a24c8686939b9-Paper.pdf

Time Series Analysis

Posted by vastee

ドメインとモーダルの違いは？

参考資料

関連記事