用Octave计算组合优化问题(神经网络算法)
广告
{{v.name}}
神经网络算法核心公式:
1. 前向传播 Forward Propagation
第 1 层(隐藏层)
\(a^{[1]} = g(W^{[1]} x + b^{[1]})\)
第 2 层(输出层)
\(a^{[2]} = g(W^{[2]} a^{[1]} + b^{[2]})\)
2. 损失函数 Loss(误差)
\(J = \frac{1}{m} \sum_{i=1}^{m} L(\hat{y}^{[i]}, y^{[i]})\)
3. 反向传播 Backpropagation
即:用链式法则求梯度,然后更新权重:
输出层误差
\(\delta^{[2]} = a^{[2]} - y\)
隐藏层误差
\(\delta^{[1]} = (W^{[2]})^T \delta^{[2]} \cdot g'(z^{[1]})\)
4. 权重更新 Weight Update
\(W^{[l]} = W^{[l]} - \alpha \cdot \delta^{[l]} (a^{[l-1]})^T\)
5. 偏置更新 Bias Update
\(b^{[l]} = b^{[l]} - \alpha \cdot \delta^{[l]}\)
例子:
设一个神经网络如下:
- 输入:\(x = 0.5\)
- 真实值:\(y = 0.8\)
- 权重初始:\(W_1=0.1,\, b_1=0.2,\, W_2=0.3,\, b_2=0.4\)
- 激活函数:sigmoid \(\sigma(z)=\frac{1}{1+e^{-z} }\)
- 学习率 \(\eta=0.1\)
1)前向传播
\(z_1 = 0.1\times0.5 + 0.2 = 0.25\)
\(a_1 = \sigma(0.25) \approx 0.5622\)
\(z_2 = 0.3\times0.5622 + 0.4 \approx 0.5687\)
\(\hat{y}=a_2=\sigma(0.5687)\approx0.6386\)
损失:
\(L=\frac12(0.6386-0.8)^2\approx0.0130\)
2)反向传播(求梯度)
sigmoid 导数:\(\sigma'(z) = \sigma(z)(1-\sigma(z))\)
\(\delta_2 = (\hat{y}-y)\cdot a_2(1-a_2) \approx -0.03725\)
\(\delta_1 = \delta_2 W_2 \cdot a_1(1-a_1) \approx -0.00275\)
更新权重:
\(W_2 \leftarrow W_2 - \eta \cdot \delta_2 a_1\)
\(b_2 \leftarrow b_2 - \eta \cdot \delta_2\)
\(W_1 \leftarrow W_1 - \eta \cdot \delta_1 x\)
\(b_1 \leftarrow b_1 - \eta \cdot \delta_1\)
迭代几百/几千次
求解,代码如下:
>> % 数据
x = 0.5;
y = 0.8;
% 初始化参数
W1 = 0.1; b1 = 0.2;
W2 = 0.3; b2 = 0.4;
lr = 0.1; % 学习率
% 迭代1000次
for iter = 1:1000
% 前向传播
z1 = W1*x + b1;
a1 = 1./(1+exp(-z1));
z2 = W2*a1 + b2;
y_hat = 1./(1+exp(-z2));
% 损失
loss = 0.5*(y_hat - y)^2;
% 反向传播
delta2 = (y_hat - y) .* y_hat .* (1 - y_hat);
delta1 = delta2 * W2 .* a1 .* (1 - a1);
% 更新
W2 = W2 - lr * delta2 * a1;
b2 = b2 - lr * delta2;
W1 = W1 - lr * delta1 * x;
b1 = b1 - lr * delta1;
end
% 结果
disp('最终预测值:'); disp(y_hat);
disp('损失:'); disp(loss);
-----------------------------
最终预测值:
0.7977
损失:
2.6511e-06