专业英语W12:反向传播算法可以解决什么问题
What problem does backpropagation solve? Explain it in your own words.
机器学习的目标是最小化代价函数,代价函数是一个关于weight和偏置的复合函数。
Backpropagation can solve the problem of the high cost of the partial derivative calculation.
The goal of machine learning is to minimize the cost function, which is a composite function of weight and bias. Gradient descent is generally adopted to achieve such goal. In real cases, complex model often involves the calculation of the partial derivatives of all variables of the multi-layer composite function, which will cost a lot.
Why the traditional calculation method of partial derivatives is very redundant and become a problem? Because ,for instance, with the upper node P and the lower node Q, we want to find the partial derivative of P to Q. It is required to find all paths from the Q node to the P node, and for each path, we have to get the product of all partial derivatives on the path, and then add up the product of all paths to get the partial derivative.
On the contrary, backpropagation can only access each path once to find the partial derivative value of the vertex to all the lower nodes, thus solving the problem of redundant calculation.
在最小化代价函数的过程中一般采取梯度下降算法。在真实的计算过程中,对于复杂模型,往往要涉及多层复合函数的所有变量的偏导数的计算,涉及到的计算代价会很大,而BP算法可以解决偏导数计算代价大的问题。
In the real calculation process, for the complex model, it and BP algorithm can solve the problem of the high cost of the partial derivative calculation.
对于上层节点p和下层节点q,要求得,需要找到从q节点到p节点的所有路径,并且对每条路径,求得该路径上的所有偏导数之乘积,然后将所有路径的 “乘积” 累加起来才能得到的值。而这种传统的计算方法非常冗余,BP算法则是对每个路径只访问一次就能它对于每一个路径只访问一次就能求顶点对所有下层节点的偏导值。
The traditional calculation method is very redundant. BP algorithm can only access each path once * * it can only access each path once to find the partial derivative value of the vertex to all the lower nodes