How to implement Pseudocode in Python?

This is Pseudocode 1:

Neural Network

Forward feeding (FF)
Goal: Feed a sample x to the input layer and get the prediction for x from
the output layer.6 In other words, if we view the NN as a function or
blackbox, FF is to calculate the value of the function for a given input x.
FF(x)
1 x(0) ←
[1
x
]
// Add the bias feature.
2 for ` = 1,...,L do
3 s(`) ←(W(`))T x(`−1)
4 x(`) ←
[ 1
θ(s(`))
]
// θ is the vectorized activation func.
5 return [x(L)]1
d (L) // [·]1
d (`) : the bias x(L)
0 ≡1 is excluded.

This is Pseudocode 2:

Vectorized SGD for NN training, minibatch size N ′ ≤N
NN_SGD_v3(D ,η,T,N ′) /*D : the training set of size N ; η: learn-
ing rate; T: #rounds of training; N ′ ≤N : minibatch size. */
1 Init W(1) ,...,W(L) , with random numbers. // Every matrix W(`) is (d (`−1) + 1) ×d (`) .
2 for t = 1,2,...,T do
3 Let D′ = (X,Y) be N ′ samples randomly picked from D.
4 X(0) ←[1 X] // add the bias feature column
5 for ` = 1,...,L do /* Forward Feeding. Save S(`) & X(`) . */
6 S(`) ←X(`−1) W(`) ; X(`) ←
[
1 θ
(
S(`) )]
// Adding the bias node. Slide 32
7 E = numpy.sum
(([
X(L) ]∗,1
∗,d (L) −Y
)

([
X(L) ]∗,1
∗,d (L) −Y
))
∗ 1
N ′ // Err for this minibatch.
Slide 44
8 ∆(L) ←2
([
X(L) ]∗,1
∗,d (L) −Y
)
∗θ′(
S(L) )
// [·]∗,1
∗,d (L) :not use the 0th column. Slide 40
9 for ` = L −1,...,1 do /* Back propagation. Save ∆(`) . */
10 ∆(`) ←θ′(
S(`) )

(
∆(`+1)
([
W(`+1) ]1
d (`)
)T )
// Slide 40
11 for ` = 1,...,L do
/* The following statement that involves 3D array/tensor operation is the vectorization
(via Numpy’s Einstein Summation) of this for loop:
G (`) ←[0]; for i = 0,...,N ′ −1 do G (`) ←G (`) + 1
N ′
(
X(`−1)
i
)T ∆(`)
i ;
*/
12 G (`) ←numpy.einsum(′ij ,ik -> jk ′,X(`−1) ,∆(`) ) ∗ 1
N ′ // calculate gradient
13 for ` = 1,...,L do W(`) ←W(`) −ηG (`) // Update weights. Slide 37

Q&A Education