Artificial Intelligence

Define what it means for a heuristic function to be admissible,
March 20, 2023
Algorithms 2 (djw1005)
March 20, 2023

Artificial Intelligence

Artificial Intelligence
In the following, N is a feedforward neural network architecture taking a vector
x
T = ( x1 x2 · · · xn )
of n inputs. The complete collection of weights for the network is denoted w and
the output produced by the network when applied to input x using weights w is
denoted N(w, x). The number of outputs is arbitrary. We have a sequence s of m
labelled training examples
s = ((x1, l1),(x2, l2), . . . ,(xm, lm))
where the li denote vectors of desired outputs. Let E(w; (xi
, li)) denote some
measure of the error that N makes when applied to the ith labelled training
example. Assuming that each node in the network computes a weighted summation
of its inputs, followed by an activation function, such that the node j in the network
computes a function
g

w
(j)
0 +
X
k
i=1
w
(j)
i
input(i)
!
of its k inputs, where g is some activation function, derive in full the
backpropagation algorithm for calculating the gradient
∂E
∂w
=

∂E
∂w1
∂E
∂w2
· · ·
∂E
∂wW
T
for the ith labelled example, where w1, . . . , wW denotes the complete collection of
W weights in the network.
[20 marks]