1) What are the characteristics of a good training set for the approach of
using neural networks in test case reduction? Explain the effects that the
training set have on the performance of this approach.

Answer: A good training set for this approach will be a one that is sampled
uniformly over the input space, thus guaranteeing that each range of inputs
has a representative sample in the training set. Having such a training set
will cause the resulting neural network to mimic the actual program
behaviour, thus leading to a reduction in the test suite while adequately
testing all possible paths in the program. 
On the other hand, using an inadequate training set will result a neural
network that doesn't contain the rules of the range of inputs that wasn't
exercised by the training data, thus causing the user to remove important
test cases from the test-suite (namely the ones that exercise this input
range), which ultimately will lead to leaving significant paths in the
program untested.
---------------------------
2) Explain how the rule-extraction phase allows us to actually reduce the
number of test cases.

Answer: After the rule extraction phase, we get an idea about how the input
space is partitioned around each rule in the program, thus we can reduce the
number of test cases that uses many values in one partition of the input
space into one representative test case for this partition (because all such
values will actually follow the same program path as we inferred from the
extracted rule).

---------------------------
3) "The pruning phase might cause the neural network representation to
produce test-suites that might avoid testing important parts of the
program". Do you agree with the previous statement? Justify your answer.


Answer: This is highly dependent on the quality of the training set
(coverage of input space) and the threshold you set for the penalty function
in the pruning phase. If these criteria is met, then the pruning phase is
guranteed to actually remove the edges that represent the rules that aren't
actually in the real program. As having lower weights for these edges
indicated that this edge isn't exercised by the training set, in other words
that this specific input on one part of th edge doesn't affect the output
associated with corresponding hidden node.
On the other hand, if the training set isn't distributed well over the input
space, that means that the lower weight of some edge might be resulting from
not having enough training data other than not being an actual rule in the
program.