VI y VII

VI. RESULTS

The main purpose of this section is to report the results that are deemed relevant to the discussion of hierarchical learning and the hypotheses stated in Chapter IV. Since a complete view of the example was already presented, the details of each example are skipped. In this chapter we deal with the simulation data itself, and its relationship to the hypotheses. In Appendix E includes the tabular data which we summarize in this section.

VI.1. Coordination Mode Tests

Hypothesis I:

IT IS HYPOTHESIZED THAT THERE IS A TOP-DOWN/B0TTOM-UP COORDINATION MODE FOR OPTIMAL LEARNING.

To test the hypothesis we ran the following benchmark simulation:

INPUT PARAMETERS

Theta = 1.0
Levels = 3 (3 levels in the hierarchy)
Goals = 13 (13 levels to be learned)
Algedonic Scheme = 5 (Linear with expectation and delta error)
Confidence = 5 (5 times a goal is reached)

As discussed before, we varied the degree of coordination from 0.6 to 1.0 and ran 25 cases, using different random seeds to start the simulation. We then measured the following three metrics.

Concerning the first metric, Figure VI.1. presents average values of the probabilities of selecting the right path to select a given goal of the simulation. True mean ranges were computed and are presented within the figure (see Appendix E, xi). We see an increasing pattern in the probabilities as the degree of Bottom-Up learning takes place in the coordination. The data concerning the second metric, presented in Figure VI.2, was obtained following the methodology described in the previous chapter (see Appendix E, ix). We were surprised to find that for this metric, the best values were obtained at coordination values of 0.7; initially we thought that a decrease in increased. Although this did not completely happen, the value for a coordination of 1.0, i.e., full Bottom-Up, still is close to the best one obtained.

The final metric tested was the one that measures how much dynamic memory was generated. Figure VI.3 presents the results obtained (see Appendix E.ii). For this case we found a decrease in the number of structures generated as the coordination value was increased. It is interesting to note that between coordination values of 0.6 and 0.7 a sharp increase in the number of structures occurs.

Further discussion of these results, as well as an overall performance metric is presented in the next chapter.

VI.2. Confidence Level Tests

Hypothesis II:

IT IS HYPOTHESIZED THAT FOR A TOP-DOWN/ BOTTOM-UP APPROACH COORDINATION MODE, THERE IS AN OPTIMAL NUMBER OF INTERACTIONS BETWEEN LOWER-LEVEL CLSA's AND THE ENVIRONMENT, BEFORE MOVING UP TO HIGHER-LEVEL ONES.

In order to test this hypothesis we evaluated the performance of the model with the following benchmark simulation:

INPUT PARAMETERS

Theta = 1.0

Levels = 3 (3 levels in the hierarchy)

Goals = 18 (18 goals to the learned)

Algedonic Scheme = 5 (Linear with expectation and delta error)

Learn = 1.0 (Full Bottom-Up Learning)

We tested the performance of the model considering different values for the confidence level, i.e., 5, 10 and 20. The same performance metrics as discussed in the previous section were used. Concerning the average probability of selecting a goal, Figure VI.4 presents us with an increase in the probabilities as the confidence level increases (see Appendix E.v).

This explained, based on the fact that as more time is spent learning each goal, the average probability is increased. Concerning the trial count, Figure VI.5 shows also an increase (see Appendix E.x). Again, the longer a low-level goal is attempted, the more time used in learning the path. However, it is important to note that the increase in trial count is not linearly dependent on the confidence values, as the data shows.

To test the final metric we ran the following benchmark simulation:

INPUT PARAMETERS

Theta = 1.0

Levels = 4

Goals = 18 (18 goals to be learned)

Algedonic Scheme = 5 (Linear with expectation and delta error)

Learn = 1.0 (Full Bottom-Up Learning

For this case, Figure VI.6 shows that for a confidence level of 10, less memory was required (see Appendix E.iv). However, the difference is rather small, and may be explained by knowing that the essential structures are generated in the first 2 or 3 attempts to reach a goal. After that, the same structures are used without having to generate more structures.

VI.3. Lower - Level Learning

Hypothesis 3:

IT IS HYPOTHESIZED THAT IF HIERARCHICAL LEANING OCCURS, THEN LOWER-LEVEL GOALS WILL BE LEARNED BETTER THAT HIGHER LEVEL ONES USING A BOTTOM-UP COORDINATION MODE.

In order to test this hypothesis we ran the following benchmark simulation:

INPUT PARAMETERS

Theta = 1.0

Levels = 4 (4 levels in the hierarchy)

Goals = 39 (18 goals to be learned)

Algedonic Schemes = 5 (Linear with expectation and delta error)

Learn SI = 1.0 (Full Bottom-Up Learning)

The metric that we are interested in this case is the average probability of selecting the correct path. The values were computed for the various levels with a sample size of 30 (see Appendix E.vii). Figure VI.7 shows that the average probability for the lowest level is better that the other two levels considered. Statistical significance was found for the differences between the probabilities for level 3 and level 4, thus validating the hypothesis previously stated. Due to the recursive nature of the algedonic process in the HCLSA model, although a goal at a low level is only attempted five times, whenever a higher-level goal requires obtaining such a low-level goal, the HCLSA must call it once again with its corresponding algedonic loop. Therefore, the path is automatically rewarded again and its probability of selection is increased.

In order to corroborate this, we plot in Figure VI.8 sample means of the average probability of selecting the correct path lowest level v e r s u s trial number. The average probability after 5 interactions is only 0.842; however, after higher level goals were learned, the final overall probability increased to 0.917, as presented in Figure VI.7 (see Appendix E. vii and E.xii). Such an increase is explained by the fact that the higher-level CLSA's had to call lower-level CLSA's to achieve the goal, with the corresponding algedonic mechanism. We shall draw some analogies to this fact in the next chapter.

Finally, in order to measure how well a goal is learned, we could have used the average normalized entropy, as presented in Figure VI.7; as the average probability increases, the entropy must decrease. However, the data would be a bit difficult to compare with the rest of the results because entropy is only a measure of how much organization exists in a transition matrix and not how accurate the organization is.

VII. CONCLUSIONS

In this chapter we briefly review the results obtained in the simulation and provide a general performance metric. We also provide suggestions for further research as well as some enhancements to the model.

VII.1. Discussion of Results

The ideas of hierarchies for learning have existed for a long time in the areas of research relevant to this study. In fact, we have used some of the propositions reported in the literature in the design of the HCLSA model. An adequate nomenclature for the field of learning automata is necessary and we have provided this. More theoretical work is certainly needed, especially in the areas of hierarchical communication and convergence of algedonic schemes. The concept of a recursive model for a learning has been studied and some analogies to this in our daily life are easily found; whenever we prepare a manuscript, we reinforce the subgoal of writing. Going back to our example of the novice programmer, as he develops sophisticated computer applications, he will reinforce his knowledge of the specific programming language he is using, etc.

Following this idea, and according to the data obtained, in the simulation it was found that for the scheme used, it was not necessary to remain at a given goal for more than five interactions. As a higher level CLSA calls for a lower level one, the algedonic loop created by their dual interaction - i.e., CLSA1.k.j. - reinforces the correct path automatically, thus increasing the probability of selection. Since the main emphasis in learning automata theory is on adaptability, recursive models require algedonic schemes that do not saturate transition matrices. In the present work, guidelines for such type of schemes are discussed. Considering how well goals are learned, i.e., the average probability of converging to the correct solution, the best results were obtained with a 100 percent Bottom-Up coordination mode. i.e., lower - level goals should be learned first. The same was true concerning memory requirements, i.e., number of structures generated. However, when the trial count was taken into account, a combined 70 percent Bottom-Up, 30 percent Top-Down, gave the best results.

VII.2. Performance Metric

In this section we present an overall performance metric in order to evaluate the data used to test hypothesis I and II of Chapter IV. The function is arbitrary and may require a modification to the exponents if any of the variables considered is to be assumed more or less important. Our intention here is merely to be argumentative and descriptive, since the raw data already been presented and discussed. The function used is given by:

FunOf(p,sg,tc) = p / (tc * sg)
where

p is the average probability of reaching a goal.

tc is a normalized value for the number of trials required to reach a goal

sg is a normalized value of the number of structures generated.

Figure VII.1. presents the data for the coordination value. In this case the best coordination value obtained was at 70 percent, the worst at 60 percent; complete Bottom-Up learning also provided a good value. Figure VII.2 presents the data for the confidence level. In this case the best value by far was obtained at a confidence level of 5. Here we suggest again that the type of application of the model should determine the right exponents for the independent variables in the performance function. For instance, if memory is a constraint, then the value of sg could be raised to a given power, e.g., f= p / (tc * sg * * 2).

VII. 3. Suggestions for Further Research

Concerning the current version of the program, although the model discussed is flexible in nature, the implementation used lacks some of the flexibility of the HCLSA as presented in Chapter IV. The idea of dynamic structures based on linked lists should be maintained; however, in order to test a longer sequence of productions, a file dump capability ought to be added to store such structure on disk for updating and later use. If the model is to be used for a learning game, a better interactive communication is needed.

The main difficulty in a hierarchical organization is to define the number of levels (verticalization) versus the optimal number of transitions per level (horizontalization). If the model is enhanced with a hierarchical set of symbolic production rules, it will be possible to analyze challenging questions that derive from the previous statement, namely:

How many levels are required in a hierarchy to learn a given task?
Which is better, to increase the number of transitions per level, or to increase the number of levels?

The study of hierarchical symbolic production rules requires study in the areas of learning automata, formal languages and discrete simulation.

VII. 4. Concluding Remarks

The HCLSA model discussed is very flexible because it gives the automaton the ability to learn transitions at various levels independently. Furthermore, it may be possible to achieve learning of different goals by establishing new CLSA's which will use previous lower-level CLSA's in a rather different arrangement. Two questions were asked in the introduction of this dissertation: we believe that both can be answered affirmatively and optimistically. Machine learning is possible and the structure to support it must be hierarchical in nature, if it is ever going to allow for reasonable complexity.

When this piece of research was begun, the HCLSA model did not exist, except in the mind of this author. Thanks to the help of members of this dissertation committee, the model came into being. The work here presented has provided a background on the model, so that it may be a subject of further study. Although some topics must be re-evaluated and new topics must be investigated, the general philosophy should remain as it has been presented. This is, then, not an end to a research, but a continuing step to a better understanding of the simulation of learning.