CIRCULARITIES IN NETS AND THE CONCEPT OF FUNCTIONAL MATRICES¹ [199]

R. Moreno-Diaz and W.S. McCulloch

The problem of reentrance or circularity has obstructed the progress of logic and mathematics for a couple of millenia, of neurophysiology for 330 years, and of engineering and physics for at least 200 years. Warren McCulloch began worrying about it in 1917, and it became acute for him in 1923, when he conceived information flowing through ranks of formal neurons as mendelian genes flow in a hereditary net. But he could make nothing of it until he had the help of Walter Pitts, with whom he worked for more than a year, chiefly on Part III of their paper of 1943, “A Logical Calculus of the Ideas Immanent in Nervous Activity”,(1) which became an opening wedge for what is now called the Algebraic Theory of Finite Automata. They managed to prove, albeit obscurely, three theorems which depend in a vague way on the theory of relative prime numbers. They concluded that neural nets without loops, provided with scanners and tape, were equivalent to Turing machines and that nets with loops could compute, without scanners and tape, some of the numbers a Turing machine can compute, but such nets could not compute all the numbers that a Turing machine can compute, and they can compute no others. No new theorems concerning closed loops appeared until 1955, when David A. Huffman,(2) working from prime polynomials on a Galois field, initiated the theory of linear shift registers composed of .delays and logical gates, called zero-sum adders, which compute the exclusive “or.”

In the ensuing eleven years, many new theorems concerning neural nets appeared, including a way of linearizing nonlinear cases by increasing the number of delays, and, in 1960, James H. Massey(3, 4) devised an algorithm generating the minimum linear shift register to produce a given sequence of digits. With his help Jose L. Simŏes da Fonseca(5) invented a way of mathematically linearizing any nonlinear shift register and an algorithm for its construction in minimal length, which is never longer and usually shorter than the linear one.

In 1965, McCulloch was working on the more general problem as to what modes of oscillation a net of N neurons could embody. For one neuron it is one mode; for two neurons, twenty modes; for three, he could neither exhaust them nor count them. Carl P. J. Schnabel(6) succeeded: It is the sum from k = 2 to k = 2^N of k − 1 factorial, times the combination of 2^N taken k at a time, which is much more than (2^N − 1)! and hence grows fantastically as N increases. The mode of oscillation that the net will enjoy is uniquely determined by a constant input.

In the next year, Roberto Moreno-Díaz,(7) using a theorem of Manuel Blum(8) proved that all possible modes of oscillation for a net of N neurons could be realized by a single net with a number of input lines equal to the logarithm to the base two of the number of possible modes. Then, thanks to Lewey Gilstrap's tensor representations of loop-free nets, we began two streams of study which have converged to produce the work we are about to report.

Stability of Neural Nets

One stream concerns the stability of neural nets, solved by Moreno-Díaz, with suggestions from Manuel Blum, employing the so-called state-transition matrices of automata theory. For a neural net (with or without loops) the state-transition matrix for a fixed input—actually, a binary relation between pairs of states—is a square matrix, T^m, of dimensions 2^N^N $T_{ij}^m$ term is 1 if the net goes from state i to state j under the input X_m and is 0 otherwise. In the general case of connectivity, in which every neuron may be connected to itself and to all others, a net of N neurons may be given by a set of N Boolean equations of the form

{Y_k}(t) = {f_k}[X(t - 1);\,Y(t - 1)]\,\,\,\,(k = 1,2,3...N)\,\,\,\,\,\,(7 - 1)

where X( t − 1) is the input vector and Y(t − 1) is the state, that is, the collection of the outputs of the neurons—both at time t − 1.

X_m $T_{ij}^m$ of the state transition matrix is given by

T_{ij}^m = {f_1}{({X_m};{Y_i})^a} \times {f_2}{({X_m};{Y_i})^b} \times \cdot \cdot \cdot \times {f_x}{({X_m};{Y_i})^f}\,\,\,\,(7 - 2)

where (a, b,. . . , f) is the string of zeros and ones that correspond to state Y_j following Gilstrap's convention in which f⁰ = f̄ (negation) and f¹ = f; The term T^m has exactly one 1 per row. Necessary and sufficient for the existence of at least one stable state is the occurrence of at least one 1 in the principal diagonal. From such a representation, the necessary and sufficient conditions for oscillations of a given length, as well as the length of transients, can be found. If the net is probabilistic, its state transition matrix P^mX_m $P_{ij}^m$ term gives the probability of transition from state i to state j.

The Logic of Relations

The second stream began with the understanding of Charles Saunders Peirce's representation of the Logic of Relations. For diadic relations in extenso, he has a square array, the intersections of whose rows and columns signify individuals between whom the relation exists by a 1 and does not exist by a 0. This representation enabled him to form relative products such as “A loves someone who serves B” and relative sums such as “take any individual; then, A loves him or he serves B.” It is on these relative operations that the important theorems of the logic of relations depend. The state-transition matrices for each input appear, thus, to be representations of diadic relations in extenso. Peirce's individuals in the relations correspond to the states of the net.

The Functional Matrix

But the relation we seek to represent is fundamentally triadic and appears diadic only by restriction to a fixed input. We defined a mint as a three-dimensional array in which are stacked all of the two-dimensional state-transitional representations for each of the possible inputs. Mints enjoy relative sums and relative products, and we have used them as. Peirce defined them. Thus, our vertical dimension comprises the inputs. Please note that, in the mint, the intersection of any three orthogonal lines are only 1s or 0s and hence are directly applicable to deterministic nets. This type of triadic relation among state, input, and new state is a particular type of triadic relation in which the vertical dimension comprises individuals ( the inputs ) of a nature different from the states. In this case, by looking down through the mint, we can form what we call a functional matrix in which every position contains a functional expression of the input variables. Each takes the value 1 or 0 for each value of the inputs, according to what the value, 1 or 0, of that position was in the original mint. Thus, every position in the functional matrix is, for deterministic nets, a Boolean function of Boolean or non-Boolean variables (the input variables) or, expressed in the language of the theory of relations, is a triadic relation projected into a diadic relation by using the third subscript—or blank—as a parameter.

Thus, the functional matrix, T(X)_ij, of a net of N neurons is such that, for each input X_m, the following equation holds:

T{({X_m})_{ij}} = T_{ij}^m\,\,\,\,\,\,(7 - 3)

For Boolean inputs, a generalization of Equation 7-2 leads to the expression:

T{(X)_{ij}} = {f_1}{(X;{Y_i})^a} \times {f_2}{(X;{Y_i})^b} \times \cdot \cdot \cdot \times {f_N}{(X;{Y_i})^f}\,\,\,\,\,\,(7 - 4)

T{(X)_{ij}} = \sum\limits_{m - 1}^{m = {2^M}} {T_{ij}^mx_1^\alpha \times x_1^\beta \times \cdot \cdot \cdot \times x_M^\mu } \,\,\,\,\,\,(7 - 5)

where Σ is the sign for the inclusive “or,” and (α, β, . . . μ) is the string of 0s and 1s that define X_m following the same convention as that above.

Given a 2^N × 2^N matrix, T(X), in which every term is a function of M input variables, X, and such that for every X_m term T (X_m) has exactly one 1 per row, there is a net that corresponds to it. That is, one can find a set of equations of the form of Equation 7-1. Indeed, let the subscripts i and j of T(X)_ij be put into one-to-one correspondence with 2^N states. Let {Y_k} be the set of states for which the kth neuron fires; that is, the states having a 1 in the kth position; {Y_k} contains 2^N-1 states. Then, the Boolean function, f_k, corresponding to the kth neuron is

{f_k} = \sum\limits_{i = 1}^{i = {2^N}} {\sum\limits_{j = 1}^{j = {2^{N - 1}}} {T{{(X)}_{ij}}y_1^\alpha \times y_2^\beta \times \cdot \cdot \cdot \times y_N^f\,\,\,\,\,\,\,(7 - 6)} }

for all Y_i = (a, b,. . . , f) and for all Y_j ε {Y_k}

Just as a Universal Turing machine can be made specific for the computation of a particular number by having a portion of its tape serve as a program, so can a “universal net” of N neurons be made specific to embody any net of N neurons (with or without loops) by a proper encoding of its inputs. Its composition may seem trivial. It is. To construct it, form a mint by stacking all the possible Boolean matrices having exactly one 1 per row. There are 2^N×2N of these matrices. From Equation 7-6, find the functions of each neuron. A number L = N × 2^N of input lines to the universal net is necessary. Let these input lines be z₁, z₂, . . . , z_L or, in short, Z. To embody a particular net, all that is necessary is an encoder that transforms the inputs X to the net being embodied into some—or maybe all—of the inputs Z. The encoder is a net without loops—and, so, a degenerate case of the theory. Its functional representation is such that all functions in each column are the same. The same is true for a decoder. A net followed by a one-to-one decoder which samples every neuron of the net is then equivalent to an alphabetical permutation of the neurons of the net.

When one wants to extend his theory to nondeterministic nets, all that is necessary is to replace the functional expressions by multivalued or probabilistic functions, as McCulloch and his co-workers did in answer to the challenge in probabilistic logic posed by John Von Neumann. It follows that one can handle such nets as if they were deterministic, receiving their inputs from the appropriate encoder. This complements the work of Shmoul Winograd and Jack D. Cowan(9) on reliable computation in the presence of noise, but in no sense replaces it. The more general case of a probabilistic net is that in which none of the transition probabilities, P_ij, are zero. A universal net and a probabilistic encoder will then duplicate the action of the probabilistic net. Consider a probabilistic encoder in which input X_m produces 2^L output configurations (L = 2^N ×N $Z_n^m$ $P_n^m$ $Z_n^m$ N $Z_n^m$ Tⁿ $P_{ij}^m$ that this universal net goes from state i to state j. This is

P_{ij}^m = \sum\limits_{n = 1}^{n = {2^L}} {T_{ij}^nP_n^m\,\,\,\,\,\,\,(7 - 7)}

Σ’ $P_{ij}^m$ $P_n^m$ for a probabilistic encoder that, with the universal net, will act as the given probabilistic net. If some of the P_ij are zero, one may not need the universal net.

Designated Discussion

WARREN s. MCCULLOCH (Cambridge, Massachusetts): With your kind permission, I would like to yield my time for discussion to your previous speaker, because I know he has things to say about the logic of relations, which is really just forming now, which will be much more important to you than any remarks that I can make.

General Discussion

ROBERTO MORENO-DÍAZ (Madrid, Spain): Actually, the logic of triadic relations is the one that we are busy on, because the logic of binary relations was almost completed by Peirce, who started it first; and we produced sometime ago a note in which we formalized the possible calculus for triadic relations.

This calculus is based on Charles Saunders Peirce's notations which came into our minds, thanks to the discussions, as I said before, with Lewey Gilstrap.

We start talking about families, family relations, and how to present them in binary form. We then were able to understand the first few papers of Charles Saunders Peirce, and from that we sat down for several days and arrived at some conclusions. We went back to Peirce and found the conclusions there.

We kept going. Most of the things were already in Peirce's writings with the result that the only thing we actually have done is to change the notation so it is less confusing now than it was for us when we started reading it.

It is possible to develop a calculus of triadic relations. That is to say, there are three things involved instead of two (which is represented as a kind of a matrix). For triads, this would be a kind of tensorial representation, in a way that parallels very closely the way the calculus of binary relations was developed. In other words, it is possible to define closed operations between relations.

These closed operations are of two types in the same sense that there are two types of closed operations in the case of binary relations.

One is nonrelative products, which is isomorphic with the Boolean algebra, and the other is relative products. The relative products are the really interesting ones in the calculus of relations.

We have developed three types of relative products that are closed in this sense. When these operations are made between triads, they give back triads.

The initial reason for attacking this problem of triads was not exactly to discover their possible application to the study of neural nets. Rather it was how to put intention in the net, or quite precisely how to express formal intention. Intention, as has been many times said, is a triadic structure. It is something for something to mean, directed to some end. So we thought we had to start with an understanding of what the theory of triadic relations had to say. How far could one go by applying this theory?

There are many traps in the development of the calculus of relations. It is very easy to get caught, and unfortunately this is what we first did. It is very easy to develop a calculus of relations in extenso, not taking intention into account. We did not immediately realize that we were falling into the same trap that, for example, Wiener did when he published a note on diadic relations in which he thought that he had reduced all the operations to operations between classes.

Without realizing that, we kept going on triadic relations and defined a few terms, but we finally found out that these were actually relations in extenso rather than intenso, which was where we were initially attempting to go.

The possibilities are still open, and, unfortunately, while we have some results, they are not results we would like to show to you. We would like to develop a more complete theory.

Footnotes

Acknowledgments

This work was supported in part by the National Institutes of Health (Grant 5 ROI NB-04985-04) and was done partly at the Instrumentation Laboratory, M.I.T., under the auspices of DSR Project 55-257 sponsored by the Bioscience Division of the National Aeronautics and Space Administration (Contract NSR 82-009-138).

References

McCulloch, W. S., and Pitts, W. H. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 9:127-247, 1943.

Huffman, D. A. The Synthesis of Linear Sequential Networks. In Cherry, C. (Ed.), Information Theory. London: Butterworth, 1955. Pp. 77-95.

Kubista, T., and Massey, J. L. Pseudo-Linear Generalized Shift Register. Department of Electrical Engineering, University of Notre Dame, Notre Dame, Indiana, 1966.

Massey, J. L. Shift-Register Synthesis and Applications. Quarterly Progress Report No. 85, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Mass., July 15, 1967. Pp. 239-240.

Simŏes da Fonseca, J. L. Synthesis and Linearization of Nonlinear Feedback Shift Register—Basis of a Model of Memory. Quarterly Progress Report No. 86, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Mass., July 15, 1967. Pp. 355-366.

Schnabel, C. P. J. Number of Modes of Oscillation of a Net of N Neurons. Quarterly Progress Report No. 80, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Mass., January 15, 1966. P. 253.

Moreno-Díaz, R. Realizability of a Neural Network Capable of All Possible Modes of Oscillation. Quarterly Progress Report No. 82, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Mass., July 15, 1966. Pp. 280-285.

Blum, M. Properties of a Neuron with Many Inputs. In von Foerster, H., and Gopf, G. (Eds.), Principles of Self-Organization. New York: Pergamon, 1961.

Winograd, S., and Cowan, J. D. Reliable Computation in the Presence of Noise. Cambridge, Mass.: M.I.T. Press, 1963.

For further research:

Wordcloud: Boolean, Calculus, Cambridge, Case, Compute, Develop, Encoder, Equation, Expression, Form, Functional, General, Given, Input, Institute, Laboratory, Linear, Logic, Loops, Matrix, McCulloch, Mint, Modes, Moreno-Diaz, Net, Neural, Neurons, Number, Operations, Oscillation, Peirce, Possible, Probabilistic, Probability, Products, Relations, Relative, Representation, Shift, State, Sum, Term, Theory, Times, Triadic, Universal, Work

Keywords: Neurons, Matrices, Net, Immanent, Work, Nets, Machine, Registers, Numbers, Automata

Google Books: http://asclinks.live/bfno

Google Scholar: http://asclinks.live/0wl9

Jstor: http://asclinks.live/7rll

1 Reprinted from Biocybernetics of the Central Nervous System, pp. 145-150, Proctor, L.D. (Ed), Publ. by Little, Brown, & Co., June 1969. ↩

CIRCULARITIES IN NETS AND THE CONCEPT OF FUNCTIONAL MATRICES1 [199]