Warren S. McCulloch


Norbert Wiener five years my senior wrote elegantly of Control and Communication in the Animal and the Machine as he saw it coming, and christened it Cybernetics. For years his scientific activities had been leading to it. He had been unable to pursue experimental biology because of nearsightedness and manual clumsiness, yet he never lost interest in physiology. He had trained as a philosopher, with a bent for symbolic logic, but few posts for logicians were available. In World War I, after an engineering apprenticeship for General Electric at Lynn, he was at last enlisted and joined the staff of mathematicians and engineers working on ballistics at the Aberdeen Proving Grounds. He had already become a mathematician with a background in biology, engineering, philosophy and logic when I, as a second class seaman, was engaged in marlin-spike seamanship and semaphore. To me these were topology and communication. What background I then had was from an early interest in theology and a few courses in geometry, algebra, theory of numbers and synthetic projective geometry, plus a thorough working knowledge of spherical trigonometry that I had picked up from old whaling captains. Whereas Wiener emerged from the war an ardent student of mathematics relevant to the real world of physics, engineering and biology, I plunged into the epistemic problems of mathematics that had captured my fancy as a freshman at Haverford College. I remember well that when I told our Quaker philosopher, Rufus Jones, that all I wanted to know was: What is a number that a man may know it; and a man, that he may know a number? He prophesied, Friend, thee will be busy as long as thee lives. He was right. Though I did not know it then, I had become a cyberneticist.

My Training in Epistemology

Since I was in the officers training school at Yale when my active duty terminated, I remained there, majoring in philosophy and minoring in p3ychology. After the hard work of Haverford, Yale presented few academic difficulties and I always had time for extra courses. What struck me even then was that one almost always learns important things from the wrong professors; it was in economics, for instance, for which I read Das Kapital, that I acquired both adequate German and the right slant for under-standing Hegel. After that I read Kants Critique of Pure Reason in German. In Woodruffs course in biology I learned how to read and respect Aristotle unfortunately only in translation. I learned the great epistemic problem of relativity from Boltwood, who taught us theoretical chemistry. Through his knowledge of living things I learned to understand his logic and hence his metaphysics. The mental and material axes merely intersected in the substance the This One. From then on mechanism and teleology held no contradiction. They have in common the dialectical argument with its logic of becoming which is necessary for cybernetics, and they complement each other from the sides of matter, or body, and form, or mind.

It is Kants conception of a synthetic a priori and his conviction that this depended upon the physical, the anatomy and physiology of the body, that, coming through Helmholtz became the Leitmotiv of Rudolf Magnus, perhaps most explicitly in his Linnaean Lecture on The Physiological A Priori". Magnus student, J.G. Dusser de Barenne, was my teacher.

The main theme of the work of my group in neurophysiology in the Research Laboratory of Electronics at the Massachusetts Institute of Technology has been in this tradition, namely, experimental epistemology, attempting to understand the physiological foundation of perception. It should never be forgotten that Kant told the great neuroanatomist, Semmaling, that the cerebro-spinal fluid could not be the sensorium commune.

Returning to Boltwood, he was very clear about the bearing of relativity, not only on our physical frames of reference, in the sense of space, time and movement, but also on any other set of axes appropriate to an observer coping by measurement and perception with his own changes in a changing world. In particular, he dealt with olfaction. The nose of a chemist working with an ester must be able to identify both halves of the salt, the organic base and the organic acid. By sniffing, say, for the acid alone, he can alter his perception of the salt, and in this way analyze the compound. Mixtures of compounds are more difficult. But the rate of olfactory adaptation and its persistence differ for both the particular radicals and their concentrations.

Boltwood had earned his Ph.D. for separating the isotopes of lead, and so was well aware of the revolution in physics when the billiard balls of antiquity turned out to be almost empty space with particles floating about in regular orbits. He had followed every revolution in physics since 1908 and expected more; notably that man would learn to split the unsplittable the atom and that it, or some isotopes, would explode in ways that might trigger others, and so explain the stars.

Complaining to me of his increasing forgetfulness, Boltwood asked me to spend one evening (5:00 p.m. to 5:00 a.m.), with him while he outlined for me the history of science. I had already had a year of study of it, but it had never before come together in my mind with that organization and that selection of the requisite diverse items. He began with guesses as to what the Greeks could have inherited from Egyptians, Mesopotamians, and Phoenicians, whom he called Semite sailors". Then he romped through the pre-Socratics with a general thesis phrased, They were not afraid to think. I cannot recover the details, but the general thesis was clear and stuck. If you want to get behind the fake explanation of how our world comes to be as it is, then you must find the facts for yourself and accept nothing in addition but logic and mathematics. Then he said: Skip the Socratics and sociology. From the Schoolmen I knew better than he the Scholastic period, and kept my mouth shut.

He began again with Galileo Galilei, who split natural science down the middle by refusing to admit mind, or anima, as an explanation of physical events. We agreed that this was the way to make physics, but that it left the mind anima a sort of self-sufficient item capable of wandering around the world and having ideas, even perceptions, of a world devoid of anima. He thought this was a great triumph, for it produced physics (Galileo), astronomy (Kepler), and celestial mechanics (Newton). but . . . and after a long time: When I was isolating the isotopes at night, I spent the day training pairs of cats to climb ladders fastened at the top; and, with gloves on their front paws, to box for the top rungs. If you want to train animals, you can succeed by giving them permission to do what they like to do. Remember that students are animals with curiosity and a willingness to help, so the best way to teach them to become scientists is to tackle a tough problem with them and make mistakes!"

One other thing that I learned from Boltwood concerns holistic arguments at which our professors of science were then scoffing. He pointed out that one has an electromotive force (E.M.F.) only in a closed loop of three components, and any substitution of another metal for one of these creates two new junctions but only one new measurement. Hence a single junction potential is not measurable because it is not thinkable.

Our psychologists had gone I.Q.-ing and become statistical. When I doubted the appropriateness of their ways I was told to read C.S. Peirce who had found that the normal distribution curve fitted his vast collection of data. Peirce thought it did, but many years later E.B. Wilson gave me his analysis of the variance of Peirces data. The curve, like most real ones, was too flat on the top, too steep on the sides, and too high in the tails.

Anyhow, I had begun to read Peirce. Then I turned to Russell and Whitehead (Principia Mathematica) who in the calculus of atomic propositions were seeking for the least event that could be true and false. Leibnitzs problem of the petit perception had become the psychophysical problem of the just noticeable difference, called JND, which has since been found in the middle range of perception to be roughly proportional to the size of the stimulus. Let us suppose that in estimating the weight of loaded pillboxes we had a JND of two per cent. Suppose now we had pillboxes arranged in ascending order increasing by one percent of their weights. Then each even pillbox differs by the JND, or 2%, from its even neighbor and the odds differ by the same amount from their odd neighbors, but none are discernible from their immediate neighbors in the whole series. Clearly the subject is in receipt of signals such that when they sum he perceives the differences.

In searching for these unit signals of perception I came on the work of Rene Descartes, fortunately the complete work, which contained his conclusions of eight years of dissection in Leiden. In the translations of Descartes we used at Yale none of this appeared. For him the least true or false event would have been his postulated hydraulic pulse in a single tube, now called an axon, true if it was properly excited, and false if it arose ectopically. He thought that nerves were composed of parallel tubes too fine for him to see them individually even under his magnifying glass. Each tube was filled with liquid in which pulses of hydraulic pressure went from brain and spinal marrow to muscles, causing them to contract; and each tube had a fine thread in it which, as the muscle contracted, signaled back to the control nervous system to close down the valve. As far as I know, this is the first use of the notion of inverse feedback, and so of the reflex. Finally, Descartes had the notion that there were enough tubes in parallel to transpose the images of the eyes into the brain, but he did not believe that there could be enough tubes in parallel to convey the picture to the central valves; so from there on he conceived that it had to be conveyed by temporal sequences of pulses which need no more resemble the picture they describe than our words resemble the thing they describe. This is probably the first coding theorem, and to one familiar with the Morse Code this was particularly appealing!

What I learned of Descartes in philosophy courses was that he was somewhat impious in saying we had to get straight our ideas of God if we wanted to make good machines, and that he was obviously a poor metaphysician because he had left the mind as a spectator running up and down the side lines. That was exactly what Galileo had done to create physics. Descartes automaton was good physics. As for me, I do not want my formal and final causes to be efficient or material causes", but I want all four to make sense of living things and machines that simulate them; from all of which I conclude that cybernetics really starts with Descartes rather than with Leibnitz. That it is only in recent years that I have been able to understand Leibnitz will become clear when we come to computers.

The best course in philosophy I had at Yale covered Locke, Berkeley and Hume. Because I had inherited the copy of Lockes Concerning Human Understanding that had persuaded my great grandfather, Bradley, to free all of his slaves, I had read the text and was heartily disappointed. To me it seemed thoroughly confused. With my scholastic background he seemed never to distinguish essence from existence. So I was delighted when I came on Newton accusing Locke of mistaking ideas for common notions and he might have added: confusing both with their notation. These distinctions are crucial in cybernetics. Among communication engineers one is apt to fall into their jargon and refer to a given circuit, hardware in a box, as the logic"; and to speak of a number in memory", meaning the magnetic state of certain ferrites. It is like speaking of a gramophone disk of a printed scope as Beethovens Fifth Symphony. These are capable of being in places one or many at a time whereas the symphony, as essence, is certainly not. The other distinction, between a sphere, i.e., the locus of all points equidistant from a given point, and a ball, however nearly round, is, I think, nearer to the gist of Platos myth of the cave of the sun, for the sphere belongs to the pure mathematician, the ball to the physicist, the engineer and the biologist.

I learned this chiefly from Jay Hambidge of the Yale Art School, who was measuring Greek vases looking for the root rectangles". Plato had said that, of all forms that are divided into three equal rectangles that one is most beautiful whose altitude is to its base as the 3 is to 1. Early Greek vases began with 2 to 1 and two equal parts, then went on to 3 to 1, then to 5 to 1, and finally to (5+1)2. This, the Golden Section, made by dividing a line into two parts such that the lesser is to the greater as the greater is to their sum, defines a rectangle with which, by using a carpenters square and a compass, one can get a whirling square", which, rounded off, is a good approximation to an equiangular spiral of which construction Plato says scornfully that it is not mathematics. He would probably have said the same of ratios of successive Fibonacci numbers tending to the Golden Section.

The scales of a pine cone develop in right and left hand sets of spirals and the number of right hand spirals of scales is always prime to the number of left hand spirals. Hambidge started me counting the spirals of pine cones. The numbers are ordinarily as 3 to 5, 5 to 8, 8 to 13, 13 to 21, and some large ones 21 to 34. I am told that there are some with 34 to 55. Certainly no one ever laid out for a pine cone how it should grow. In 1911, DArcy Thompson had published the answer. Given that the growing central stalk has two branches at nearly the same level, and not precisely opposite, spiralling up the stem at any pitch and velocity, then when one counts down to the lowest recognizable spiral the rights and lefts are necessarily Fibonacci. This, then, is a self-organizing system that with the absolute minimum of information produces the pine cone. Please note that it has strictly no symmetry of rotation, for it must be turned 360° to coincide with itself. Clearly it needed one thing more to account for its digitalized form. Hambidge started me counting then, about 1920-1921, and I continued to count pine cones until 1964. Since a pine cone has Fn spirals one way and Fn+1 spirals the other, it seems obvious that, when one counts up the spirals Fn=5, the number up this way must be the same as the number up the other multiplied by the reciprocal of Fn/(Fn+1) (say, 5/8, i.e.: by 8/5). Well, it just is not so. Finally you call yourself an idiot for you have counted the top five times one way and eight times the other. So the difference is 3; only it is not! Then you turn the cone over and start from the top and find the second equally simple consequence of the second least constraint.

That chance and the nature of numbers account for the self-organization of such a growing system nudges the cyberneticist toward Peirces notion that given a stochastic world, order will evolve. It will become reasonable. My difficulty was that it took me from 1920 to 1964 to learn to count. If, as you count down the pine cone, you mark each spiral that you count on the right hand red, and on the left green, you will find out what you are really doing.

To return briefly to Berkeley; I easily understood his text as a sort of stretching of the epistemology of relativistic physics to mundane perceptions (F.S.C. Northrop just told me Einstein had thought so!) but it left me incredulous. My classmates never understood Berkeley and I know my professor was professedly puzzled. To think of the Holy Ghost as tensor invariance has the flavor of Lewis Carrolls Alice climbing through the looking glass onto a new leaf of space which is a chessboard, upon which the pieces are alive with a memory that works both ways!

That two-way memory Hume lacked. He had a succession of perceptions but no perception of succession. He was about the same age when he wrote his Tractatus as I was when I read it, and being a fellow Scot I could fairly see the wheels going around in his head. What Duns Scotus had called a firm proposition residing in the soul, Hume called a habit of the mind. And, whereas Scotus distinguished carefully between Causa causalis, or bound cause, and Causa casualis, or accidental cause, Hume makes no distinction. The biological purport of the bound cause is the law of the conservation of species; like begets like. Without causality the primary properties of physical things went the way of their secondary qualities, but, strangely enough, logic and arithmetic remained and allowed him to define equal numbers: When each has a unit corresponding to a unit of the other we pronounce them equal, and he continues, and it is for the want of such a measure that geometry can scarce be considered an exact and perfect science. Hume had already seen that with number one could compute precisely forever. I had begun to find my way toward the nature of number, but I went the wrong way.

In 1920 I began to attempt to construct a logic for transitive verbs of action and contra-transitive verbs of perception, etc. I continued it through many pitfalls until February 1923, when I came under the influence of T.H. Morgan who talked to me much and earnestly about genes and inheritance. He was disturbed by having to consider a gene as a physical thing on a given chromo-some, which in any case is a very specific hunk of matter, and, at the same time, having to think of a gene as a determinant of a set of properties in a homogeneous population. Just as two men can be of one mind, so identical twins are of one genetic complex. Here was a least message that could be written many places and read out differently in different context. Just as hereditary information could flow through generations, so could sensory information flow through ranks of neurons. This was certainly the right model. Both nets are anastomotic; that is, information from every input may be effective at every output, and that by many paths.

I was on my slow way. In 1923, I took my Masters degree in psychology with a thesis on which I had started out of incredulity of Hambidges assertion that root rectangles and the Golden Section are aesthetically preferred by most people. He was right. It does not take long to train a man to the limit of his capacity in judging lengths, areas, and volumes. If his JND is two percent in length, it is two percent in area and two percent in volume. Therefore he cannot be making estimates of areas or volumes on the basis of length, for he would be off by a factor of two and three times as much. Hambidge had said that even so his knowledge of shape was well-nigh perfect. Using black cards with white lines, I started to test this, not by asking: Which of these shapes have such and such properties? or: Set this card to such and such a ratio, but simply: Which shape do you like better of these two? and: Set this card to any shape you like. Trained observers, architects, sculptors, painters, came rapidly to the root rectangles and the Golden Section. People unused to making esthetic judgments came more slowly. But what astounded me and made me keep on testing long after I had turned in my thesis was this, that a man with a JND of two percent would set the card correctly to the third decimal point, so that I had to read it with a vernier. Hambidge was correct! Man does live in a world of relations.

My Acquisition of the Experimental Techniques for a True Epistemology

During those years in psychology I had come to know Woodworth very well and had acquired a healthy interest in physiological psychology and learned to appreciate Gardner Murphy, from whom I learned the history of all the schools of psychology from the pre-Socratics to behaviorists. The rest of my time I had spent in mathematics, physics, chemistry and neuroanatomy.

The next fall I entered medical school with the avowed intent of learning enough physiology of man to understand how brains work. The four years at the College of Physicians and Surgeons was a hard grind, but I was repaid by excellent and enthusiastic teachers in neuroanatomy, comparative neuroanatomy, neuropathology and neurophysiology. Everything else seemed secondary to the function of the master tissue! When I began to see patients, it was injuries of peripheral nerves, of spine, and skull that intrigued me. Neurology and neuropsychiatry delighted and enthralled me. I had intended to go back to the laboratory as soon as I graduated, but I joined Foster Kennedys staff at Bellevue, first as intern and then as resident on his neurological service, which included neurosurgery. Those were the years 1927, 1928 and 1929. But, in spite of the busy life of the intern, I was forever studying anything that might lead me to a theory of nervous function. My fellow intern, Samuel Bernard Wortis, accused me of trying to write an equation for the working of the brain. I am still trying to!

This was the period of post-encephalitic Parkinsonism, and our commonest cases had pill-rolling tremors. Here was a problem of activity in a closed loop coming out of the ventral root of the spinal cord, returning from the contracting muscle over the dorsal root, and so out again to the same muscle. From what little we knew of the delays in the spine and of the conduction times, it was difficult to say whether a tremor of, say, three to five per second was par for the course.

The question that Sam and I discussed most often was whether reverberation in this loop was a vicious circle or whether something central was reverberating at that frequency and sending out signals for contraction from one part of the loop. He also had many epileptic patients, some due to injury, infection, and chemical insult to the brain, and others of familial types. Here we wondered whether there was not in them a vicious circle within the brain. We even wondered whether curarizing the patient might not stop not only his thrashing about, but also the central nervous activity concomitant with it. Yet I doubt whether any of us in those days thought of reverberant activity in brains either as intrinsic to them or as a normal process. No anatomist had produced convincing pictures of loops closed within the central nervous system, though Ramón y Cajal had suggested them. No spontaneous activity of neurons was known and no activity that was regenerative over a closed path.

The first theoretical paper on this score appeared in Brain in July of the following year, 1930, entitled, A Theoretical Application to Some Neurological Problems of the Properties of Excitation Waves Which Move in Closed Circuitry, by Lawrence S. Kubie. (Unfortunately, I did not come on it then. Nearly forty years later, it reads as if it were written today.) Some months or so later. S.W. Ranson had to postulate closed loops for other purposes. Alexander Forbes told Birdsie Renshaw and me that he did believe in delay chains but not in reverberating chains, and gave us the references to those papers, but that was years later.

Kubie, like the rest of us, was very familiar with the reflex, as defined about 1819 by Magendie, as a process begun in some part of the body initiating impulses that proceed over dorsal roots to the spinal cord whence they are reflected over ventral roots to the part where they arose and there stop or reverse the process that gives rise to them. I had even struggled through a muddy German outline of Sechenovs Reflexes of the Brain, but nowhere had I found Kubies papers that today sound so modern.

Here indeed were some early sources of cybernetics.

The year 1929 was important to me in two ways: my Yale classmate, C.H. Prescott, working at the Bell Telephone Labora-tories, introduced me to an older mathematician, R. Hartley, who was trying to quantify the amount of information that could be transmitted over a noisy line, and it was he who gave me a refer-ence to the definition of information by C.S. Peirce as a third kind of quantity, being the disjunction of all of those statements in which the term in question was subject or predicate, antecedent or consequent. This was the bud of the American definition of information as a quantity. It flowered in Shannons Mathematical Theory of Communication in 1948. Shannons definition rests on an ensemble of possible messages in a language known to sender and receiver, and is a measure of the improbability of the message, its surprise value, written - P log p, which resembles negative entropy, but lacks the dimensional constant for energy.

The second of those surprising ways of looking at information in 1931 came one evening when one of our friends translated to a group of us a paper of Goedels which had just appeared. It was the one on the arithmetization of logic.

In the years 1929-1931 I went on with neurophysiology and studied mathematics and mathematical physics, with little direct relation to cybernetics, but I taught physiological psychology at Seth Law Junior College in Brooklyn, New York, where I explored my theory of information flowing through ranks of neurons very successfully, until I tried to close the loop. Not then having the notion of delay, I thought: One cannot close the loop, one cannot be ones own ancestor! I knew I needed more intellectual tools, but not what they were.

Then, in the depth of the Depression, I went to Rockland State Hospital to earn money, but fortunately there I met Eilhard von Domarus. He was a superb clinician and a well trained philosopher. I learned much of both from him, and in return I helped him rewrite in English his Ph.D. Thesis On the Philosophic Foundations of Psychology and Psychiatry, which he had written in an incredible mixture of German, English, Latin and Greek for my old friend, Professor F.S.C. Northrop of Yale. It took over a year of nightly sessions to do it, but I came away with a real understanding of what was important for me in German philosophy and German neuropsychiatry, without which I would never have come to a definition of thinking that fits cybernetics. His analysis of the abuses of the word consciousness in our language has helped me out of many an altercation. The important meaning in forensic medicine is a triadic relation: A is conscious of B with C if A and C can both bear witness as to B. This cannot be sought in any one man, whereas Peter Abelards conscientia", things thought about together, and so the idea of ideas, can be embodied in a single brain and, today, in a digital computer. Domarus was thirty years ahead of his world.

I went next to Yale to work with Dusser de Barenne.

The ensuing years at Yale were filled with scientifically enlightening discoveries, so much so that Dusser would slam me on the back and shout: We discover too much! He was the only student of Rudolf Magnus who brought on the great tradition from Helmholtz and Kant, he was a psychiatrist whose problem was sensation and perception and their embodiments in activities of the nervous system. He was the inventor of the method of local strychninization of the brain, which made it possible to identify the central structures whose chemical excitation elicited all the clinical signs of hyperalgesia and hyperaesthesia and paralgesis, which we neurologists know as behavioral signs, not merely as symptoms or complaints. By combining these studies with electri-cal recordings, we were able to map the directed pathways of the regions of the cortex to other structures and other regions of the cortex, producing maps of circuit action from which the wiring diagram could be deduced. These pathways are even today being slowly confirmed by the meticulous methods of neuroanatomy. This is but one of the many things that went well as science, but its importance for cybernetics was its setting in the flesh. Its setting in the blood came from measurements of pH and similar items, usually with the help of Leslie Nims, Physical Chemist at Yale. For me it proved that brains do not secrete thought as the liver se-cretes bile, but that they compute thoughts the way computing machines calculate numbers.

At that time I found of all Whiteheads philosophy his analysis of the percipient event most helpful, and it pushed me into studying the Monadology of Leibnitz. When one is working on the physics and chemistry of the anesthetized brain, as I was, one is doing biophysics and biochemistry necessary for neuro-physiology, but falling short of physiology because the nervous system is then deprived of its functions; but even if it were working properly it would still be only physics and chemistry and not physiology unless one were studying the function also. Here the seventeenth paragraph of the Monadology begins:

Moreover, it must be confessed that perception and that which depends upon it are inexplicable on mechanical grounds, that is to say, by means of figures and motions. and supposing there were a machine, so constructed as to think, feel, and have perception, it might be conceived as increased in size, while keeping the same proportions, so that one might go into it as into a mill. That being so, we should, on examining its interior, find only parts which work one upon another, and never anything by which to explain a perception.

Two of Leibnitzs machines that add, subtract, multiply and divide we know to exist: one is in Munich, the other (or an early copy of it) I have seen in the Museum of Scientific Instruments in Florence. The action is by gears with which Leibnitz was not entirely satisfied. It is interesting that, except for the abacus, logical machines of computation preceded discrete numerical machines, which we call digital, by so many centuries. Leibnitz wanted a truly discrete step, for reasons Hume later specified, more in keeping with the tradition of the improbable Ramon Llull whom Leibnitz admired. Llull (1235-1315) had used, as symbols, two intersecting circles for two arguments. Euler used three for three arguments. Venn pushed it to six, but his closed forms are no longer circles. Lewis Carroll pictured six by using a 4 x 4 square with a diagonal in each direction in every square. Minsky and Selfridge have shown how by sine waves doubling in frequency and diminishing in amplitude one can extend these forms indefinitely. C.S. Peirce had broken down the Llullian circles into a single chiastic (X) symbol, the left for the first argument, the right for the second, above for both, below for neither, but the editor did not see fit to include them in the collected works, although Peirce had asked that they be used for his amphex, i.e., not both and neither nor, which Shafer had to rediscover for himself many years later. I made the chiastic symbols for myself in 1941 to teach logic. Any of these symbolisms can be used to signify to what Boolean functions of its inputs a given device will respond. They proved most useful in visualizing probabilistic functions and representing so-called dont care conditions".

The Anastomosis of Cybernetics

In the fall of 1941 I went to Chicago to build up a team of specialists to lay the biological foundations for the Department of Psychiatry of the University of Illinois. This was to be primarily neuroanatomy and neurophysiology and whatever physics and chemistry I thought appropriate. We were not going to work on behavioral problems that were being handled well in several other places. As neurophysiologists we were concerned with the chemistry of the brain. Because the ratio of O2 consumed to CO2 produced by the brain is nearly one, and because many psychiatric patients were known to be resistant to insulin, we worked on the regulation of carbohydrate metabolism in health and disease.

I was fortunate in our electrical engineer, Craig Goodwin, who was excellent in the theory and design of regulatory devices.

He called my attention to Clark Maxwells famous paper, On Governors, and to H.S. Blacks paper on Stabilized Feedback Amplifiers of 1934. He helped me think through all of the diseases of the systems with inverse feedback. I understood automatic volume controls and self-tuning devices. Above all, I learned from him that when the mathematics of our hardware, of nonlinear oscillators and their couplings, was beyond us, we could still build a working model and think in terms of it. Moreover, with him I learned to think statistically about brain waves.

In 1941 I presented my notions on the flow of information through ranks of neurons to Rashevskys seminar in the Committee on Mathematical Biology of the University of Chicago and met Walter Pitts, who then was about seventeen years old. He was working on a mathematical theory of learning and I was much impressed. He was interested in problems of circularity, how to handle regenerative nervous activity in closed loops.

I had to suppose such loops to account for epileptic activity of surgically isolated brain and even of undercut cortex. Lorente de NO had shown their significance in vestibular nystagmus. I wanted them to account for causalgia persisting after amputation of a painful limb and even after section of the spinothalamic tract; I wanted them to account for the early stages of memory and conditioning. I wanted them to account for compulsive behavior, for anxiety and for the effects of shock therapy. These appeared to be processes that once started seemed to run on in various ways. Since there obviously were negative feedbacks within the brain, why not regenerative ones? For two years Walter and I worked on these problems whose solution depended upon modular mathematics of which I knew nothing, but Walter did. We needed a rigorous terminology and Walter had it from Carnap, with whom he had been studying. We, I should say Walter Pitts, finally got it in proper form and we published in 1943, A Logical Calculus of the Ideas Immanent in Nervous Activity. H.D. Landahl immediately joined us in a note applying the logical calculus statistically. The crucial third part of our first article is rigorous but opaque and there is an error in subscript. In substance what it proved via its three theorems is that a net made of threshold devices, formal neurons, can compute those and only those numbers that a Turing machine can compute with a finite tape.

The formal neurons were deliberately as impoverished as possible. Real neurons can not only compute any Boolean function of their inputs, but many others. Threshold devices can compute only those functions which, in a Birkhoff lattice, or N-dimensional cube, can be separated by a hyperplane. With two arguments this is fourteen out of sixteen, only the exclusive or and if and only if are missing. With three arguments only 104 out of 256 are what are called threshold computables". The problem is one of counting the number of planes, and so again I learned to count. We need a Euler to solve it in general. The threshold functions have been computed by Robert Winder for seven arguments, and he estimates the fraction of all functions diminish for large N more rapidly than 2N22N2.

I am tired of hearing of the Pitts and McCulloch neuron, but it was a fair description of available hardware and this is one reason why reliable nets of unreliable components are beyond our technology to date.

Fortunately for our calculus, von Neumann used our article in teaching the general theory of digital computers and it gave rise to the algebraic theory of finite automata. It formed the basis on which I solved the Heterarchy of Values Determined by the Topology of Nervous Nets, and, again with Walter Pitts, of How We Know Universals.

In 1943, Kenneth Craik published his little book called The Nature of Explanation, which I read five times before I realized why Einstein said it was a great book. Of his life I know only secondhand from Professor Dreyer of Edinburgh and from Sir Frederick Bartlett and Lord Adrian, all of whom considered him a genius. Thanks to their interest, Dr. Stephen L. Sherwood was entrusted with Craiks residual writings which we studied and organized and Sherwood edited to form a second little book, The Nature of Psychology. Unfortunately for science, Craik was struck by a truck and died at the age of 32, but his work has changed the course of British physiological psychology. It is close to cybernetics. Craik thought of human memory as a model of the world with us in it, which we update every tenth of a second for position, every two tenths for velocity, and every three tenths for acceleration as long as we are awake. Shortly after the War, there formed in London a group of young scientists, each with two sciences, who thought at one time of calling themselves by his name, but finally became the Ratio Club. From this group and their friends came Ashbys pioneering Design for a Brain; Grey Walters tortoise; and the work of Albert Uttley at the National Physical Laboratory, Teddington, England, on perceptive machinery on a probabilistic basis.

But I want to emphasize Donald McKay, now Professor of Communication at the University of Keele, who, with Dennis Gabor and Colin Cherry, initiated the English School of Information theory which is as appropriate to some scientific and biological problems as the Shannon measure of the American School is to communication. In a communication code the price of a simple BIT, or answer to a yes or no question, remains constant even when we halve the area of ignorance by every question, as we can in playing Twenty Questions. In making measurements, the price varies: the range of uncertainty diminishes inversely as the square root of the number of samples we average. For this and other reasons the English speak of an amount of information as a vector whose length and direction represent complementary measures of information-content, whereas the Americans speak of information as a quantity, Peirces third kind of quantity. It is what you get with one metron per logon.

Before I turn away from England there are two others, not guilty of the solemnity of the square hat. I doubt if Stafford Beer has any degree, though I happen to know he has refused several professorships, preferring to stay in business and make cybernetics pay him and industry handsomely. He knows how to pull able people to him and how to defend them. He has the humor, the diction and the prose of G.K. Chesterton whom I knew in New York City in the early 20s. I met Stafford when he was modeling industries and factories on computers for British United Steel. That was about half a year after the first international conference in Namur, where he and Gordon Pask gave the outstanding papers. Gordon, after a couple of years toward an M.D. degree, shied off into electrical engineering problems and formed his own company to work on self-organizing systems and on teaching machines in which the machine learns the learner.

But I must return to 1943. I had known of Norbert Wieners work on adjustable filters from C.H. Prescott, and of his attempts at biology from Stanley Cobb. I had met Norbert with our mutual friend, Arturo Rosenblueth, who was working with Walter B. Cannon on homeostasis, on the so-called Porter phenomenon and on the supersensitivity of denervated structures, all cybernetic problems. I was amazed at Norberts exact knowledge, pointed questions and clear thinking in neurophysiology. He talked also of various kinds of computation and was happy with my notion of brains as, to a first guess, digital computers, with the possibility that it was the temporal succession of impulses that might consti-tute the signal proper. Then Norbert spoke of predictive filters in terms of the response of linear systems to noise and, finally, of almost periodic times series. I have tried in vain to date that meeting, probably in the spring of 1940 or 1941, for in the fall of 1941 came Pearl Harbor. It must have been shortly thereafter that Julian Bigelow came to Boston and began to help Norbert Wiener on problems of prediction for gun laying and I know not what else. They worked first on linear prediction, using the method of least squares, and then on nonlinear predictions. Part of these studies were published in a book we called the Yellow Peril for its yellow cover and its torturous mathematics.

Norbert told me, and I believe wrote somewhere, that it was Julian who had impressed on him the importance of feedback in guidance. To be more precise, what Julian did contribute most surely was this: that it was not some particular physical thing such as energy or length or voltage, but only information (conveyed by any means) as to the outcome of ones previous act that, for example, a pilot needed to fly a plane to its destination. This I take to be the crucial and the central notion of cybernetics. Once Bigelow noted that it did not matter how the information was carried, only that the machine or organism be informed of the outcome of its previous act, cybernetics was born.

Be that as it may, Norbert, Arturo and Julian wrote a paper on teleological mechanism, the substance of which was presented at a Josiah Macy Foundation meeting in New York City in 1942. Thereafter Arturo and I conspired by Frank Fremont-Smith to hold a series of interdisciplinary meetings to spread these ideas to those who could use them. Before these could get under way, came the meeting of engineers, physiologists and mathematicians at Princeton in the later winter of 1943-1944, described by Norbert in the introduction to Cybernetics. Here I first met Johnny von Neumann.

Winter 1943-1944

Lorente de Nó and I, as physiologists, were asked to consider the second of two hypothetical black boxes that the allies had liberated from the Germans. No one knew what they were supposed to do or how they were to do it. The first box had been opened and exploded. Both had inputs and outputs, so labelled. The question was phrased unforgettably: This is the enemys machine. You always have to find out what it does and how it does it. What shall we do? By the time the question had become that well defined, Norbert was snoring at the top of his lungs and his cigar ashes were falling on his stomach. But when Lorente and I had tried to answer, Norbert rose abruptly and said: You could of course give it all possible sinusoidal frequencies one after the other and record the output, but it would be better to feed it noise say white noise you might call this a Rorschach. Before I could challenge his notion of a Rorschach, many engineers voices broke in. Then, for the first time, I caught the sparkle in Johnny von Neumanns eye. I had never seen him before and I did not know who he was. He read my face like an open book. He knew that a stimulus for man or machine must be shaped to match nearly some of his feature-filters, and that white noise would not do. There followed a wonderful duel: Norbert with an enormous club chasing Johnny, and Johnny with a rapier waltzing around Norbert at the end of which they went to lunch arm in arm. The later part of this meeting was spent listening to engineering and mathematics appropriate to these problems. We all agreed that there should be an interdisciplinary gathering of this kind more often.

That was in the academic year 1943-1944, and was followed from 1946 on by the 10 Macy meetings on circular causal and feedback system. I had a fairly free hand and excellent advisors in gathering the group. Frank Fremont-Smith and I agreed that it should never exceed 25 regular members including guests, and that we should always have at least two of each kind: two mathe-maticians, two neurophysiologists, two neuroanatomists, two psychologists, two engineers, two neurophychiatrists, etc. Every speaker knew there was at least one in the audience who knew his jargon. I agreed to chair, provided Frank would sit next to me and kick my stupid shins. I could count on Margaret Meads keeping a flowsheet of the discussion in her head and on Walter Pitts understanding everybody. Even so, working in our shirt sleeves for days on end at every meeting, morning, lunch, afternoon, cocktails, supper and evening, we were unable to behave in a familiar, friendly, or even civil manner. The first five meetings were intolerable. Some participants left in tears, never to return. We tried some sessions with and some without recording, but nothing was printable. The smoke, the noise, the smell of battle were not printable. Of our first meeting Norbert wrote that it was largely devoted to didactic papers by those of us who had been present at the Princeton meeting, and to a general assessment of the importance of the field by all present. In fact it was, charac-teristically, without any papers, and everyone who tried to speak was challenged again and again for his obscurity. I can still remember Norbert in a loud voice pleading or commanding; May I finish my sentence? and hearing his noisy antagonist, who was pointing at me or at Frank, shouting: Dont stop me when I am interrupting. Margaret Mead records that in the heat of battle she broke a tooth and did not even notice it until after the meeting. We finally learned that every scientist is a layman outside his discipline and that he must be addressed as such. The sixth through the tenth meetings went more calmly and were edited by Hans-Lukas Teuber, Margaret Mead and Heinz von Foerster, and published by the Josiah Macy Foundation, so I need not describe them. I would like to express my perennial gratitude to Frank Fremont-Smith and all the unrepentant interrupters for my education in the circular causal and feedback difficulties, which taught me to tolerate other peoples and even my own follies or foibles. Since then I have wept at more than one interdisciplinary meeting, but never departed on that or any other score. I had learned to listen to others and even to myself.

I first saw Norberts book, Cybernetics, proper, in galley at a Macy meeting. Norberts cataracts were opaque and he had written his equations large on the blackboard, but could not read them. I had only an hours chance to skim through the book and we discussed its possible circulation which we both underestimated grossly. The historical introduction is true to the events as I had experienced them; remember, mutatis mutandi, he was a Roundhead; I, a cavalier! His philosophy was one I understood and with which I largely concurred. The mathematics was superbly to the point and the text accompanying the equations was sufficient for the mathematical moron. I could and did praise it to him then and there, but he remained overanxious and too tense about the books real value. He was like a prophet with a message that had to be delivered, not merely like a 12-year-old boy seeking approbation. I could easily understand how Julian Bigelow must have felt about it, for Julian disliked publicity and sought completion before publication. Years later, when the book had become a best seller and Norbert a figure in the public eye, Norbert wrote what I believe is the best description of cybernetics thus:

The whole background of my ideas on cybernetics lies in the record of my earlier work. Because I was interested in the theory of communication, I was forced to consider the theory of information and, above all, that partial information which our knowledge of one part of a system gives us of the rest of it. Because I had studied harmonic analysis and had been aware that the problem of contin-uous spectra drives us back on the consideration of functions and curves too irregular to belong to the classical repertory of analysis, I formed a new respect for the irregular and a new concept of the essential irregularity of the universe. Because I had worked in the closest possible way with physicists and engineers, I knew that our data can never be precise. Because I had some contact with the complicated mechanism of the nervous system, I knew that the world about us is acrpssible only through a nervous system and that our information concerning it is confined to what limited information the nervous system can transmit.

It is no coincidence that my first childish essay into philosophy, written when I was in high school and not 11 years old, was called The Theory of Ignorance. Even at that time I was struck with the impossibility of originating a perfectly tight theory with the aid of so loose a mechanism as the human mind. And when I studied with Bertrand Russell, I could not bring myself to believe in the existence of a closed set of postulates for all logic, leaving no room for any arbitrariness in the system defined by them. Here, without the justification of their superb technique, I foresaw something of the critique of Russell which was later to be carried out by Goedel and his followers, who have given real grounds for the denial of the existence of any single closed logic following in a closed and rigid way from a body of stated rules.

To me, logic and learning and all mental activity have always been incomprehensible as a complete and closed picture and have been understandable only as a process by which man puts himself en rapport with his environment. It is the battle for learning which is significant, and not victory. Every victory that is absolute is followed at once by the Twilight of the Gods, in which the very concept of victory is dissolved in the moment of its attainment.

We are swimming upstream against a great torrent of disorganization, which tends to reduce everything to the heat-death of equilibrium and sameness described in the second law of thermodynamics. What Maxwell, Boltzmann and Gibbs meant by this heat-death in physics has a counterpart in the ethics of Kierkegaard, who pointed out that we live in a chaotic moral universe. In this, our main obligation is to establish arbitrary enclaves of order and system. These enclaves will not remain there indefinitely by any momentum of their own after we have once established them. Like the Red Queen, we cannot stay where we are without running as fast as we can.

We are fighting for a definite victory in the indefinite future. It is the greatest possible victory to be, to continue to be, and to have been. No defeat can deprive us of the success of having existed for some moment of time in a universe that seems indifferent to us.

This is no defeatism, it is rather a sense of tragedy in a world in which necessity is represented by an inevitable disappearance of differentiation. The declaration of our own nature and the attempt to build up an enclave of organization in the face of natures overwhelming tendency to disorder is an insolence against the gods and the iron necessity that they impose. Here lies tragedy, but here lies glory too.

These were the ideas I wished to synthesize in my book on cybernetics.

My former student, and still my collaborator, Jerome Y. Lettvin, had gone to Boston City Hospital in neurology in 1943, and he had brought Norbert and Walter Pitts together on account of Walters and my paper on the Logical Calculus. Walter agreed with Norbert as to the importance of activity in random nets. Walter found a way of computing this by means of the probability of a neuron being fired in terms of the probability of activity in the neurons affecting it. He presented this at one of the Macy cybernetics meetings. Johnny von Neumann saw its importance in attacking the problem of shock waves. Walter became a Guggenheim Fellow and went on to work with Norbert as described by him in the introduction to Cybernetics. They worked on the syncytium of heart muscle, which is random in detail of connections of neighbors by protoplasmic bridges, and on the pools of motor neurons with fairly random connections to them from afferent peripheral neurons playing upon them. Walter and I had less time together until Norbert arranged with Julius A. Stratton and Jerome B. Wiesner for me to join Jerry Lettvin and Walter in the Research Laboratory of Electronics at Massachusetts Institute of Technology, where we have remained together until Walters death on May 14, 1969.

After the publication of Cybernetics, Norbert rarely had time for the Macy meetings, but they continued to make us a group disciplined in interdisciplinary give-and-take, which came out at the first Hixon Symposium in 1948. Johnny von Neumann had spent hours after the meetings and other long sessions with me, for he had become excited by the possibility of formulating the central problems of cybernetics as he had done for games. He chose to consider first reproduction, about which he spoke at the first Hixon Symposium. He suggested that, given a computing machine, a program and an assembler with a sea of parts and a duplicator of the tape, there is no reason why this cannot be a simplest machine that can reproduce itself. If, by error, it makes a simpler, that machine will fail to reproduce, but if a more complicated, it may reproduce and machines may so evolve. With Heinz von Foerster, Johnny discussed self-organization, the problem being the possibility of things essentially chaotic becoming organized, whether stars or crystals, and, if organisms, whether this organization was inherently informational. Did the system accept its information from the environment to stave off entropy or disorganization? What has this to do with learning, above all learning to perceive?

With me Johnny had talked chiefly of the possibilities of building reliable computers of unreliable components, which became the theme of my Toward a Probabilistic Logic. He always nudged me to continue. I did, and much later, with the help of Manuel Blum and Leo A.M. Verbeek, managed to make a logic where the functions, not merely the arguments, were only probable. It did not give the kind of redundancy of coding which Shannon was able to make use of in proving the noisy-channel coding theorem which forms the basis of information theory. Our kind of probabilistic logic did not therefore give rise to an analog of channel capacity in the form of a minimum level of redundancy above which arbitrary reliable computation with unreliable components could be achieved. This was finally achieved in my laboratory by S. Winograd and J.D. Cowan. In constructing our probabilistic logic we had been forced to abandon von Neumanns multiplexing scheme in which each neuron listens to two and speaks to two, for one in which each neuron listens to many and speaks to many the so-called anastomotic net. Winograd and Cowan used anastomotic nets, but: such a way as to incorporate the redundant error-correcting codes used in coding for noisy communication channels into the logical functions computed by each neuron in the net. The probabilistic logic which results from their construction is functionally redundant, there are many more possible functions stored in the net than in our own anastomotic nets, and it is this redundancy which is used to obtain reliable computation. Given the assump-tion that the rate at which neurons malfunction is not tied to the number of inputs playing on each one, the Winograd-Cowan construction is in one-to-one correspondence with that of Shannons which led to the noisy channel coding theorem and the existence of a channel capacity. On this basis there is an analogous computational capacity for reliable computers made of unreliable components.

The interesting thing in the present context is that for sufficiently extensive coding almost any code, say one made at random, is nearly optimal. In 1967 Stuart Kauffman came to me for three months, to work on his idea that a random net of neurons, each listening to two neurons, and each speaking to two, and the 16 Boolean functions tossed into the neurons at random, would form a good model of epigenesis. These nets must exhibit recurrent behavior sequences called state cycles. Kauffman found that with two inputs per element, nets typically do have very short, stable state cycles, and very few state cycles. Interpreting each state cycle a stable mode of behavior of the model genetic net as a distinct cell type, he modeled the epigenetic landscape as flow among the cell types induced by random minimal perturbations to a running system. He found that such systems flow down an epigenetic landscape to a subset of cell types and remain trapped among them. In addition, each cell type can only directly flow into a few other cell types a characteristic of all metazoan cells. Simulation proved him right in model genetic nets of anything up to 8,000 components, beyond which computer memory balks. In all the ensuing time I have tried, and so have several mathematical friends, to get a closed recursive procedure to continue when computers have bogged down for lack of memory, but to date with no success. It does look as though one does not need miracles or intricate clockwork to explain ontogenesis, but only unfamiliar properties of chance and number.

Unfortunately for cybernetics, Johnny became chief scientist of the Atomic Energy Commission and then died young. Norbert kept on struggling with three kinds of problems: The first, the statistical; the second, the coupling of nonlinear oscillators; and finally, continuous nonlinear prediction. These, and combinations of them, proved too tough and the last four years of his life, 19601964, were chiefly taken up with philosophical problems, often with an ethical or moral slant. This resulted in an unfortunate use of the word Cybernetics in international meetings of a group of physicians, which Norbert christened the Rheumatism of Cybernetics", and by other groups of would-be social reformers, all of which extensions Norbert has expressly rejected in his answer to Margaret Mead and Gregory Bateson in the very introduction to Cybernetics. There is no doubt but that there are cybernetic problems in them, but we lack sufficiently long runs of uncorrupted data to apply the mathematical tools at our command. These things made us hesitate to set up an American Society for Cybernetics. When we did, it was to forestall the opportunists and the do-gooders. Perhaps we will succeed. The Russians have.

In the realm of neurophysiology cybernetic ideas are really at work. Regenerative and inverse feedback are properly handled. Better conceptions of the distribution of functions over tissues with only partially ordered connections are appearing, and our own paper by J. Lettvin, H.R. Maturana, W. McCulloch and W. Pitts, What the Frogs Eye Tells the Frogs Brain pointed the way to a more profitable approach to experimental epistemology. M.L. Minsky and S.A. Papert recently reviewed the limitations of perceptions without closed loops determined in large measure by their logical depth, and have given a fairly up-to-date bibliography. In practice Louis Sutros group, including me, has found Azriel Rosenfelds procedures most useful for specifying the binocular vision required by a robot to get about a three-dimen-sional world.

Lewey O. Gilstrap (of Adaptronics) has so solved the electronics of adaptive control of air- and spacecraft, that control even for vertical takeoff and landing may soon cease to be a pressing problem.

The behavior of closed loops in digital devices lay dormant waiting for mathematical development from The Logical Calculus of the Ideas Immanent in Nervous Activity of 1943 to 1956 when David Huffman gave us a theory of shift registers in terms of prime polynomials over Galois fields. As this branch of modular mathe-matics is comprehensible in the lower predicate calculus, it is known to be consistent and complete. Two years ago, James L. Massey invented an algorithm that will produce the minimum linear shift register to embody a given sequence, say of zeroes and ones. A shift register is linear if it consists of a string of delaying components, each signaling to the next and the last to the first by a line going through one or more gates each of which passes on that signal, provided it does not simultaneously receive one from a particular delayer or vice versa. By using other logical components in this loop, it is often possible to construct shorter strings of delayers to do the job. José L. Simoes da Fonseca showed how to linearize all of these problems in theory, and he and Massey found the algorithm for making them. These can, in theory, account for the learning of a sequence of skilled acts, as in learning to play a given piano composition. In theory now this aspect of cybernetics can go to any elaboration on a sound basis. This should help us to understand the basal ganglia which presumably are the habitat of these delay chains.

Modern electrical engineering theory and practice in control, feed forward, interval timing and autocorrelation to bring signals up out of noise bid fair to give us a clear notion of cerebellar activity. The cerebrum, whose business seems to be to take habits, make guesses and lay plans, certainly presents no insuperable problems in the handling of information on the inductive side with its appropriate uncertainty, and the mechanism of multiple-trace formation is under competent chemical, physical and physi-ological investigation, even including the hippocampus, without which man makes no new traces. Such problems as remain in cortex and thalamus seem purely parochial.

Back in 1952 I had drowned myself in data on the reticular formation of the brain, the abductive organ that commits the entire animal to one rather than to any other of a small number of incompatible modes of behavior, like fight, fly, sleep, eat or make love. For man there may be 14 at least and 18 at most such modes according to whether one includes incompatible reflexes, say for swallowing and breathing. I do not like to experiment when I have no hypothesis to disprove. The anatomy of the reticular core, thanks to the Golgi work of the Scheibels and Valverde, the sections at right angles to the core through the bends of the flexures of the midbrain, thanks to Jan Droogleever Fortuyn, the characterization of its cells, thanks to Walle Nauta, and the studies of the behavior of its cells singly and in combination, thanks to Vahe Amassian and the Scheibels, have, since 1952, given us a sharp limitation of what any theory of the function of the reticular core may postulate. William Kilmer and I, working for five or six years, were convinced that neither the theory of coupled oscillators nor the theory of iterated nets could handle the problem. We finally turned to the ars combinatoria and were forced to a simulation on a digital computer. The simulation, with its several nonlinearities, now does all we can ask of it, and we are ready to look into a version in hardware, for a roving robot -Mars. Yet a good model is still a long way from a clean theory by which to design experiments. The way to it may go over a study by Roberto Moreno-Diaz of a neuronal net that is universal in the same sense as a Turing machine. The net proper is easily built by Manuel Blums construction. Its circuit action is Boolean and deterministic, and can easily be described by Robertos functional state transition matrix. It is preceded by an encoder that may be probabilistic or continuous but is circle-free. And it is followed by a decoder that samples all cells of the universal net and yields all required permutations. If this will suffice for a full description of the circuit action of the reticular core, and I think it will, we shall still need to look into the logic of relations!

Even this is under way. Robertos student, Pepe Mira, and José da Fonseca are at work on it. Herman and Kotelly and their friends, as well as James Bever, Christopher Longyear, and I are also at work on it, each from his own angle. There are nearly enough of us in close communication to achieve critical mass".

That is where cybernetics is today.

It was born in 1943, christened in 1948, and came of age five years ago in the early 1960s. In its short majority it has certainly done best for those fields where it was conceived. It has been a challenge to logic and to mathematics, an inspiration to neuro-physiology and to the theory of automata, including artificial intelligence, and bionics or robotology. To the social sciences it is still mere suspiration.

Above all, it is ready to officiate at the expiration of philo-sophical Dualism and Reductionism, to use MacKays phrase, The Nothing-buttery of Mentalism and of Materialism. Our world is again one, and so are we. Or at least, that is how it looks to one who was never a prodigy and is not a mathematician.


1 Reprinted from ASC Forum, Volume VI, No. 2-Summer 1974 (pp 5-16)