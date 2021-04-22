Communication is the discriminatory response of an organism to a stimulus (Cherry, 1957). If we are to reckon with communication beyond formal rhetoric or syntax, whether English or computer graphics, we must address ourselves to the versatility of the discriminating mechanism—the interface. In this case the interface is the point of contact and interaction between a machine and the “information environment,” most often the physical environment itself.
We have looked at graphic interfaces for one, and teletypes for another, but a dialogue demands a redundant and multichanneled concoction of sensory and motor devices far beyond these two mechanisms. We are talking about a total observation channel for an architecture machine.
For a machine to have an image of a designer, of a problem, or of a physical environment, three properties are inherently necessary: an event, a manifestation, a representation. The event can be visual, auditory, olfactory, tactile, extrasensory, or a motor command. The manifestation measures the event with the appropriate parameters: luminance, frequency, brain wavelength, angle of rotation, and so forth. The representation is the act of mapping the information into a receptacle that is compatible with the organism’s processing characteristics. These three properties—event, manifestation, representation—form the interface between any two organisms. The aspect of this interface with which we are primarily concerned is the manifestation, encompassed primarily by a piece of hardware.
In an architect-machine relationship, perhaps the most interesting sensory interfaces are auditory and visual. Machines that are capable of visual perception and speech recognition are two of the prime targets of researchers in artificial intelligence. Someday, machines that can see and hear will be commonplace machines. Setting aside the phantasmagoria of robot designers, consider speaking to a machine that sees you—a machine with eyes and ears, a machine that walks and talks. In our present culture the thought is either frightening, foolish, or, to some, quite realistic. To our children it will be an ordinary daily occurrence. To Mortimer Taube (1961) it is offensive. To Marvin Minsky (1966) it is obvious.
In the meantime, extrapolations into the future should recognize current problems of implementation. In the January 1967 issue of Datamation, Leslie Mezie portrayed a conversation with a machine that could listen, and talk. A fragment went as follows:
Professor:
What time is it?
Computer:
It is 8:30 p.m., Thursday, December 5, 1985.
Professor:
I think I would like to start with some music today, let’s have some chamber music.
Computer:
You listen to Telemann most.
Professor:
No, something earlier.
Computer:
What about.…
Six months later, in the August issue of Datamation, B. W. Boehm parodied the sketch implanting aspects of present-day dilemmas:
Professor:
What time is it?
Computer:
I did not catch the last word. Or was it two words?
Professor:
What is the time?
Computer:
The dime is a copper coin worth one-tenth of a dollar. The word derives from the Latin decem, meaning.…
Professor:
No. No. What is the time? The time?
Computer:
It is 8:30 p.m., Thursday, December 5, 1985. We have been having some trouble with your linguals recently. Sometimes I can’t tell your d’s from your t’s. Let’s practice them. Watch the display screen for the intonation pattern, and repeat after me: Teddy’s daddy toted two dead toads to Detroit.
Professor:
Teddy’s daddy toted.…
Nilo Lindgren’s (1965a and b) comprehensive survey describes a host of intriguing research efforts in speech recognition, all of which fall into one of three catagories: the auditory sensation, the acoustical disturbance freely propagating through air, and a sequence of articulatory events in a psychological structure. The reader should also refer to the recent works of Bobrow and Klatt (1968), Reddy and Vicens (1968), and Rabiner (1968).
Beyond giving a machine ears, giving a machine eyes is extremely critical to architecture machines. Just on the hunch that a blind machine will have shortcomings similar to those of a blind architect, the relevance of a seeing machine warrants research. Outside of the design professions, giving machines eyes is of imminent importance. For instance, space exploration will eventually require machines that can both see and process the seen information. This is because the remote monitoring of a space robot’s movements by earthlings requires too much transmission time (to Mars and back, for example), and a machine would crash into that which it is told to avoid only because the message to stop might arrive too late. More domestic applications involve visual discrimination of simple objects. Eventually, machines will package your purchased goods at the counter of your neighborhood supermarket.
Oliver Selfridge (and Neisser, 1963) is credited with the founding works in pattern recognition. His mechanism, PANDEMONIUM, would observe many localized visual characteristics. Each local verdict as to what was seen would be voiced by “demons” (thus, pandemonium), and with enough pieces of local evidence the pattern could be recognized. The more recent work of Marvin Minsky and Seymour Papert (1969) has extensively shown that solely local information is not enough; certain general observations are necessary in order to achieve complete visual discrimination.
At present, these works are being applied to architectural problems as an exercise preliminary to the construction of an architecture machine. Anthony Platt and Mark Drazen are applying the Minsky-Papert eye to the problem of looking at physical models (Negroponte, 1969d). The interim goal of this exercise is to observe, recognize, and determine the “intents” of several models built from plastic blocks. Combined with Platt’s previously described LEARN, this experiment is an attempt at machine learning through machine seeing. In contrast to describing criteria and asking the machine to generate physical form, this exercise focuses on generating criteria from physical form.
A second example of interfacing with the real world is Steven Gregory’s GROPE (Negroponte et al., 1969b). GROPE is a small mobile unit that crawls over maps, in this case Passonneau and Wurman’s (1966) Urban Atlas maps. It employs a low-resolution seeing mechanism constructed with simple photocells that register only states of on or off, “I see light” or “I don’t see light.” In contrast to the Platt experiment, GROPE knows nothing about images; it deploys a controller that must be furnished with a context and a role (as opposed to a goal: play chess as opposed to winning at chess). GROPE’s role is to seek out “interesting things.” To determine future moves, the little robot compares where he has been to where he is, compares the past to the present, and occasionally employs random numbers to avoid ruts. The onlooking human or architecture machine observes what is “interesting” by observing GROPE’s behavior rather than by receiving the testimony that this or that is “interesting.” At present, some aspects of GROPE are simulated and other aspects use the local computing power on GROPE’s plastic back. GROPE will be one of the first appendages to an architecture machine, because it is an interface that explores the real world. An architecture machine must watch devices such as GROPE and observe their behavior rather than listen to their comments.
But why not supply the machine with a coordinate description of the form on punch cards and proceed with the same experiment? Why must a machine actually see it? The answer is twofold. First, if the machine were supplied a nonvisual input, the machine could not learn to solicit such information without depending on humans. Second, it turns out that the computational task of simply seeing, the physiology of vision (as opposed to the psychology of perception) involves a set of heuristics that are apparently those very rules of thumb that were missing from LEARN, that made LEARN a mannerist rather than a student.
It seems natural that architecture machines would be superb clients for sophisticated sensors. Architecture itself demands a sensory involvement. Cardboard models and line drawings describe some of the physical and some of the visual worlds, but who has ever smelt a model, heard a model, lived in a model? Most surely, computer-aided architecture is the best client for “full interfacing.” Designers need an involvement with the sensory aspects of our physical environments, and it is not difficult to imagine that their machine partners need a similar involvement.