Lev Manovich on Fri, 21 Aug 1998 08:56:00 +0200 (MET DST)

Lev Manovich


The Most Popular Moving Image Sequence of All Times

Don't you wish that somebody, in 1895, 1897 or at least in 1903, realized
the fundamental significance of cinema's emergence and produced a
comprehensive record of new medium's emergence? Interviews with
the audiences; a systematic account of the narrative strategies,
scenography and camera positions as they developed year by year; an
analysis of the connections between the emerging language of cinema
and different forms of popular entertainment which coexisted with it,
would have been invaluable. But, of course, these records do not exist.
Instead, we are left with newspaper reports, diaries of cinema's
inventors, programs of film showings and other bits and pieces -- a set
of random and unevenly distributed historical samples.
        Today we are living in the midst of an emerging  new medium -
the metamedium of the digital computer. All information becomes
encoded in one code; all cultural objects become computer programs,
something which is not only seen, heard or read, but first of all stored
and transmitted, compiled and executed. In contrast to a hundred years
ago, when cinema was coming into being, we are fully aware of the
significance of this new media revolution. And yet I am afraid that
future theorists and historians of computer media will be left with not
much more than the equivalents of newspaper reviews and random
bits of evidence similar to cinema's first decades.
They will find that the analytical texts from our
era are fully aware of the significance of computer's takeover of culture
yet, by and large, they mostly contain speculations about the future
rather than a record and a theory of the present. Future researchers will
wonder why the theoreticians, who already had plenty of experience
analyzing older cultural forms, did not try to describe computer
media's semiotic codes, modes of address, and audience reception
patterns. If, for instance, they painstakingly reconstructed how cinema
emerged out of preceding  cultural forms (panorama, optical toys, peep
shows), why didn't they attempt to construct a similar genealogy for
the language of computer media at the moment when it was just
coming into being, while the elements of previous cultural forms
going into its making are still clearly visible, still recognizable before
melting into a new unity. Where there the theoreticians at the
moment when the icons and the buttons of multimedia interfaces
were like a wet paint on a just completed painting, before they became
a universal convention and thus slipped into invisibility? Or, at the
moment when the designers of Myst were debugging their code,
converting graphics to 8-bit and massaging QuickTime clips? Or, at the
historical moment when a young 20-something programmer at
Netscape took the chewing gum out of his mouth, sipped warm Coke
out of the can -- he was at a computer for 16 hours straight, trying to
meet a marketing deadline -- and, finally satisfied with its small file
size, saved a short animation of stars moving across the night sky, the
animation which was to appear in the upper right corner of Netscape
Navigator, thus becoming the most widely seen moving image
sequence ever -- until the next release.
        The following is an attempt at both a record and a theory -- of the
present. Just as film historians traced the development of film
language during cinema's first decades, I want to describe and
understand the logic driving the development of the language of
computer media. It is tempting to extend this parallel a little further
and to speculate whether today this new language is already getting
closer to acquiring its final and stable form, just as film language
acquired its "classical" form during the 1910's. Or are the 1990's more
like the 1890's, because future computer media language will be
entirely different than the one used today? [1] In either case, by trying to
understand which cultural forces are shaping the development of this
language, we may be in a better position both to predict its future
course as well as to offer different alternatives. For just as avant-garde
filmmakers throughout cinema's existence offered alternatives to its
particular narrative audio-visual regime, the task of an avant-garde
computer artist today is to offer alternatives to the existing language of
computer media. This can be better accomplished if we have a theory of
how "mainstream" language is currently structured.
        Does it make sense to theorize the present when it seems to be
changing so fast? It is a gamble. If subsequent developments prove the
theoretical projections of this text to be correct, I win. But, if the
language of computer media develops in a different direction than the
one suggested by the present analysis, this does not mean that I
automatically lose. Rather, the analysis presented here will become a
record of possibilities which were heretofore not realized, of the
horizon which was visible to us today but later became unimaginable.
        We no longer think of the history of cinema as a linear march
towards only one possible language, or as a progression towards more
and more accurate verisimilitude. Rather, we have come to see its
history as a succession of distinct and equally expressive languages,
each with its own aesthetic variables, each new language closing off
some of the possibilities of the previous one -- a cultural logic not
dissimilar to Kuhn's analysis of scientific paradigms. [2] Similarly, every
stage in the history of computer media offers its own aesthetic
opportunities, as well as its own imagination of the future -- in short,
its own "research paradigm." This paradigm is modified or even
abandoned at the next stage. In this paper I want to record the "research
paradigm" of new media during its first decade before it slips into

Cultural Interfaces

During the 1990s, the cultural role of a digital computer has changed
from a tool to a medium. In the beginning of the decade, a computer
was still largely thought of as a simulation of a typewriter, a paintbrush
or a drafting ruler -- in other words, as a tool used to produce cultural
content which, once created, will be stored and distributed in its
appropriate media: printed page, film, photographic print, electronic
recording. By the end of the decade, the computer's public image has
begun to shift to one of a universal machine, used not only to author,
but also to store, distribute and access all media. All culture, past and
present, is beginning to be filtered through a computer, with its
particular human-computer interface.
        The term human-computer interface (HCI) describes the ways in
which the user interacts with a computer. HCI includes physical input
and output devices such a monitor, a keyboard, and a mouse.  It also
consists of metaphors used to conceptualize the organization of
computer data. For instance, the Macintosh interface introduced by
Apple in 1984 uses the metaphor of files and folders arranged on a
desktop. Finally, HCI also includes ways of manipulating this data, i.e. a
grammar of meaningful actions which the user can perform on it. An
example of this grammar are the commands used in a command-line
interface such as DOS and UNIX: copy file, delete file, set date, open
port, list directory, and so on.
        As the role of a computer is shifting from being a tool to a
universal media machine, we are increasingly "interfacing" to
predominantly cultural data: texts, photographs, films, music, virtual
environments. In short, we are no longer interfacing to a computer but
to culture encoded in digital form. I would like to introduce the term
"cultural interfaces" to describe evolving interfaces used by the
designers of Web sites, CD-ROM and DVD-ROM titles, multimedia
encyclopedias, online museums, computer games and other digital
cultural objects.
        If you need to remind yourself what a typical cultural interface
looked like in 1997, go back in time and click to a random Web page.
You are likely to see something which graphically resembles a
magazine layout from the same decade. The page is dominated by text:
headlines, hyperlinks, blocks of copy. Within this text are few media
elements: graphics, photographs, perhaps a QuickTime movie and a
VRML scene. The page also includes radio buttons and a pull-down
menu which allows you to choose an item from the list. Finally there
is a search engine: type a word or a phrase, hit the search button and
the computer will scan through a file or a database trying to match your
        For another example of a prototypical cultural interface of the
1990s, you may load (assuming it would still run on your computer)
the most well-known CD-ROM of the 1990s  - Myst (Broderbund, 1993).
Its opening clearly recalls a movie: credits slowly scroll across the
screen, accompanied by a movie-like soundtrack to set the mood. Next,
the computer screen shows a book open in the middle, waiting for your
mouse click. Next, an element of a familiar Macintosh interface makes
an appearance, reminding you that along with being a new
movie/book hybrid, Myst is also a computer application: you can adjust
sound volume and graphics quality by selecting from a usual
Macintosh-style menu in the upper top part of the screen. Finally, you
are taken inside the game, where the interplay between the printed
word and cinema continue. A virtual camera frames images of an
island which dissolve between each other. At the same time, you keep
encountering books and letters, which take over the screen, providing
with you with clues on how to progress in the game.
        Given that computer media is simply a set of characters and
numbers stored in a computer, there are numerous ways in which it
could be presented to a user. Yet, as it always happens with cultural
languages, only a few of these possibilities actually appear viable in a
given historical moment. Just as early fifteenth century Italian painters
could only conceive of painting in a very particular way - quite
different from, say, sixteenth century Dutch painters - today's digital
designers and artists use a small set of action grammars and metaphors
out of a much larger set of all possibilities.
        Why do cultural interfaces - web pages, CD-ROM titles,
computer games - look the way they do? Why do designers organize
computer data in certain ways and not in others? Why do they employ
some interface metaphors and not others?
        My theory is that there are three key cultural forms which are
shaping cultural interfaces in the 1990s. What are these forms? The
answer to this puzzle can be found in the opening sequence of Myst
which activates them before our eyes, one by one. The first form is
cinema. The second form is the printed word. The third form is a
general-purpose human-computer interface (HCI).
        At the time of this writing (1997), it appears that out of the three,
the influence of cinema is becoming more and more important. So,
despite frequent pronouncements that cinema is dead, it is actually on
its own way to becoming a general purpose cultural interface, a set of
techniques and tools which can be used to interact with any cultural
data. Accordingly, I will devote the largest section of this article to the
discussion of the ways in which cinematic techniques structure cultural
        As it should become clear from the following, I use words
"cinema" and "printed word" as shortcuts. They stand not for
particular objects, such as a film or a novel, but rather for larger
cultural traditions (we can also use such words as cultural forms,
mechanisms, languages or media). "Cinema" thus includes mobile
camera, representation of space, editing techniques, narrative
conventions, activity of a spectator -- in short, different elements of
cinematic perception, language and reception. Their presence is not
limited to the twentieth-century institution of fiction films, they can be
already found in panoramas, magic lantern slides, theater and other
nineteenth-century cultural forms; similarly, since the middle of the
twentieth century, they are present not only in films but also in
television and video programs.  In the case of the "printed word" I am
also referring to a set of conventions which have developed over many
centuries (some even before the invention of print) and which today
are shared by numerous forms of printed matter, from magazines to
instruction manuals: a rectangular page containing one or more
columns of text; illustrations or other graphics framed by the text; pages
which follow each sequentially; a table of contents and index.
        Modern human-computer interface has a much shorter history
than the printed word or cinema -- but it is still a history. Its principles
such as direct manipulation of objects on the screen, overlapping
windows, iconic representation, and dynamic menus were gradually
developed over a few decades, from the early 1950s to the early 1980s,
when they finally appeared in commercial systems such as Xerox Star
(1981), the Apple Lisa (1982), and most importantly the Apple
Macintosh (1984). [3]  Since than, they have become an accepted
convention for operating a computer, and a cultural language in their
own right.
        Cinema, the printed word and human-computer interface: each
of these traditions has developed its own unique ways of how
information is organized, how it is presented to the user, how space
and time are correlated with each other, how human experience is
being structured in the process of accessing information. Pages of text
and a table of contents; 3-D spaces framed by a rectangular frame which
can be navigated using a mobile point of view; hierarchical menus,
variables, parameters, copy/pasteand search/replace operations -- these
and other elements of these three traditions are shaping cultural
interfaces today. Cinema, the printed word and HCI: they are the three
main reservoirs of metaphors and strategies for organizing
information which feed cultural interfaces.
        Bringing cinema, the printed word and HCI interface together
and treating them as occupying the same conceptual plane has an
additional advantage -- a theoretical bonus. It is only natural to think of
them as belonging to two different kind of cultural species, so to speak.
If HCI is a general purpose tool which can be used to manipulate any
kind of data, both the printed word and cinema are less general: they
offer ways to organize particular types of data: text in the case of print,
audio-visual narrative taking place in a 3-D space in the case of cinema.
HCI is a system of controls to operate a machine; the printed word and
cinema are cultural traditions, distinct ways to record human memory
and human experience, mechanisms for cultural and social exchange
of information. Bringing HCI, the printed word and cinema together
allows us to see that the three have more in common than we may
anticipate at first. On the one hand, being a part of our culture now for
half a century, HCI already represents a powerful cultural tradition, a
cultural language offering its own ways to represent human memory
and human experience. This language speaks in the form of discrete
objects organized in hierarchies (hierarchical file system), or as catalogs
(databases), or as objects linked together through hyperlinks
(hypermedia). On the other hand, we begin to see that the printed word
and cinema also can be thought of as interfaces, even though
historically they have been tied to particular kinds of data. Each has its
own grammar of actions, each comes with its own metaphors, each
offers a particular physical interface. A book or a magazine is a solid
object consisting from separate pages; the actions include going from
page to page linearly, marking individual pages and using table of
contexts. In the case of cinema, its physical interface is a particular
architectural arrangement of a movie theater; its metaphor is a
window opening up into a virtual 3-D space.
        Today, as media is being "liberated" from its traditional physical
storage media - paper, film, stone, glass, magnetic tape - the elements
of printed word interface and cinema interface, which previously were
hardwired to the content, become "liberated" as well. A digital designer
can freely mix pages and virtual cameras, table of contents and screens,
bookmarks and points of view. No longer embedded within particular
texts and films, these organizational strategies are now free floating in
our culture, available for use in new contexts. In this respect, printed
word and cinema have indeed became interfaces --  rich sets of
metaphors, ways of navigating through content, ways of accessing and
storing data. For a user, both conceptually and psychologically, their
elements exist on the same plane as radio buttons, pull-down menus,
command line calls and other elements of standard human-computer
        Let us now discuss some of the elements of these three cultural
traditions -- cinema, the printed word and HCI -- to see how they are
shaping the language of cultural interfaces.

I. Printed Word

In the 1980's, as PC's and word processing software became
commonplace, text became the first cultural media to be subjected to
digitization in a massive way. But already in the 1960's, two and a half
decades before the concept of digital media was born, researchers were
thinking about having the sum total of human written production --
books, encyclopedias, technical articles, works of fiction and so on --
available online (Ted Nelson's Xanadu project [4]).
        Text is unique among other media types. It plays a privileged
role in computer culture. On the one hand, it is one media type among
others. But, on the other hand, it is a meta-language of digital media, a
code in which all other media are represented: coordinates of 3-D
objects, pixel values of digital images, the formatting of a page in
HTML. It is also the primary means of communication between a
computer and a user: one types single line commands or runs
computer programs written in a subset of English; the other responds
by displaying error codes or text messages. [5]
        If a computer uses text as its meta-language, cultural interfaces in
their turn inherit the principles of text organization developed by
human civilization throughout its existence. One of these is a page: a
rectangular surface containing a limited amount of information,
designed to be accessed in some order, and having a particular
relationship to other pages. In its modern form, the page is born in the
first centuries of the Christian era when the clay tablets and papyrus
rolls are replaced by a codex - the collection of written pages stitched
together on one side.
        Cultural interfaces rely on our familiarity with the "page
interface" while also trying to stretch its definition to include new
concepts made possible by a computer. In 1984, Apple introduced a
graphical user interface which presented information in overlapping
windows stacked behind one another -- essentially, a set of book pages.
The user was given the ability to go back and forth between these pages,
as well as to scroll through individual pages. In this way, a traditional
page was redefined as a virtual page, a surface which can be much
larger than the limited surface of a computer screen. In 1987, Apple
shipped popular Hypercard program which extended the page concept
in new ways. Now the users were able to include multimedia elements
within the pages, as well as to establish links between pages regardless
of their ordering.  A few years later, designers of HTML stretched the
concept of a page even more by enabling the creation of distributed
documents, where different parts of a document are located on
different computers connected through the network. With this
development, a long process of gradual "virtualization" of the page
reached a new stage. Messages written on clay tablets, which were
almost indestructible, were replaced by ink on paper. Ink, in its turn,
was replaced by bits of computer memory, making characters on an
electronic screen. Finally, with HTML, which allows parts of a single
page to be located on different computers, the page became even more
fluid and unstable.
        The conceptual development of the page in digital media can
also be read in a different way - not as further development of a codex
form, but as a return to earlier forms such as the papyrus roll of ancient
Egypt, Greece and Rome. Scrolling through the contents of a computer
window or a World Wide Web page has more in common with
unrolling than turning the pages of a modern book. In the case of the
Web of the 1990s, the similarity with a roll is even stronger because the
information is not available all at once, but arrives sequentially, top to
bottom, as though the roll is being unrolled.
        A good example of how cultural interfaces stretch the definition
of a page while mixing together its different historical forms is the Web
page designed in 1997 by the British design collective antirom for
HotWired RGB Gallery. [6] The designers have created a large surface
containing rectangular blocks of texts in different font sizes, arranged
without any apparent order. The user is invited to skip from one block
to another moving in any direction. Here, the different directions of
reading used in different cultures are combined together in a single
        By the mid 1990's, Web pages included a variety of media types --
but they are still essentially pages. Different media elements -- graphics,
photographs, digital video, sound and 3-D worlds -- were embedded
within rectangular surfaces containing text. VRML evangelists wanted
to overturn this hierarchy by imaging the future in which the World
Wide Web is rendered as a giant 3-D space, with all the other media
types, including text, existing within it. [7] Given that the history of a
page stretches for thousands of years, I think it is unlikely that it would
disappear so quickly.
        While the 1990's cultural interfaces have retained the modern
page format, they also have come to rely on a new way of organizing
and accessing texts which has little precedent within book tradition --
hyperlinking. We may be tempted to trace hyperlinking to earlier
forms and practices of non-sequential text organization, such as the
Torah's interpretations and footnotes, but it is actually fundamentally
different from them. Both the Torah's interpretations and footnotes
imply a master-slave relationship between one text and another. But in
the case of hyperlinking, no such relationship of hierarchy is assumed.
The two sources connected through hyperlinking have equal weight;
they exist on the same level of importance. Thus the acceptance of
hyperlinking in the 1980's can be read as a perfect reflection of
contemporary culture with its suspicion of all hierarchies, and its
aesthetics of collage where radically different sources are brought
together within the singular cultural object ("post-modernism").
        Traditionally, texts encoded human knowledge and memory,
instructed, inspired, and seduced their readers to adopt new ideas, new
ways of interpreting the world, new ideologies. In short, the word was
always linked to the art of rhetoric. While it is probably possible to
invent a new rhetoric of hypermedia, which will use hyperlinking not
to distract the reader from the argument (as it is often the case today),
but instead to further convince hir/her of argument's validity, the
sheer existence and popularity of hyperlinking exemplifies the
continuing decline of the field of rhetoric in the modern era. Ancient
and Medieval scholars have classified hundreds of different rhetorical
figures. In the middle of the twentieth century Roman Jakobson, under
the influence of computer's binary logic, information theory and
cybernetics to which he was exposed at MIT, radically reduced rhetoric
to just two figures: metaphor and metonymy. [8] Finally, in the 1990's,
the World Wide Web hyperlinking has privileged the single figure of
metonymy at the expense of all others. [9] Hyperlinking leads the reader
from one text to another, ad finitum. Contrary to the popular image, in
which digital media collapses all human culture into a single giant
library (which implies the existence of some ordering system), or a
single giant book (which implies a narrative progression), it maybe
more accurate to think of the resulting object as an infinite flat surface
composed from individual texts in no particular order -- the antirom
design for HotWired. Expanding this comparison further, we can note
that Random Access Memory, the concept behind the group's name,
also implies the lack of any hierarchy: any RAM location can be
accessed as quickly as any other. In contrast to the older storage media
of book, film, and magnetic tape, where data is organized sequentially
and linearly, thus suggesting the presence of a narrative or a rhetorical
trajectory, RAM "flattens" the data. Rather than seducing the user
through the careful arrangement of arguments and examples, points
and counterpoints, changing rhythms of presentation (i.e., the rate of
data streaming, to use contemporary language), simulated false paths
and orchestrated breakthroughs, cultural interfaces, like RAM itself,
bombards the users with all the data at once. [10]
        In the 1980's many critics have described one of key's effects of
"post-modernism" as that of spatialization: privileging space over
time, flattening historical time, refusing grand narratives. Digital
media, which has evolved during the same decade, accomplished this
spatialization quite literally. It replaced sequential storage with
random-access storage; hierarchical organization of information with a
flattened hypertext; psychological movement of narrative in novel and
cinema with physical movement through space, as witnessed by
endless computer animated fly-throughs or computer games such as
Myst and countless others. In short, time becomes a flat image or a
landscape, something to look at or navigate through. If there is a new
rhetoric or aesthetic which is possible here, it may have less to do with
the ordering of time by a writer or an orator, and more with spatial
wandering. The hypertext reader is like Robinson Crusoe, walking
through the sand and water, picking up a navigation journal, a rotten
fruit, an instrument whose purpose he does not know; leaving
imprints in the sand, which, like computer hyperlinks, follow from
one found object to another.

