Update

February 7th, 2010

First of all I had to turn off comments which were filled with spam. Web 2.0. indeed.

Second, work is being a pest and using up study time, so this is brief.

For somebody who has mocked MaxMSP so religiously over the years it humbles me to admit I have no other option at this point. There simply is no other reasonable option for my development process – Jitter is the only tool that can process an image as an array of data, such that I can perform statistical analysis on video materials. There may be more complex methods but I am an arts student, and know my place.

Somewhere, Miller Puckette is laughing.

Actually I’ve been here before. When I was younger I discovered APL, which is A Programming Language. It’s both elegant and hellish in that it uses unique symbols for the code that look more like runes than any modern script.

500px-APL-keybd2.svg

At one point I had little stickers on the Amiga keyboard with these symbols. I bought the textbook and was scribbling what appeared to be fluent Martian on bits of paper. I guess it was around the time that most 20 year olds start using MaxMSP.

APL treats a matrix as a single variable. So summing two images is like 1+1=2. Finding the overall hue of an image is a single line of code. Making a piece of music louder is a few keystrokes. A ‘Game of Life’ in one line.

APL was wonderful, but what killed me was there was no simple way to get that data out as an image or sound. APL was mute by malice.

Perhaps the disappointment of those days is why I am afraid to start on Max. Plus the fact that it tends to destroy the musicality of anyone who touches it.

What does a VJ do?

January 28th, 2010

To be honest that’s a concealed version of ‘what is art?’ and similarly likely to end in ruin. Thinking aloud: allow me a shallow (and biased) starting point in the hope that it will improve over time. In crafting a story about the work of a VJ we can start to see what my hypothetical machine will need to offer.

VJ is derived from DJ, or disc jockey, which implies that in the work they select from a library of existing sources (the ‘discs’) and sequence them into a programme. They do not create the sources within the work but they may connect or overlay them to create a montage, each of which creates a relationship in the audience’s mind, creating a longer flow of impressions. The benchmark is an arc perhaps like that of a Beethoven symphony: questions, debates, tensions, resolution. (Side note – that’s my understanding of a symphony but needs further study).

The work depends on the library, which you would expect to align with interests or preoccupations of the VJ. For example my video library includes many clips of people staring quizzically at the camera or to one side – I know that when I play these one after the other I create tense ‘unspoken dialogue’ between these people. Another VJ might have many similar shapes – balloons, apples, doorknobs – playing with the correspondence between all these unrelated objects. The clips are collected with a preconception of how they will connect (like LEGO pieces) – that’s important I think. The machine must assist in this intention.

The VJ then plans a pathway through the imagery, or moves from place to place according to intuition, the way that people tell oral history. I tend to plan carefully; others are bolder. But I think ‘a pathway’ is on the right track* – the clips are arranged in a space and the VJ travels through it, twisting and turning in performance. How the clips are arranged in the space is a key puzzle in this whole project.

*sorry.

pc0201

I have in mind an example from psychology. The Myers-Briggs typology model is a systematization of personality types. It has four main axes: Extraversion – Introversion, Sensing – Intuition, Thinking – Feeling, Judgment – Perception. According to proponents a personality type can be located somewhere in this 4D space via questionnaire and thus summarized in a set of figures. For opponents this kind of ‘psychometrics’ is no better than astrology, or a nasty attempt to quantize the mind. Nevertheless it’s an interesting model for the machinery I’m designing, which will always be a subset of the human ability. Are there a small number of vectors that could be used to arrange the clips used in a VJ work? Is it possible to take many arbitrary tags and groups in use and condense them into a systematic navigable space?

Microsoft Research uses “face, directionality, energy, edge, color, and spatial distribution” in their Show Similar Image technology for Bing. Without yet having read the publications, it’s plausible that these represent 6 axes that worked best for their searches. But every different artistic work might require some very different dimensions.

The path would represent an envelope of excitation – perhaps slow at first with rising intensity – or it could move between extremes as a contrast. Considering T Visionarium (which is going to be a constant comparison) the notable problem for me was that the source library had no particular preoccupation and there was no pathway drawn through the material. It took a day of television and blew it apart, but had no particular investment in putting it back together – scenes of a feather just flocked together. The path should be my focus.

I can imagine an interface where the artist chooses the vectors on which the sources are arranged (or magically these are derived from the clips) and then takes a baton and conducts a path through this space. The work can then be scored, refined and reproduced by others. This of course may be no better than astrology :-)

The Genesis of the idea.

January 26th, 2010

In 2006 I was approached by John Jacobs of ABC Radio fame. He had seen a VJ performance I’d done recently and thought I might offer some help to a project he was planning called Umami. The idea (expressed in my own words) went like this: Umami would watch the news feeds coming into ABC-TV all day and then condense them into the most interesting parts. Then late at night it would recall the images it had stored during the day as an ambient abstract flow that would be entertainment for late night viewers – something to have playing in the background along with the music and conversation that you may already have going. It would acknowledge and power the night-owl channel surfing that TV has offered for decades.

Now I am aware that there’s quite a few bits in there that raise questions, and we will come back to them all I promise.

boom-bike-escorts-police-editJohn would be in charge of the idea and I would help out on the technical side of it. He wanted to get ABC-TV interested and hoped that we could get a demo going in a reasonable time frame. While John is considered a national treasure by some at the broadcaster there was much doubt about the project and the more I looked into it, the more doubt I had myself.

Then came The Pool, which is an ABC project to address shared media. Rather than my describing it here, you can see it all at the link. The thought came that rather than Umami watching the airwaves it could sit at the edge of the Pool and work with the media there, which would be already tagged and permitted and so on. I sat in on a few of the initial meetings at the ABC as an ‘interested artist’, but Pool had enough things to figure out without the added burden of our schemes, I fell out of that loop. John moved down to Melbourne and I had to find a new job and so things moved on.

But I knew that Umami needed some serious attention if it was going to ever happen and some of that involved the kind of research and development that I was not trained to do. On the advice of Norie Neumark I went back to university and undertook an undergraduate thesis, and despite feeling like the worst kind of idiot the whole time I managed to earn entry and a scholarship (which might come in handy some day). So now, four years later, I can begin.

Some of the issues that face this project.

Thinking back to the original idea, there are some words that gloss over processes that need a strong definition. By what rules does the system ‘condense’ the material – that is, by what rules does it delete or ignore? What do we mean by ‘interesting’ and how can a machine decide what is interesting or not? When it recalls the material by what rules does it arrange the work and what decides the abstraction and filters applied to the vision? All of this assumes that we can automate the business of the film editor or VJ, which assumes that we can define that in machine terms. I think that this is very uncertain.

FreudTo my mind Umami is a near fit for Freud’s concept of ‘dream work’. It takes material from the day’s experiences, condenses them according to their value in expressing unconscious themes and arranges them in a puzzle that slips by a censor in a late night recall – seemingly meaningless but heavy with unconscious meaning. Although Freud meant this to be an accurate description of a brain mechanism it is a pseudoscience more poetic than the basis of a real world device. Nevertheless the concept of the ‘dream work’ is for me the only interesting aspect of recycled art. Taking sections of existing media and reassembling them to imply a new thought is a kind of alchemy that turns video lead into gold. Alchemy, I am convinced by Duchamp, is the basis of art.

Diverging from the Umami idea.

Rather than make the machine act like a human, let’s move the other way. We assume a human operator and translate their (neurotic?) wishes into machine terms. An illustration: postscript is a computer language but it’s usually created via a high level tool such as Illustrator, very rarely does a human write the actual code. MIDI is a serial hexadecimal code but generally it is created as notes in a composing tool such as Pro Tools. We describe what we want to see or hear in human terms and the machine handles the translation. To answer one question above – ‘interesting’ is a decision made by a composer, and passed to the computer as guidelines for action.

Next question – how can we capture wishes as the kind of data a computer can parse? My idea is this: we need to survey how people do their work now, then form a pseudo code from that. Simple example – what really constitutes Picasso’s ‘blue period’? When Picasso grabbed the Cyan control and raised it, what was the link with his mind? This pseudo code would need to be different for each conductor. So the machine will present an abstraction layer into which the conductor’s meaning can be mapped. For Picasso it would be a surface that presented meaningful terms and for the machine a set of parameters that describe activity.

That is a very bald overview of the idea, and the next post will dig more thoroughly into the actual plan of action.