Weekly Report 12

What I Made This Week

This week I made a lot of good progress in a bunch of the different installation projects, and made moves to getting a space to have them actually installed in Gallery 181 in the design building next to Carly's work, in that I actually have the gallery space reserved! I added a great deal of interactivity to the Emergent Garden, and made solid work on two somewhat smaller works that I think could be included as an interactive and introspective experience with AI, working titles being "How Original?" and "The Single Greatest Piece of Art That Has Ever, Can Ever, or Will Ever be Made" (a bit pretentious of a title but I'll add more context to what that actually means and entails below).

Emergent Garden

Like I said I added much more interactivity to the Emergent Garden, combining aspects of the drawing tool I made a while back with a gesture system that allows the user to draw shapes that are then interpreted to different flowers. The main "brush" is a simple circle with a feedback that keeps the previous frames visible, and is situated in the midpoint between a users index finger and thumb. By moving their hand around they can draw shapes on the canvas, and by pinching their fingers together or apart, they can change the size of the brush. To change the color, the user gives a thumbs up to the camera, which cycles the color of the brush from red, purple, blue, green, orange, and white. Finally, the user can clear the entire canvas by pointing their finger up, so that they don't have to worry about erasing every single shape they've made. I added pictogram instructions that will be fixed on screen over the camera feed, and the shapes they draw are made slightly transparent so they can still see their hands and themselves as they draw. 
Right now the motion tracking takes into account only one hand, but soon I'd like to add the ability for more than one hand to be tracked to add a more collaborative aspect of interaction. Another issue I have with it right now is that the camera feed is at a 16:9 resolution, while the AI is only capable of outputting a 1:1 square resolution. This means that the actual size of the drawing canvas, or camera feed, is currently hidden behind the AI canvas. It's not much of a usability issue, but this means that some of the drawing could be under the AI canvas, and make shapes that don't correspond to what a user is actually seeing. This is somewhat subverted by the catch all erase function, but it is an area of improvement I'd like to make. More so, this limits the potential of shape exploration by the user, with a wider canvas allowing for more variation over time and over users. That said, it is less of a priority than the collaborative functionalities I'd like to make. I also have begun setting up systems for the changing of the AI prompts, but I'm still undecided if I want a midi pad to control it, or if I want some other options such as having them cycle slowly through time between flowers, insects, and fauna, with the Emergent [title] changing to let a user know what their shapes are currently being interpreted to, and to make the small lines of code that are added in post over the AI canvas to be dynamic and changing, to maybe further emphasize the collaboration between the user and the system. Now that I have the gallery space too I can start thinking more about presentation options. Right now the two feeds are juxtaposed, which I think works well, but I can think about other options than having a TV monitor in the space, such as if I want the AI output to be projected onto the wall next to some posters I make, or if the entire feed is projected. Something to think about, and the projection can be incorporated with another new project I've made good progress on.

How Original?

This is a new project that stems from AI image databases. The structure of the database involves billions of images scraped from the internet (often without permission) to be used in training. In order for an image generator to be able to generate any given prompt, these images cover all sorts of ground from art to photography, from nature and industry, to animals and people. From all this coverage there is one thing that unites all of the images in the database, and that is the human behind them all. Even a picture of a tree deep in the amazon rain forest, or of a galaxy billions of light years away, the underlying factor that makes the image possible to be viewed and interpreted in the first place is the human ingenuity, from the person who took the picture, or from the years of collaboration and accumulated knowledge that allowed for the creation of the image-creating devices. This is true of AI as well, as the technology behind processes has taken years of hard human work to achieve, and the training images even more: spanning from the present day to the beginning of documented history.

The nature of the images collected and repurposed has left some to be perturbed by the process, and maybe rightfully so, since no permissions are often given for an image to be fed to the model. However, the taking from one source and repurposing for something new, is in fact not new. From renaissance ideas of antiquity and humanism inspired from ancient Greek philosophy, or the architecture of U.S. government buildings directly inspired from ancient roman architecture, our inventions and our technology are seldom, if ever, actually invented more so as they borrow ideas from the past. Insert Virgil Abloh's 3% rule: To make something culturally significant, you only need to modify 3% of something that already exists. Machines and AI have a unique way of interpreting the preexisting images and data fed into it, "remembering" the past differently than we do. In the instance of gesture and body tracking, a machine with inputs of millions of human bodies and gestures across history, is much better at measuring the distances between points and vectors, than it is at actually understanding historical significance or emotion. This is not necessarily a flaw, but more of a confrontation of the limits of AI and machine logic, and the irreducible complexity of human expression.

This preamble leads to the "How Original?" project. The idea is to match a live gesture from a viewer or user, with that of a historical image of a person doing a near similar gesture to that of the viewer. In this instance, AI, often framed as a tool of the future, is repurposed as a medium for engaging with our collective past. (An interaction with time, and a focus of the 3rd project discussed in this post).
Currently, the database being used in the pictures is from open source training sets used in machine vision, and don't really match quite well with the users live gestures. The current database was used more as a means to get the backend logic working, with the mapping of joints on a picture and somewhat matching that to the live stream. The good thing here is that the hard work of actually coding is mostly complete, what's left for me to do is gathering a database of open source historical photos (I'm targeting anything pre-1960 so that the time between the "now" and the "then" is more drastic). The bad thing is that this will likely be tedious and manual, having to find pictures that are of a decent resolution, and show a persons full body, and I'll have to gather enough to get a full mapping of a person's possible gestures. I'm not targeting complete similarity between poses, I think around .7 would do the trick in getting the idea across. The idea of viewing history as a living collage and peeling back layers of time, where our technologies and our gestures are both individual and collective, and repetitions of ideas are not coincidence or con-work, but a continuity of humanity and our stubborn refusal to let go of the past. The idea of working with AI through time is a strong basis for the last project I've put a little work on: "The Single Greatest Piece of Art That Has Ever, Can Ever, or Will Ever be Made".

The Single Greatest Piece of Art That Has Ever, Can Ever, or Will Ever be Made

Spoiler alert: it's not "The Single Greatest Piece of Art That Has Ever, Can Ever, or Will Ever be Made". However, if you tell an AI image model to create "The Single Greatest Piece of Art That Has Ever, Can Ever, or Will Ever be Made", it will generate an image "The Single Greatest Piece of Art That Has Ever, Can Ever, or Will Ever be Made". But as you would guess, what it would generate would more likely be "The Single Greatest Piece of Meh Confined to a Square Canvas".

Similarly, AI as a technology is often heralded in the hype cycle as this technology that will change everyone's lives for the better, make work easier, but has it actually? Or has it just increased CEO's profit margins and make funny slop memes on TikTok? In the hate cycle, people say AI will take the common-folks jobs, ruin the environment, or that the bubble will burst soon and the stock market will fall, a depression will commence, soon followed by the end of the world. But how much is AI impacting the environment compared to things like the oil leaked from massive shipping vessels or the already existing data centers that house the servers for things like Google and Instagram that people don't seem to have as much of a problem with? Will the bubble actually burst or will the progress of AI just lead to massive expansion? A common endpoint for both sides is the idea of a singularity that looms over our heads that in either case would cause massive upheaval of everyone's lives, maybe not good, maybe not bad, but certainly massive change. These areas of unknown and uncertainty can only be known after time allows it.

Time is a funny thing when it comes to AI. As in the previous section, AI connects us to our collective past and our potential future. In the case of image or video generation, AI generates near-instantly what could take a traditional artist weeks or months. What if we subverted this time-to-generate, lengthening the time it takes to generate from almost instantaneous, to hours, or weeks, or years, or even centuries?

Using time as this supplementary medium is not entirely new as a concept (repetition through time again). The pitch drop experiment is the worlds longest running "experiment" that sees pitch (a highly viscous liquid comparable to tar) being dropped through a beaker. Because of how viscous it is, the time between singular drops lasts anywhere between 8-14 years. This has sensationalized it in a way, with a constant live-stream of the beaker being filled up with viewers whenever a drop is about to...drop. With how much liquid is in it given how infrequently it drops, the beaker is expected to continue dripping for at least a hundred more years until it runs out, with only 9 drops so far. On the more artistic side, a John Malkovich movie called 100 years, will fittingly come out in 2115, 100 years after it was made and likely after the entirety of the makers will have passed. Lastly, the project I think is the most interesting and takes place on the longest time scale is the Zeitpyramide (time pyramid) is a sculptural art project that will see 120 concrete blocks shaped in a pyramid completed over the course of almost 1200 years, with only one new block being added every 10 years. Starting in 1993 and as of 2023, there are only 4 blocks in the structure, and with the current schedule the project will be complete in the year 3183. If you include evolution as a creative act, then we're talking about a time frame of tens of millions of years, or including the natural beauties of the world like the grand canyon, then billions of years of rock and magma and water crushing and grinding together. Beautiful spiral galaxies on the edges of the known universe even longer.

All of this is to say that our perceptions of time and unveiling of the unknown can often be disappointing, and our expectations should be kept in check. The pitch drop experiment will keep dropping, John Malkovich's movie was sponsored by a cognac brand that ages it's product for 100 years, we already have visualizations of the time pyramids completion in the form of diagrams and pitch concepts, and evolution and star formation act on such a long scale of time that they may as well be static to us (as some actually believe). In the case of AI and "The Single Greatest Piece of Art That Has Ever, Can Ever, or Will Ever be Made", it's important to keep our wits about and not let the hype/hate get to us. The promise of "The Single Greatest Piece of Art That Has Ever, Can Ever, or Will Ever be Made", and AI's benefits and consequences, are manipulated through time and language. Big words and high fidelity tech demo's only show a piece of the picture, just as in this project, the AI denoising of a timeless incontestable "masterpiece" slowly shows progress over time, but the final outcome is often disappointing compared to the buildup of hype we can be trapped in.
As it stands right now, I have this ComfyUI setup that I can naturally extend the steps on to an extent, which pushes the generation of an image from a couple seconds to a couple minutes. If I actually wanted to keep the AI going for longer than a day, my computer and GPU would probably burst into flames, so I need to think about ways I can work around this limitation to achieve the intended effect, and to introduce more design elements that can push the concept or prolonged process and the manipulation of time and hype more.

I think it would be a bit of a disservice to the concept if I knew at all what the AI would generate over a conceptual 100 years, but in setting up the system and getting a prompt going, I had to generate at least a couple images. With a massive prompt along the lines of "create a maximally creative piece of art that has no equivalent using any medium", the first image it generated I thought had a bit of poetic irony to it, although to be fair I made all the meaning myself.
Interesting that the seemingly maximally creative singular best piece of art ever created by an AI would be of a dark cube-like shape... a black box if you would, burst open to reveal the visage of a human head. As with all great art I'll leave it up to audience interpretation, since again with the AI image itself there is no inherent meaning other than what we make of it.

I think the concept is potentially strong as it is, but I think selling the execution as a Thesis project leaves maybe some more to be desired. I think if I can frame it correctly as an installation piece, as another interaction through time, this particular one focused through the lens of the hype/hate cycle, it could work a bit better.

Where the Next Steps are Leading

Evidently reading has been sidelined this week to crunch on these projects, which I will make a strong push to flesh out as much as possible both before the semester ends and before my committee meeting. These projects and making more progress on them aside, I have the Muse 2 brainwave scanner now that I have to build up a project on. The scanner is a little finicky and not the most consistent, so I will have to build accordingly to make it the most usable given its limitations. Combining it with more reliable face tracking and the emotive-reactive project makes more sense this way, as that can more drive the "dynamicism" of the project for a user to interact with, while the brainwave scanner can be more of something that activates the project from an idle state, as opposed to relying heavily on the brainwave data it produces. With as much progress as I've made on the installations, I've also had the chance to conduct 3 interviews with AI practitioners to get some insight on their outcomes in understanding AI, which I will have to lay out and decode some common themes on, and maybe reach out to more practitioners in the coming weeks, since the interviews are low-intensity I would be able to conduct much more depending on if I get responses from those I reach out to. Lastly, the workshop planning will take more precedent as I push forward with these projects and likely be a goal of mine through winter break to finalize.

Leave a comment

//about

Ryan Schlesinger is a multidisciplinary designer, artist, and researcher.

His skills and experience include, but are not limited to: graphic design, human-computer interaction, creative direction, motion design, videography, video-jockeying, UI/UX, branding, and marketing, DJ-ing and sound design.

This blog serves as a means of documenting his master’s thesis to the world. The thesis is an exploration of AI tools in the space of live performance and installation settings.