Incom ist die Kommunikations-Plattform der Hochschule Anhalt Fachbereich Design

In seiner Funktionalität auf die Lehre in gestalterischen Studiengängen zugeschnitten... Schnittstelle für die moderne Lehre

Incom ist die Kommunikations-Plattform der Hochschule Anhalt Fachbereich Design mehr erfahren

Artificial Intelligence in Visual Effects

Artificial Intelligence in Visual Effects

Using artificial intelligence and machine learning to accelerate visual effects work and 3D animation


Artificial intelligence and machine learning play a huge role in our everyday lives already. The digital services we use rely heavily on AI and our mobile devices are packed with technology leveraging machine learning, from content curation via biometric authentication through to computational photography.

How are our jobs as designers going to change under the influence of these emerging technologies? Will AI take over most of what we do or will it be just a tool making our work more efficient?

In our project, we explored what kind of design tools already exist that utilise artificial intelligence and machine learning, with a focus on visual effects work and 3D animation. We created a short (and quite rubbishy) visual effects shot using some of these tools combined with a technical breakdown of how we made it. Technical breakdowns are usually extremely boring and highly specific, so we tried to make it as entertaining as possible.

Make sure to take a look at the “More Tools” section as well for additional tools that did not make it into our video.

Watch the Video

AR Motion Tracking

Tracking the camera motion in live-action film footage has never been easier. The augmented reality capabilities of modern smartphones enable automatic motion tracking while filming instead of having to track the footage traditionally using markers.

This works incredibly well, even in situations that can be difficult to track the traditional way. However, manual correction may be required if the footage is very shaky and/or contains a lot of motion blur.

We used the app CamTrackAR (iPhone and iPad) by FXhome to record and simultaneously track our footage.

Human Digitisation

We experimented with a relatively new machine learning algorithm called PiFuHD (short for “Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization”) that reconstructs 3D geometry from a single photo of a person.

Running PiFuHD requires a fairly high-end computer. Fortunately, a demo is available on Google Colaboratory leveraging the power of Google’s cloud computers. However, the output resolution of the 3D geometry is limited. For higher resolution, we would have to run PiFuHD on a local machine. The Google Colaboratory output was sufficient for us though.

The process is pretty straightforward (there is also a video tutorial). All we had to do was upload photos of us, execute the predefined code on the cloud computer, and download the generated 3D models.

The results are very rough and not (yet) suited for close-up shots, but they are good enough for our purposes.

Animating with Mixamo

Animating 3D characters is usually an extremely time-consuming task. Before the actual animation process can begin, the character model needs to be rigged. A rig is essentially a simplified skeleton that can move and deform the 3D mesh accordingly. After rigging the character model, it can be animated using motion capture equipment or by hand. Manual character animation is very difficult to get right and without a lot of practice, the animated movements are guaranteed to look unnatural.

Adobe Mixamo can help us with that. It offers automatic character rigging that uses machine learning under the hood. The results are surprisingly good, even for our not-so-clean character models. Mixamo also has a large library of professionally created and customisable animations ranging from walking and running via climbing through to dancing.

Using Mixamo enables us to get from a plain 3D mesh to a fully rigged and animated character in about five minutes.

Image Denoising

To fit the live-action footage best, we decided to render the 3D elements using Blender’s path-tracing engine Cycles which essentially simulates millions of light rays bouncing around in the scene to create a photorealistic image (called light sampling). Simulating more light rays will result in an image with less noise, but it will also increase the render time significantly.

There’s a solution for this issue though: denoising — AI denoising in particular! Using colour and some geometry information, it is capable of removing noise from rendered images and even the live preview in the 3D application’s viewport. This allows us to get noise-free results at a much lower render quality, decreasing the render time dramatically. It is also very easy to use. Intel’s OpenImageDenoise is integrated into almost every 3D software and can be used without any complicated setup.

More Tools

We have also experimented with more tools enabled by artificial intelligence and machine learning that did not make it into our “breakdown” video. Some of them are in a very early stage and/or are complicated to set up, and some did not fit our specific use case. Nevertheless, they are fun to try out and may be useful in the future.

Flowframes for Video Frame Interpolation

This free app (Windows only) makes video frame interpolation extremely easy. It uses artificial intelligence to increase the frame rate of videos with little to no noticeable loss of quality. For example, you can create a smooth 24 fps video from a 12 fps stop-motion animation or use it to make slow-motion shots even more dramatic.

This could also be beneficial for 3D animation because it would cut the rendering time almost in half by interpolating every second frame. Unfortunately, it does not work with transparency yet which was crucial for our video.

TecoGAN for Video Upscaling

TecoGAN is a very impressive machine learning network that is capable of upscaling a video’s resolution and “guessing” the details that are not present in the source video.

This open-source app (Windows only) is a rudimentary implementation of TecoGAN and requires additional setup.

Speech Synthesis

This open-source demo makes it easy to do “voice cloning” or speech synthesis. It takes an audio recording (5 seconds or longer) of someone speaking and tries to synthesise the voice. You can then let this voice say anything you want using text-to-speech. The quality is not very good and not suitable for production, but you can get some interesting and also creepy results.

Monster Mash for Casual 3D Modelling and Animation

Monster Mash is a really fun approach to sketch-based 3D modelling and animation. It is very easy to use, so be sure to check it out!

While Monster Mash is not based on artificial intelligence, it is still very impressive and can make our work a little bit easier.