Even Disney is Investing in AI: A Look at Face Re-Aging for Visual Effects

Written by whatsai | Published 2022/12/24
Tech Story Tags: ai | disney | machine-learning | computer-vision | youtubers | hackernoon-top-story | artificial-intelligence | youtube-transcripts | web-monetization | hackernoon-es | hackernoon-hi | hackernoon-zh | hackernoon-vi | hackernoon-fr | hackernoon-pt | hackernoon-ja

TLDRThis is usually done by skilled artists using Photoshop or a similar tool to edit your pictures Worst, in a video, they have to do this kind of manual editing for every frame! Just imagine the amount of work needed for that. Well, here’s both a solution and a new problem to this situation...via the TL;DR App

Whether it be for fun in a Snapchat filter, for a movie, or even to remove a few riddles, we all have a utility in mind for being able to change our age in a picture.
This is usually done by skilled artists using Photoshop or a similar tool to edit your pictures. Worst, in a video, they have to do this kind of manual editing for every frame! Just imagine the amount of work needed for that. Well, here’s both a solution and a new problem to this situation...

References

Video Transcript

0:02
[Music]
0:06
whether it be for a fun Snapchat filter
0:08
for a movie or even to remove a few
0:11
riddles we all have a utility in mind
0:13
for being able to change our age in a
0:15
picture this is usually done by skilled
0:18
artists using Photoshop or a similar
0:20
tool to edit your pictures worst in a
0:23
video they have to do this kind of
0:24
manual editing for every frame just
0:27
imagine the amount of work needed for
0:29
that well here's both a solution and a
0:32
new problem to this situation Disney's
0:35
most recent publication Fran can do that
0:38
automatically this is a big deal for the
0:41
film industry allowing you to instantly
0:43
re-age someone for a whole movie with
0:46
very little cast however it's a problem
0:48
for artists as it simultaneously cut
0:51
some job opportunities and help them cut
0:53
long and tedious work hours to focus on
0:56
Talent related tasks something cool here
0:58
is that they created a front-based tool
1:01
for artists to use and edit the results
1:03
making their work more efficient by
1:05
focusing on improving the details rather
1:08
than monotonic copy pasting edits from
1:10
one frame to another I'd love to hear
1:13
your thoughts about that in the comments
1:15
below or chat on our Discord Community
1:17
learn AI together but for this video
1:20
let's focus once more on the purely
1:22
positive sides of this work the
1:24
scientific progress they made in the
1:26
digital re-aging of Faces in video
1:29
you've been seeing the results of this
1:31
new Fran algorithm and I believe you can
1:33
already agree on how amazing these
1:35
results look just look at how much more
1:38
realistic it looks compared to other
1:40
state-of-the-art reaging approaches that
1:42
contain many artifacts and fail to keep
1:44
the person's identity the same plus
1:47
friends approach does not require to
1:49
Center the faces as these other
1:51
approaches do which makes it even more
1:53
impressive what's even more incredible
1:55
is how simple their approach is first
1:58
Fran unsurprisingly stands for face
2:01
re-aging Network this means that the
2:04
model is able to take a face and change
2:06
how old the person looks with
2:08
consistency realism and high resolution
2:11
results across variable Expressions
2:13
viewpoints and lighting conditions for
2:16
movies the actor's age appearance is
2:19
usually Changed by the production team
2:20
using dedicated costumes hairstyles Etc
2:23
to depict the aimed age and only the
2:26
face is left for digital artists to edit
2:29
frame by frame which is where Fran comes
2:32
in focusing strictly on skin regions of
2:35
the face they also focus on adult ages
2:38
as movies already have efficient and
2:40
different techniques for very young
2:42
re-aging as their whole body and faces
2:44
shapes are different and smaller in
2:47
those cases but how can they take a face
2:49
from any position and just change its
2:52
appearance to add or remove a few Dozen
2:54
Years mainly because they have no ground
2:56
truth on this task meaning that they
2:59
cannot train an algorithm to replicate
3:01
before and after pictures since they
3:03
don't have them very few examples exist
3:05
of the same person with 20 or more years
3:08
apart in every angle they need to have a
3:10
different approach and conventional
3:12
supervised learning approaches where you
3:14
try to replicate the examples you
3:16
already have in your set of data
3:18
typically researchers tackle this
3:20
problem using powerful models trained on
3:22
generated fake faces of all ages though
3:25
the results are pretty impressive they
3:27
mainly work on centered and frontal
3:29
faces due to the training data of fake
3:32
faces generated for it thus the results
3:35
are hardly generalizable to Real World
3:38
scenes since they do not really keep the
3:40
identity of the person as it was not
3:42
trained using the same person at
3:44
different periods of time but just a
3:46
variety of different people of different
3:48
ages and such static models can hardly
3:52
produce realistic facial movements due
3:54
to its training on static images it
3:57
doesn't know real world mechanics
3:58
lighting changes Etc
4:01
their first contribution is tackling
4:03
this Gap in the number of images from
4:05
the same person at different ages their
4:08
goal here is to do the same thing as
4:10
previous approaches but with a small
4:12
tweak they will still be using generated
4:15
fake faces but will build a data set
4:18
full of the same faces with different
4:20
edges so basically the same person with
4:23
the same background and same everything
4:25
except the age to have the algorithm
4:27
focused strictly on the face at
4:29
different ages they figured that even if
4:32
these approaches do not really
4:33
generalize well in their real world and
4:36
in video scenes they still understand
4:38
the aging process really well so they
4:40
could use them to generate more images
4:43
of the same person at different ages as
4:45
a first step to build a better data set
4:48
this step is done using a model called
4:50
sum which can take a person's face that
4:53
is perfectly centered and reagent it
4:56
will only be used to construct our set
4:57
of before and after pictures to be used
4:59
for training their fan an algorithm this
5:02
step is necessary since our algorithms
5:04
are too dumb to generalize from a few
5:07
examples as we humans do and we cannot
5:10
get nearly as many pictures of real
5:12
faces with the same lighting same
5:15
background and same clothing at
5:17
different ages it must be artificially
5:19
generated their second contribution is
5:22
using this new set of images they
5:24
created and training and algorithm able
5:27
to replicate this process on Real World
5:30
scenes along with good consistency
5:32
across video frames the algorithm they
5:35
built is in fact quite simple and
5:37
similar to most image to image
5:39
translation algorithms you will find
5:41
they use a unet architecture which takes
5:44
an input and output age and an image to
5:48
learn the best way to transform it into
5:50
a new image by encoding it into the most
5:53
meaningful space possible and decoding
5:55
it into the new image so the network
5:58
learns to take any image and get it into
6:00
what we call a latent space where we
6:03
have our encodings this latent space
6:06
basically contains all the necessary
6:08
information the network learned for its
6:10
specific tasks so basically the
6:12
different features of the face for this
6:14
particular individual but does not
6:16
contain information about the image
6:18
background or other features that aren't
6:20
necessary for reaging then it takes this
6:23
information to predict some kind of
6:25
re-aging mask this mask will only
6:27
contain the parts that need to be edited
6:29
in the picture for a re-aging effect
6:31
making the tasks much more manageable
6:34
than predicting the whole image once
6:36
again and we simply merge this predicted
6:39
mask to our initial image to get the
6:41
re-aged face this mask is the main
6:43
reason why their approach is so much
6:45
better at preserving the person's
6:47
identity since they limit their
6:49
Network's field of action to the reaging
6:51
modifications only and not the whole
6:54
image or even the whole face when you
6:56
can't make it more intelligent just make
6:59
it more specific the mode doll is
7:01
trained following a gun approach which
7:03
means as we covered in many videos it
7:05
will use another model that you see here
7:08
on the right called a discriminator
7:10
trained simultaneously and used to
7:12
calculate if the generated re-age image
7:15
is similar to the ones we have in our
7:17
training data set basically rating its
7:20
results for guiding the training and
7:22
voila this is how Fran helps you re-age
7:25
your face anywhere between 18 and 85
7:28
years old of course this was just a
7:30
simple overview of this new Disney
7:32
research publication and I'd recommend
7:34
reading their excellent paper for more
7:36
information and results analysis if you
7:39
are unfamiliar with guns I suggest
7:40
watching the short introduction video I
7:43
made about them thank you for watching
7:45
and I will see you next time with
7:47
another amazing paper
7:48
[Music]
7:59
thank you
[Music]

Written by whatsai | I explain Artificial Intelligence terms and news to non-experts.
Published by HackerNoon on 2022/12/24