Oh look it's me as a student of Spanish and French
back in the 1990s. Why am I choosing to lead with a snapshot from the last century? Well by the
time this photo was taken I had spent every day for the previous six years trying to get my tongue
around three non-english languages and despite my thousands of hours of linguistic study the best
you could expected me overseas would be to buy a bus ticket without much of a comedy accent
but today thanks to artificial intelligence the 2024 versio
n of me has become a miraculous
polyglot. I can master dozens of world languages, no effort required, no conjugation of verbs,
no lists of vocab. As an imaginary test case let's explore what AI's breakthroughs in
human language might mean for something as simple as a 10-second publicity video. Come to our
library, the university has world class teachers, a diverse student demographic and all the latest
technology! Well obviously not a real video. For the purposes of demonstration I put toge
ther
that 10 second clip to stand in for some genuine publicity material. I know what you were thinking
as you were watching the video. You were thinking that it's going to need some finessing before it
hits the International Film circuit and you're right. True enough in its current form is of
limited interest outside of the English- speaking Market but that is okay because now I can use a
web-based AI tool to translate my video into any one of 29 languages so I've chosen HeyGen because
I'
ve used it before it's worked very well for me. I have uploaded my video exactly as it was and
it now becomes as complex as choosing my target language. The publicity for this site says they
do 40 plus languages I can only find 29 in this list but it's still better than I can speak
myself. So, I pick my destination language and I hit translate this video. It will
then take 5 or 10 minutes to generate and at that point I have myself a video dubbed into
a different language without needing to
refilm, without needing to find a native
speaker to to redub for me. So, how about we take a look at our test
video again this time dubbed into Romanian. And now again in Hindi. Certainly shows that I can transform an existing
video without hiring a translator and a voice actor. So, now I have the option for parallel
versions of the video in any selection of 29 languages but perhaps the viewers might find a
voiceover video a bit impersonal. Maybe I want to target the audience better by put
ting a human
face up on screen. Maybe somebody with which the demographic might better relate. Luckily, I can
do just that. I don't need to hire actors and I don't need to find native speakers. I can go
to one of several services that would allow me to create a 3D avatar that talks. I've gone here
to VidNoz because I've used it before. There are other sites available they charge. This one seems
to be free for most things so I will whiz through this but basically I will create a video. I wil
l
start with a blank template for demonstration purposes and I will pick my avatar. So, at
this point I I get a a series of talking faces that I can use to host my video.
So, let's arbitrarily pick one that I think will appeal to my demographic. Uh uh
it's this guy here and so I've chosen him and we now choose what he's going to say
in what language. I can now choose again from a very wide range of languages. For
demonstration let's imagine this man is Catalan. I will need of course to cre
ate
myself a script in Catalan but that's not difficult because I can paste it into chat
GPT. Translate into Catalan and my script, there we go. So, I have a translated script
I could just copy that stick it in here and it really is as simple as that. There's all
sorts of sophisticated things I can do such as adding images to the background,
add a moving video to the background, but basically that's all I need to do to create
a talking head in a variety of languages. Let's take a look at t
he results starting with a
version of our video optimized for the Japanese audience. And how about Zulu? All right the avatars are still a bit robotic and
Uncanny Valley but the technology is moving along so quickly, how realistic do you think they'll
look in six months a year? How soon until we can create an avatar that is undetectable? My guess
is, not very long. But let's imagine that I want to make my video more personal by using my own
voice, the human touch. Traditionally, I would hav
e had to sit down and record the narration,
endlessly umming ahhing and fluffing my lines. A simple voiceover might have taken me ages to get
right. Thankfully in 2024 I can get AI to clone my voice. So, to create a clone of my own voice I
will come here to ElevenLabs who at the moment seem to be leading the world in voice cloning
and other forms of voice synthesis. It is a very straightforward process to create a copy of
my voice. I will go to voices create. I will add a generative or clon
e voice. I will choose instant
voice cloning here because it's the free option. I will then give it a name, a voice name that is.
Uh and I would hit the microphone and I would at that point be prompted to record a series of 25 30
second samples of my voice and I would then hit, 'add voice'. ElevenLabs would take a few minutes
to generate it but at that point I would have a very versatile fully working copy of my own
voice that I could use. This is the cloned version of my voice. I think it'
s convincing
enough, certainly not bad based on a couple of minutes of me reading from the Wikipedia. Maybe
I'm not dropping enough T's and there certainly aren't any ums and ahhs but a perfectly plausible
rendition. It certainly would let me record long voiceovers without fluffing my lines no matter
how long the text. I could create an audio of my narration and re-record it painlessly anytime
the content needed updating. All I would have to do would be copy and paste the script into
Eleve
nLabs and hit generate. Plus I can now use my clone voice to create audio in any one of 30
languages. Would you like to hear a few samples? So, using my clone voice I can now generate
29 voiceovers in different languages but maybe even that's not ambitious enough? Perhaps I
want to personalize this video to the max by having my own face talking on screen in a foreign
language. I could try redubbing the video with my multilingual cloned narrations but that would
definitely suffer from the sa
me problem as any Bruce Lee film namely that the voice would get
comically out of sync with the lip movements. But in these days of AI I'm sure we can do much
better than that. If we go back to HeyGen where we did some translations earlier this time I
can pick the video translate option. At this point I would upload a copy of the video with my
talking head speaking English. I would choose my target language which in this case is going to
be Mandarin and then I would click translate this vid
eo. At this point it can take quite a
while to run because I am not a paid subscriber so I have to wait behind everybody else. So,
for instance when I generated my last video there were about 2,000 people ahead of me in
the queue and it took a couple of hours but it's still free and what HeyGen is going to do, is
translate what I'm saying from English Mandarin, clone my voice and create a Mandarin voice-over
of it and manipulate my lip movements to match up. And this is the same technology
that
very recently allowed the uh Argentinian President at Davos to seemingly give a
speech in English. Let's take a look at the final result which should show a mediocre
linguist from the 1990s talking confidently in Mandarin. That Mandarin speaking finale wraps
it up from me. Artificial Intelligence will inevitably bring staggering paradigm shifts to
the traditional barriers of language. I don't expect any of us can imagine the ramifications
of a world in which everyone can potentially c
ommunicate with everyone else. I hope you
have been interested to watch my musings, on the ways in which with no
budget and very little effort, you can potentially repurpose any of your
presentations, documentation, or training materials to cater for an international
audience and please don't judge me on my videography!
Comments