I’ve managed to link my cloned avatar (using HeyGen) to my cloned voice (via ElevenLabs) so that it reads the content of my blog from November 2013 called The benefits of walking (#170). The process was straightforward.
The content pertaining to Aristotle’s four causes is fairly dense. For people interested, some material is easier to read than to listen to. I feel the content and the medium could be matched better. I’ll also work on a friendlier avatar voice, and with pauses and expression.
You might wonder what are the benefits of planting words I have written into the face and voice of a proxy of myself.
You can stop-start and replay a video. Thanks to smartphones you can do other things while listening and/or watching. I assume readers are receptive to information delivered via a human voice in ways different to how we absorb content through the printed word. (See post Voices without bodies.)
The presentation here is obviously fake, and ethical practice suggests that any synthetic elements of content or presentation are declared. So I think of voice and face cloning not as a way to fool viewers. It is simply a means of presentation that communicates differently — and in different contexts.
It is a costly process, especially the extravagant processing time on the server-side. It took close to one hour for the HeyGen platform to deliver this 9 minute video. As with other AI and data-intensive services, the costs amplify environmental pressures.
Here’s a reading of this post by Speechify.
Discover more from Reflections on Technology, Media & Culture
Subscribe to get the latest posts sent to your email.