Microsoft’s new voice simulating AI VALL-E presents both opportunities and pitfalls

Microsoft recently announced that it has Develop a new artificial intelligence It can simulate a person’s voice after listening to just three seconds of the audio recording. VALL-E is a neural coding language paradigm. According to their paper, AI encodes speech and uses its algorithms to use those codes to generate waveforms that sound like a speaker, even while preserving the speaker’s timbre and emotional tone.

Fortunately, Microsoft’s principles of responsible AI have led the company to block the AI ​​code. Clearly, there is potential for unethical uses of this technology. Potentially nefarious uses range from bypassing audio biometric locks, to creating realistic-looking deep fakes, to generally causing mayhem and distress.

Consider a low-tech audio parody: In the UK, a hospital caring for Kate Middleton was tricked into believing that the Queen and then Prince Charles had called to speak to the Duchess, by two on-air Australian radio personalities. The nurse who took the call committed suicide soon after. Notably, besides social and professional ostracism, the two broadcasters never faced any criminal or civil charges.

1 Gallery view

Argonian people Argonian people

Amnesty International.

(Created by Midjourney AI Generator)

In addition to these concerns mentioned above, there may also be the issue of widespread infringement of a person’s right to publicity, which is a form of intellectual property.

In 2004, the Israeli Supreme Court in Alonel v. McDonald recognized the right to publicity outside the scope of privacy laws. These rights provide a form of ownership and control over one’s image, name, and voice. Later in 2016, this right was expanded in a lawsuit against two Israeli companies, Beverly Hills Fashion and Ha-Mashbir. Allegedly, the companies were using the artist Salvador Dali’s name for commercial purposes. (in the case of Fundacio Gala Salvador Dali v. VS Marketing). Under this provision, an individual’s right to an opinion and other attributes was expanded and considered a transferable right, continuing like other intellectual property rights for years after death.

The Israeli cause line protects one’s voice and example. But what about VALL-E’s abilities to trick that sound. Is this also an infringement of the right of publicity?

There are two major US cases in this area: In a 1992 ruling by singer Tom Waits – known for his distinctive voice described as “like how you would sound if you drank a quart of bourbon, smoked a pack of cigarettes and swallowed a packet of razors…. late at night After Not Sleeping for Three Days”- she successfully sued snack company Frito Lay for $2.5 million for using a Tom Waits impersonator in a Dorito commercial.

In an earlier 1988 ruling, the Ninth Circuit Court similarly found that a commercial using an actor with a voice that sounds like Bette Midler violated Midler’s rights under California law. According to the ruling: “Where the distinctive voice of a professional singer is widely known and deliberately imitated in order to sell a product, the sellers have appropriated what is not theirs and committed a California tort.”

To wit: Under California law: “Any person who knowingly uses the name or voice of another person…for advertising or selling purposes…without that person’s prior consent…is liable for any damages to the person or persons injured as a result.” “.

Either way, while the court was concerned with protecting consumers from deceptive practices and false advertising, the courts also found that where a voice is “sufficient evidence of a celebrity’s identity, the right of publicity protects against its imitation for commercial purposes without the celebrity’s participation and consent.” According to this decision line, Microsoft’s non-consensual use of artificial intelligence to imitate a person’s voice, especially the voice of celebrities for commercial purposes, can be an infringement of personality rights.

Likewise, in France, the right to one’s image extends to one’s voice, even to persons anonymous and apparently without any commercial regard.

However, despite these and other jurisdictions providing some rights over one’s voice, there is no shortage of comedians who successfully mimic the voices of famous personalities, even building their careers on these imitation skills, all seemingly without legal consequences.

Take, for example, the comics on Eretz Nehederet or Saturday Night Live who clearly benefit from such voice impersonation. If these shows can make their living off of someone else’s voice, then maybe VALL-E can also impersonate other people’s voices for fun and even for profit?

Or maybe not. It seems that a distinction must be made between the narrow aims of comedy impressions and the use of my voice and your voice for all other purposes. Perhaps a comparison could be made to distinguish copyright law between fair use defenses of parody and satire.

Parody and satire are closely related forms of comedy and both can be used for important messages. However, under laws such as Section 19 of the Israeli Copyright Law of 2007, fair use is a more likely qualified defense of supposed copyright infringement for parodying works than for satire using the same copyrighted work. Similar statutes of court rulings have been codified in Canada and the United States.

This distinction between parody and satire is due to the fact that parody uses the protected work to comment on the work itself. Immigrants are less likely to obtain permission from their target, so the law needs to provide greater protection to achieve desired discourse; The means and ends are closely related. In contrast, satire uses the protected work to provide broad commentary, not necessarily in relation to the work itself. As such, the law often considers infringement an unnecessary and indefensible means, despite the laudable end.

When making a comparison: when VALL-E is used to trick a voice with the intent of creating speech specifically relating to that individual, for example to create an AI version of Eretz Nehederet, this can be considered fair use and protected speech, at least under propaganda laws. Why should artificial intelligence be more responsible than a human impressionist?

By contrast, if VALL-E is for the non-harmonic use of a person’s voice for a purpose not related to the voice itself, for example, where any other voice would be equally useful for the purposes of that speech, then such use could be considered an infringement of the right of publicity.

In the ongoing battle over the differences between humans and AI in content creation, AI is currently losing out on why it is considered as good as a human. Perhaps a successful defense of AI fair use created parody of a cynical voice will start to turn the tide.

Professor Dov Greenbaum is Director of the Zvi Mitar Institute for the Legal Implications of Emerging Technologies at the Harry Radzner School of Law at Reichmann University.

Leave a Comment