Tuesday, June 17, 2025

AI models aren’t copycats but learners just like us

At the heart of all these cases lies the prohibition under copyright law against the reproduction of literary and artistic works without the owner’s permission. AI companies admittedly ‘train’ their models on text, audio and video material scraped from the internet. Given that much of the output they generate contains similar content, there is a presumption that they have somehow ‘copied’ these works without the permission of copyright holders.

Also Read: ChatGPT plays Ghibli well: Will genuine originality suffer?

Copyright law was established in response to the very first innovation of the information age—the printing press. When publishers realized that the works they had commissioned were being sold in the market at a fraction of the price they charged, they asked for legal protection—not only for the physical books they had printed, but also for the ideas contained within them.

This necessitated a new form of legal protection, one that broadened the concept of ownership beyond just the tangible forms in which content is sold (like books, paintings and vinyl records) to include the intangible ideas they hold. As new technologies emerged, these protections evolved to encompass them; this is how we prevent pirates from selling bootleg DVDs, counterfeiters from duplicating merchandise and websites from displaying images without permission. Now, with AI, copyright law is also being adapted to accommodate it.

Also Read: Pay thy muse: Yes, AI does owe royalties for stolen inspiration

AI companies process large volumes of content—text and images—to develop their models. They first break text down into smaller units called tokens and convert images into discrete pixel values. Transformer architectures then process these text tokens to learn the relationships between them, while neural networks learn to remove ‘noise’ from random particles until they start forming coherent images.

As a result, AI models do not store the content they process for retrieval on demand. Instead, they identify statistical patterns within the data so that, when prompted, they can apply that knowledge to generate content that most appropriately responds to the requests made. In the case of large language models (LLMs), this involves predicting the next word, sentence or paragraph. In the case of diffusion models, it entails progressively eliminating noise until an image appears. 

So, even though Midjourney may have been trained on millions of images, it hasn’t ‘copied’ them into its memory. All it has done is derive statistical patterns from such visual information so that general principles of composition, colour and form can be encoded as mathematical weights that represent that information.

Also Read: Parmy Olson: AI chatbots could become advertising vehicles

This process is remarkably similar to human learning. When we read, our eyes scan the words in a book, but our brains don’t store an exact facsimile of that text; instead, they merely retain the ideas and concepts it embodies. When art students study the Great Masters, they do so not to replicate them perfectly, but to absorb techniques, principles of composition and visual approaches, allowing this enhanced understanding to improve their own artistic skills.

This is the essence of human creativity. Exposure to existing works has always been essential for new expression. Shakespeare borrowed his plots. Picasso learnt from African masks. Jazz musicians transformed classical forms. Great artists are also great students, building their own individual styles upon those of masters who came before them. If the fundamental essence of human learning does not violate intellectual property laws, should we not apply the same logic to AI as well?

Also Read: Devina Mehra: Yes, AI is coming for our jobs; it’s the old story of new tech

There is no doubt that this will affect creative industries. But we have been here before. Every wave of technological evolution has been disruptive. The powerloom replaced handloom artists; recorded music rendered live musicians redundant; and the film industry disrupted live theatre performances, just as OTT streaming content has led to fewer people going to theatres. In much the same way, AI is going to disrupt artists of all sorts—graphic designers, musicians, authors, actors and film producers.

Incumbents have always resisted change. But early victories have often been pyrrhic. Although the music industry succeeded in shutting down Napster’s free file-sharing service, within a decade music had become entirely digital, distributed over online platforms using the very model they had fought so hard to squash. 

Creative enterprises that succeed in the long run are those that embrace change. 

The London-based advertising agency group WPP recently created a Super Bowl advertisement entirely with AI—no sets, no actors and no crew. It cost far less than it would otherwise have and was completed in a fraction of the time. The ad agency group also offers an AI plat­form called WPP Open that can take simple text prompts and turn them into social media ads in a matter of minutes—a service that over 50,000 people are already using.

The only way to prevent yourself from being disrupted is to disrupt yourself.

The author is a partner at Trilegal and the author of ‘The Third Way: India’s Revolutionary Approach to Data Governance’. His X handle is @matthan.

#models #arent #copycats #learners

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles