2024 is looking to be the year where AI issues truly come to a head – there are already multiple ethical and legal debacles only a week into the new year. Just recently, Wizards of the Coast, Respawn Entertainment (developers of Apex Legends), and Wacom were mocked on social media for using AI-generated images in their promotions.
OpenAI (the company behind ChatGPT) is now virtually pleading for copyright exceptions from the UK parliament. In a submission to the UK’s House of Lords communications and digital select committee, OpenAI argued that creating sophisticated AI tools like ChatGPT without access to copyrighted material is “impossible.”
OpenAI emphasized that, given the extensive scope of copyright, covering various forms of human expression, it couldn’t develop its AI models without the content. As per a report by The Telegraph, OpenAI’s submission includes this statement:
“Because copyright today covers virtually every sort of human expression – including blogposts, photographs, forum posts, scraps of software code, and government documents – it would be impossible to train today’s leading AI models without using copyrighted materials… Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.
Brazenly ignoring the obvious and moral choice of “don’t bloody do it, then”, OpenAI’s request for special exceptions sidesteps the proper practice of licensing and crediting the holders of these copyrights.
This submission requesting special exceptions to avoid copyright issues comes on the heels of legal challenges faced by OpenAI, including a lawsuit by The New York Times in December. The newspaper accused OpenAI and Microsoft of the “unlawful use” of its work to develop AI products.
OpenAI defended its stance, asserting that training is “fair use” and expressing confidence that the lawsuit lacks merit. In OpenAI’s own response to the lawsuit, the company behind ChatGPT says the regurgitation of exact content is a “rare bug”.
The New York Times’ lawsuit followed a similar legal action in September involving 17 authors, including renowned figure George R.R. Martin, alleging “systematic theft on a mass scale.” The legal landscape further evolved in August when a U.S. District Judge upheld a U.S. Copyright Office finding that AI-generated art could not be copyrighted.
This decision, dating back to 2018, emphasized the critical connection between the human mind and creative expression as a basis for copyright protection, while also reinforcing why AI imagery is not as good as the work of human artists.
Holy shit! I finally figured it out! It took me a long time!
When OpenAI filed as a “nonprofit corporation organized exclusively for charitable and/or educational purposes” with an interest in “public benefit” (screenshot below)
what they meant was
that they would ask for (&… pic.twitter.com/nVEXtzsxV9
— Gary Marcus (@GaryMarcus) January 9, 2024
The controversy surrounding AI and copyright extends beyond ChatGPT, as evidenced by a recent IEEE report highlighting concerns related to AI services, including OpenAI’s DALL-E 3 and Midjourney, as well as applications from other companies.
The report, co-authored by AI expert Gary Marcus and digital illustrator Reid Southen, pointed out instances of “plagiaristic outputs” where AI models reproduced copyrighted scenes from films and video games based on their training data.
OpenAI has faced a series of challenges in recent months, including the reinstatement of Sam Altman as CEO in November following a board-led firing. The company, backed significantly by Microsoft, is now valued at over $80 billion according to reports. Microsoft has invested more than $10 billion in OpenAI over the past year and has integrated OpenAI’s technology into its Bing search engine.
The legal ramifications of AI-generated content remain contentious, with debates around the liability of AI vendors and customers for potential copyright infringement. OpenAI has faced criticism for not fully disclosing the training data used for its AI models, raising questions about transparency in the industry.
As the legal landscape continues to evolve, the challenges posed by AI’s reliance on copyrighted material underscore the need for a delicate balance between technological advancement and ethical considerations.
OpenAI’s submission to the House of Lords reflects the ongoing efforts to navigate these complexities while emphasizing the indispensability of copyrighted data in training advanced AI models. Is there a middle ground possible where all parties involved will be happy with? It seems unlikely, and this burgeoning industry needs to be regulated early before more damage is done.