Difference between revisions of "Copyright issues behind ChatGPT's creation"

From SI410
Jump to: navigation, search
(1. Can you copyright the output of a generative AI model, and if so, who owns it?)
(Authorship belongs to non-humans(ChatGPT))
 
(42 intermediate revisions by the same user not shown)
Line 1: Line 1:
ChatGPT(Chat Generative Pre-trained Transformer) is a new chatbot model released by OpenAI, an artificial intelligence research lab, on November 30, 2022. The model uses natural language processing tools powered by artificial intelligence technology. ChatGPT is able to conduct conversations by learning and understanding modern human language, mainly English, and can also interact based on the contextual information of the chat. It performs chatting and communicating behavior truly like a human, and even completes tasks as writing emails, video scripts, translation, and code under certain scenarios.<ref>What is CHATGPT and why does it matter? here's everything you need to know. ZDNET. (n.d.). Retrieved January 27, 2023, from https://www.zdnet.com/article/what-is-chatgpt-and-why-does-it-matter-heres-everything-you-need-to-know/ </ref>
+
ChatGPT(Chat Generative Pre-trained Transformer) is a new chatbot model released by OpenAI, an artificial intelligence research lab, on November 30, 2022. The model uses natural language processing tools powered by artificial intelligence technology. ChatGPT is able to conduct conversations by learning and understanding modern human language, mainly English, and can also interact based on the contextual information of the chat. It performs chatting and communicating behavior truly like a human, and even completes tasks such as writing emails, video scripts, translation, and code under certain scenarios.<ref>What is CHATGPT and why does it matter? here's everything you need to know. ZDNET. (n.d.). Retrieved January 27, 2023, from https://www.zdnet.com/article/what-is-chatgpt-and-why-does-it-matter-heres-everything-you-need-to-know/ </ref>
  
 
To train the model behind ChatGPT, a huge amount of data is collected from the Internet and applied to both supervised and reinforcement machine learning techniques. The answers delivered by ChatGPT, sometimes, are highly similar to the answers online created by human authors. Other times, it summarizes multiple answers, created by human authors, from its training dataset. Whether the creation of ChatGPT is considered to have originality is highly debating. Ethical issues like copyright get more and more attention from the general public.
 
To train the model behind ChatGPT, a huge amount of data is collected from the Internet and applied to both supervised and reinforcement machine learning techniques. The answers delivered by ChatGPT, sometimes, are highly similar to the answers online created by human authors. Other times, it summarizes multiple answers, created by human authors, from its training dataset. Whether the creation of ChatGPT is considered to have originality is highly debating. Ethical issues like copyright get more and more attention from the general public.
 +
 +
 +
[[File:chatgpt.jpeg|400px|thumb|engadget - OpenAI will soon test a paid version of its hit ChatGPT bot<ref>Fingas, J. (2023, January 11). OpenAI will soon test a paid version of its hit Chatgpt Bot. Engadget. Retrieved February 11, 2023, from https://www.engadget.com/openai-chatgpt-professional-paid-chatbot-143004442.html </ref>]]
  
  
 
==Copyright==
 
==Copyright==
Copyright refers to the ownership of a creative work. Issues of copyright are mainly related to the use, distribution and protection of creative works. Creative works can be with formats in literary, artistic, educational or musical background. Copyright is intended to protect the originality of the idea created by the author with the form of a creative work, not the idea itself.<ref>Stim, Rich (27 March 2013). ["Copyright Basics FAQ"](https://fairuse.stanford.edu/overview/faqs/copyright-basics/). The Center for Internet and Society Fair Use Project. Stanford University. Retrieved 21 July 2019.</ref>
+
Copyright is a legal framework that affords creators of original works with the exclusive rights to regulate the utilization and dissemination of their productions. The aim of copyright is to foster creativity by offering authors, artists, and other creators incentives to generate new works. Copyright law offers protection to a broad spectrum of works, including literature, music, software, film, photography, and architecture. <ref>Stim, Rich (27 March 2013). ["Copyright Basics FAQ"](https://fairuse.stanford.edu/overview/faqs/copyright-basics/). The Center for Internet and Society Fair Use Project. Stanford University. Retrieved 21 July 2019.</ref>
 +
 
 +
Under copyright law, the owner of an original work is granted exclusive control over the following aspects of the work:<ref>Stokes, S. (n.d.). Art and copyright. Google Books. Retrieved February 11, 2023, from https://books.google.com/books?id=h-XBqKIryaQC&amp;as_brr=3</ref>
 +
 
 +
* Reproduction: The right to make copies of the work.
 +
* Distribution: The right to sell, rent, or otherwise distribute copies of the work.
 +
* Display: The right to show the work in public.
 +
* Performance: The right to perform the work in public, such as a play or musical composition.
 +
* Derivative Works: The right to make adaptations or alterations of the original work.
 +
 
 +
In many jurisdictions, the moment a work is created and fixed in a tangible medium, such as writing a book or composing a song, it is automatically protected by copyright law. Registering the work with the relevant copyright office provides evidence of ownership and can assist in the enforcement of rights in legal proceedings.<ref>Service unavailable. GOV.UK. (n.d.). Retrieved February 11, 2023, from https://www.ipo.gov.uk/copy/c-claim/c-register.htm</ref>
 +
 
 +
Additionally, copyright law acknowledges certain exceptions and limitations to the copyright owner's exclusive rights. In the United States, the principle of fair use permits limited utilization of copyrighted works without the copyright owner's authorization for specific purposes, such as criticism, commentary, news reporting, education, scholarship, or research.
 +
 
 +
A fundamental aspect of copyright law is the granting of exclusive control over the use of a work to the copyright owner. Additionally, the law allows for the transfer of these rights, enabling the copyright owner to sell, license, or otherwise transfer the right to utilize the work to a third party. Such transfers are frequently executed through agreements such as publishing or licensing contracts.<ref>Yu, P. K. (n.d.). Intellectual property and information wealth: Issues and practices in the Digital age, volume 1. Google Books. Retrieved February 11, 2023, from https://books.google.com/books/about/Intellectual_Property_and_Information_We.html?id=bnW8ypT9_pIC </ref>
  
 +
International treaties and agreements, including the Berne Convention and the World Intellectual Property Organization (WIPO) Copyright Treaty, also extend copyright protection across multiple countries. This enables creators to secure protection for their works on a worldwide basis.<ref>MacQueen, H. L. (n.d.). Contemporary intellectual property: Law and policy. Google Books. Retrieved February 11, 2023, from https://books.google.com/books?id=_Iwcn4pT0OoC </ref>
  
 
==History==
 
==History==
In the past, generative AI would not rise copyright issues. Back to 2010s, most of the AI models were still under development and had a lot of problems generating works. Their creation is far below the the human level either in complexity or in aesthetics. Models could only generate blurry artworks with black-and-white faces. Chatbots were far behind the maturity of conducting regular conversation.  
+
The examination of the relationship between copyright and AI technology is a continuously developing field and it is anticipated that it will continue to evolve as AI technology becomes increasingly prevalent in society.
  
However, with a series of responses deliberately picked from the best responses of generative AI, an illusion of what AI model could do impressed the general public. Inspired by modern science fictions and other medias, rumors on AI threats human beings soon caught people's attention. That being said, the generative AI was still harmless to human content creators, even though with narrow and well-defined tasks, they could generate some results. <ref>Roose, K. (2022, December 5). _The brilliance and weirdness of chatgpt_. The New York Times. Retrieved January 27, 2023, from https://www.nytimes.com/2022/12/05/technology/chatgpt-ai-twitter.html</ref>
+
====Mid-20th century====
 +
As advancements were made in computer and AI technology, questions were raised about the potential effect on intellectual property laws. Despite the technology being in its early stages, these discussions centered primarily on theoretical concerns.
 +
 
 +
====1980s and 1990s====
 +
With the rise in the usage of personal computers and the internet, early AI systems were developed that were capable of producing original works, such as music and poetry. This led to the examination of the question of whether AI-generated content can be considered original and qualify for copyright protection.
 +
 
 +
====Late 1990s to early 2000s====
 +
As the advancement and usage of AI systems increased, attention was given to the topic of AI and copyright. Views were divided with some experts considering AI-generated content as not being original and thus not deserving of copyright protection, while others believed AI systems should be recognized as the creators of the works they produced.
 +
 
 +
====Mid-2010s====
 +
As advancements in AI technology persisted, several lawsuits relating to copyright and AI were initiated. One such example was the lawsuit filed by a group of photographers against Google in 2014, regarding the use of their images in the company's street view mapping service. Another case was a lawsuit brought by a musician against a music streaming service, Spotify, for the use of his compositions in its playlist recommendations.
 +
 
 +
====Late 2010s to present====
 +
The appropriate legal framework for AI and copyright remains a topic of ongoing discussion and debate. While some countries have enacted specific laws to address the issue, a clear consensus has yet to be reached. Meanwhile, organizations and experts are calling for a more nuanced approach that takes into account the unique characteristics of AI and its various applications in generating original works. <ref>Roose, K. (2022, December 5). The brilliance and weirdness of chatgpt. The New York Times. Retrieved January 27, 2023, from https://www.nytimes.com/2022/12/05/technology/chatgpt-ai-twitter.html</ref>
 +
 
 +
 
 +
==The model behind ChatGPT==
 +
ChatGPT is a transformer-based language model, a type of artificial intelligence. The model is trained on a vast amount of text data, utilizing artificial neural networks to generate text based on the patterns learned from the text data during its training.<ref>Fingas, J. (2023, January 11). OpenAI will soon test a paid version of its hit Chatgpt Bot. Engadget. Retrieved February 11, 2023, from https://www.engadget.com/openai-chatgpt-professional-paid-chatbot-143004442.html </ref>
 +
 
 +
When a prompt or question is provided to ChatGPT, the model processes the text and generates a response. This is achieved by predicting the subsequent word in the sequence based on the preceding words. The model employs a complex mathematical procedure to evaluate the probability of various words being the next word in the sequence, and selects the word with the highest probability.
 +
 
 +
The "transformer" aspect of the model's name refers to a specific type of neural network architecture that is utilized to process the text data. The transformer architecture is designed to handle sequences of data, making it appropriate for language modeling.
 +
 
 +
 
 +
==ChatGPT's Strength==
 +
ChatGPT is capable of generating text with a high level of fluency and coherence that is similar to human language.  It is a useful resource for a wide range of NLP applications.
 +
 
 +
====Question Answering====
 +
ChatGPT has the ability to comprehend and provide answers to questions on a diverse array of subjects, such as history, science, and current events. For instance, if a query of "Who was the first president of the United States?" is made, ChatGPT would provide the response of "The first president of the United States was George Washington." <ref>Fingas, J. (2023, January 11). OpenAI will soon test a paid version of its hit Chatgpt Bot. Engadget. Retrieved February 11, 2023, from https://www.engadget.com/openai-chatgpt-professional-paid-chatbot-143004442.html </ref>
 +
 
 +
====Text Summarization====
 +
ChatGPT is capable of generating a brief summary of a longer piece of text, such as an article, a news story, or a research paper. The model can analyze the input text, extract the most relevant information, and condense it into a shortened form that retains the core meaning of the original text.
 +
   
 +
====Conversational Modeling====
 +
ChatGPT is capable of generating responses in a conversational manner, making it well-suited for the development of chatbots. The model can understand the context and intent behind user inputs, and generate appropriate and coherent responses. For instance, if the query "How are you today?" is made, ChatGPT could respond with "I am functioning well, thank you for asking. How are you today?" <ref>OpenAI's CHATGPT is scary good at my job, but it can't replace me (yet). ZDNET. (n.d.). Retrieved February 11, 2023, from https://www.zdnet.com/article/openais-chatgpt-is-scary-good-at-my-job-but-it-cant-replace-me-yet/ </ref> 
 +
 
 +
====Text Generation====
 +
ChatGPT has the ability to produce new text based on a specified prompt. This can be applied to tasks such as language translation, story writing, or generating responses in a chatbot. The model utilizes its comprehension of language patterns and grammar to generate coherent and diverse text that is consistent with the specified prompt. As an example, if the prompt "Write a short story about a magical world" is provided, ChatGPT could generate a story describing a fantastical place filled with mythical creatures and spells. <ref>OpenAI's CHATGPT is scary good at my job, but it can't replace me (yet). ZDNET. (n.d.). Retrieved February 11, 2023, from https://www.zdnet.com/article/openais-chatgpt-is-scary-good-at-my-job-but-it-cant-replace-me-yet/ </ref>
  
  
 
==ChatGPT's Limitation==
 
==ChatGPT's Limitation==
Although ChatGPT appears to be quite remarkable, it still has limitations. These restrictions include the inability to respond to questions that are phrased in a particular way since it requires rephrasing in order to comprehend the question from the the conversational background.<ref> What is CHATGPT and why does it matter? here's everything you need to know. ZDNET. (n.d.). Retrieved January 27, 2023, from [https://www.zdnet.com/article/what-is-chatgpt-and-why-does-it-matter-heres-everything-you-need-to-know/](https://www.zdnet.com/article/what-is-chatgpt-and-why-does-it-matter-heres-everything-you-need-to-know/)</ref> Though ChatGPT can tell the difference between "appropriate" and "inappropriate" requests, it can still process "inappropriate request", which is not like OpenAI designed it to be. Users have found ways around pre-set principles of processing requests. "inappropriate requests", like generating instructions for illegal activities, can still be made by rephrasing the request as a hypothetical though experiment.
+
ChatGPT, despite being regarded as advanced, still faces certain constraints in its functionality.
 +
 
 +
====Fact checking====
 +
ChatGPT is trained on a large amount of text data from the internet, which can include false or inaccurate information. As a result, the model may generate responses that are not entirely accurate. It is important to critically evaluate the information generated by ChatGPT and corroborate it with other sources.
 +
 
 +
====Common sense reasoning====
 +
ChatGPT is not designed to have a deep understanding of common sense knowledge and may struggle with tasks that require this kind of understanding. For example, it may generate responses that are logically inconsistent or do not align with real-world expectations. <ref>Chatgpt: Threat or menace?: Inside higher ed. Higher Ed Gamma. (n.d.). Retrieved February 11, 2023, from https://www.insidehighered.com/blogs/higher-ed-gamma/chatgpt-threat-or-menace</ref>
 +
 
 +
====Ethical considerations====
 +
Like all AI models, ChatGPT is not capable of considering ethical considerations when generating text. It may generate responses that are insensitive, inappropriate, or offensive, and it is up to human users to intervene and prevent such responses from being used. <ref>Bogost, I. (2022, December 16). CHATGPT is dumber than you think. The Atlantic. Retrieved February 11, 2023, from https://www.theatlantic.com/technology/archive/2022/12/chatgpt-openai-artificial-intelligence-writing-ethics/672386/</ref>
 +
 
 +
====Legal considerations====
 +
While ChatGPT is equipped to differentiate between requests that are appropriate and those that are not, it still has the ability to process requests that fall outside of the parameters set by OpenAI. Some users have found ways to circumvent the established principles for processing requests.<ref> What is CHATGPT and why does it matter? here's everything you need to know. ZDNET. (n.d.). Retrieved January 27, 2023, from [https://www.zdnet.com/article/what-is-chatgpt-and-why-does-it-matter-heres-everything-you-need-to-know/](https://www.zdnet.com/article/what-is-chatgpt-and-why-does-it-matter-heres-everything-you-need-to-know/)</ref>
  
Another significant drawback is the poor quality of the replies it provides, which occasionally seem reasonable but are overly vague and unpractical. When it encounters confusing words, ChatGPT tends to make assumptions about how to interpret those words instead of asking the user for further clarification. This interactive behavior often results in a confusion to its users.
+
====Question Answering====
 +
While ChatGPT can generate coherent and consistent responses for general conversations, it may still face difficulty in comprehending questions that are expressed in specific ways, which necessitates rephrasing for accurate understanding.
  
 
==Polarized views==
 
==Polarized views==
With all of its impressive creations and limitations, ChatGPT has received many attentions. After it was posted for public testing on 30th November 2022, within the 1st week of its launch, ChatGPT has reached 1 million users.<ref> Ruby, D., & About The Author Daniel Ruby Content writer with 10+ years of experience. I write across a range of subjects. (2023, January 2). CHATGPT statistics for 2023: Comprehensive facts and data. Demand Sage. Retrieved January 27, 2023, from https://www.demandsage.com/chatgpt-statistics/ </ref>
+
With all of its impressive creations and limitations, ChatGPT has received much attention. After it was posted for public testing on 30th November 2022, within the 1st week of its launch, ChatGPT has reached 1 million users.<ref> Ruby, D., & About The Author Daniel Ruby Content writer with 10+ years of experience. I write across a range of subjects. (2023, January 2). CHATGPT statistics for 2023: Comprehensive facts and data. Demand Sage. Retrieved January 27, 2023, from https://www.demandsage.com/chatgpt-statistics/ </ref>
From one side, people think these technologies were undoubtedly capable of violating copyright laws, and they would soon be subject to major legal repercussions. Others said the reverse, with similar assurance: that everything taking place in the realm of generative AI is legal and above board, and any legal actions are bound to fail.
+
  
==Questions need to be answered==
+
The question of who holds the copyright for the output produced by AI systems like ChatGPT is a complex matter, with varying perspectives. There are varying opinions regarding the legality of the technology and its potential for violating copyright laws.
===1. Can you copyright the output of a generative AI model, and if so, who owns it?===
+
  
Regarding intellectual property, Bern Elliot, analyst at Gartner, states that the model for ChatGPT "is trained on a corpus of creative works and it is yet unknown what the legal precedent may be for reuse of this content, assuming it was formed from the intellectual property of other human creators."<ref>Why is ChatGPT making waves in the AI market? Gartner. (n.d.). Retrieved January 29, 2023, from https://www.gartner.com/en/newsroom/press-releases/2022-12-08-why-is-chatgpt-making-waves-in-the-ai-market</ref>
+
One perspective is that AI systems like ChatGPT lack the capacity to hold a copyright, as they are not human and therefore do not possess the legal right to intellectual property. According to this viewpoint, the copyright for the AI's output would belong to the human creator or owner of the system, such as OpenAI in the case of ChatGPT.
  
 +
Alternatively, there are those who argue that the output generated by AI systems like ChatGPT can be seen as a form of original expression and that the AI system should be granted copyright protection. Those in favor of this viewpoint argue that AI systems like ChatGPT have the ability to generate distinctive and innovative output that is not merely a reflection of the training data. Thus, this output should be protected by copyright laws.
 +
 +
The issue of copyright ownership for outputs created by AI remains unresolved in many countries and continues to be a subject of discussion. Laws and regulations related to AI and copyright can differ depending on the jurisdiction and the specific circumstances. Consulting a legal expert in accordance with specific needs and circumstances is a common course of action. <ref>Hillemann, D., &amp; Zimprich, S. (2022, December 9). Chatgpt - legal challenges, legal opportunities. Fieldfisher. Retrieved February 11, 2023, from https://www.fieldfisher.com/en/insights/chatgpt-legal-challenges-legal-opportunities </ref>
 +
 +
 +
==Issues behind copyright==
 +
===1. Can you copyright the output of a generative AI model, and if so, who owns it?===
 +
Regarding intellectual property, Bern Elliot, analyst at Gartner, states that the model for ChatGPT "is trained on a corpus of creative works and it is yet unknown what the legal precedent may be for reuse of this content, assuming it was formed from the intellectual property of other human creators."<ref>Why is ChatGPT making waves in the AI market? Gartner. (n.d.). Retrieved January 29, 2023, from https://www.gartner.com/en/newsroom/press-releases/2022-12-08-why-is-chatgpt-making-waves-in-the-ai-market</ref>
  
 
====Authorship belongs to non-humans(ChatGPT)====
 
====Authorship belongs to non-humans(ChatGPT)====
In general, it is not acceptable for non-humans, like ChatGPT, to claim authorship. In the US, there is no copyright protection for works generated solely by a machine. <ref>Vincent, J. (2022, November 15). The scary truth about AI copyright is nobody knows what will happen next. The Verge. Retrieved January 27, 2023, from https://www.theverge.com/23444685/generative-ai-copyright-infringement-legal-fair-use-training-data</ref> For a work to enjoy copyright protection under current U.S. law, “the work must be the result of original and creative authorship by a human author.“<ref>McKendrick, J. (2022, December 26). Who ultimately owns content generated by CHATGPT and other AI platforms? Forbes. Retrieved January 29, 2023, from https://www.forbes.com/sites/joemckendrick/2022/12/21/who-ultimately-owns-content-generated-by-chatgpt-and-other-ai-platforms/?sh=7205359e5423</ref>If there is an ongoing copyright dispute over AI-generated content, one way to dispute the requirement of human authorship is to either appeal a Copyright Office registration denial or pursue an infringer after failing to register copyrights with the Copyright Office. 
+
In the current legal framework of the United States, it is generally accepted that non-human entities, such as ChatGPT, cannot hold authorship rights for works they generate.<ref>Vincent, J. (2022, November 15). The scary truth about AI copyright is nobody knows what will happen next. The Verge. Retrieved January 27, 2023, from https://www.theverge.com/23444685/generative-ai-copyright-infringement-legal-fair-use-training-data</ref> Copyright protection under current U.S. law requires that a work must be the result of original and creative authorship by a human author. However, there may be instances where the question of AI-generated content and authorship arises, and these cases may be addressed through the appeal of a Copyright Office registration denial or through legal action after a failure to register copyrights with the Copyright Office.
  
 
In either case, the legislative history of the necessity for human authorship and later legal decisions upholding the requirement will be heavily debated.
 
In either case, the legislative history of the necessity for human authorship and later legal decisions upholding the requirement will be heavily debated.
  
===2. if you own the copyright to the input used to train an AI, does that give you any legal claim over the model or the content it creates?=== 
+
====Authorship belongs to humans====
 +
For generative AI in general, the ownership of their creation is likely to have three results.<ref>McKendrick, J. (2022, December 26). Who ultimately owns content generated by CHATGPT and other AI platforms? Forbes. Retrieved January 29, 2023, from https://www.forbes.com/sites/joemckendrick/2022/12/21/who-ultimately-owns-content-generated-by-chatgpt-and-other-ai-platforms/?sh=7205359e5423</ref>
  
When deciding if something is fair use, there are a number of considerations, explains Daniel Gervais, a professor at Vanderbilt Law School who specializes in intellectual property law and has written extensively on how this intersects with AI. Two factors, though, have “much, much more prominence,” he says. “What’s the purpose or nature of the use and what’s the impact on the market.” In other words: does the use-case change the nature of the material in some way (usually described as a “transformative” use), and does it threaten the livelihood of the original creator by competing with their works?
+
# a work that became public domain as soon as it was created 
 +
# a work that is derived from the resources the AI tool was trained on. Who owns the dataset used to train the AI tool and the degree of similarity between any given work in the training dataset and the AI work are two common factors that affect the ownership of the derived work. 
 +
# a work considered as an innovative creation of the human who is directing the AI. 
  
===3. What kind of legal restraints could — or should — be put in place on data collection? In other words, can there be peace between the people building these systems and those whose data is needed to create them?===
+
The 1st and 2nd approach can be applied to the copyright issues of ChatGPT's creation. However, the 3rd also requires a clear measurement of the level of human dedication along with the help of AI in generating work. In the case of ChatGPT, the human operator only has limited dedication in the creation process. Therefore, the 3rd approach is usually not applicable.
  
  
==Potential Solution==
+
===2. Commercial use of the output of a generative AI model===
For many creators, it seems the damage has already been done. But AI startups are at least suggesting new approaches for the future. One obvious step forward is for AI researchers to simply create databases where there is no possibility of copyright infringement — either because the material has been properly licensed or because it’s been created for the specific purpose of AI training. One such example is “The Stack” — a dataset for training AI designed to specifically avoid accusations of copyright infringement. It includes only code with the most permissive possible open-source licensing and offers developers an easy way to remove their data on request. Its creators say their model could be used throughout the industry.<ref>Vincent, J. (2022, November 15). The scary truth about AI copyright is nobody knows what will happen next. The Verge. Retrieved January 27, 2023, from https://www.theverge.com/23444685/generative-ai-copyright-infringement-legal-fair-use-training-data</ref>
+
The utilization of content produced by ChatGPT for commercial purposes requires obtaining the necessary permissions and licenses. ChatGPT, a large language model developed by OpenAI, generates text based on the context of an interaction and the responses generated may vary in accordance with the input received. <ref>Loafars. (2023, January 27). Is chat GPT free for commercial use? Chat GPT Pro. Retrieved February 11, 2023, from https://opchatgptai.com/is-chat-gpt-free-for-commercial-use/ </ref> A license from OpenAI or the relevant rights holders may be required to utilize the content generated by ChatGPT for commercial purposes, which can depend on the specific circumstances of the use case. Obtaining the necessary permissions and licenses prior to utilizing any content for commercial purposes is the standard procedure in such cases.
  
 +
The responsibility of obtaining a license from OpenAI for commercial use of the content generated by ChatGPT remains with the user, and it remains questionable. Additionally, in cases where ChatGPT is utilized to condense a copyrighted work(such as translating a book in English to another language), it raises questions regarding the need for obtaining paid permission from the author or publisher. The future reaction of OpenAI and relevant third parties to potential commercial use of ChatGPT remains uncertain.<ref>Hillemann, D., &amp; Zimprich, S. (2022, December 9). Chatgpt - legal challenges, legal opportunities. Fieldfisher. Retrieved February 11, 2023, from https://www.fieldfisher.com/en/insights/chatgpt-legal-challenges-legal-opportunities </ref>
 +
 +
==Potential Solution==
 +
It appears that the copyright infringement has already occurred for many creators. However, companies who developed those generative AI do proposing fresh strategies to solve copyright issues related to their generative AI for the future. Dataset, where every collection within it belongs to the public domain, has been created and used for AI training in response to the copyright infringement. 
  
 +
"The Stack," a dataset for AI training created explicitly to avoid claims of copyright infringement, is an example for that approach. When it permits, the dataset only includes open-source licensing. For the parts where the ownership is not explicitly mentioned, it traces back to the issuer of those sources and asks for permission before using it. When there is any change in the ownership of sources after "The Stack" claims them, developers have easy access to remove those sources on request.<ref>Vincent, J. (2022, November 15). The scary truth about AI copyright is nobody knows what will happen next. The Verge. Retrieved January 27, 2023, from https://www.theverge.com/23444685/generative-ai-copyright-infringement-legal-fair-use-training-data</ref> According to its creators, this model could be used throughout the industry as a solid way to solve copyright issues related to generative AI.
  
Still working on it (from Daniel Wang)
 
  
 
==References==
 
==References==

Latest revision as of 06:19, 11 February 2023

ChatGPT(Chat Generative Pre-trained Transformer) is a new chatbot model released by OpenAI, an artificial intelligence research lab, on November 30, 2022. The model uses natural language processing tools powered by artificial intelligence technology. ChatGPT is able to conduct conversations by learning and understanding modern human language, mainly English, and can also interact based on the contextual information of the chat. It performs chatting and communicating behavior truly like a human, and even completes tasks such as writing emails, video scripts, translation, and code under certain scenarios.[1]

To train the model behind ChatGPT, a huge amount of data is collected from the Internet and applied to both supervised and reinforcement machine learning techniques. The answers delivered by ChatGPT, sometimes, are highly similar to the answers online created by human authors. Other times, it summarizes multiple answers, created by human authors, from its training dataset. Whether the creation of ChatGPT is considered to have originality is highly debating. Ethical issues like copyright get more and more attention from the general public.


engadget - OpenAI will soon test a paid version of its hit ChatGPT bot[2]


Copyright

Copyright is a legal framework that affords creators of original works with the exclusive rights to regulate the utilization and dissemination of their productions. The aim of copyright is to foster creativity by offering authors, artists, and other creators incentives to generate new works. Copyright law offers protection to a broad spectrum of works, including literature, music, software, film, photography, and architecture. [3]

Under copyright law, the owner of an original work is granted exclusive control over the following aspects of the work:[4]

  • Reproduction: The right to make copies of the work.
  • Distribution: The right to sell, rent, or otherwise distribute copies of the work.
  • Display: The right to show the work in public.
  • Performance: The right to perform the work in public, such as a play or musical composition.
  • Derivative Works: The right to make adaptations or alterations of the original work.

In many jurisdictions, the moment a work is created and fixed in a tangible medium, such as writing a book or composing a song, it is automatically protected by copyright law. Registering the work with the relevant copyright office provides evidence of ownership and can assist in the enforcement of rights in legal proceedings.[5]

Additionally, copyright law acknowledges certain exceptions and limitations to the copyright owner's exclusive rights. In the United States, the principle of fair use permits limited utilization of copyrighted works without the copyright owner's authorization for specific purposes, such as criticism, commentary, news reporting, education, scholarship, or research.

A fundamental aspect of copyright law is the granting of exclusive control over the use of a work to the copyright owner. Additionally, the law allows for the transfer of these rights, enabling the copyright owner to sell, license, or otherwise transfer the right to utilize the work to a third party. Such transfers are frequently executed through agreements such as publishing or licensing contracts.[6]

International treaties and agreements, including the Berne Convention and the World Intellectual Property Organization (WIPO) Copyright Treaty, also extend copyright protection across multiple countries. This enables creators to secure protection for their works on a worldwide basis.[7]

History

The examination of the relationship between copyright and AI technology is a continuously developing field and it is anticipated that it will continue to evolve as AI technology becomes increasingly prevalent in society.

Mid-20th century

As advancements were made in computer and AI technology, questions were raised about the potential effect on intellectual property laws. Despite the technology being in its early stages, these discussions centered primarily on theoretical concerns.

1980s and 1990s

With the rise in the usage of personal computers and the internet, early AI systems were developed that were capable of producing original works, such as music and poetry. This led to the examination of the question of whether AI-generated content can be considered original and qualify for copyright protection.

Late 1990s to early 2000s

As the advancement and usage of AI systems increased, attention was given to the topic of AI and copyright. Views were divided with some experts considering AI-generated content as not being original and thus not deserving of copyright protection, while others believed AI systems should be recognized as the creators of the works they produced.

Mid-2010s

As advancements in AI technology persisted, several lawsuits relating to copyright and AI were initiated. One such example was the lawsuit filed by a group of photographers against Google in 2014, regarding the use of their images in the company's street view mapping service. Another case was a lawsuit brought by a musician against a music streaming service, Spotify, for the use of his compositions in its playlist recommendations.

Late 2010s to present

The appropriate legal framework for AI and copyright remains a topic of ongoing discussion and debate. While some countries have enacted specific laws to address the issue, a clear consensus has yet to be reached. Meanwhile, organizations and experts are calling for a more nuanced approach that takes into account the unique characteristics of AI and its various applications in generating original works. [8]


The model behind ChatGPT

ChatGPT is a transformer-based language model, a type of artificial intelligence. The model is trained on a vast amount of text data, utilizing artificial neural networks to generate text based on the patterns learned from the text data during its training.[9]

When a prompt or question is provided to ChatGPT, the model processes the text and generates a response. This is achieved by predicting the subsequent word in the sequence based on the preceding words. The model employs a complex mathematical procedure to evaluate the probability of various words being the next word in the sequence, and selects the word with the highest probability.

The "transformer" aspect of the model's name refers to a specific type of neural network architecture that is utilized to process the text data. The transformer architecture is designed to handle sequences of data, making it appropriate for language modeling.


ChatGPT's Strength

ChatGPT is capable of generating text with a high level of fluency and coherence that is similar to human language. It is a useful resource for a wide range of NLP applications.

Question Answering

ChatGPT has the ability to comprehend and provide answers to questions on a diverse array of subjects, such as history, science, and current events. For instance, if a query of "Who was the first president of the United States?" is made, ChatGPT would provide the response of "The first president of the United States was George Washington." [10]

Text Summarization

ChatGPT is capable of generating a brief summary of a longer piece of text, such as an article, a news story, or a research paper. The model can analyze the input text, extract the most relevant information, and condense it into a shortened form that retains the core meaning of the original text.

Conversational Modeling

ChatGPT is capable of generating responses in a conversational manner, making it well-suited for the development of chatbots. The model can understand the context and intent behind user inputs, and generate appropriate and coherent responses. For instance, if the query "How are you today?" is made, ChatGPT could respond with "I am functioning well, thank you for asking. How are you today?" [11]

Text Generation

ChatGPT has the ability to produce new text based on a specified prompt. This can be applied to tasks such as language translation, story writing, or generating responses in a chatbot. The model utilizes its comprehension of language patterns and grammar to generate coherent and diverse text that is consistent with the specified prompt. As an example, if the prompt "Write a short story about a magical world" is provided, ChatGPT could generate a story describing a fantastical place filled with mythical creatures and spells. [12]


ChatGPT's Limitation

ChatGPT, despite being regarded as advanced, still faces certain constraints in its functionality.

Fact checking

ChatGPT is trained on a large amount of text data from the internet, which can include false or inaccurate information. As a result, the model may generate responses that are not entirely accurate. It is important to critically evaluate the information generated by ChatGPT and corroborate it with other sources.

Common sense reasoning

ChatGPT is not designed to have a deep understanding of common sense knowledge and may struggle with tasks that require this kind of understanding. For example, it may generate responses that are logically inconsistent or do not align with real-world expectations. [13]

Ethical considerations

Like all AI models, ChatGPT is not capable of considering ethical considerations when generating text. It may generate responses that are insensitive, inappropriate, or offensive, and it is up to human users to intervene and prevent such responses from being used. [14]

Legal considerations

While ChatGPT is equipped to differentiate between requests that are appropriate and those that are not, it still has the ability to process requests that fall outside of the parameters set by OpenAI. Some users have found ways to circumvent the established principles for processing requests.[15]

Question Answering

While ChatGPT can generate coherent and consistent responses for general conversations, it may still face difficulty in comprehending questions that are expressed in specific ways, which necessitates rephrasing for accurate understanding.

Polarized views

With all of its impressive creations and limitations, ChatGPT has received much attention. After it was posted for public testing on 30th November 2022, within the 1st week of its launch, ChatGPT has reached 1 million users.[16]

The question of who holds the copyright for the output produced by AI systems like ChatGPT is a complex matter, with varying perspectives. There are varying opinions regarding the legality of the technology and its potential for violating copyright laws.

One perspective is that AI systems like ChatGPT lack the capacity to hold a copyright, as they are not human and therefore do not possess the legal right to intellectual property. According to this viewpoint, the copyright for the AI's output would belong to the human creator or owner of the system, such as OpenAI in the case of ChatGPT.

Alternatively, there are those who argue that the output generated by AI systems like ChatGPT can be seen as a form of original expression and that the AI system should be granted copyright protection. Those in favor of this viewpoint argue that AI systems like ChatGPT have the ability to generate distinctive and innovative output that is not merely a reflection of the training data. Thus, this output should be protected by copyright laws.

The issue of copyright ownership for outputs created by AI remains unresolved in many countries and continues to be a subject of discussion. Laws and regulations related to AI and copyright can differ depending on the jurisdiction and the specific circumstances. Consulting a legal expert in accordance with specific needs and circumstances is a common course of action. [17]


Issues behind copyright

1. Can you copyright the output of a generative AI model, and if so, who owns it?

Regarding intellectual property, Bern Elliot, analyst at Gartner, states that the model for ChatGPT "is trained on a corpus of creative works and it is yet unknown what the legal precedent may be for reuse of this content, assuming it was formed from the intellectual property of other human creators."[18]

Authorship belongs to non-humans(ChatGPT)

In the current legal framework of the United States, it is generally accepted that non-human entities, such as ChatGPT, cannot hold authorship rights for works they generate.[19] Copyright protection under current U.S. law requires that a work must be the result of original and creative authorship by a human author. However, there may be instances where the question of AI-generated content and authorship arises, and these cases may be addressed through the appeal of a Copyright Office registration denial or through legal action after a failure to register copyrights with the Copyright Office.

In either case, the legislative history of the necessity for human authorship and later legal decisions upholding the requirement will be heavily debated.

Authorship belongs to humans

For generative AI in general, the ownership of their creation is likely to have three results.[20]

  1. a work that became public domain as soon as it was created
  2. a work that is derived from the resources the AI tool was trained on. Who owns the dataset used to train the AI tool and the degree of similarity between any given work in the training dataset and the AI work are two common factors that affect the ownership of the derived work.
  3. a work considered as an innovative creation of the human who is directing the AI.

The 1st and 2nd approach can be applied to the copyright issues of ChatGPT's creation. However, the 3rd also requires a clear measurement of the level of human dedication along with the help of AI in generating work. In the case of ChatGPT, the human operator only has limited dedication in the creation process. Therefore, the 3rd approach is usually not applicable.


2. Commercial use of the output of a generative AI model

The utilization of content produced by ChatGPT for commercial purposes requires obtaining the necessary permissions and licenses. ChatGPT, a large language model developed by OpenAI, generates text based on the context of an interaction and the responses generated may vary in accordance with the input received. [21] A license from OpenAI or the relevant rights holders may be required to utilize the content generated by ChatGPT for commercial purposes, which can depend on the specific circumstances of the use case. Obtaining the necessary permissions and licenses prior to utilizing any content for commercial purposes is the standard procedure in such cases.

The responsibility of obtaining a license from OpenAI for commercial use of the content generated by ChatGPT remains with the user, and it remains questionable. Additionally, in cases where ChatGPT is utilized to condense a copyrighted work(such as translating a book in English to another language), it raises questions regarding the need for obtaining paid permission from the author or publisher. The future reaction of OpenAI and relevant third parties to potential commercial use of ChatGPT remains uncertain.[22]

Potential Solution

It appears that the copyright infringement has already occurred for many creators. However, companies who developed those generative AI do proposing fresh strategies to solve copyright issues related to their generative AI for the future. Dataset, where every collection within it belongs to the public domain, has been created and used for AI training in response to the copyright infringement. 

"The Stack," a dataset for AI training created explicitly to avoid claims of copyright infringement, is an example for that approach. When it permits, the dataset only includes open-source licensing. For the parts where the ownership is not explicitly mentioned, it traces back to the issuer of those sources and asks for permission before using it. When there is any change in the ownership of sources after "The Stack" claims them, developers have easy access to remove those sources on request.[23] According to its creators, this model could be used throughout the industry as a solid way to solve copyright issues related to generative AI.


References

  1. What is CHATGPT and why does it matter? here's everything you need to know. ZDNET. (n.d.). Retrieved January 27, 2023, from https://www.zdnet.com/article/what-is-chatgpt-and-why-does-it-matter-heres-everything-you-need-to-know/
  2. Fingas, J. (2023, January 11). OpenAI will soon test a paid version of its hit Chatgpt Bot. Engadget. Retrieved February 11, 2023, from https://www.engadget.com/openai-chatgpt-professional-paid-chatbot-143004442.html
  3. Stim, Rich (27 March 2013). ["Copyright Basics FAQ"](https://fairuse.stanford.edu/overview/faqs/copyright-basics/). The Center for Internet and Society Fair Use Project. Stanford University. Retrieved 21 July 2019.
  4. Stokes, S. (n.d.). Art and copyright. Google Books. Retrieved February 11, 2023, from https://books.google.com/books?id=h-XBqKIryaQC&as_brr=3
  5. Service unavailable. GOV.UK. (n.d.). Retrieved February 11, 2023, from https://www.ipo.gov.uk/copy/c-claim/c-register.htm
  6. Yu, P. K. (n.d.). Intellectual property and information wealth: Issues and practices in the Digital age, volume 1. Google Books. Retrieved February 11, 2023, from https://books.google.com/books/about/Intellectual_Property_and_Information_We.html?id=bnW8ypT9_pIC
  7. MacQueen, H. L. (n.d.). Contemporary intellectual property: Law and policy. Google Books. Retrieved February 11, 2023, from https://books.google.com/books?id=_Iwcn4pT0OoC
  8. Roose, K. (2022, December 5). The brilliance and weirdness of chatgpt. The New York Times. Retrieved January 27, 2023, from https://www.nytimes.com/2022/12/05/technology/chatgpt-ai-twitter.html
  9. Fingas, J. (2023, January 11). OpenAI will soon test a paid version of its hit Chatgpt Bot. Engadget. Retrieved February 11, 2023, from https://www.engadget.com/openai-chatgpt-professional-paid-chatbot-143004442.html
  10. Fingas, J. (2023, January 11). OpenAI will soon test a paid version of its hit Chatgpt Bot. Engadget. Retrieved February 11, 2023, from https://www.engadget.com/openai-chatgpt-professional-paid-chatbot-143004442.html
  11. OpenAI's CHATGPT is scary good at my job, but it can't replace me (yet). ZDNET. (n.d.). Retrieved February 11, 2023, from https://www.zdnet.com/article/openais-chatgpt-is-scary-good-at-my-job-but-it-cant-replace-me-yet/
  12. OpenAI's CHATGPT is scary good at my job, but it can't replace me (yet). ZDNET. (n.d.). Retrieved February 11, 2023, from https://www.zdnet.com/article/openais-chatgpt-is-scary-good-at-my-job-but-it-cant-replace-me-yet/
  13. Chatgpt: Threat or menace?: Inside higher ed. Higher Ed Gamma. (n.d.). Retrieved February 11, 2023, from https://www.insidehighered.com/blogs/higher-ed-gamma/chatgpt-threat-or-menace
  14. Bogost, I. (2022, December 16). CHATGPT is dumber than you think. The Atlantic. Retrieved February 11, 2023, from https://www.theatlantic.com/technology/archive/2022/12/chatgpt-openai-artificial-intelligence-writing-ethics/672386/
  15. What is CHATGPT and why does it matter? here's everything you need to know. ZDNET. (n.d.). Retrieved January 27, 2023, from [1](https://www.zdnet.com/article/what-is-chatgpt-and-why-does-it-matter-heres-everything-you-need-to-know/)
  16. Ruby, D., & About The Author Daniel Ruby Content writer with 10+ years of experience. I write across a range of subjects. (2023, January 2). CHATGPT statistics for 2023: Comprehensive facts and data. Demand Sage. Retrieved January 27, 2023, from https://www.demandsage.com/chatgpt-statistics/
  17. Hillemann, D., & Zimprich, S. (2022, December 9). Chatgpt - legal challenges, legal opportunities. Fieldfisher. Retrieved February 11, 2023, from https://www.fieldfisher.com/en/insights/chatgpt-legal-challenges-legal-opportunities
  18. Why is ChatGPT making waves in the AI market? Gartner. (n.d.). Retrieved January 29, 2023, from https://www.gartner.com/en/newsroom/press-releases/2022-12-08-why-is-chatgpt-making-waves-in-the-ai-market
  19. Vincent, J. (2022, November 15). The scary truth about AI copyright is nobody knows what will happen next. The Verge. Retrieved January 27, 2023, from https://www.theverge.com/23444685/generative-ai-copyright-infringement-legal-fair-use-training-data
  20. McKendrick, J. (2022, December 26). Who ultimately owns content generated by CHATGPT and other AI platforms? Forbes. Retrieved January 29, 2023, from https://www.forbes.com/sites/joemckendrick/2022/12/21/who-ultimately-owns-content-generated-by-chatgpt-and-other-ai-platforms/?sh=7205359e5423
  21. Loafars. (2023, January 27). Is chat GPT free for commercial use? Chat GPT Pro. Retrieved February 11, 2023, from https://opchatgptai.com/is-chat-gpt-free-for-commercial-use/
  22. Hillemann, D., & Zimprich, S. (2022, December 9). Chatgpt - legal challenges, legal opportunities. Fieldfisher. Retrieved February 11, 2023, from https://www.fieldfisher.com/en/insights/chatgpt-legal-challenges-legal-opportunities
  23. Vincent, J. (2022, November 15). The scary truth about AI copyright is nobody knows what will happen next. The Verge. Retrieved January 27, 2023, from https://www.theverge.com/23444685/generative-ai-copyright-infringement-legal-fair-use-training-data