The U.S. Copyright Office has published announcements and information regarding AI and copyright law and policy. For the most up to date notices, check out the Copyright and Artificial Intelligence page on their official website.
Algorithmic Bias
While machines themselves do not have inherent biases, the data used to train AI systems are selected and produced by humans. Unintended outcomes exist as a result of many factors, including how algorithms are designed, what datasets are used in training, and how those datasets are created. Algorithmic bias exists in many of the algorithms we use today, including search engines and social media platforms, as well as in artificial intelligence systems such as ChatGPT and Midjourney.
Many companies use AI in their recruitment and hiring processes. While this may seem helpful at first glance, the algorithms are not more impartial than humans. Amazon, for instance, discontinued use of AI in hiring because of the biases discovered in the algorithm. The dataset was made up of resumes submitted over a period of 10 years, and the algorithm was taught which of those were successful in the predominantly white male managerial staff. As a result, the AI software more often chose resumes from male applicants with white-sounding names and downgraded any female candidates who had attended women's colleges and universities, resulting in a gender bias.
Facial Recognition
In the early days of photography, balancing tone and color in developing film was based on an image of a white woman. This practice made photographs of black faces to come out grey, or for their faces to disappear into a dark background (Huang). In much the same way, facial recognition have been trained more on white male faces than any other. Facial recognition software is less accurate for other groups of people as a result of this bias (Lee, Resnick, and Barton).
Dataset Acquisition
In the case of generative AI in particular, much of the training content was used in the datasets without the appropriate permissions. Crawling Reddit threads and other publicly posted conversations is a grey area, but many creatives have found that their copyrighted works have been included as well in the datasets used to train. While you cannot have ChatGPT type out the full text of a copyrighted work, concerns and complaints about their use in training data have raised made (Bloomburg Law).
Sources
Bloomburg Law. Copyright Chaos: Legal Implications of Generative AI. March 2023. Retrieved January 8 2024. https://www.bloomberglaw.com/external/document/XDDQ1PNK000000/copyrights-professional-perspective-copyright-chaos-legal-implic
Huang, Solaya. Time for a new lens: The hidden racism behind photography. Calgary Journal, February 28, 2021. Retrieved January 8, 2024. https://calgaryjournal.ca/2021/02/28/time-for-a-new-lens-the-hidden-racism-behind-photography/
Lee, Nicol Turner, Paul Resnick, and Genie Barton. Algorithmic bias detection and mitigation: Best practices and policies to reduce consumer harms. Brookings Institute. May 22, 2019. Retrieved January 8, 2024. https://www.brookings.edu/articles/algorithmic-bias-detection-and-mitigation-best-practices-and-policies-to-reduce-consumer-harms/
The following resources approach generative AI from a critical perspective. After evaluating these claims, how does this information impact how you will use generative AI tools?
Source: Butler University's AI in the Classroom Lib Guide, which is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.