OPEN

ai matters: open source and generative ai

Open Source Software, Open Source AI, Open Washing and Intellectual Property

Picking up where the series left off, at the intersection of copyright and generative AI. If you’re just joining, it may be helpful to take a look at the series introduction, technology overview, and part 1, where we introduced some of the issues at the intersection of “traditional” copyright and generative AI. In this article, we will dive into the intersection of copyright and generative AI through a slightly different lens: that of open source software.

Open Source Software: Origin Story and Primer

For those less familiar, open source software is interesting as it is both a philosophical movement as well as a legal structure of how we convey software. Like most things, background and context here matters.

One of the issues I have found most often in working within organizations around free and open source matters, is the confusion that presents around definitions and identifying where the IP issues arise within the software, so if you’re less familiar, it’s worth taking. A moment to better understand the terms and the origin stories behind them.

Additionally, I hope you will notice throughout many of these articles: the current IP systems may not be well-suited to address issues of AI, and open source is no exception. And thus, when reexamining the structures of the system, it is often helpful to understand their origins as well as the initial intent of the legal tools, including copyright as applied to software and the open source licensing regime.

To many, open source software is considered, in origin at least, an offshoot of the free software movement.[1] The free software movement is an ideological approach to software which emerged in the 1980s in response to the proliferation of software and the recognition of its value by society, including corporations. Free software started “officially” in many estimations, with the launch of Richard Stallman’s GNU Project in 1983 and it was popularized by the early 1990s with the release of Linus Torvald’s Linux kernel[2]. The movement established the 4 essential freedoms[3] of “free software”. [4]  However, as initially stated, free and open source software are each rooted in philosophical movements, and therefore, opinions differ on their alignment.[5]  In many estimations, open source is based on a utilitarian approach. It follows a logic along the lines of: allowing more developers to contribute to the development process and aligning incentives for bug-free code creates a better software solution. Therefore, it is in everyone’s (private individuals as well as corporations) best interest to find a way to work together, without requiring unduly burdensome compromises or administrative processes. Essentially, open source may still be seen to advance some of the values of the free software movement, namely that collaboration and “freedom” is a better approach than closed, proprietary development, even if it is “just” for practical reasons (e.g., higher quality software solutions). Open source software is considered as such if it meets the definition of open source managed and held by the Open Source Institute (OSI).[6] The organization also holds the list of “approved” open source license, too, which will be relevant to understanding later sections.

The open source software licensing scheme originated in large part based on the concept of “copyleft[7]” licensing. This concept, developed by the free software foundation, was developed as a way to ideologically upend traditional copyright. The community believed that releasing software “free of copyright” (assuming it was possible) would likely result in those outside obtaining and copyrighting the works for themselves. To keep the software “free”, they and therefore, devised a licensing scheme to apply these copyleft principles to the code, namely through a set of licenses that would require anyone who redistributes the software or any derivations of it, whether or not they modify it, to pass along the freedoms to copy and change it (as well as other license terms). As mentioned in the previous article, some believed copyright was ideologically created to encourage the expression and communication of creative ideas, or as some have interpreted, to “protect the artists” from others stealing their ideas. The open source licensing regime was created, in effect, to use copyright (to a greater or lesser extent, depending on the type of license) to encourage or require the open sharing of the expressed ideas, in this case, the ideas expressed through software code.

But we are getting ahead of ourselves here, because we still need to talk briefly about the intersections of IP (and copyright) and software.

Software and IP

This seems like a good time to talk about how software falls into the scope of copyright at all, as many will probably (and rightly so) associate software and technology with the patent space of IP.

A few types of intellectual property may be relevant to this discussion, and therefore will be addressed briefly: patents, copyright and trade secret.

Patents are used to protect the “ideas”[8] but in practice protect useful inventions that are novel and non-obvious. They historically have provided a limited monopoly to incentivize innovation. Whereas copyright protects the expression of original works of authorship. In other words, and as applied to software, copyright protects the software code itself (e.g., the expression of the creativity in the implementation details), while patents may protect the functionality of the software (e.g., what the software allows the computer system to do). The copyright owner has control only over the work that he/she/they has created.  Therefore, infringement may occur if software is copied without the permission of the author. However, the copyright owner does not have control over someone else’s independent creation This may sound obvious, but is in contrast to a patent right, which may be enforced against another entity, even if the inventor was unaware of the creation of the previous patent right and thus did not copy or had no intention of copying the concept, if the innovation falls within the scope of the patent claims. Therefore, the patent owner has control (the right to exclude) over the patented innovation, even if it was independently created by another.

Trade secrets should also be mentioned; however, they protect valuable confidential information that must be kept confidential in order to merit protection. Openly sharing a trade secret, even once it has been designated as such, generally destroys any legal protections that would otherwise be afforded the trade secret. Trade secret protection can be a viable alternative to other types of IP, when it is deemed more valuable and/or more realistic to protect their interests through non-disclosure.

Copyright’s application to software (“computer programs”) originally arose from an interpretation of software as a “literary work” in the Copyright Act of 1976[9], which added definitions to 17 U.S.C. §101 of computer programs, among other changes. Then in 1983, Apple v. Franklin Computer Corp. 714 F.2d 1240 established that binary code (machine-readable code), in addition to source code (the human-readable code), was copyright protected. From there, as software grew to be more and more widespread, much of the global approach to copyright as it pertains to software began to harmonize.

Open Source Software Licenses

A final building block, and then we can start to examine how these pieces may fit together. Open source software licenses are unique, as in my experience, they are a type of license that often a software developer knows much more about than a general practice attorney. Therefore, it is worth touching on both the legal and technical aspects.

A license is a type of contract that can be used to give one party (licensee) rights to another party (licensor’s) intellectual property. Since a license is a type of contract, it can therefore have many different scopes, requirements and/or conditions.

As mentioned earlier, the free and open source community took the concept of protecting software with copyright and upended it by applying a set of ”copyleft” licenses to various configurations of software. In other words, they created and applied a set of copyright licenses for various pieces of code, with a set of distinct requirements. Notably for an open source license, many are conditional based on compliance with certain obligations such as providing proper attribution or reciprocal licensing. From there, a set of licenses considered less restrictive than were created to accommodate various needs of both proprietary and non-proprietary developers[10].   Many of these licenses revolved around the concepts of copying (including reproduction and derivative works[11]), and the right to distribute.[12]

The software copyright licenses initially developed using language that was relatively close to standard copyright language; however, as the community has grown and new licenses have emerged, many of them use language that is pretty open or casual, but still are effective in conveying rights, and stipulating obligations and ownership.

As mentioned earlier, Open Source Initiative[13] is the organization which facilitates the approval process and manages the list of approved open source licenses. These open source license can generally be divided into two broad categories: permissive and restrictive. However, in practice (such as in a corporate open source compliance program), they generally fall into three categories: copyleft (“restrictive”), permissive, and those in between, referred to as “weak copyleft”. Each category gives a general sense of the requirements and/or obligations that attach, based on the license, to the user (such as a developer, a distributor, or a corporation.

The names are somewhat intuitive, with copyleft being the most restrictive, in the sense that the license contain the most stringent requirement with regard to sharing (e.g., redistribution) of code.[14] Strong copyleft or licenses identified as restrictive are generally classified as such because they contain a clause that is often discussed as the ability to “infect” code; while this understanding is not completely accurate, on a high level, the concern with code licensed under a strong copyleft license is that if it is implemented into a system, all code within that system that it contacts are required to be disclosed under the terms of the license. Thus, the concern is that if a piece of restricted code is used by developers, the entirety of the company’s code is obliged to be shared, at least upon user request.[15]

On the other end of the spectrum are permissive licenses, which often allows for free use of the code with only minimal restrictions outside of attribution.[16] Between these two license types is the third, often referred to as “weak copyleft” which often require some or unique requirements, but less than a strong copyleft license like GPL 2.0.[17]

Open Source Software Compliance and AI

While this is not a new issue, the rise of generative AI has only increased the importance of properly setting up and (the hard part) implementing, maintaining and scaling an open source compliance program. Many organizations lack the awareness or prioritization needed to establish a right-sized open source program for all software in the company. Without at least a basic accounting for (such as via a comprehensive overview of all open source component, including open source datasets) and approach to open source software, many companies are opening themselves up to enormous risk should anyone ever review its software, as in the Tesla situation described in the footnote. While most sophisticated tech companies have a robust compliance or awareness program, that includes an element of active contribution to the open source community through planned open source releases, via code sharing repositories like GitHub, the ubiquity of software development and the near-necessity of working with open source code in nearly any modern software development project means that just about every company should have at least a basic, but active, open source compliance program, regardless of size or industry, lest they find themselves in violation of license terms and/or are required to release their proprietary code to the market.

Open Source Software and AI

With this basic knowledge of both software copyright and open source licensing, the issues surrounding generative AI are relatively simple. Though of course a problem being “relatively simple” in no way indicates the ease nor simplicity of the solution, often times quite the opposite.

Let’s divide the issues into two categories: 1) software developers against the large AI model companies and 2) developers in the OSS community wondering how to protect, define and identify licenses, specifically for “Open Source AI”.

First, we will follow in the vein of the earlier piece, by tracking some relevant litigation that contends with the current definitions of copyright in the software space. Specifically, a seminal case, Doe v. Github, wherein a group of developers filed a class action suit against GitHub, Microsoft, and OpenAI back in 2022. The complainants the legality of GitHub’s Copilot and OpenAI Codex. For those unfamiliar, GitHub Copilot is a generative AI tool made to help software developers by suggesting pieces of code and OpenAI Codex is the model powering GitHub Copilot. This was the first major suit that implicated copyright issues against OpenAI in the software space.

The complaint asserted many violations, including two copyright-related claims that were directed toward violations of the open source license terms (essentially, a contracts claim) and of the Digital Millenium Copyright Act[18] (DMCA). The open source license violation assertions were based on copyright and contract law, in that the licenses were OSS licenses, which as discussed above, are copyright licenses, licensing rights to use and/or distribute the copyrighted code. The theory being that by using (and copying) the code, the user accepts the terms of the license. Then, in direct violation of the contractual (e.g., license) terms, GitHub’s CoPilot reproduced portions of the code without appropriate attribution, and as the terms of the license often require, including the copyright notice and/or license itself with the code.[19]  Similarly, the DMCA claims also hinged on the issue of copyright. As a part of the body of US copyright law, the work at issue must be subject to copyright protection for the DMCA provisions to attach. The subsections of the DMCA provisions allegedly violated include those directed toward copyright management information (CMI) and are directed to issues surrounding removing or altering CMI such as information about the author, copyright owner, and copyright notice. Therefore, the use and reproduction of the open source code (with its accompanying open source licenses) violated the DMCA.

Much of the initial complaint was dismissed and amended claims were refiled.

However, earlier this summer in a decision[20] unsealed in late July, a Judge dismissed with prejudice[21] all but 2[22] claims[23] including the DMCA claims, leaving only two claims standing – open source license violation and breach of contract. This is due to the Judge’s determination that the code allegedly copied from developers wasn’t similar enough to the original code to warrant a copyright violation claim needed for the DMCA provisions to apply.

Doe v. GitHub will likely continue, with the unnamed developer plaintiffs and their lawyers continuing to attempt to identify just what shape some of the rights of a group of open source software developers will take in the world of generative AI and big tech.

Open Source Generative AI, Openwashing, and Toward a New Definition of Open Source AI

A second issue at the intersection of software, open source and generative AI is around the development of “open” or “open source” AI. Since the benefits of open source software, as it has developed over the past nearly 30 years. Many companies have been hyping the “openness” of their models, speaking the same language as the open source community, and citing the now well-established benefits of “open” over “closed” development. However, many in academia and the open source community have noted that the “openness” some large tech companies claim fails along many of the key freedoms fundamental to the open movement. Accordingly, momentum is building around the claims of “openwashing”[24] and calls for transparency on the definitions of openness, specifically around “open AI”. For example, in their article recently profiled in Nature,[25] researchers in the Netherlands explain the current situation well, “When generative AI follows a release-by-blogpost[26] model, it is reaping the benefits of mimicking scientific communication – including associations of reproducibility and rigor- without actually doing the work. And when generative AI co-opts the term open source, it is reaping the benefits of libre culture – including associations of transparency and associated freedoms – without actually contributing to the commons…[t]here is ample evidence that as a communication strategy, open-washing is highly effective.” (emphasis added).[27],[28] One well-known example is that of Meta founder and CEO, Mark Zuckerberg’s published article “Open Source AI Is the Path Forward”. In the July 2024 piece, he talks about the history of open source and accurately recounts the open source rise to popularity through following the development and adoption of  Linux, from an outlier in a closed-development model to “the industry standards foundation for both cloud computing and the operation systems that run most mobile devices…”,[29] grounding the release of Mat’s Llama squarely in the open source context. Many have begun to note, however, that Meta’s Llama only discloses the weights in the model. While this disclosure may be helpful, it lacks much of what developers have come to consider standard with something marked “open source”, namely, as cited in a recent Financial Times article, “a full understanding of the systems”[30]. Similarly, Google has released “open” models, that come with significant use restrictions.  While much ambiguity exists on what exactly “a full understanding” might mean, there seems to be some consensus that an indication of and access to training datasets is essential to understanding (and reproducing) a generative AI system.  But more on that in just a bit.

One obvious reason for corporations to be participating in what has been identified as openwashing is to garner the good will of the public and to be seen as collaborators and contributors with innovation and development. Another reason may be less obvious, but equally as relevant, in that the EU AI Act has been described as a “chose your own adventure” by Kate Downing in her piece of the same title “Choose Your Own Adventure: The EU AI Act and the Openish AI”.[32] Aptly noted is that 1) the EU AI Act creates exemptions for “software and data, including models, released under a free and open-source license that allows them to be openly shared and where users can freely access, use, modify and redistribute them…”[33], the details of which we will not go into here, but suffice to say, there is the potential for significant resource (time, costs, and all that is consumed in the ambiguity of the often trial-and-error implementation of a new law) savings by simply labeling the asset (likely an AI model) as “open”. In this case, since the current “definition” provided by the EU AI Act is fairly ambiguous and does not require the presence of the “4 freedoms” as elucidated by the open source community originally, there is much space for interpretation and the potential of stretching the definition to suit interests that may not align with the spirit of the Act, or at least the open source community.

Before we get too skeptical of big tech interests and begin to make only negative inferences in the presence of any and all unknowns, it is important to note that there are many complex ambiguities to the term “openness” in the generative AI space, as discussed in previous articles, “generative AI” contains a vast scope of models and technologies and systems that are often interconnected, and many complexities regarding the interplay of transparency and profit-driven growth exist, even in just a “simple” exercise of developing  a global definition of “open source AI”.

However, there is an awareness that being proactive in this space is essential to ensure innovation is not hampered and opportunities are not lost. For at least these reasons, similar to the developer community’s desire to organically develop, articulate, and host both a philosophy and collection of licenses under the heading OSI, efforts have been growing and consolidating to establish a definition for open source AI. While there are many similarities between open source software requirements and open source AI needs, it has become increasingly apparent that new definitions are needed, and that the current licenses are insufficient to properly build “openness” into the growing generative AI development community.

As acknowledged by the OSI team spearheading the redefinition efforts, “AI systems are growing more complex and pervasive every day. The traditional view of Open Source code and licenses, when applied to AI components are not sufficient to guarantee the freedoms to use, study, share and modify the systems. It is time to address the question: What does it mean for an AI system to be Open Source?” While the initiative aims to create a new definition, they are following their traditional process of bringing together global experts “to establish a shared set of principles that can recreate permissionless, pragmatic and simplified collaboration for AI practitioners” and the validation process will be “community-led, open and public.”

In late August, OSI released its latest version[34] of the definition for Open Source AI to the public for comment, a draft which many believe to be very near the final draft[35]. The definition includes an initial reference to and incorporation of the original “4 freedoms”, which I reproduce here[36], as much of the confusion around “freedoms” often revolves around an interpretation of “free” as “at no cost” versus, what is actually meant, namely the following, the freedoms to:

  1. Use the system for any purpose and without having to ask permission.
  2. Study how the system works and inspect its components.
  3. Modify the system for any purpose, including to change its output.
  4. Share the system for others to use with or without modifications, for any purpose.

Grounded in these foundational freedoms, the definition goes on to stipulate the disclosure of 3 key categories: data information (“sufficiently detailed information about the data used to train the system, so that a skilled person can recreate a substantially equivalent system using the same or similar data. Data information shall be made available with licenses that comply with the Open Source Definition.”), code (“The source code used to train and run the system, made available with OSI-approved licenses. For example, if used, this would include code used for pre-processing data, code used for training, validation, and testing, supporting libraries like tokenizers…), and weights (“the model weights and parameters, made available with OSI-approved terms. For example, this might include checkpoints from key intermediate stages of training as well as the final optimizer state.”). OSI has also helpfully published a draft checklist[37] to accompany the definition, which provides a concreteness to an understanding of what is required, what is optional, and what is not relevant.

Notably, this definition states that the definition “does not take any stance as to whether model parameters require a license, or any other legal instruments, and whether they can be legally controlled by any such instruments once disclosed and shared.”

However, assuming the best and that soon a definition will be finalized, questions of implementation, use, and enforceability remain.  Suffice to say, this is an area that is absolutely worth following, as the decisions and implementation will certainly have a significant, if not serve as THE deciding force in the direction, speed, and accessibility of the future of generative AI.

[1] If you don’t know about the free software movement, a great place to start is by checking out the free software foundation’s page here: https://www.fsf.org

[2] If you’re interested in the full history (including a timeline) of the open source software movement you can find it in the OSI archives: https://web.archive.org/web/20021001164015/http://www.opensource.org/docs/history.php

[3] A program is free software if the program’s users have the four essential freedoms:

  • The freedom to run the program as you wish, for any purpose (freedom 0).
  • The freedom to study how the program works and change it, so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
  • The freedom to redistribute copies so you can help others (freedom 2).
  • The freedom to distribute copies of your modified versions to others (freedom 3).

By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

[4] Notably, “Think “free” as in free speech, not as in free beer” became popularized as a slogan used by the free software folks.

[5] For example, the GNU (the operating system created by the FSF), explicitly states in its “Philosophy” page that “”Open source” is something different: it has a very different philosophy based on different values. Its practical definition is different too, but nearly all open source programs are in fact free.” The page includes another article written by founder Richard Stallman titled “Why Open Source misses the point of Free Software” that states “The terms “free software” and “open source” stand for almost the same range of programs. However, they say deeply different things about those programs, based on different values. The free software movement campaigns for freedom for the users of computing; it is a movement for freedom and justice. By contrast, the open source idea values mainly practical advantage and does not campaign for principles. This is why we do not agree with open source, and do not use that term.” https://www.gnu.org/philosophy/open-source-misses-the-point.html.

Nevertheless, as stated, in practice, they often work quite similarly and in industry are generally considered together and indicated as “F/OSS” or “(F)OSS” or “Free and Open Source Software”.

[6] The Open Source Initaitve (OSI) is the organization that holds the list commonly recognized as “official” aka “OSI Approved” open source licenses, is coordinating the effort to estabilsh a common definition for Open Source AI. OSI is a non-profit based in California.  OSI page, Open Source Definition, accessed 26 Aug. 2024.

[7] The “left” of copyright is intended to be the opposite of “right” in copyright, not “left” as in “to leave”. To read more about the origins find the article “What is Copyleft?” on the GNU page here: https://www.gnu.org/licenses/copyleft.en.html

[8] Though “ideas” themselves are considered abstract and therefore not patentable, that distinction is a conversation for another piece, perhaps 35 U.S.C. §101.

[9] This act became law Dec 12, 1980, as Congress appointed a National Commission on New Technological Uses of Copyrighted Works (CONTU) to review and recommend about computer programs and other “new” (at the time) technolgoies including photocopying and computer databases. CONTU recommended a new definition be added to Section 101.

[10] Primarily, but not exclusively, the various core licenses were developed to accomodate needs of corporations as well as the rise of the importance of patents, and more specifically software patents, to corporate entities.

[11] Copying and copying with modifications often invokes the concept of “derivative works”. The term derivate works comes from the US Copyright Act, and therefore it is considered a “term of art” that has a statutory defintion (rather than the dictionary definition). The exact definition and application of this term is still a subject of debate within the (F)OSS community, namely determining “how much change is enough or too much change?”

[12] Like derivative works, distribution is still an open question when it comes to the boundaries, including, for example, a common defintion is “the provision of a copy of a piece of softwrae, in binary or source code form, to another entity (an individual or organziaiton outside your company or organization)”. Source: Open Chain. However, this often raises the question of what is an “individual outside your organization” or a “company outside your organization” as these are often contractual matters, many times with nuance created for reasons related to employment law (e.g., contracting employees, subcontractors, collaborating entities, suppliers, etc.).

[13] https://opensource.org/licenses

[14] Some examples of restrictive or strong copyleft licenses include GPL 2.0, GPL 3.0, CC BY-SA.

[15] For example, in its early years Tesla used GPL licensed (“GPLed”) code in its software and was for many years noncompliant with the license terms. Tesla recevied numerous public complaints from copyright holders requesting the code and asserting Tesla’s non-compliance, most notably those from the Software Freedom Conservancy. This resulted in Tesla eventually (in mid-2018) releasing some of its code publicly (specifically for Autopilot and the Model S/X infotainment system software) on GitHub. See, for example, https://electrek.co/2018/05/19/tesla-releases-softwar-open-source-licences/.

[16] Some examples of permissive licenses include MIT, BSD, Apache 2.0.

[17] Some examples of weak copyleft licenses include MPL (mozilla), CPL (Common Pubilc License), LGPL v2 and others.

[18] A digital rights management law; a 1998 law intended to provide legal basis for illegality of production and dissemination of technology, serivces, or devces intended to circumvent measures to control access to copyrighted material, including criminalization of circumventing that accesss control itself, even if no copyright infringement is detected.

[19] However, as Jeffrey W. Gluck emphasizes in his excellent contribution in the Journal of Emerging Issues in Litigation, without the copyright attaching to the code, “the license/contracts would be null and void, as plaintiffs would have had nothing to offer as part of the licenses/contracts.” p.33 “Copyright Issues in Generatvie AI for Software: Doe v. Github Inc.”Jeffrey W. Gluck. JEIL/ Winter 2024, Vol. 4, No.1, pp. 29-35.

[20] Read the full decision here: https://www.documentcloud.org/documents/24796955-github-copilot-claims-dismissed?responsive=1&title=1

[21] i.e., the developers can’t refile the claim

[22] of the original 22 claims filed.

[23] including requests for unjust enrichment and punitive damages.

[24] “”Companies have been known to misues the term when marketing their models,” says Avijit Ghosh, an applied policy reseracher at Hugging Face.” Williams, Rhiannon and James O’Donnell. “We finally have a definition for open source AI” MIT Tech Review. 22 August 2024.

[25] ”Not all ’open source’ AI models are actually open: here’s a ranking”. NATURE. 19 June 2024.

[26] The author of the report’s term for the recent release strategy of many of the open generative AI models, namely “[the models] were first made public in a blogpost or press release touting their openness. For instance, TII’s Falcon 70B…as the “top-ranked open-source AI model”, Stabliity Aii’s Stable Beluga as “open access”, Minstral “we have the best open source models”…the strongest claims to label open source…come from Meta and its Llama 2 and Llama 2 models.” P. 1776.

[27] “Rethinking open source generative AI: open washing and the EU AI Act” at P.1776.

[28] In many ways, this is reminent of course of the “greenwashing” we have seen a backlash against since corporate marketing and PR jumped on the climate crisis as a marketing asset.

In the open source world, it is also a bit reminicient of Elon Musk’s 2014 announcement he was “open sourcing” all of Tesla’s patents (in a post no longer available on the tesla blog, but can be read here: https://www.latimes.com/business/autos/la-fi-hy-elon-musk-opens-tesla-patents-20140612-story.html). It conveyed all the altruism while conveying few, if any, of the actual benefits.

[29] Zuckerberg, Mark. “Open Source AI Is the Path Forward” 24 July 2024 https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/.  Acccessed 27 Aug 2024.

[30] Waters, Richard. “We are a long way from truly open-source AI” 22 August 2024., https://www.ft.com/content/c7ab2cf3-deaf-4de4-9dc7-46eadc84e2a0

[32] “Choose your Own Adventure: The EU AI Act and Openish AI” accessed 27 Aug 2024 https://katedowninglaw.com/2024/02/06/choose-your-own-adventure-the-eu-ai-act-and-openish-ai-2/

[33] EU AI Act, section 60i

[34] Version 0.0.9, accessible here: https://opensource.org/deepdive/drafts/open-source-ai-definition-draft-v-0-0-9

[35] “Williams, Rhiannon and James O’Donnell. “We finally have a defintion for open-source AI” 22 August 2024. MIT Technology Review.  https://www.technologyreview.com/2024/08/22/1097224/we-finally-have-a-definition-for-open-source-ai/

[36] And may be referenced in the earlier footnote, fn 2, at the introduction of this article.

[37] https://opensource.org/deepdive/drafts/the-open-source-ai-definition-checklist-draft-v-0-0-9