Open Thinkering

What do we mean when we talk about 'openness' in (generative) AI?

This blog post is the result of writing a too-long email in response to someone who wanted to think through what ‘open’ means when it comes to AI. In what follows, I’m going to completely disregard anything relating to copyright or IP. I’m also only going to mention environmental concerns as ‘externalities’. That’s not to downplay anything, but just to keep focused on the issue at hand.


Let’s start with the tension between ‘Free’ and ‘Open’ in the world of software. As you may know, the Free Software Foundation’s Four Freedoms are:

  • The freedom to run the program as you wish, for any purpose (freedom 0).
  • The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
  • The freedom to redistribute copies so you can help others (freedom 2).
  • The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

So, without access to training data, the conditions of Freedom 1 can’t be satisfied.

The Open Source Initiative (OSI), an organisation which not everyone thinks is awesome, ran a global co-design process involving participants from more than 35 countries to come up with the Open Source AI Definition (OSAID). This definition says that an ‘Open Source AI’ is one that grants freedoms to:

  • Use the system for any purpose and without having to ask for permission.
  • Study how the system works and inspect its components.
  • Modify the system for any purpose, including to change its output.
  • Share the system for others to use with or without modifications, for any purpose.

Given that definition, Meta’s ‘open source’ Llama models aren’t open (because they have commercial restrictions) and neither are those made by Stability AI or Mistral, which insist on large businesses purchasing enterprise licenses. People have accused them of ‘openwashing’ but I think it’s actually a lot more complicated than that.

There are, after all, fifty shades of open, which is why WAO tends to define open as an attitude.

Illustration of a green door with a yellow sign saying "We Are Open". The caption reads "OPEN IS AN ATTITUDE"
Image CC BY-ND Visual Thinkery for WAO

As you may be be aware, with Large Language Models (LLMs) we’re not just talking about data but about the strength of connections in different layers of the model, commonly referred to as weights. So perhaps a better approach to understanding of ‘opennes” is the Open Weight Definition which requires:

1. Free Redistribution
The license shall not restrict any party from selling or giving away the weights as a component of an aggregate distribution containing components from several different sources. The license shall not require a royalty or other fee for such sale. 

2. Model Weights

The product must include the weights, and must allow distribution of the weights. Where some form of a product is not distributed with the weights, there must be a well-publicized means of obtaining the weights for no more than a reasonable reproduction cost, preferably downloading via the Internet without charge. The weights must be the actual form in which a practitioner would use the weights. Deliberately obfuscated weights are not allowed. Intermediate forms such as training checkpoints or partial states are not allowed. Transformed versions such as quantized or optimized forms are only allowed if clearly distinguished. 

3. Derived Works

The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original weights.

4. Integrity of The Author’s Work 

The license must explicitly permit the distribution of weights derived from modifications. The license may require derived works to carry a different name or version number from the original weights. 

5. No Discrimination Against Persons or Groups

The license must not discriminate against any person or group of persons. 

6. No Discrimination Against Fields of Endeavor

The license must not restrict anyone from making use of the weights in a specific field of endeavor. For example, it may not restrict the weights from being used in a business, or from being used for genetic research. 

7. Distribution of License

The rights attached to the weights must apply to all to whom the weights are redistributed without the need for execution of an additional license by those parties. 

8. License Must Not Be Specific to a Product

The rights attached to the weights must not depend on the weights being part of a particular distribution. If the weights are extracted from that distribution and used or distributed within the terms of their license, all parties to whom the weights are redistributed should have the same rights as those that are granted in conjunction with the original distribution. 

9. License Must Not Restrict Other Software

The license must not place restrictions on other components that are distributed along with the weights. For example, the license must not insist that all other models distributed on the same medium must be Open Weight. 

10. License Must Be Technology-Neutral

No provision of the license may be predicated on any individual technology or style of interface.

At the moment, I think the biggest barrier in the way of ‘open’ AI is cost. Having worked at Open Source organisations such as Mozilla and Moodle, I know that Open Source development approaches can be tricky to monetise. And it’s extremely expensive to train these models: even Deepseek R1 cost millions (instead of hundreds of millions) of dollars to train. So even before we consider externalities such as environmental impact, training these models and then “giving them away” (as some people understand working openly to be) is a tough business decision to make.

(Aside: this is why I think huge, well-endowed universities and countries with sovereign wealth funds should be doing this!)

I use Perplexity most days, which allows users to switch between models when searching. Recently, it’s turned on by default the ability to auto-switch between these depending on the scope of the query. This is a technique Laura and I used (manually) when writing the Friends of the Earth report using RecurseChat. These days I’d use Msty (see my HOWTO video).

There’s much more to say in this area, but for now just a final word to say that I think that the ‘technical’ openness of LLMs is less interesting than the ‘cultural’ aspect — i.e. the weights rather than the data. Initiatives such as PD12M are a great idea and useful for specific situations, but sometimes I feel like people are stuck in the McLuhan-esque situation of “looking at the present through a rear-view mirror, so that we march backwards into the future.”

The important thing here is the whyrather than over-rotating on the how.

(Another aside: this is what often frustrates me about the ‘Open Education’ community, especially r.e. OERs.)