The Third Pillar: Preservation

We need to talk about preservation.

If we accept that starting from Human Rights is necessary, …

if we accept that “scale” is about capacity of MOSS gardeners more than anything else, …

… then how do we preserve a commons?

This question is strongly intertwined with the previous one, but slightly less about the ideal to attain, and more about practical approaches.

The first thing we have to do is consider what attacks on the commons we currently know of, before we can turn to defending against them.

Vendor Lock-In

The historical attack that spawned Free and Open Source software movements, is “vendor lock-in”. By choosing some vendor system that comes with proprietary software, we effectively choose to be locked into that vendor’s ecosystem.

Stallman’s realization was that he needs to obtain source code, and failing that, to provide alternative source code that ideally does much the same.

In order to protect this alternative source code from then being appropriated by the vendor, he created copyleft as a mechanism to ensure that the software remains accessible to everyone.

Copyleft as a protection mechanism has succeeded. This kind of hard vendor lock-in has a well-understood and proven defense.

License Defensibility

Folk are quick to point out that a license is just a piece of paper (or digital text), and you either enforce it or can wipe yourself with it.

While this is always going to be true in extreme cases – nobody is going to believe Hitler would honour a copyleft agreement – throwing out the baby with the bathwater is not an appropriate response.

There will always be cases where the license can be enforced, but isn’t.

So the highly abstract way to phrase a defense here is to increase the capacity of the community to enforce license agreements.

There are several approaches to this, but they tend to cluster around legal defense funds, and perhaps copyright assignment to a body that will take legal action on behalf of a project.

While the specifics of such approaches can vary, the general idea is sound. This kind of attack has a defense, which “just” needs to be implemented.

Commons Enclosure

Since times immemorial, enclosure of the commons has been a popular attack. The term enclosure comes from land ownership, where monied folk bought land surrounding common land, and then fenced off this newly acquired land. As a result, the commons land became inaccessible except by the (paid) permission of the land owner.

Digital commons, of which open source software is one component, can similarly be surrounded by fences. In software, there are several popular strategies for doing this.

Open Core

Under an open core model, open source software is the core of some offering. This core is designed to be extensible by plugins, and licenses are chosen so that the plugins are not affected by copyleft virality clauses, if any.

Vendors then publish proprietary plugins, and make customers depend on those. This can include offering additional services tied to one or more proprietary plugins, which customers rely on. An “ideal” open core model gives customers little choice in the matter, for example when the plugin contributes to e.g. regulatory compliance or some such.

“Enclosure” here isn’t a precise translation from the historic use, but it is a good description of the same effect: the commons can only be meaningfully accessed by paying a tithe to the landlord.

Relicensing

An unfortunate and repeated occurrence is when some group of people gains effective control over an open source project, and then decides to relicense it under a proprietary license. The community can fork the project just before the license change, but any additions made to the original software remain closed.

In a sense, this is the more literal translation of the historic enclosure to the digital realm: access is literally closed off.

What’s worth highlighting, however, is that the point is not to deny access as such. The point is for existing users to be forced to do business with the landlord.

Embrace, Extend, Extinguish

A common tactic against open source is the “Embrace, Extend, Extinguish” (EEE) model, popularized by Microsoft. Here, a vendor first appears to support an open source project. Then it extends the project with modifications that largely matter to the vendor’s customers, and nobody else. Finally, in some form or another, they “extinguish” the project. This could mean any number of things – but it usually means dropping the support the community has come to rely on so suddenly and swiftly, that the project cannot easily recover.

Then, customers are “forced” to use the landlord’s proprietary solution, as the open source alternative is no longer viable.

A famous example of this tactic was employed by Google around twenty years ago. Just as XHTML 2.0 was about to be finalized as a standard, “concern trolling”, which mostly came from Google, delayed the adoption of the standard for years.

Then an “alternative proposal” was sprung into the discussion, which also was largely backed by Googlers: instead of quibbling endlessly over how the final phrasing of the standard might be, let’s introduce HTML5 as a “living standard”, which is expressly allowed to evolve over time.

Of course, such a “living standard” opens up many areas of work – so many, in fact, that only the best funded browser manufacturer can hope to send representatives to each of the working groups, and then send multiple. This effectively lets Google steer the majority of the HTML5 “standard”.

EEE works on software, standards, and other collaboration artefacts.

Defenses against Commons Enclosure

Although the specific activities by which the commons are enclosed vary, there is an underlying pattern to each of them: a single entity fields the majority of contributions to a project, and so gains effective control. They may or may not turn this into explicit control at some point.

Therefore, the defense against such tactics is obvious: much as in e.g. political forums, MOSS projects must adopt governance processes by which participants unilaterally determine the direction of the project.

Democratic systems around the world have, some better, some worse, demonstrated how this kind of governance may be achieved. MOSS projects must learn from the better ones.

A simple example, certainly not sufficient, might be to limit the number of representatives from a single vendor with voting rights. It is that kind of community management that is required to keep MOSS gardens flourishing.

Commons Exploitation

A related but distinct kind of tactic is for single participants to exploit the commons. In historic times, this might have been the farmer with the largest herd of sheep driving their animals to the commons early in the day, so that any subsequent herd finds little to graze.

Permissive Licensing

Permissive (sometimes called MIT-style) licensing does not contain copyleft provisions. Therefore, open source software can be extended with proprietary code without concern for the vendor, who ships the now modified software only, and without source code.

Permissive licensing works similar to commons enclosure, even though it does nothing to the open source project itself: it presents itself to consumers as the only viable choice, as it adds features the user requires.

In that, it is similar to the Open Core tactic described above. It is therefore unsurprising that quite a few Open Core projects use permissive licensing, as they work well in tandem.

The obvious defense is to retain strong copyleft provisions.

Generative AI and License Washing

Much can be said about generative AI, or LLMs in the context of Open Source.

In the context of “exploitation”, the case is very simple: feeding LLMs with open source software as training data produces a vendor controlled artefact (the model), which then produces more code. It is a literal exploitation of the work of others.

Moreover, doing so “license washes” the open source contribution. That is, if the “training data” was licensed under a copyleft provision, the code produced by the model is not. Even assuming that LLMs were capable of such a feat, if the model was tasked with producing the same functionality as a software that is part of its “training data”, this could replace the open source software without license restrictions.

The only defense against this is to forbid the software being used for LLM training data, or to make such use infeasible.

Though copyright works only on copies of works, licenses can contain near arbitrary requirements of the licensee. It is therefore also possible to use the copyleft trick against license washing: add a provision that using a software in training data is only permitted on the express agreement that the resulting model and all software it produces is published under the same license.

Summary of Defenses

I consider the battle against proprietary software either fully won or fully lost. It really depends on your goals: that open source is used everywhere?

To me, that is a red herring. I do not care about open source. I care about free access to knowledge by everyone. The proliferation of permissive licenses and/or open core type business models tell me that “free access” is something we still do not enjoy. Unfortunately, every “open source” maintainer who chooses a permissive license, for whatever reason, fights on the wrong side of this.

The real point is this: copyleft as a mechanism may only be one tool in the toolbox, but it is a very effective one. We need to use more of it, not less.

Denial of Service

One of the most pervasive attacks against a project is what I will loosely call a denial of service (DoS) attack. This represents any form of interaction that overwhelms maintainer capacity to the point where they cannot continue with their intended tasks any longer.

The example I listed above where HTML5 led to Google’s de-facto control over web standards is actually a kind of DoS scenario. Here, so many changes are discussed in parallel that small groups of maintainers have no hope of dealing with them meaningfully.

The difference here is the target. A “true” DoS targets the maintainers. Google’s attack was designed to exclude less prepared competitors (the “maintainers” here would be the W3C editors, etc. who continued to work just fine.)

There are a number of DoS scenarios I have seen over the years. The key differentiation between a DoS scenario and mere maintainer overwhelm is that individual tasks for maintainers have little individual meaning compared to the overall campaign.

It is not really worth discussing many different scenarios of this kind, as the mitigation strategy is the same: update contribution guidelines to exclude tasks that follow a DoS pattern, the ignore (or close) that match the pattern. If a set of community members can be seen to be behind them, ban those members.

The ability to create many accounts as well as automation tools may make this difficult in practice. This isn’t the place for discussing technical means; it’s the policy that matters.

LLMs / generative AI

There is one subset of DoS attacks that bear special mention, namely the use of LLM-backed coding or analysis tools.

I find myself in discussions fairly often where their usefulness is praised. In a sense, I don’t care how useful they are, because they’re fascist.

But there are some specific patterns that have emerged, that are worth addressing briefly. The mitigation, however, is to ban LLM-backed contributions outright. Projects may consider permitting specific contributors to use specific tools, however, at least for a limited time.

LLM Code Generation

Code generation is seen as a boon to productivity, but that is a false metric. As discussed in the the second pillar, scale in MOSS is all about the human factor.

Code generation is a DoS attack on the human factor, specifically on the community’s capacity building.

It takes a few steps to fully grasp, but it’s really quite simple.

Businesses are sometimes required to present so-called “continuity plans”. These plans outline how a business may restore operations after a critical failure, or how to transfer control over an asset to another business, in case of disaster.

The latter category is a last-ditch contingency. Nobody wants to be there. But the first part is highly important.

Because “critical failure” may mean the loss of a long-time contributor. Open source is old enough now that several projects have been effectively abandoned because their maintainer passed away. Others had to deal with intense maintainer burnout.

Having a contingency plan – such as the one the Linux kernel adopted is great.

You know what’s better? Not having single points of failure.

The only way you can erase SPoFs in your project’s organization is to onboard more contributors effectively enough that they can take over the tasks of others. In other words, to build capacity.

Code generation attacks a projects ability to build capacity, because code generation removes, in practice if not in principle, the need to deeply understand a project.

Lacking understanding means that key positions cannot be filled again in the case of a major incident.

Which means the promise of productivity is a red herring (even if true). It’s an attack on capacity building.

LLM Code Analysis

News such as the fact that the USA banned export of Anthropic’s new models seem to underline that LLM-backed code analysis is great.

It can be. But the false positive rate is immense.

False positives present themselves as exploitable security issues, when in practice, the scenarios in which they can be exploited are so absurd and impractical, that they can effectively be treated as having negligible urgency.

Unfortunately, merely classifying issues of this kind eats up more resources than many projects can bring to bear. A seemingly useful tool has lead to a denial of service.

This is where the exception to such tooling may be interesting: it is not without merit that individual core maintainers may use such analysis tools to preempt malicious exploits. But they need to be able to do so at their own pace, not driven by outside reports.

As much as it pains me to say this considering how LLMs combine negligible merit with disastrous side-effects, such an activity may simply be necessary in a world in which attackers use this kind of tool.

But there exist options that do not require this, as I will outline in a bit.

Supply Chain Attacks

The canonical definition of a supply chain attack is one where a project is indirectly attacked by undermining its supply chain, i.e. other projects that it depends on.

But there is another “supply chain” attack worth considering. Both, however, revolve around the notion of a “supply chain”.

There are two sides to that coin, and ignoring one won’t make it go away:

Open source licensing explicitly states that there are no warranties extended of any kind.
Businesses building on open source need there to be a supply chain, in some cases also to fulfil regulations under which they operate.

Critics of “supply chain” arguments often raise that first point to show that such arguments are entirely without merit. And that is correct in the technical kind that can miss a larger point.

Yes, MOSS gardening is about human scale. Being able to supply a business with value is not a particularly MOSS-y value.

But if the garden is large enough to support it? Then helping businesses may mean that MOSS catches on – in a similar way that permissive licensing helped open source catch on.

So the canonical supply chain attack is about upstream projects. The variation that is worth discussing is about downstream projects: when another project makes MOSS a part of its “supply chain”, and then demands that the MOSS projects acts accordingly.

This actually applies more widely than the “business” case I started out with. It could just as easily be another open source project that has delusions of a supply chain nature.

The defense IMHO is neither to reject the notion of a supply chain outright, nor to bend over backwards to accommodate related demands.

I think the only long-term workable defense is to point at the human scale of MOSS gardening, and demand reciprocity: I understand your needs. I would like to satisfy them. This little MOSS garden cannot do so without more capacity. Help me build it.

The specific kind of help offered and required may vary. But it’s not rude to demand help in kind. MOSS gardening is all about community an reciprocity, and that needs to be understood all around.

Conclusion

I previously outlined a number of attacks on human scale MOSS gardening, and broad defenses against them. In no way is this meant to be a complete list, or are the defenses meant to be perfect.

The reason for this list is that it illustrates one key point that a fair few open source projects do not like to accept: open source is deeply political.

This isn’t about party politics. It isn’t about a Left-Right political spectrum (at least not directly). It most certainly isn’t about specific political issues that woke-fearing folk are worried about.

Politics, etymologically, refers to matters of the “polis”, the city-state. In terms of open source, it refers to all the “soft skills” and organizational issues that anyone merely scratching an itch with a few lines of code does not mean to invite.

The bad news is: denial doesn’t make this go away.

But there is also good news.

The good news is that the second pillar already provides all the answers. At least, that is, under one condition: embrace that MOSS is political.

Assume I wrote a shell script, mostly for myself. Assume it scratches a highly individual itch.

Then imagine I have a random passer-by ask me about my project’s code of conduct. Of course this is a ridiculous notion, more work than this throwaway code was worth, and what does that person even think?

Unless… unless this is a signal. Specifically, it’s a signal that this little garden may actually be a little bit more than a hobby.

MOSS scale doesn’t require you to provide a shell script with a code of conduct. It doesn’t require a code of conduct you might have to be perfect.

All it requires is openness to the scale changing.

That turns a rejection of such suggestions into a kinder response: I didn’t think there was a CoC necessary. Would you continue interacting without one? Are there specific concerns, or questions you wish answered?

Chances are, the Nth time you need to answer the same questions, you’d much rather point people at a pre-written text. Chances are also that if you’ve been asked N times, there is already a lively community of contributors, at which point the question of what the community should have as behaviour standards is perfectly valid.

MOSS scale provides much of the specific answers to preservation efforts, but require an acceptance that gardening MOSS is deeply political.