Repositioning ForgeFed? Scope to Code Forges or Free Software Development Lifecycle (FSDL)

aschrijver · May 30, 2022, 9:05am

(This topic is copied from this chatroom discussion)

Welcome fr33domlover I am not directly involved with forge federation projects, but I regularly jump in these rooms as a general advocate for anything Fediverse and hoping to inspire applications that will take fedi to its full potential. I don’t know if I’m able to attend a meeting and really don’t have free time to spend. But there’s one point of advocacy I made on a number of occasions and will make again now…

A code forge is a bundled set of tools that help make software development more efficient. It represents an application and the features it offers relate to an ‘application domain’ (that of ‘code forges’). For the federation you can slice it like that and then call your AP specification project e.g. “ForgeFed”.

But in doing so you might sell yourself short, and with high risk to squander a unique opportunity. By specifying the application domain the specifications become most usable and obvious to use for code forges. They may and likely will become much harder to use in more general context of the top-level ‘business domain’ of Software Development. (A ‘business domain’ is a particular field of expertise, not to be confused with commercial businesses).

IMHO one of two different approaches should be chosen for the ForgeFed project:

ForgeFed is repositioned to model the various business domains of software development.
ForgeFed will clearly limit its scope to a single business domain of software development.

(In both cases ForgeFed may not be the best name for the project, but that is another matter)

For the rationale to 1) it is best to start looking at Github. With its popularity, network effects and FOMO it has established a real dominant position in software development community. There’s much more than the code forge alone, as Github is the center for an enormous ecosystem of vendors that offer value-added services and tools covering the entire software development lifecycle. It cements Githubs position, and they can selectively adopt attractive new features into their platform, making them a de-facto walled garden. Gitea and others eternally trying to catch up.

Now with ActivityPub federation the entire software development lifecycle can be opened and democratized! That is the humongous opportunity that exists. If the ForgeFed specifications were to grow (hypothetically) and encompass Project, Board, Comment, CI Pipeline or whatever other concepts, then there is a need to split into business domains. Project and Board for instance are also concepts that exist in Trello, and a Trello board might federate with a Github or a Gitea board. But Trello won’t do that, or won’t become aware if things are tightly interwoven with other forge features they don’t offer.

So to me it makes sense to slice according to business domains. Instead of application domain of ‘code forge’ there’d be sub-domains of Revision Control, Project Management, Task Management, Quality Assurance, etcetera. There’d be a whole set of different specifications that evolve separately, each at their own pace and likely with different people involved in that. But they can be collected under one ‘umbrella’, and I suggested a name for that to be: Free Software Development Lifecycle or FSDL.

Now I can understand that - and this was mentioned by dachary.org - that such a larger scope needs a different organization and people committed to it. I think it does not need all that much extra effort, because it is something that will grow incrementally over prolonged period of time. The upfront effort is more the repositioning it involves (these specs may live at fsdl.forge.es where multiple parties are already collaborating).

But if this idea is too much and no one feels like being involved then we come to option 2)

This is the approach that there eventually may be the establishment of a FSDL, but not right now. In that case (and it looks like ForgeFed is already much scoped like this), ForgeFed may deliberately restrict to one particular sub-domain, do that really well, launch it in production, and then see what the future holds. That sub-domain looks to be Revision Control and aligns to the typical functionality that Git offers, but then on the code forge’s end.

bill-auger · May 30, 2022, 7:20pm

i may be misunderstanding this - the jargon in that post is non-intuitive to me - forge-fed can be discussed/presented much more plainly

it is definitely concerned with more than only the VCS, and is definitely not specific to git; but it is limited to “forges”, whatever a forge is

the goals forge-fed could be summed as “everything that someone could do by clicking a mouse on a forge website, should be possible without a web browser” - ie: all forge operations should be possible using only a ‘curl’ client - once you have that layer of simplicity in place, everything is possible, including cross-forge interoperability and whole project migration - that simplicity is complicated only by cross-server authentication - the only fundamental constraint, is which set of operations are common across forges (which requests are the target forge willing to fulfill) - the logistics of it could be summed as “translating forge-fed requests into CRUD operations on the remote forge database” - any concerns which do not fit that generalizatoin, do not need to be supported

the concerns for “business domains” and “lifecycles” only complicates people’s thinking - maybe this is more about expanding the scope of the project, to accommodate things that forges do not already do? - at least for the initial ‘core’ protocol, it is sufficient to let the forges decide which features to support - forge-fed could accommodate whatever forges permit their users to do with a mouse, or via an API - everything that forges do is represented canonically in their databases; and i dont see any reason for forge-fed to accommodate anything that forges generally do not do

however, there could always be increasing (but backward-compatible) compatibility versions defined over time, extensions, or whatever is necessary for special use-cases as they arise, beyond the core functionality which is common across forges - if any of those special extensions need to be relatively complex, of course new projects could form around them, dedicated to that domain

i think it is most critical to keep the ‘core’ protocol as limited and generic as possible, and suggest to implementers that any unrecognized messages should be ignored, in order to avoid proliferation of incompatible protocols, while allowing for arbitrary extensions transparently

am i missing something? - why cant we think of it so simply?

aschrijver · May 30, 2022, 9:00pm

Well, it is rather intuitive really, though likely my poor explanation skills made it seem like complex

If you start modeling, then depending on how you define what you are modeling, you likely end up with wholly different models.

“Given code forges - whatever a forge is - we can ‘reverse-engineer’ common features found in these applications and turn them into interoperable specifications for code forges to adopt.”

Versus …

“Given common Revision Control needs of developers, as implemented in code forges and related tools we can define an interoperable specification for anyone implementing revision control.”
- And on to the next sub-domain, e.g. Project Management spec that maps well onto code forges (but also on Trello).

Then on the terminology…

Lifecycle. Any software project has a lifecycle and part of that lifecyle is supported by features in code forges. As it happens more and more of this lifecycle is added to forges as features (especially Github with others catching up, and certainly in its ecosystem the entire lifecycle is represented). It is also called software developmet process, but I find that less descriptive. During a project lifecycle there are multiple processes working in parallel or sequentially, and stages going from the project’s inception to its end-of-life.

Business domain. Any software project always translates some real-world requirements into abstractions written as code. The field of expertise where you are automating something with your software is called the business domain. Whether you explicitly name it or not, you always do ‘domain modeling’ in one way or another. Especially when writing reusable specifications imho recognizing the domain will be of great help in the modeling effort.

I call it business domain, but the Wikipedia is not up-to-par. The term comes from domain driven design. A business domain is more generic than an application domain where additional implementation details of the particular application may give a slighty different model.

The business domain is most valuable. Imagine there exist no developer tools and you ask a developer (the domain expert): “What do you need to be more productive in coding with your team?”. No developer would answer: “I need a code forge”. Instead they would say “I need to be able to to see the revisions each of us make, and able to act if there’s issues with it”. That’s the basis to further drill down to their needs.

Don’t want to pile on more terminlogy, but sub-domains are also called Bounded Contexts. Here’s an example:

Domain model with a Sales context and a Support context

Picture that for e.g. Revision Control and Task Management contexts. Trello would be interested in Task Management, not so much in Revision Control, while most forges are likely interested to implement both (or gradually adopt context by context).

You also see duplication of concepts in each of the contexts in that diagram. Each context is internally consistent. They can stand on their own, and some concepts have slightly different meaning in other sub-domains. If (hypothetically) a code forge would have Dependency Management, then while modeling that there’s need to know about the concept of Repository, but this model is not interested in all the detailed properties of a Repository that may exist in the Revision Control context.

bill-auger · May 30, 2022, 9:58pm

that was a great explanation - thanks

i dont think there needs to be such broad models - it can be maximally generic

a forge is nothing but a collection of common project management tools - tools which already existed separately long before forges were popular - ie: forges offer nothing unique - they merely unify what are truly distinct tools, into a common, convenient interface - just because the model is “everything that forges do”, that does not imply that anyone needs to use any forge, or use it for “everything”

for example, a stand-alone bug tracker could fork or mirror tickets from a forge and implement and respond to whichever forge-fed requests are related to managing tickets, and ignore any other requests - a mailing list, forum, or “social” website could implement and respond to whichever forge-fed requests are related to subscribing to the forge activity stream or tickets, and for posting comments to tickets, and ignore any other requests

i suppose im saying that each service/client/peer could define their own model, drawn from the common set of “things that forges do”

i still think that all of that can be condensed into a few simpler questions, which will satisfy most people, for the core feature-set:

Q1: should it support only git concerns?
A: NO

Q2: should it support only VCS concerns?
A: NO

Q3: should it support everything that forges commonly do?
A: YES

Q4: should it support anything that forges do not commonly do?
A: NO

Q5: should it expect that any node will support all features?
A: NO

Q6: should it support (“tolerate”, actually) arbitrary extensions for “everything else”?
A: YES

then it is simply left to forges to converge upon the common set of Q3 features; and let the community develop and support the Q6 features, and let every (partially) compatible node define their own models/use-cases

in case it is not obvious, i will add that Q5 and Q6 are non-negotiable - a federated system can not rely on any node to be fully or even partially compliant - it must tolerate them being offline, insane, or outright hostile

aschrijver · June 3, 2022, 7:33am

Yes. I support all the Q’s. Still the question remains, especially relating to Q6, how to best slice the model. Let’s look at one example: Ticket.

{
  "@context": [
      "https://www.w3.org/ns/activitystreams",
      "https://forgefed.peers.community/ns"
  ],
  "type": "Ticket",
  "id": "https://example.dev/alice/myrepo/issues/42",
  "context": "https://example.dev/alice/myrepo",
  "attributedTo": "https://dev.community/bob",
  "summary": "Nothing works!",
  "content": "<p>Please fix. <i>Everything</i> is broken!</p>",
  "mediaType": "text/html",
  "source": {
      "content": "Please fix. *Everything* is broken!",
      "mediaType": "text/markdown; variant=CommonMark"
  },
  "assignedTo": "https://example.dev/alice",
  "isResolved": false
}

Doesn’t have too many custom properties just yet. In fact there’s just two now. The assignedTo and the Boolean field isResolved are defined in the namespace https://forgefed.peers.community/ns.

Now suppose I’m working for Mattermost, and developing the Focalboard project. I bump into ForgeFed and say “Wow, we could provide project boards for any federated forge repository”.

What ForgeFed calls Ticket they call Cards or Tasks, but that maps well. Here’s a Focalboard task:

focalboard-task

What we see here:

isResolved is meaningless to them. Instead they have a Status property with a custom enum that represents board columns.
They have ability to add custom properties on the fly and define custom card types (which might be modeled as "type": ["Ticket", "focal:ReminderCard"] or something).
Not shown in this screenshot, but they have time-tracking fields that have some correspondence to time-tracking found in code forges as well.
They have a Reviewed property that might be mapped onto some ForgeFed construct, except that ForgeFed will likely model reviews in the context of Merge Requests / PR’s and may be incompatible.

The developer pushes ahead and after some months Focalboard proudly releases their ‘ActivityPub federation support’. The federated message they send across the Fediverse look something like:

{
  "@context": [
      "https://www.w3.org/ns/activitystreams",
      "https://forgefed.peers.community/ns",
      "https://mattermost.com/focalboard/2.0/ns"
  ],
  ...
}

Weird things here. All their message have reference to a namespace full of Code Forge stuff in it, even for Mattermost uses where no forge is involved.

Anyway, suppose this integration is exceptionally well received. Mattermost / Focalboard extension providers love the well-designed Focalboard AP extension. So in short time many Marketplace Extensions also add federation support, extending onf ForgeFed and "https://mattermost.com/focalboard/2.0/ns" namespaces.

Now Trello wakes up. They are a bit late to the party, but see the opportunity of federated boards clearly now. They start building their own support. Except they find that in order to get a good position in the market and catch up to Mattermost, they should build on their extension as well. The Mattermost namespace became a de-facto standard. So taking what they must, their federation msgs look like:

{
  "@context": [
      "https://www.w3.org/ns/activitystreams",
      "https://forgefed.peers.community/ns",
      "https://mattermost.com/focalboard/2.0/ns",
      "https://mattermost.com/extensions/jira/4.0/ns",
      "https://trello.com/remote/engineering/board"
  ],
  ...
}

This is getting messy, and Trello is none too happy they have to reference a namespace of their competitor. Also with ForgeFed initiative well under steam and powering ahead, there’s a continuous addition of properties to this ‘core’ namespace that were define in extension to it. This causes some projects and vendors in this new ecosystem, who are less reliant to code forge integration, to ditch the ForgeFed namespace altogether.

All in all there’s an explosion of complexity. Not all vendors document their extension that well, and it becomes very hard to track evolution of every single namespace. Adding federation support is becoming less attractive, as the ecosystem becomes fragmented. If only they had organized better from the start.

Enter the FSDL

Consider the situation where everyone had mapped their extension against the Free Software Development Lifecycle, a crowdsourced set of specifications with some processes and governance in place, so that tool providers can add their extensions to it or expand / evolve existing ones based on consensus in the broader ecosystem.

Trello’s federated messages might look like this now:

{
  "@context": [
      "https://www.w3.org/ns/activitystreams",
      "https://w3id.org/fsdl/task-management/1.0",
      "https://w3id.org/fsdl/project-management/2.0",
      "https://w3id.org/fsdl/time-tracking/draft/latest",
  ],
  ...
}

Sourcehut might implement:

{
  "@context": [
      "https://www.w3.org/ns/activitystreams",
      "https://w3id.org/fsdl/revision-control/3.0",
      "https://w3id.org/fsdl/task-management/1.0",
      "https://sourcehut.org/mail/mailing-lists/ns#",
  ],
  ...
}

And Codeberg on their Gitea instance, might be allowing other Gitea hosters to use their CI and have:

{
  "@context": [
      "https://www.w3.org/ns/activitystreams",
      "https://w3id.org/fsdl/task-management/2.0",
      "https://w3id.org/fsdl/project-management/2.0",
      "https://w3id.org/fsdl/continuous-integration/1.0",
      "https://codeberg.org/CI/Woodpecker/notifications",
  ],
  ...
}

I don’t know if the above is the best way to go about it. But my general plea is that the evolution of the ecosystem that federation support enables should be well considered, and also saying that there’s a huge potential and opportunity to consider the broader context of Software Development (same as what Github considers in scope for their platform already) that warrants such a more holistic approach.

The example with the namespaces is not just hypothetical. This is happening now in the Fediverse. If I want to create a video platform I likely have to extend upon a Peertube namespace. The mistake of never thinking about the evolution of the ecosystem at large has already been made. The W3C SocialCG / SocialHub community that was meant to dedicate to that has languished. Individual projects that already are successful and popular, just go their own way. Other developers try to bring improvement, but way after ‘damage is done’. For instance there’s an effort to create a single big Turtle specification for the Fediverse, that has these namespaces in it:


# ActivityPub and Community
@prefix as: <http://www.w3.org/ns/activitystreams#> .
@prefix sec: <https://w3id.org/security/v1> .
@prefix toot: <http://joinmastodon.org/ns#> .
@prefix pt: <https://joinpeertube.org/ns#> .
@prefix mz: <https://joinmobilizon.org/ns#> .
@prefix lm: <https://join-lemmy.org#> .
@prefix yf: <https://yuforium.com/ns/activitypub> .

(Which IMHO is the wrong approach, as this obviously does not scale and makes no sense if you are in a particular domain)

Without an approach that defines the standardization processes and governance very clearly - not just for core forge features, but for extensions on top of that - the whole ForgeFed ecosystem will imho inevitably end up where the current Fediverse is: A state where ad-hoc interoperability (everyone extending on the fly, not contributing back to the ecosystem) and post-facto interoperability (the dominant leader in the ecosystem you must follow) leads to innovation that grinds to a halt and an ecosystem that can no longer easily evolve.

I have summarized some of the great challenges we face in: Major challenges for the Fediverse.

dachary · June 4, 2022, 12:45pm

The problem is clearly articulated and the solution you propose (a standardization process) is sensible. It is a very ambitious undertaking and would require someone working fulltime during months to get it started, to create the conditions that make it possible. It would be very beneficial to both Forgefed and forgefriends, I think.

Any ideas / lead regarding how to move forward in a concrete way?

aschrijver · June 13, 2022, 7:16am

I think there are multiple ways to go forward with this. Though some might require the effort you mention, you don’t have these kinds of resources at your disposal atm. Also IMHO such dedication is only needed once you start really positioning and fostering the larger FSDL ecosystem.

Until that time it may enough to have the broader vision on the horizon and be wary of risks to the various projects that would stray you from the path towards it.

You know me as a passionate advocate for the Fediverse, who spent countless hours promoting the ecosystem in all its breadth and width. With my investigation into the state of the Fediverse I am also an increasingly desperate and worried person about its future outlook.

It is clear to me that the laissez-faire (anarchistic (?)) way that the fedi has evolved on top of the initial standards has brought it in a position now, where it is very hard and challenging to put things into a better state. You might say there’s a metric ton of ‘tech debt’ at protocol / interoperability level in the ecosystem at large. The effort and dedication this requires is likely much larger than doing things upfront.

Forge federation / FSDL are exploring a wholly different set of domains than Microblogging and that provides opportunities to avoid the same pitfalls from the start.

I recently found the IETF Evolvability, Deployability, & Maintainability Working Group that is preparing drafts on maintaining protocols, and I urge people in the forge federation community to at least read the section on Protocol Decay in: The Harmful Consequences of the Robustness Principle

And also look at other documents of this WG, such as: RFC 9170: Long-Term Viability of Protocol Extension Mechanisms

The FSDL is also part of the Social Coding Movement that is yet to be launched. We are discussing challenges to any FOSS ecosystem and Fediverse in particular - and as discussed here - in Social Coding Foundations and I invite you to join the chatroom to keep in the loop.

aschrijver · June 19, 2022, 8:42pm

Like to refer to some chat in Forge Federation General chatroom relating to this topic (and ends with this chat message). Discussion starts on how to attain compatibility with Mastodon, a note to focus on domains that are more important first (i.e. forge federation), informational pointers on domain driven design and bounded contexts, and an example where a use case description should probably not contain mention of UI elements, but focus on the domain use case first (domains first, UX/UI second).

aschrijver · June 30, 2022, 6:05am

Forges and the positioning of Github and Gitlab

In chatroom discussion @dachary posted:

But … every developer knows what a forge is. Even though they all have different ideas, they more or less relate to “a set of tools they use to work on software”.

I first learned the term when I bumped into ForgeFed. Never heard or used it before then. I’d want to say, be careful, as there are biases creeping in because of our exposure to terms and technology. Same maybe with name recognition of Gitea, where likely the vast unkempt hordes using Github don’t know about it. Github has a large wikipedia page. Do you know how many times the word “forge” is used? 6 times… as part of the ‘brand’ name “SourceForge”.

This relates to why I am an advocate of modeling consistent sub-domains that map directly to the software development lifecycle, i.e. to categorized activities that IT technologists (multiple stakeholder types, the domain experts) perform. If we look at Github’s introduction text on Wikipedia:

GitHub, Inc. is a provider of Internet hosting for software development and version control using Git. It offers the distributed version control and source code management (SCM) functionality of Git, plus its own features. It provides access control and several collaboration features such as bug tracking, feature requests, task management, continuous integration, and wikis for every project.

I think this is not the best split from a domain analysis perspective, yet as a fresh start to this analysis I’d note down:

“Internet hosting for software development” → top-level domain
Distributed version control, source code management, bug tracking, feature requests, task management, continuous integration → sub-domains
Access control, wikis → supporting domains

Another example. The Gitlab Wikipedia page. 1 occurance of “SourceForge” in the footnotes listing “Bug Tracking Systems”

Here in the introductory text Gitlab is described as:

a DevOps software package that combines the ability to develop, secure, and operate software in a single application.

→ top-level domain

It would be interesting to study the websites of both companies and distill their mission / vision / product portfolio positioning from there.

Going to the Github About page it becomes immediately clear what their “completeness of vision” entails…

Github: Where the world builds software

Millions of developers and companies build, ship, and maintain their software on GitHub—the largest and most advanced development platform in the world.

Key words: Build software. Build, ship, maintain. Most advanced development platform.

There’s no direct mission/vision statement, but just on the About page some more indicators on how they view their own products and their scope:

We’re focused on fighting for developer rights by shaping the policies that promote their interests and the future of software.

Let’s do the same for Gitlab from their About page…

Gitlab: The One DevOps Platform

From planning to production, bring teams together in one application. Ship secure code faster, deploy to any cloud, and drive business results.

For every stage of the DevOps Lifecycle

Eliminate point solution tool sprawl with our comprehensive platform.

They actually have a lifecycle diagram on their page:

Plan → Create → Verify → Package → Secure → Release → Configure → Monitor → Protect → Manage

Their about page provides a wealth of information on doing productization well.

Now, it becomes interesting to see how Gitea contrasts to that. From the landing page there are only 2 direct top-level domain / product-related statements:

A painless self-hosted Git service.
Lightweight code hosting solution

The rest of the page, while relevant, highlights that it is FOSS, that it is maintained by a community, that it is easy to install + host. In terms of productization it is by far not the worst I’ve seen in FOSS circles, but there’s a lot to improve. I recently on the Open Strategy megathread contrasted Lemmy and Mastodon landing pages as a good example (where Lemmy represents a typical FOSS website).

Also discussed in chat by me in reaction to Ryuno Ki on future plans of MS/Github:

I have mentioned my thoughts on where I think Github is headed in earlier chat (scroll some miles upwards )

TL;DR going to a situation where your browser is a dumb UI terminal, and on any project complete executing environments run in the background / in the cloud (Azure mostly), and you just edit the code. Even git and your code files may be abstracted away.

If I were them and went for max. lock-in I’d abstract away code files altogether. Have you navigate a language and architecture dependent logical data model instead. Then if you work offline you sync a execution environment package that has the abstract syntax tree of that. Bye bye git, the future has arrived.

There might be a git repo export to comply to regulation (like GDPR, Digital Markets and Services Acts of the EU).

Let’s thwart that future with federated FSDL

There is the general concern / risk that the use of words as “forge” lead to a dogmatic approach and not seeing the broader landscape. Because there’s a certain notion of what a forge is and is not.

I have argued before: A forge is really a type of application, it does not demarkate a generic ‘business domain’. There’s no would-be developer that says “give me a forge, I want to be a developer”. Instead they master skill after skill, explore domain after domain and become domain experts in fields of software development.

Re: The word “Forge” it is interesting to analyse the Wikipedia page that mentions it. It is undoubtedly written by a FOSS developer. Now if you would ‘productize’ such page in the same way Github / Gitlab did for their platforms… there’d be a big improvement.

aschrijver · July 2, 2022, 7:43pm

An important article by sfconservancy and likely an interesting discussion at: Give up GitHub: The time has come | Hacker News

There’s more stuff to add to the FSDL topic today. One quote in particular points back to chat we had today:

The FOSS development methodology is GitHub’s product, which they’ve proprietarized and repackaged with our active (if often unwitting) help.

→ FOSS development methodology === FSDL

Direct article link: Give Up GitHub: The Time Has Come! - Conservancy Blog - Software Freedom Conservancy

aschrijver · July 2, 2022, 7:47pm

I have documented a placeholder best-practice for the crowdsourced Social Coding Movement website, that has FSDL as an example: The Open Ecosystem

aschrijver · July 7, 2022, 5:45am

There are early discussions of tying Social Coding FSDL to the planned forg.es umbrella organization, and form an Ecosystem Alliance. That would mean that all discussion on positioning and vision will shift to that level, and ForgeFed is free to adopt whatever they prefer.

See also: Positioning Friendly Forge Format - #12 by aschrijver

To be continued (likely in a different topic).

aschrijver · August 22, 2022, 1:47pm

Continued on Discuss Social Coding

This thread is continued on the social coding forum. Please add follow-ups over there or in the social coding chatrooms.

aschrijver · August 22, 2022, 1:47pm