DRAFT: Fedeproxy is written in Go to share code with Gitea

Early 2021 the fedeproxy project was created to build communication bridges between forges. It would allow to create a bug report on a project hosted on GitLab even when using Gitea. At that time community members knew very little about the ActivityPub protocol, the Gitea codebase or its community. But they knew it would be years before all forges are able to communicate with each other using some kind of standard protocol. So they decided to bootstrap the fedeproxy codebase using the language that they were most comfortable with: Python.

How the decision was made

During the summer of 2021, while working on the fedeproxy codebase, they got a better understanding of what is at stake when implementing ActivityPub. They also got a chance to better understand how the Gitea community is organized. The work that was done on a grant application to implement federation in Gitea was an opportunity for personal interactions with a few of the most active members of the Gitea project. In July 2021 they also became a co-administrators of the Chapril Gitea instance.

In August 2021, there were doubts that pursuing the implementation of fedeproxy in python was the best option and a rewrite in Go was proposed. There was no push back, on the contrary. A month later fedeproxy community members agreed, although Go is not their native programming language.

Where Gitea fits in

In retrospect it was a mistake to begin with a python codebase. But fedeproxy community members did not know then what they know now. It would be an even bigger mistake to keep going with the current codebase for the sole reason that it exists. It would deprive fedeproxy of the following benefits:

Instead of bootstrapping an entirely new codebase, it will be bootstrapped with a Gitea fork which already implements:

  • Dump/restore feature;
  • Migration feature which includes API interactions with a range of forges;
  • Embryo of an upload feature;
  • Mirroring infrastructure, although limited to git repositories;
  • REST server;
  • In memory representation of issues, pull requests etc.;
  • Release process;

A large part of the Gitea codebase is irrelevant in the context of fedeproxy and the fedeproxy project, as a whole, does not fit into Gitea. It would be tempting to only cherry pick from Gitea in order to populate the initial fedeproxy codebase. But it would then be very difficult to track the evolution from Gitea and merge them back into the fedeproxy codebase. Taking on the entire codebase makes it easier, as long as the fedeproxy code is carefully crafted to minimize the risks of a conflict.

A concrete example of code that will be shared is the integration of go-fed should be almost identical in fedeproxy and Gitea. Whatever improvement is implemented in fedeproxy can be contributed to Gitea, and vice versa. Another example is the recently added private key generation.

There are well known examples of successful projects that are based on a codebase that has an independent lifecycle, such as Grsecurity, a set of patches on top of the Linux kernel. The methods and tools to manage such a project are different than what is common when creating a codebase from scratch, but they exist (see git rerere for an example).

Impact and way forward

It makes intuitive sense to start in this way because fedeproxy’s ultimate goal is to disappear when all forges are able to communicate with each other. For instance, when Gitea is able to gracefully federate issues across instances, fedeproxy will become redundant for this particular feature and the associated code can be removed from fedeproxy. The same will be true for GitLab when it natively allows issues to be federated between instances using a standard protocol and vocabulary. And when all forges use the same standard to federate issues, the fedeproxy code bridging issues can be removed entirely.

@pilou this is essentially a summary of the discussions we had over the past two weeks on this topic. It would help a lot if you could write down the objections you raised regarding the switch to Go/Gitea: they were valuable and I may have missed a few :blush:

I am afraid I do not understand the “disolve into Gitea” part fully. Do you mean that developing something for e.g. Gitlab and other forges is no longer part of FedeProxy, or that that will be in a wholly separate fork of Gitlab, or - maybe - once FedeProxy is a mature project you’ll just have solid specs for forge federation and any forge can grab it for themselves to implement stuff (Gitea being a reference implementation)?

1 Like

You are correct, this is ambiguous and incorrect. Fedeproxy may be a transparent proxy for Gitea and no longer useful for Gitea. But it will be a while before it becomes redundant for GitLab and in the meantime it can’t be “disolved”. I liked the image but it is confusing and needs rewording.

1 Like

I updated the description and title for clarity @aschrijver, what do you think ?

I am afraid still not clear enough. But it may be because I am insufficiently in the loop of project objectives and scope. What I think / deduced:

  • What: Federated issue management (MVP?)
  • How: Standard protocol and vocabulary (data format)
  • Where: Forge-specific implementations (?), starting with Gitea (Golang)

Gitea is fully committed to adding federation support, which is great. What purpose serves a full fork that you need to keep in sync and where FedeProxy federation has subtle differences with Gitea federation?

I would understand this fork if FedeProxy were an independently installed application that you deploy side-by-side with a forge product and which communicates via API’s. Then nothing will dissolve over time, however. But this would make sense to me: you bootstrap with Gitea, strip what you do not need, and build upon what remains. It would also allow a self-hosted FedeProxy-only instance (in the sort of forge-specific Ports & remote Adapters architecture we discussed some time ago). It would be a reference implementation where best-practices for protocol and vocabulary is used and with full documentation and tests accompanying it.

In this setup the “where” would be:

  • Where: FedeProxy module, independently installable, optionally self-hostable (and reference impl. for forge devs)

Another reason for a fork (with the original “where”) might be that the MVP of Gitea would be different in requirements and use cases than the MVP of ForgeFed while protocol and vocab are evolving, but would ultimately be integrated into Gitea, after which Forgefed ‘dissolves’ their Golang impl. and move on to the next forge federation support quest.

Yes on the What and Hown. No on the Where.

It is.

Here is an example based on what will happen:

  • fedeproxy merges the add user settings key/value DB table pull request in its codebase
  • fedeproxy implements the ActivityPub’s endpoint that returns information about an actor, including its public key (e.g. /users/loic), based on the code from the “add user settings key/value DB table” pull request
  • a pull request is opened in Gitea to propose the code implementing this new endpoint
  • the pull request is merged into Gitea
  • fedeproxy merges back the Gitea codebase and there no longer is a difference between Gitea and fedeproxy on this particular feature.

This is what I meant by “disolution”. It would of course be much, much less work if the endpoint could be implemented directly in Gitea instead. But the lifecycle of fedeproxy and Gitea are different and it would block any progress on fedeproxy during months while waiting on Gitea.

There are part of fedeproxy that do not make sense for Gitea to include, such as acting as a reverse proxy for the ActivityPub protocol on behalf of a GitLab instance. This will not be “disolved”.

It is very close to what I have in mind, with the following difference:

  • Do not strip the uneeded code, just ignore it
  • Merge the Gitea codebase on a weekly basis in fedeproxy
  • Build fedeproxy specific code in a way that minimizes the likelyhood of a merged conflict

Another example of something that is not in scope for Gitea but is supported by fedeproxy: Mercurial. Whatever code deals with Mercurial won’t be contributed back to Gitea. Not that it’s impossible in theory but the likelyhood of that happening is very low in the next two years.

I think I understand now. The FedeProxy fork (which could technically also be a Gitea branch if it wouldn’t have to address broader requirements for other forges) will gradually “meld” anything purely Gitea-related up to the point where Gitea has native “federation support”. At that point the FedeProxy Golang codebase will take on a life of its own, and become a more independent app that supports ever more forges and is installed side-by-side to them (like in the case of Gitlab).

When project “independence stage” is reached you may still incorporate Gitea code, but less frequently and there’s no need for the high-effort syncing anymore. Similarly Gitea from then on will evolve their own federation capability on their own, now based on mature standard specifications that evolve too.

If I am on the right track in my line of thinking, then eventually you’ll want to strip, rather than carry the ballast of ‘dead code’.

1 Like

Your feedback is invaluable: when I wrote the initial version I completely failed to communicate what I intended. We’re getting close although there still are significant differences between what you reworded and what I intended to convey.

Well put.

The fedeproxy codebase has a life of its own from the very beginning (i.e. October 2021). It replaces the current codebase entirely and will support Gitea and GitLab, as the current codebase does. Since it is written in Go and includes all the Gitea codebase, supporting Gitea will be easier. But in both cases fedeproxy will communicate with the forge via the API and will be installed side-by-side to them.

In other words the following happen in parallel, not in sequence:

  • Implementation of the fedeproxy server as a service that runs side-by-side an existing Gitea or GitLab instance
  • Contributions from the fedeproxy codebase to the Gitea codebase to further federation
  • Merge from the Gitea codebase into the fedeproxy codebase to benefit from the work done by the Gitea developers regarding federation (and other parts of the code that is common)

Exactly: over time the parts that are common to Gitea and fedeproxy will be identical, as a library would.

Yes, although “then on” suggests a sequence of event. It really happens all in parallel. Fedeproxy make progress at its own pace. And Gitea has its own development lifecycle happening at the same time.

I concur. Initially though (i.e. for the next few years) I think stripping would make things more difficult than carrying the entire Gitea codebase.

1 Like

@aschrijver I wrote a third version of the draft and changed the title based on our discussion. I’m quite happy with it but let me know if it feels wrong. I will be grateful if you disagree (but I won’t be crossed if you think it makes sense :crazy_face:). Thanks again!

1 Like

Yes, this was poorly phrased of me, and was the way I understood. Good to mention explicitly that it happens in parallel wrt building other forge (Gitlab) interoperability.

Yeah, we meant to say the same, once again I could have expressed it better. You might say that Gitea after “independence stage” is in a slightly different role. They go from tight participants to the role of client / consumer of FedeProxy (though can still contribute as actively as before).

Good title, good text. I don’t know where you intend to publish (blog?). There are some slight improvement suggestions in terms of structure and language use…

You might add H2 paragraphs, and restructure with essence / TL;DR summary first e.g.:

  • Intro text (no header needed): “We have made some important decisions on the FedeProxy project direction. Instead of bootstrapping an entirely new codebase, we decided […]” and follow with the bullet list.
  • H2 par “How we got there”. First four paragraphs + first bullet list here.
  • H2 par “Where Gitea fits in”. “A large part of the Gitea codebase […] A concrete example […]”
  • H2 par “Impact and way forward”. Last paragraph here.

I would strongly advice to turn “I”'s into “we” and refer in 3rd person to yourself. You already sought consent on these decisions, so you can talk in narrative voice representing the entire community. You can also sign off that way “FedeProxy Community, Loïc Dachary”. That will make feel the project more inclusive, collectively owned.

1 Like

I initially used the first person because there was no consensus. But you are right that this is not necessary, I’ll rephrase with your suggestions and publish it on the blog of fedeproxy. Exciting times :slight_smile:

1 Like

It is now published. I am very happy about this collaborative writing, it is a pleasure working with you @aschrijver :+1:

1 Like