Deadline August 1st, 2021
https://nlnet.nl/assure/
https://nlnet.nl/propose/
Thematic call: NLnet; NGI Assure
Your name: Loïc Dachary
Email address: loic@dachary.org
Phone numbers: [redacted]
Organisation (if any): Easter-Eggs
Country: France
Project name: Friendly Forge Format (F3): an Open Standard for secure communication between software forges
Website / wiki: About the Friendly Forge Format (F3)
Abstract: Can you explain the whole project and its expected outcome(s). (1200 characters)
There is no Open File Standard to share the content of a software project in a forge such as GitHub or GitLab. Only undocumented internal formats with no guarantee of integrity or portability. This is one of the main reasons preventing interoperability and federation between forges. When a project is hosted on a centralized forge, even the owners are not provided the assurance that it was not tampered with: the supply chain may be at risk. Obtaining the full state of a software project is a challenge.
The Friendly Forge Format (F3) is a digitally signed Open File Format for storing the state of a software project obtained from a forge such as issues, pull requests… as well as the VCS (Git…). It is designed to provide strong assurances regarding its integrity and the identity of the authors. It can conveniently be stored for later reference and compared to detect malicious or accidental changes.
It also enable forges to communicate with each other by reducing the complexity from N² to N. With F3 a forge can communicate with all the others by implementing a conversion to and from F3 (N). Without F3 they need to implement a conversion to and from the all specific forge formats (N²).
Have you been involved with projects or organisations relevant to this project before? And if so, can you tell us a bit about your contributions?
I worked full time with https://forgefriends.org, https://forgefed.org, https://gitea.io and https://www.softwareheritage.org/ for the past eighteen months to advance forge federation.
I made contributions to the forgefed specification and played an active role in 2022 to revive the project with a broader community.
The forgefriends project is a proxy designed to enable federation for software forges that do not yet implement it natively. I participated in forgefriends from the start, in January 2021. I authored most of the code and documentation that exist today. I published activity reports on a monthly basis and organized videoconferences to keep the larger community up to date.
I have worked with the Gitea project since late 2021 to natively implement federation with code contributions. I filed a successful grant application (2022) for the benefit of the Gitea owners to allow them to work on forge federation.
I designed a novel storage system for Software Heritage early 2021, implemented a proof of concept and derived ideas on how federation can play a role in preserving software.
Back in 2001 I created and maintained https://savannah.gnu.org as an alternative to SourceForge.
Requested Amount: 50,000€
Explain what the requested budget will be used for? Does the project have other funding sources, both past and present? (If you want, you can in addition attach a budget at the bottom of the form)
The Friendly Forge Format (F3) is new and has no funding sources.
The budget will exclusively be used as an income source for Loïc Dachary (480€ per day) to enable him to work on the project over a period of twelve months.
The roots of F3 can be traced back to closely related projects that received funding in the past. It emerged as an idea while my work was funded by the “Storing Efficiently Our Software Heritage” grant ( NLnet; Storing Efficiently Our Software Heritage ) and has since been completed. The forgefriends project in which the Friendly Forge Format project was drafted (around March 2022) was funded by DAPSI from March 2021 to October 2021 ( fedeproxy - DAPSI - Data Portability & Services Incubator ).
Compare your own project with existing or historical efforts.
The State of the Forge Federation: 2021 to 2023 published in June 2022 contains a detailed description of the projects related to F3. It is designed to be a building block that can be reused by all of them to facilitate the implementation of forge federation features.
F3 is different from ForgeFed. ForgeFed is an ActivityPub extension with its own vocabulary and models represented in JSON-LD. F3 is an JSON based Open File Format providing a strongly consistent representation of a software project at a given point in time. Despite these differences, there are overlaps: they both need to define a glossary of terms and explain concepts that are common between forges. This already led to contributions to Forgefed and more are expected in the future.
F3 emerged in the context of the forgefriends project where the Gitea internal file format was improved to make room for federated features. The effort duplication of maintaining internal file formats in all existing forges and its consequences inspired the authors to create F3 and publish it as an Open File Format.
F3 is at the crossroad of a number of forge related projects funded by NGI in the recent past: ForgeFed, Storing Efficiently Our Software Heritage, Federated software forges with Gitea, Contribute to all Free Software from the comfort of your forge. It adds an essential piece to the puzzle that will eventually be completed and allow forges to continuously exchange data using open standards.
The forgefriends and ForgeFlux forge federation proxies will include F3 in their upcoming releases. This integration will be a showcase demonstrating how the Go and Python API can be used for integration in other software forges.
Deploying F3 in production is challenging because it does not yet have a reassuring reputation of stability and robustness. When problems are discovered, they will require a level of understanding and an investment in time from system administrators that most service providers consider too costly. The Hostea service provider is committed to advance forge federation and will deploy F3 as soon as it is available. It will likely be the first production instance supporting F3.
What are significant technical challenges you expect to solve during the project, if any?
Reliable release process
A release of F3 is a specification document composed of the description of many forge artifacts that evolve independently. It is bound to the reference implementation that shows how each aspect of the specifications work. Dataset generators and fixtures can then be used by the reference implementation to verify the conformance with the specifications.
These three parts (specifications, reference implementation and datasets) must be used to establish a Quality and Assurance process that verifies a F3 release candidate is consistent before it is published.
Representing a very large number of forge artifacts
A forge is an unbounded set of tools (e.g. issues, pull requests, comments, releases, VCS, etc.) that are used by developers when they work on a particular software project. Each of these tools also has an unbounded set of features. It is common for a forge to provide tools and features for which there is no equivalent in another forge (e.g. there are mailing lists on SourceHut but not in Gitea).
Even when there is an equivalence (e.g. Woodpecker CI provides the same kind of functionality as GitHub actions), representing both in F3 is challenging because there capabilities often have subtle variations.
Robust test infrastructure
The reference implementation must track in real time the evolution of forges for which it provides conversion to an form F3. A continuous integration pipeline must be run against all supported versions as soon as a new change is proposed. It involves bootstraping forges from scratch and feeding them with sample data created from fixtures which is a resource intensive process.
Ensuring the stability of the CI pipeline is, in itself, a core engineering challenge. A flaky pipeline that fails randomly or is too sensitive to environmental problems will create false negatives. When there are too many, the developers and contributors maintaining the reference implementations may spend more time debugging the CI than improving F3.
Seamless contributor onboarding
Setting up a local development environment that allows a contributor to modify:
- the JSON Schema of the specifications
- the reference implementation
must be made as easy as possible so that they can debug problems and try new ideas. It is critically important because the very large number of details that F3 needs to address requires crowdsourcing development. Without a seamless contributor onboarding process there will be too much friction for that to happen effectively.
Describe the ecosystem of the project, and how you will engage with relevant actors and promote the outcomes?
F3 was first announced publicly when the State of the Forge Federation: 2021 to 2023 was published in June 2022. Although the idea first emerged early 2022, it still is largely unknown and has never been described formally.
Monthly reports and videoconferences
A monthly report will be published and disseminated to provide a high level overview of the evolution of the F3 specification and reference implementations. They will also be discussed monthly during a videoconference where active participants can better explain what they are doing and why.
In combination with regular developer releases, this will achieve two goals:
- allow future users to keep in touch and plan for F3 integration in their development workflow
- encourage contributors to participate by clarifying what is being done and where more workforce is useful
Forgefriends
F3 will be released as an integral part of forgefriends from the start.
Hostea
The Hostea hosting provider currently proposes vanilla Gitea instances. It will also propose forgefriends instances as soon as they are released. This will enable anyone to try the import / export feature provided by F3 as soon as they are available.
Gitea
Discussions began with Gitea early on since F3 is derived from the Gitea internal format. A pull request to merge F3 as an integral part of upcoming releases will be worked on until it is accepted. The strongest incentives for Gitea to use F3 are robustness and active development which is lacking for the legacy export/import codebase.
ForgeFlux & Pagure
The author of ForgeFlux is committed to use the Python reference implementation of F3 as part of its first release scheduled next year. The author of the ForgeFed plugin for Pagure will be contacted and assistance will be provided to use F3 via the Python reference implementation.
GitLab
The Go reference implementation will support importing from GitLab CE and, to an extent limited by the API, exporting to GitLab CE. Merge requests will be opened in the GitLab CE tracker to advocate for features required for federation (e.g. mapping of Issue ids, etc.).
ForgeFed
Improvements to the ForgeFed specifications will be proposed so that data contained in a F3 archive can be translated and sent over ActivityPub. It will make it easier for forges to implement federation: they would otherwise have to figure out how to map their internal data structures or format into ForgeFed models and vocabulary.
What should we do in the other case, e.g. when your project is not immediately selected?
X I want NLnet Foundation to erase all information should my project proposal not be granted
https://nlnet.nl/assure/guideforapplicants/
The goal of NGI Assure 1 is to support projects that design and engineer reusable building blocks for the Next Generation Internet as part of a complete, strong chain of assurances for all stakeholders regarding the source and integrity of identities, identifiers, data, cyberphysical systems, service components and processes. Furthermore contributions can be made to address underlying real-world challenges in deploying and validating such building blocks, such as energy efficiency and sustainability, scalability and throughput, security, privacy/confidentiality, plausible deniability, robustness and crypto-agility, side-channel resistance, interoperability, governance and compliance to regulatory frameworks - where needed to turn the above into reproducible and trustworthy end-to-end solutions that can withstand the hostile battle grounds of the modern internet.