NGI Assure call deadline 01/08/2022

Reply sent September 21st, 2022.


Hi [redacted],

Establishing an Open Standard that can be trusted by software forge implementors and users is a formidable undertaking that will require years of work. The goal in the context of this funding is to create a sound ground to bootstrap this effort, which includes:

  • a lifecycle relying on both paid work and crowdsourcing
  • a scope defined and implemented by all stakeholders
  • an adoption based on incremental inclusion in software forges

In practice, the following lines of work are required to achieve this first iteration:

  • Publish and improve the F3 specifications
  • Conduct User Research to identify the needs of forge users, forge implementors and forge maintainers and focus on what matters most to them
  • Publish and develop a reference implementation of the F3 specifications
  • Create a roadmap and a lifecycle for the next iteration of the F3 specifications and reference implementation
  • Establish a dialog with all stake holders to create a consensus on F3
  • Setup an environment favorable to crowdsourcing the specifications and the reference implementation

The detailed answers to your questions can be found below, as well as a high level workplan. Please let me know if something is unclear or raises a concern.

Cheers

Scope

The role and associated feature set of forges are continuously evolving. Vendors continue to increase their features, for instance integration of donation buttons, license compliance support, CMS and wiki functionality etc. Where do you draw the line (and by what criteria), and stop implementing additional features?

In this first iteration of F3, the line is drawn by the limited resources available.

User research conducted in 2021 identified that “issues” are the most pressing need, therefore it is included. The following are also included because they are essential to Gitea, which is the primary community targeted by F3, and adoption would be unlikely if they were not: user, project, label, milestone, topic, release, asset, pull_request, repository, comment, reaction, review.

This feature set will have to expand during the next iterations of F3 to include more features. It will happen, as a result of a dialog between the implementors of the specification and the stakeholders. There will always be a tension between desirable features and the effort required to implement and maintain them. But F3 must always be a living specification.

To what extend can F3 be an archiving format? How good is the expected feature coverage of the spec when looking at the whole spectrum of version control systems (e.g. Fossil, Sourcehut, Gitolite, Pagure, Pijul, etc) and proprietary services like Github? Is the idea for the spec to be a superset of all (theoretically) possible features of all solutions available in todays solutions, or an opinionated subset of that?

Although it may be ambitious to claim that F3 is suitable for archiving a software project at this point in time, the reality is that there is no other format that even tries to do the same. GitLab import/export format is undocumented and unsuitable for archival, the same is true of GitHub etc. However imperfect F3 may be for archival it is the best there is. Since the work on F3 is based on a bottom up approach, it is effectively an opinionated subset of the existing features. Which will expand over time when the feature set expands.

At some point, when F3 becomes mature enough, forge implementors will have a motivation to ensure there is a parity between the features they provide and the features F3 can represent. It will be a condition for them to be able to import/mirror software projects originating from other forges. It will take time to get there but it is also the best way for F3 to achieve a healthy lifecycle.

Today’s forges also still lack quite some features. Will you for instance also integrate software translations, which are now often outsourced to external translation tools like Weblate? Will you handle groups, access control/rights, key management/signing etc? Is the spec extensible to cover whatever a project needs?

The coding.social social movement that emerged last year aims at identifying all domains involved in the lifecycle of a Free Software. All of them, including translations, are potentially in scope for inclusion in F3. It is likely that domains identified during dialog in coding.social will lead to proposals to modify the scope of F3.

Lifecycle

How do you see the life cycle/future maintenance of the specification?

F3 is going to be defined by an iterative process based on the following:

  • A feature is identified and defined during informal discussions with stakeholders. It could be within coding.social or because it emerges in some major forges, etc.
  • User Research is conducted and the report concludes the stakeholders (forge users, implementors, operators) value this feature
  • The F3 specifications and reference implementations are modified to include that feature
  • A version of F3 is released

It is very similar to the life cycle of a software. Although the feature set of F3 may change less frequently over time, creating Free Software is essentially a social process and the odds are it will keep being a living specification for the foreseeable future.

Have you considered engaging with a standards body, e.g. OASIS?

Standard bodies are unfamiliar to everyone involved in F3, reason why it is not included in this iteration. It is however the logical context in which to develop a standard and the best efforts will be made to apply and learn about the work it implies.

Adoption

Projects that are currently autonomous could be hesitant to switch away from their existing internal abstractions, serialisations and on disk file formats - unless their developers have 100R0confidence that services are not disrupted by migration and users are guaranteed to not lose information. How realistic do you consider it that others can natively adopt an externally defined file format as-is in their applications, or even spend time looking into such a possibility - unless they are involved early in the definition of the format?

Since F3 emerged from the Gitea internal import/export format, the idea that F3 immeiately becomes a substitute was considered. But that involves significant internal infrastructure changes that are challenging. Even though the feature set of F3 is an exact match with the feature set of the internal import/export.

Instead, it was decided to add support for F3, as an alternate format for import/export which is also capable of mirroring (an important feature for forge federation). The downside of this approach could be work redundancy: the existing import/export needs to be maintained while users gain confidence that F3 is stable enough to match their needs. However, the current codebase is not being worked on actively and there is no indication that it will be within the year to come. Therefore there will be very little redundant work and the hope is that this approach will allow F3 to mature within the Gitea codebase. It will then become the most popular export/import module and replace the old one.

While this strategy may work for Gitea, a different one will have to be defined for Sourcehut, GitLab, Pagure, Fossil etc. Realistically each forge will require an ad-hoc strategy if F3 is to be adopted. From a technical standpoint having Go reference implementation is an asset: it can be used as a binary library in other programming languages. For instance, in order for Pagure to use F3 a python module linked with the Go F3 library can be implemented. Even though this is work, it is an order of magnitude simpler than rewriting the entire F3 implementation from scratch, down to the last detail and the last border case.

To summarize the strategy for adoption is based on:

  • Initial inclusion as an optional import/export format capable of mirroring, with links to federation
  • Availability of a reference implementation in Go to be reused by all languages

And there also is the expected amount of advocacy, code glue and dialog with forge implementors.

Main tasks breakdown

Could you provide a breakdown of the main tasks, and the associated effort?

Here are the main tasks, which can be further subdivided per-feature (assets, issue, pull request, etc.) into a detailed workplan with associated work efforts.

Specification and documentation

The F3 Specification includes:

  • An introduction
  • A guide to concepts
  • JSON Schema with embedded documentation
  • Release notes
  • A normative file hierarchy
  • A glossary of terms and their definition

Milestones:

  • JSON Schema for F3 are published a dedicated repositories
  • A formatted online version of the documentation

Monthly reports and videoconferences

Stakeholders (forge users, implementors and operators) are invited to monthly videoconferences to express their needs and keep up to date with the development of F3. The outcome of this dialog is expected to be included in the User Research report and define the next iteration of the F3 specifications.

Milestones:

  • A report with detailed description of the work done over the past month
  • A recording of the videoconference

Documentation of the F3 lifecycle

  • Guide for newcomers
  • Roadmap
  • Low hanging fruits
  • Reference for specification developers

Milestones:

  • A repository with the documentation
  • A formatted online version of the documentation

First release

The first F3 release is a bundle that includes:

  • The specifications and documentation
  • The Go reference implementation

They are verified to be consistent and tagged with the same version number.

Milestone: simultaneous publication of F3 version 1.0.0 at:

Go package reference implementation

A reference implementation of F3 in Go provides:

  • An API for integration in a forge written in Go
  • Validation of a F3 archive (JSON Schema validation, etc.)
  • Import and Export support for Gitea and GitLab
  • Dataset generators and fixtures to verify the conformance with the specifications

Milestone: The Go package is published https://pkg.go.dev/

Integration in the Gitea codebase

The F3 Go reference implementation is used as an alternative for the internal format used for repositories dump and restore features in the Gitea codebase.

Milestones: pull request merged in https://forgefriends.org or https://gitea.io

User Research

User Research involving forge implementors, forge users and forge operators to determine which feature are most important for them to develop during the next iteration of the F3 specification.

Milestones:

  • A User Research report including three emerging themes