Higher Education and Research in France. software forges: analysis, current limitations, actions

Bonjour,

I’ve been informed today that a working group is studying software forges and how they could be used in academia in France. The summary of their progress is public (rare & appreciated :+1: ). It’s all in French but I will translate for the benefit of non-French speaking people participating in forgefriends so that they can comment and contribute (this is par of the effort to improve linguistic diversity).

https://www.ouvrirlascience.fr/the-free-and-open-source-software-expert-group/

@sibichakkaravarthy this may be of particular interest to you in the context of the ongoing involvement of AIR on the general topic of reproducible software?

Cheers

In the folllowing, HER means Higher Education and Research.


title: “Forges: needs analysis, identification of current limitations, proposals for action”.
author: "WG 2 : Tools and good technical and social practices
date: April 2022


Needs

Creation of public and private projects, to allow the design of the software internally and its public release when its level of maturity allows it.

Version management

  • commit, branches, releases
  • collaboration (merge request, history, sharing etc)
  • backup/archiving

Ticket management

Continuous integration for various platforms (allow to easily create deliverables from the source code).
For developers, the goal/need is also to allow testing/validation of code for different OS/config as development proceeds. The “reproducible” aspect offered by the CI (via images in registries for example) is also very important.

Importance of the possibility to have interactions with users outside HER (tickets, code contribution, documentation).

Integration of the state of the art in software engineering (static code analysis, testing, quality assurance, documentation, etc.).

Also use these platforms for collaborative writing of documentation, web pages, using tools for transforming structured text files (markdown, asciidoc). Requires continuous integration with transformation tools and deployment on a public area (such as github/gitlab pages).

Use of standard tools to facilitate training, aculturation.

List of “public” HER forges

INRIA (gitlab)

gitlab.inria.fr

CNRS (gitlab)

src.koda.cnrs.fr

gitlab.in2p3.fr

plmlab.math.cnrs.fr

gitlab.huma-num.fr

IRD (gitlab)

forge.ird.fr

CIRAD (gitlab)

gitlab.cirad.fr

INRAE (Gitlab)

forgemia.inra.fr

RENATER (fusionforge)

sourcesup.renater.fr

Laboratories or universities

*Put here the (public) forges of labs/universities, if you know any.

gricad-gitlab.univ-grenoble-alpes.fr

gitub.u-bordeaux.fr Accueil · Wiki · Administrator / infosgithub · GitLab

gitlab.lip6.fr

git.unicaen.fr

In Lille, to check (access without auth, where you can have an account and make projects) : gitlab.univ-lille.fr, gitlab.cristal.univ-lille.fr, gitlab-etu.fil.univ-lille.fr, gitlab-ens.univ-lille.fr, gitlab-fil.univ-lille.fr

Current forge features

Identification Outside HER Continuous integration Other services
gitlab.inria.fr Inria external sponsored guests GitLab-CI
src.koda.cnrs.fr CNRS (Janus) No
gitlab.in2p3.fr EduGAIN external users ?
plmlab.math.cnrs.fr Renater No ?
forge.ird.fr Renater CRU accounts - others in progress gitlab pages
gitlab.cirad.fr Renater No Gitlab-CI
sourcesup.renater.fr Renater No ?
gricad-gitlab.univ-grenoble-alpes.fr UGA Yes : on simple registration (without validation) but with limited permissions. Gitlab-CI (with shared runners)
gitub.u-bordeaux.fr Université de Bordeaux After
gitlab.lip6.fr LDAP LIP6 No GitLab-CI (without public runner)
gitlab.huma-num.fr HumanID No (but flexible in reality) GitLab-CI

Limitations

The main current limitation of existing forges is that it is often not possible for someone outside of HER to access them. Most of the forges require an identity federation type access (at the HER or structure level) to have access to all the functionalities.

Although some allow the creation of external accounts, they are often difficult to access (for example, for gitlab.inria.fr, an external account must be “sponsored” by a member of an Inria project team) and limited (gitlab external account operation which does not allow the creation of one’s own projects). It is therefore often impossible, or difficult, to propose changes with this kind of account (because it requires making a fork of the original project, thus creating a project of one’s own on the forge). Reporting a bug requires to have obtained an account beforehand, which can be very difficult.

It is possible to synchronize local forges (HER) with public forges (gitlab.com or github.com). However, you lose the centralization of information. You have to disable the public ticket system in this case to keep the tickets on the local forge. But this solution is not practical for contributors, who have to use two accounts (one for tickets, bug reports, suggestions) and another one for contributions (code, docs).

Another limitation is the absence of a single showcase for the software production of the HER, or even of a single EPST or university. This limitation can be compensated by a catalog of HER software production (see WG 1).

Speaking of showcases, there are currently many forges in which the creation of public projects is not “validated”. So we find public projects that range from “hello word” to a big application that has been maintained for years. There is no filter. This is a problem for a showcase (which is meant to highlight projects of a certain “quality”). It is not a problem if you want to provide a technical platform.

The lack of control (or the weak control) of the project creations makes the management of these complicated. It is difficult (impossible?) for administrators to determine which projects should be kept or not. Another difficulty: the management of user accounts according to the evolution of their status (departure/arrival from/in the HER etc ).

Another complicated point to manage (due to the possibility of registering on the platform without validation) is spam via snippets and issues for public projects.

Another important point: offering runners for continuous integration raises security issues. Public forges have reduced or simply disabled accounts that allow free use of CPU time on runners.

Actions

  • To be discussed Encourage HER forges to adopt identity federation such as eduGAIN. This would facilitate contribution to existing projects among HER staff. (N.B. For non-HER contributions, would allowing OAuth authentication via GitHub.com be feasible?)
  • To be discussed Support Forgefriends, a project to enable forge federation. The federation of forges could, in the long run, solve some of the limitations mentioned above, by allowing people to contribute to a project from the forge of their choice. There is also ForgeFed, linked to fediverse, to be seen.
  • Identify possible training courses in the HER concerning the use of forges (licenses, git, continuous integration, quality assurance, etc.) for a technical audience (computer scientist) or not (non computer scientist): DevLog, GDR GPL, internal training IRD, CNRS, INRIA (experimentation and development service, SED). (see Je code : Les bonnes pratiques de développement logiciel)
  • To provide ideas for a policy regarding the opening of the code

Ideas (to be explored)

Use of OpenID? OpenID Connect OmniAuth provider | GitLab

What about GitHub?

The CodeGouv platform references 2351 repositories under the HER.
93% of these repositories are hosted on GitHub. The 168 remaining repositories use mostly gitlab instances : forgemia.inra.fr (16), framagit.org (11), git.unicaen.fr (21), gitlab.inria.fr (76), gitlab.com (27)…

2 Likes

Here is the mail I sent in reply to the proposal to comment on the document above and participate to a videoconference to discuss forgefriends on June 17th, 2022. This is a translation from French and the original version can be found at the end of this message.


Evening,

With Roberto Di Cosmo participating in this working group and all that has been done to preserve the software in https://www.softwareheritage.org/, I will only comment on the federation of forges aspect and the vision carried by forgefriends[0], as I see it. I specify that it is a personal view and not a collective one: the project being horizontal, each member carries their own vision[1].

Ideally, every free software project should be found in multiple copies everywhere, identical or almost identical, continuously updated on software forges that constantly communicate with each other. When a researcher opens a bug on her forge, it is copied to other forges and the answers that another person may bring to it comes back to her, without her being aware of the forge where it was originally written. A bit like e-mail communication, really.

Since software projects then exist in multiple copies, their durability is much better and the task of Software Heritage is made easier. Each person can choose the forge they prefer according to their need of the moment, they have at their fingertips the union of the functionalities of all the forges instead of being confronted with the Cornelian choice which always amounts to giving up what a particular forge does not offer.

The way to reach this ideal goal is very simple and faces only one obstacle: the popularity of the centralized model carried by GitHub / GitLab, blocked by the proprietary software they edit and the billions artificially at stake (let’s compare this to the budget of wikipedia to take a more reasonable measure). Everyone agrees, privately, that this is far from what we need. But there are very few people working on this, all of whom are involved to varying degrees at https://forgefriends.org.

This being said, what do we do in practice? Dreams are very good but in everyday life they are quite frustrating. My choice is to progress towards the ideal by avoiding to engage in a way that would make it impossible (or very difficult). If I were in the position of an organization providing resources to HER (which I am not, so I am well aware that the advice is more a projection of my imagination than anything else), I would do this:

  1. deploy a free software GitLab forge (GitLab CE) exclusively because it is the best compromise between freedom and features
  2. sign a support contract with a company totally independent from GitLab to avoid constant pressure to acquire proprietary extensions
  3. stay informed about the progress of projects aiming to make GitLab communicate with other software forges
  4. support these projects as much as possible to increase the probability of their success
  5. try to use projects that claim to make forges communicate as soon as they announce that it is possible (which is probably not true in most cases, let’s face it)

In other words, (i) choose a pragmatic solution while avoiding the pitfalls (ii) concretely support the projects that tend towards the ideal, even if they look rather shaky from a distance.

To conclude, I would be happy to talk about this in person with you, as it is a subject that i care about. To be in the spirit of radical transparency of the forgefriends project, I would really appreciate if this part could be recorded and broadcasted publicly (I can take care of the technical aspect). In the same way the monthly video conferences[2] are, even the ones that organize the grant applications.

This gives me the opportunity to answer another question, about the forgefed project [3]. After more than a year of silence, work has been resumed for two weeks on the formalization of this vocabulary describing software forges and intended to be used by the ActivityPub protocol. A first public videoconference will take place on June 13, 2022 [4] and its recording will be available if you cannot attend.

Cheers

[0] Definition and scope of the forgefriends project
[1] Forgefriends Community Manifesto
[2] Forgefriends monthly update, May 19h 2022, 7pm-8pm UTC+2
[3] https://forum.forgefriends.org/c/development/forgefed/18
[4] Forgefed videoconference - June 13th 2pm UTC


Bonsoir,

Avec Roberto Di Cosmo participant à ce groupe de travail et tout ce qui a été fait pour préserver le logiciel dans https://www.softwareheritage.org/, je vais seulement commenter sur l’aspect fédération des forges et la vision portée par forgefriends[0], telle que je la conçois. Je précise que c’est une vue personnelle et non collective: le projet étant horizontal chacun membre porte sa vision[1].

L’idéal serait que tout projet logiciel libre se trouve à de multiple exemplaires un peu partout, à l’identique ou presque, mis à jour en continu sur des forges logicielles qui communiquent constamment entre elles. Quand une chercheuse ouvre un bug sur sa forge, c’est copié sur d’autres forges et les réponses qu’une autre personne peut y apporter lui reviennent, sans qu’elle ait besoin de savoir sur quelle forge cela a été rédigé à l’origine. Un peu comme on communique parcourriel, finalement.

Comme les projets logiciels existent alors à plusieurs exemplaires, leur pérennité est bien meilleure et la tache de Software Heritage grandement facilitée. Et chaque personne peut choisir la forge qu’elle préfère selon ses besoins du moment, elle a sous les doigts l’union des fonctionnalités de toutes les forges au lieu de se trouver confronté au choix cornélien qui revient toujours à renoncer à ce qu’une forge particulière ne propose pas.

Le chemin pour arriver à cet objectif idéal est très simple et se heurte à un seul obstacle: la popularité du modèle centralisé porté par GitHub / GitLab, bloqué par le logiciel propriétaire qu’ils éditent et les milliards artificiellement mis en jeu (comparons cela au budget de wikipedia pour prendre une mesure plus raisonnable). Tout le monde s’accorde a dire, en privé, que c’est loin d’être souhaitable. Mais il y a très peu de personnes qui travaillent sur le sujet, toutes impliquées à divers degré sur https://forgefriends.org.

Ceci étant posé, on en fait quoi en pratique ? Les rêves c’est très bien mais au quotidien c’est assez frustrant. Mon choix est de progresser vers l’idéal en évitant de m’engager dans une voie qui rendrait cela impossible (ou bien très difficile). Si j’étais dans la position d’une organisation fournissant des moyens à l’ESR (ce que je ne suis pas donc j’ai bien conscience que le conseil est plus une projection de mon imagination qu’autre chose), je ferais ceci:

  1. déployer une forge GitLab logiciel libre (GitLab CE) exclusivement parce que c’est aujourd’hui le meilleur compromis entre liberté et fonctionnalités
  2. souscrire un contrat de support à une entreprise totalement indépendante de GitLab afin d’éviter de subir une pression constante pour acquérir des extensions propriétaire
  3. se tenir informé des progrès des projets visant à faire communiquer GitLab avec d’autres forges logicielles
  4. soutenir ces projets autant que possible pour augmenter les probabilités qu’ils aboutissent
  5. essayer d’utiliser les projets qui prétendent faire communiquer des forges dès qu’ils annoncent que c’est possible (ce qui est probablement faux dans la plupart des cas, soyons réalistes)

Autrement dit, (i) choisir une solution pragmatique en évitant les embûches (ii) soutenir concrètement les projets qui tendent vers l’idéal, même s’ils ont l’air assez bancals vus de loin.

Pour conclure j’en parlerais volontiers de vive voix avec vous car c’est un sujet qui me tient à cœur. Pour être dans l’esprit de transparence radicale du projet forgefriends, j’apprécierais beaucoup si cette partie pouvait être enregistrée et diffusée publiquement (je peux me charger de l’aspect technique). Tout comme le sont les vidéoconférences mensuelles[2] et même celles qui organisent les demandes de financement.

Ce qui me donne l’occasion de répondre à une autre interrogation, à propos du projet forgefed[3]. Après plus d’un an de silence, le travail reprend depuis deux semaines sur la formalisation de ce vocabulaire décrivant les forges logiciels et destiné à être utilisé par le protocole ActivityPub. Une première vidéoconférence publique aura lieu le 13 Juin, 2022[4] et son enregistrement sera disponible si vous ne pouvez y participer.

A++

[0] Definition and scope of the forgefriends project
[1] Forgefriends Community Manifesto
[2] Forgefriends monthly update, May 19h 2022, 7pm-8pm UTC+2
[3] https://forum.forgefriends.org/c/development/forgefed/18
[4] Forgefed videoconference - June 13th 2pm UTC

1 Like

I did not receive a reply to the mail above and assumed there was no interest from them. But I received a mail today, with a recording of the meeting where they were expecting me. Here is my reply:


Hi,

Thanks for the link to the recording, which I listened carefully. Indeed there was a misunderstanding: since I did not receive a reply to my email, I did not write that down in my agenda. And I should have because the discussion was most interesting. It gives me a lot to think about because it is the first time I hear that kind of testimony, these concrete problems and real world feedback. More anything else it makes me want to ask questions and better understand.

I’m available to attend another meeting, if you want to. But in all honesty, I confess that the state of forge federation is only theoretical at this point. I would be surprised if it turns out to solve a problem you actually have in the year to come :slight_smile:

Cheers


Bonjour,

Merci pour le lien de l’enregistrement que j’ai écouté avec attention. Effectivement il y a eu un malentendu: n’ayant pas eu de réponse à mon courriel je n’ai pas inscrit cela dans mon agenda. Et j’aurais du car la discussion était des plus intéressantes. Elle me donne beaucoup à penser car c’est la première fois que j’entends ces témoignages, ces problèmes concrets et ces retours d’expérience. Avant tout cela me donne envie de poser des questions et de mieux comprendre.

Je suis disponible pour assister à une prochaine réunion si vous le souhaitez. Mais, en toute honnêteté, je confesse qu’au stade ou en est la fédération des forges elle ne présente qu’un intérêt théorique. Je serais le premier surpris si elle s’avère résoudre un problème concret pour vous dans l’année qui vient :slight_smile:

A++

Another videoconference is scheduled July 7th, 2022 11am. I’ll attend and publish the recording here.