Thank you for this writeup.
I had overlooked the fact that you wanted to setup a proxy that would use the API to get the data, then expose that data using ActivityPub. For some reason I was thinking about modifying GitLab, but of course this can not be done with GitHub. Ok.
That also explains why you want to export everything, store it onto the disk and diff it the next time you query the project data.
Could we separate the two ? There is one communication channel between the forge and the proxy and another one between the proxy and the fediverse using ActivityPub.
My comments were targeted at the second communication channel. This is the place where I think using RDF written as JSON-LD and the doap* ontologies is making the most sense, because the doap data model is already in use.
For example https://www.cubicweb.org/project/cubicweb?vid=doap is a description of the cubicweb project using DOAP. It is serialized as RDF/XML instead of RDF/JSON-LD, but that is a detail. What is important is that any application that can load and transform DOAP data can make sense of it. I am convinced this is something to leverage and build on.