mCaptcha - NLnet grant application - August 2022

realaravinth · October 5, 2022, 4:29pm

My response to the email above:

Hello <redacted>,

Apologies for the delayed response.

I invited @gusted[0], a Codeberg volunteer who is leading the mCaptcha deployment in Codeberg[1], to participate in the grant and he kindly accepted. He implemented mCaptcha support in Gitea[2] through a Go client library that he developed[3].

I will respond to the other questions asked by Friday.

Once again, I apologies for the delay. I got caught up with some things offline.

Warm regards,
Aravinth

---
[0]: https://gusted.xyz
[1]: https://codeberg.org/Codeberg/Community/issues/479#issuecomment-600240
[2]: https://github.com/go-gitea/gitea/pull/20458
[3]: https://codeberg.org/Gusted/mCaptcha

I’ll post a draft to the other questions in a moment

realaravinth · October 5, 2022, 5:15pm

Response from NLnet to the above email:

Hi Aravinth,

> I invited @gusted[0], a Codeberg volunteer who is leading the mCaptcha deployment in Codeberg[1], to participate in the grant and he kindly accepted. He implemented mCaptcha support in Gitea[2] through a Go client library that he developed[3].

that is excellent. Have you discussed an amount for that contribution and an associated rate (we'll have to do that soon anyway should the project be accepted)?

> Once again, I apologies for the delay. I got caught up with some things offline.

These things happen, hopefully things are good now 😉

Best,
<redacted>

My response to the above email:

Hi <redacted>,

Thanks for the swift response! 😄

> that is excellent. Have you discussed an amount for that contribution and an associated rate (we'll have to do that soon anyway should the project be accepted)?

I'm drafting a response to the questions asked earlier, I'll work with Gusted and include a list of tasks that he's willing to work on and his rates.

> These things happen, hopefully things are good now 😉

Thanks for understanding, and yes things are much better now :D

Warm regards,
Aravinth

realaravinth · October 5, 2022, 5:37pm

My response to the questions asked:

What are your thoughts on PrivacyPass?

The benefits of implementing PrivacyPass within mCaptcha are marginal. PrivacyPass is designed to improve the experience of visitors using VPNs and Tor. So it assumes that Tor/VPN visitors have bad experiences with CAPTCHAs, which isn’t true for mCaptcha.

Also, PrivacyPass is disabled when an attack/surge is detected. mCaptcha detects surges in seconds and increases the difficulty factor to contain the surge. So if PrivacyPass is implemented, then it will only be used in normal conditions, when the CAPTCHA takes less than 200ms to solve. Using PrivacyPass in this situation doesn’t yield significant UX improvements for the increase in code complexity.

Would the Codeberg team be willing to join this project?

As mentioned in the previous email, I invited @gusted to participate in the grant, and he kindly agreed.

Gusted’s objectives:

TODO: @gusted, please add the list of tasks that you are interested in working on. Also, kindly mention your rates against the tasks.

You requested 28500 euro, equivalent to one year of fulltime work. Could you provide a breakdown of the main tasks, and the associated effort?

Tasks

The grant application includes a full list of tasks, which is also available here.

Objective 1: Proof-of-Work accessibility:

Difficulty rating: intermediate

mCaptcha currently has two Proof-of-Work libraries: WebAssembly and JavaScript polyfill. The survey must collect benchmarks using both, since a visitor might end up using either libraries. Percentile scores must be calculated on the results aggregated, so that the webmasters who integrate mCaptcha in their websites can make informed decision on difficulty factors that will work for most of their visitors. The results must also be published under open-access licenses.

The benchmark code partially exists but processing, and publishing mechanisms don’t exist yet.

Objective 2: Horizontal scaling

Difficulty rating: hard

mCaptcha uses a leaky bucket algorithm for response Proof-of-Work difficulty scaling. The implementation that currently exists within mCaptcha isn’t distributed and so is a bottleneck for deployment with popular websites.

So I must implement a distributed version of the same algorithm. The new implementation must also be verified for correctness. To verify, I’ll have to create Infrastructure-as-Code for automated deployment in test environment.

Both distributed leaky bucket algorithm and full system Infrastructure-as-Code are time-consuming, so this objective is rated “hard”.

Objective 3: Integration test

Difficulty rating: hard

mCaptcha, at the moment, is maintained solely by me. Full system integration tests covering all configuration matrices will significantly improve quality, stability and ease maintenance.

Currently, extensive unit testing exist within individual programs and libraries, but full system integration tests don’t exist. In order to set this up, I’ll have to deploy a test runner (requested part of this application), write Infrastructure-as-Code to set up test env and periodically run tests.

It is an involved and time-consuming process and so it is rated “hard”.

dachary · October 5, 2022, 8:09pm

You will find my comments and suggestions in the chatroom

Gusted · October 8, 2022, 3:46pm

I will be working on making libraries(in Go, Rust, Javascript etc.) to interact with mCaptcha’s API. The building of the library includes: designing a general structure(used across programming languages), documentation of the library, implement tests and obviously the code itself.

I will be working on Codeberg to deploy a mCaptcha instance and used in combination with codeberg.org’s Gitea instance. So as a task(if needed, not sure if this will actually be fulfilled), I can help with other server admins to setup an mCaptcha instance.

Given the experience by setting up mCaptcha in a real-world scenario, I will be able to improve the documentation and process of setting up and maintaining a mCaptcha instance.

Feel free to word this into the grant, so it’s consistent language and tone.

Gusted · October 8, 2022, 6:37pm

For the tasks that I will be working on, I will be requesting a €20/h rate. Which was determined by the difficulty factor for these task. @realaravinth Are you fine with this?

realaravinth · October 10, 2022, 9:20am

Perfect!

realaravinth · October 12, 2022, 5:27am

Receneivrd response from NLnet:

Dear Aravinth and Gusted,


you applied to the first NGI0 Entrust open call from NLnet, round August 2022. We have kept you in suspense for a while, because this call was the single largest in our history in terms of proposals that needed to be processed. This is done, however, and currently a selection of the projects is pending the final stage review by an independent review committee to validate their eligibility, and we are happy to inform you that this includes your project "mCaptcha" (2022-08-142). Should your project pass that final hurdle (which under normal circumstances it should, but please do not seek external publicity until it is officially confirmed), the selection will be made public and we will contact you in order to establish a Memorandum of Understanding. The final amount of the grant will be determined at that point.

We will then also need to share some information about the project both with the general audience and with the European Commission. In the interest of time, we ask you to prepare a **one paragraph management summary** of the project. For examples we refer you to https://nlnet.nl/thema/NGIZeroPET.html

We kindly request you to send us this summary as soon as possible.

If you meanwhile have any questions, please let us know.

Kind regards,
on behalf of NLnet foundation,

cc @gusted

realaravinth · October 13, 2022, 1:55pm

My response:

Hello <redacted>,

Thanks for the good news. Here's the management summary that you asked:

----------

Existing CAPTCHA systems expect visitors to identify objects to prevent spam, which makes the web inaccessible to persons with cognitive, auditory, and visual special needs. They log Internet Protocol (IP) addresses and use tracking technologies, like cookies, to track and profile their users across the internet. IP logging and cookie-based tracking are privacy-invasive, inaccurate, and impossible to use with anonymizing technologies like Tor and VPNs. Censors can abuse the opaque nature of these systems to prevent certain groups from accessing certain types of information. Independent testing for bias is not possible since the documentation doesn't exist for their methods and algorithms.


mCaptcha is an attempt at creating a self-hosted alternative to reCAPTCHA and hCaptcha with a focus on privacy, transparency, user experience, and accessibility. mCaptcha’s Proof of Work (PoW) mechanism uses strong cryptographic principles that guarantee idempotency and transparency. mCaptcha doesn’t log IP addresses and doesn’t require tracking user activity across the internet. Censors can’t use mCaptcha to deny access to information without detection. Also, the PoW mechanism requires minimal user interaction to solve the CAPTCHA, which will significantly improve the accessibility of the web.

----------

Warm regards,
Aravinth

realaravinth · October 13, 2022, 5:31pm

Response from NLnet:

Hi Aravinth and Gusted,

> Thanks for the good news. Here's the management summary that you asked:

thank you very much for the summary, much appreciated!

We will keep you posted on the outcome of the external review committee.

Meanwhile, take care!

Best,
<redacted>

realaravinth · October 24, 2022, 12:15pm

We are in!

Dear Aravinth and Gusted,


Congratulations! We have received the green light from the independent review committee. That means your project "mCaptcha" (2022-08-142) is one of the selected proposals eligible to receive a grant from NLnet foundation in the August 2022 NGI0 Entrust call!

We should set up a call in order to undertake the necessary further steps - leading up to a Memorandum of Understanding that includes a concrete project plan with pertinent milestones. Note that the final amount of the grant will be determined in dialogue with you, also taking into account any new insights during the negotiations.

We at NLnet Foundation are very much looking forward to working with you, together with our partners in NGI0 Entrust - which we will tell you more about during our upcoming call. You will also find the key information in the attached document.

Can you please indicate some convenient dates in the coming weeks?

If you meanwhile have any questions, please let us know.

Kind regards,
on behalf of NLnet foundation,
<redacted>

cc: @gusted

realaravinth · October 24, 2022, 12:38pm

Ran a poll to determine ideal dates. I have tests from October 28th to November 10th. Hopefully NLnet are available on the 26th

My response:

Dear <redacted>,

Thanks for the email, this is great news! :)

Gusted and I are available on October 26th from 14:00-16:00 UTC+2. Please let me know if this timing convenient for you.

Warm regards,
Aravinth

realaravinth · November 2, 2022, 4:39am

I forgot to post the following messages:

From me to NLnet on 22:09 IST 25th October, 2022:

Dear <redacted>,

Apologies for requesting the meeting in such short notice. It's just that Gusted is busy this week except for the 26th[0], and I will be unavailable from 1st to 7th November.

If 26th is too tight, can you please let me know if we can pick a date after November 7th?

Thanks!

Warm regards,
Aravinth


[0]: https://framadate.org/mcaptcha-nlnet-2022-01

Received response from NLnet at 22:27 IST 31st October, 2022:

Sorry for the delay, we are very busy rounding up two previous grant programmes, which finish today. Tomorrow we will start planning our calls. A call next week, after the 7th, is fine.

— <redacted>, NLnet

The above email is from a different person at NLnet, my original handler sent the folloiwng response oat 22:27 31st October, 2022:

Hi Aravinth and Gusted,

> Gusted and I are available on October 26th from 14:00-16:00 UTC+2. Please let me know if this timing convenient for you.

we didn't make that date, apologies - we were very busy with a deadline.
Would you have some new dates for us? For instance on Friday the 11th?

Best,
<redacted>

For the record, I’ve been the only person receiving these emails. I have requested @gusted’s email, will start cc’ing him in future communications.

dachary · November 2, 2022, 7:15am

Maybe make an alias that forwards to both of you. That works for the Gitea application.

realaravinth · November 2, 2022, 7:37am

Sent the following email:

Hello <redacted>,

Gusted and I are available from 12PM - 4PM UTC+1 the 11th. Please let me know if this timing is inconvenient, we will come with alternate dates :)

Warm regards,
Aravinth

realaravinth · November 2, 2022, 8:54am

Received positive response on the timing from NLnet:

Hi Aravinth, Gusted,


> Gusted and I are available from 12PM - 4PM UTC+1 the 11th. Please let me know if this timing is inconvenient, we will come with alternate dates 😄

that works beautifully. I've noted you down for 12h CET. I am aware that this is a late hour for you, so as early as possible 😉

I hope you can find the time to look into the intake document we sent before.

I suggest we use an open source tool called Galene, which we host an instance of ourselves:

<redacted>

(you can use any name or password) Let me know if that works for you.

Looking forward to meeting you both and discuss the project!

Best,
<redacted>

realaravinth · November 2, 2022, 11:00am

Sent confirmation email:

Hello <redacted>,

 > that works beautifully. I've noted you down for 12h CET. I am aware that
 > this is a late hour for you, so as early as possible ;)

Thank you, it is perfect!

> Let me know if that works for you.

And we confirm able to log into the Galene instance.

Looking forward to meeting you!

Warm regards,
Aravinth

dachary · November 7, 2022, 8:41pm

You are in the final phases and maybe have doubts regarding what you can publish publicly or not. From asking in a previous grant, the answer is: everything, except for the exact amounts. And except when told otherwise, such as recording the intake call. To be on the safe side you can explicitly ask your contact.

So for instance, there is nothing confidential in the intake document or the MoU, except for private data and amounts which you can conveniently redact. It would be great if you publish them when the time comes so there are more examples out there, not just what I published

realaravinth · November 8, 2022, 8:49am

Thanks for the tip, I’ll take it up with my handler and publish everything that is safe to publish

realaravinth · November 18, 2022, 12:42pm

@gusted and I have come up with a draft work plan for the grant. The work plan includes all tasks that were included in the proposal and some additional ones that @gusted will be working on.

We haven’t assigned weights to tasks, but we will be assigning concrete amounts against them soon.

Some backlog in reporting:

Gusted and I attended an intake call with NLnet where some basic information regarding the grant and our collaboration was presented. I didn’t receive prior approval from them to publish our communications, so they had some reservations about it. Going forward, I’ll seek permission from them before uploading it to here.

We also received a draft MoU yesterday (2022-11-17). The MoU was missing some information, so I emailed today requesting clarification.