25 Mar 2021 |
irl | they did not have an api when we were initially looking at matomo | 08:52:37 |
irl | only the javascript tracking code | 08:52:47 |
irl | but swapping out BSD/MIT licensed code for AGPL without much warning put me right off. you can argue that AGPL would then mean I have to release the source code of the entire website on request just because I embed the tracking code. | 08:54:03 |
irl | that's just liability I don't want to have to deal with, especially from an organisation that is more committed to profit than it is to free software. | 08:55:17 |
| Abel changed their display name from Abel to abel. | 11:57:52 |
| Abel changed their display name from abel to Abel. | 12:53:40 |
gina_h | The Markup is working on building their own version of analytics -- I reached out to them about Clean Insights but so far no response | 15:42:12 |
31 Mar 2021 |
Nathan GP (he) | We've got a new video up: https://www.youtube.com/watch?v=NT6LmCTpQ4k and a tweet about it here: https://twitter.com/guardianproject/status/1377351643407085568 | 20:07:19 |
2 Apr 2021 |
gina_h | In reply to @n8fr8:matrix.org We've got a new video up: https://www.youtube.com/watch?v=NT6LmCTpQ4k and a tweet about it here: https://twitter.com/guardianproject/status/1377351643407085568 Nathan has a very Mr. Rogers vibe in this video. 😀 | 14:27:19 |
| simonft joined the room. | 16:49:15 |
simonft | Hello! I work at The Markup, where we've been exploring building privacy-preserving analytics, with goals that seem to align nicely with clean insights. | 16:50:43 |
Nathan GP (he) | Welcome, simonft :) In this room from the team is threeletteracronym and irl our lead developers on the SDKs and infrastructure, crwinfrey our consent UX lead, and a bunch of the rest of the team and community | 16:52:47 |
Nathan GP (he) | Glad you're already on this path. Was the intent of your project just as a tool for your own site/needs, or to create something general purpose? Is it for measuring web interaction, apps, both? | 16:53:34 |
simonft | The intention is first and foremost for our own site/needs, but the thinking is that if there's not something that really fits our needs and we're going to have to build some stuff on our own we may as well make it something others can use as well. | 16:55:46 |
simonft | And we're just interested in web interaction right now, as we don't have any mobile apps | 16:56:24 |
simonft | We've been in the requirements gathering stage for a while, though I'm hoping we'll have something reasonably solid on that front soon | 16:57:32 |
Nathan GP (he) | Got it. We have 3 things that could be relevant: 1) Our JS SDK which was originally designed for Node and React apps, but that we are packaging now for easier integration with the web 2) Our Python SDK which could be integrated server side (along with our JS SDK if you are using Node), and then 3) our work on privacy preserving web logging, and how to import and analyze those | 16:57:54 |
Nathan GP (he) | Great. Would be happy to talk through that as well on a call sometime. The work we're doing with Consent UX, and trying to figure out how/when to ask for it, and what insights you are hoping to gain, and how you communicate the value of that to your audience, is a big piece of all of this. | 16:58:36 |
Nathan GP (he) | We have a bunch of ideas there, and can help you sort these things out. The actual code implementation or log configuration isn't hard... moving away from "spy on and record everything, and then we'll figure it out later" is. | 17:00:31 |
simonft | Thanks! We're currently doing something similar to the matomo proxy, but instead nginx is sending IP-less logs to a script that sends them up to matomo. This gets us raw pageviews, but the rest of the data in matomo ends up being quite a bit less useful | 17:08:01 |
Nathan GP (he) | Yes, we've had the same issue, somewhat with Matomo quickly losing usefulness, at least in its default dashboard configuration. | 17:09:05 |
Nathan GP (he) | Some of the work we are doing is related to custom events, so you can decide you want to understand use of a certain feature on the site, or interaction with aspects of a specific story or content | 17:09:41 |
Nathan GP (he) | This is what F-Droid did for their app popularity contest, for instance, or to try and understand how often users are failing to install an app | 17:10:04 |
simonft | Oh interesting. I think that would be useful to us. | 17:13:50 |
simonft | We've also been looking at self-hosting plausible. Last time I looked into it I had some concerns about backing up the visitor data without also backing up the rotating salt they're using to make it extremely difficult to brute force the hashes. They were thinking about ways to fix that though. | 17:14:00 |
simonft | And we've had on-and-off talks with Ian Goldberg about various possible differential privacy schemes if we want to get fancy. One of the things we're thinking about is how badly we want numbers of "unique" visitors, and how correct that number needs to be | 17:15:07 |
Nathan GP (he) | Diff Privacy can start working well at a large scale, though as a way to get a specific count of visitors, yeah, it can be tricky. It could be useful to get a rough idea of typeface preference, or doing A/B testing of one layout versus another. | 17:20:41 |
Nathan GP (he) | I think it could be interesting to try to do some analysis of the raw page views to group them into sessions, and then estimate unique visitors from there. Overall, I will admit implementing CI for web visitors is not our strongest area, but we are putting more effort into now, as it keeps coming up. Even for those with apps, they want to be able to link it to web traffic, and have one system for both. | 17:23:09 |
gina_h | So glad you're here @simonft! | 19:13:04 |
simonft | glad to be here! and glad others are thinking about these problems | 19:37:32 |