25 Mar 2021
@irl:irl.xyzirlthat's just liability I don't want to have to deal with, especially from an organisation that is more committed to profit than it is to free software.08:55:17
@gina_h:matrix.orggina_hThe Markup is working on building their own version of analytics -- I reached out to them about Clean Insights but so far no response15:42:12
31 Mar 2021
@n8fr8:matrix.orgn8fr8 (he)We've got a new video up: https://www.youtube.com/watch?v=NT6LmCTpQ4k and a tweet about it here: https://twitter.com/guardianproject/status/137735164340708556820:07:19
2 Apr 2021
Nathan has a very Mr. Rogers vibe in this video. 😀
@simonft:matrix.orgsimonft joined the room.16:49:15
@simonft:matrix.orgsimonftHello! I work at The Markup, where we've been exploring building privacy-preserving analytics, with goals that seem to align nicely with clean insights.16:50:43
@n8fr8:matrix.orgn8fr8 (he) Welcome, simonft :) In this room from the team is threeletteracronym and irl our lead developers on the SDKs and infrastructure, crwinfrey our consent UX lead, and a bunch of the rest of the team and community 16:52:47
@n8fr8:matrix.orgn8fr8 (he)Glad you're already on this path. Was the intent of your project just as a tool for your own site/needs, or to create something general purpose? Is it for measuring web interaction, apps, both?16:53:34
@simonft:matrix.orgsimonftThe intention is first and foremost for our own site/needs, but the thinking is that if there's not something that really fits our needs and we're going to have to build some stuff on our own we may as well make it something others can use as well.16:55:46
@simonft:matrix.orgsimonftAnd we're just interested in web interaction right now, as we don't have any mobile apps16:56:24
@simonft:matrix.orgsimonftWe've been in the requirements gathering stage for a while, though I'm hoping we'll have something reasonably solid on that front soon16:57:32
@n8fr8:matrix.orgn8fr8 (he)Got it. We have 3 things that could be relevant: 1) Our JS SDK which was originally designed for Node and React apps, but that we are packaging now for easier integration with the web 2) Our Python SDK which could be integrated server side (along with our JS SDK if you are using Node), and then 3) our work on privacy preserving web logging, and how to import and analyze those16:57:54
@n8fr8:matrix.orgn8fr8 (he)Great. Would be happy to talk through that as well on a call sometime. The work we're doing with Consent UX, and trying to figure out how/when to ask for it, and what insights you are hoping to gain, and how you communicate the value of that to your audience, is a big piece of all of this.16:58:36
@n8fr8:matrix.orgn8fr8 (he)We have a bunch of ideas there, and can help you sort these things out. The actual code implementation or log configuration isn't hard... moving away from "spy on and record everything, and then we'll figure it out later" is.17:00:31
@simonft:matrix.orgsimonftThanks! We're currently doing something similar to the matomo proxy, but instead nginx is sending IP-less logs to a script that sends them up to matomo. This gets us raw pageviews, but the rest of the data in matomo ends up being quite a bit less useful17:08:01
@n8fr8:matrix.orgn8fr8 (he)Yes, we've had the same issue, somewhat with Matomo quickly losing usefulness, at least in its default dashboard configuration. 17:09:05
@n8fr8:matrix.orgn8fr8 (he)Some of the work we are doing is related to custom events, so you can decide you want to understand use of a certain feature on the site, or interaction with aspects of a specific story or content17:09:41
@n8fr8:matrix.orgn8fr8 (he)This is what F-Droid did for their app popularity contest, for instance, or to try and understand how often users are failing to install an app17:10:04
@simonft:matrix.orgsimonftOh interesting. I think that would be useful to us.17:13:50
@simonft:matrix.orgsimonftWe've also been looking at self-hosting plausible. Last time I looked into it I had some concerns about backing up the visitor data without also backing up the rotating salt they're using to make it extremely difficult to brute force the hashes. They were thinking about ways to fix that though.17:14:00
@simonft:matrix.orgsimonftAnd we've had on-and-off talks with Ian Goldberg about various possible differential privacy schemes if we want to get fancy. One of the things we're thinking about is how badly we want numbers of "unique" visitors, and how correct that number needs to be17:15:07
@n8fr8:matrix.orgn8fr8 (he)Diff Privacy can start working well at a large scale, though as a way to get a specific count of visitors, yeah, it can be tricky. It could be useful to get a rough idea of typeface preference, or doing A/B testing of one layout versus another.17:20:41
@n8fr8:matrix.orgn8fr8 (he)I think it could be interesting to try to do some analysis of the raw page views to group them into sessions, and then estimate unique visitors from there. Overall, I will admit implementing CI for web visitors is not our strongest area, but we are putting more effort into now, as it keeps coming up. Even for those with apps, they want to be able to link it to web traffic, and have one system for both.17:23:09
@gina_h:matrix.orggina_hSo glad you're here @simonft! 19:13:04
@simonft:matrix.orgsimonftglad to be here! and glad others are thinking about these problems19:37:32
3 Apr 2021
@eighthave:matrix.org_hchi simonft, welcome! glad to have the publisher's perspective here. You might be interested in the comparison between the Clean Insights approach and differential privacy https://guardianproject.info/2021/03/02/new-insights-into-clean-analytics/11:18:44
@eighthave:matrix.org_hc in other Clean Insights news, Apple is rejecting apps with SDKs that do not get user consent https://9to5mac.com/2021/04/01/app-store-now-rejecting-apps-using-third-party-sdks-that-collect-user-data-without-consent/ 11:20:05
6 Apr 2021
@simonft:matrix.orgsimonft _hc: thanks for that comparison! 14:24:45

