!gawonsIgQvFMKiMyYX:matrix.org

cleaninsights-discussion

36 Members
Discussion about cleaninsights.org and privacy preserving measurement6 Servers

Load older messages


SenderMessageTime
2 Apr 2021
@n8fr8:matrix.orgn8fr8We have a bunch of ideas there, and can help you sort these things out. The actual code implementation or log configuration isn't hard... moving away from "spy on and record everything, and then we'll figure it out later" is.17:00:31
@simonft:matrix.orgsimonftThanks! We're currently doing something similar to the matomo proxy, but instead nginx is sending IP-less logs to a script that sends them up to matomo. This gets us raw pageviews, but the rest of the data in matomo ends up being quite a bit less useful17:08:01
@n8fr8:matrix.orgn8fr8Yes, we've had the same issue, somewhat with Matomo quickly losing usefulness, at least in its default dashboard configuration. 17:09:05
@n8fr8:matrix.orgn8fr8Some of the work we are doing is related to custom events, so you can decide you want to understand use of a certain feature on the site, or interaction with aspects of a specific story or content17:09:41
@n8fr8:matrix.orgn8fr8This is what F-Droid did for their app popularity contest, for instance, or to try and understand how often users are failing to install an app17:10:04
@simonft:matrix.orgsimonftOh interesting. I think that would be useful to us.17:13:50
@simonft:matrix.orgsimonftWe've also been looking at self-hosting plausible. Last time I looked into it I had some concerns about backing up the visitor data without also backing up the rotating salt they're using to make it extremely difficult to brute force the hashes. They were thinking about ways to fix that though.17:14:00
@simonft:matrix.orgsimonftAnd we've had on-and-off talks with Ian Goldberg about various possible differential privacy schemes if we want to get fancy. One of the things we're thinking about is how badly we want numbers of "unique" visitors, and how correct that number needs to be17:15:07
@n8fr8:matrix.orgn8fr8Diff Privacy can start working well at a large scale, though as a way to get a specific count of visitors, yeah, it can be tricky. It could be useful to get a rough idea of typeface preference, or doing A/B testing of one layout versus another.17:20:41
@n8fr8:matrix.orgn8fr8I think it could be interesting to try to do some analysis of the raw page views to group them into sessions, and then estimate unique visitors from there. Overall, I will admit implementing CI for web visitors is not our strongest area, but we are putting more effort into now, as it keeps coming up. Even for those with apps, they want to be able to link it to web traffic, and have one system for both.17:23:09
@gina_h:matrix.orggina_hSo glad you're here @simonft! 19:13:04
@simonft:matrix.orgsimonftglad to be here! and glad others are thinking about these problems19:37:32
3 Apr 2021
@eighthave:matrix.org@eighthave:matrix.orghi simonft, welcome! glad to have the publisher's perspective here. You might be interested in the comparison between the Clean Insights approach and differential privacy https://guardianproject.info/2021/03/02/new-insights-into-clean-analytics/11:18:44
@eighthave:matrix.org@eighthave:matrix.org in other Clean Insights news, Apple is rejecting apps with SDKs that do not get user consent https://9to5mac.com/2021/04/01/app-store-now-rejecting-apps-using-third-party-sdks-that-collect-user-data-without-consent/ 11:20:05
6 Apr 2021
@simonft:matrix.orgsimonft _hc: thanks for that comparison! 14:24:45
@simonft:matrix.orgsimonftDo people here have thoughts on Prio? https://www.abetterinternet.org/prio/14:25:07
@eighthave:matrix.org@eighthave:matrix.orgClean Insights does various kinds of aggragation like is mentioned in that brief description. My first thought is that sounds like we have the same goals14:30:15
@eighthave:matrix.org@eighthave:matrix.orgthey don't seem to have anything you can use yet14:30:29
@eighthave:matrix.org@eighthave:matrix.orgit seems very focused on one specific part of the problem: the aggragation14:31:05
@simonft:matrix.orgsimonftyeah. https://github.com/abetterinternet/prio-server is a thing but there's not a lot more details on how to actually run it14:31:45
@simonft:matrix.orgsimonftthey reached out to us but I don't have much info about why14:31:55
@eighthave:matrix.org@eighthave:matrix.org

ocumentation

Integration Guide for Android and iOS Applications (Coming Soon)
How to Operate a Data Share Processing System (Coming Soon)
14:32:06
@eighthave:matrix.org@eighthave:matrix.orgbased on their paper, their use case is more like differential privacy, where you assume that you're aiming to collect PII14:33:41
@eighthave:matrix.org@eighthave:matrix.orgwe're trying to push analytics without gathering PII and see how far that can get us14:34:05
@simonft:matrix.orgsimonftWhen you talk about PII, does that include some sort of unique device identifiers for the device/user? E.g. I could imagine estimating uniques, or say pagevisits per user, without collecting or storing things like name, username, location14:42:22
@eighthave:matrix.org@eighthave:matrix.orgyes unique IDs are PII 16:24:40
@eighthave:matrix.org@eighthave:matrix.orgits really hard to gather a lot of data tied to a pseudonym, then keep it anonymous16:25:10
@eighthave:matrix.org@eighthave:matrix.orgunless the types of data gathered is really restricted16:25:27
@eighthave:matrix.org@eighthave:matrix.orgI mean they are probably not considered PII by GDPR, unless things are deanonymized.16:26:14
@eighthave:matrix.org@eighthave:matrix.orgavoiding tracking people makes a lot of other things much easier16:26:34

Show newer messages


Back to Room ListRoom Version: 5