Wikimedia SRE

374 Members
📟 https://icinga.wikimedia.org/alerts 📟 https://www.mediawiki.org/wiki/SRE | Channel is logged at https://wm-bot.wmflabs.org/libera_logs/%23wikimedia-sre/3 Servers

Load older messages

6 Oct 2022
@bblack:libera.chatbblackwe have a lot of tooling vs problem-space mismatches :/16:54:01
@bblack:libera.chatbblack(tooling at the lowest level I mean, e.g. vsthrottle as a universal solution)16:54:37
@cdanis:libera.chatcdanis bblack: yeah, that's one of the reason why I was proposing haproxy, it's quite a bit more flexible in what it can track 16:56:40
@bblack:libera.chatbblack yeah, it makes sense in general for that reason, and it's the frontmost layer in terms of defending the rest of the stack 16:57:18
@cdanis:libera.chatcdanisre: the research _joe_ mentioned, research is a bit generous but there is some anecdata comparison at https://phabricator.wikimedia.org/F3554683616:57:23
@cdanis:libera.chatcdanisvery long morning and I need to step away for a bit, back in about an hour16:59:07
@bblack:libera.chatbblackback on the kernel stuff, in modules/cacheproxy/manifests/performance.pp we currently define (for the cp nodes)17:04:00
@bblack:libera.chatbblack qdisc => 'fq flow_limit 300 buckets 8192 maxrate 256mbit', 17:04:05
@bblack:libera.chatbblack which the interface-rps script applies at the per-queue level for our multiq cards, so for example cp1075 in "tc qdisc" will show 12 queues each running that set of fq params. 17:04:42
@bblack:libera.chatbblackvery little has been done in this area, those parameter are very rough guesses17:05:03
@bblack:libera.chatbblack there's other qdisc we could use, other fairness policies, the bucketing could be wildly wrong, there's no overall shape limit, etc 17:05:29
@bblack:libera.chatbblack this is basically attacking one edge of the problem, from down there. how do you ensure all the various tcp flows coming out of this machines are fair to each other and the network, basically (which more targets the "few IPs downloading lots of stuff" problem than the hotlinking/mobileapp sort of thing) 17:07:44
@bblack:libera.chatbblackwe could even shape the whole 10G card down to a reasonable traffic level17:08:50
@bblack:libera.chatbblack(given the total outbound capacity of all the cp nodes at a site is larger than the total possible transit+peering out)17:09:16
@bblack:libera.chatbblack by doing that you're still "inducing failure" by dropping packets, but by doing it proactively with an aim towards per-flow fairness, more of the good stuff is getting through and less of the bad stuff. 17:10:27
@bblack:libera.chatbblack (unfortunately, tc can't see L7, so we can't fix a widely-hotlinked/embedded image this way :/) 17:12:14
@topranks:libera.chattopranksbblack: late to this, but would be interested to discuss. We are looking to introduce some QoS/categorisation on the network level. So for instance instead of dropping traffic below the 10G, it could instead be marked on the host as drop eligible, and the network could be left to decide if it ultimately needs to be dropped upstream. Lots of moving parts here, and none of it is a cure for insufficient bandwidth. But be interested to get 20:35:43
@topranks:libera.chattopranksyour thoughts.20:35:46
@cdanis:libera.chatcdanistopranks: that does give us some more options, cool!20:41:34
@cdanis:libera.chatcdanisat some point I also want to pick yours/ XioNoX‘s brains about egress traffic management options too (mostly out of curiosity, since it seems hard to get right at our scale with anything floss)20:43:02
@topranks:libera.chattoprankscdanis: yep, absolutely.20:44:01
@topranks:libera.chattopranksbu egress traffic management you mean right out at the edge towards the internet?20:44:13
@topranks:libera.chattopranks this stuff isn't easy, but I think we could pull of a better integration of host-level and network-level control than many manage to do 20:45:39
@cdanis:libera.chatcdanistopranks: yeah, either by knowing to limit traffic at the host level, or, by knowing to send one AS’s egress split between peering and transit 20:54:27
@topranks:libera.chattopranksThe last one is the real trick. Some of the SDN / SD-WAN stuff is built to tackle that, and also feed back performance info to the path selection. Very difficult to achieve at our scale, and for so many different destinations though.20:59:24
@topranks:libera.chattopranksOn a manual level, for AS X, you can do BGP things to make it all seem equal and do ECMP. But weighted between one and the other, and adapting dynamically, is where it really gets hard.21:00:39
@Jeff_Green:libera.chat@Jeff_Green:libera.chat left the room.21:45:13
@cdanis:libera.chatcdanisyeah, it seems quite hard to do at all, and I’m sure I don’t even realize the half of it :)22:32:45

There are no newer messages yet.

Back to Room List