!RcWPWcZrMeBxOGaalX:matrix.org

Dendrite

1680 Members
Second-generation Matrix homeserver written in Go! | https://matrix-org.github.io/dendrite/ | #dendrite-dev:matrix.org for development — #dendrite-alerts:matrix.org for release notifications662 Servers

Load older messages


SenderMessageTime
30 Jan 2023
@andrewm:element.ioAndrew Morgan (anoa) changed their display name from Andrew Morgan (anoa) [back Jan 28th] to Andrew Morgan (anoa).10:00:15
@benj:matrix.brandt-art.synology.meBenjamin joined the room.10:06:42
@elcedepeter:matsch.xyz@elcedepeter:matsch.xyz joined the room.10:06:57
@mat:matdoes.devmat changed their profile picture.10:30:04
@carroarmato0:matrix.carroarmato0.becarroarmato0 joined the room.10:38:41
@matthias:asra.grmatthias

Hi, I'm doing some load tests using https://gitlab.futo.org/load-testing/matrix-locust on monolith and noticed some sttrange behaviour with 0.11.0 and postgres 15: The test logs in and creates a room with ~30 invitees, all of them on the same domain (no federation). I see around 10k(!) SQL queries and it takes around 6 seconds for a single room. I noticed a lot of transaction rollbacks (71). When looking into the queries, I see some duplicated queries (probably due to rollbacks), but e.g. the following query is done about 5.000 times:
SELECT event_type_nid, event_state_key_nid, event_nid FROM roomserver_events WHERE event_nid = ANY($1) AND (CARDINALITY($2::bigint[]) = 0 OR event_type_nid = ANY($2)) AND (CARDINALITY($3::bigint[]) = 0 OR event_state_key_nid = ANY($3)) ORDER BY event_type_nid, event_state_key_nid ASC The arguments of the query change, so there are about 1.1k different values for the above query

I think I made a setup mistake, but did anybody see this before? Dendrite's log does not contain anything (using level 'warn'). I'll check if older versions show the same behaviour, but any help would be appreciated. (Register and login on the same server is working flawlessly)

14:47:17
@matthias:asra.grmatthias
In reply to @matthias:asra.gr

Hi, I'm doing some load tests using https://gitlab.futo.org/load-testing/matrix-locust on monolith and noticed some sttrange behaviour with 0.11.0 and postgres 15: The test logs in and creates a room with ~30 invitees, all of them on the same domain (no federation). I see around 10k(!) SQL queries and it takes around 6 seconds for a single room. I noticed a lot of transaction rollbacks (71). When looking into the queries, I see some duplicated queries (probably due to rollbacks), but e.g. the following query is done about 5.000 times:
SELECT event_type_nid, event_state_key_nid, event_nid FROM roomserver_events WHERE event_nid = ANY($1) AND (CARDINALITY($2::bigint[]) = 0 OR event_type_nid = ANY($2)) AND (CARDINALITY($3::bigint[]) = 0 OR event_state_key_nid = ANY($3)) ORDER BY event_type_nid, event_state_key_nid ASC The arguments of the query change, so there are about 1.1k different values for the above query

I think I made a setup mistake, but did anybody see this before? Dendrite's log does not contain anything (using level 'warn'). I'll check if older versions show the same behaviour, but any help would be appreciated. (Register and login on the same server is working flawlessly)

Seems like every invitee is taking ~200ms, so 30x0.2=6 seconds
14:55:07
@s7evink:matrix.orgTillIs that for initial syncs or incremental ones?14:55:38
@matthias:asra.grmatthiasYou mean /sync? No /sync is happening, only /login and then a single /createRoom with 30 invitees14:58:12
@s7evink:matrix.orgTillHu, fun.14:59:36
@elcedepeter:matsch.xyz@elcedepeter:matsch.xyz left the room.15:01:34
@matthias:asra.grmatthias:-) I think I also saw relatively slow behavior on <0.11.0 during createRoom, but I had problems with locust, too. It was only now that I discovered the amount of queries. I'll check older versions if they show the same amount of queries15:02:37
@s7evink:matrix.orgTillThere are currently https://github.com/matrix-org/dendrite/issues/2911 and https://github.com/matrix-org/dendrite/issues/2777, #2777 mentions the same query15:04:46
@_neb_github:matrix.orgGithubhttps://github.com/matrix-org/dendrite/issues/2777 : /sync performance slow since v0.10.015:04:47
@matthias:asra.grmatthiasThanks for the issues! So I'll check 0.9.9 first.15:14:55
@matthias:asra.grmatthiasInteresting: with 0.9.9 it is slightly better 5sec vs 6sec and there is only a single ROLLBACK left, however the amount of queries is the same (~10k). 15:53:56
@jassu:kapsi.fiJassu Yeah, I’ve been debugging those SQL related weirdnesses too, but haven’t had free time to continue recently. Another big problem with sync seems to be huge parameter sets given to some queries. Like multi megabyte list to ANY( … ) for example. 15:59:24
@matthias:asra.grmatthiasOn 0.3.0 at least there is only 1/3 of the number of queries, but it still takes >5sec for a single room. So I assume it's really the number of invitees that cause the delay. I've gone back until 0.1.0, which is again slightly faster than 0.2.0 (~4.5sec). I again check for the number of queries and compare them to a room creation without any invitees/only one invitee16:39:00
@kegan:matrix.orgKegan
In reply to @jassu:kapsi.fi
Yeah, I’ve been debugging those SQL related weirdnesses too, but haven’t had free time to continue recently. Another big problem with sync seems to be huge parameter sets given to some queries. Like multi megabyte list to ANY( … ) for example.
we should be chunking them if we aren't already. In practice, having such large param sets isn't really an issue insofar that the alternative isn't really much better and is significantly slower
18:18:22
@kegan:matrix.orgKeganideally we wouldn't be needing to query such vast amounts of data in the first place18:18:40
@jassu:kapsi.fiJassuHuh? How so slower? That kind of stuff should be picked in by the SQL query, not fetching it from the DB by another query, parse it in the Dendrite and then using it as a multi megabyte parameter… -_-18:31:32
@jassu:kapsi.fiJassu

I don’t have full picture of what exactly would be needed to serve the full whatever the client expects, but I got it down to a minute on my raspberryPi, when I actually combined two of the heaviest queries into one and returned the whole set in one go from the DB in a single optimized query.

Compared to not finishing the other one during the night before I started fiddling with that thing.

18:35:07
@jassu:kapsi.fiJassu It’s always considerably faster to tell psql exactly what is wanted, and then building the query to get that and exactly that out of there. Add indexes as required, of course. 18:37:04
@jassu:kapsi.fiJassuSo, from literally not finishig the query in 7+ hours down to ~minute. When going from Dendrite’s exact query on my live db to the one that actually got considerably more of the required set covered. 🤷🏼‍♀️18:39:10
@jassu:kapsi.fiJassu Relational databade is not a KV store. 18:40:19
@neilalexander:matrix.orgneilalexanderThe problem with the SQL backend in Dendrite is more the opinionated style, i.e. queries for a table are generally grouped together in the same file, generally don't reach into other tables, don't join unless absolutely essential 19:24:52
@neilalexander:matrix.orgneilalexanderThose barriers could be broken down and you can push more work on the SQL engine19:25:04
@neilalexander:matrix.orgneilalexander... but maybe you make more headaches for the devs in doing so :D19:25:16
@neilalexander:matrix.orgneilalexander There are a lot of unnecessary database round trips though, and /sync is just a monstrosity that vomits out huge quantities of data anyway 19:25:44
@abuse:matrix.orgAdministrator banned @adsusa:matrix.org@adsusa:matrix.org (spam).21:37:55

There are no newer messages yet.


Back to Room List