!XaqDhxuTIlvldquJaV:matrix.org

#synapse-dev:matrix.org

1311 Members
If you are not a dev contributing code to Synapse, please talk in #synapse:matrix.org instead. This room is *only* for coordination of dev work. If you want to grab the attention of a dev, do it in #synapse:matrix.org. 162 Servers

Load older messages


SenderMessageTime
18 May 2022
@richvdh:sw1v.orgrichvdhI think we need some other mechanism for locking a room17:33:14
@richvdh:sw1v.orgrichvdh Erik: any thoughts on this? 17:33:32
@erikj:jki.reErikOh, hmm, why did the tests not catch that17:40:19
@erikj:jki.reErik
In reply to @richvdh:sw1v.org
I can't really believe that deleting from rooms first is the right solution there
We can move the rooms deletion back where it was, I think I just moved it up mostly for sanity reasons. I think the SELECT FOR SHARE in the persist events code path is the important thing there
17:47:23
@erikj:jki.reErikThe way I was thinking of it was: we're modifying the room, so check that the room actually exists and that its not going to be deleted from under us17:47:57
@richvdh:sw1v.orgrichvdh
In reply to @erikj:jki.re
Oh, hmm, why did the tests not catch that
presumably, because we don't test purging a room that has done any federation?
17:49:38
@erikj:jki.reErik

The alternatives I think are:

  1. Use an advisory lock, but that gives worse guarantees
  2. Make a second table that we DELETE / SELECT FOR SHARE on, but I'm not really sure what that buys you,
  3. Add foreign key constraints from events -> rooms, which is probably the right way, but involves adding a foriegn key constraint which I think is painful? Maybe I'm wrong there
17:50:55
@richvdh:sw1v.orgrichvdhI don't think adding FKs is that painful actually17:52:50
@richvdh:sw1v.orgrichvdhit can be done async17:53:26
@richvdh:sw1v.orgrichvdh well, moving the rooms deletion back down will certainly fix the immediate problem 17:54:02
@clokep:matrix.orgclokep

I don't think adding FKs is that painful actually

I think this is true as long as your tables meet the constraints properly. 😄

19:04:55
@clokep:matrix.orgclokep Nick Mills-Barrett: Out of curiosity, do you all have jaeger setup? Could help diagnose some of the issues you all are seeing. 19:18:50
@clokep:matrix.orgclokepI haven't been following the higher CPU stuff going on though so probably won't be able to help diagnose much. 😢 19:19:13
@fizzadar:beeper.comNick Mills-Barrett
In reply to @clokep:matrix.org
Nick Mills-Barrett: Out of curiosity, do you all have jaeger setup? Could help diagnose some of the issues you all are seeing.
we don't unfortunately!
19:22:08
@clokep:matrix.orgclokepIt's quite heavy, I can't blame you there!19:22:35
@dmrobertson:matrix.orgdmrScreenshot from 2022-05-18 20-39-09.png
Download Screenshot from 2022-05-18 20-39-09.png
19:39:35
@dmrobertson:matrix.orgdmrphwoar19:39:37
@clokep:matrix.orgclokepimage.png
Download image.png
19:42:47
* @clokep:matrix.orgclokep can't wait until Tuesday...19:42:50
@dmrobertson:matrix.orgdmrWTF happened to this sytest run on develop? https://github.com/matrix-org/synapse/runs/6495580753?check_suite_focus=true19:44:04
@clokep:matrix.orgclokepThat seems terrifying...19:45:11
@clokep:matrix.orgclokepI have no idea.19:45:13
@richvdh:sw1v.orgrichvdhyou mean all the test failures?20:16:02
@dmrobertson:matrix.orgdmryeah, exactly20:17:37
@richvdh:sw1v.orgrichvdhyeah, dunno20:26:27
@richvdh:sw1v.orgrichvdhit looks like some failure in replication, though I can't immediately see what20:27:03
@richvdh:sw1v.orgrichvdh

oh, here we go

2022-05-18 19:14:15,877 - synapse.replication.tcp.redis - 112 - INFO - sentinel - Connected to redis
2022-05-18 19:14:15,877 - synapse.replication.tcp.redis - 120 - INFO - subscribe-replication-0 - Sending redis SUBSCRIBE for localhost:8840
2022-05-18 19:14:15,878 - synapse.replication.tcp.redis - 123 - INFO - subscribe-replication-0 - Successfully subscribed to redis stream, sending REPLICATE command
2022-05-18 19:14:15,878 - synapse.metrics.background_process_metrics - 247 - ERROR - subscribe-replication-0 - Background process 'subscribe-replication' threw an exception
Traceback (most recent call last):
  File "/synapse/synapse/metrics/background_process_metrics.py", line 243, in run
    return await func(*args, **kwargs)
  File "/synapse/synapse/replication/tcp/redis.py", line 126, in _send_subscribe
    await self._async_send_command(ReplicateCommand())
  File "/synapse/synapse/replication/tcp/redis.py", line 220, in _async_send_command
    self.synapse_stream_name, encoded_string
  File "/venv/lib/python3.7/site-packages/twisted/internet/defer.py", line 959, in send
    raise result.value
Exception: getConnection yielded with context subscribe-replication-0 rather than sentinel, yielded on line 2355 in /venv/lib/python3.7/site-packages/txredisapi.py
20:29:08
@richvdh:sw1v.orgrichvdhthat's one of the workers failing to set up replication, unless my eyes deceive me20:29:59
@californiatoke:matrix.org@californiatoke:matrix.org joined the room.20:38:53
@abuse:matrix.orgAdministrator banned @californiatoke:matrix.org@californiatoke:matrix.org (spam).22:10:55

There are no newer messages yet.


Back to Room List