!HEfOZuCWbFqQODrTRb:matrix.org

Xapian

38 Members
The Xapian search engine library https://xapian.org/ | GSoC https://trac.xapian.org/wiki/GSoCProjectIdeas | FAQ https://trac.xapian.org/wiki/FAQ | doc https://getting-started-with-xapian.readthedocs.io/ | If you have a question, just ask it and WAIT for a reply (there's often a delay due to timezones)2 Servers

Load older messages


SenderMessageTime
16 May 2021
@dipanshu:matrix.orgdipanshu left the room.16:00:21
@freenode_bremner:matrix.orgbremnerif I start with an empty database, make a series of non-flushed transactions, commit_transaction all but the last, and then cancel the last transaction, should I be surpised that the database appears empty again?22:59:34
@freenode_bremner:matrix.orgbremner I guess that is what "However, if cancel_transaction is called (or if commit_transaction isn't called before the WritableDatabase object is destroyed) then any changes which were pending before the transaction began will also be discarded." is warning me about 23:03:32
@olly.nz:matrix.orgollyan "unflushed" transaction really just groups what's inside it23:05:49
@olly.nz:matrix.orgollye.g. if you wanted to update a document being split into two such that you ensure either the full document is there or the two split parts are, but not any intermediate state23:07:37
@olly.nz:matrix.orgollynot sure it's well named23:08:03
@olly.nz:matrix.orgollyit's from when the commit() method was called flush()23:08:14
@olly.nz:matrix.orgollywhich also wasn't the best name, hence we changed it23:08:29
@olly.nz:matrix.orgollydoes anyone know if anything else have a similar concept?23:09:22
@freenode_bremner:matrix.orgbremner right. I'm just not sure if the transactions in notmuch-new (indexing a single email) are big enough to be sensibly flushed/committed 23:09:42
@olly.nz:matrix.orgollyyou could explicitly commit() between some transactions23:10:51
@olly.nz:matrix.orgollycommitting on every email indexed would be much slower I'd expect23:11:26
@freenode_bremner:matrix.orgbremner yeah. I think we did that once and didn't enjoy it 23:11:42
@olly.nz:matrix.orgollycurrently the auto flush threshold will fire at the same interval in terms of changed documents as outside transactions, but if we're in a transaction it only flushes changes to disk, and doesn't commit them23:14:00
@olly.nz:matrix.orgollypossibly it'd be better if that also triggered a commit after the current transaction23:14:46
@freenode_bremner:matrix.orgbremnerthat would be a nice feature.23:15:18
@olly.nz:matrix.orgollyi don't really have a good idea how widely used transactions are, or how they're actually being used23:15:21
@freenode_bremner:matrix.orgbremnerwe just group all the modifications due to one indexing one email together so the indexing can be interrupted cleanly23:20:46
@olly.nz:matrix.orgollythe current "autoflush" threshold conflates two different things really23:24:35
@olly.nz:matrix.orgollyone is that you don't want to keep caching changes in memory indefinitely, because you'll run out of memory23:25:09
@olly.nz:matrix.orgollyand the other is that it may be useful to periodically commit changes automatically23:25:41
@olly.nz:matrix.orgollyoutside a transaction it does both, but inside it does just the first23:26:02
@olly.nz:matrix.orgollyand a commit implies flushing out the pending changes, but not vice vera23:26:33
@freenode_bremner:matrix.orgbremneroh, so autoflush in a transaction is really just resource management, no durability effects?23:27:02
@olly.nz:matrix.orgollyyes23:27:08
@olly.nz:matrix.orgollyfor honey, I'm hoping to have a rolling flush to keep the disc write rate much more even23:27:24
@freenode_bremner:matrix.orgbremnerthen we really should be committing every N messages indexed, for some N23:27:39
@freenode_bremner:matrix.orgbremnerI guess worst case is you restart indexing.23:28:01
@olly.nz:matrix.orgollythere's an argument for time-based commits, though it makes debugging crashes harder as they'll probably tend to be less reproducible23:28:44
@olly.nz:matrix.orgollybut something like "only have to repeat at most a minute of indexing" if interrupted might make sense for something like notmuch23:30:13

There are no newer messages yet.


Back to Room List