!vVDCbufQqGEdAdfclU:matrix.org

neorg treesitter

576 Members
3 Servers

Load older messages


SenderMessageTime
31 Dec 2023
@_discord_750666437420384297:t2bot.ioboltless implemented heading/slides/indent_segments without any problem 12:02:15
@_discord_750666437420384297:t2bot.ioboltless …if complexity of scanner doesn’t count as a problem 12:02:57
@_discord_964904883675332678:t2bot.iosockstealingbastard joined the room.14:19:46
@_discord_387036585033465856:t2bot.iontbbloodbath changed their profile picture.22:25:48
@_discord_644284217525665793:t2bot.iob4mbus changed their profile picture.23:06:45
1 Jan 2024
@_discord_387036585033465856:t2bot.iontbbloodbath changed their profile picture.02:53:03
@_discord_750666437420384297:t2bot.ioboltlessRedacted or Malformed Event04:59:45
@_discord_750666437420384297:t2bot.ioboltless hmmm parser is having overflow issue with it’s state count with superscript/subscript
tree-sitter uses uint16_t and with superscript/subscript, parser has slightly larger than it (about 70000)
05:11:00
@_discord_750666437420384297:t2bot.ioboltless trying to reduce the parser size to get at least one of superscript/subscript working 05:20:54
@_discord_750666437420384297:t2bot.ioboltless I thought I can go to block types so people can work on pandoc/lsp/formatter or so on.
until I realized I completely forgot implementing null modifier 🤦‍♂️
05:42:04
@_discord_750666437420384297:t2bot.ioboltless if anyone willing to help v3 parser,
current main problem is too big parser state count (tree-sitter uses uint16_t for state count, and generated parser has more than 70000)
- patch tree-sitter to use uint32_t
- reduce parser size to avoid overflow
08:10:46
@_discord_750666437420384297:t2bot.ioboltless * if anyone willing to help v3 parser,
current main problem is too big parser state count (tree-sitter uses uint16_t for state count, and generated parser has more than 70000)
- patch tree-sitter to use uint32_t
- reduce parser size to avoid overflow

I
08:10:56
@_discord_750666437420384297:t2bot.ioboltless * if anyone willing to help v3 parser,
current main problem is too big parser state count (tree-sitter uses uint16_t for state count, and generated parser has more than 70000)
- patch tree-sitter to use uint32_t
- reduce parser size to avoid overflow

I’m not C/C++ expert, so help on first solution would be really helpful.
08:11:17
@_discord_750666437420384297:t2bot.ioboltless changing uint16_t to uint32_t solves the problem
with 22MiB parser size (without null modifier)
09:56:09
@_discord_83007807036588032:t2bot.iopurewater What is an acceptable parser size 09:56:54
@_discord_750666437420384297:t2bot.ioboltless definitely not 22MiB 09:58:51
@_discord_750666437420384297:t2bot.ioboltless largest parser I know is tree-sitter-ocaml (4.9MiB) 09:59:45
@_discord_83007807036588032:t2bot.iopurewater I mean is it reasonable to expect it to be smaller 10:00:51
@_discord_750666437420384297:t2bot.ioboltless v1 was huge mess and has quite a lot of problems
but it’s size is about 3.5MiB iirc
10:02:17
@_discord_83007807036588032:t2bot.iopurewater Do you know what's causing the state size to balloon 10:03:40
@_discord_83007807036588032:t2bot.iopurewater I will have like a 2 hour period later to look at it 10:04:39
@_discord_750666437420384297:t2bot.ioboltless 1. we have quite a lot of markup types (attached modifiers, verbatim/free-form variant linkables)
2. we should allow unclosed markups (like markdown and many other markup languages do)
3. we can’t make parser to parse in various contexts (expecting closed/unclosed) because theoretically parser can have 2^9 versions in runtime
4. .vhyrro ‘s last solution about 3 is to unpack inner part of each markups to make parser have unpacked single lexed version instead of multiple different versions until it meet valid close token.
5. and that solution is now causing too many state for parser

this is what’s happening in my understanding.
10:11:08
@_discord_750666437420384297:t2bot.ioboltless + why did v1 didn’t have the problem?
-> v1 parser is doing all parser’s job from external lexer (scanner.cc), so parser doesn’t have state
which was bad design in many reasons
10:14:18
@_discord_750666437420384297:t2bot.ioboltless my hands are shaking… 11:39:41
@_discord_750666437420384297:t2bot.ioboltless sort of working parser with 236K..?? wth 11:40:01
@_discord_83007807036588032:t2bot.iopurewater Wait what 11:45:51
@_discord_83007807036588032:t2bot.iopurewater How 11:45:58
@_discord_83007807036588032:t2bot.iopurewater Can I hire you 11:46:05
@_discord_83007807036588032:t2bot.iopurewater I don't know what for 11:46:12
@_discord_750666437420384297:t2bot.ioboltless nope it’s not… still have some previous issues 11:46:15

Show newer messages


Back to Room ListRoom Version: 9