31 Dec 2023 |
boltless | implemented heading/slides/indent_segments without any problem | 12:02:15 |
boltless | …if complexity of scanner doesn’t count as a problem | 12:02:57 |
| sockstealingbastard joined the room. | 14:19:46 |
| ntbbloodbath changed their profile picture. | 22:25:48 |
| b4mbus changed their profile picture. | 23:06:45 |
1 Jan 2024 |
| ntbbloodbath changed their profile picture. | 02:53:03 |
boltless | Redacted or Malformed Event | 04:59:45 |
boltless | hmmm parser is having overflow issue with it’s state count with superscript/subscript tree-sitter uses uint16_t and with superscript/subscript, parser has slightly larger than it (about 70000) | 05:11:00 |
boltless | trying to reduce the parser size to get at least one of superscript/subscript working | 05:20:54 |
boltless | I thought I can go to block types so people can work on pandoc/lsp/formatter or so on. until I realized I completely forgot implementing null modifier 🤦♂️ | 05:42:04 |
boltless | if anyone willing to help v3 parser, current main problem is too big parser state count (tree-sitter uses uint16_t for state count, and generated parser has more than 70000) - patch tree-sitter to use uint32_t - reduce parser size to avoid overflow | 08:10:46 |
boltless | * if anyone willing to help v3 parser, current main problem is too big parser state count (tree-sitter uses uint16_t for state count, and generated parser has more than 70000) - patch tree-sitter to use uint32_t - reduce parser size to avoid overflow
I | 08:10:56 |
boltless | * if anyone willing to help v3 parser, current main problem is too big parser state count (tree-sitter uses uint16_t for state count, and generated parser has more than 70000) - patch tree-sitter to use uint32_t - reduce parser size to avoid overflow
I’m not C/C++ expert, so help on first solution would be really helpful. | 08:11:17 |
boltless | changing uint16_t to uint32_t solves the problem with 22MiB parser size (without null modifier) | 09:56:09 |
purewater | What is an acceptable parser size | 09:56:54 |
boltless | definitely not 22MiB | 09:58:51 |
boltless | largest parser I know is tree-sitter-ocaml (4.9MiB) | 09:59:45 |
purewater | I mean is it reasonable to expect it to be smaller | 10:00:51 |
boltless | v1 was huge mess and has quite a lot of problems but it’s size is about 3.5MiB iirc | 10:02:17 |
purewater | Do you know what's causing the state size to balloon | 10:03:40 |
purewater | I will have like a 2 hour period later to look at it | 10:04:39 |
boltless | 1. we have quite a lot of markup types (attached modifiers, verbatim/free-form variant linkables) 2. we should allow unclosed markups (like markdown and many other markup languages do) 3. we can’t make parser to parse in various contexts (expecting closed/unclosed) because theoretically parser can have 2^9 versions in runtime 4. .vhyrro ‘s last solution about 3 is to unpack inner part of each markups to make parser have unpacked single lexed version instead of multiple different versions until it meet valid close token. 5. and that solution is now causing too many state for parser
this is what’s happening in my understanding. | 10:11:08 |
boltless | + why did v1 didn’t have the problem? -> v1 parser is doing all parser’s job from external lexer (scanner.cc ), so parser doesn’t have state which was bad design in many reasons | 10:14:18 |
boltless | my hands are shaking… | 11:39:41 |
boltless | sort of working parser with 236K..?? wth | 11:40:01 |
purewater | Wait what | 11:45:51 |
purewater | How | 11:45:58 |
purewater | Can I hire you | 11:46:05 |
purewater | I don't know what for | 11:46:12 |
boltless | nope it’s not… still have some previous issues | 11:46:15 |