!arfzDnJwymdIWXAJOl:matrix.org

_Labs - Analysis

307 Members
2 Servers

Load older messages


SenderMessageTime
17 May 2022
@_discord_239371209202991105:t2bot.ioOxey but it's annoying though because from just a unicode value or something like that it's hard to say if a character is a symbol or not 16:11:21
@_discord_239371209202991105:t2bot.ioOxey * but it's annoying though because from just a unicode value or something like that it's impossible to say if a character is a symbol or not 16:11:28
@_discord_239371209202991105:t2bot.ioOxey so unless you then also have data about general accepted chars you can't do shit 👍 16:11:51
@_discord_239371209202991105:t2bot.ioOxey actually I could do this I think 16:12:00
@_discord_169285177481101312:t2bot.ioApsu I mean, there are classifications in Unicode 16:12:16
@_discord_239371209202991105:t2bot.ioOxey 16:12:33
@_discord_169285177481101312:t2bot.ioApsu https://unicodebook.readthedocs.io/unicode.html 16:12:34
@_discord_239371209202991105:t2bot.ioOxey joking of course tell me more 16:12:36
@_discord_239371209202991105:t2bot.ioOxey 😳 16:12:37
@_discord_169285177481101312:t2bot.ioApsu
Unicode 6.0 has 7 character categories, and each category has subcategories:

Letter (L): lowercase (Ll), modifier (Lm), titlecase (Lt), uppercase (Lu), other (Lo)
Mark (M): spacing combining (Mc), enclosing (Me), non-spacing (Mn)
Number (N): decimal digit (Nd), letter (Nl), other (No)
Punctuation (P): connector (Pc), dash (Pd), initial quote (Pi), final quote (Pf), open (Ps), close (Pe), other (Po)
Symbol (S): currency (Sc), modifier (Sk), math (Sm), other (So)
Separator (Z): line (Zl), paragraph (Zp), space (Zs)
Other (C): control (Cc), format (Cf), not assigned (Cn), private use (Co), surrogate (Cs)
16:12:41
@_discord_239371209202991105:t2bot.ioOxey 1100 numbers 16:13:26
@_discord_169285177481101312:t2bot.ioApsu You can decompose and normalize too 16:13:28
@_discord_169285177481101312:t2bot.ioApsu There are libraries for this of course 16:13:36
@_discord_169285177481101312:t2bot.ioApsu So from any unicode character sequence, you should be able to extract letters just fine 16:13:52
@_discord_169285177481101312:t2bot.ioApsu And punct 16:13:58
@_discord_239371209202991105:t2bot.ioOxey 😳 16:22:35
@_discord_239371209202991105:t2bot.ioOxey I might 16:22:55
@_discord_444585600318701568:t2bot.ioTanamr mine also does thumb keys but you need to give it a ton of data before it learns what is good and bad 16:28:33
@_discord_444585600318701568:t2bot.ioTanamr and tell it not to use oe pinky 16:28:38
@_discord_328195824238723072:t2bot.ioec0vid Right that's how it goes around here. Everyone redoing the same thing (their own analyzer) rather than working together.
F
18:44:53
@_discord_239371209202991105:t2bot.ioOxey 18:45:04
@_discord_239371209202991105:t2bot.ioOxey we just like to do our own silly little projects 18:45:11
@_discord_239371209202991105:t2bot.ioOxey that's all 18:45:14
@_discord_239371209202991105:t2bot.ioOxey 18:45:16
18 May 2022
@_discord_273268037837127690:t2bot.ioMonstoBusta changed their display name from MonstoBusta to MonstoBusta#3109.06:43:30
@_discord_273268037837127690:t2bot.ioMonstoBusta changed their display name from MonstoBusta#3109 to MonstoBusta.06:43:34
@_discord_468775077920505858:t2bot.iosmudge (cuteboi) changed their display name from Chaox to Chaox#0007.08:21:32
@_discord_468775077920505858:t2bot.iosmudge (cuteboi) changed their display name from Chaox#0007 to smudge (cuteboi).08:21:36
@_discord_468775077920505858:t2bot.iosmudge (cuteboi) changed their display name from smudge (cuteboi) to Chaox#0007.08:41:36
@_discord_468775077920505858:t2bot.iosmudge (cuteboi) changed their display name from Chaox#0007 to smudge (cuteboi).08:41:39

There are no newer messages yet.


Back to Room List