!cxPCiPlsXnajakSrqd:matrix.org

Array languages

887 Members
General discussion about APL-like array languages. See #array:matrix.org for other rooms23 Servers

Load older messages


SenderMessageTime
14 Jan 2025
@ngn:matrix.orgngnit accepts three params: void* pointer to memory, length, and itemsize so it's universal - works for 64bit, 32bit, 8bit22:22:26
@ngn:matrix.orgngnit's from my private library for PE22:24:42
@dzaima:matrix.orgdzaima ngn it's somewhere here with some levels of macro abstractions (incl. calling into Singeli for the cumulative sum steps), not gonna be able to easily extract it out (written by Marshall, not me) 22:24:58
@ngn:matrix.orgngn

here is mine:

V srtV(V*r,U n,U w){G*g;CO U b=1<<8*SZ(*g);U c[w][b+1];MS(c,0,SZ c);g=r;F(n,Fj(w,c[j][1+*g++]++))
 V*t=malloc(n*w);Q(t);F(w,U*ci=c[i];sumsU(ci+1,b-1);g=r;g+=i;Fj(n,MC(t+ci[g[w*j]]++*w,r+j*w,w))SWP(r,t))free(t);}

i'll add comment later

22:27:44
@ngn:matrix.orgngn

dzaima: Kamila

//universal sorter - sorts items of any given size
//r:array sorted in place
//n:length - number of items
//w:width - size of an item in bytes, must be even
V srtV(V*r,U n,U w){
 G*g;                      //same as r but reinterpreted as bytes (G means unsigned char)
 CO U b=1<<8*SZ(*g);       //256
 U c[w][b+1];              //counters
 MS(c,0,SZ c);             //memset with 0s
 g=r;
 F(n,Fj(w,c[j][1+*g++]++)) //go through g only once and do the counting for all of the w passes
 V*t=malloc(n*w);          //temporary helper array, will be swapped with r an even number of times
 Q(t);                     //assert(t!=NULL); - unnecessary
 F(w,                      //do w iterations, on the ith iteration sort by ith-least-significant byte
  U*ci=c[i];               //counters for the current iteration
  sumsU(ci+1,b-1);         //like +\ in k,  i should optimize it one day..
  g=r;g+=i;                //now I can change this to g=r+i; previously r had a different type from g
  Fj(n,
   MC(t+ci[g[w*j]]++*w,    //memcpy an item to the temp array; the compiler should replace it with
      r+j*w,               // a simple store instruction when srtV() is inlined and w is constant
      w);
  )
  SWP(r,t);                //swap
 )
 free(t);
}
//one day i'll also add a parameter for "key size", so i can sort by part of the item
//i'm still not sure if that or "grading" is more sensible

//"instantiate" srtV() for types U (32bit) and W (64bit); i'll add more when i need them
#define M(T) V srt##T(T*r,U n){srtV(r,n,SZ(T));}
 M(U)M(W)
#undef M
22:54:47
@_discord_506589628418097152:t2bot.iosilas why'd you not do G*g=r ? 23:27:41
@ngn:matrix.orgngn i started with separate code for srtU() and srtW(), some things were different, there are some traces left.. 23:31:49
@ngn:matrix.orgngni'll change it, thanks23:33:04
@ngn:matrix.orgngnah, i remembered! :)23:34:27
@ngn:matrix.orgngn b depends on the sizeof the type of G (because i wanted to experiment with a 16bit g too)
c's size depends on b
and i wanted to initialize c with ={}; instead of a memset, so it had to move g=r after c (otherwise gcc refuses to compile the ={};
but it still didnt work and i left it like this..
23:39:37
@ngn:matrix.orgngn * b depends on the sizeof the type of g (because i wanted to experiment with a 16bit g too)
c's size depends on b
and i wanted to initialize c with ={}; instead of a memset, so it had to move g=r after c (otherwise gcc refuses to compile the ={};
but it still didnt work and i left it like this..
23:39:52
@_discord_506589628418097152:t2bot.iosilas cool, feel I'll need to read more to get the logic but looks interesting 23:51:22
@ngn:matrix.orgngnthe logic of radix sort or the logic of my implementation of it?23:56:21
@ngn:matrix.orgngnradix sort is simple: sort by least significant bytes, then by the second-least significant, etc23:57:20
@ngn:matrix.orgngneach pass is a sort by counting23:57:50
@_discord_506589628418097152:t2bot.iosilas your implementation - don't get the need for multi-dimension counter array, although is that for the number of rounds? 23:58:40
@ngn:matrix.orgngn it's because i //go through g only once and do the counting for all of the w passes 23:59:10
@ngn:matrix.orgngn in other words, c[i] are the counters for the ith pass 23:59:38
@_discord_506589628418097152:t2bot.iosilas right, thought so - thanks 23:59:56
15 Jan 2025
@ngn:matrix.orgngnit's a slightly better memory access pattern - access g[0], g[1], g[2].. instead of g[0], g[8], g[16].. and then g[1], g[9], g[17], and then..00:01:00
@_discord_506589628418097152:t2bot.iosilas indeed 00:02:06
@ngn:matrix.orgngn

new improved version:

inline V srtV(V*r,U n,U w){U c[w][257];MS(c,0,SZ c);G*g=r;F(n,Fj(w,c[j][1+*g++]++))V*t=malloc(n*w);
 F(w,U*ci=c[i];sumsU(ci,256);g=r+i;Fj(n,MC(t+ci[g[w*j]]++*w,r+j*w,w))SWP(r,t))I(w&1,SWP(r,t);MC(r,t,n*w))free(t);}
00:05:41
@ngn:matrix.orgngn explicit inline to increase chances of getting proper speed in other environments
support for odd w
surrendered to the thought that 8bit g works best (not 16bit) and hardcoded 256 (and 257)
00:09:00
@ngn:matrix.orgngn * explicit inline to increase chances of getting proper speed in other environments
support for odd w
surrendered to the thought that 8bit g works best (not 16bit) and hardcoded 256 (and 257)
rounded sumsU's args, it may simplify its vectorization "one day"
00:10:04
@ngn:matrix.orgngn it looks like Marshall doesnt want compilers to steal his job :) - he has separate copies for all possible types (and even those are macros), manual loop unrolling.. 00:29:42
@ngn:matrix.orgngnbtw, what's the difference between "radix sort i8" and "counting sort i8"?00:31:05
@dzaima:matrix.orgdzaima there is a good amount of variation between the copies - reusing buffers, handling the last byte separately (cause signed ints; includes having said buffer be offset). Perhaps you could move all that in loops but I'm not sure if the result would be much clearer 00:38:48
@dzaima:matrix.orgdzaima also there's a radix sort impl in Singeli here that's less duplicatey (along with a bunch of other sort algorithms in the same project) 00:43:35
@ngn:matrix.orgngntoo many languages00:58:53
@ngn:matrix.orgngn dzaima: do you have a fast plain c version of +\ by any chance? 01:01:01

Show newer messages


Back to Room ListRoom Version: 6