ejolson
Posts: 3831
Joined: Tue Mar 18, 2014 11:47 am

Re: Liberation through Computer Literacy

Tue Nov 12, 2019 8:58 pm

John_Spikowski wrote:
Tue Nov 12, 2019 8:54 pm
Back on topic:

Am I correct by saying the following languages have Tatami submissions?

C
Perl
ScriptBasic
8th
I also wrote a solution in Visual Basic some time ago and later one in Commodore Basic. There were also Fortran and PL/I solutions posted.

Heater
Posts: 13925
Joined: Tue Jul 17, 2012 3:02 pm

Re: Liberation through Computer Literacy

Tue Nov 12, 2019 9:08 pm

Rust.
Memory in C++ is a leaky abstraction .

jalih
Posts: 94
Joined: Mon Apr 15, 2019 3:54 pm

Re: Liberation through Computer Literacy

Tue Nov 12, 2019 9:30 pm

Here are updated Tatami codes for PL/1, Fortran and 8th. I use ancient PL/1 compiler for Windows but it still produces the fastest code of the bunch. PL/1 code takes about 3 seconds to run, Fortran code doubles that time and runs in about 6.5 seconds. Execution time for 8th in this task is poor and takes about 9 minutes!

8th:

Code: Select all

\ Based on Tatami.c written by jcyr on Raspberrypi.org forums

100000000 constant N-MAX
N-MAX n:sqrt n:int constant N-MAX-SQRT

N-MAX b:new true b:writable constant v

: swap+-  \ a b c f -- a (a+b+1) (a+c-1)
  drop
  swap 2 pick n:+ n:1+
  swap 2 pick n:+ n:1- ;

: l4  \ i k2 k3 k4 -- i k2 k3 k4
  \ i k2 k3 k4 j
  4 pick over n:*
  v over b:@ n:1+ rot swap b:! 2drop ;

: l3
  \ i
  dup 3 n:+             \ i k2=i+3
  over dup n:+ 4 n:-    \ i k2 k3=i+i-4
  repeat  \ i k2 k3
    2dup n:> not 2over n:* N-MAX n:< and if
      N-MAX n:1- 3 pick n:/ n:int     \  i k2 k3 k4
      2dup n:< if
        drop dup
      then
      ' l4 3 pick 2 pick loop   \ i k2 k3 k4
      swap+-
    else
      break
    then
  again
  2drop drop
  2 step ;

: l2  \ i k2 k3 k4 -- i k2 k3 k4
  \ i k2 k3 k4 j
  4 pick over n:*
  v over b:@ n:1+ rot swap b:! 2drop
  2 step ;

: l1
  \ i
  dup 3 n:+             \ i k2=i+3
  over dup n:+ 4 n:-    \ i k2 k3=i+i-4
  repeat  \ i k2 k3
    2dup n:> not 2over n:* N-MAX n:< and if
      N-MAX n:1- 3 pick n:/ n:int     \  i k2 k3 k4
      2dup n:< if
        drop dup
      then
      ' l2 3 pick 2 pick loop   \ i k2 k3 k4
      swap+-
    else
      break
    then
  again
  2drop drop
  2 step ;

: tatami  \ n -- s
  ' l1 7 N-MAX-SQRT n:1- loop
  ' l3 8 N-MAX-SQRT n:1- loop
  v swap 1 a:close b:new b:search nip ;

: app:main
  200 tatami null? not if
    "The smallest room size s for which T(s) = 200 is %d.\n" s:strfmt .
  else
    drop "Not found\n" .
  then
  bye ;
  
PL/1:

Code: Select all


*PROCESS MARGINS(1,180) LIBS(SINGLE,STATIC);
*PROCESS OPTIMIZE(3) DFT(REORDER);
*PROCESS PP(MACRO);


/* Based on Tatami.c written by jcyr on Raspberrypi.org forums */
 tatami: proc options(main);
   dcl N_MAX fixed bin(31) value(100000000);
   dcl N_MAX_SQRT fixed bin(31) value(sqrt(N_MAX));

   dcl v(0:N_MAX-1) fixed bin(8) unsigned ctl;

   dcl (i, j, k2, k3, k4) fixed bin(31);

   allocate v(0:N_MAX-1);

   do i = 7 to N_MAX_SQRT - 1 by 2;
     k2 = i + 3;
     k3 = i + i - 4;
     do while((k2 <= k3) & ((i * k2) < N_MAX));
       k4 = (N_MAX - 1) / i;
       if k3 < k4 then k4 = k3;
       do j = k2 to k4 by 2;
         v(i*j)+=1;
       end;
       k2 += i + 1;
       k3 += i - 1;
     end;
   end;

   do i = 8 to N_MAX_SQRT - 1 by 2;
     k2 = i + 3;
     k3 = i + i - 4;
     do while((k2 <= k3) & ((i * k2) < N_MAX));
       k4 = (N_MAX - 1) / i;
       if k3 < k4 then k4 = k3;
       do j = k2 to k4;
         v(i*j)+=1;
       end;
       k2 += i + 1;
       k3 += i - 1;
     end;
   end;

   do i = 0 upthru N_MAX - 1;
     if v(i) = 200 then leave;
   end;

   put skip list('The smallest room size s for which T(s) = 200 is:', i);

   free v;
 end tatami;
Fortran:

Code: Select all

! Based on Tatami.c written by jcyr on Raspberrypi.org forums
program tatami
  implicit none

  integer, parameter :: N_MAX = 100000000
  integer, parameter :: N_MAX_SQRT = sqrt(real(N_MAX))

  integer :: ierror
  integer, dimension(:), allocatable :: v 

  integer :: i, j, k2, k3, k4

  allocate(v(0:N_MAX-1), stat=ierror)
  if (ierror /= 0) then
    print *, 'Could not allocate vector'
    stop
  end if
  
  do i = 7, N_MAX_SQRT - 1, 2
    k2 = i + 3
    k3 = i + i - 4
    do while((k2 <= k3) .and. ((i * k2) < N_MAX))
      k4 = (N_MAX - 1) / i
      if (k3 < k4) k4 = k3
      do j = k2, k4, 2
        v(i*j) = v(i*j) + 1
      end do
      k2 = k2 + i + 1
      k3 = k3 + i - 1
    end do    
  end do

  do i = 8, N_MAX_SQRT - 1, 2
    k2 = i + 3
    k3 = i + i - 4
    do while((k2 <= k3) .and. ((i * k2) < N_MAX))
      k4 = (N_MAX - 1) / i
      if (k3 < k4) k4 = k3
      do j = k2, k4, 1
        v(i*j) = v(i*j) + 1
      end do
      k2 = k2 + i + 1
      k3 = k3 + i - 1
    end do    
  end do
  
  do i = 0, N_MAX - 1, 1
    if (v(i) == 200) exit
  end do
    
  print *, 'The smallest room size s for which T(s) = 200 is:', i
  
  deallocate(v)  
end program tatami
Last edited by jalih on Wed Nov 13, 2019 4:56 pm, edited 1 time in total.

User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Tue Nov 12, 2019 9:51 pm

I forgot about your VB6 version. I even converted it to ScriptBasic no less.

Is a Tatami 200 chart in the making?

A nice change ScriptBasic not being the retard language in the group.

Array Class
The future evolution of the TA extension module.

User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Wed Nov 13, 2019 2:29 am

If you're running ScriptBasic with scriba only, here is how to use an extension module without having to create a config file and set an environmental variable.

1, Download/unzip the attach ta.so extension module to where you like.

2. Modify the the posted tatamix.sb script to use a full path and attribute in the LIB string.

3. time ./scriba tatamix.sb 200

Code: Select all

DECLARE SUB do_v    ALIAS "do_v"    LIB "/home/pi/sbrt/examples/ta.so"
DECLARE SUB find_s  ALIAS "find_s"  LIB "/home/pi/sbrt/examples/ta.so"
My ta.so is in the examples directory where tatamix.sb resides.
Attachments
scriba.zip
(150.42 KiB) Downloaded 6 times
ta.so.zip
(1.89 KiB) Downloaded 4 times

gkreidl
Posts: 6139
Joined: Thu Jan 26, 2012 1:07 pm
Location: Germany

Re: Liberation through Computer Literacy

Wed Nov 13, 2019 9:43 am

Some people have been asking for a Python implementation of the Tatami challenge. I wasn't really interested for a number of reasons but finally got caught nevertheless. But I wanted a version that is more versatile (being able to find solutions for larger values of s) and while working on it I found an optimization which should be applicable to most other implementations as well.

Both versions I provide (tatami.py and tatami_opt.py) may take up to two arguments: s and nMax. So it's possible to play arround with different values. The current algorithm will break with nMax >= 147026880, because the stored values may be bigger than 255. So the program uses bytearrays, Python list or arrays from the array module (as a last resort, because it is the slowest), depending on the size of nMax, maximum size of Python lists and memory usage. Here is the standard version (tatami.py):

Code: Select all

#!/usr/bin/python3
import sys
from math import sqrt

def tatami(s):
    for i in range(7,nMaxSqrt,2):
        k2 = i + 3
        k3 = i + i - 4
        while ((k2 <= k3) and ((i * k2) < nMax)):
            k4 = (nMax-1) // i
            if k3 < k4:
                k4 = k3
            for j in range(k2,k4+1,2):
                v[i*j] += 1
            k2 += i+1
            k3 += i-1
    for i in range(8,nMaxSqrt,2):
        k2 = i + 3
        k3 = i + i - 4
        while ((k2 <= k3) and ((i * k2) < nMax)):
            k4 = (nMax-1) // i
            if k3 < k4:
                k4 = k3
            for j in range(k2,k4+1):
                v[i*j] += 1
            k2 += i + 1
            k3 += i - 1
    try:
        return v.index(s)
    except:
        return -1

s = 200
nMax = 100000000

if len(sys.argv) > 1:
    try:
        s = int(sys.argv[1])
    except:
        pass
if len(sys.argv) > 2:
    try:
        nMax = int(sys.argv[2])
    except:
        pass

if (nMax & 1) != 0:
    nMax += -1
              
if nMax > 536870912 and sys.maxsize == 2147483647:
    try:
        from array import array
        v = array('I',[0])*nMax
    except:
        print("Not enough memory!")
        sys.exit(0)
elif nMax >= 147026880:
    try:
        v = [0]*nMax
    except:
        try:
            from array import array
            v = array('I',[0])*nMax
        except:
            print("Not enough memory!")
            sys.exit(0)
else:
    v = bytearray(nMax)

nMaxSqrt = int(sqrt(nMax))

res = tatami(s)
if res > -1:
    print("The smallest room size s for which T(s) = "+str(s)+" is "+str(res))
else:
    print("No solution found for s=" +str(s)+" and nMax="+str(nMax))
    maxs = 0
    maxi = 0
    for i in range(0,nMax):
        if v[i] > maxs:
            maxs = v[i]
            maxi = i
    print("Maximal T(s) found = "+str(maxs)+" for room size = "+str(maxi))
Of course it will run much slower than a C version.:

Code: Select all

time ./tatami.py
The smallest room size s for which T(s) = 200 is 85765680

real	1m29,730s
user	1m29,230s
sys	0m0,240s
But it will give a nice performance time when run from pypy3:

Code: Select all

time pypy3 ./tatami.py
The smallest room size s for which T(s) = 200 is 85765680

real	0m16,484s
user	0m16,034s
sys	0m0,430s
The optimization takes into account, that only even room sizes are allowed. Half of the size of the v-Array is never used. So it's possible to reduce the dimension of the v-array to nMax/2. It needs an additional shift by one to the right within the tatami function: v[(i*j) >> 1] += 1. The result has to be multiplied by 2 in the end. The overhead should be small and the search time for the result will be reduced. Here's the code of the optimized Python version (tatami_opt.py):

Code: Select all

#!/usr/bin/python3
import sys
from math import sqrt

def tatami(s):
    for i in range(7,nMaxSqrt,2):
        k2 = i + 3
        k3 = i + i - 4
        while ((k2 <= k3) and ((i * k2) < nMax)):
            k4 = (nMax-1) // i
            if k3 < k4:
                k4 = k3
            for j in range(k2,k4+1,2):
                v[(i*j) >> 1] += 1
            k2 += i+1
            k3 += i-1
    for i in range(8,nMaxSqrt,2):
        k2 = i + 3
        k3 = i + i - 4
        while ((k2 <= k3) and ((i * k2) < nMax)):
            k4 = (nMax-1) // i
            if k3 < k4:
                k4 = k3
            for j in range(k2,k4+1):
                v[(i*j) >> 1] += 1
            k2 += i + 1
            k3 += i - 1
    try:
        return v.index(s)
    except:
        return -1

s = 200
nMax = 100000000

if len(sys.argv) > 1:
    try:
        s = int(sys.argv[1])
    except:
        pass
if len(sys.argv) > 2:
    try:
        nMax = int(sys.argv[2])
    except:
        pass

if (nMax & 1) != 0:
    nMax += -1

if nMax > 536870912*2 and sys.maxsize == 2147483647:
    try:
        from array import array
        v = array('I',[0])*(nMax//2)
    except:
        print("Not enough memory!")
        sys.exit(0)
elif nMax >= 147026880:
    try:
        v = [0]*(nMax//2)
    except:
        try:
            from array import array
            v = array('I',[0])*(nMax//2)
        except:
            print("Not enough memory!")
            sys.exit(0)
else:
    v = bytearray(nMax//2)

nMaxSqrt = int(sqrt(nMax))

res = tatami(s)*2
if res > -1:
    print("The smallest room size s for which T(s) = "+str(s)+" is "+str(res))
else:
    print("No solution found for s=" +str(s)+" and nMax="+str(nMax))
    maxs = 0
    maxi = 0
    for i in range(0,nMax//2):
        if v[i] > maxs:
            maxs = v[i]
            maxi = i*2
    print("Maximal T(s) found = "+str(maxs)+" for room size = "+str(maxi))
It runs a bit slower in interpreter mode, but even faster with pypy3:

Code: Select all

time ./tatami_opt.py
The smallest room size s for which T(s) = 200 is 85765680

real	1m39,995s
user	1m39,781s
sys	0m0,140s

time pypy3 ./tatami_opt.py
The smallest room size s for which T(s) = 200 is 85765680

real	0m15,124s
user	0m14,871s
sys	0m0,210s
Now let us search for a solution for T(s) = 300:

Code: Select all

time pypy3 ./tatami_opt.py 300 300000000
The smallest room size s for which T(s) = 300 is 294053760

real	0m49,615s
user	0m48,143s
sys	0m1,400s
All timings are from a RPi 4B/4G
Minimal Kiosk Browser (kweb)
Slim, fast webkit browser with support for audio+video+playlists+youtube+pdf+download
Optional fullscreen kiosk mode and command interface for embedded applications
Includes omxplayerGUI, an X front end for omxplayer

ejolson
Posts: 3831
Joined: Tue Mar 18, 2014 11:47 am

Re: Liberation through Computer Literacy

Wed Nov 13, 2019 2:56 pm

gkreidl wrote:
Wed Nov 13, 2019 9:43 am
It runs a bit slower in interpreter mode, but even faster with pypy3:
Storing T(s) only for even values of s is a nice optimisation. In addition to being easier on the cache it takes less memory.

I was talking with the lead developer of FidoBasic about how to present the results of the challenge. The canine coder suggested that everyone should receive stars for any correctly working code submitted. When asked what kind of stars, the developer clicked and found
Wikipedia wrote:Sirius is known colloquially as the "Dog Star", reflecting its prominence in its constellation, Canis Major (the Greater Dog)
Then Fido became barking mad. What? growled the canine coder. That's an egregious error. Sirius is the colloquial; Dog Star is the proper scientific name.

I reflected and then remarked, I not sure we need stars. Didn't someone offer to provide free tatami mats to everyone entering the challenge? However, Fido was busy editing Wikipedia and did not respond. It's amazing how liberating it is to know how to use a computer.

User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Wed Nov 13, 2019 5:15 pm

The optimization takes into account, that only even room sizes are allowed. Half of the size of the v-Array is never used. 
This sounds like a precursor to threads.

Can X and Y be done In separate threads?

Tatami is rectangles not squares.

scriba tatamixy.sb 200 100


User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 1:08 am

My vote for the next iteration of the Tatami challenge is to pass both X and Y dimensions and not assume the room is square.

User avatar
Paeryn
Posts: 2749
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 1:41 am

John_Spikowski wrote:
Thu Nov 14, 2019 1:08 am
My vote for the next iteration of the Tatami challenge is to pass both X and Y dimensions and not assume the room is square.
It doesn't assume the room is square, the calculations are on the area of the room, T(s) being the number of rooms with an area of s where the width is not greater than than the height that can't be covered completely. The width <= height prevents counting duplicate identical rooms (e.g. a room of 7x10 is just a rotated copy of 10x7).
She who travels light — forgot something.

jcyr
Posts: 503
Joined: Sun Apr 23, 2017 1:31 pm
Location: Atlanta

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 2:48 am

gkreidl wrote:
Wed Nov 13, 2019 9:43 am
The optimization takes into account, that only even room sizes are allowed. Half of the size of the v-Array is never used. So it's possible to reduce the dimension of the v-array to nMax/2. It needs an additional shift by one to the right within the tatami function: v[(i*j) >> 1] += 1.
Nice, but that extra right shift in the innermost loops has a perceptible impact on performance!

Code: Select all

[email protected]:~/tatami $ gcc -O3 -march=native -mtune=cortex-a72 -o tatami tatami.c 
[email protected]:~/tatami $ time ./tatami 200
T(85765680) = 200

real    0m9.889s
user    0m9.538s
sys     0m0.351s

Code: Select all

[email protected]:~/tatami $ gcc -O3 -march=native -mtune=cortex-a72 -o tatami tatami-with-shift.c  
[email protected]:~/tatami $ time ./tatami 200
T(85765680) = 200

real    0m11.159s
user    0m10.903s
sys     0m0.251s
Might as well cancel out the shift by making V an array of shorts, getting back the performance at the expense of memory, with the added bonus of being able to calculate T beyond 255.

Code: Select all

[email protected]:~/tatami $ gcc -O3 -march=native -mtune=cortex-a72 -o tatami tatami-with-shift-shorts.c 
[email protected]:~/tatami $ time ./tatami 200
T(85765680) = 200

real    0m9.873s
user    0m9.512s
sys     0m0.361s

Code: Select all

// tatami-with-shift-shorts.c

#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define nMax 100000000
#define nMaxSqrt sqrt(nMax)

unsigned short v[nMax / 2];

int Tatami(int s)
{
    for (int i = 7; i < nMaxSqrt; i += 2)
    {
        int k2 = i + 3;
        int k3 = i + i - 4;
        while ((k2 <= k3) && ((i * k2) < nMax))
        {
            int k4 = (nMax - 1) / i;
            if (k3 < k4)
                k4 = k3;
            for (int j = k2; j <= k4; j += 2)
                ++v[(i * j) / 2];
            k2 += i + 1;
            k3 += i - 1;
        }
    }
    for (int i = 8; i < nMaxSqrt; i += 2)
    {
        int k2 = i + 3;
        int k3 = i + i - 4;
        while ((k2 <= k3) && ((i * k2) < nMax))
        {
            int k4 = (nMax - 1) / i;
            if (k3 < k4)
                k4 = k3;
            for (int j = k2; j <= k4; ++j)
                ++v[(i * j) / 2];
            k2 += i + 1;
            k3 += i - 1;
        }
    }

    for (int i = 0; i < nMax / 2; ++i)
        if (v[i] == s)
            return i + i;
}

int main(int ac, char* av[])
{
    int s = atoi(av[1]);
    printf("T(%d) = %d\n", Tatami(s), s);
}
Despite all of this the task of figuring out the upper bound of V for any given s is still guess work.
Last edited by jcyr on Thu Nov 14, 2019 5:47 am, edited 2 times in total.
It's um...uh...well it's kinda like...and it's got a bit of...

User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 3:04 am

A slight improvement for ScriptBasic. Only the extension module code was changed.

No one has download the scriba or ta.so zips. Was that another waste of my time?

Haven't heard much from Heater. Is he still mapping the universe or him barfing on his keyboard take him offline?

Code: Select all

/* Tatami Array Get / Set / Add
UXLIBS: -lm
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "../../basext.h"
#include "cbasic.h"

DIM AS unsigned char v[50000000];

/****************************
 Extension Module Functions
****************************/

besVERSION_NEGOTIATE
  RETURN_FUNCTION((int)INTERFACE_VERSION);
besEND

besSUB_START
  DIM AS long PTR p;
  besMODULEPOINTER = besALLOC(sizeof(long));
  IF (besMODULEPOINTER EQ NULL) THEN_DO RETURN_FUNCTION(0);
  p = (long PTR)besMODULEPOINTER;
  RETURN_FUNCTION(0);
besEND

besSUB_FINISH
  DIM AS long PTR p;
  p = (long PTR)besMODULEPOINTER;
  IF (p EQ NULL) THEN_DO RETURN_FUNCTION(0);
  RETURN_FUNCTION(0);
besEND


/************************
 Tatami array functions
************************/

besFUNCTION(do_v)
  DIM AS long start;
  DIM AS long end;
  DIM AS long value;
  DIM AS long idx;
  besARGUMENTS("iii")
    AT start, AT end, AT value
  besARGEND
  DEF_FOR (idx = start TO idx <= end STEP INCR idx + 2)
  BEGIN_FOR
    v[(value * idx) / 2] += 1;
  NEXT
  besRETURNVALUE = NULL;
besEND 

besFUNCTION(find_s)
  DIM AS long s;
  DIM AS long idx;
  besARGUMENTS("i")
    AT s
  besARGEND
  DEF_FOR (idx = 0 TO idx <= 49999999 STEP INCR idx)
  BEGIN_FOR
    IF (v[idx] == s) THEN
      besRETURN_LONG(idx + idx);
         EXIT_FOR
    END_IF
  NEXT
besEND  
Output (Laptop)

Code: Select all

[email protected]:~/sbrt/examples$ time scriba tatamix.sb 200
The smallest room size s for which T(s) = 200 is 85765680

real	0m4.859s
user	0m4.815s
sys	0m0.044s
[email protected]:~/sbrt/examples$ 
Output (RPI 4B 4GB)

Code: Select all

[email protected]:~/sbrt/examples $ time scriba tatamix.sb 200
The smallest room size s for which T(s) = 200 is 85765680

real	0m23.399s
user	0m23.267s
sys	0m0.121s
[email protected]:~/sbrt/examples $ 
Attachments
ta.so.zip
(1.9 KiB) Downloaded 6 times

User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 5:12 am

Tatami Python 200 (optimized)

Output (Laptop)

Code: Select all

[email protected]:~/sbrt/examples$ time ./tatami.py 200
The smallest room size s for which T(s) = 200 is 85765680

real	0m39.373s
user	0m39.358s
sys	0m0.012s
[email protected]:~/sbrt/examples$ 
Output (RPi 4B 4GB)

Code: Select all

[email protected]:~/sbrt/examples $ time ./tatami.py 200
The smallest room size s for which T(s) = 200 is 85765680

real	1m45.614s
user	1m45.450s
sys	0m0.131s
[email protected]:~/sbrt/examples $ 
Last edited by John_Spikowski on Thu Nov 14, 2019 5:40 am, edited 2 times in total.

User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 5:36 am

Tatami C (optimized)

Output (Laptop)

Code: Select all

[email protected]:~/sbrt/examples$ time ./tatami 200
T(85765680) = 200

real	0m3.312s
user	0m3.258s
sys	0m0.053s
[email protected]:~/sbrt/examples$ 
Output (RPi 4B 4GB)[/b]

Code: Select all

[email protected]:~/sbrt/examples $ time ./tatami 200
T(85765680) = 200

real	0m10.417s
user	0m10.025s
sys	0m0.392s
[email protected]:~/sbrt/examples $ 

Heater
Posts: 13925
Joined: Tue Jul 17, 2012 3:02 pm

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 5:58 am

John_Spikowski,
Haven't heard much from Heater. Is he still mapping the universe or him barfing on his keyboard take him offline?
Still here, following along intently.

Strangely enough mapping is exactly what I have been doing. Only a small corner of the universe mind.

Was hoping to make a Rust version of one of the recent optimized C tatami codes. That might have to wait till the week end.
Memory in C++ is a leaky abstraction .

User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 6:18 am

Thanks for the update. A Rust version would be nice.

User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 6:40 am

It might be possible to run the two FOR/NEXT routines that update the V arrary as threads updating their own instance of V and returning the arrays as strings.

The final FOR/NEXT that searches for S could take the two thread array return strings as arguments and combine them before finding the S index.

Wouldn't searching backward from UBOUND towards zero be faster?

User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 7:06 am

Removed invalid code.
Last edited by John_Spikowski on Thu Nov 14, 2019 4:01 pm, edited 1 time in total.

jcyr
Posts: 503
Joined: Sun Apr 23, 2017 1:31 pm
Location: Atlanta

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 7:22 am

Code: Select all

[email protected]:~/tatami $ gcc -O3 -march=native -mtune=cortex-a72 -o tatami tatami-with-shift-shorts.c -lm
[email protected]:~/tatami $ time ./tatami 400 1640000000 
T(1639915200) = 400

real    4m33.364s
user    4m29.735s
sys     0m3.551s
It's um...uh...well it's kinda like...and it's got a bit of...

User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 7:55 am

Show us the code!

ejolson
Posts: 3831
Joined: Tue Mar 18, 2014 11:47 am

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 9:57 am

John_Spikowski wrote:
Thu Nov 14, 2019 7:06 am
Doing the reverse search shaved off a bit more time.
Searching from large values of s until T(s)=n does not necessarily find the least s that does the same. The fact that the same value for s is returned when n=200 is coincidence, so this approach does not count as working code. To further see this try solving T(s)=1 without changing smax and note that the correct answer of s=70 is not returned.

User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 3:55 pm

I'm going to take a shot at doing this in ScriptBasic as a string for V. It won't be as fast a the extension module version but should be able to do 200 native.

jcyr
Posts: 503
Joined: Sun Apr 23, 2017 1:31 pm
Location: Atlanta

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 5:19 pm

Tatami threaded

Code: Select all

[email protected]:~/tatami $ g++ -march=native -mtune=cortex-a72 -O3 -o tatami tatami-threaded.cpp -lm -lpthread
[email protected]:~/tatami $ time ./tatami 400 1640000000                                                      
T(1639915200) = 400

real    3m31.257s
user    6m26.697s
sys     0m3.540s
Tatami

Code: Select all

[email protected]:~/tatami $ gcc -O3 -march=native -mtune=cortex-a72 -o tatami tatami-with-shift-shorts.c -lm
[email protected]:~/tatami $ time ./tatami 400 1640000000 
T(1639915200) = 400

real    4m33.364s
user    4m29.735s
sys     0m3.551s

Code: Select all

// tatami-threaded.cpp

#include <math.h>

#include <iostream>
#include <thread>

#define nMaxDefault 100000000  // default
//#define nMax 85765682        // for T(s) = 200
//#define nMax 1640000000      // for T(s) = 400

unsigned nMax = nMaxDefault;
unsigned nMaxSqrt;
unsigned short* v;

static void odd()
{
    for (int i = 7; i < nMaxSqrt; i += 2)
    {
        int k2 = i + 3;
        int k3 = i + i - 4;
        while ((k2 <= k3) && ((i * k2) < nMax))
        {
            int k4 = (nMax - 1) / i;
            if (k3 < k4)
                k4 = k3;
            for (int j = k2; j <= k4; j += 2)
                ++v[(i * j) / 2];
            k2 += i + 1;
            k3 += i - 1;
        }
    }
}

static void even()
{
    for (int i = 8; i < nMaxSqrt; i += 2)
    {
        int k2 = i + 3;
        int k3 = i + i - 4;
        while ((k2 <= k3) && ((i * k2) < nMax))
        {
            int k4 = (nMax - 1) / i;
            if (k3 < k4)
                k4 = k3;
            for (int j = k2; j <= k4; ++j)
                ++v[(i * j) / 2];
            k2 += i + 1;
            k3 += i - 1;
        }
    }
}

int Tatami(int s)
{
    v = new unsigned short[nMax / 2];
    for (unsigned i; i < (nMax / 2); i++)
        v[i] = 0;
    nMaxSqrt = sqrt(nMax);

    std::thread evenThread(even);
    std::thread oddThread(odd);
    evenThread.join();
    oddThread.join();

    for (int i = 0; i < nMax / 2; ++i)
        if (v[i] == s)
            return i + i;
    return 0; // shouldn't happen
}

int main(int ac, char* av[])
{
    int s = atoi(av[1]);
    if (ac > 2)
        nMax = strtoul(av[2], NULL, 10);
    std::cout << "T(" << Tatami(s) << ") = " << s << std::endl;
}
It's um...uh...well it's kinda like...and it's got a bit of...

User avatar
John_Spikowski
Posts: 1614
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: Liberation through Computer Literacy

Thu Nov 14, 2019 5:24 pm

Tatami threaded
🥳

Thanks!
The pthread_join() function waits for the thread specified by thread to terminate. If that thread has already terminated, then pthread_join() returns immediately. The thread specified by thread must be joinable. If retval is not NULL, then pthread_join() copies the exit status of the target thread (i.e., the value that the target thread supplied to pthread_exit(3)) into the location pointed to by retval. If the target thread was canceled, then PTHREAD_CANCELED is placed in the location pointed to by retval. If multiple threads simultaneously try to join with the same thread, the results are undefined. If the thread calling pthread_join() is canceled, then the target thread will remain joinable (i.e., it will not be detached).

Return to “General programming discussion”