jahboater
Posts: 4838
Joined: Wed Feb 04, 2015 6:38 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 9:53 am

Another question:

Without using isapha(), how would you reduce this "if" to a single test instead of four?

Code: Select all

#include <stdio.h>

int
main( int argc, char *argv[] )
{
  int ch = getchar();

  if( (ch >= 'A' && ch <= 'Z') || (ch >= 'a' && ch <= 'z') )
    printf( "is alpha" );
}
Not saying you should do this in real code!

User avatar
rpdom
Posts: 15583
Joined: Sun May 06, 2012 5:17 am
Location: Chelmsford, Essex, UK

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 10:31 am

A nasty way of doing it would be to build an array of char (0 to 255 or -128 to 127 depending on the system) with each element set to 0 or 1, then use something like

Code: Select all

if( validchars[ch] == 1 )
    printf( "is alpha" );
Sometimes a lookup table is the fastest way to do things. It will only take 256 bytes of memory.

1dot0
Posts: 430
Joined: Mon Nov 28, 2016 12:31 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 10:41 am

rpdom wrote:A nasty way of doing it would be to build an array of char (0 to 255 or -128 to 127 depending on the system) with each element set to 0 or 1, then use something like

Code: Select all

if( validchars[ch] == 1 )
    printf( "is alpha" );
Sometimes a lookup table is the fastest way to do things. It will only take 256 bytes of memory.
haha, in that case one had to perform 127-255 comparisons before, i.e. for building the table - instead of just 4 :lol:

User avatar
rurwin
Forum Moderator
Forum Moderator
Posts: 4258
Joined: Mon Jan 09, 2012 3:16 pm
Contact: Website

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 10:44 am

Interesting fact: isalpha() is not a valid replacement for that code and might even crash.

Certain C libraries (<achem>Microsoft</achem>) throw an exception if they are given a negative argument. If you check the BSD sources you will find a comment to the effect that this might be a good idea. At least one compiler manufacturer has done so. getchar() returns EOF if it encounters an end-of-file, and that is usually (always?) -1.

This can be reduced to a single line, but I've left is as three in a function for clarity with the test code for proof.

Code: Select all

#include <stdio.h>
#define bool int

bool isalph(unsigned int c)
{
    unsigned int a = (c-1)^64;
    unsigned int b = a & ~0x20;
    return b < 26;
}
    
int main(int argc, char**argv)
{
	int i, j;
    for (i = -16; i < 272; i+=16)
    {
        printf("\n%03d: ", i);
        for (j = 0; j < 16; j++)
        {
            if (isalph(i+j))
            {
                if (i+j < 33 && i + j != 127)
                    printf("+");
                else
                    printf("%c", i+j);
            }
			else
				printf(" ");
        }
    }
    printf ("\n");
}

User avatar
rpdom
Posts: 15583
Joined: Sun May 06, 2012 5:17 am
Location: Chelmsford, Essex, UK

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 10:58 am

1dot0 wrote:
rpdom wrote:A nasty way of doing it would be to build an array of char (0 to 255 or -128 to 127 depending on the system) with each element set to 0 or 1, then use something like

Code: Select all

if( validchars[ch] == 1 )
    printf( "is alpha" );
Sometimes a lookup table is the fastest way to do things. It will only take 256 bytes of memory.
haha, in that case one had to perform 127-255 comparisons before, i.e. for building the table - instead of just 4 :lol:
No, just hard code it in the source. That will require no processing to build the table at runtime. It may even be possible to do it as a macro, or have it in a previously generated source file.

User avatar
rurwin
Forum Moderator
Forum Moderator
Posts: 4258
Joined: Mon Jan 09, 2012 3:16 pm
Contact: Website

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 11:12 am

rpdom wrote:
1dot0 wrote:
rpdom wrote:A nasty way of doing it would be to build an array of char (0 to 255 or -128 to 127 depending on the system) with each element set to 0 or 1, then use something like

Code: Select all

if( validchars[ch] == 1 )
    printf( "is alpha" );
Sometimes a lookup table is the fastest way to do things. It will only take 256 bytes of memory.
haha, in that case one had to perform 127-255 comparisons before, i.e. for building the table - instead of just 4 :lol:
No, just hard code it in the source. That will require no processing to build the table at runtime. It may even be possible to do it as a macro, or have it in a previously generated source file.
That gives an array access out of bounds at end-of-file but a dirty +1 would fix it.

User avatar
jojopi
Posts: 3085
Joined: Tue Oct 11, 2011 8:38 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 11:18 am

jahboater wrote:Without using isapha(), how would you reduce this "if" to a single test instead of four?
Simply by compiling it:

Code: Select all

        bic     r0, r0, #32
        sub     r0, r0, #65
        cmp     r0, #25
Note that subtracting 65 is an improvement over rurwin's solution.

1dot0
Posts: 430
Joined: Mon Nov 28, 2016 12:31 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 12:37 pm

rpdom wrote:
1dot0 wrote:
rpdom wrote:A nasty way of doing it would be to build an array of char (0 to 255 or -128 to 127 depending on the system) with each element set to 0 or 1, then use something like

Code: Select all

if( validchars[ch] == 1 )
    printf( "is alpha" );
Sometimes a lookup table is the fastest way to do things. It will only take 256 bytes of memory.
haha, in that case one had to perform 127-255 comparisons before, i.e. for building the table - instead of just 4 :lol:
No, just hard code it in the source. That will require no processing to build the table at runtime. It may even be possible to do it as a macro, or have it in a previously generated source file.
yes, of course, that's what I meant: for hardcoding the lookup table :D

1dot0
Posts: 430
Joined: Mon Nov 28, 2016 12:31 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 12:39 pm

jojopi wrote:
jahboater wrote:Without using isapha(), how would you reduce this "if" to a single test instead of four?
Simply by compiling it:

Code: Select all

        bic     r0, r0, #32
        sub     r0, r0, #65
        cmp     r0, #25
Note that subtracting 65 is an improvement over rurwin's solution.
assembler is cheating! :D

1dot0
Posts: 430
Joined: Mon Nov 28, 2016 12:31 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 12:44 pm

using toupper(c) at least would reduce the comparisons to just 2, i.e. >='A' and <='Z' :geek:

jahboater
Posts: 4838
Joined: Wed Feb 04, 2015 6:38 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 1:09 pm

jojopi wrote:
jahboater wrote:Without using isapha(), how would you reduce this "if" to a single test instead of four?
Simply by compiling it:

Code: Select all

        bic     r0, r0, #32
        sub     r0, r0, #65
        cmp     r0, #25
Note that subtracting 65 is an improvement over rurwin's solution.
Yes!
and in 64-bit mode it is "and w0,w0,-33" instead of the bic (bit clear).

Note that it ignores the short circuit semantics of ||, but that doesn't matter, its always correct.
It can handle any integer value, for example characters from ncurses getch() which can be larger than 255 as well as negative values. Any table lookup solution will need extra code to avoid an out of bounds access. Furthermore a table lookup involves a memory read which is very slow and involves even more code to load the table start address (on the Pi). The above solution is all done in one single register, no memory references, and three instructions.

I was hoping no one would use the compiler!
I was wondering if any of the experienced programmers here could find a better solution to the compiler generated code "if( (ch & -33) - 64 < 26 )" or even match it. It supports my observation that for day to day (not SIMD) code, the compiler is often better than a human, there is no point in assembler optimization, leave it in readable C!

User avatar
rurwin
Forum Moderator
Forum Moderator
Posts: 4258
Joined: Mon Jan 09, 2012 3:16 pm
Contact: Website

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 2:13 pm

jojopi wrote:Note that subtracting 65 is an improvement over rurwin's solution.
I'm happy enough that I got as close as one instruction to the optimum. It reminds me of the Two Laws of Optimisation:
  • Don't do it.
  • (For experts only) Don't do it now.
I've also seen horrible code coming out of a compiler (compiling for debug, so limited optimisation). It took

Code: Select all

array[big expression] ++;
And calculated the expression twice -- the whole reason a lot of people would tell you for the ++ operator. (It still avoids code duplication, so it's still useful.)

jahboater
Posts: 4838
Joined: Wed Feb 04, 2015 6:38 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 3:02 pm

rurwin wrote:
jojopi wrote:I've also seen horrible code coming out of a compiler (compiling for debug, so limited optimisation).
Yes, its only fair to turn optimization on, use the latest version of the compiler, and target the right CPU, if comparing human to machine!

1dot0
Posts: 430
Joined: Mon Nov 28, 2016 12:31 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 4:22 pm

jahboater wrote:
jojopi wrote:
jahboater wrote:Without using isapha(), how would you reduce this "if" to a single test instead of four?
Simply by compiling it:

Code: Select all

        bic     r0, r0, #32
        sub     r0, r0, #65
        cmp     r0, #25
Note that subtracting 65 is an improvement over rurwin's solution.
Yes!
and in 64-bit mode it is "and w0,w0,-33" instead of the bic (bit clear).

Note that it ignores the short circuit semantics of ||, but that doesn't matter, its always correct.
It can handle any integer value, for example characters from ncurses getch() which can be larger than 255 as well as negative values. Any table lookup solution will need extra code to avoid an out of bounds access. Furthermore a table lookup involves a memory read which is very slow and involves even more code to load the table start address (on the Pi). The above solution is all done in one single register, no memory references, and three instructions.

I was hoping no one would use the compiler!
I was wondering if any of the experienced programmers here could find a better solution to the compiler generated code "if( (ch & -33) - 64 < 26 )" or even match it. It supports my observation that for day to day (not SIMD) code, the compiler is often better than a human, there is no point in assembler optimization, leave it in readable C!
what does this (ch & -33) in if( (ch & -33) - 64 < 26 ) do?

User avatar
Paeryn
Posts: 2744
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 5:00 pm

1dot0 wrote:what does this (ch & -33) in if( (ch & -33) - 64 < 26 ) do?
It clears bit 5 so effectively converts a lower case letter into an upper case letter. Though I prefer ~32 to -33, it looks cleaner to my eye.

Also shouldn't the C version be

Code: Select all

if ( ((unsigned int)((ch & -33) - 65)) < 26)
to make -ve values after the subtraction into high +ve values.
She who travels light — forgot something.

jahboater
Posts: 4838
Joined: Wed Feb 04, 2015 6:38 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 5:32 pm

Paeryn wrote:Also shouldn't the C version be

Code: Select all

if ( ((unsigned int)((ch & -33) - 65)) < 26)
to make -ve values after the subtraction into high +ve values.
Agreed.

1dot0
Posts: 430
Joined: Mon Nov 28, 2016 12:31 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 5:58 pm

Paeryn wrote:
1dot0 wrote:what does this (ch & -33) in if( (ch & -33) - 64 < 26 ) do?
It clears bit 5 so effectively converts a lower case letter into an upper case letter. Though I prefer ~32 to -33, it looks cleaner to my eye.
so not obfuscated:
if( (toupper(ch) -65 <26) )
because 'A' is 65 and 'Z' is 65+25

but what about ASCII(32)=' ' ... ASCII(64)='@''? they are negative then after -65, but still true for <26 though?

User avatar
Paeryn
Posts: 2744
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 6:16 pm

1dot0 wrote:so not obfuscated:
if( (toupper(ch) -65 <26) )
because 'A' is 65 and 'Z' is 65+25

but what about ASCII(32)=' ' ... ASCII(64)='@'? they are negative then, but still true for <26 though?
They would end up -ve which is why I added the cast to unsigned, that makes -ve numbers be interpretted as very high +ve numbers and hence stops the comparison from incorrectly registering true.

Using toupper() sort of doesn't qualify for this exercise as it necessitates extra tests to make sure it only affects letters and the exercise was to do it with only one test.
She who travels light — forgot something.

User avatar
PeterO
Posts: 5138
Joined: Sun Jul 22, 2012 4:14 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 7:54 pm

I've been busy all day at TNMOC so missed this fun :-)
PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

jahboater
Posts: 4838
Joined: Wed Feb 04, 2015 6:38 pm

Re: How would you ...... in C ? (+ other C discussions)

Sat Jul 15, 2017 8:27 pm

1dot0 wrote: but what about ASCII(32)=' ' ... ASCII(64)='@''? they are negative then after -65, but still true for <26 though?
No. If its treated as unsigned then any negative signed number becomes a very large. So it works.

As a simpler example, consider:

int n;
if( n >= 0 && n < 20 )

The compiler will omit the "n >= 0" test, because -5 (say) is 0xFFFFFFFB, definitely > 20!

jahboater
Posts: 4838
Joined: Wed Feb 04, 2015 6:38 pm

Re: How would you ...... in C ? (+ other C discussions)

Sun Jul 16, 2017 4:55 am

rurwin wrote: I've also seen horrible code coming out of a compiler (compiling for debug, so limited optimisation). It took

Code: Select all

array[big expression] ++;
And calculated the expression twice -- the whole reason a lot of people would tell you for the ++ operator. (It still avoids code duplication, so it's still useful.)
Please could you post a small example where this happens?

If I can reproduce it, I will raise it as a bug with GCC.

User avatar
rurwin
Forum Moderator
Forum Moderator
Posts: 4258
Joined: Mon Jan 09, 2012 3:16 pm
Contact: Website

Re: How would you ...... in C ? (+ other C discussions)

Sun Jul 16, 2017 8:23 am

It was the Microsoft C compiler. And it was many years ago -- probably the version just before it became Visual C.

And it was not a bug. The compiler is allowed to use any translation it likes with the same effect. The expression it doubled had no side effects.

jahboater
Posts: 4838
Joined: Wed Feb 04, 2015 6:38 pm

Re: How would you ...... in C ? (+ other C discussions)

Sun Jul 16, 2017 8:37 am

rurwin wrote:It was the Microsoft C compiler. And it was many years ago -- probably the version just before it became Visual C.
OK.
Compilers have improved over the (20+?) years.
You should take a look at the code produced by a recent GCC - its very unusual to see unexplained extra instructions, and from experience, when you take the time to try and fully understand whats going on, you eventually realize the compiler was right!
And it was not a bug. The compiler is allowed to use any translation it likes with the same effect.
It is a "missed optimization" bug, and since a large expression was duplicated, perhaps higher than normal severity.
If GCC ever produced code like you describe (which it doesn't) there would be a huge fuss and many issues would be raised!

LdB
Posts: 1317
Joined: Wed Dec 07, 2016 2:29 pm

Re: How would you ...... in C ? (+ other C discussions)

Sun Jul 16, 2017 3:39 pm

You need to get out more GCC in general is horrible at optimization with structures, I truely get sick of holding it's stupid hand. It misses them continously and I don't care what version you use. It has been raised countless times and never fixed and we are talking at least 10 years.

Try this one which is standard GPIO code for a Pi.

What causes the problem is I am carrying the IO base address in memory and it needs to make a volatile pointer to GPIO->GPSET and GPIO->GPCLR and usually it's too stupid to work out they are just offsets off the same base address.

Please don't tell me to make it a constant it's a memory value because my code auto-detects the base address so it can run on any model Pi ... so no changing the problem to get an optimization. What I am interested in is the code for gpio_output

Code: Select all

#include <stdint.h>
/*--------------------------------------------------------------------------}
{    RASPBERRY PI GPIO HARDWARE REGISTERS - BCM2835.PDF Manual Section 6	}
{--------------------------------------------------------------------------*/
struct __attribute__((__packed__, aligned(4))) GPIORegisters {
	volatile uint32_t GPFSEL[6];									// 0x00  GPFSEL0 - GPFSEL[5]
	uint32_t reserved1;												// 0x18  reserved
	volatile uint32_t GPSET[2];										// 0x1C  GPSET0 - GPSET1;
	uint32_t reserved2;												// 0x24  reserved
	volatile uint32_t GPCLR[2];										// 0x28  GPCLR0 - GPCLR1
	uint32_t reserved3;												// 0x30  reserved
	const volatile uint32_t GPLEV[2];								// 0x34  GPLEV0 - GPLEV1   ** Read only hence const
	uint32_t reserved4;												// 0x3C  reserved
	volatile uint32_t GPEDS[2];										// 0x40  GPEDS0 - GPEDS1 
	uint32_t reserved5;												// 0x48  reserved
	volatile uint32_t GPREN[2];										// 0x4C  GPREN0 - GPREN1;	 
	uint32_t reserved6;												// 0x54  reserved
	volatile uint32_t GPFEN[2];										// 0x58  GPFEN0 - GPFEN1;
	uint32_t reserved7;												// 0x60  reserved
	volatile uint32_t GPHEN[2];										// 0x64  GPHEN0 - GPHEN1;
	uint32_t reserved8;												// 0x6c  reserved
	volatile uint32_t GPLEN[2];										// 0x70  GPLEN0 - GPLEN1;
	uint32_t reserved9;												// 0x78  reserved
	volatile uint32_t GPAREN[2];									// 0x7C  GPAREN0 - GPAREN1;
	uint32_t reserved10;											// 0x84  reserved
	volatile uint32_t GPAFEN[2]; 									// 0x88  GPAFEN0 - GPAFEN1;
	uint32_t reserved11;											// 0x90  reserved
	volatile uint32_t GPPUD; 										// 0x94  GPPUD 
	volatile uint32_t GPPUDCLK[2]; 									// 0x98  GPPUDCLK0 - GPPUDCLK1;
};

uint32_t RPi_IO_Base_Addr = 0x3F000000;

#define GPIO ((volatile __attribute__((aligned(4))) struct GPIORegisters*) (RPi_IO_Base_Addr + 0x200000))

bool gpio_output(int gpio, bool on) {
	if (gpio < 0 || gpio > 54) return false;						// Check GPIO pin number valid, return false if invalid
	uint32_t bit = 1 << (gpio % 32);								// Create mask bit
	if (on) {														// ON request
		GPIO->GPSET[gpio / 32] = bit;								// Set bit to make GPIO high output
	} else {
		GPIO->GPCLR[gpio / 32] = bit;								// Set bit to make GPIO low output
	}
	return true;													// Return true
}
Most times on most distros it generates this sort of thing with -02 .. love the double load of the address and it's to stupid to compact the different branches.

Code: Select all

  58              		.fpu neon-vfpv4
  59              		.type	gpio_output, %function
  60              	gpio_output:
  61              		@ args = 0, pretend = 0, frame = 0
  62              		@ frame_needed = 0, uses_anonymous_args = 0
  63              		@ link register save eliminated.
  64 0000 360050E3 		cmp	r0, #54
  65 0004 1600008A 		bhi	.L12
  66 0008 1F3000E2 		and	r3, r0, #31
  67 000c 0120A0E3 		mov	r2, #1
  68 0010 000051E3 		cmp	r1, #0
  69 0014 12C3A0E1 		lsl	ip, r2, r3
  70 0018 0800001A 		bne	.L13
  71 001c 001000E3 		movw	r1, #:lower16:RPi_IO_Base_Addr
  72 0020 001040E3 		movt	r1, #:upper16:RPi_IO_Base_Addr
  73 0024 C032A0E1 		asr	r3, r0, #5
  74 0028 0200A0E1 		mov	r0, r2
  75 002c 0A3083E2 		add	r3, r3, #10
  76 0030 002091E5 		ldr	r2, [r1]
  77 0034 022682E2 		add	r2, r2, #2097152
  78 0038 03C182E7 		str	ip, [r2, r3, lsl #2]
  79 003c 1EFF2FE1 		bx	lr
  80              	.L13:
  81 0040 002000E3 		movw	r2, #:lower16:RPi_IO_Base_Addr
  82 0044 002040E3 		movt	r2, #:upper16:RPi_IO_Base_Addr
  83 0048 C032A0E1 		asr	r3, r0, #5
  84 004c 0100A0E1 		mov	r0, r1
  85 0050 002092E5 		ldr	r2, [r2]
  86 0054 033182E0 		add	r3, r2, r3, lsl #2
  87 0058 023683E2 		add	r3, r3, #2097152
  88 005c 1CC083E5 		str	ip, [r3, #28]
  89 0060 1EFF2FE1 		bx	lr
  90              	.L12:
  91 0064 0000A0E3 		mov	r0, #0
  92 0068 1EFF2FE1 		bx	lr
  93              		.size	gpio_output, .-gpio_output
Every now and again on a distro you can convince it to at least load a single load the address but then it still keeps the two branches.

Code: Select all

gpio_output(int, bool):
        cmp     r0, #54
        bhi     .L4
        mov     r2, #1
        and     r3, r0, #31
        cmp     r1, #0
        lsl     ip, r2, r3
        bne     .L6
        ldr     r1, .L7
        asr     r3, r0, #5
        ldr     r1, [r1]
        add     r3, r3, #10
        add     r1, r1, #2097152
        mov     r0, r2
        str     ip, [r1, r3, lsl #2]
        bx      lr
.L6:
        ldr     r2, .L7
        asr     r3, r0, #5
        ldr     r0, [r2]
        add     r3, r3, #6
        add     r0, r0, #2097152
        add     r3, r0, r3, lsl #2
        str     ip, [r3, #4]
        mov     r0, r1
        bx      lr
.L4:
        mov     r0, #0
        bx      lr
.L7:
        .word   .LANCHOR0
RPi_IO_Base_Addr:
        .word   1056964608
Now feed the same code into CLang 4.0 (I dont have ARM one setup at home but 386 code will show same). One register load compacted out of two full branches to a single.

Code: Select all

gpio_output(int, bool):                      # @gpio_output(int, bool)
        cmp     edi, 54
        jbe     .LBB0_2
        xor     eax, eax
        ret
.LBB0_2:
        mov     eax, 1
        mov     ecx, edi
        shl     eax, cl
        mov     ecx, 2097152
        add     ecx, dword ptr [rip + RPi_IO_Base_Addr]
        lea     rdx, [rcx + 40]
        add     rcx, 28
        test    sil, sil
        cmove   rcx, rdx
        shr     edi, 5
        mov     dword ptr [rcx + 4*rdi], eax
        mov     al, 1
        ret

RPi_IO_Base_Addr:
        .long   1056964608              # 0x3f000000
So please don't tell me GCC is good at optimization compared to CLang it's horrible. There is nothing wrong with GCC code it's just a million miles from optimized because of the struct and the struct isn't even misaligned (it gets worse with that).

1dot0
Posts: 430
Joined: Mon Nov 28, 2016 12:31 pm

Re: How would you ...... in C ? (+ other C discussions)

Sun Jul 16, 2017 4:29 pm

jahboater wrote:
1dot0 wrote: but what about ASCII(32)=' ' ... ASCII(64)='@''? they are negative then after -65, but still true for <26 though?
No. If its treated as unsigned then any negative signed number becomes a very large. So it works.

As a simpler example, consider:

int n;
if( n >= 0 && n < 20 )

The compiler will omit the "n >= 0" test, because -5 (say) is 0xFFFFFFFB, definitely > 20!
hell, that is really f*cking sophicated! :shock: :ugeek: 8-)

Return to “C/C++”