User avatar
PeterO
Posts: 4726
Joined: Sun Jul 22, 2012 4:14 pm

Re: Dynamic sized arrays

Sat Jun 02, 2018 6:30 am

ejolson wrote:
Sat Jun 02, 2018 6:18 am
PeterO wrote:
Mon May 28, 2018 11:42 am
Does anyone have a better solution?
I think you have one too many levels on indirection. Try something like
int (*array)[ys];
and then reference the array as
array[x][y]
using the same syntax as any other two-dimension array.
Most Excellent (and non-bogus) (*) ! :D 8-) :idea:
PeterO

(*) That's a "Bill and Ted" reference (for our younger readers) :lol:
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

ejolson
Posts: 3029
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Sat Jun 02, 2018 6:36 am

GuruMeditation wrote:
Thu May 31, 2018 11:42 pm
Heater wrote:
Thu May 31, 2018 11:16 pm
I think you will find there is no malloc behind C99 arrays. It's all done on the stack.
If so then the feature is really only useful for very small local arrays.
Set your stack size to unlimited using setrlimit to overcome small default stack sizes. Then one can store very large arrays on the stack.

While all stack allocated variables are local, you can always assign their addresses to a global pointer. Then they are local no more.

User avatar
rpdom
Posts: 14417
Joined: Sun May 06, 2012 5:17 am
Location: Chelmsford, Essex, UK

Re: Dynamic sized arrays

Sat Jun 02, 2018 8:04 am

PeterO wrote:
Sat Jun 02, 2018 6:30 am
That's a "Bill and Ted" reference (for our younger readers) :lol:
They'll learn when B&T3 comes out next year :-)

GuruMeditation
Posts: 33
Joined: Thu May 31, 2018 3:33 pm
Location: UK

Re: Dynamic sized arrays

Sat Jun 02, 2018 10:37 am

ejolson wrote:
Sat Jun 02, 2018 6:36 am
Set your stack size to unlimited using setrlimit to overcome small default stack sizes. Then one can store very large arrays on the stack.
That's not exactly a good practice.
ejolson wrote:
Sat Jun 02, 2018 6:36 am
While all stack allocated variables are local, you can always assign their addresses to a global pointer. Then they are local no more.
Data on the stack won't be valid anymore as soon as the function returns. Then you have a pointer to random data on the stack, ready to make your software crash.

jahboater
Posts: 4424
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Sat Jun 02, 2018 12:33 pm

GuruMeditation wrote:
Sat Jun 02, 2018 10:37 am
ejolson wrote:
Sat Jun 02, 2018 6:36 am
Set your stack size to unlimited using setrlimit to overcome small default stack sizes. Then one can store very large arrays on the stack.
That's not exactly a good practice.
The default is 8MB by the way.
GuruMeditation wrote:
Sat Jun 02, 2018 10:37 am
ejolson wrote:
Sat Jun 02, 2018 6:36 am
While all stack allocated variables are local, you can always assign their addresses to a global pointer. Then they are local no more.
Data on the stack won't be valid anymore as soon as the function returns. Then you have a pointer to random data on the stack, ready to make your software crash.
Not necessarily. It is perfectly OK to pass the address of stack allocated variables to a function further down the stack. For example if you allocated a large array in main() you could safely pass a pointer to it to any other function (apart from co-routines I suppose).

GuruMeditation
Posts: 33
Joined: Thu May 31, 2018 3:33 pm
Location: UK

Re: Dynamic sized arrays

Sat Jun 02, 2018 1:15 pm

jahboater wrote:
Sat Jun 02, 2018 12:33 pm
Not necessarily. It is perfectly OK to pass the address of stack allocated variables to a function further down the stack. For example if you allocated a large array in main() you could safely pass a pointer to it to any other function (apart from co-routines I suppose).
Because the function hasn't returned yet and is calling another function. This is not the same as returning or setting a pointer for global use, which is an absolute no-no.

The stack is for temporary, smallish data. Allocating a large amount of data in the stack in main() for essentially global use does not strike me as good practice as they is no need to do that.

Another aspect is that using large variable length arrays on the stack, basically what is proposed here, makes it impossible to work out the appropriate size to reserve for the stack.

In fact, standards like MISRA outright forbid doing that. There is of course no need to follow all these guidelines for desktop software but it's still worth having a thought about these considerations because they are good practices that help produce reliable software.

jahboater
Posts: 4424
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Sat Jun 02, 2018 1:28 pm

I agree.

As for standards, I like the CERT C coding standard.
Beginners should read it.

ejolson
Posts: 3029
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Sat Jun 02, 2018 3:36 pm

GuruMeditation wrote:
Sat Jun 02, 2018 1:15 pm
jahboater wrote:
Sat Jun 02, 2018 12:33 pm
Not necessarily. It is perfectly OK to pass the address of stack allocated variables to a function further down the stack. For example if you allocated a large array in main() you could safely pass a pointer to it to any other function (apart from co-routines I suppose).
Because the function hasn't returned yet and is calling another function. This is not the same as returning or setting a pointer for global use, which is an absolute no-no.

The stack is for temporary, smallish data. Allocating a large amount of data in the stack in main() for essentially global use does not strike me as good practice as they is no need to do that.

Another aspect is that using large variable length arrays on the stack, basically what is proposed here, makes it impossible to work out the appropriate size to reserve for the stack.

In fact, standards like MISRA outright forbid doing that. There is of course no need to follow all these guidelines for desktop software but it's still worth having a thought about these considerations because they are good practices that help produce reliable software.
How is not knowing how much memory is needed for the stack any different than not knowing how much is needed for the heap?

Outside of purpose-built embedded control systems such as those covered by MISRA standards, it is reasonable to write programs that can operate on different sizes of datasets without needing to be recompiled. While you can count the amount of data before hand, check if there is available memory and set limits on the stack or heap accordingly, programs that use too much memory without checking should not crash a modern Linux operating system.

Heater
Posts: 12623
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Sat Jun 02, 2018 6:24 pm

ejolson,
How is not knowing how much memory is needed for the stack any different than not knowing how much is needed for the heap?
It's very different if the objects put in the memory have to live longer than the execution time of the function that creates them.

When a function returns the stuff it put on the stack is gone. The stuff it put on the heap is still there.
Outside of purpose-built embedded control systems such as those covered by MISRA standards, it is reasonable to write programs that can operate on different sizes of datasets without needing to be recompiled.
Indeed it is.

If one is building a dedicated embedded system one probably has control of all other processes on the machine. Hence one can ensure the memory required is available.
...programs that use too much memory without checking should not crash a modern Linux operating system.
Indeed.

Perhaps the Linux out of memory detector finds my program is using too much memory. Then the OOM killer kills my process in order to save the OS.

Fine, except now my system has failed anyway.

jahboater
Posts: 4424
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Sat Jun 02, 2018 7:06 pm

ejolson wrote:
Sat Jun 02, 2018 3:36 pm
How is not knowing how much memory is needed for the stack any different than not knowing how much is needed for the heap?
There are a few differences ...
The stack has a default limit of 8MB. The heap does not of course.
In addition, large heap allocations go in a different segment - they mmap /dev/zero rather than extending the data segment, which avoids RLIMIT_DATA.
Finally, on RISC machines (the Pi) it can take multiple instructions to adjust the stack downwards by a large amount.

ejolson
Posts: 3029
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Sun Jun 03, 2018 4:47 am

Those are all good points.
jahboater wrote:
Sat Jun 02, 2018 7:06 pm
Finally, on RISC machines (the Pi) it can take multiple instructions to adjust the stack downwards by a large amount.
Even so, my understanding is that allocating and freeing memory from the heap is slower than adjusting the stack.
Heater wrote:
Sat Jun 02, 2018 6:24 pm
When a function returns the stuff it put on the stack is gone. The stuff it put on the heap is still there.
The fact that memory is automatically freed when the function returns and the speed at which this happens are the two main features of stack-allocated local variables, whether variable length or anything else.
Heater wrote:
Sat Jun 02, 2018 6:24 pm
Perhaps the Linux out of memory detector finds my program is using too much memory. Then the OOM killer kills my process in order to save the OS.
Either stack or heap allocated memory can trigger an out of memory condition because both are optimistically overcommitted by Linux using the virtual memory address space. In this sense any program running on Linux which allocates virtual address space not backed by physical memory at run time may cause an out of memory condition at some unknown point in the future.

The example given in the original post allocated memory from the heap using malloc and is not about stack-allocated local variables. In that example, pointers to variable-length arrays are used for notational convenience so the standard notation

array[x][y]

works instead of the construct

array[x*ys+y]

as necessary for C++ and earlier C standards.

jahboater
Posts: 4424
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Sun Jun 03, 2018 6:47 am

ejolson wrote:
Sun Jun 03, 2018 4:47 am
Those are all good points.
jahboater wrote:
Sat Jun 02, 2018 7:06 pm
Finally, on RISC machines (the Pi) it can take multiple instructions to adjust the stack downwards by a large amount.
Even so, my understanding is that allocating and freeing memory from the heap is slower than adjusting the stack.
Agreed. Also see alloca(). It is just arithmetic for the stack, whereas the heap is a function call.

User avatar
PeterO
Posts: 4726
Joined: Sun Jul 22, 2012 4:14 pm

Re: Dynamic sized arrays

Sun Jun 03, 2018 7:21 am

ejolson wrote:
Sun Jun 03, 2018 4:47 am
The example given in the original post allocated memory from the heap using malloc and is not about stack-allocated local variables. In that example, pointers to variable-length arrays are used for notational convenience so the standard notation

array[x][y]

works instead of the construct

array[x*ys+y]

as necessary for C++ and earlier C standards.
Which is what the thread was supposed to be about :roll:
PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

GuruMeditation
Posts: 33
Joined: Thu May 31, 2018 3:33 pm
Location: UK

Re: Dynamic sized arrays

Sun Jun 03, 2018 9:19 am

ejolson wrote:
Sat Jun 02, 2018 3:36 pm
Outside of purpose-built embedded control systems such as those covered by MISRA standards, it is reasonable to write programs that can operate on different sizes of datasets without needing to be recompiled.
Just for the record: None of what I wrote suggests or implies that programs would need recompiling to operate on different data sizes. It was about sound programming practices.

Although such practices may be more relaxed in a desktop environment, a lot of those practices are valid in all cases.

Heater
Posts: 12623
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Tue Jun 05, 2018 8:15 am

PeterO,

I just came across an idea for dynamically growing arrays called "stretchy buffers" and thought might be an ideal fit for your problem.

At first glance stretchy buffers looks like just some wrapper code around malloc and realloc but they have some clever tricks up their sleeve which result in some very useful features:

1) Type safety. The arrays managed by stretchy buffers are type safe, no more messing around with casting to and from void* all over the place.

2) Generality. The stretchy buffer code can be used to create dynamic arrays of any type. No more creating the same wrapper over and over for different types.

4) stretchy buffer arrays can be accessed with the normal C array index syntax "myArry[42]" rather than having to access elements through a wrapper function call.

The upshot of all this is that code that uses stretchy buffers can be very neat, like so:

Code: Select all

    // An example type to be used for stretchy buffer elements
    struct point {
        int x;
        int y;
        int z;
    };

    // A stretchy buffer of points
    struct point *points = NULL;

    // Create a lot of points and record then in the points buffer
    int n = 1234;
    for (int i = 0; i < n; i++) {
        struct point newPoint;
        newPoint.x = i;
        newPoint.y = 2 * i;
        newPoint.z = 3 * i;
        buf_push(points, newPoint);   // Note: The first call to buf_push creates the stretchy buffer structure
    }

    // Print all the points in  the buffer
    for (int i = 0; i < n; i++) {
        printf("Point %d: x = %d, y = %d, z = %d \n", i, points[i].x, points[i].y, points[i].z);
    }
    // Note the convenient way normal array indexing is used to access the stretchy buffer

    // Print some buffer statistics
    printf("There are %lu points in the buffer\n", buf_len(points));
    printf("The points buffer has a capacity of %lu\n", buf_cap(points));
    printf("The points buffer is %lu bytes in size\n", buf_sizeof(points));
The full stretchy buffer code is attached here.
Attachments
stretchy-buffers.tgz
(1.59 KiB) Downloaded 20 times

User avatar
PeterO
Posts: 4726
Joined: Sun Jul 22, 2012 4:14 pm

Re: Dynamic sized arrays

Tue Jun 05, 2018 9:52 am

Heater wrote:
Tue Jun 05, 2018 8:15 am
PeterO,
I just came across an idea for dynamically growing arrays called "stretchy buffers" and thought might be an ideal fit for your problem.
As I said some way back up the thread I don't want or need "Dynamic" arrays (I used the wrong word), all I wanted was a clean way to deal with arrays of unknown size at compile time. My code can find out how big the array needs to be at run time and
viewtopic.php?f=33&t=214605#p1322948 gives a better syntax than my first attempt.
PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

Heater
Posts: 12623
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Tue Jun 05, 2018 9:56 am

I found what seems to be the code that originally inspired the code that I attached above.

https://github.com/nothings/stb/blob/ma ... y_buffer.h

The code I posted was from the Ion compiler in Per Vognsen's Bitwise project: https://github.com/pervognsen/bitwise

Heater
Posts: 12623
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Tue Jun 05, 2018 10:57 am

PeterO,
As I said some way back up the thread I don't want or need "Dynamic" arrays (I used the wrong word), all I wanted was a clean way to deal with arrays of unknown size at compile time.
Sorry if I have not followed all the details of this thread exactly.

I think you used the right word. If the size of an array cannot be statically determined at compile time, if it is created at whatever size dynamically at run time then that is a "dynamic" array.

Perhaps you don't need what stretchy buffers offer in your current program but they do solve the problem outlined in your opening post. In a very neat way. And without needing to parse the input twice.

It is a very common situation to be reading data from a file or other source of unknown size and parsing it into data structures of unknown size. That requires dynamic arrays (Unless one limits the maximum size one can handle or hogs all the machines memory). Perhaps the stretchy buffers will be of help to you or anyone else reading this thread with such problems.

I think they are neat anyway :)

jahboater
Posts: 4424
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Tue Jun 05, 2018 11:14 am

Heater wrote:
Tue Jun 05, 2018 10:57 am
I think they are neat anyway :)
I think so too.
But perhaps when you start using things like this, C++ would be cleaner and possibly faster.

Heater
Posts: 12623
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Tue Jun 05, 2018 12:58 pm

Normally now a days I agree. I would just reach for a std::vector.

However, in terms of code "cleanliness" the C and C++ versions of my example above look almost the same. There still places where C is needed, perhaps C++ is not available for the target or the programmer just hates the mind bending complexity of C++ :)

Interestingly the C++ version using std::vector results in an executable size exactly the same as the C version using stretchy buffers. Like so:

Code: Select all

$ cat stretchy-buffers.cpp
#include<stdio.h>
#include <vector>

// An example type to be used for stretchy buffer elements
struct point {
    int x;
    int y;
    int z;
};

int main(int argc, char* argv[]) {

    // A stretchy buffer of points
    std::vector<point> points;

    // Create a lot of points and record then in the points buffer
    int n = 1234;
    for (int i = 0; i < n; i++) {
        struct point newPoint;
        newPoint.x = i;
        newPoint.y = 2 * i;
        newPoint.z = 3 * i;
        points.push_back(newPoint);
    }

    // Print all the points in  the buffer
    for (int i = 0; i < n; i++) {
        printf("Point %d: x = %d, y = %d, z = %d \n", i, points[i].x, points[i].y, points[i].z);
    }
    // Note the convenient way normal array addressing is used to access the stretch buffer

    // Print some buffer statistics
    printf("There are %lu points in the buffer\n", points.size());
    printf("The points buffer has a capacity of %lu\n", points.capacity());
    printf("The points buffer is %lu bytes in size\n", sizeof(point) * points.capacity());
}

$ g++ -Wall -Os -fdata-sections -ffunction-sections stretchy-buffers.cpp -o stretchy-buffers.elf -Wl,--gc-sections -Wl,--strip-all
$ size stretchy-buffers
   text    data     bss     dec     hex filename
   2581     584       8    3173     c65 stretchy-buffers
Hardly seems worth checking the performance, the generated code is probably very similar.

jahboater
Posts: 4424
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Tue Jun 05, 2018 2:11 pm

Heater wrote:
Tue Jun 05, 2018 12:58 pm
However, in terms of code "cleanliness" the C and C++ versions of my example above look almost the same.
Yes indeed. That does seem to be the intention:-

Code: Select all

// This implements an approximation to C++ vector<> for C, in that it
//    provides a generic definition for dynamic arrays which you can
//    still access in a type safe way using arr[i] or *(arr+i). 
The implementation is a bit hairy though, whereas std::vector is provided, is extremely well tested, and possibly recognized by the compiler.
the mind bending complexity of C++
+1 !!

Heater
Posts: 12623
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Tue Jun 05, 2018 8:30 pm

jahboater,

Yes, the implementation looks a bit hairy. I have never looked at it but I suspect the source code of std::vector is a lot more hairy and incomprehensible.

Yes, std::vector is provided, and is extremely well tested. On the other hand that stretchy buffer thing is actually very short and sweet. Everything it can do can be tested easily.

That's before we think about the order of magnitude of complexity required of the compiler to support std::vector. What with templates and all.

Anyway, no matter. I'm intrigued by the fact that the code generated for the stretchy buffer solution in C is exactly, to the byte, the same size as the vector solution in C++.

That makes me suspect the generated code is actually, instruction for instruction, the same.

It would not be the first time I have seen this. In a comparison of a C++ program with classes and the same functionality in C with functions taking a struct pointer parameter.

Perhaps another nail in the coffin of the recurring argument that many have, that C++ creates bloated executables.

When I have a moment I will look into this some more.

jahboater
Posts: 4424
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Wed Jun 06, 2018 6:27 am

Heater wrote:
Tue Jun 05, 2018 8:30 pm
Yes, the implementation looks a bit hairy. I have never looked at it but I suspect the source code of std::vector is a lot more hairy and incomprehensible.
Oh yes definitely. But that is someone else's worry!
Perhaps another nail in the coffin of the recurring argument that many have, that C++ creates bloated executables.
I think that the more information that the compiler has about the real purpose of the code, the more chance it has to optimize. Otherwise it either has to recognize common idioms or try and deduce whats going on. So the greater expressive power of C++ should help.

ejolson
Posts: 3029
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Wed Jun 06, 2018 7:00 am

jahboater wrote:
Wed Jun 06, 2018 6:27 am
I think that the more information that the compiler has about the real purpose of the code, the more chance it has to optimize.
That makes sense.

Note that std::vector and the stretchy buffers are implemented as libraries in C++ whereas variable length arrays are built into the compiler of the C language. Similar is the case with complex variables: they are built into the compiler of the C language, but provided only through an object library in C++.

What both C and C++ are missing are portable built-in vector operations.

Heater
Posts: 12623
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Wed Jun 06, 2018 11:24 am

Note that std::vector and the stretchy buffers are not anything like the variable length arrays in C.

vectors and buffers are allocated from the heap, variable length arrays are on the stack.
vectors and buffers can be shrunk variable length arrays cannot.
vectors and buffers live as long as you like variable length arrays only exist for the life of the scope they are in.

Return to “C/C++”