jahboater
Posts: 2939
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Tue Jun 12, 2018 8:12 pm

Valgrind is good at detecting memory leaks.

Heater
Posts: 9842
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Tue Jun 12, 2018 8:18 pm

Yep.

The "sanitizers" that came with Clang/LLVM and I believe now with GCC are even better. Detecting memory leaks, use after free and so on.

jahboater
Posts: 2939
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Tue Jun 12, 2018 8:21 pm

Heater wrote:
Tue Jun 12, 2018 8:18 pm
Yep.

The "sanitizers" that came with Clang/LLVM and I believe now with GCC are even better. Detecting memory leaks, use after free and so on.
You mean "-fsanitize=leak" for gcc.
I run code through both, just to be on the safe side!

Heater
Posts: 9842
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Tue Jun 12, 2018 8:35 pm

Yeah. I think so. I have only ever used the sanitizers with Clang.

Some years ago I needed to find a memory leak one of our contractors had left us with. Valgrind was so slow and memory hungry it was not help.

Of course the leak only showed up on our embedded system target. So perhaps the modern sanitizers would not have worked their either.

Note: Never higher app developers that can't count bytes and cycles.

jahboater
Posts: 2939
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Tue Jun 12, 2018 8:47 pm

Yes, the code runs very fast with the sanitizers on. Valgrind is slow because it checks every single instruction. This causes problems on CISC machines (x86) because its not practical to keep up with all the new complex instructions, so some are not recognized.
For the gcc sanitizers I use:-

Code: Select all

SAN = -fno-sanitize-recover=all          \
      -fsanitize=undefined               \
      -fsanitize=address                 \
      -fsanitize-address-use-after-scope \
      -fsanitize=leak                    \
      -fsanitize=bounds                  \
      -fsanitize=bounds-strict           \
      -fsanitize=integer-divide-by-zero  \
      -fsanitize=float-divide-by-zero    \
      -fsanitize=float-cast-overflow     \
      -fsanitize=unreachable             \
      -fsanitize=vla-bound               \
      -fsanitize=null                    \
      -fsanitize=signed-integer-overflow \
      -fsanitize=object-size             \
      -fsanitize=bool                    \
      -fsanitize=enum                    \
      -fsanitize=return                  \
      -fsanitize=shift                   \
      -fsanitize=alignment               \
      -fsanitize=builtin                 \
      -fsanitize=pointer-compare         \
      -fsanitize=pointer-subtract        \
      -fsanitize=pointer-overflow
-fsanitize=vla-bound
is relevant to this thread.

Heater
Posts: 9842
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Tue Jun 12, 2018 9:02 pm

Why don't we just use Ada?

Ada checks for all of that stuff out of the box. No extra tools required.

ejolson
Posts: 1903
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Wed Jun 13, 2018 3:46 am

Heater wrote:
Tue Jun 12, 2018 9:02 pm
Why don't we just use Ada?
Because this is the C/C++ forum?
Heater wrote:
Tue Jun 12, 2018 7:46 pm
This is a deep misunderstanding.
If a program (or constructor) allocates System V shared memory, it will not be reclaimed by the operating system upon program termination. Please look at shmget, shmat and shmctl for more information. I think POSIX shared memory has similar properties.

A program that is supposed to run forever shouldn't continually malloc and free because the heap may become fragmented and trigger an out of memory condition anyway. There are a surprising number of apparent memory leaks that result from such issues.

If static memory allocation is not sufficient and you need to allocate memory for a task that won't be needed after the task if done, the memory should be allocated on the stack. This is exactly what is going on with the array s[n] in the row-major computation of the matrix 1-norm. Allocating from the stack is important for long running code but even more important for multi-threaded code because each thread has its own stack whereas allocations from the heap, in addition to being slower, pass through a global lock.

Heater
Posts: 9842
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Wed Jun 13, 2018 5:21 am

ejolson,
If a program (or constructor) allocates System V shared memory, it will not be reclaimed by the operating system upon program termination. Please look at shmget, shmat and shmctl for more information. I think POSIX shared memory has similar properties.
Quite so. Similar to how when a program creates a file on disk one expects that file to be there after the program terminates. No surprises there. Been there, done that.

We have been talking about regular memory allocation though so I don't see the relevance.
A program that is supposed to run forever shouldn't continually malloc and free...
Perhaps you claim they should not but they do. Million, billions, of them running on servers all around the world all day every day. From web servers to your banks databases. Frantically mallocing and freeing, newing and deleting, memory all the time as they get work to do.

A lot of languages do all this without the programmer having to think much about allocating and deallocating: Java, C#, Python, Javascript, Ruby, C++ and many others.
...because the heap may become fragmented
And so it does. Fragmentation can make it seem that you are using more memory than you think. Strangely enough that is not such a big problem. All those billions of servers keep running...
...and trigger an out of memory condition anyway.
It happens. Steps are taken to ensure out of memory conditions are not reached.
There are a surprising number of apparent memory leaks that result from such issues.
Indeed there are. Everyone does it though. That is why we have tools like valgrind and the sanitizers, to detect memory leaks. Higher level languages like Javascript, Python, Java etc take care of this for you (mostly)
If static memory allocation is not sufficient and you need to allocate memory for a task that won't be needed after the task if done, the memory should be allocated on the stack. This is exactly what is going on with the array s[n] in the row-major computation of the matrix 1-norm
Except that is not what happens most of the time.
Allocating from the stack is important for long running code but even more important for multi-threaded code because each thread has its own stack whereas allocations from the heap, in addition to being slower, pass through a global lock.
Threading and mutual exclusion is a whole other can of worms. If you are using a global lock for all the data shared between threads you are doing it wrong and your performance will be terrible. I have plenty of threaded code that shares data with no locks at all. See "lock free programming" http://preshing.com/20120612/an-introdu ... ogramming/

Allocating and deallocating memory may be slower than just using the stack. In practice that is not a huge concern. If 90% of your time is spent processing that data then the 10% overhead of allocation and deallocation is hardly worth worrying about. The matnormal example is such a case for large matrices. On the other hand if you are processing millions of smaller matrices then perhaps the overheads of memory allocation dominate and it's time to think about not doing that.

jahboater
Posts: 2939
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Wed Jun 13, 2018 7:06 am

Heater wrote:
Wed Jun 13, 2018 5:21 am
A program that is supposed to run forever shouldn't continually malloc and free...
...because the heap may become fragmented
And so it does. Fragmentation can make it seem that you are using more memory than you think. Strangely enough that is not such a big problem. All those billions of servers keep running...
Heap fragmentation in long running programs is something I used to worry about long ago.
I am not sure it is such a problem now - malloc is mature and sophisticated.
Anyway, for data items that are constantly being allocated and free'd over and over again, and just happen to include handy pointer (which is common):-

Code: Select all

typedef struct verb
{
  struct verb *next; 
  ..........
  ..........
}
VERB;
you can simply stick them on a free list instead if calling free(), then instead of malloc(), just pop it off the list, a couple of instructions (or call malloc if the list is empty).
Much faster and zero fragmentation.

Heater
Posts: 9842
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Wed Jun 13, 2018 12:39 pm

I sympathize. Back in the day, when I did a lot of work on embedded systems that had little memory, little performance and real-time garantees were esssential, I though dynamic memory allocation was nuts and had no use for it. We did not have the time for all that allocating and deallocating, we could not tolerate the erratic timing that introduced. Such systems should run forever so one had to be sure memory would never run out, in which case you need to know where you memory is from the get go and put things in it where you like. Everything statically allocated.
..you can simply stick them on a free list instead if calling free(), then instead of malloc(), just pop it off the list,...
Yes, there is sometimes a lot to be gained by managing memory oneself.

Problem is, for a lot of jobs, a web server might be a canonical example, you don't have nice neat fixed sized objects that can be packed into arrays or whatever. You have no idea how many clients will be connected at any time, you have no idea what size their requests will be, you have no idea how big the responses will be. Managing all that manually in your program would be a nightmare of complexity. Meh, just abstract that away to the programming language and suck up the drop in efficiency.

jahboater
Posts: 2939
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Wed Jun 13, 2018 3:00 pm

Heater wrote:
Wed Jun 13, 2018 12:39 pm
Problem is, for a lot of jobs, a web server might be a canonical example, you don't have nice neat fixed sized objects that can be packed into arrays or whatever. You have no idea how many clients will be connected at any time, you have no idea what size their requests will be, you have no idea how big the responses will be. Managing all that manually in your program would be a nightmare of complexity. Meh, just abstract that away to the programming language and suck up the drop in efficiency.
Yes, definitely. I only suggested the simple case above because the coding complexity is near zero for a good gain.

I once wrote a network switching system, with store and forward, routing etc. It had to be very fast and stay up indefinitely. The messages could of arbitrary size. After trying a couple of private memory allocation schemes, I gave up and used plain malloc() - which performed brilliantly! Lesson learned.

Heater
Posts: 9842
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Wed Jun 13, 2018 7:30 pm

I guess we are on the same page then.

ejolson
Posts: 1903
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Thu Jun 14, 2018 8:06 pm

Heater wrote:
Wed Jun 13, 2018 7:30 pm
I guess we are on the same page then.
One can get lucky with malloc and free because not all usage patterns lead to fragmentation. My suspicion is that many of those good usage patterns are, in fact, stack oriented and there is no need to perform them using a heap.

Serving webpages is not an example of a long running process. Originally NCSA httpd could be run from inetd.conf with a new process spawned for every request. With all the patches that led to the Apache server, this is no longer the case. However, the program is now structured in such a way that a certain number of child processes are forked to serve the requests. Each child serves MaxRequestsPerChild or MaxConnectionsPerChild before it dies, presumably to avoid progressive heap fragmentation. Note that the current default setting for MaxConnectionsPerChild is to never die, which indicates that in typical cases the heap fragmentation issue may be solved for long running Apache child processes.

An example of heap fragmentation appears with Gnome desktop applications and services. Apparently the way glibc uses malloc and free quickly causes heap fragmentation leading to programs that consume much more memory than needed. There had been talk about mitigating some of the issues with a custom memory manager that tries to free unused blocks in the heap back to the operating system by unmapping them. I don't know what the status of that is; maybe trimming the heap was enough. Still, I recently read a report on a heap-fragmentation related bug with a long running process in either KDE or Gnome within in the last year. Though I looked, I unfortunately can't find the link.

While we have strayed a little from the original topic of variable length arrays in C, it is worth observing that variable length arrays in C can be allocated on the heap as well as the stack. In particular, allocating a variable length array on the heap was demonstrated in the original C version of the matrix 1-norm program for the matrix A in the main routine.

Heater
Posts: 9842
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Thu Jun 14, 2018 11:59 pm

ejolson,

I think I agree with a lot you say there. Except for this:
Serving web pages is not an example of a long running process.
It most certainly is. As far as I understand the history of such things it goes like this:

1) Original web servers crudely forked a new process to handle every incoming request. As such whatever memory those processes malloced was freed by the OS when the request was done and the process terminated. Even if the request handler process was to lazy to do it my itself.

2) Later other web servers created a new thread for each request. Much more light weight and efficient than creating a whole process. Basically the same as above when it comes to memory allocation. Correct me if I am wrong.

So far so good. The web server itself could run forever, never mind what happened to the processes/threads it spawned.

3) Modern web servers do not create any process or thread to handle a request. They use the event driven programming model. Even more efficient. Think nginx or a node.js server.

As such, we expect the webserver, a single process/thread, to run forever.

Luckily, my nginx and node.js web servers have been doing exactly that for some years now.
...it is worth observing that variable length arrays in C can be allocated on the heap as well as the stack
How?

Are you saying that we can somehow do this:

Code: Select all

int main (int argc, char* argv[]) {
  int x[argc];
}
But with x being allocated on the heap instead of the stack?

ejolson
Posts: 1903
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Fri Jun 15, 2018 8:05 am

Heater wrote:
Thu Jun 14, 2018 11:59 pm
ejolson,

I think I agree with a lot you say there. Except for this:
Serving web pages is not an example of a long running process.
It most certainly is.
The operating system needs to keep running, but the time to process one request for a web page is measured in milliseconds.

In C use malloc to allocate memory from the heap. Note that you really end up initialising a pointer to a variable length array, which is only interesting for two-dimensional arrays and higher. It should be clear what is going on in my example code as well as the original code posted in this thread.

jahboater
Posts: 2939
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Fri Jun 15, 2018 8:42 am

ejolson wrote:
Fri Jun 15, 2018 8:05 am
Serving web pages is not an example of a long running process.
It most certainly is.
ejolson wrote:
Fri Jun 15, 2018 8:05 am
The operating system needs to keep running, but the time to process one request for a web page is measured in milliseconds.
What Heater is saying is that the web server process runs indefinitely nowadays. And therefore has to worry about long term memory issues. Web servers no longer use the traditional server model where each request is dealt with by a separate forked process.

Basic stuff .....

Code: Select all

int main (int argc, char* argv[]) {
  int x[argc];    // vla on stack, lifetime as function instance
  int *y = alloca( argc * sizeof(int) );  //  ditto
  int *z = malloc( argc * sizeof(int) );  // vla on heap, whole program lifetime
}

Heater
Posts: 9842
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Fri Jun 15, 2018 10:36 am

ejolson,
The operating system needs to keep running, but the time to process one request for a web page is measured in milliseconds.
That is true of good old fashioned web servers: recieve request, read HTML from file or create it dynamically, respond, close connection.

Put a database behind there and that request could take far longer to complete.

With modern HTTP the connection between client and server can be open for a long time, essentially indefinitely. For example when a server is continuously pushing data in real-time to browsers using web sockets.

Either way, web servers like nginx and those made with node.js do not spawn processes or threads, they have an event loop. Essentially they are a single process/thread that runs forever.
In C use malloc to allocate memory from the heap.
Of course, like my stretchy buffers example way back in this thread.
Note that you really end up initialising a pointer to a variable length array, which is only interesting for two-dimensional arrays and higher.
How so? People use dynamically allocated memory for one dimensional arrays, buffers, strings, etc all the time.

ejolson
Posts: 1903
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Sat Jun 16, 2018 12:25 am

Heater wrote:
Fri Jun 15, 2018 10:36 am
How so? People use dynamically allocated memory for one dimensional arrays, buffers, strings, etc all the time.
Exactly. So there is nothing new with that. Only when dealing with multi-dimensional arrays does the fact that you can specify the range of the second, third and higher indices at run time make things different than usual C pointers.

Heater
Posts: 9842
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Sat Jun 16, 2018 7:45 am

ejolson,
there is nothing new with that. Only when dealing with multi-dimensional arrays does the fact that you can specify the range of the second, third and higher indices at run time make things different than usual C pointers
Sounds like you are now talking about C99's variable length arrays. I was replying to your stament about other dynamic memory allocations, to wit: "In C use malloc to allocate memory from the heap....which is only interesting for two-dimensional arrays".

I'm not suggesting VLA's a are a bad thing or not interesting. It's just that as far as I can see they are of limited use, what with only having the lifetime of the function that creates them and, not being suitable for large arrays, and not being dynamically resizable.

If there are bad things about VLA's it's that:

1) It diverges the C standard from C++, which might be inconvenient sometimes.

2) Your program will crash with a segfault if they become too big and there is no way to check if that might happen.
Last edited by Heater on Sat Jun 16, 2018 6:23 pm, edited 1 time in total.

jahboater
Posts: 2939
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Sat Jun 16, 2018 3:46 pm

Heater wrote:
Sat Jun 16, 2018 7:45 am
If there are bad things about VLA's it's that:

1) It diverges the C standard from C++, which might be inconvenient sometimes.
They work fine in C++17 by the way.

I saw this in the C99 Rationale (the second paragraph is amusing) ...
Minimize incompatibilities with C++. The Committee recognizes the need for a clear and
defensible plan for addressing the compatibility issue with C++. The Committee endorses the
principle of maintaining the largest common subset clearly and from the outset. Such a principle
should satisfy the requirement to maximize overlap of the languages while maintaining a
distinction between them and allowing them to evolve separately.

The Committee is content to let C++ be the big and ambitious language. While some features of
C++ may well be embraced, it is not the Committee’s intention that C become C++.

Heater
Posts: 9842
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Sat Jun 16, 2018 7:35 pm

jahboater,
They work fine in C++17 by the way.
I don't believe they do. What compiler are you using? My gcc 5.4.0 has some support from C++17 but I cannot get a simple example to work:

Code: Select all

$ cat vla.c
#include <stdlib.h>

double process(int n, double A[n][n]){
    return A[n-1][n-1];
}

int main (int argc, char* argv[]) {
  int size = 10;

  double A[size][size];

  double X = process(size, A);

  return (int)X;
}

$ gcc -O3 -Wall -o vla  vla.c
$ cp vla.c vla.cpp
$ g++ -std=c++17 -O3 -Wall -o vla vla.cpp
vla.cpp:3:33: error: use of parameter outside function body before ‘]’ token
 double process(int n, double A[n][n]){
                                 ^
vla.cpp:3:36: error: use of parameter outside function body before ‘]’ token
 double process(int n, double A[n][n]){
                                    ^
vla.cpp: In function ‘double process(...)’:
vla.cpp:4:12: error: ‘A’ was not declared in this scope
     return A[n-1][n-1];
            ^
vla.cpp:4:14: error: ‘n’ was not declared in this scope
     return A[n-1][n-1];
I can't find any reference to C style VLA's being in C++17.

I have read hint that VLA's are supported as an extension to GCC and Clang. Can't find a way to make it work though.

ejolson
Posts: 1903
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Sat Jun 16, 2018 8:11 pm

Heater wrote:
Sat Jun 16, 2018 7:45 am
If there are bad things about VLA's it's that:

1) It diverges the C standard from C++, which might be inconvenient sometimes.

2) Your program will crash with a segfault if they become too big and there is no way to check if that might happen.
Please look at my original code to compute the matrix 1-norm to see how to create a variable-length two-dimensional array allocated from the heap. For your convenience I will excerpt the relevant lines here

Code: Select all

    int n=256;
    if(argc==2) n=atoi(argv[1]);
//  double A[n][n];
    double (*A)[n]=malloc(sizeof(double)*n*n);
Note how the stack-allocated variable-length array A has been commented out and directly replaced by a heap-allocated version. In fact, doing this was exactly the point of the original post, which also allocated the variable-length array from the heap.

While variable-length arrays advance C beyond C++, they is very useful to people who are working with matrices. In particular, the addition of variable-length arrays and a native complex-number data type finally make C99 general purpose enough to be convenient for numerical work.

We've discussed crashing related to dynamic memory allocation at length. A defensively written program could check the current limits on the stack and where the pointer is before allocating more memory, or simply set the stack limit to infinity. The latter option is similar to using malloc on Linux, as malloc will happily return non-zero virtual addresses that cause an out-of-memory condition and subsequent program termination if actually used. In either case, to be exactly correct one should also inspect the current state of the Linux virtual memory system to know how many physical pages are left to back the virtual addresses and how much swap is in use. Programs of that complexity are likely to be less reliable than simpler ones.

Heater
Posts: 9842
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Sat Jun 16, 2018 8:33 pm

ejolson,

I do get the idea about putting C99 variable sized arrays on the heap. It's pretty neat.
A defensively written program could check the current limits on the stack and where the pointer is before allocating more memory, or simply set the stack limit to infinity.
I don't see how that can ever be robust. As you say later, Linux will tell you you have whatever memory want, including stack. Only later when you actually use that space Linux can say "Sorry, I lied" and segfault you. Do you have a bullet proof way to do those checks?
...similar to using malloc on Linux, as malloc will happily return non-zero virtual addresses that cause an out-of-memory condition and subsequent program termination if actually used....
Exactly.
...to be exactly correct one should also inspect the current state of the Linux virtual memory system to know how many physical pages are left to back the virtual addresses and how much swap is in use.
Again, I don't see how to do this robustly. Your program is in competition with whatever else is running on the machine. There is a race condition, a time window between your program checking what memory is available and actually using it. In that time window some other process can get in and grab it. POOF! You segfault.

Do you have a bulletproof example of doing this?

Anyway, you changed the subject again. My claim was that C++ does not support C99 style Variable Length Arrays.

ejolson
Posts: 1903
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Sat Jun 16, 2018 9:15 pm

Heater wrote:
Sat Jun 16, 2018 8:33 pm
Do you have a bulletproof example of doing this?
Maybe systemd does this. That's why I said the additional complexity is likely to be less reliable.
Heater wrote:
Sat Jun 16, 2018 8:33 pm
Anyway, you changed the subject again. My claim was that C++ does not support C99 style Variable Length Arrays.
This is my understanding as well and why I asked for possible C++ versions of my matrix 1-norm code. The students here learn C++ in the first and second semesters and typically take my course in the 5th or 7th semester. While C is close to C++, it would be easier to build directly off their C++ knowledge. However, I have yet to find a natural way to work with variable sized arrays in C++ that is also compatible with LAPACK addressing modes.

jahboater
Posts: 2939
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Sat Jun 16, 2018 10:01 pm

Heater,

I am using gcc 8.1 (which has full support for C++17)

This simpler example compiles cleanly with C++

Code: Select all

#include <iostream>

int
main( int argc, const char *argv[] )
{
  int x[argc][argc];
}
Your example, on the other hand, fails in the same way here as it did for you.
Hmm...

Return to “C/C++”

Who is online

Users browsing this forum: grininmonkey and 6 guests