Heater
Posts: 9727
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Wed Jun 06, 2018 6:38 pm

Oops I messed that up. The C++ version using vector is bigger that the C version using strechy buffers.

Code: Select all

$ gcc -Wall -Os -fdata-sections -ffunction-sections stretchy-buffers.c -o stretchy-buffers -Wl,--gc-sections -Wl,--strip-all
$ size stretchy-buffers
   text    data     bss     dec     hex filename
   2503     576       8    3087     c0f stretchy-buffers
$ g++ -Wall -Os -fdata-sections -ffunction-sections stretchy-buffers.cpp -o stretchy-buffers -Wl,--gc-sections -Wl,--strip-all
$ size stretchy-buffers
   text    data     bss     dec     hex filename
   3045     616       8    3669     e55 stretchy-buffers
So the C++ version is about 16% bigger. The "C++ is bloated" claimer's were correct.

Or perhaps not, considering that std::vector has many other useful features it can be excused being a bit bigger.

jahboater
Posts: 2858
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Wed Jun 06, 2018 6:47 pm

Interesting set of compiler options?
You can replace "-Wl,--strip-all" with "-s" by the way
Last edited by jahboater on Wed Jun 06, 2018 6:48 pm, edited 1 time in total.

Heater
Posts: 9727
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Wed Jun 06, 2018 6:48 pm

But what about the performance, std::vector vs stretchy buffers ?

Here is a test of stretchy buffers in C:

Code: Select all

$ cat stretchy-buffers.c
#define ITERS 100
int main(int argc, char* argv[]) {
//    buf_test();

    // An example type to be used for stretchy buffer elements
    struct point {
        int x;
        int y;
        int z;
    };

    // A stretchy buffer of points
    struct point *points = NULL;
    int n = 8000000;

    for (int count = 0; count < ITERS; count++) {

        // Create a lot of points and record then in the points buffer
        for (int i = 0; i < n; i++) {
            struct point newPoint;
            newPoint.x = i;
            newPoint.y = 2 * i;
            newPoint.z = 3 * i;
            buf_push(points, newPoint);   // Note: The first call to buf_push creates the stretchy buffer structure
        }

        if (count < ITERS - 1) {
            buf_free(points);
        }
    }

    // Print some buffer statistics
    printf("There are %lu points in the buffer\n", buf_len(points));
    printf("The points buffer has a capacity of %lu\n", buf_cap(points));
    printf("The points buffer is %lu bytes in size\n", buf_sizeof(points));
}

$ gcc -Wall -Os -fdata-sections -ffunction-sections stretchy-buffers.c -o stretchy-buffers_c -Wl,--gc-sections -Wl,--strip-all
$ time ./stretchy-buffers_c
There are 8000000 points in the buffer
The points buffer has a capacity of 8388608
The points buffer is 96000000 bytes in size

real    0m21.689s
user    0m4.078s
sys     0m17.547s
Here is a test doing the same thing with std::vector

Code: Select all

$ cat stretchy-buffers.cpp
#include<stdio.h>
#include <vector>

// An example type to be used for stretchy buffer elements
struct point {
    int x;
    int y;
    int z;
};

#define ITERS 100
int main(int argc, char* argv[]) {

    // A stretchy buffer of points
    std::vector<point> points;

    // Create a lot of points and record then in the points buffer
    int n = 8000000;

    for (int count = 0; count < ITERS; count++) {
        for (int i = 0; i < n; i++) {
            struct point newPoint;
            newPoint.x = i;
            newPoint.y = 2 * i;
            newPoint.z = 3 * i;
            points.push_back(newPoint);
        }

        if (count < ITERS - 1) {
            points.clear();
            std::vector<point>(points).swap(points);
        }
    }

    // Print some buffer statistics
    printf("There are %lu points in the buffer\n", points.size());
    printf("The points buffer has a capacity of %lu\n", points.capacity());
    printf("The points buffer is %lu bytes in size\n", sizeof(point) * points.capacity());
}

$ g++ -Wall -Os -fdata-sections -ffunction-sections stretchy-buffers.cpp -o stretchy-buffers_cpp -Wl,--gc-sections -Wl,--strip-all
$ time ./stretchy-buffers_cpp
There are 8000000 points in the buffer
The points buffer has a capacity of 8388608
The points buffer is 100663296 bytes in size

real    0m30.912s
user    0m7.047s
sys     0m23.859s

Hmm... std::vector is takes 44% longer than stretchy buffers.

jahboater
Posts: 2858
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Wed Jun 06, 2018 7:02 pm

The points buffer size is larger too.
More red tape perhaps.

ejolson
Posts: 1843
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Wed Jun 06, 2018 7:48 pm

Heater wrote:
Tue Jun 05, 2018 10:57 am
PeterO,
As I said some way back up the thread I don't want or need "Dynamic" arrays (I used the wrong word), all I wanted was a clean way to deal with arrays of unknown size at compile time.
Sorry if I have not followed all the details of this thread exactly.

I think you used the right word. If the size of an array cannot be statically determined at compile time, if it is created at whatever size dynamically at run time then that is a "dynamic" array.
I think the usual distinctions are

fixed length -- determined at compile time
variable length -- determined at run time
dynamic length -- changes during run time

Here is an example that hopefully illustrates a natural use of C variable-length arrays.

Code: Select all

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <strings.h>
#include <sys/time.h>
#include <sys/resource.h>

/*  Compute the matrix 1-norm using either a row-major or column-major
    algorithm while illustrating a paradigmatic use of stack-allocated 
    variable-length one and two-dimensional arrays.  */

static double tic_time;
void tic() {
    struct timeval tp;
    gettimeofday(&tp,0);
    tic_time=tp.tv_sec+tp.tv_usec*1.0e-6;
}
double toc() {
    struct timeval tp;
    gettimeofday(&tp,0);
    return (tp.tv_sec+tp.tv_usec*1.0e-6)-tic_time;
}

// Column-major computation of |A|_1
double norm1cm(int n,double A[n][n]){
    double r=0;
    for(int j=0;j<n;j++){
        double s=0;
        for(int i=0;i<n;i++){
            s+=fabs(A[i][j]);
        }
        if(s>r) r=s;
    }
    return r;
}

// Row-major computation of |A|_1
double norm1rm(int n,double A[n][n]){
    double s[n];
    bzero(s,sizeof(s));
    for(int i=0;i<n;i++){
        for(int j=0;j<n;j++){
            s[j]+=fabs(A[i][j]);
        }
    }
    double r=0;
    for(int j=0;j<n;j++){
        if(s[j]>r) r=s[j];
    }
    return r;
}

int main(int argc,char *argv[argc]){
    int n=4096;
    if(argc==2) n=atoi(argv[1]);
//    double A[n][n];
    double (*A)[n]=malloc(sizeof(double)*n*n);
    srandom(19721);
    for(int i=0;i<n;i++){
        for(int j=0;j<n;j++){
            A[i][j]=2.0*random()/RAND_MAX-1.0;
        }
    }
    printf("#%-9s %9s %22s %15s\n","method","n","value","time");

    typedef double (*function)(int n,double A[n][n]);
    struct test {
        char *name;
        function norm;
    } test[] = {
        {"norm1cm", norm1cm},
        {"norm1rm", norm1rm}
    };

    for(int i=0;i<sizeof(test)/sizeof(struct test);i++){
        tic();
        double r=test[i].norm(n,A);
        double t=toc();
        printf("%-10s %9d %22.14e %15.8f\n",test[i].name,n,r,t);
    }
    exit(0);
}
Output when run on a Raspberry Pi 3B+ is as follows.

Code: Select all

$ ./matnorm1 
#method            n                  value            time
norm1cm         4096   2.11124447615968e+03      3.23258995
norm1rm         4096   2.11124447615968e+03      0.19290999
I would be very happy to see an equivalent C++ version.

Heater
Posts: 9727
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Wed Jun 06, 2018 9:04 pm

ejolson,
I would be very happy to see an equivalent C++ version.
Ha! I gave up trying to convert that to C++ after half an hour.

Two dimensional std::vectors are a pig.

Then I realized the resulting code would be as ugly as hell.

Maybe I just had the wrong idea, any C++ gurus around here?

User avatar
PeterO
Posts: 4237
Joined: Sun Jul 22, 2012 4:14 pm

Re: Dynamic sized arrays

Wed Jun 06, 2018 10:16 pm

ejolson wrote:
Wed Jun 06, 2018 7:48 pm
I think the usual distinctions are

fixed length -- determined at compile time
variable length -- determined at run time
dynamic length -- changes during run time
By that list i should have called them "variable length".

PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),Aeromodelling,1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

User avatar
Paeryn
Posts: 2115
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Dynamic sized arrays

Thu Jun 07, 2018 12:00 am

Heater wrote:
Wed Jun 06, 2018 9:04 pm
ejolson,
I would be very happy to see an equivalent C++ version.
Ha! I gave up trying to convert that to C++ after half an hour.

Two dimensional std::vectors are a pig.

Then I realized the resulting code would be as ugly as hell.

Maybe I just had the wrong idea, any C++ gurus around here?
It's not ugly, looks neater to me.

Code: Select all

[email protected]:~/Programming/asm/dynarr $ cat matnorm.cc
#include <cstdio>
#include <cstdlib>
#include <vector>
#include <cmath>
#include <strings.h>
#include <sys/time.h>
#include <sys/resource.h>

/*  Compute the matrix 1-norm using either a row-major or column-major
    algorithm while illustrating a paradigmatic use of stack-allocated
    variable-length one and two-dimensional arrays.  */

static double tic_time;
void tic() {
    struct timeval tp;
    gettimeofday(&tp,0);
    tic_time=tp.tv_sec+tp.tv_usec*1.0e-6;
}
double toc() {
    struct timeval tp;
    gettimeofday(&tp,0);
    return (tp.tv_sec+tp.tv_usec*1.0e-6)-tic_time;
}

// Column-major computation of |A|_1
/*** Original C version of function
double norm1cm(int n,double A[n][n]){
    double r=0;
    for(int j=0;j<n;j++){
        double s=0;
        for(int i=0;i<n;i++){
            s+=fabs(A[i][j]);
        }
        if(s>r) r=s;
    }
    return r;
}
*/

typedef std::vector<std::vector<double>> Array_2d_t;

double norm1cm(const Array_2d_t &A)
{
    const int size = A.size();
    double r = 0.0;
    for(int j = 0; j < size; j++) {
        double s = 0.0;
        for(int i = 0; i < size; i++) {
            s += fabs(A[i][j]);
        }
        if (s > r) r = s;
    }
    return r;
}

// Row-major computation of |A|_1
/*** Original C version of function
double norm1rm(int n,double A[n][n]){
    double s[n];
    bzero(s,sizeof(s));
    for(int i=0;i<n;i++){
        for(int j=0;j<n;j++){
            s[j]+=fabs(A[i][j]);
        }
    }
    double r=0;
    for(int j=0;j<n;j++){
        if(s[j]>r) r=s[j];
    }
    return r;
}
*/

double norm1rm(const Array_2d_t &A)
{
    const int size = A.size();
    double s[size] = {};
    for (const auto& i : A) {
        int k = 0;
        for (const auto& j : i) {
            s[k++] += fabs(j);
        }
    }
    double r = 0.0;
    for (const auto& i : s) {
        if (i > r) r = i;
    }
    return r;
}

double norm1rmAlt(const Array_2d_t &A)
{
    const int size = A.size();
    double s[size] = {};
    for (int i = 0; i < size; i++) {
        for (int j = 0; j < size; j++) {
            s[j] += fabs(A[i][j]);
        }
    }
    double r = 0.0;
    for (int i = 0; i < size; i++) {
        if (s[i] > r) r = s[i];
    }
    return r;
}


int main(int argc,char *argv[]){
    int n=4096;
    if(argc==2) n=atoi(argv[1]);
//    double (*A)[n]=malloc(sizeof(double)*n*n);
    Array_2d_t A(n, std::vector<double>(n));
    srandom(19721);
/*
    for(int i=0;i<n;i++){
        for(int j=0;j<n;j++){
            A[i][j]=2.0*random()/RAND_MAX-1.0;
        }
    }
*/
    for (auto& row : A) {
        for (auto& item : row) {
            item = 2.0*random()/RAND_MAX-1.0;
        }
    }
    std::printf("#%-9s %9s %22s %15s\n","method","n","value","time");

    typedef double (*function)(const Array_2d_t &A);
    struct test {
        const char *name;
        function norm;
    } test[] = {
        {"norm1cm", norm1cm},
        {"norm1rm", norm1rm},
        {"norm1rmAlt", norm1rmAlt}
    };

    for(int i=0;i<sizeof(test)/sizeof(struct test);i++){
        tic();
        double r=test[i].norm(A);
        double t=toc();
        std::printf("%-10s %9d %22.14e %15.8f\n",test[i].name,n,r,t);
    }
    exit(0);
}

Code: Select all

[email protected]:~/Programming/asm/dynarr $ gcc matnorm.c -o matnorm -Os
[email protected]:~/Programming/asm/dynarr $ g++ matnorm.cc -o matnorm2 -Os
[email protected]:~/Programming/asm/dynarr $ ./matnorm
#method            n                  value            time
norm1cm         4096   2.11124447615968e+03      3.29556799
norm1rm         4096   2.11124447615968e+03      0.21998692
[email protected]:~/Programming/asm/dynarr $ ./matnorm2
#method            n                  value            time
norm1cm         4096   2.11124447615968e+03      2.04459190
norm1rm         4096   2.11124447615968e+03      0.24787402
norm1rmAlt      4096   2.11124447615968e+03      0.21867704
I did the row-major in two ways, the first using iterators to access the arrays and then an Alt version with traditional array-based access.

<Edit>
  • Added const to functions and iterators.
  • Isn't strict ISO C++, ISO doesn't allow VLAs, specifically the array double s[size] in functions norm1rm() and norm1rmAlt(), both gcc and clang allow it, but clang is still strict on not allowing an initialiser list for it.
Last edited by Paeryn on Thu Jun 07, 2018 1:59 am, edited 2 times in total.
She who travels light — forgot something.

Heater
Posts: 9727
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Thu Jun 07, 2018 12:39 am

That does not look so bad. Nice to see I was on the right track before I gave up!

User avatar
Paeryn
Posts: 2115
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Dynamic sized arrays

Thu Jun 07, 2018 12:51 am

Heater wrote:
Thu Jun 07, 2018 12:39 am
That does not look so bad. Nice to see I was on the right track before I gave up!
I had problems with the typedef for a while, until I realised it'd tried naming it 2d_Array_t... That and my first version didn't pass the arrays by reference so was taking ages to copy the arrays each time. Must be this heat that is addling my brain.
She who travels light — forgot something.

ejolson
Posts: 1843
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Fri Jun 08, 2018 1:46 am

Paeryn wrote:
Thu Jun 07, 2018 12:00 am
Isn't strict ISO C++, ISO doesn't allow VLAs, specifically the array double s[size] in functions norm1rm() and norm1rmAlt(), both gcc and clang allow it, but clang is still strict on not allowing an initialiser list for it.
Thanks for taking the effort to write and post that code.

Since variable-length arrays are supported by the gcc and clang extensions of C++ does that mean the original code would work essentially as is?

On my Pi 3B+ the C++ std::vector version yields

Code: Select all

$ ./matnorm1cpp
#method            n                  value            time
norm1cm         4096   2.11124447615968e+03      2.21937203
norm1rm         4096   2.11124447615968e+03      0.19473890
norm1rmAlt      4096   2.11124447615968e+03      0.19482007
which implies the row-major algorithms are as fast as the C code.

Interestingly, the column major algorithm is relatively faster than before. Since the C++ code actually allocates an array of pointers to vectors, the memory used to store the matrix is not contiguous. By chance there is an additional 8 bytes between each row. As a result, scanning down a column reads memory at addresses spaced 4097*8 apart rather than 4096*8. This is better in terms of cache associativity.

Running the original code for a matrix of size 4097 yields the similar speedup

Code: Select all

$ ./matnorm1 4097
#method            n                  value            time
norm1cm         4097   2.11957570900236e+03      1.24055589
norm1rm         4097   2.11957570900236e+03      0.19432401
for the column-major version of the algorithm with, of course, different values for the norms.

Here are a couple comments on the C++ version of the code:

1. Instead of using j to denote a double-precision floating-point value it might be better to use x.

2. The inner loop contains an additional integer counter k to keep track of the column. This is unneeded if the loops are taken over integer indexes rather than pointers.

3. The memory to store the entire matrix is not contiguous. In fact, if the heap were fragmented from running other code prior to this, there may not even be a constant stride between rows of the matrix.

Not having a constant stride between rows makes it difficult to call any LAPACK routine for the matrix. Because the optimized LAPACK implementations which exist for all major computer architectures run significantly faster than most hand-crafted code, this last point may be more important than it appears.

jahboater
Posts: 2858
Joined: Wed Feb 04, 2015 6:38 pm

Re: Dynamic sized arrays

Fri Jun 08, 2018 6:20 am

Does anyone know anything about std::valarray which is designed specifically to allow optimization, presumably with SIMD? Looks to me like they are fixed size though.
https://stackoverflow.com/questions/160 ... -vs-vector
and
http://en.cppreference.com/w/cpp/numeric/valarray

Heater
Posts: 9727
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Sun Jun 10, 2018 3:36 pm

ejolson,
I would be very happy to see an equivalent C++ version.
So I thought about a different way to get a variable length 2D array in C++. I'm not happy with the vector of vector solutions. That is not anything like the same, it's an array of objects, which happen to be arrays. They can be spread all over memory rather than in a single neat C array kind of space.

An obvious solution is to create class for a square array. Like so:

Code: Select all

class SquareArray {
public:
    SquareArray(int size) {
        this->size = size;
        a = std::vector<double>(size * size);        
    }

    __attribute__((always_inline)) double& ref(int x, int y) {
        return a[x * size + y];
    }

    int dim () {
        return this->size;
    }

private:
    std::vector<double> a;
    int size = 0;
};
This maintains the array as one contiguous space in memory.
Then you can make a variable sized square array like so:

Code: Select all

auto A = SquareArray(n);
And access its elements like so:

Code: Select all

double x = A.ref(i, j);
A.ref(j, i) = x;
The results are (C version vs C++ square array version) as follows:

Code: Select all

$ ./matnormal_c
#method            n                  value            time
norm1cm        24576   1.24706024724666e+04     27.22231388
norm1rm        24576   1.24706024724666e+04      0.41067195
./matnormal_cpp
#method            n                  value            time
norm1cm        24576   1.24706024724666e+04     26.36430192
norm1rm        24576   1.24706024724666e+04      0.86408710
Not much in it when the cache is being thrashed. But still the C version is faster for the cache friendly code.
Note: I multiplied the dimensions by 6 to get some decent execution time on the amd64 machine.

Here is the complete C++ version:

Code: Select all

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <strings.h>
#include <sys/time.h>
#include <sys/resource.h>

#include <vector>

/*  Compute the matrix 1-norm using either a row-major or column-major
    algorithm while illustrating a paradigmatic use of stack-allocated 
    variable-length one and two-dimensional arrays.  */

static double tic_time;
void tic() {
    struct timeval tp;
    gettimeofday(&tp,0);
    tic_time=tp.tv_sec+tp.tv_usec*1.0e-6;
}
double toc() {
    struct timeval tp;
    gettimeofday(&tp,0);
    return (tp.tv_sec+tp.tv_usec*1.0e-6)-tic_time;
}

class SquareArray {
public:
    SquareArray(int size) {
        this->size = size;
        a = std::vector<double>(size * size);        
    }

    __attribute__((always_inline)) double& ref(int x, int y) {
        return a[x * size + y];
    }

    int dim () {
        return this->size;
    }

private:
    std::vector<double> a;
    int size = 0;
};

// Column-major computation of |A|_1
double norm1cm(SquareArray& A){
    int n = A.dim();
    double r=0;
    for(int j=0;j<n;j++){
        double s=0;
        for(int i=0;i<n;i++){
            s+=fabs(A.ref(i, j));
            //s+=fabs(A[i][j]);
        }
        if(s>r) r=s;
    }
    return r;
}

// Row-major computation of |A|_1
double norm1rm(SquareArray& A){
    int n = A.dim();
    double s[n];
    bzero(s,sizeof(s));
    for(int i=0;i<n;i++){
        for(int j=0;j<n;j++){
            s[j]+=fabs(A.ref(i, j));
        }
    }
    double r=0;
    for(int j=0;j<n;j++){
        if(s[j]>r) r=s[j];
    }
    return r;
}

int main(int argc,char *argv[]){
    int n=4096 * 6;
    if(argc==2) n=atoi(argv[1]);
    auto A = SquareArray(n);
    srandom(19721);
    for(int i=0;i<n;i++){
        for(int j=0;j<n;j++){
            A.ref(i, j)=2.0*random()/RAND_MAX-1.0;
        }
    }
    printf("#%-9s %9s %22s %15s\n","method","n","value","time");

    typedef double (*function)(SquareArray& A);
    struct test {
        char *name;
        function norm;
    } test[] = {
        {"norm1cm", norm1cm},
        {"norm1rm", norm1rm}
    };

    for(int i=0;i<sizeof(test)/sizeof(struct test);i++){
        tic();
        double r=test[i].norm(A);
        double t=toc();
        printf("%-10s %9d %22.14e %15.8f\n",test[i].name,n,r,t);
    }
    exit(0);
}
[code]

Heater
Posts: 9727
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Sun Jun 10, 2018 4:11 pm

Oddly paeryn's solution using vector is significantly slower than my simulated variable sized array solution for large arrays. But faster for small, cache friendly sizes, Like so:

Code: Select all

$ ./matnormal_paeryn_cpp
#method            n                  value            time
norm1cm        24576   1.24706024724666e+04     37.58417201
norm1rm        24576   1.24706024724666e+04      0.41677094
norm1rmAlt     24576   1.24706024724666e+04      0.39906001
Again, with the array dimensions multiplied by 6 from the originals.

ejolson
Posts: 1843
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Sun Jun 10, 2018 6:45 pm

Heater wrote,

I'm not happy with the vector of vector solutions.

My reply,

Your solution also appears a reasonable C++ choice.

I wonder if some object oriented overloading trickery could collect the i and j arguments from the multiple brackets in A[i][j] and then pass them on to the ref(i,j) function.

Why do you think the result is twice as slow?

Heater
Posts: 9727
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Sun Jun 10, 2018 10:46 pm

ejolson,
Why do you think the result is twice as slow?
Just now I have no idea.

At the end of the day, to access a 2D array you need a base address and an offset in one direction. Then an offset in the other direction. One of which requires a multiply to get to the right array element.

Which is what I was trying to achieve in my C++ example.

Where does the time go?

ejolson
Posts: 1843
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Mon Jun 11, 2018 12:41 am

Heater wrote:
Sun Jun 10, 2018 10:46 pm
ejolson,
Why do you think the result is twice as slow?
Just now I have no idea.
Assuming you have the optimiser turned on, my guess is that something is preventing it from moving the multiplication i*size outside the inner loop.

Heater
Posts: 9727
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Mon Jun 11, 2018 10:10 am

All versions compiled with -O3.

The resulting assembler is opaque enough that I can't see what is what.

Heater
Posts: 9727
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Mon Jun 11, 2018 10:29 pm

Edit: Ignore the timing results here. Something weird was up last night to slow this down. Running the same code this morning things work as expected!

OK, riddle me this, any C++ gurus out there.

When I change my little variable size square array class to use an old school "new double[size * size];" to hold the data the execution time of norm1cm becomes 3 times longer. Far worse still the norm1rm execution time balloons out to 27.16589689. That's about 5 times and 40 times slower respectively !

Given that both the vector and the "new"ed array are, ultimately just a blob of memory on the heap why the dramatic difference?

Code: Select all

$ ./matnormal_cpp
#method            n                  value            time
norm1cm        24576   1.24706024724666e+04    120.54688907
norm1rm        24576   1.24706024724666e+04     27.16589689
The code:

Code: Select all

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <strings.h>
#include <sys/time.h>
#include <sys/resource.h>

#include <vector>

/*  Compute the matrix 1-norm using either a row-major or column-major
    algorithm. Uses a variable size 2D array class to maintin the array
*/

static double tic_time;
void tic() {
    struct timeval tp;
    gettimeofday(&tp,0);
    tic_time=tp.tv_sec+tp.tv_usec*1.0e-6;
}
double toc() {
    struct timeval tp;
    gettimeofday(&tp,0);
    return (tp.tv_sec+tp.tv_usec*1.0e-6)-tic_time;
}

class SquareArray {
public:
    SquareArray(int size) {
        this->size = size;
        a = new double[size * size];
    }

    __attribute__((always_inline)) double& ref(int x, int y) {
        return a[x * size + y];
    }

    int dim () {
        return this->size;
    }

private:
    double *a;
    int size = 0;
};

// Column-major computation of |A|_1
double norm1cm(SquareArray& A){
    int n = A.dim();
    double r=0;
    for(int j=0;j<n;j++){
        double s=0;
        for(int i=0;i<n;i++){
            s+=fabs(A.ref(i, j));
            //s+=fabs(A[i][j]);
        }
        if(s>r) r=s;
    }
    return r;
}

// Row-major computation of |A|_1
double norm1rm(SquareArray& A){
    int n = A.dim();
    double s[n];
    bzero(s,sizeof(s));
    for(int i=0;i<n;i++){
        for(int j=0;j<n;j++){
            s[j]+=fabs(A.ref(i, j));
        }
    }
    double r=0;
    for(int j=0;j<n;j++){
        if(s[j]>r) r=s[j];
    }
    return r;
}

int main(int argc,char *argv[]){
    int n=4096 * 6;
    if(argc==2) n=atoi(argv[1]);
    auto A = SquareArray(n);
    srandom(19721);
    for(int i=0;i<n;i++){
        for(int j=0;j<n;j++){
            A.ref(i, j)=2.0*random()/RAND_MAX-1.0;
        }
    }
    printf("#%-9s %9s %22s %15s\n","method","n","value","time");

//    typedef double (*function)(int n,double A[n][n]);
    typedef double (*function)(SquareArray& A);
    struct test {
        char *name;
        function norm;
    } test[] = {
        {"norm1cm", norm1cm},
        {"norm1rm", norm1rm}
    };

    for(int i=0;i<sizeof(test)/sizeof(struct test);i++){
        tic();
        double r=test[i].norm(A);
        double t=toc();
        printf("%-10s %9d %22.14e %15.8f\n",test[i].name,n,r,t);
    }
    exit(0);
}
Last edited by Heater on Tue Jun 12, 2018 6:58 am, edited 1 time in total.

User avatar
Paeryn
Posts: 2115
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Dynamic sized arrays

Tue Jun 12, 2018 12:41 am

Heater wrote:
Mon Jun 11, 2018 10:29 pm
OK, riddle me this, any C++ gurus out there.

When I change my little variable size square array class to use an old school "new double[size * size];" to hold the data the execution time of norm1cm becomes 3 times longer. Far worse still the norm1rm execution time balloons out to 27.16589689. That's about 5 times and 40 times slower respectively !

Given that both the vector and the "new"ed array are, ultimately just a blob of memory on the heap why the dramatic difference?

Code: Select all

$ ./matnormal_cpp
#method            n                  value            time
norm1cm        24576   1.24706024724666e+04    120.54688907
norm1rm        24576   1.24706024724666e+04     27.16589689
Something up there, new shouldn't be much different than std::vector<> speed-wise, did the new version hit the swap? I just tried it on mine although with n = 16384 as my linux vbox is only configured with 3GB. This is compiled with gcc-5 at optimisation level -O3 but even with -O0 the new version is slightly faster.

Code: Select all

[email protected] ~/programming/rpi/dynarr $ ./mat_norm_vec 16384
#method            n                  value            time
norm1cm        16384   8.34485295437500e+03      7.99577999
norm1rm        16384   8.34485295437500e+03      0.16259909
[email protected] ~/programming/rpi/dynarr $ ./mat_norm_new 16384
#method            n                  value            time
norm1cm        16384   8.34485295437500e+03      7.33245897
norm1rm        16384   8.34485295437500e+03      0.16113400
She who travels light — forgot something.

ejolson
Posts: 1843
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Tue Jun 12, 2018 12:45 am

Heater wrote:
Mon Jun 11, 2018 10:29 pm
just a blob of memory on the heap why the dramatic difference?
Could it be alignment or NUMA effects? What happens on a Pi?

Maybe it is time to try JavaScript.

Heater
Posts: 9727
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Tue Jun 12, 2018 7:20 am

Paeryn,

You are right, something was up there last night. Although I did run both versions alternately 4 or 5 times so it looked like real effect. Perhaps Chrome was being exceptionally piggy of memory/cpu at the time.

When I run the same code again this morning timings are as expected. This is the C++ version with "new int[size * size]":

Code: Select all

$ ./matnormal_cpp
#method            n                  value            time
norm1cm        24576   1.24706024724666e+04     31.64375281
norm1rm        24576   1.24706024724666e+04      0.80307794
$ ./matnormal_cpp
#method            n                  value            time
norm1cm        24576   1.24706024724666e+04     32.27062011
norm1rm        24576   1.24706024724666e+04      0.79047108
This is the C version with variable size arrays:
$ ./matnormal_c
#method n value time
norm1cm 24576 1.24706024724666e+04 32.59459114
norm1rm 24576 1.24706024724666e+04 0.78661394
$ ./matnormal_c

Code: Select all

#method            n                  value            time
norm1cm        24576   1.24706024724666e+04     30.69593096
norm1rm        24576   1.24706024724666e+04      0.79263496
Essentially the C and C++ versions perform equally now.

I guess the old school method using "new" is frowned upon today. It requires you remember to have a corresponding "delete" somewhere (Which my code does not have, the SquareArray class should have a destructor defined to do that). But I kind of like it better than the clunky vector syntax in this case.

ejolson
Posts: 1843
Joined: Tue Mar 18, 2014 11:47 am

Re: Dynamic sized arrays

Tue Jun 12, 2018 4:01 pm

Heater wrote:
Tue Jun 12, 2018 7:20 am
I guess the old school method using "new" is frowned upon today. It requires you remember to have a corresponding "delete" somewhere.
With malloc it is clear that free does not have to be called at the end of the program because the heap is automatically freed by the operating system upon termination. I'm not sure about new and delete, because instead of freeing memory from the heap a destructor might perform file output, manipulate System V semaphores and shared memory or perform some other action that would persist after program termination.

Does the C++ runtime automatically call the destructors for all objects which haven't already had their destructors called at termination?

User avatar
Paeryn
Posts: 2115
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Dynamic sized arrays

Tue Jun 12, 2018 4:46 pm

ejolson wrote:
Tue Jun 12, 2018 4:01 pm
Heater wrote:
Tue Jun 12, 2018 7:20 am
I guess the old school method using "new" is frowned upon today. It requires you remember to have a corresponding "delete" somewhere.
With malloc it is clear that free does not have to be called at the end of the program because the heap is automatically freed by the operating system upon termination. I'm not sure about new and delete, because instead of freeing memory from the heap a destructor might perform file output, manipulate System V semaphores and shared memory or perform some other action that would persist after program termination.

Does the C++ runtime automatically call the destructors for all objects which haven't already had their destructors called at termination?
Depends, destructors are called for automatic variables but not ones you create with new

Code: Select all

int main() {
    Array auto_var(3);  // This will have its destructor called
    Array *ptr_var(4);  // This will not have its destructor called
    return 0;
}
She who travels light — forgot something.

Heater
Posts: 9727
Joined: Tue Jul 17, 2012 3:02 pm

Re: Dynamic sized arrays

Tue Jun 12, 2018 7:46 pm

ejolson,
With malloc it is clear that free does not have to be called at the end of the program because the heap is automatically freed by the operating system upon termination.
and
Does the C++ runtime automatically call the destructors for all objects which haven't already had their destructors called at termination?
This is a deep misunderstanding.

If your program runs, does a thing, and terminates, the operating system will clean up and reclaim whatever resources the program has used. Memory, sockets, whatever. No matter if those destructors have been called or not.

In that way one can be sloppy about not matching "free" with "malloc" or "delete" with "new". The OS takes care of it when we are done.

Where I come from programs are expected to run forever. That means that if it uses "new" or "malloc" to get something done, it had better "delete" or "free" it when that little task is done. Else we are going to eat memory with dead objects and eventually get killed with an out of memory error.

Same applies to opening and closing sockets or whatever else.

Return to “C/C++”

Who is online

Users browsing this forum: No registered users and 5 guests