1. Started with 2019-09-26-raspbian-buster.zip
2. Working on pyvenv, built with native Python-3.7.3
3. The building of the numpy module was rather straightforward, except for the gfortran issue.
4. Found that linalg submodule is so slow, about 20 to 50 times slower than the speed of RasPi 3B+
5. I suspected inappropriate libraries (OpenBLAS vs. ATLAS, for example), and changed some of them, but this did not make any differences.
6. Even tried to install numpy with 'pip install --no-binary :all: numpy,' but the results were the same; resultant np.linalg was so slow.
7. To RasPi 4, I inserted one of the microSD cards which work properly on RasPi 3B+ and earlier boards, but the numpy remains so slow.
I'll show some codes to clarify what I have done so far:
* The Numpy on the old card works fine on RasPi 3B+
Code: Select all
(pve37) fukuda@raspi23:~/pve37% ipython
Python 3.7.3 (default, Apr 3 2019, 05:39:12)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.8.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import numpy as np
In [2]: A = np.random.rand(256, 256)
In [3]: %timeit B = np.linalg.inv(A)
28.9 ms ± 325 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [4]: %timeit C = np.fft.fft2(A)
21.8 ms ± 705 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [4]: %timeit C = np.fft.fft2(A)
21.8 ms ± 705 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [5]: np.show_config()
blas_mkl_info:
NOT AVAILABLE
blis_info:
NOT AVAILABLE
openblas_info:
NOT AVAILABLE
atlas_3_10_blas_threads_info:
NOT AVAILABLE
atlas_3_10_blas_info:
NOT AVAILABLE
atlas_blas_threads_info:
NOT AVAILABLE
atlas_blas_info:
language = c
define_macros = [('HAVE_CBLAS', None), ('NO_ATLAS_INFO', -1)]
libraries = ['f77blas', 'cblas', 'atlas', 'f77blas', 'cblas']
library_dirs = ['/usr/lib/arm-linux-gnueabihf']
blas_opt_info:
language = c
define_macros = [('HAVE_CBLAS', None), ('NO_ATLAS_INFO', -1)]
libraries = ['f77blas', 'cblas', 'atlas', 'f77blas', 'cblas']
library_dirs = ['/usr/lib/arm-linux-gnueabihf']
lapack_mkl_info:
NOT AVAILABLE
openblas_lapack_info:
NOT AVAILABLE
openblas_clapack_info:
NOT AVAILABLE
flame_info:
NOT AVAILABLE
atlas_3_10_threads_info:
NOT AVAILABLE
atlas_3_10_info:
NOT AVAILABLE
atlas_threads_info:
NOT AVAILABLE
atlas_info:
language = f77
libraries = ['lapack', 'f77blas', 'cblas', 'atlas', 'f77blas', 'cblas']
library_dirs = ['/usr/lib/arm-linux-gnueabihf']
define_macros = [('NO_ATLAS_INFO', -1)]
lapack_opt_info:
language = f77
libraries = ['lapack', 'f77blas', 'cblas', 'atlas', 'f77blas', 'cblas']
library_dirs = ['/usr/lib/arm-linux-gnueabihf']
define_macros = [('NO_ATLAS_INFO', -1)]
Code: Select all
In [4]: A = np.random.rand(256, 256)
In [5]: %timeit B = np.linalg.inv(A)
1.3 s ± 225 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [6]: %timeit C = np.fft.fft2(A)
13 ms ± 11.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
It looks like that the RasPi3 and RasPi4 react to the same binary in a significantly different way, and I have now no idea which way to look. Any advises or suggestions would be appreciated.