sonettguy
Posts: 142
Joined: Wed Jan 10, 2018 7:29 pm
Location: texas, USA

trouble adding a subset of matrix elements (numpy)

Tue May 28, 2019 9:53 pm

I'm trying to add a subset of elements from one column of a mult-dimensional matrix, but I cannot figure out how to do it elegantly. I would like to use a running summation to make a decision. Here, the decision is simply to print a line. To illustrate, I have created a file called temp3_add. It is as follows. As you will see, the next to last line is not correct (the line with np.sum). What should I replace it with? I cannot find an example of code where only a portion of one dimension is summed.

Code: Select all

#!/usr/bin/env python3
# Looking into adding few elements of a martix
#
import numpy as np
pwr=np.random.randint(0,6,(2,2,2,20))
print("pwr is",pwr.shape)
print(pwr[:,:,:,0:11])
for i in range(2):
  for j in range(2):
    for k in range(2):
      for m in range(20):
        if m>8:
          if np.sum(pwr[i,j,k,m:m-9])<4:
            print("sum of ten less than four:",np.sum(pwr[i,j,k,m:m-9]),i,j,k,m)
The result follows. It repeats until the end. The 'sum' is always zero, since I seem to be using the 'np.sum' improperly. It seems it is only used to sum an entire column or row (or a whole matrix), rather than summing only ten elements of a 20-element row. Below, I print only the first 11 entries in the 20-element row. I was hoping to see the first two sums as 1+2+3+0+0+4+3+2+2+1=18 and 2+3+0+0+4+3+2+2+1+0=17 (neither of which would have printed since they are each greater than 4).

Code: Select all

user@piname:~/esi $ python3 temp3_add
pwr is (2, 2, 2, 20)
[[[[1 2 3 0 0 4 3 2 2 1 0]
[2 2 5 4 0 5 3 0 5 1 5]]

[[4 4 1 0 0 1 1 0 5 2 5]
[2 2 2 4 3 5 0 0 3 2 3]]]

[[[5 4 4 5 4 3 2 4 2 4 2]
[3 2 3 2 2 3 1 2 0 1 2]]

[[1 5 0 1 0 0 0 1 5 4 3]
[1 2 3 2 2 2 2 0 1 3 1]]]]
sum of ten less than four: 0 0 0 0 9
sum of ten less than four: 0 0 0 0 10
sum of ten less than four: 0 0 0 0 11
sum of ten less than four: 0 0 0 0 12
sum of ten less than four: 0 0 0 0 13
sum of ten less than four: 0 0 0 0 14
sum of ten less than four: 0 0 0 0 15
sum of ten less than four: 0 0 0 0 16
sum of ten less than four: 0 0 0 0 17
sum of ten less than four: 0 0 0 0 18
sum of ten less than four: 0 0 0 0 19
sum of ten less than four: 0 0 0 1 9
sum of ten less than four: 0 0 0 1 10
sum of ten less than four: 0 0 0 1 11
sum of ten less than four: 0 0 0 1 12
sum of ten less than four: 0 0 0 1 13
sum of ten less than four: 0 0 0 1 14
sum of ten less than four: 0 0 0 1 15
sum of ten less than four: 0 0 0 1 16
sum of ten less than four: 0 0 0 1 17
sum of ten less than four: 0 0 0 1 18
sum of ten less than four: 0 0 0 1 19

User avatar
paddyg
Posts: 2541
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Re: trouble adding a subset of matrix elements (numpy)

Tue May 28, 2019 11:33 pm

You basically don't use loops with numpy. If you do it is much less efficient and is slower than standard python.

Code: Select all

#you can create a view on your array like this
pwr_slice = pwr[:,:,:,9:]
#and an index of all the values > 5 like this
ix = pwr_slice < 4
# and sum the slice at those indexes like this
np.sum(pwr_slice[ix])
# or all in one line (which might run quicker for bigger arrays but is harder to follow)
np.sum(pwr[:,:,:,9:][pwr[:,:,:,9:] < 4])
PS looking at your code a bit more I see that you are wanting to take sums over a sliding slice of an array. That's probably quite awkward to do in numpy. Your code looks to have the index values the wrong way round in the last dimension i.e should be pwr[i,j,k,m-10:m] However you can sum over part of an array and only over a certain dimension so to do the kind of thing you were aiming at I would

Code: Select all

import numpy as np

pwr = np.random.randint(0, 6, (200000,2,2,20)) # need turn the volume up to get any hits. But that's why you would use numpy

# this is one way to do it which will just give you the answers but less info about where the match occurred
for m in range(11): 
  ix = np.sum(pwr[:,:,:,m:m+10], axis=3) < 4
  if ix.any():
    print("from {} to {}: {}".format(m, m+10, pwr[:,:,:,m:m+10][ix]))

# np.where creates a tuple of individual index vals so you can see where the hits were
for m in range(11):
  ix = np.where(np.sum(pwr[:,:,:,m:m+10], axis=3) < 4)
  if len(ix[0]) > 0:
    for v in range(len(ix[0])):
      i, j, k = ix[0][v], ix[1][v], ix[2][v]
      print("i={},j={},k={} m from {} to {}: {}".format(i, j, k, m, m+10, pwr[i,j,k,m:m+10]))
Of course it depends very much on what your final goal is. This might still be the wrong approach to your problem!
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

sonettguy
Posts: 142
Joined: Wed Jan 10, 2018 7:29 pm
Location: texas, USA

Re: trouble adding a subset of matrix elements (numpy)

Thu May 30, 2019 3:19 pm

Thanks very much, @paddyg. I need to adjust my thinking based on your first observation. Good advice. I'll work on this and see if I can get out of the 'loop' mentality.

sonettguy
Posts: 142
Joined: Wed Jan 10, 2018 7:29 pm
Location: texas, USA

Re: trouble adding a subset of matrix elements (numpy) SOLVED

Thu May 30, 2019 10:31 pm

[Spacing corrected, as pointed out in the reply that follows this message.]

So, lesson #1: numpy doesn't like range specifications that count backwards. I replaced indices [m:m-9] with [m-9:m] and my code worked.

Lesson #2: The most important lesson. Numpy matrix functions often take the place of "for" loops. Who knew. This is a startling revelation to me and will cause me to do a lot of modifications elsewhere in my code, beyond what I use as an example here. For completeness, here is the new code, including that suggested by paddyg (although I couldn't get his print statement working so added my own).

Code: Select all

#!/usr/bin/env python3
# Looking into adding few elements of a martix
#
import numpy as np
pwr=np.random.randint(0,6,(2,2,2,20))
print("pwr is",pwr.shape)
print(pwr[:,:,:,0:11])
for i in range(2):
  for j in range(2):
    for k in range(2):
      for m in range(20):
        if m>8:
          if(np.sum(pwr[i,j,k,m-9:m]))<18:
            print("sum of ten less than 18:",np.sum(pwr[i,j,k,m-9:m]),i,j,k,m)
print("--------------------------------------------------------------------------")
for m in range(11): ix=np.where(np.sum(pwr[:,:,:,m:m+10],axis=3)<18)
  if len(ix[0])>0:
    for v in range(len(ix[0])):
      i,j,k=ix[0][v],ix[1][v],ix[2][v]
      print("i=(),j=(),k=() m from () to ():()".format(i,j,k,m,m+10,np.sum(pwr[i,j,k,m:m+10])))
      print(i,j,k,m,"to",m+10)
And here is the output. The two, oddly, don't match up, but I'm not going to pursue it. It's only an example.

Code: Select all

$ python3 temp3_add
pwr is (2, 2, 2, 20)
[[[[4 3 0 5 2 0 4 4 1 3 3]
[2 0 1 3 1 0 0 0 5 3 2]]

[[1 0 4 0 0 3 3 5 3 5 4]
[4 2 5 2 1 1 1 2 2 4 1]]]


[[[4 5 3 5 3 0 2 4 3 3 2]
[3 0 2 4 5 3 1 2 2 1 2]]

[[4 5 5 1 4 0 4 4 5 3 4]
[0 0 0 0 5 0 5 5 2 2 5]]]]
sum of ten less than 18: 12 0 0 1 9
sum of ten less than 18: 13 0 0 1 10
sum of ten less than 18: 15 0 0 1 11
sum of ten less than 18: 15 0 0 1 12
sum of ten less than 18: 16 0 0 1 13
sum of ten less than 18: 16 0 0 1 14
sum of ten less than 18: 16 0 0 1 15
sum of ten less than 18: 16 0 0 1 18
sum of ten less than 18: 17 0 0 1 19
sum of ten less than 18: 17 0 1 1 12
sum of ten less than 18: 17 1 0 0 18
sum of ten less than 18: 17 1 0 1 14
sum of ten less than 18: 16 1 0 1 15
sum of ten less than 18: 17 1 1 1 9
--------------------------------------------------------------------------
i=(),j=(),k=() m from () to ():()
0 0 1 0 to 10
i=(),j=(),k=() m from () to ():()
0 0 1 1 to 11
i=(),j=(),k=() m from () to ():()
0 0 1 2 to 12
i=(),j=(),k=() m from () to ():()
0 0 1 4 to 14
i=(),j=(),k=() m from () to ():()
0 0 1 5 to 15
Last edited by sonettguy on Fri May 31, 2019 3:08 am, edited 1 time in total.

User avatar
paddyg
Posts: 2541
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Re: trouble adding a subset of matrix elements (numpy)

Thu May 30, 2019 11:17 pm

Hi, yes, if you find yourself using loops then probably you should not use numpy - apart from things like the moving sum thing you're using. If your final problem is along the lines of your example (2,2,2,20) array then using standard python arrays will be simpler and faster (though time obviously isn't an issue) If, on the other hand, your real data has thousands of elements in some dimensions then you need to get rid of your loops (if you want to stick to numpy). numba or cython can convert normal python into very fast executables.

You should replace the indentation in the code you posted with spaces (you should always use spaces for indentation in python imho) otherwise it doesn't really make any sense (but I think I can figure out what it's supposed to be doing ;)

Your loop version is doing a different thing from my version as you are summing over eleven slices of nine i.e 0:9, 1:10, 2:11... 10:19 whereas I sum over ten slices of ten 0:10, 1:11, 2:12 ... 9:19

I think the "python {}".format(x) system requires curly brackets inside the string that you are formatting with other values. Normal brackets just print as, well, normal brackets.

Paddy
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

User avatar
paddyg
Posts: 2541
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Re: trouble adding a subset of matrix elements (numpy)

Fri May 31, 2019 7:32 am

PS python can use ranges that count backwards using step -1 and just as with positive steps it starts with the first value and stops the iteration before equaling the last value... but the way you are doing it will run into problems getting to the zeroth index:

Code: Select all

# if m == 12
pwr[m-9:m] #=> pwr[3:12:1] = 3,4,5,6,7,8,9,10,11
equivalent
pwr[m-1:m-10:-1] #=> pwr[11:2:-1] = 11,10,9,8,7,6,5,4,3
# but when m == 9
pwr[0:9:1] => 0,1,2,3,4,5,6,7,8
pwr[8:-1:-1] => empty array because negative index values work back from end of array
# you would have to do something horrible like
pwr[m-1:m-10 if m > 9 else None:-1]
So only worth using negative steps if you really need them!

Paddy
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

sonettguy
Posts: 142
Joined: Wed Jan 10, 2018 7:29 pm
Location: texas, USA

Re: trouble adding a subset of matrix elements (numpy)

Fri May 31, 2019 1:13 pm

Thanks. Yes, I don't know how many times 'not including the last value' has gotten me. Many. Good to know you can direct numpy to count backwards. In this case it is not necessary, but good to know for the future. I am dealing with a matrix with [2,2,48000.4]. It is data captured at a high sample for two antennas. The pi is just temporary for development and will be replaced with a faster processor/DSP pair eventually. Now, I must go clean up several areas with 'for' loops....

User avatar
paddyg
Posts: 2541
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Re: trouble adding a subset of matrix elements (numpy)

Fri May 31, 2019 1:44 pm

Ah, now you mention *that* it sounds like convolution might be what you want! There's numpy.convolve and various scipy functions. By using something like that you can get round the for loop over value of m (and therefor speed things up. If you have to upgrade your processor to make the loop system fast enough then improving the performance might enable you to stick withe RPi!)

Maybe describe what your final objective is and I or someone else might be able to help with a really neat one line solution
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

sonettguy
Posts: 142
Joined: Wed Jan 10, 2018 7:29 pm
Location: texas, USA

Re: trouble adding a subset of matrix elements (numpy)

Fri May 31, 2019 9:33 pm

Well, first, thank you for offering. I'm sitting here trying to think exactly how to set this up in an open forum and provide enough information for the case to make sense. It is a small section of a rather lengthy body of code. I'll give it a shot.

I read data passed to me from another system which has captured signals arriving at two antennas, which I refer to as inputs. I may have two such systems, which I refer to as manifolds. The data is passed to me as a 48,000 complex datapoint stream, 24,000 for input 1 followed by 24,000 for input 2. I calculate power, which is where we begin this discussion. The power matrix is pwr[manifolds, inputs, datapoints/input]. That happens to be [1,2,24000] in the simplest test case.

I want to find the rising edge of the first pulse in each input dataset. I have already reduced the datapoints from 24,000 to 4096. I know the first pulse lies somewhere within that 4096 slice. I know that the pulse width is 2048. Once I have it, I take an FFT of the pulse. That part is already written and tested. The portion I am currently writing is the section finding that rising edge. For this test, I have visually verified that the pulse actually starts at datapoint 1025. That is why you will see print statement looking at points around that figure.

This is what I have currently.

Code: Select all

# This section finds the pulse in the input data stream
# In the test case 'Example #2' that means 2048 of 4096 datapoints
#
# Calculate Power
pwr=np.add(np.square(data.real),np.square(data.imag))
print("'pwr' matrix is",pwr.shape)
print(pwr[:,:,1015:1040])
N0=np.mean(pwr)
print ("N0 is average power",N0)
print("---------------------------------------------------------------")
# With the average power, find the pulse
# Here we use a gross detection, then fine tune it (to save computing time) 
# First, redefine 'pwr' as a boolean matrix 1=exceeds threshold, 0=doesn't 
pwr=np.where(pwr>N0/2,np.int(1),np.int(0)) 
# change pwr matrix to boolean based on threshold 
print("pwr as 1s and 0s is",pwr.shape) 
print(pwr[:,:,1015:1040])

# examine pwr for the beginning of the first pulse in each input
pwr=pwr.resize(manifolds,inputs,np.int(indat/10),10)
print("pwr ready for summing is",pwr.shape)
print(pwr[:,:,0-6,:])
pwrsum=np.sum(pwr,axis=3)
print("pwrsum is",pwrsum.shape)
print(pwrsum[:,:,0:40])
The last section doesn't work as is. I am trying to resize the power matrix to include a new dimension with each ten datapoints. (Thus, I abandon the rolling sum.) I will sum that dimension, which I believe will retract the matrix back to three dimensions (at least it will if np.sum works like np.average in that respect) giving me a new matrix 'pwrsum.' Then, I will find the ten-point section that has the rising edge and write more code (which I haven't started) to find the exact datapoint (in 'pwr') where it rises. That is my idea to speed it up.

Code: Select all

The current code, in the last section, gives me a scalar for pwrsum.
'pwr' matrix is (1, 2, 4096)
[[[2.59167454e-05 6.22989088e-06 1.86896511e-05 2.61999725e-05
2.01763086e-05 1.54260481e-05 2.14214460e-06 5.43702520e-06
1.12911418e-05 1.45992022e-03 1.16419753e-03 3.66901296e-03
1.47404067e-03 4.23945951e-03 8.81071269e-03 7.73278334e-03
4.84947192e-04 1.24258423e-03 2.37445083e-03 2.31586487e-03
3.59474264e-03 1.21379329e-03 5.22641165e-03 9.74239688e-04
3.29744981e-03]
[1.28948850e-05 4.23178476e-06 3.61211223e-05 1.19399182e-05
1.71741414e-05 1.00669951e-07 6.56602973e-06 1.00561338e-05
7.75410970e-06 2.02694631e-04 2.13421632e-03 6.29514901e-04
5.82015401e-03 6.08337756e-05 6.53554687e-03 8.22032727e-03
6.05838853e-03 9.52441538e-05 2.24991930e-03 8.19040495e-04
4.25328955e-03 3.20162519e-03 1.81849150e-04 6.46625573e-03
5.93614369e-04]]]
N0 is average power 0.0015424418591345873
---------------------------------------------------------------
pwr as 1s and 0s is (1, 2, 4096)
[[[0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1]
[0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 1 1 1 1 0 1 0]]]
Traceback (most recent call last):
File "scat005.py", line 112, in <module>
print("pwr ready for summing is",pwr.shape)
AttributeError: 'NoneType' object has no attribute 'shape'
This may not be enough information to actually help, but this is about as much as I can provide. Thanks for taking a look.

User avatar
paddyg
Posts: 2541
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Re: trouble adding a subset of matrix elements (numpy)

Fri May 31, 2019 11:42 pm

Hi, that looks interesting. What are the characteristics of the data stream and the pulse? i.e. how much noise is there and can the pulse be determined by looking at a short length (10 values) or do you have to check that it remains high for 2048 values as you say? Is the pulse always relative to the mean of the whole data stream - if the pulse is smaller than the base values then the mean will be lower and the pulse harder to differentiate.

Here is an example of picking out a pulse in noisy data you can mess around with the size of the noise, signal and window size for smoothing. I use a threshold of the mid point between the min and max of the smoothed signal (excluding the start and end) The pulse point is the index of the first nonzero from the condition pwr_smoothed > mid_val

Code: Select all

import numpy as np
import matplotlib.pyplot as plt
W_SZ = 50

x = np.linspace(0.0, 1.0, 4096)
#pwr = np.random.randn(4096) * 0.02 + 0.3 # try these different noisinesses
pwr = np.random.randn(4096) * 0.05 + 0.3
#pwr = np.random.randn(4096) * 0.1 + 0.3

window = 1.0 - np.abs(np.linspace(-1.0, 1.0, W_SZ)) # this is a triangular filter
window /= np.sum(window) # normalize to sum = 1

pwr[1234:1567] += 0.1 # make a pulse

pwr_smoothed = np.convolve(pwr, window, 'same')
mean_line = np.ones_like(x) * np.mean(pwr)
mid_val = (np.max(pwr_smoothed[W_SZ:-W_SZ]) + np.min(pwr_smoothed[W_SZ:-W_SZ])) * 0.5
mid_line = np.ones_like(x) * mid_val

plt.plot(x, pwr, '.', x, pwr_smoothed, x, mean_line, x, mid_line)
plt.show()

print("estimated start of pulse {} c.f. actual start at 1234".format((pwr_smoothed > mid_val).nonzero()[0][0]))
Figure_1.jpg
Figure_1.jpg (110 KiB) Viewed 926 times
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

sonettguy
Posts: 142
Joined: Wed Jan 10, 2018 7:29 pm
Location: texas, USA

Re: trouble adding a subset of matrix elements (numpy)

Sat Jun 01, 2019 12:33 am

Nice. You are quite talented. I'm jealous of your matplotlib. I have it one another pi, but on this one, I can't get it to work. I tried installing with pip3, then conda. It always says "no such file or directory." I read where that happens with multiple installations, so I painstakingly uninstalled what I could, then searched and deleted other occurrences. After a full day of chasing files, I felt I had cleaned it up. I reinstalled with conda and it gave me the same result. I don't know if there is some config file somewhere with two references or what, but I don't have time to chase it.

I'll have to study convolve. That's new to me.

My raw data looks like this, here showing the first six pulses and second showing the edge of the first pulse.
Image

Image
When I get to the power matrix and alter it to ones and zeros, it looks like this...first showing the first two pulses, second showing the edge of the first pulse.
Image

Image

User avatar
paddyg
Posts: 2541
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Re: trouble adding a subset of matrix elements (numpy)

Sat Jun 01, 2019 8:13 am

Installing modules is the worst aspect of python by a long way: complicated by the fact that there are two incompatible versions maintained in parallel, and that the default (if you type 'python' on raspbian it uses python2) is not going to be supported in a few months time!

Your data looks pretty clean. I would still use convolve() to smooth the pwr series then look for the first point that the smoothed pwr goes over (max + min)/2. If I increase the separation between the background and pulse levels in my example code I get the exact start of the pulse every time.

You can do pretty good smoothing with simple rectangular convolutions ie window = np.array([0.1] * 10) but I have found that the triangular array works well generally.

Paddy

PS I find the code easier to read (it's all relative as numpy is pretty alien) if I stick to the more normal 'math' style formulas rather than the old 'function' style numpy i.e.

Code: Select all

# Calculate Power
pwr = data.real ** 2 + data.imag ** 2 #<<<<<<<<<<
print("'pwr' matrix is",pwr.shape)
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

sonettguy
Posts: 142
Joined: Wed Jan 10, 2018 7:29 pm
Location: texas, USA

Re: trouble adding a subset of matrix elements (numpy)

Sat Jun 01, 2019 1:29 pm

Good suggestion. I wonder how execution time changes function vs. math formula. Here, execution time trumps ease of reading.

User avatar
paddyg
Posts: 2541
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Re: trouble adding a subset of matrix elements (numpy)

Sat Jun 01, 2019 1:49 pm

Essentially doesn't make any difference, in this case - sometimes the python syntax is quicker, sometimes slower! See

Code: Select all

import timeit

setup = '''
import numpy as np
a = np.random.rand(24_000) + np.random.rand(24_000) * 1j
'''

fns = ['''
b = a.real ** 2 + a.imag ** 2
''', '''
c = np.add(np.square(a.real), np.square(a.imag))
''']

for f in fns:
  print(timeit.timeit(f, setup, number=10_000))
On this laptop this takes 0.67s each. That's 6.7μs to square and sum 24k values. (and 1.3μs for 4096 values)
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

Return to “Python”