forked from nipraxis/textbook
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathboolean_indexing_nd.Rmd
192 lines (143 loc) · 5.6 KB
/
boolean_indexing_nd.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
---
jupyter:
jupytext:
notebook_metadata_filter: all,-language_info
split_at_heading: true
text_representation:
extension: .Rmd
format_name: rmarkdown
format_version: '1.2'
jupytext_version: 1.13.7
kernelspec:
display_name: Python 3
language: python
name: python3
orphan: true
---
# Boolean indexing with more than one dimension
```{python}
# import common modules
import numpy as np # the Python array package
import matplotlib.pyplot as plt # the Python plotting package
```
First we make a 3D array of shape (4, 3, 2)
```{python}
# create an array of numbers from 0 to 11 inclusive, reshape this into a 4 rows by 3 columns array
plane0 = np.reshape(np.arange(12), (4, 3))
plane0
```
```{python}
# create an array of numbers from 100 to 111 inclusive, reshape this into a 4 rows by 3 columns array
plane1 = np.reshape(np.arange(100, 112), (4, 3))
plane1
```
```{python}
# create a 3D array, full of 0s, with 4 rows, 3 columns and 2 planes, all filled with 0s
arr_3d = np.zeros((4, 3, 2))
# set the first plane to be equal to plane0
arr_3d[:, :, 0] = plane0
# set the first plane to be equal to plane1
arr_3d[:, :, 1] = plane1
# show the array
arr_3d
```
Pictured in 3D space, this array looks as below:
![](images/arr_3d_planes.jpg)
We can index this with a one-dimensional Boolean array. This selects
elements from the first axis.
```{python}
bool_1d = np.array([False, True, True, False])
arr_3d[bool_1d]
```
Pictured in 3D space, the first axis is the rows, across each plane. So, using `[False, True, True, False]` to index `arr_3d` is equivalent to stating "give me the the 2nd and 3rd rows, across both remaining planes, from `arr_3d`".
Put another way, the 1D Boolean index is saying, "Give me only the *rows* with True, and all the columns and all the planes".
Remember that the `True` and `False` values in the boolean array act as 'switches': only the elements corresponding to `True` make it into the final array:
![](images/arr_3d_1d_bool.jpg)
```{python}
# show the final array, from indexing arr_3d with bool_1d
arr_3d[bool_1d]
```
We can also index with a two-dimensional Boolean array.
Put another way, we want "Only the rows, columns with a True value, and all the planes".
```{python}
bool_2d = np.array([[False, True, False],
[True, False, True],
[True, False, False],
[False, False, True],
])
bool_2d
```
If we index with this array, it selects elements
from the first *two* axes.
Put another way, we want "Only the rows, columns with a True value, and all the planes".
```{python}
# index arr_3d with bool_2d
arr_3d[bool_2d]
```
You can think of the `bool_2d` array as 'switching' on or off individual columns of the `arr_3d` array. If we show this as a sequence, pictured in 3D space, it looks as follows:
![](images/arr_3d_2d_bool.jpg)
In this case, using `bool_2d` to index `arr_3d` yields an array with one plane, with 5 rows and 2 columns:
```{python}
arr_3d[bool_2d]
```
```{python}
# show the shape of the arr_3d array, when indexed with the bool_2d boolean array
arr_3d[bool_2d].shape
```
We can even index with a 3D array, this selects elements over all three
dimensions. In which order does it get the elements?
```{python}
# A zero (=False) array of all Booleans (dtype=bool)
bool_3d = np.zeros((4, 3, 2), dtype=bool)
bool_3d
```
```{python}
# Fill in the first plane with the original 2D array
bool_3d[:, :, 0] = bool_2d
# Fill in the second plane with another 2D array.
another_bool_2d = np.array([[True, True, False],
[False, False, False],
[True, True, False],
[True, False, False],
])
bool_3d[:, :, 1] = another_bool_2d
bool_3d
```
```{python}
# Use bool_3d to index arr_3d
arr_3d[bool_3d]
```
![](images/arr_3d_3d_bool.jpg)
Now we have done the indexing in one, two and three dimensions, let us think about the rule for the shape.
The Boolean index selects the items on the corresponding dimension(s).
Therefore:
* A 1D index on its own selects rows (and leaves columns and planes intact).
The output has one *row* for each True in the array, and the original number
of columns and planes. It is therefore 3D.
* A 2D index on its own selects row, column elements (and leaves planes
intact). The output has one *row* for each True in the 2D array (each
selected row, column element), and the original number of planes. It is
therefore 2D.
* A 3D index selects row, column, plane elements. The output has one *row* for
each True in the 3D array (each selected row, column, plane element). It is
therefore 1D.
What's the rule?
The dimensions in the indexing Boolean array get collapsed to a single
dimension. Call that the Boolean dimension. The dimension has the same number
of elements are there are True values in the Boolean array. The remaining
dimensions persist. In our case:
* 1D Boolean collapses the single dimension to a single dimension - giving
Boolean dimension x columns x planes.
* 2D Boolean collapses two dimensions to a single dimension - giving Boolean
dimension x planes.
* 3D Boolean collapses all three dimensions to a single dimension - giving
Boolean dimension.
In each of the cases above, the Boolean dimension has the same length as the
number of True elements in the indexing Boolean array.
## But wait
We can also mix 1D Boolean arrays with ordinary slicing to select elements on
a single axis. E.g. the code below returns the entire second plane of `arr_3d`:
```{python}
bool_1d_dim3 = np.array([False, True])
arr_3d[:, :, bool_1d_dim3]
```