NUMPY_LIB_PLAYGROUND

NUMPY_LIB_PLAYGROUND


切片 - None, …(三个点), 与多维度

很是疑惑, 便记下(参考[REF]).

首先, None关键字出现在列表的切片中, 表示所在维度位置上新增一个维度(None的一个别名是numpy.newaxis), 验证如下:

1
2
3
4
>>> import numpy as np
>>> np.newaxis # 此处因为是None, 所以不会有结果显示的;
>>> type(np.newaxis)
<class 'NoneType'>

验证None在列表切片上新增维度的作用:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
>>> a = np.arange(25).reshape((5,5))
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
>>> print(a[:, None],'\n\nShape is: ', a[:, None].shape)
[[[ 0 1 2 3 4]]

[[ 5 6 7 8 9]]

[[10 11 12 13 14]]

[[15 16 17 18 19]]

[[20 21 22 23 24]]]

Shape is: (5, 1, 5)

上面的代码中, None作用在第二维上(a[:, None]), 所以PYTHON直接在第二维上增加了一个维度: 原先的第二维是0, 1, 2, 3, 4共5个元素, 而None增加维度之后, 第二维变成了[ 0 1 2 3 4], 只有1个列表元素, None把原先第二维的5个元素升了一个维度(加了[]), 而并没改变原先的5个元素, 所以数组a的维度是这么增加了, 故, 如是记: None放在哪一维, 就会在哪一维上出现新的维度. 再看:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
>>> a[:, :, None]
array([[[ 0],
[ 1],
[ 2],
[ 3],
[ 4]],

[[ 5],
[ 6],
[ 7],
[ 8],
[ 9]],

[[10],
[11],
[12],
[13],
[14]],

[[15],
[16],
[17],
[18],
[19]],

[[20],
[21],
[22],
[23],
[24]]])
>>> a[:, :, None, None] # :(维度1), :(维度2), None(维度3), None(维度4);
array([[[[ 0]],

[[ 1]],

[[ 2]],

[[ 3]],

[[ 4]]],


[[[ 5]],

[[ 6]],

[[ 7]],

[[ 8]],

[[ 9]]],


[[[10]],

[[11]],

[[12]],

[[13]],

[[14]]],


[[[15]],

[[16]],

[[17]],

[[18]],

[[19]]],


[[[20]],

[[21]],

[[22]],

[[23]],

[[24]]]])

好了, 不能再None下去了, 这排版已经不能看了, 我们试试没那么难看但会出错的例子:

1
2
3
4
>>> a[:, :, None, :]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: too many indices for array

上面的代码, 出错了, 因为在第四维我们用了:来索引所有元素, 而array a并没有具体的第四维(None是维度扩充, 所以有没有所在位置的维度, 对None来说, 不会造成语法和逻辑上的错误).

而切片中. . .(三个点, 此处点之间含空格是为了排版)是什么意思? 它表示省略所有冒号的意思了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
>>> a[..., None]
array([[[ 0],
[ 1],
[ 2],
[ 3],
[ 4]],

[[ 5],
[ 6],
[ 7],
[ 8],
[ 9]],

[[10],
[11],
[12],
[13],
[14]],

[[15],
[16],
[17],
[18],
[19]],

[[20],
[21],
[22],
[23],
[24]]])
>>> a[..., None].shape
(5, 5, 1)
>>> a[None, ...]
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]]])
>>> a[None, ...].shape
(1, 5, 5)

打住打住, 一切皆空, 一切皆None(包括前面那一句话).


Axis - 轴

关于Numpy中的axis(轴), 还是看官方文档的示例吧(“The N-dim array, Calculation节”[REF]), 里面讲的例子, 够简单概括和准确(相对于博文来说):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Example of the axis argument
# A 3-dimensional array of size 3 x 3 x 3, summed over each of its three axes
>>> x
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
>>> x.sum(axis=0)
array([[27, 30, 33],
[36, 39, 42],
[45, 48, 51]])
>>> # for sum, axis is the first keyword, so we may omit it,
>>> # specifying only its value
>>> x.sum(0), x.sum(1), x.sum(2)
(array([[27, 30, 33],
[36, 39, 42],
[45, 48, 51]]),
array([[ 9, 12, 15],
[36, 39, 42],
[63, 66, 69]]),
array([[ 3, 12, 21],
[30, 39, 48],
[57, 66, 75]]))

以上示例中, 想要弄清楚axis=0/1/2分别有什么结果, 先把array的shape找到来, 上例中, 为: (3, 3, 3). 其中:

  • 第一个3与axis=0相关联, 表明array([e1,e2,e3])中最外围[]内元素的轴向, axis=0是作用于这个维度上的e1,e2,e3的, 有英文across之意;
  • 第二个3与axis=1相关联, 表明第二个shape维度中元素的轴向, 即array([[e1,e2,e3],[e1,e2,e3],[e1,e2,e3]]), axis=1是作用于这个维度上的e1,e2,e3的;
  • 第三个3与axis=2相关联, 表明第三个shape维度中元素的轴向, 即array([[[e1,e2,e3],[...],[...]],[[e1,e2,e3],[...],[...]],[[e1,e2,e3],[...],[...]]]), axis=2是作用于这个维度上的e1,e2,e3的.

有了以上的解释, 上述官方代码中(x.sum(0), x.sum(1), x.sum(2))的求和结果就不难解释了: 它们对指定的axis上被across的所有元素求和. (注: 轴是个抽象概念, 不直观, 不直接, 只要能弄清逻辑就行, 慢一点正常的, 换句话说: 能很快被理清逻辑的, 那还能算抽象吗)

NumPy User Guide[REF]

NumPy basics[REF]

Data types[REF]

Indexing[REF]

Others - 其余小叙

numpy.load() 函数[REF]

函数 说明
numpy.load(file, mmap_mode=None, allow_pickle=False, fix_imports=True, encoding='ASCII') Load arrays or pickled objects from .npy, .npz or pickled files

load()返回的是: array, tuple, dict, etc, 即”data stored in the file”. (For .npz files, the returned instance of NpzFile class must be closed to avoid leaking file descriptors) 以下是一些例子:

1
2
3
4
5
# Store data to disk, and load it again;
>>> np.save('/tmp/123', np.array([[1, 2, 3], [4, 5, 6]]))
>>> np.load('/tmp/123.npy')
array([[1, 2, 3],
[4, 5, 6]])
1
2
3
4
5
6
7
8
9
10
11
# Store compressed data to disk, and load it again;
>>> a=np.array([[1, 2, 3], [4, 5, 6]])
>>> b=np.array([1, 2])
>>> np.savez('/tmp/123.npz', a=a, b=b)
>>> data = np.load('/tmp/123.npz')
>>> data['a'] # 看此处的字典key值, 应为savez()的参数"a=a"中前面那个'a';
array([[1, 2, 3],
[4, 5, 6]])
>>> data['b'] # 看此处的字典key值, 应为savez()的参数"b=b"中前面那个'b';
array([1, 2])
>>> data.close()
1
2
3
4
# Mem-map the stored array, and then access the second row directly from disk;
>>> X = np.load('/tmp/123.npy', mmap_mode='r')
>>> X[1, :] # Second row;
memmap([4, 5, 6])

号外: PYTHON中的Special Methods

For standard library functions:

ndarray.__copy__() Used if copy.copy is called on an array.
ndarray.__deepcopy__() Used if copy.deepcopy is called on an array.
ndarray.__reduce__() For pickling.
ndarray.__setstate__(state, /) For unpickling.

Basic customization:

ndarray.__new__(*args, **kwargs) Create and return a new object.
ndarray.__array__() Returns either a new reference to self if dtype is not given or a new array of provided data type if dtype is different from the current dtype of the array.
ndarray.__array_wrap__()

Container customization: (see Indexing)

ndarray.__len__(self, /) Return len(self).
ndarray.__getitem__(self, key, /) Return self[key].
ndarray.__setitem__(self, key, value, /) Set self[key] to value.
ndarray.__contains__(self, key, /) Return key in self.

Conversion; the operations int, float and complex. . They work only on arrays that have one element in them and return the appropriate scalar.

String representations:

ndarray.__str__(self, /) Return str(self).
ndarray.__repr__(self, /) Return repr(self).

numpy.squeeze() 函数[REF]

函数 说明
numpy.squeeze(a, axis=None) 将输入数组a中的长度为1的维度(single-dimensional entries)移除掉.

返回: ndarray(squeezed), 即: the input array, but with all or a subset of the dimensions of length 1 removed. This is always a itself or a view into a.

抛出(Raises): ValueError, If axis is not None, and an axis being squeezed is not of length 1.

具体例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
>>> x = np.array([[[0], [1], [2]]])
>>> x.shape
(1, 3, 1)

>>> np.squeeze(x)
array([0, 1, 2])
>>> np.squeeze(x).shape
(3,)

>>> np.squeeze(x, axis=0)
array([[0],
[1],
[2]])
>>> np.squeeze(x, axis=0).shape
(3, 1)

>>> np.squeeze(x, axis=1).shape
Traceback (most recent call last):
...
ValueError: cannot select an axis to squeeze out which has size not equal to one

>>> np.squeeze(x, axis=2)
array([[0, 1, 2]])
>>> np.squeeze(x, axis=2).shape
(1, 3)

numpy.reshape() 函数[REF]

函数 说明
numpy.reshape(a, newshape, order=’C’) Gives a new shape to an array without changing its data.

返回 : ndarray(reshaped_array), this will be a new view object if possible; otherwise, it will be a copy. (Note: there is no guarantee of the memory layout (C- or Fortran- contiguous) of the returned array)

(See also: ndarray.reshape, Equivalent method)

1
2
3
4
5
6
7
8
9
10
# 例子:
>>> a = np.array([[1,2,3], [4,5,6]])
>>> np.reshape(a, 6) # 默认的order是'C': C-like index ordering;
array([1, 2, 3, 4, 5, 6])
>>> np.reshape(a, 6, order='F') # Fortran-like index ordering;
array([1, 4, 2, 5, 3, 6])
>>> np.reshape(a, (3,-1)) # The unspecified value is inferred to be 2;
array([[1, 2],
[3, 4],
[5, 6]])

numpy.random.randn() 函数[REF]

函数 说明
numpy.random.randn(d0, d1, …, dn) Return a sample (or samples) from the “standard normal” distribution.

参数: d0, d1, …, dn(int, optional), the dimensions of the returned array, should be all positive. If no argument is given a single Python float is returned. (注: 返回结果dtype是float, 从”a univariate “normal” (Gaussian) distribution of mean 0 and variance 1”中采样得来)

返回: ndarray or float(Z), a (d0, d1, …, dn)-shaped array of floating-point samples from the standard normal distribution, or a single such float if no parameters were supplied.

(注: For random samples fromN(\mu, \sigma^2), use: sigma * np.random.randn(...) + mu)

(注: See also standard_normal, similar, but takes a tuple as its argument)

具体例子:

1
2
3
4
5
6
>>> np.random.randn()
2.1923875335537315 # random;
# Two-by-four array of samples from N(3, 6.25);
>>> 2.5 * np.random.randn(2, 4) + 3
array([[-4.49401501, 4.00950034, -1.81814867, 7.29718677], #random
[ 0.39924804, 4.68456316, 4.99394529, 4.84057254]]) #random

numpy.clip() 函数[REF]

函数 说明
numpy.clip(a, a_min, a_max, out=None, kwargs) Clip (limit) the values in an array. Given an interval, values outside the interval are clipped to the interval edges. For example, if an interval of [0, 1] is specified, values smaller than 0 become 0, and values larger than 1 become 1.

返回: ndarray(clipped_array), an array with the elements of a, but where values < a_min are replaced with a_min, and those > a_max with a_max.
(注: out参数是ndarray, 可选, 函数的结果会placed在这个array中, “it may be the input array for in-place clipping”)

具体例子:

1
2
3
4
5
6
7
8
9
10
11
12
>>> a = np.arange(10)
>>> np.clip(a, 1, 8)
array([1, 1, 2, 3, 4, 5, 6, 7, 8, 8])
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.clip(a, 3, 6, out=a)
array([3, 3, 3, 3, 4, 5, 6, 6, 6, 6])
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.clip(a, [3, 4, 1, 1, 1, 4, 4, 4, 4, 4], 8) # a_min是一个数组;
array([3, 4, 2, 3, 4, 5, 6, 7, 8, 8])

TODO: Array creation routines

太累了我不想写: [REF].

PEP与PYHTON之禅(ZEN OF PYTHON)

PEPPython Enhancement Proposal的缩写, 是”PYTHON增强提案”的意思, PEP的背景是: Python核心开发者通过邮件列表讨论有关于Python各种方面的问题/提议/计划, 在这些交流中, 这些问题/提议/计划经过了核心开发者审阅(review)和认可后, 最终形成的正式文档(即PEP), 起到了对外公示的作用.

PEP的官方地址是[REF], 也是PEP0的地址, 这个PEP其实是所有PEP的索引(“Index of Python Enhancement Proposals (PEPs)”), 其中PEP20[REF], 便是著名的”PYTHON之禅“. 如, 以下的增强提案, 其涉及的内容都会包含在我们使用的Python代码中(#渔#与#所以然#):

PEP Index 说明
PEP 318 Decorators for Functions and Methods: 关于装饰器.
PEP 282 A Logging System: 关于Logging标准库.
PEP 3101 Advanced String Formatting: 字符串格式化.
PEP 380 Syntax for Delegating to a Subgenerator: 引入“yield from”语法.
PEP 492 Coroutines with async and await syntax :引入async/await语法.
PEP 3135 New Super :Python3中的super用法.

Ref: [博文1], [博文2], 以及[10个必读].

numpy.append() 函数[REF]

函数 说明
numpy.append(arr, values, axis=None) 将一个array附加到另一个array中(append values to the end of an array).
参数 参数说明
arr : array_like Values are appended to a copy of this array.
values : array_like These values are appended to a copy of arr. It must be of the correct shape (the same shape as arr, excluding axis). If axis is not specified, values can be any shape and will be flattened before use.
axis : int, optional The axis along which values are appended. If axis is not given, both arr and values are flattened before use.
返回: append : ndarray A copy of arr with values appended to axis. Note that append does not occur in-place: a new array is allocated and filled. If axis is None, out is a flattened array.
1
2
3
4
5
6
7
8
9
10
11
12
# 具体例子;
>>> np.append([1, 2, 3], [[4, 5, 6], [7, 8, 9]])
array([1, 2, 3, 4, 5, 6, 7, 8, 9]) # Axis is None, flattened;
# When axis is specified, values must have the correct shape;
>>> np.append([[1, 2, 3], [4, 5, 6]], [[7, 8, 9]], axis=0) # 两个array的shape相同;
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> np.append([[1, 2, 3], [4, 5, 6]], [7, 8, 9], axis=0) # 两个array的shape不同;
Traceback (most recent call last):
...
ValueError: arrays must have same number of dimensions

numpy.resize() 函数[REF]

函数 说明
numpy.resize(a, new_shape)[source] Return a new array with the specified shape. (关键在于PYTHON是怎么填充/重排数组内的元素的)
Parameters Explaination
a: array_like Array to be resized.
new_shape: int or tuple of int Shape of resized array.
Returns: reshaped_array: ndarray The new array is formed from the data in the old array, repeated if necessary to fill out the required number of elements. The data are repeated in the order that they are stored in memory.
1
2
3
4
5
6
7
8
9
10
# 具体例子;
>>> a=np.array([[0,1],[2,3]])
>>> np.resize(a,(2,3))
array([[0, 1, 2], # Data repeated;
[3, 0, 1]])
>>> np.resize(a,(1,4))
array([[0, 1, 2, 3]])
>>> np.resize(a,(2,4)) # Data repeated;
array([[0, 1, 2, 3],
[0, 1, 2, 3]])

numpy.trim_zeros() 函数[REF]

函数 说明
numpy.trim_zeros(filt, trim=’fb’) Trim the leading and/or trailing zeros from a 1-D array or sequence.
Parameters Explaination
filt: 1D array or squence Input array.
trim: str, optional A string with ‘f’ representing trim from front and ‘b’ to trim from back. Default is ‘fb’, trim zeros from both front and back of the array.
Returns: trimmed: 1D array or squence The result of trimming the input. The input data type is preserved.
1
2
3
4
5
6
7
8
9
# 具体例子;
>>> a = np.array((0, 0, 0, 1, 2, 3, 0, 2, 1, 0))
>>> np.trim_zeros(a)
array([1, 2, 3, 0, 2, 1])
>>> np.trim_zeros(a, 'b')
array([0, 0, 0, 1, 2, 3, 0, 2, 1])
# The input data type is preserved, list/tuple in means list/tuple out;
>>> np.trim_zeros([0, 1, 2, 0])
[1, 2]