kwarray.util_numpy module¶
Numpy specific extensions
- kwarray.util_numpy.boolmask(indices, shape=None)[source]¶
Constructs an array of booleans where an item is True if its position is in
indices
otherwise it is False. This can be viewed as the inverse ofnumpy.where()
.- Parameters:
indices (NDArray) – list of integer indices
shape (int | tuple) – length of the returned list. If not specified the minimal possible shape to incoporate all the indices is used. In general, it is best practice to always specify this argument.
- Returns:
mask - mask[idx] is True if idx in indices
- Return type:
NDArray[Any, Int]
Example
>>> indices = [0, 1, 4] >>> mask = boolmask(indices, shape=6) >>> assert np.all(mask == [True, True, False, False, True, False]) >>> mask = boolmask(indices) >>> assert np.all(mask == [True, True, False, False, True])
Example
>>> import kwarray >>> import ubelt as ub # NOQA >>> indices = np.array([(0, 0), (1, 1), (2, 1)]) >>> shape = (3, 3) >>> mask = kwarray.boolmask(indices, shape) >>> result = ub.urepr(mask, with_dtype=0) >>> print(result) np.array([[ True, False, False], [False, True, False], [False, True, False]])
- kwarray.util_numpy.iter_reduce_ufunc(ufunc, arrs, out=None, default=None)[source]¶
constant memory iteration and reduction
Applys ufunc from left to right over the input arrays
- Parameters:
ufunc (Callable) – called on each pair of consecutive ndarrays
arrs (Iterator[NDArray]) – iterator of ndarrays
default (object) – return value when iterator is empty
- Returns:
if len(arrs) == 0, returns
default
if len(arrs) == 1, returns arrs[0], if len(arrs) >= 2, returns ufunc(…ufunc(ufunc(arrs[0], arrs[1]), arrs[2]),…arrs[n-1])- Return type:
NDArray
Example
>>> arr_list = [ ... np.array([0, 1, 2, 3, 8, 9]), ... np.array([4, 1, 2, 3, 4, 5]), ... np.array([0, 5, 2, 3, 4, 5]), ... np.array([1, 1, 6, 3, 4, 5]), ... np.array([0, 1, 2, 7, 4, 5]) ... ] >>> memory = np.array([9, 9, 9, 9, 9, 9]) >>> gen_memory = memory.copy() >>> def arr_gen(arr_list, gen_memory): ... for arr in arr_list: ... gen_memory[:] = arr ... yield gen_memory >>> print('memory = %r' % (memory,)) >>> print('gen_memory = %r' % (gen_memory,)) >>> ufunc = np.maximum >>> res1 = iter_reduce_ufunc(ufunc, iter(arr_list), out=None) >>> res2 = iter_reduce_ufunc(ufunc, iter(arr_list), out=memory) >>> res3 = iter_reduce_ufunc(ufunc, arr_gen(arr_list, gen_memory), out=memory) >>> print('res1 = %r' % (res1,)) >>> print('res2 = %r' % (res2,)) >>> print('res3 = %r' % (res3,)) >>> print('memory = %r' % (memory,)) >>> print('gen_memory = %r' % (gen_memory,)) >>> assert np.all(res1 == res2) >>> assert np.all(res2 == res3)
- kwarray.util_numpy.isect_flags(arr, other)[source]¶
Check which items in an array intersect with another set of items
- Parameters:
arr (NDArray) – items to check
other (Iterable) – items to check if they exist in arr
- Returns:
- booleans corresponding to arr indicating if any item in other
is also contained in other.
- Return type:
NDArray
Example
>>> arr = np.array([ >>> [1, 2, 3, 4], >>> [5, 6, 3, 4], >>> [1, 1, 3, 4], >>> ]) >>> other = np.array([1, 4, 6]) >>> mask = isect_flags(arr, other) >>> print(mask) [[ True False False True] [False True False True] [ True True False True]]
- kwarray.util_numpy.atleast_nd(arr, n, front=False)[source]¶
View inputs as arrays with at least n dimensions.
- Parameters:
arr (ArrayLike) – An array-like object. Non-array inputs are converted to arrays. Arrays that already have n or more dimensions are preserved.
n (int) – number of dimensions to ensure
front (bool) – if True new dimensions are added to the front of the array. otherwise they are added to the back. Defaults to False.
- Returns:
An array with
a.ndim >= n
. Copies are avoided where possible, and views with three or more dimensions are returned. For example, a 1-D array of shape(N,)
becomes a view of shape(1, N, 1)
, and a 2-D array of shape(M, N)
becomes a view of shape(M, N, 1)
.- Return type:
NDArray
See also
numpy.atleast_1d, numpy.atleast_2d, numpy.atleast_3d
Example
>>> n = 2 >>> arr = np.array([1, 1, 1]) >>> arr_ = atleast_nd(arr, n) >>> import ubelt as ub # NOQA >>> result = ub.urepr(arr_.tolist(), nl=0) >>> print(result) [[1], [1], [1]]
Example
>>> n = 4 >>> arr1 = [1, 1, 1] >>> arr2 = np.array(0) >>> arr3 = np.array([[[[[1]]]]]) >>> arr1_ = atleast_nd(arr1, n) >>> arr2_ = atleast_nd(arr2, n) >>> arr3_ = atleast_nd(arr3, n) >>> import ubelt as ub # NOQA >>> result1 = ub.urepr(arr1_.tolist(), nl=0) >>> result2 = ub.urepr(arr2_.tolist(), nl=0) >>> result3 = ub.urepr(arr3_.tolist(), nl=0) >>> result = '\n'.join([result1, result2, result3]) >>> print(result) [[[[1]]], [[[1]]], [[[1]]]] [[[[0]]]] [[[[[1]]]]]
Note
Extensive benchmarks are in kwarray/dev/bench_atleast_nd.py
These demonstrate that this function is statistically faster than the numpy variants, although the difference is small. On average this function takes 480ns versus numpy which takes 790ns.
- kwarray.util_numpy.argmaxima(arr, num, axis=None, ordered=True)[source]¶
Returns the top
num
maximum indicies.This can be significantly faster than using argsort.
- Parameters:
arr (NDArray) – input array
num (int) – number of maximum indices to return
axis (int | None) – axis to find maxima over. If None this is equivalent to using arr.ravel().
ordered (bool) – if False, returns the maximum elements in an arbitrary order, otherwise they are in decending order. (Setting this to false is a bit faster).
Todo
[ ] if num is None, return arg for all values equal to the maximum
- Returns:
NDArray
Example
>>> # Test cases with axis=None >>> arr = (np.random.rand(100) * 100).astype(int) >>> for num in range(0, len(arr) + 1): >>> idxs = argmaxima(arr, num) >>> idxs2 = argmaxima(arr, num, ordered=False) >>> assert np.all(arr[idxs] == np.array(sorted(arr)[::-1][:len(idxs)])), 'ordered=True must return in order' >>> assert sorted(idxs2) == sorted(idxs), 'ordered=False must return the right idxs, but in any order'
Example
>>> # Test cases with axis >>> arr = (np.random.rand(3, 5, 7) * 100).astype(int) >>> for axis in range(len(arr.shape)): >>> for num in range(0, len(arr) + 1): >>> idxs = argmaxima(arr, num, axis=axis) >>> idxs2 = argmaxima(arr, num, ordered=False, axis=axis) >>> assert idxs.shape[axis] == num >>> assert idxs2.shape[axis] == num
- kwarray.util_numpy.argminima(arr, num, axis=None, ordered=True)[source]¶
Returns the top
num
minimum indicies.This can be significantly faster than using argsort.
- Parameters:
arr (NDArray) – input array
num (int) – number of minimum indices to return
axis (int|None) – axis to find minima over. If None this is equivalent to using arr.ravel().
ordered (bool) – if False, returns the minimum elements in an arbitrary order, otherwise they are in ascending order. (Setting this to false is a bit faster).
Example
>>> arr = (np.random.rand(100) * 100).astype(int) >>> for num in range(0, len(arr) + 1): >>> idxs = argminima(arr, num) >>> assert np.all(arr[idxs] == np.array(sorted(arr)[:len(idxs)])), 'ordered=True must return in order' >>> idxs2 = argminima(arr, num, ordered=False) >>> assert sorted(idxs2) == sorted(idxs), 'ordered=False must return the right idxs, but in any order'
Example
>>> # Test cases with axis >>> from kwarray.util_numpy import * # NOQA >>> arr = (np.random.rand(3, 5, 7) * 100).astype(int) >>> # make a unique array so we can check argmax consistency >>> arr = np.arange(3 * 5 * 7) >>> np.random.shuffle(arr) >>> arr = arr.reshape(3, 5, 7) >>> for axis in range(len(arr.shape)): >>> for num in range(0, len(arr) + 1): >>> idxs = argminima(arr, num, axis=axis) >>> idxs2 = argminima(arr, num, ordered=False, axis=axis) >>> print('idxs = {!r}'.format(idxs)) >>> print('idxs2 = {!r}'.format(idxs2)) >>> assert idxs.shape[axis] == num >>> assert idxs2.shape[axis] == num >>> # Check if argmin argrees with -argmax >>> idxs3 = argmaxima(-arr, num, axis=axis) >>> assert np.all(idxs3 == idxs)
Example
>>> arr = np.arange(20).reshape(4, 5) % 6 >>> argminima(arr, axis=1, num=2, ordered=False) >>> argminima(arr, axis=1, num=2, ordered=True) >>> argmaxima(-arr, axis=1, num=2, ordered=True) >>> argmaxima(-arr, axis=1, num=2, ordered=False)
- kwarray.util_numpy.unique_rows(arr, ordered=False, return_index=False)[source]¶
Like unique, but works on rows
- Parameters:
arr (NDArray) – must be a contiguous C style array
ordered (bool) – if true, keeps relative ordering
References
https://stackoverflow.com/questions/16970982/find-unique-rows-in-numpy-array
Example
>>> import kwarray >>> from kwarray.util_numpy import * # NOQA >>> rng = kwarray.ensure_rng(0) >>> arr = rng.randint(0, 2, size=(22, 3)) >>> arr_unique = unique_rows(arr) >>> print('arr_unique = {!r}'.format(arr_unique)) >>> arr_unique, idxs = unique_rows(arr, return_index=True, ordered=True) >>> assert np.all(arr[idxs] == arr_unique) >>> print('arr_unique = {!r}'.format(arr_unique)) >>> print('idxs = {!r}'.format(idxs)) >>> arr_unique, idxs = unique_rows(arr, return_index=True, ordered=False) >>> assert np.all(arr[idxs] == arr_unique) >>> print('arr_unique = {!r}'.format(arr_unique)) >>> print('idxs = {!r}'.format(idxs))
- kwarray.util_numpy.arglexmax(keys, multi=False)[source]¶
Find the index of the maximum element in a sequence of keys.
- Parameters:
keys (tuple) – a k-tuple of k N-dimensional arrays. Like np.lexsort the last key in the sequence is used for the primary sort order, the second-to-last key for the secondary sort order, and so on.
multi (bool) – if True, returns all indices that share the max value
- Returns:
either the index or list of indices
- Return type:
int | NDArray[Any, Int]
Example
>>> k, N = 100, 100 >>> rng = np.random.RandomState(0) >>> keys = [(rng.rand(N) * N).astype(int) for _ in range(k)] >>> multi_idx = arglexmax(keys, multi=True) >>> idxs = np.lexsort(keys) >>> assert sorted(idxs[::-1][:len(multi_idx)]) == sorted(multi_idx)
- Benchark:
>>> import ubelt as ub >>> k, N = 100, 100 >>> rng = np.random >>> keys = [(rng.rand(N) * N).astype(int) for _ in range(k)] >>> for timer in ub.Timerit(100, bestof=10, label='arglexmax'): >>> with timer: >>> arglexmax(keys) >>> for timer in ub.Timerit(100, bestof=10, label='lexsort'): >>> with timer: >>> np.lexsort(keys)[-1]
- kwarray.util_numpy.generalized_logistic(x, floor=0, capacity=1, C=1, y_intercept=None, Q=None, growth=1, v=1)[source]¶
A generalization of the logistic / sigmoid functions that allows for flexible specification of S-shaped curve.
This is also known as a “Richards curve” [WikiRichardsCurve].
- Parameters:
x (NDArray) – input x coordinates
floor (float) – the lower (left) asymptote. (Also called
A
in some texts). Defaults to 0.capacity (float) – the carrying capacity. When C=1, this is the upper (right) asymptote. (Also called
K
in some texts). Defaults to 1.C (float) – Has influence on the upper asymptote. Defaults to 1. This is typically not modified.
y_intercept (float | None) – specify where the the y intercept is at x=0. Mutually exclusive with
Q
.Q (float | None) – related to the value of the function at x=0. Mutually exclusive with
y_intercept
. Defaults to 1.growth (float) – the growth rate (also calle
B
in some texts). Defaults to 1.v (float) – Positive number that influences near which asymptote the growth occurs. Defaults to 1.
- Returns:
the values for each input
- Return type:
NDArray
References
Example
>>> from kwarray.util_numpy import * # NOQA >>> # xdoctest: +REQUIRES(module:pandas) >>> import pandas as pd >>> import ubelt as ub >>> x = np.linspace(-3, 3, 30) >>> basis = { >>> # 'y_intercept': [0.1, 0.5, 0.8, -1], >>> # 'y_intercept': [0.1, 0.5, 0.8], >>> 'v': [0.5, 1.0, 2.0], >>> 'growth': [-1, 0, 2], >>> } >>> grid = list(ub.named_product(basis)) >>> datas = [] >>> for params in grid: >>> y = generalized_logistic(x, **params) >>> data = pd.DataFrame({'x': x, 'y': y}) >>> key = ub.urepr(params, compact=1) >>> data['key'] = key >>> for k, v in params.items(): >>> data[k] = v >>> datas.append(data) >>> all_data = pd.concat(datas).reset_index() >>> # xdoctest: +REQUIRES(--show) >>> # xdoctest: +REQUIRES(module:kwplot) >>> import kwplot >>> plt = kwplot.autoplt() >>> sns = kwplot.autosns() >>> plt.gca().cla() >>> sns.lineplot(data=all_data, x='x', y='y', hue='growth', size='v')
- kwarray.util_numpy.equal_with_nan(a1, a2)[source]¶
Numpy has array_equal with
equal_nan=True
, but this is elementwise- Parameters:
a1 (ArrayLike) – input array
a2 (ArrayLike) – input array
Example
>>> import kwarray >>> a1 = np.array([ >>> [np.nan, 0, np.nan], >>> [np.nan, 0, 0], >>> [np.nan, 1, 0], >>> [np.nan, 1, np.nan], >>> ]) >>> a2 = np.array([np.nan, 0, np.nan]) >>> flags = kwarray.equal_with_nan(a1, a2) >>> assert np.array_equal(flags, np.array([ >>> [ True, False, True], >>> [ True, False, False], >>> [ True, True, False], >>> [ True, True, True] >>> ]))