Call-graph of Numpy scalar array addition, shows that get_ufunc_arguments contribute to almost 18% in cumulative time.
![]() |
Callgraph for x = np.asarray(1.0). |
Problem
Tracing the execution path get_ufunc_arguments under for
x = numpy.asarray(1.0); x + x
, which flow as PyArrayFromAny, PyArrayFromArray, PyArrayCanCastArrayTo, can_cast_scalar_to.
In PyArrayFromArray, I find out if argument newtype is Null, it get value of oldtype. So, technically it check if casting is possible for same type which is useless.
oldtype = PyArray_DESCR(arr); if (newtype == NULL) { newtype = oldtype; Py_INCREF(oldtype); }
Improvement
It is better to bypass this for newtype=Null and flag=0. As I did below,
oldtype = PyArray_DESCR(arr); if (newtype == NULL) { /* Check if object is of array with Null newtype. * If so return it directly instead of checking for casting. */ if (flags == 0) { Py_INCREF(arr); return (PyObject *)arr; } newtype = oldtype; Py_INCREF(oldtype); }
This brings down the cumulative time contribution significantly from 17.9% to 3.1%.
![]() |
Callgraph for x = np.asarray(1.0), but after improvement |
No comments:
Post a Comment