Call-graph of Numpy scalar array addition, shows that get_ufunc_arguments contribute to almost 18% in cumulative time.
![]() |
| Callgraph for x = np.asarray(1.0). |
Problem
Tracing the execution path get_ufunc_arguments under for
x = numpy.asarray(1.0); x + x , which flow as PyArrayFromAny, PyArrayFromArray, PyArrayCanCastArrayTo, can_cast_scalar_to.
In PyArrayFromArray, I find out if argument newtype is Null, it get value of oldtype. So, technically it check if casting is possible for same type which is useless.
oldtype = PyArray_DESCR(arr);
if (newtype == NULL) {
newtype = oldtype;
Py_INCREF(oldtype);
}
Improvement
It is better to bypass this for newtype=Null and flag=0. As I did below,
oldtype = PyArray_DESCR(arr);
if (newtype == NULL) {
/* Check if object is of array with Null newtype.
* If so return it directly instead of checking for casting.
*/
if (flags == 0) {
Py_INCREF(arr);
return (PyObject *)arr;
}
newtype = oldtype;
Py_INCREF(oldtype);
}
This brings down the cumulative time contribution significantly from 17.9% to 3.1%.
![]() |
| Callgraph for x = np.asarray(1.0), but after improvement |


No comments:
Post a Comment