UFUNC_CHECK_STATUS
is just single macro which do both checking clearing the error flags. It clear error flags every time after checking. We should avoid clear operation if not needed, as it is a bit expensive and take significant amount of time.The way numpy detect divide-by-zero, overflow, underflow, etc., is that before each ufunc loop it clear the FP error flags, and then after the ufunc loop we see if any have become set. And clear again. I have avoided clear if not needed to save time.
Improvement
Before each ufunc loop when
PyUFunc_clearfperr()
flag error is checked, then clearing them if necessary. Now, checking results in macro doesn't get ignored unlike before. Earlier time taken by PyUFunc_clearfperr()
and PyUFunc_getfperr()
combined was around 10%, which is now dropped to 1%, for operation which don't raise any error.![]() |
callgraph comparing performance x = np.asarray([1]); x+x; |
More
PR for this change is #3739
No comments:
Post a Comment