Instead of
Fast path for consecutive same operations, it better to use caching
*errobj = Py_BuildValue("NO", PyBytes_FromString(name), retval);
in _extract_pyvals
. It alone takes 10% of time, with every operations. Its better to avoid packing of string instead assign pointer to struct and stash them to thread local storage. Hence no allocations would be need to for errorobj every-time, which almost goes unused.Implementation
Error object include name of ufunc and callback pointer.typedef struct { PyObject_HEAD char *name; PyObject* retval; } PyErrObject;
Fast path for consecutive same operations, it better to use caching
errvalues = PyDict_GetItem(thedict, PyUFunc_PYVALS_ERROR); if (errvalues == NULL) { errvalues = PyObject_New(PyErrObject, &PyErrObject_Type); errvalues->name = name; errvalues->retval = retval; PyDict_SetItem(thedict, PyUFunc_PYVALS_ERROR, (PyObject*)errvalues); } errvalues->name = name; errvalues->retval = retval; *errobj = errvalues; Py_INCREF(errvalues);
Improvement
![]() |
Callgraph PyErrObject x = np.asarray([5.0,1.0]); x+x |
Time consumption of _extract_pyvals drop to 2% from 9.3%.
More
PR for this enhancement is #3686
No comments:
Post a Comment