Wednesday, September 11, 2013

Improvement in _extract_pyvals

Instead of *errobj = Py_BuildValue("NO", PyBytes_FromString(name), retval); in _extract_pyvals. It alone takes 10% of time, with every operations.  Its better to avoid packing of string instead assign pointer to struct and stash them to thread local storage. Hence no allocations would be need to for errorobj every-time, which almost goes unused.

Implementation

Error object include name of ufunc and callback pointer.
typedef struct {
    PyObject_HEAD
    char *name;
    PyObject* retval;
} PyErrObject;

Fast path for consecutive same operations, it better to use caching
errvalues = PyDict_GetItem(thedict, PyUFunc_PYVALS_ERROR);    
    if (errvalues == NULL) {
        errvalues = PyObject_New(PyErrObject, &PyErrObject_Type);
        errvalues->name = name;
        errvalues->retval = retval;
        PyDict_SetItem(thedict, PyUFunc_PYVALS_ERROR, (PyObject*)errvalues);
    }
    
    errvalues->name = name;
    errvalues->retval = retval;
    *errobj = errvalues;
    Py_INCREF(errvalues);


Improvement

Callgraph PyErrObject
x = np.asarray([5.0,1.0]); x+x
Time consumption of _extract_pyvals drop to 2% from 9.3%.

More

PR for this enhancement is #3686

No comments:

Post a Comment