Monday, September 9, 2013

Scope for improvement in _extract_pyvals

I my last post, I mentioned how two functions, _extract_pyvals and PyUFunc_GetPyValues together use >12% of time. But major culprit is *errobj = Py_BuildValue("NO", PyBytes_FromString(name), retval); in  _extract_pyvals. It alone takes 10% of time, with every operations.

Improvement

Caching Py_BuildValue

First approach I take, was to cached Py_BuildValue with Thread storage or dict. With this time consumption of _extract_pyvals dropped to 4% from 12%. 
errorobj caching with PyThreadState_GetDict
But since, TLS is a bit unreliable and risky. So @juliantaylor advised that it should be avoided. Even after many fixes, commit for this didnt managed to pass all test cases.



Py_BuildValue with PyTuple_Pack

For quick optimization  Py_BuildValue is replaced with PyTuple_Pack. And PyInt_AsLong is replaced with PyInt_As_Long which does not do error checking. This helps to improve _extract_pyvals by 3%.
replaced Py_BuildValue by PyTuple_Pack

Optimization has been made at pr #3686

No comments:

Post a Comment