I my last post, I mentioned how two functions,
_extract_pyvals
and PyUFunc_GetPyValues
together use >12% of time. But major culprit is *errobj = Py_BuildValue("NO", PyBytes_FromString(name), retval);
in _extract_pyvals
. It alone takes 10% of time, with every operations.
Improvement
Caching Py_BuildValue
First approach I take, was to cached Py_BuildValue with Thread storage or dict. With this time consumption of _extract_pyvals dropped to 4% from 12%.
![]() |
errorobj caching with PyThreadState_GetDict |
But since, TLS is a bit unreliable and risky. So @commit for this didnt managed to pass all test cases.
Optimization has been made at pr #3686
advised that it should be avoided. Even after many fixes, Py_BuildValue with PyTuple_Pack
For quick optimization Py_BuildValue is replaced with PyTuple_Pack. And PyInt_AsLong is replaced with PyInt_As_Long which does not do error checking. This helps to improve _extract_pyvals by 3%.![]() |
replaced Py_BuildValue by PyTuple_Pack |
Optimization has been made at pr #3686
No comments:
Post a Comment