Vectorised solutions

Why a data scientist should always should try the vectorised solution

In Data Science we often deal with big data. Data manipulations can easily take up loads of computer time and memory when not efficiently coded. In order to fully take advantage of computation power of computers, the state of art of implementation of algorithm is vectorising all the manipulations. This allows you to achieve parallelised computation using multiple cpu's at once. The libraries pandas and numpy allow vectorised manipulations. This makes the computations easily 2500 times faster. So stop looping over rows and columns. Use the pandas methods!

See also the study case https://fennaf.gitbook.io/bfvm22prog1/study-cases/why-we-love-numpy

Last updated

Was this helpful?