June 21, 2018

Numpy array append and performance issues

Issue

Assume you need to add 100 elements in the numpy array, the initial intuition is to run-down the loop construct and append the element to the array.

However, this would be a grave mistake in-terms of extension-ability of the code. Assume that the elements changes from 100 to 50000 - what would you expect? Let's see with below example


Comparing numpy append vs list append 

Check out this example (link) which run's through adding elements 
  1. directly to numpy array - using the append api provided by numpy
  2. indirectly create the numpy array - using a list (to construct the information) and converting the list to array using numpy array api.
We see that the Option(2) is much more efficient compared to Option(1). And more the number of elements to add to numpy array, more efficient is the option (2).




Result Analysis

Let's look at the sample results obtained in the run


Num of elements
 Time taken (in sec) to directly add in numpy (A)
Time taken (in sec) to add to list and then convert to numpy (B)
Approach A is slower compared to B by
10000
0.112673044205
0.00116991996765
96 times
20000
0.218866825104
0.00390100479126
56 times
30000
0.721004962921
0.00337600708008
213 times
40000
1.38639616966
0.00489687919617
283 times
50000
2.46702504158
0.00590395927429
417 times

So we see that the as the number of elements to insert increases, the factor by which the insert takes time increases drastically (as depicted by below graph)



Reason

Now, lets look at the api details of numpy append. When we look into the details - we get a small note - 

Note that append does not occur in-place: a new array is allocated and filled. If axis is None, out is a flattened array.
So, unlike the python list, the numpy array is not a linked list implementation & every time we add a element, it copies all existing contents of the array with an additional element space - before inserting. Which is a very very costly operation.

So when you need to build an numpy array, always build it first with python list and then convert it to numpy array.


May 7, 2018

Use of __repr__ in python


Problem Statement


How many times you are in the process of debugging/ printing/logging your python objects for a critical issue analysis, and you find output which is very obscure??? Quiet Often?


<__main__.Point instance at 0x7f23e62995f0>

Actual Reason

Consider a 'Point' class which takes in two co-ordinates. Now, when you print the object details, python interpreter understands it as a command to provide the instance details of the object along with its associated class.


class Point:
    def __init__(self, x_cord, y_cord):
        self.x = x_cord
        self.y = y_cord



if __name__ == "__main__":
    d = Point(3,4)
    print d

So, when you run a simple code provided above, you will see that the output is very complicate details of the class object - which provides very little information.


<__main__.Point instance at 0x7f23e62995f0>

For some, this might be sufficient. However, for many of us, we need to understand more about the object than just the plain instance details associated with the class.

Workaround

Python has provided a beautiful mechanism to have a workaround with this. Here we use a built-in class member called __repr__.

Now, consider the below modified version of the same class definition.


class Point:
    def __init__(self, x_cord, y_cord):
        self.x = x_cord
        self.y = y_cord

    def __repr__(self):
        return 'Point in co-ordindate form where x = %s, y = %s' % (self.x,self.y)



if __name__ == "__main__":
    d = Point(3,4)
    print d

when you run the above program, the out changes to the text description as shown below



lvtb9:~/workspace/blog_examples/repr_usage$ python class_repr_usage.py
Point in co-ordindate form where x = 3, y = 4

Wonderful!!!

Conclusion

We might think that these minor tweaks carry little value in overall day-day product development/maintainance.

However, think again - let's say you need to log the object details when an exception is thrown for a code which is legacy and difficult to change. Now just by extending the details of the '__repr__' in your legacy code, you will be easily able to extend the logging details and equip yourself for effective debugging.



April 27, 2018

Python name guard and its importance

Why do we need "__main__" guard in python code?

We all have seen code where we have protected the python code with "__main__" guard. Why do we need this? take a look at below explanation

File - grepinfiles.py

import sys

def grep(ptrn,txtfl):
    with open(txtfl) as f:
        for line in f:
            if ptrn in line:
                yield line.rstrip('\n')

ptrn,txtfl = sys.argv[1],sys.argv[2]
for matchline in grep(ptrn,txtfl):
    print(matchline)


For a sample input file

>>cat /tmp/1.txt
This is a sample code for grep
we do no have any example for egrep


We get the below output

python grepinfiles.py egrep /tmp/1.txt
we do no have any example for egrep


Now, lets use this module as package in another module.

File: finderror.py


import sys
from grepinfiles import findpattern

txtfl = sys.argv[1]
for line in findpattern('ERROR',txtfl):
    print(line)

when you run this function, we get the below output.


>> python finderror.py /tmp/1.txt
Traceback (most recent call last):
  File "finderror.py", line 2, in <module>
    from grepinfiles import findpattern
  File "/home/user/workspace/blog_examples/python_name_gaurd/grepinfiles.py", line 9, in <module>
    ptrn,txtfl = sys.argv[1],sys.argv[2]
IndexError: list index out of range

Why is this error???

Magic variable called "__name__"

Now, lets modify the code slightly and run the same command.

File: grepinfiles.py


import sys

def findpattern(ptrn,txtfl):
    print("Inside the module",__name__)
    with open(txtfl) as f:
        for line in f:
            if ptrn in line:
                yield line.rstrip('\n')

if __name__ == "__main__":
    ptrn,txtfl = sys.argv[1],sys.argv[2]
    for matchline in findpattern(ptrn,txtfl):
        print(matchline)

and

File: finderror.py


import sys
from grepinfiles import findpattern


if __name__ == "__main__":
    txtfl = sys.argv[1]
    for line in findpattern('ERROR',txtfl):
        print(line)


Now, check the output for the modified source code


>> python grepinfiles.py egrep /tmp/1.txt
('Inside the module', '__main__')
we do no have any example for egrep
>> python finderror.py /tmp/1.txt
('Inside the module', 'grepinfiles')

Explanation

The main package which is invoked by the python interpreter will have __name__ variable set to __main__
Any other module/package which is invoked by main package/module will have __name__ as the module name itself.

So, when finderror.py was invoked,
  • finderror.py module will have __name__ set to __main__
  • grepinfiles.py module will have __name__ set to 'grepinfiles'
However, when only grepinfiles.py was invoked,
  • grepinfiles.py module will have __name__ set to '__main__'

Conclusion

Name guard is a mechanism to customize your python module/package to run any specific code for the module when invoked independently. Also, it is a mechanism to safeguard the code base which are not to be executed when invoked from other up-stream modules/functions.


March 31, 2018

How to use linux perf tools and save dollars - Part 2

Flame Graph - Introduction

In our previous discussion (I strongly recommend to go through the linked blog), we have seen how in real world, linux perf tool would be helpful in writing a very well optimized code. We have seen an example of how we could save loads of CPU cycles in the example linked. 

Now we will see how to visualize the output which would help us to find the bottle-neck super fast!!!

What is flame graph?

Flame graph is a tool which converts the binary data file from perf tool into a "*.svg" a more comfortable visualized format. In fact, to put it more generically flame graph is a tool which helps to represent different profile data - like perf tool, Dtrace etc in a visual format.


Ok... Tell me how to use it

Step - 1

Install the flame and generate perf data dump. For this, we are using the example in previous discussion - and generate the perf data dump. Check the below GIF which provides the steps executed - 

Install Flame Graph and using perf generate the perf dump

Step - 2

Follow the steps shown in the below GIF to convert the dump into a "*.svg" file which can be opened in any browser and analyzed.

Convert perf data to flame graph

Step - 3

Analysis of the graph

  • In the GIF, you can see that the graph generated most of the stack (represents CPU cycles) is related to usleep system call.
  • Also, the usleep is triggered by consumer call-back function of the consumer thread.
  • Compare the total area of the consumer and producer we would be clearly identifying the total CPU consumed by consumer is far more than producer.
This immediately helps us concluded that - consumer slow is excessively slow and is due to the function usleep in the consumer call back function.

Óyela.... we will get all required answer!!!!

Conclusion

  • Use the flame graph tool very effectively, (even in multi-threaded applications also) to visualize your application bottle-neck and fix them.
  • Both the long term costs saved, and time to debug will be sufficiently reduced with this great tool.
Keeping looking into this space, you would get much more information on this tool.
Would be happy to hear you out and help you... Please leave your opinion in the comment section.

And a closing thought...



How to find linux kernel version in your distros

How do you find the linux kernel version you are running?

Run the below command to find out the Linux kernel version.

uname -r

In my distros, I get the below response

~/ $ uname -r
3.13.0-142-generic
~/ $



Here is a short clip on how to run and get the details.
Added bonus, I have given more details on other options for the same command.

How to find linux kernel version