Why do we need "__main__" guard in python code?
We all have seen code where we have protected the python code with "__main__" guard. Why do we need this? take a look at below explanationFile - grepinfiles.py
import sys def grep(ptrn,txtfl): with open(txtfl) as f: for line in f: if ptrn in line: yield line.rstrip('\n') ptrn,txtfl = sys.argv[1],sys.argv[2] for matchline in grep(ptrn,txtfl): print(matchline)
For a sample input file
>>cat /tmp/1.txt This is a sample code for grep we do no have any example for egrep
We get the below output
python grepinfiles.py egrep /tmp/1.txt we do no have any example for egrep
Now, lets use this module as package in another module.
File: finderror.py
import sys from grepinfiles import findpattern txtfl = sys.argv[1] for line in findpattern('ERROR',txtfl): print(line)
when you run this function, we get the below output.
>> python finderror.py /tmp/1.txt Traceback (most recent call last): File "finderror.py", line 2, in <module> from grepinfiles import findpattern File "/home/user/workspace/blog_examples/python_name_gaurd/grepinfiles.py", line 9, in <module> ptrn,txtfl = sys.argv[1],sys.argv[2] IndexError: list index out of range
Why is this error???
Magic variable called "__name__"
Now, lets modify the code slightly and run the same command.
File: grepinfiles.py
import sys def findpattern(ptrn,txtfl): print("Inside the module",__name__) with open(txtfl) as f: for line in f: if ptrn in line: yield line.rstrip('\n') if __name__ == "__main__": ptrn,txtfl = sys.argv[1],sys.argv[2] for matchline in findpattern(ptrn,txtfl): print(matchline)
and
File: finderror.py
import sys from grepinfiles import findpattern if __name__ == "__main__": txtfl = sys.argv[1] for line in findpattern('ERROR',txtfl): print(line)
Now, check the output for the modified source code
>> python grepinfiles.py egrep /tmp/1.txt ('Inside the module', '__main__') we do no have any example for egrep >> python finderror.py /tmp/1.txt ('Inside the module', 'grepinfiles')
Explanation
The main package which is invoked by the python interpreter will have __name__ variable set to __main__
Any other module/package which is invoked by main package/module will have __name__ as the module name itself.
So, when finderror.py was invoked,
- finderror.py module will have __name__ set to __main__
- grepinfiles.py module will have __name__ set to 'grepinfiles'
However, when only grepinfiles.py was invoked,
- grepinfiles.py module will have __name__ set to '__main__'
Conclusion
Name guard is a mechanism to customize your python module/package to run any specific code for the module when invoked independently. Also, it is a mechanism to safeguard the code base which are not to be executed when invoked from other up-stream modules/functions.
This comment has been removed by the author.
ReplyDelete