Python Tips and Trick, You Haven't Already Seen, Part 2.

Few weeks ago I posted an article (here) about some not so commonly known Python features and quite a few people seemed to like it, so here comes another round of Python features that you hopefully haven't seen yet.

Naming a Slice Using `slice` Function

Using lots of hardcoded index values can quickly become maintenance and readability mess. One option would be to use constants for all index values, but we can do better:


#              ID    First Name     Last Name
line_record = "2        John         Smith"

ID = slice(0, 8)
FIRST_NAME = slice(9, 21)
LAST_NAME = slice(22, 27)

name = f"{line_record[FIRST_NAME].strip()} {line_record[LAST_NAME].strip()}"
# name == "John Smith"

In this example we can see that we can avoid mysterious indices, by first naming them using slice function and then using them when slicing out part of string. You can also get more information about the slice object using its attributes .start, .stop and .step.

Prompting User for a Password at Runtime

Lots of commandline tools or scripts require username and password to operate. So, if you happen to write such program you might find getpass module useful:


import getpass

user = getpass.getuser()
password = getpass.getpass()
# Do Stuff...

This very simple package allows you to prompt user for password as well as get their username, by extracting current users login name. Be aware though, that not every system supports hiding of passwords. Python will try to warn you about that, so just read warnings in command line.

Find Close Matches of a Word/String

Now, for little more obscure feature of Python standard library. If you ever find yourself is situation where you need to find words similar to some input string using something like Levenshtein distance, then Python and difflib have your back.


import difflib
difflib.get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'], n=2)
# returns ['apple', 'ape']

difflib.get_close_matches finds the best "good enough" matches. Here, first argument is being matched against second one. We can also supply optional argument n which specifies maximum number of matches to be returned. Another available keyword argument cutoff (defaults to 0.6) can be set to change threshold for score of matched strings.

Working with IP Addresses

If you have to do some networking in Python you might find ipaddress module very useful. One use-case would be generating list of ip addresses from CIDR (Classless Inter-Domain Routing):


import ipaddress
net = ipaddress.ip_network('74.125.227.0/29')  # Works for IPv6 too
# IPv4Network('74.125.227.0/29')

for addr in net:
    print(addr)

# 74.125.227.0
# 74.125.227.1
# 74.125.227.2
# 74.125.227.3
# ...

Another nice feature is a network membership check of IP address:


ip = ipaddress.ip_address("74.125.227.3")

ip in net
# True

ip = ipaddress.ip_address("74.125.227.12")
ip in net
# False

There are plenty more interesting features that I will not go over as you can find those here. Be aware though, that there is only a limited interoperability between ipaddress module and other network-related modules. For example, you can't use instances of IPv4Network as address strings - they need to be converted using str first.

Debugging Program Crashes in Shell

If you are one of the people who refuse to use IDE and are coding in Vim or Emacs, then you probably got into a situation where having debugger like in IDE would be useful. And you know what? You have one - just run your program with python3.8 -i - the -i launches interactive shell as soon as your program terminates and from there you can explore all variables and call functions. Neat, but how about actual debugger (pdb)? Let's use following program (script.py):


def func():
    return 0 / 0

func()

And run script with python3.8 -i script.py


# Script crashes...
Traceback (most recent call last):
  File "script.py", line 4, in <module>
    func()
  File "script.py", line 2, in func
    return 0 / 0
ZeroDivisionError: division by zero
>>> import pdb
>>> pdb.pm()  # Post-mortem debugger
> script.py(2)func()
-> return 0 / 0
(Pdb)

We see where we crashed, now let's set a breakpoint:


def func():
    breakpoint()  # import pdb; pdb.set_trace()
    return 0 / 0

func()

Now run it again:


script.py(3)func()
-> return 0 / 0
(Pdb)  # we start here
(Pdb) step
ZeroDivisionError: division by zero
> script.py(3)func()
-> return 0 / 0
(Pdb)

Most of the time print statements and tracebacks are enough for debugging, but sometimes, you need to start poking around to get sense of what's happening inside your program. In these cases you can set breakpoint(s) and when you run the program, the execution will stop on the line of breakpoint and you can examine your program, e.g. list function args, evaluate expression, list variables or just step through as shown above. pdb is fully featured python shell so you can execute literary anything, but you will need some of the debugger commands which you can find here.

Defining Multiple Constructors in a Class

One feature that is very common in programming languages, but not in Python, is function overloading. Even though you can't overload normal functions, you can still (kinda) overload constructors using class methods:


import datetime

class Date:
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day

    @classmethod
    def today(cls):
        t = datetime.datetime.now()
        return cls(t.year, t.month, t.day)

d = Date.today()
print(f"{d.day}/{d.month}/{d.year}")
# 14/9/2019

You might be inclined to put all the logic of alternate constructors into __init__ and solve it using *args, **kwargs and bunch of if statements instead of using class methods. That could work, but it can become hard to read and hard to maintain. I would therefore recommend to put very little logic into __init__ and perform all the operations in separate methods/constructors. This way you will get code that is clean and clear both for the maintainer and user of the class.

Caching Function Calls Using Decorator

Have you ever wrote a function that was performing expensive I/O operations or some fairly slow recursive function that could benefit from caching (memoizing) of it's results? If you did, then there is easy solution to that using lru_cache from functools:


from functools import lru_cache
import requests

@lru_cache(maxsize=32)
def get_with_cache(url):
    try:
        r = requests.get(url)
        return r.text
    except:
        return "Not Found"


for url in ["https://google.com/",
            "https://martinheinz.dev/",
            "https://reddit.com/",
            "https://google.com/",
            "https://dev.to/martinheinz",
            "https://google.com/"]:
    get_with_cache(url)

print(get_with_cache.cache_info())
# CacheInfo(hits=2, misses=4, maxsize=32, currsize=4)

In this example we are doing GET requests that are being cached (up to 32 cached results). You can also see that we can inspect cache info of our function using cache_info method. The decorator also provides a clear_cache method for invalidating cached results. I want to also point out, that this should not be used with a functions that have side-effects or ones that create mutable objects with each call.

Find the Most Frequently Occurring Items in a Iterable

Finding the most common items in list is pretty common task, which you could do using for cycle and dictionary (map), but that would be a waste of time as there is Counter class in collections module:


from collections import Counter

cheese = ["gouda", "brie", "feta", "cream cheese", "feta", "cheddar",
          "parmesan", "parmesan", "cheddar", "mozzarella", "cheddar", "gouda",
          "parmesan", "camembert", "emmental", "camembert", "parmesan"]

cheese_count = Counter(cheese)
print(cheese_count.most_common(3))
# Prints: [('parmesan', 4), ('cheddar', 3), ('gouda', 2)]

Under the hood, Counter is just a dictionary that maps items to number of occurrences, therefore you can use it as normal dict:


print(cheese_count["mozzarella"])
# Prints: 1

cheese_count["mozzarella"] += 1

print(cheese_count["mozzarella"])
# Prints: 2

Besides that you can also use update(more_words) method to easily add more elements to counter. Another cool feature of Counter is that you can use mathematical operations (addition and subtraction) to combine and subtract instances of Counter.

Conclusion

I think that this time most of the tips I shared here can be useful pretty much everyday if you are working with Python, so I hope they will come in handy. Also, if you have any thoughts on these Python tips and tricks, or maybe if you know of any better ways of solving above problems, then let me know! 🙂

Naming a Slice Using slice Function