Python tips for the new Data Science student

As I began learning Python in my current data science boot camp, and the new data types and methods were beginning to pile up, it seemed a good idea to create a cheat sheet for myself. It worked well for a week or two, and then grew to unwieldy proportions as new data types and methods were added.

Excel spreadsheet with lots of data

At some point, I had to abandon it, even though it still has some good information. It just isn’t possible to keep track of every method on every data type for every library. It turns out that is what Google is for… and Stack Overflow. So here are the highlights and takeaways that I refer back to most often.

String and Variable Concatenation
Though I occasionally use this for coding challenges, I find it most useful for debugging code when I print every iterable. These are the three ways I concatenate:

List manipulation
You can add to a set using set.add(), but .add() cannot be used with lists. Instead you must use append to add one item or extend to add multiple items, like this:

List comprehensions
These never cease to amaze me. As a former COBOL programmer (what seems like a lifetime ago) my brain seems to understand multiple nested if statements. Yet, for some reason, list comprehensions have not been very intuitive for me. The elegance and simplicity of list comprehensions are by far my favorite Python tool.

A basic list comprehension is formatted like this:

squares = [x**2 for x in range(10)]

Adding a conditional to the list comprehension:

even_numbers = [x for x in range(10) if x%2==0]

You can also nest conditional statements like this:

even_numbers_under_5 = [x for x in range(10) if x%2==0 if x<5]

Then you can get into some really fancy looking ones. Try these out!

Unpacking lists and dictionaries
Unpacking is something that I still haven’t had the occasion to try, but I have come across it a few times. The basic idea is that instead of specifying each item in a list, you can use the * in front of the list name to indicate all items in the list.

num_list = [1,2,3,4,5]
num_list2 = [6,7,8,9,10]
print(*num_list,*num_list2) # 1 2 3 4 5 6 7 8 9 10

All of the items in the list are unpacked and passed in to the print function as separate arguments. Unpacking can also be done with dictionaries using the ** operator to return both the key and value of each element.

Conclusion

Hopefully these tips will help you, if you, like me, are just getting started with Python.

Resources

Stack Overflow
Python documentation
Real Python
Towards Data Science

Data Science Student, Stay-at-Home Mom, Former Management Consultant