Intermediate Python

At the risk of sounding nerdy…python has started getting fun. There are 4 chapters and 1 case study in this course. Chapter 1 is titled ‘Matplotlib’. From what I understand, matplotlib is a package you import which holds a lot of the visualizations you might need going through your work. The instructor of this course called matplotlib “the mother of them all is matplotlib” as it pertains to visualization packages. So, I’d imagine that most work goes through matplotlib. Importing the package looks the same as numpy:

  • import matplotlib.pyplot as plt (plt seems to be the industry standard)

In this chapter, we wonly went over line charts, scatter plot and histograms. So, not too many knew things but it’s cool to see the visualizations. I have some lines of code below that go over how you call each of these:

  • import matplotlib.pyplot as plt

  • year = *insert list here*

  • pop = *insert list here*

  • plt.plot(year, pop) —> this returns a line chart

  • plt.scatter(year, pop) —> this returns a scatter plot

    • I’d image you aren’t calling both in in one project as it pertains to the same data so you’d only have one of the ‘plt.’ calls but either one returns the year and population (pop) in your chart

Histograms are a bit different than the line chart and scatter plots. For histograms (image to the left), your main goal is to get a distribution of data. The 2 main things you need to call for a histogram are x (list of values you want to build the histogram for) and bins. Bins tells python how many different bins you want the data divided into. If you don’t specify the bins argument, python will assume the bin size is 10. For example, you want to split up ages across a generation. You can build a histogram for the population in each age bin (grouping of years). So, the histogram can be split into 2 year increments, 5 year, 10 year, etc. This visual gives a good example of ‘bin=5’ as you see the 5 year increments.

After learning how to create the visualizations, we went over labeling the x-y axis and how to make your visualization make sense. Other things we touched on were ‘ticks’ which were what you wanted the x-y axis to show. For example, the example given was working with billions in world population. Instead of having the y-axis as 0, 2, 4, 6, 8, 10; the axis was label ‘0, 2B, 4B, 6B…’ to show billions. I have a few examples of labeling and ticks below:

The ‘#” line items are short direction descriptions

  • # Basic scatter plot, log scale

    • plt.scatter(gdp_cap, life_exp)

    • plt.xscale('log')

  • # Strings

    • xlab = 'GDP per Capita [in USD]'

    • ylab = 'Life Expectancy [in years]'

    • title = 'World Development in 2007'

  • # Add axis labels

    • plt.xlabel(xlab)

    • plt.ylabel(ylab)

  • # Add title

    • plt.title(title)

  • # After customizing, display the plot

    • plt.show()

The lines of code above also show that you can label variables (xlab, ylab, title under #Strings) so in your call to label the axis or title, you don’t have to add the string text and can use what you labeled your variables as. I believe that just makes everything cleaner, unsure though.

  • # Scatter plot

    • plt.scatter(gdp_cap, life_exp)

  • # Previous customizations

    • plt.xscale('log')

    • plt.xlabel('GDP per Capita [in USD]')

    • plt.ylabel('Life Expectancy [in years]')

    • plt.title('World Development in 2007')

  • # Definition of tick_val and tick_lab

    • tick_val = [1000, 10000, 100000]

    • tick_lab = ['1k', '10k', '100k']

  • # Adapt the ticks on the x-axis

    • plt.xticks(tick_val, tick_lab)

  • # After customizing, display the plot

    • plt.show()

This code above is building a scatter plot and is labeling the values on the x-axis as ‘1k, 10k, 100k’.

That pretty much wraps it up for chapter 1 of intermediate python. To be continued…

Previous
Previous

Intermediate Python Cont.

Next
Next

Intermediate SQL