Python Lists

Let’s now dive deeper into Python lists. We saw a list in this post, it is basically a way to store data making it one of Python’s data structures (there are more)

Go ahead and create a new project called pyLists. Check the previous post if unser how to create a new project.

Now create the following list and add a print and run it.


a=[1,2,4,r,w,s,t]

print(a)

We get this message!

Oh damn, Visual Studio didn’t like that. It is telling us that ‘r’ is not defined. If we look at our list, it is a combination of numbers and letters. The numbers didn’t cause any errors, it was the letters where it crashed. That’s because when the program runs, it is assuming that ‘r’ is a variable and it cannot find it anywhere (not defined).

To fix this we need to put all the letters in quotation marks (single or double works):


a=[1,2,4,'r','w','s','t']

print(a)

This runs our code without errors.

What to do with lists?

Lists have a lot of interesting properties, let’s explore a few of them and by the end of this post we will see how to find an element in a list.

Our current list “a” is very small so let’s expand it, we could do that by rewriting it and adding more elements but let’s do it in a different way. Create another list called “b“.


b=['the','badgers','are','in','here']

Then we can simply append the lists by adding them up:


c=a+b

print(c)

Now we have a long list but let’s make it even longer by doing this:


a=['r','w','s','t']
b=['the','badgers','are','in','here']
c=a+b
c=10*c
print(c)

 

We have now multiplied our list by 10! The c=10*c statement means the following:

  • It reads from right to left, the right side “c” is the current value which in this case it was just our original list of a+b.
  • We grab this value and multiply it by 10
  • This new c x 10 list is now assigned to “c”. This effectively creates a new “c” but with this new value of c x 10

Now that we have out list let’s add one more element which we will also use to search for. To add an element to the list we write (put this before the print):


c.append('searchMe')

Python appends this element to the very end of the list which would be 91; this could beg the question, how many elements are in a list? We can find that with the following:


print(len(c))

This Python method gives us the number of elements in a container, in our case we get 91.

Let’s assume now that we don’t know if an element is in a list, how do we find out?

For and If Statements

We will use a combination of a For and If loops. These two will allow us to loop over the elements of the list and find the word we are looking for.

Write this in your Visual Studio project:


a=['r','w','s','t']
b=['the','badgers','are','in','here']
c=a+b
c=10*c
c.append('searchMe')
searchWord='searchMe'
counter=1

for element in c:
    if element==searchWord:
        print('{} found at position {}'.format(searchWord,counter))
    counter=counter+1

Line 6

We create a variable called searchWord where we can store the word we are looking for.

Line 7

This is a counter variable, we will use it to see how many times we have looped over the list (to find where in the list is our searchWord)

Lines 9 to 12

We are looping over the c list and we are inspecting each element of it. This in a For loop takes the form of:


FOR Element IN Object:
    doSomething

Line 10 IF statement

The For loop will start with the first element of our list, if this element is equal to our searchWord then the code below the If statement (line 11) will execute, if not, then the For will move to the next on the list.

Line 11

So, if in whatever counter (iteration) of our for loop we see that the current element is equal to our searchWord, this line will be executed:

print('{} found at position {}'.format(searchWord,counter))

You can think of this line as two different statements, both inside the print parenthesis:

  1. ‘{} found at position {}’: This part of the statement is telling Python to write a variable plus the text “found at position” and then another variable, the place holders for the variables are defined by these brackets {}.
  2. Now we add the keyword .format (don’t forget the “.”) and within we write the variables (in the same order we want them to appear in the first statement)

Notice that this line belongs to the If statement, we can identify that by the indentation

Line 12

This line does NOT belong to the If statement, it belongs to the For because of the indentation. Any time the For loop runs, it’s last operation increases our Counter by 1 to keep track of our position in the list.

You should see the following when you run your code:

This concludes our lists exploration, eventually we will use lists to store data obtained from a database and we will perform ETL (Extract Transform Load) with them.

Python Charts with Bokeh

So, today we will create a simple chart with a library called Bokeh. It is a very easy to use library.

First create a new project and point the environment to our honeyBadger env, (Check this post to see how to do this)

Call the project: myFirstBokeh.

Your solution explorer should look like this after changing the environment:

So, how do we create a chart? Well, let’s see this Bokeh library, they have some neat examples in there

https://bokeh.pydata.org/en/latest/docs/user_guide/quickstart.html#userguide-quickstart

They have this code in there

from bokeh.plotting import figure, output_file, show

# prepare some data
x = [1, 2, 3, 4, 5]
y = [6, 7, 2, 4, 5]

# output to static HTML file
output_file("lines.html")

# create a new plot with a title and axis labels
p = figure(title="simple line example", x_axis_label='x', y_axis_label='y')

# add a line renderer with legend and line thickness
p.line(x, y, legend="Temp.", line_width=2)

# show the results
show(p)

If we run this however, it won’t work! Also notice how it shows up in our Visual Studio:

This is Visual Studio saying “I have no idea what this is that you are asking me to import”.

Let’s analyze this for a second. We are asking Visual Studio to import:

  • figure
  • output_file
  • show

From something called bokeh.plotting

This means we are trying to import into our Python project those 3 functions of the bokeh.plotting module. These 3 do stuff that we need to produce a chart and the good thing is we only have to import them into our project and call them.

Anaconda Navigator

Ok but how do we make Visual Studio get that library. That’s where our Anaconda Navigator comes in, open it and click on our honeyBadger environment.

 

Once in here click on the dropdown (3) and change it to Not Installed and in the Search Packages box type Bokeh. You should see it coming up like this:

Now click on the checkbox (or right click on the name then Mark for Installation) and click Apply on the bottom right then Apply again on the popup

After it finishes, you can click on the dropdown again to see that it is there, also you can click on the “x” to remove the search and you will see all the modules installed in our honeyBadger environment.

All right, after we install Bokeh using Anaconda we can go back to Visual Studio and if we open the Python Environments window (Tools -> Python -> Python Environments) we will see that Visual Studio is working in integrating Bokeh in its Intellisense which basically means that when we type functions from that library, Visual Studio will open a little window (as we type) providing the parts that belong to the libraries we are using (you will see this soon)

You will also notice that in our Solution Explorer if we expand our honeyBadger environment we will have Bokeh in there.

All right, so now hit run and you might have a web browser opening for you with the chart you programmed.

That was easy, wasn’t it? You can even interact with the chart using the tools on the top right But let’s analyze the code now after the import statements.

Data to Plot

# prepare some data
x = [1, 2, 3, 4, 5]
y = [6, 7, 2, 4, 5]

A chart needs data to plot (y) and another set of data for the horizontal axis (x)

Plot to HTML File


# output to static HTML file
output_file("lines.html")

Bokeh offers this amazing capability to create the chart to a file which you could then put in a web server.

A Figure


# create a new plot with a title and axis labels
p = figure(title="simple line example", x_axis_label='x', y_axis_label='y')

A figure is the area where the plot (the lines) will be drawn, think of it as the canvas of a picture. This Canvas has certain elements like the Title and the labels.

Plots Within the Figure


# add a line renderer with legend and line thickness
p.line(x, y, legend="Temp.", line_width=2)

Here we add a line plot to our figure P. We also pass the data we want to plot (x,y) and define legend and line width

Render the Figure


# show the results
show(p)

Finally this line creates the chart and opens the HTML file.

Now that you know how to get Bokeh running, try modifying the charts and also run some of the examples in the Bokeh website. Have fun Bokehing!

Simple Python Script with a Function

In the previous post, I detailed how to install everything to be able to run Python. Now we will create a simple Python script to have fun with what we created

Create New Project

Go ahead and open Visual Studio and create a new project.

 

Now let’s use the honeyBadger environment that we created in the previous post

 

Python Script Structure

Let’s talk about the structure of a Python project, basically we can define 3 sections:

#Section 1
import antigravity
#Section 2
def someFunction():
     return stuff
#Section 3
codeThatRunsAndFlies

Section 1: Import Libraries

The first section (import antigravity) tells Python to import libraries that you need in your code, libraries are pieces of code someone already created for a purpose. There are thousands of libraries out there, pretty much you only need to have an idea of what you need to do and there’s a library already created for that purpose. Popular libraries include:

  • Pandas: Data analysis/manipulation.
  • Scikit-Learn: Machine learning library.
  • Matplotlib: This is a numerical plotting library.
  • Nltk: This is a great library that allows you to do a lot of text related analysis, I use it in conjunction with Scikit-Learn to create machine learning algorithms which are text based.
  • PyGame: Library to create 2D games.

In this simpleCode exercise we will not use any additional libraries (Python has a default library with lots of goodies)

Oh and if you want to read more about the “antigravity” library please read this

Section 2: Functions and classes (classes will have their own post)

These are pieces of code that you can create and then call within your main code section (Section 3). If you have a repetitive task (like calculating the percentage for a tip in a restaurant) you could create a function called tip that accepts arguments and returns the desired value. For example


def tip(someNumber):
     tipReturned=someNumber*.15
     return tipReturned

print(tip(5))

You can run this code and get the result of 0.75, try also changing the 5 to some other number.

Indentation!

You might have noticed the indentation in the code, Python requires this (some people hate it, others love it) to work for functions, For loops etc. It basically means the following:

I am defining something here:

     Everything that is below of whatever I am defining and has indentation belongs to what I defined

     This belongs as well

     And so does this

This is something different 

When writing a function for example

def func():

Right after you write the colon character and press enter, Visual Studio will do the indentation for you so you don’t have to do it manually.

Section 3: The code

This is the section where you will make use of everything you defined in sections 1 and 2 (if needed).

If we go back to our tip example of the tip calculator. What if we need to calculate the tip for several values and not only 1? One way to do it would be something like this (if we have 8 values)


def tip(someNumber):
     tipReturned=someNumber*.15
     return tipReturned

print(tip(3))
print(tip(4))
print(tip(5))
print(tip(6))
print(tip(7))
print(tip(8))
print(tip(9))
print(tip(10))

This is however, not a good practice. There is a lot of unnecessary repetition and what if we have 100 numbers that we need the calculation for?

One option is by the use of Python lists which is basically a collection of numbers or letters in a set. We can define our list like this:

def tip(someNumber):
     tipReturned=someNumber*.15
     return tipReturned

list=[3,4,5,6,7,8,9,10]

print(tip(list))

Unfortunately this will fail because our function is designed to calculate the tip for just one number and when we send the function the complete list, Python doesn’t know what to do with this. So, how do we fix this?

One option is to use our loops, a loop is basically an algorithm that performs an operation until certain condition is met. Let’s try a for loop


def tip(someNumber):
     tipReturned=someNumber*.15
     return tipReturned

list=[3,4,5,6,7,8,9,10]

for each in list:
     print(tip(each))

This runs! And we get this result

Let’s break down that for statement

for each in list: this means that for each element in the list do whatever is defined below the statement (in this case we do a print with a function in it). The word “each” could be anything, you can change it to muffin or kitty and it will do the same . To summarize, this is a concise way to perform the same operation over the elements of a list (or array)

This concludes the simple Python code, in the next post we will learn how to create charts in Python