Learning Python Programming Language with Interactive Computing Framework

logo-python

Som Bohora, Department of Pediatrics

OUHSC, Statistical Computing User Group
Biomedical and Behavioral Methodology Core (BBMC)
Tuesday, March 01, 2016

Outline of presentation

  • Python 3 download and installation
  • Key differences between Python 2 and Python 3
  • Interactive computing and dynamic reporting with IPython/Jupyter Notebook
  • Python tutorial
    • Why Python?
    • Variables, expressions, and statements
    • Conditional executions
    • Functions
    • Iteration
    • Standard data types
      • Numbers, Strings
      • Lists, Tuples, and dictionaries
  • Files (working with .txt files)
  • Compare some python commands with R program
  • References and resources

Python 3 & Jupyter Notebook Installation

fhfhfgjgjghk

Python overview

  • Python is a high-level, general-purpose, interpreted, and object-oriented scripting language
  • Developed by Guido van Rossum in the late eighties and early nineties in the Netherlands
  • Derived from many other languages, including C, C++, and other scripting languages
  • It is copyrighted and is now available under the GNU General Public License (GPL)

  • No need of type declaration of variables, parameters, functions or methods

  • Python source files use .py extension and run with command prompt by python filename.py
  • Python uses 0-based index origin

Python author: Guido van Rossum

Guido van Rossum

lecense

Python features

  • Easy-to-learn: few keywords,simple structure, and clear syntax.
  • Easy-to-read: much more clearly defined and visible to the eyes.
  • Easy-to-maintain: source code is fairly easy-to-maintain.
  • A broad standard library: very portable and cross-platform compatible
  • Interactive Mode
  • Portable: run and interface on a wide variety of platforms
  • Extendable: one can add low-level modules to the Python interpreter for efficiency.
  • Databases: provides interfaces to all major commercial databases.
  • Scalable: provides a better structure and support for large programs than shell.

Why Python?

  • A great experience on Day 1 (minimal set up, shallow learning curve, written in plain English, Errors appear on runtime)
  • Ability to program on the web
  • Ability to program desktop applications
  • Increasing use in data science
  • Marketable professional skills
  • Community support
  • It can be easily integrated with C, C++, Java, and other languages

Python IDE

PyDev with Eclipse, Komodo, Emacs, Vim, TextMate, Gedit, python IDLE, NotePad++, PyCharm

Key differences in Python 2.x and 3.x

Parameters Python 2.x Python 3.x
print print 'hello' print('hello')
integer division print '3//2=', 3/2 print('3//2=', 1)
xrange xrange range
user input raw_input input
iterable objects type list range

NOTE:

Use `from __future__ import division` within 2.x to do `3.x` way

Variables, expressions and statements

In [66]:
# This is a comment

print('Hello World')

print(type("string"))

# `=` is an assignment statement
def findType(a):
    return type(a)

x, y, z = 1, 3.1415, 'python'
list(map(findType, (x,y,z)))
Hello World
<class 'str'>
Out[66]:
[int, float, str]
In [3]:
# Variable names -- Python is case-sensitive
# 76value = 'big parade'
# more@ = 100
# class = 'Biostat'
In [4]:
import keyword 
# print(keyword.kwlist)

Contd variables, expressions and statements

In [1]:
x=5 ; print(x+7)

print(x % 2) # % modulus operator

first = 10 ; second = 15 ; print( first+second )
firstString = str( first ); secondString = str( second )
print( type( firstString ) )
print(firstString + secondString)

# User input 
whatCourse = input("what course is this?\n")  # "\n" is newline character
print(whatCourse)
12
1
25
<class 'str'>
1015
what course is this?
biostat I
biostat I

Conditional execution

Boolean expressions

In [2]:
print(5 == 5) ; print(5 == 6) ; type(True)
True
False
Out[2]:
bool

Comparison operators

In [3]:
x = 5; y = 6; x != y; x > y; x >= y; 

x is y; x is not y
Out[3]:
True

Logical operators

In [4]:
x > 0 and x < 10
print(not (x > y))   # Is x not greater than y?
x%2 == 0 or  x%3 == 0
17 and True          # any non-zero number is treated as `True`

5%2
True
Out[4]:
1

Conditional execution contd

Conditional execution

In [6]:
# x = -1; y =6
if x < 0:
    print("x is positive")
else:
    pass

if x < y:
    print(x, "is less than", y)
elif x > y :
    print(x, "is greater than", y)
else:
    print("x and y are equal")
5 is less than 6

Standard data types

Numbers and Strings

Lists

  • A list is a sequence with elements or items of any type seprated by commas
  • [] or list() is used to create lists
  • Lists are mutable

Data types contd

In [50]:
import math
lst = [2,3,4,1, 'spam', 3.24, math.pi]; print(x)

listWithinList = ['hello', 2.0, 3, [3,6,5]]

empty = []
print(lst, listWithinList, empty)
[2, 3, 4, 1, 'spam', 3.24, 3.141592653589793]
[2, 3, 4, 1, 'spam', 3.24, 3.141592653589793] ['hello', 2.0, 3, [3, 6, 5]] []
In [54]:
# Lists are mutable
listWithinList[0] = "Hi Som"; print(listWithinList)

# `in` operator
"Hi Som" in listWithinList

# looping through a list
for i in lst:
    print(i)
['Hi Som', 2.0, 3, [3, 6, 5]]
2
3
4
1
spam
3.24
3.141592653589793

Data types contd

In [56]:
# Traversing a list
numbers = [3,5,2]
for i in range(len(numbers)):
    numbers[i] = numbers[i] * 2
print(numbers)
[6, 10, 4]
In [38]:
# List operations
a = [1,2,3,5,4,[2,5,1]]; b = [4,5,9]
print(a + b)

print([0]*4)
[1, 2, 3, 5, 4, [2, 5, 1], 4, 5, 9]
[0, 0, 0, 0]

Data types contd

In [55]:
# List slices
print(a[:])
print(a[:3])  # non-inclusive end-point, this gives 0,1,2 elements
print(a[2:])

a[1:3] = [777, 999];  print(a)

# Slice a list within a list
print(a[5][1])

# List methods
c = a[:4]
c.append(888); print(c)
# c.insert(2, 2222222); print(c)  # select a index where you want to insert

# sort lists
c.sort(reverse=True); print(c)
sorted(c, reverse=True)
[1, 777, 999, 5, 4, [2, 5, 1]]
[1, 777, 999]
[999, 5, 4, [2, 5, 1]]
[1, 777, 999, 5, 4, [2, 5, 1]]
5
[1, 777, 999, 5, 888]
[999, 888, 777, 5, 1]
[999, 777, 5, 1]
[777, 5, 1]
3
777
783
['P', 'y', 't', 'h', 'o', 'n', ' ', 'i', 's', ' ', 'a', ' ', 'm', 'a', 'g', 'i', 'c']
['Python', 'is', 'a', 'magic']
Python is a magic

Data types contd

In [ ]:
# Delete elements
del c[1]; print(c)       # del c[1:3] for multiple elements
c.remove(999); print(c)

# List and functions
print(len(c)); print(max(c)); print(sum(c))

# Lists and strings
s = 'Python is a magical'
t = list(s); print(t)
split_str = s.split() ; print(split_str); # default separator is ' '

delimiter = ' '
print(delimiter.join(split_str))

Tuples

  • A tuple is a sequence of values like list that stores any type of any types separated by commas
  • () or tuple() may be used to create lists
  • Lists are immutable

Tuples contd

In [6]:
t = ('a', 'b', 'c', 'd','e')  # t = 'a', 'b', 'c', 'd', 'e'  
#  tuple() Although it is not necessary, it is common to enclose tuples in parentheses to

# To create a tuple with a single element
t2 = ('a',)  # t2 = ('a') is  a string

t3 = tuple()
print(t[1:])

# t[0] = 'A'          # Cannot modify it

# You cannot modify the elements of a tuple, but you can replace one tuple with another
t4 = ('A',) + t[1:]
print(t4)

t = tuple('Hello'); print(t)

# Clever application of tuple is: Swap values between two variables
a = 2; b = 3
a, b = b, a
('a', 'b', 'c', 'd', 'e')
('b', 'c', 'd', 'e')
('A', 'b', 'c', 'd', 'e')
('H', 'e', 'l', 'l', 'o')

Tuples contd

In [56]:
# Comparing tuples
print((0, 1, 2) < (0, 3, 4))
print((0, 1, 2000000) < (0, 3, 4))

# Tuple assignment
m = [ 'have', 'fun' ]
x,y = m
(x, y) = m
print(x,y)
True
True
have fun

Dictionaries

  • A dictionary is like a list (index position is integer), but more general (indices can be any type)
  • Think of it being as mapping between set of indices (keys) and a set of values (key-value pair)
  • It is similar to loop up table or associative array or hash table
  • {} or dict() is used to create lists
  • Dictionaries are mutable
  • Duplicate keys are not allowed
  • Duplicate values are just fine

Dictionaries contd

In [2]:
# Create a dictionary
eng2sp = dict()

# Add an item to dictionary
eng2sp['one'] = 'uno'

eng2sp = {'one':'uno', 'two':'dos', 'three':'tres'}

# Add an item in dictionary
eng2sp['four'] = 'cuatro'

# Delete item from dictionary
del eng2sp['four'] ; print(eng2sp)

print(eng2sp['two']); print(len(eng2sp))

print('one' in eng2sp)  # default loop up is `keys`

# To see whether something appears as value
print('uno' in eng2sp.values())
{'three': 'tres', 'one': 'uno', 'two': 'dos'}
dos
3
True
True

Dictionaries contd

In [6]:
# Looping through dictionaries
for key,value in eng2sp.items():
    print(key, value)

vals1 =eng2sp.keys()       # Similarly .value() gives values
# g = list(vals1) # change to list
print(vals1)

# Another dictinary example
d = {'a':10, 'b':1, 'c':22}
t =d.items(); print(t) ; print(sorted(t))   # items() returns a list of tuples

for key in d:
    if d[key] > 10:
        print(key, d[key])

# sort by value
l = list()
for key, value in d.items():
    l.append( (value, key) )
    l.sort(reverse=True)
print(l)

# changing value of an element of a list
x = [5, None, 10] ; print(x)
for idx, i in enumerate(x):
    if i == 5:
        x[idx] =1000
print(x)

seasons = ['Spring', "Summer", "Fall", "Winter"]
list(enumerate(seasons))
three tres
one uno
two dos
dict_keys(['three', 'one', 'two'])
dict_items([('c', 22), ('a', 10), ('b', 1)])
[('a', 10), ('b', 1), ('c', 22)]
c 22
[(22, 'c'), (10, 'a'), (1, 'b')]
[5, None, 10]
[1000, None, 10]
Out[6]:
[(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')]

Functions

try and except

In [5]:
def divide(a,b):
    try:
        return True, a/b
    except:
        return "Non divisible", None
divide(2,0)

# Writing a function
def addTwo(a,b):
    added = a + b
    return added
print(addTwo(2,3))
5

Functions contd

Built-in and new functions

In [3]:
text = "Hello world"
print(max(text)); print(min(text)) ;len(text)

# Adding new functions
def print_lines():
    print("Hi, I am Som.")
print_lines()

def repeat_lines():
    print("Here begins new function")
    print_lines(); print_lines()
repeat_lines()
w
 
Hi, I am Som.
Here begins new function
Hi, I am Som.
Hi, I am Som.

Iteration

In [7]:
x = 0
x = x + 1; print(x)

# For loops
for i in range(4):      # end is non-inclusive
    print("The value is:", i)
print("Done")

for letter in "python":
    print("The letter is,", letter)

friends = ['Binod', 'Achyut', 'Bikram']
for friend in friends:
    print(friend, 'has', len(friend), 'letters.')
    
for i in range(0,2):
    for j in "Hi":
        print(i,j)
1
The value is: 0
The value is: 1
The value is: 2
The value is: 3
Done
The letter is, p
The letter is, y
The letter is, t
The letter is, h
The letter is, o
The letter is, n
Binod has 5 letters.
Achyut has 6 letters.
Bikram has 6 letters.
0 H
0 i
1 H
1 i

Iteration contd

In [ ]:
# While loops
n = 5
while (n > 0):                 # while n > 0, display, and reduce by 1
    print(n)
    n = n - 1
print('Done')

# Take user input until they type `done`
while True:
    line = input('> ')
    if line[0] == '#':
        continue
    if line == 'done' or line == 'Done':
        break
    print(line)
print('Done')

Strings

  • A string is a sequence of characters
  • Characters can be accessed one at a time with a bracket operator
  • Strings are immutable

Strings contd

In [34]:
# Strings are immutable
fruit = 'apple'   # [0] = 'a', [1] = 'p', [2] = 'p', [3] = 'l',[4] = 'e',
print(fruit[1], len(fruit))
  
length = len(fruit)
# print(fruit[length])   # 0:5, but asking out of range
print(fruit[length-1])

# Traversing through a string
for char in fruit:
    print(char)

# Traversing through a string with index
for idx, val in enumerate(fruit): # getting index of loops
    print(idx, val)
    
# `in` operator
'a' in 'Anaconda'

# String methods
print(type(program))
print(dir(program))
p 5
e
a
p
p
l
e
0 a
1 p
2 p
3 l
4 e
Jytho
 in Ja
Jython in Java
Jyt
hon in Java
Mello friends!
2
<class 'str'>
['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
ANACONDA
2
here we go
False
A
uct.ac.za
There are 99 pythons.
The value of pi i.e. 3.14 can be rounded to 3.

Strings contd

In [6]:
# String slices
s = 'Jython in Java'
print(s[0:5]); print(s[6:12]); print(s[:]); print(s[:3]); print(s[3:])

# Strings are immutable
greeting = 'Hello friends!'

# greeting[0] = 'M'; print(greeting)     # DOES NOT WORK

# Slice and concatenate
new_greeting = 'M' + greeting[1:] ; print(new_greeting)

# Looping and counting
program = 'Anaconda'
count = 0
for letter in program:
    if letter == 'a':
        count += 1
print(count)
Jytho
 in Ja
Jython in Java
Jyt
hon in Java
Mello friends!
2

Strings contd

In [7]:
# Change case
print(program.upper()) ; print(program.find('a'))   # finds first occurence

# remove space at the begining and end of the string 
print('  here we go   '.strip())
print(program.startswith('b'))  # logical
print(program.upper()[program.upper().startswith('A')-1])   #  get the letter
# print(True - 1)

# Parsing the strings
data = 'From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008'
at_position = data.find('@')
space_position = data.find(' ', at_position)
host = data[at_position+1:space_position] ; print(host)

# Format operator `%`
value = 99
print('There are %d pythons.' %value)
print('The value of %s i.e. %g can be rounded to %d.' % ('pi', 3.14, 3))
ANACONDA
2
here we go
False
A
uct.ac.za
There are 99 pythons.
The value of pi i.e. 3.14 can be rounded to 3.
In [67]:
# Making sure that only values greater than are allowed
while True:
    try:
        firstValue = int(input("Enter first number:"))
    except ValueError:
        print()
    else:
        if firstValue > 0:
            break
        else:
            print("ERRORS!!! Please enter value greater than 0")

# Print type of operations
print("Choose operation: 1-add, 2-subtract,3-multiply, 4-divide")

# Storing input type from above list
operation = input("Choose operation from above list (1/2/3/4): ")

# Making sure that only values greater than are allowed
while True:
    try:
        secondValue = int(input("Enter second number:"))
    except ValueError:
        print()
    else:
        if secondValue > 0:
            break
        else:
            print("ERROR!!! Please enter value greater than 0")

# Conditional execution to perform selected tasks of calculation
if operation == '1':
    result = (firstValue + secondValue)
elif operation == '2':
    result = (firstValue - secondValue)
elif operation == '3':
    result = (firstValue * secondValue)
elif operation == '4':
    result = (firstValue / secondValue)
else:
    print("Invalid input")
print(result)
print("Done calculation!!!")
#quit()
Enter first number:2
Choose operation: 1-add, 2-subtract,3-multiply, 4-divide
Choose operation from above list (1/2/3/4): 2
Enter second number:6
-4
Done calculation!!!

Files

  • Opening, reading, and searching through .txt files

In [46]:
# reading from a text file

print("Opening file")
text_file = open("test.txt", "r")   # or open("test.txt").read()
print(text_file)
#print(dir(text_file))

#print(text_file.read(1))       # reads first character
#print(text_file.read(5))       # reads fifth character (skips first character as you already read it)

# wholeFile = text_file.read()     # reads whole file (skips already read character)
# print(wholeFile.split())         # splits by space


# print(text_file.readline())    # reads first line
# print(text_file.readline(5))    # reads 4 characters from first line
Opening file
<_io.TextIOWrapper name='test.txt' mode='r' encoding='cp1252'>

Contd Files

In [47]:
lines = text_file.readlines()    # reads all lines and results in  a list with newline character
print(lines)

# for i in lines:     # there is a carriage return ('\n' character)
#     i = i.rstrip()
#     print(i)

# Searching through a file
for line in lines:
    line = line.rstrip()                      # lstrip removes spaces from left
    if not line.startswith("Man"):
        continue   # if line.startswith('From '):
    words = line.split()
    print(line)                                                            

text_file.close()
['I am a test file.\n', 'Maybe someday, he will promote me to a real file.\n', 'Man, I long to be a real file\n', 'and hang out with all my new real file friends.']
Man, I long to be a real file

Python commands the R way

Note: Python base is used and numpy as np is used whenever appropriate

Methods R Python 3.x
command line program r print("hi") python print("hi")
block delimiters { } offside rule
assignment i = 3; i <- 3; 3 -> i; assign("i", 3) i = 3
null NA or NULL None or np.nan
null test is.na(x) or is.null(x) x == None or x is None
condition expression if (x > 0) y else -y or ifelse y if x > 0 else -y
True/False TRUE FALSE T F True False
logical operator &, |, ! and, or, not
integer division 13 %/% 5 13 // 5
paste0 paste("one", "two") = "onetwo" "one" + "two"
case change tolower("FOO") 'foo'.upper()
# characters nchar("hi") len("hi")
start index 1 0
concatenation a=c(1,2) ; a2<-append(a,c(2,3)) a + [2,3]
sequence seq(0,100,10) range(0,101,10)
Methods R Python 3.x
element type A = array(c(1,2,3)) np.array([1,2,3])
matrix A = matrix(c(1, 2, 3, 4), nrow=2) A = np.array([[1, 2], [3, 4]])
dictionary d=list(n=10, avg=3.6, sd=0.3) d = {'n': 10, 'avg': 3.6, 'sd': 0.3}
update d$var = d$sd**2 d['var'] = d['sd']**2
function add = function(x, y) {x + y} def add(x,y): x + y
break continue break next or ifelse break continue
install package install.packages("ggplot2") pip3 install scipy
data type class type
undefine variable rm(x) del(x)
eval eval(parse(text='1 + 1')) eval('1 + 1')
help help(x) or ?x help(x)
load package library("dplyr") import math

References and resources

Thanks!

Coming up -- Python for Data Science with pandas, numpy, and scikit-learn