=======================================
Python: complex statements (Solutions)
=======================================
Conditional code: ``if``
---------------------------
#. Solution::
number = int(raw_input("write a number: "))
if number % 2 == 0:
print "even"
else:
print "odd"
We use ``else``, since even and odd are the only two possibilities.
A way to make a third option explicit would be::
if number % 2 == 0:
print "even"
elif number % 2 == 1:
print "odd"
else:
print "impossible!"
but the code in ``else`` will never be executed for any value of ``number``!
Since the two options are mutually exclusive, we can also write::
if number % 2 == 0:
print "even"
if numero % 2 == 1:
print "odd"
even without the``else``, one and only one of the
``if`` can be executed.
#. Solution::
number = float(raw_input("write rational: "))
if number >= -1 and number <= 1:
print "okay"
we don't need neither``elif`` (there is only one condition) neither ``else`` (if the
condition is false, we don't need to do anything).
#. Solution::
answer = raw_input("write two numbers separated by a space: ")
words = answer.split()
num1 = int(words[0])
num2 = int(words[1])
if num1 > num2:
print "first"
elif num2 > num1:
print "second"
else:
print "neither"
Alternatively::
answer = raw_input("write two numbers separated by a space: ")
numbers = [int(word) for word in answer.split()]
if numbers[0] > numbers[1]:
print "first"
elif numbers[0] < numbers[1]:
print "second"
else:
print "neither"
#. Solution::
horoscope_of = {
"January": "extreme luck",
"February": "try to be born again",
"March": "kissed by fortune",
"April": "lucky luke",
}
month = raw_input("tell me your birth month: ")
if horoscope_of.has_key(month):
print horoscope_of[month]
else:
print "not available"
#. Solution::
path = raw_input("write your path: ")
lines = open(path, "r").readlines()
if len(lines) == 0:
print "empty"
elif len(lines) < 100:
print "short", len(lines)
elif len(lines) < 1000:
print "average", len(lines)
else:
print "large", len(lines)
Note that it's not necessary to specify entirely the conditions: in the code
we can shorten ``100 < len(lines) < 1000`` with ``len(lines) < 1000``.
We can do that, since when``len(lines)`` is lower than ``100``
the first ``elif`` is executed: the second ``elif`` is not even considered.
#. Solution::
point1 = [float(word) for word
in raw_input("write three coordinates: ").split()]
point2 = [float(word) for word
in raw_input("write three coordinates: ").split()]
if point1[0] >= 0 and point1[1] >= 0 and point1[2] >= 0 and \
point2[0] >= 0 and point2[1] >= 0 and point2[2] >= 0:
diff_x = point1[0] - point2[0]
diff_y = point1[1] - point2[1]
diff_z = point1[2] - point2[2]
print "the distance is", (diff_x**2 + diff_y**2 + diff_z**2)**0.5
Note that ``print`` is *inside* the ``if``.
#. Solution: we know that ``number`` is an arbitrary integer, chosen by the user::
if number % 3 == 0:
print "divisible by 3"
elif numero % 3 != 0:
print "not divisible by 3"
else:
print "dunno"
``if``, ``elif`` and ``else`` form a chain: only one among them is executed.
#. ``if`` is executed if and only if ``number`` is divisibile by three.
#. ``elif`` is executed if and only if the previous ``if`` is not executed and
if ``number`` is *not* divisible by three.
#. ``else`` is execute whenever neither ``if`` and ``elif`` are executed.
Since all numbers are either divisible by ``3`` either not, there is no other possibility,
``else`` will *never* be executed.
Therefore, the answer is no.
#. Solution: as before, ``number`` is an arbitrary integer. The code is::
number = int(raw_input("write a number: "))
if number % 2 == 0:
print "divisible by 2"
if number % 3 == 0:
print "divisible by 2"
if number % 2 != 0 and number % 3 != 0:
print "dunno"
Here we don't have "chains" of ``if``, ``elif`` ed ``else``: we have three independent ``if``.
#. The first ``if`` is executed if and only if ``number`` is divisible by two.
#. The second ``if`` is executed if and only if ``number`` is divisible by three.
#. The third ``if`` is executed if and only if ``number`` is *not* divisible by neither
two and three.
If ``number`` is 6, divisible by both two and three, the first two
``if`` will be both executed, while the third won't be.
If ``number`` is 5, not divisible by neither two and three, the first two
``if`` will *not* be executed, but the third will be.
Therefore, the answer is yes.
(There is no possibility to *not* execute neither of the three ``if``.)
#. Solution::
answer = raw_input("sum or product?: ")
if answer == "sum":
num1 = int(raw_input("number 1: "))
num2 = int(raw_input("number 2: "))
print "the sum is", num1 + num2
elif answer == "product":
num1 = int(raw_input("num1: "))
num2 = int(raw_input("num2: "))
print "the product is", num1 * num2
Using ``if`` or ``elif`` won't change the execution of the program.
We can simplify like this::
answer = raw_input("sum or product?: ")
num1 = int(raw_input("number 1: "))
num2 = int(raw_input("number 2: "))
if answer == "sum":
print "the sum is", num1 + num2
elif answer == "product":
print "the product is", num1 * num2
Iterative code: ``for`` and ``while``
--------------------------------------
#. Solutions:
#. Solution::
for number in range(10):
print number
#. Solution::
for number in range(10):
print number**2
#. Solution::
sum_of_squares = 0
for number in range(10):
sum_of_squares = sum_of_squares + number**2
print sum_of_squares
#. Solution::
product = 1 # note that for the product the initial value should be 1!
for number in range(1,10):
product = product * number
print product
#. Solution::
volume_of = {
"A": 67.0, "C": 86.0, "D": 91.0,
"E": 109.0, "F": 135.0, "G": 48.0,
"H": 118.0, "I": 124.0, "K": 135.0,
"L": 124.0, "M": 124.0, "N": 96.0,
"P": 90.0, "Q": 114.0, "R": 148.0,
"S": 73.0, "T": 93.0, "V": 105.0,
"W": 163.0, "Y": 141.0,
}
sum_of_volumes = 0
for volume in volume_of.values():
sum_of_volumes = sum_of_volumes + volume
print sum_of_volumes
#. Solution::
volume_of = {
"A": 67.0, "C": 86.0, "D": 91.0,
"E": 109.0, "F": 135.0, "G": 48.0,
"H": 118.0, "I": 124.0, "K": 135.0,
"L": 124.0, "M": 124.0, "N": 96.0,
"P": 90.0, "Q": 114.0, "R": 148.0,
"S": 73.0, "T": 93.0, "V": 105.0,
"W": 163.0, "Y": 141.0,
}
fasta = """>1BA4:A|PDBID|CHAIN|SEQUENCE
DAEFRHDSGYEVHHQKLVFFAEDVGSNKGAIIGLMVGGVV"""
# Let's extract the sequence
sequence = fasta.split("\n")[1]
sum_of_volumes = 0
# for each character in the sequence ...
for aa in sequence:
volume_of_aa = volume_of[aa]
sum_of_volumes = sum_of_volumes + volume_of_aa
print sum_of_volumes
#. Solution: let's adapt the code from the previous example::
list = [1, 25, 6, 27, 57, 12]
minimum_so_far = list[0]
for number in list[1:]:
if number < minimum_so_far:
minimum_so_far = number
print "the minimum value is:", minimum_so_far
#. Solution: let's combine the example and the previous exercise::
list = [1, 25, 6, 27, 57, 12]
max = list[0]
min = list[0]
for number in list[1:]:
if number > max:
max = number
if number < min:
min = number
print "minimum =", min, "maximum =", max
#. Solution: ``range(0, len(sequence), 3)`` returns ``[0, 3, 6, 9, ...]``,
containing the positions of the first character of all the triplets.
Let's write::
sequence = "ATGGCGCCCGAACAGGGA"
# let's start from an empty list
triplets = []
for pos_start in range(0, len(sequence), 3):
triplets = sequence[pos_start:pos_start+3]
triplets.append(triplets)
print triplets
#. Solution::
text = """>2HMI:A|PDBID|CHAIN|SEQUENCE
PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKI
>2HMI:B|PDBID|CHAIN|SEQUENCE
PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKI
>2HMI:C|PDBID|CHAIN|SEQUENCE
DIQMTQTTSSLSASLGDRVTISCSASQDISSYLNWYQQKPEGTVKLLIYY
>2HMI:D|PDBID|CHAIN|SEQUENCE
QITLKESGPGIVQPSQPFRLTCTFSGFSLSTSGIGVTWIRQPSGKGLEWL
>2HMI:E|PDBID|CHAIN|SEQUENCE
ATGGCGCCCGAACAGGGAC
>2HMI:F|PDBID|CHAIN|SEQUENCE
GTCCCTGTTCGGGCGCCA"""
# first, let's split the text il lines
lines = text.split("\n")
# then, let's create an empty dictionary
sequence_of = {}
# now we can iterate on lines
for line in lines:
if line[0] == ">":
# if the line is a header, we extract the sequence name
name = line.split("|")[0]
else:
# the line contains the sequence, that we add to the dictionary, using the name extracted before as key
sequence_of[name] = line
print sequence_of
#. Solutions:
#. Solution::
while raw_input("write 'STOP': ") != "STOP":
print "you must write 'STOP'..."
#. Solution::
while raw_input("write stop: ").lower() != "stop":
print "you must write 'stop'..."
#. Solutions:
#. Solution: all numbers in ``range(10)``.
#. Solution: the number ``0``. ``break`` immediately interrupts the ``for`` cycle.
#. Solution: all numbers in ``range(10)``. ``continue`` jumps to the next iteration, as Python automatically does when the instructions in the ``for`` cycle are finished. Since ``continue`` in this case is right at the end of the ``for`` cycle, it doesn't have any effect.
#. Solution: the number ``0``. In the first iteration, when ``number`` has value ``0``, first Python executes ``print number``, printing ``0``; then ``if`` is executed, and also the ``break`` inside the ``if``, immediately interrupting the ``for`` cycle.
#. Solution: nothing. In the first iteration, when ``number`` has value ``0``, ``if`` is executed and also the ``break`` inside the ``if``, immediately interrupting the ``for`` cycle. Therefore, ``print`` is never executed.
#. Solution: nothing. Instructions inside the ``while`` are never executed, since the condition is ``False``!
#. Solution: nothing. Instructions inside the ``while`` are never executed, since the condition is ``False``! As a consequence, the line ``condition = True`` is never executed.
#. Solution: ``"the condition is true"`` an infinite number of times. Since the condition is always ``True``, the ``while`` never stops iterating!
#. Solution: ten strings of the form ``"position 0 contains the element 0"``, ``"position 1 contains the element 1"``, *and so on*
#. Solution: all the elements of ``lines`` (processed by ``strip()``) occurring before the first empty line: ``"line 1"``, ``"line 2"`` and ``"line 3"``. As soon as ``line`` has value ``""`` (the fourth element of ``lines``) the ``if`` is executed, and ``break`` interrupts the cycle. Note that the fourth row is *not* printed.
#. Solution::
numbers = (0, 1, 1, 0, 0, 0, 1, 1, 2, 1, 2)
for i in range(len(numbers)):
number_in_pos_i = numbers[i]
if number_in_pos_i == 2:
print "the position is", i
break
#. Solution::
strings = ("000", "51", "51", "32", "57", "26")
for i in range(len(strings)):
string_in_pos_i = strings[i]
if "2" in string_in_pos_i:
print "position =", i, "value =", string_in_pos_i
break
#. Solution::
length = int(raw_input("write the length of the sequence: "))
import random
alphabet = "AGCT"
sequence = ""
for i in range(length):
index = random.randint(0, 3)
sequence = sequence + alphabet[index]
print sequence
Nested code
-------------
#. Solution::
n = 5
matrix = [range(n) for i in range(n)]
for line in matrix:
for element in line:
print element
#. Solution:
#. All the elements of the matrix.
#. The *sum* of all the elements of the matrix.
#. Again, all the elements of the matrix.
#. Again, all the elements of the matrix.
#. The list of the elements on the diagonal.
#. Solution::
numbers = [8, 3, 2, 9, 7, 1, 8]
for num_1 in numbers:
for num_2 in numbers:
print num_1, num_2
This code is very similar to the clock example!
#. Solution::
numbers = [8, 3, 2, 9, 7, 1, 8]
already_printed_pairs = []
for i in range(len(numbers)):
for j in range(len(numbers)):
pair = (numbers[i], numbers[j])
# check wheter we already printed the symmetric pair
if (pair[1], pair[0]) in already_printed_pairs:
continue
# this code will be executed if the pair has not been printed:
# print the pair and update already_printed_pairs
print pair
already_printed_pairs.append(pair)
#. The solution is the same of the previous exercise.
#. Solution::
numbers = range(10)
for element_1 in numbers:
for element_2 in numbers:
if 2 * element_1 == element_2:
print element_1, element_2
#. Solution::
numbers = [8, 3, 2, 9, 7, 1, 8]
for element_1 in numbers:
for element_2 in numbers:
if element_1 + element_2 == 10:
print element_1, element_2
#. Solution::
numbers = [8, 3, 2, 9, 7, 1, 8]
# first, let's create an empty list
list_of_pairs = []
for element_1 in numbers:
for element_2 in numbers:
if element_1 + element_2 == 10:
# update the list with append()
list_of_pairs.append((element_1, element_2))
# finally, let's print the list
print list_of_pairs
#. Solution::
numbers_1 = [5, 9, 4, 4, 9, 2]
numbers_2 = [7, 9, 6, 2]
# iteration on the *first* list
for i in range(len(numbers_1)):
num_in_pos_i = numbers_1[i]
# iteration on the *second* list
for j in range(len(numbers_2)):
num_in_pos_j = numbers_2[j]
if num_in_pos_i == num_in_pos_j:
print "positions:", i, j, "; repeated value:", num_in_pos_i
#. Solution::
numbers_1 = [5, 9, 4, 4, 9, 2]
numbers_2 = [7, 9, 6, 2]
# first, let's create an empty list
list_of_triplets = []
# iteration on the *first* list
for i in range(len(numbers_1)):
num_in_pos_i = numbers_1[i]
# iteration on the *second* list
for j in range(len(numbers_2)):
num_in_pos_j = numbers_2[j]
if num_in_pos_i == num_in_pos_j:
# instead of printing, we update the list
llist_of_triplets.append((i, j, num_in_pos_i))
# finally, let's print the list
print list_of_triplets
#. Solution::
n = 5
matrix = [range(n) for i in range(n)]
# let's initialize with the first element (any other element would be fine as well)
max_element_so_far = matrix[0][0]
# iteration...
for line in matrix:
for element in line:
# we update max_element_so_far when we find a higher element,
if element > max_element_so_far:
max_element_so_far = element
print max_element_so_far
#. Solution::
sequences = [
"ATGGCGCCCGAACAGGGA",
"GTCCCTGTTCGGGCGCCA",
]
# first, let's create an empty list
result = []
# iteration
for sequence in sequences:
# split the current sequence in triplets
triplets = []
for i in range(0, len(sequence), 3):
triplets.append(sequence[i:i+3])
# append (*not* extend()!!!) the obtained triplets
# to the list result
result.append(triplets)
# finally, let's print the list
print result
#. Solution::
numbers = [5, 9, 4, 4, 9, 2]
num_occurrences = {}
for number in numbers:
if not num_occurrences.has_key(number):
num_occurrences[number] = 1
else:
num_occurrences[number] += 1
alternatively::
numbers = [5, 9, 4, 4, 9, 2]
num_occurrences = {}
for number in numbers:
if not num_occurrences.has_key(number):
num_occurrences[number] = 0
num_occurrences[number] += 1
or, using ``count()``::
numbers = [5, 9, 4, 4, 9, 2]
num_occurrences = {}
for number in numbers:
if not num_occurrences.has_key(number):
num_occurrences[number] = numbers.count(number)
Note that in the last variant, the ``if`` line is optional (but not the following "content"!)
#. Solution::
groups = [["gene1", "gene2"], ["gene3"], [], ["gene4", "gene5"]]
# let's initialize with the first group
biggest_group_so_far = groups[0]
# iteration
for grup in groups[1:]:
if len(gropu) > len(biggest_group_so_far):
biggest_group_so_far = group
print biggest_group_so_far
#. Solution::
sequences_2HMI = {
"A": "PISPIETVPVKLKPGMDGPKVKQWPLTEEKI",
"B": "PISPIETVPVKLKPGMDGPKVKQWPLTEEKI",
"C": "DIQMTQTTSSLSASLGDRVTISCSASQDISS",
"D": "QITLKESGPGIVQPSQPFRLTCTFSGFSLST",
"E": "ATGGCGCCCGAACAGGGAC",
"F": "GTCCCTGTTCGGGCGCCA",
}
# let's start with an empty dictionary
histograms = {}
for key, sequence in sequences_2HMI.items():
# let's associate this key to an empty dictionary
histograms[key] = {}
for residue in sequence:
if not histograms[key].has_key(residue):
histograms[key][residue] = 1
else:
histograms[key][residue] += 1
# let's print the result
print histograms
# let's print the result more clearly
for key, histogram in histograms.items():
print key
print histogram
print ""
#. Solution::
table = [
"protein domain start end",
"YNL275W PF00955 236 498",
"YHR065C SM00490 335 416",
"YKL053C-A PF05254 5 72",
"YOR349W PANTHER 353 414",
]
# as before, first let's extract column names from the first row
column_names = table[0].split()
# let's start from an empty list
lines_as_dictionaries = []
# now, let's iterate on the other rows
for line in table[1:]:
# let's compile the dictionary for this row
dictionary = {}
words = line.split()
for i in range(len(words)):
# extract the corresponding word
word= words[i]
# extract the corresponding column name
column_name = column_names[i]
# update the dictionary
dictionary[column_name] = word
# having compiled the dictionary for this line,
# we can update the list
lines_as_dictionaries.append(dictionary)
# finished! now let's print the result (one row at a time,
# to make it easier to read)
for row in lines_as_dictionaries:
print row
#. Solution::
alphabel_lo = "abcdefghijklmnopqrstuvwxyz"
alphabet_up = alfabeto_min.upper()
# let's build the dictionary
lo_to_up = {}
for i in range(len(alphabel_lo)):
lo_to_up[alphabel_lo[i]] = alphabel_up[i]
string = "I am a string"
# let's convert the string
converted_chars = []
for character in string:
if lo_to_up.has_key(character):
# convert the alphabetic character
converted_chars.append(lo_to_up[character])
else:
# we don't convert it (e.g., it's not an alphabetic character)
converted_chars.append(character)
converted_string = "".join(converted_chars)
print converted_string
#. Solution::
lines_1 = open(raw_input("path 1: ")).readlines()
lines_2 = open(raw_input("path 2: ")).readlines()
# we have to be careful, since the two files could be of different lengths!
max_lines = len(lines_1)
if len(lines_2) > max_lines:
max_lines = len(lines_2)
# iteration on the lines of both files
for i in range(max_lines):
# take the i-th line of the first file, if existent,
if i < len(lines_1):
line_1 = lines_1[i].strip()
else:
line_1 = ""
# take the i-th line of the second file, if existent,
if i < len(lines_2):
line_2 = lines_2[i].strip()
else:
line_2 = ""
print line_1 + " " + line_2
#. Solution::
# let's read the fasta file
fasta_as_dictionary = {}
for line in open("data/dna-fasta/fasta.1").readlines():
# let's clean the sequence
line = line.strip()
if line[0] == ">":
header = line
fasta_as_dictionary[header] = ""
else:
fasta_as_dictionary[header] += line
# let's iterate on header-sequence pairs
for header, sequence in fasta_as_dictionary.items():
print "processind", header
# let's count the number of occurrences of each nucleotide
count = {}
for nucleotide in ("A", "C", "G", "T"):
count[nucleotide] = sequence.count(nucleotide)
print "nucleotide occurrences:", count
# calculate gc-content
gc_content = (count["G"] + count["C"]) / float(len(sequence))
print "GC content:", gc_content
# calculate the AT/GC-ratio
sum_at = count["A"] + count["T"]
sum_cg = count["C"] + count["G"]
at_gc_ratio = float(sum_at) / float(sum_cg)
print "AT/GC-ratio:", at_gc_ratio