Python: complex statements (Solutions)

Conditional code: if

  1. Solution:

    number = int(raw_input("write a number: "))
    
    if number % 2 == 0:
        print "even"
    else:
        print "odd"
    

    We use else, since even and odd are the only two possibilities.

    A way to make a third option explicit would be:

    if number % 2 == 0:
        print "even"
    elif number % 2 == 1:
        print "odd"
    else:
        print "impossible!"
    

    but the code in else will never be executed for any value of number!

    Since the two options are mutually exclusive, we can also write:

    if number % 2 == 0:
        print "even"
    if numero % 2 == 1:
        print "odd"
    

    even without the``else``, one and only one of the if can be executed.

  2. Solution:

    number = float(raw_input("write rational: "))
    
    if number >= -1 and number <= 1:
        print "okay"
    

    we don’t need neither``elif`` (there is only one condition) neither else (if the condition is false, we don’t need to do anything).

  3. Solution:

    answer = raw_input("write two numbers separated by a space: ")
    
    words = answer.split()
    num1 = int(words[0])
    num2 = int(words[1])
    
    if num1 > num2:
        print "first"
    elif num2 > num1:
        print "second"
    else:
        print "neither"
    

    Alternatively:

    answer = raw_input("write two numbers separated by a space: ")
    
    numbers = [int(word) for word in answer.split()]
    
    if numbers[0] > numbers[1]:
        print "first"
    elif numbers[0] < numbers[1]:
        print "second"
    else:
        print "neither"
    
  4. Solution:

    horoscope_of = {
        "January": "extreme luck",
        "February": "try to be born again",
        "March": "kissed by fortune",
        "April": "lucky luke",
    }
    
    month = raw_input("tell me your birth month: ")
    
    if horoscope_of.has_key(month):
        print horoscope_of[month]
    else:
        print "not available"
    
  5. Solution:

    path = raw_input("write your path: ")
    
    lines = open(path, "r").readlines()
    if len(lines) == 0:
        print "empty"
    elif len(lines) < 100:
        print "short", len(lines)
    elif len(lines) < 1000:
        print "average", len(lines)
    else:
        print "large", len(lines)
    

    Note that it’s not necessary to specify entirely the conditions: in the code we can shorten 100 < len(lines) < 1000 with len(lines) < 1000. We can do that, since when``len(lines)`` is lower than 100 the first elif is executed: the second elif is not even considered.

  6. Solution:

    point1 = [float(word) for word
              in raw_input("write three coordinates: ").split()]
    
    point2 = [float(word) for word
              in raw_input("write three coordinates: ").split()]
    
    if point1[0] >= 0 and point1[1] >= 0 and point1[2] >= 0 and \
       point2[0] >= 0 and point2[1] >= 0 and point2[2] >= 0:
        diff_x = point1[0] - point2[0]
        diff_y = point1[1] - point2[1]
        diff_z = point1[2] - point2[2]
    
        print "the distance is", (diff_x**2 + diff_y**2 +  diff_z**2)**0.5
    

    Note that print is inside the if.

  7. Solution: we know that number is an arbitrary integer, chosen by the user:

    if number % 3 == 0:
        print "divisible by 3"
    elif numero % 3 != 0:
        print "not divisible by 3"
    else:
        print "dunno"
    

    if, elif and else form a chain: only one among them is executed.

    1. if is executed if and only if number is divisibile by three.
    2. elif is executed if and only if the previous if is not executed and if number is not divisible by three.
    3. else is execute whenever neither if and elif are executed.

    Since all numbers are either divisible by 3 either not, there is no other possibility, else will never be executed.

    Therefore, the answer is no.

  8. Solution: as before, number is an arbitrary integer. The code is:

    number = int(raw_input("write a number: "))
    if number % 2 == 0:
        print "divisible by 2"
    if number % 3 == 0:
        print "divisible by 2"
    if number % 2 != 0 and number % 3 != 0:
        print "dunno"
    

    Here we don’t have “chains” of if, elif ed else: we have three independent if.

    1. The first if is executed if and only if number is divisible by two.
    2. The second if is executed if and only if number is divisible by three.
    3. The third if is executed if and only if number is not divisible by neither two and three.

    If number is 6, divisible by both two and three, the first two if will be both executed, while the third won’t be.

    If number is 5, not divisible by neither two and three, the first two if will not be executed, but the third will be.

    Therefore, the answer is yes.

    (There is no possibility to not execute neither of the three if.)

  9. Solution:

    answer = raw_input("sum or product?: ")
    
    if answer == "sum":
        num1 = int(raw_input("number 1: "))
        num2 = int(raw_input("number 2: "))
        print "the sum is", num1 + num2
    
    elif answer == "product":
        num1 = int(raw_input("num1: "))
        num2 = int(raw_input("num2: "))
        print "the product is", num1 * num2
    

    Using if or elif won’t change the execution of the program.

    We can simplify like this:

    answer = raw_input("sum or product?: ")
    num1 = int(raw_input("number 1: "))
    num2 = int(raw_input("number 2: "))
    
    if answer == "sum":
        print "the sum is", num1 + num2
    
    elif answer == "product":
        print "the product is", num1 * num2
    

Iterative code: for and while

  1. Solutions:

    1. Solution:

      for number in range(10):
          print number
      
    2. Solution:

      for number in range(10):
          print number**2
      
    3. Solution:

      sum_of_squares = 0
      for number in range(10):
          sum_of_squares = sum_of_squares + number**2
      print sum_of_squares
      
    4. Solution:

      product = 1 # note that for the product the initial value should be 1!
      for number in range(1,10):
          product = product * number
      print product
      
    5. Solution:

      volume_of = {
          "A":  67.0, "C":  86.0, "D":  91.0,
          "E": 109.0, "F": 135.0, "G":  48.0,
          "H": 118.0, "I": 124.0, "K": 135.0,
          "L": 124.0, "M": 124.0, "N":  96.0,
          "P":  90.0, "Q": 114.0, "R": 148.0,
          "S":  73.0, "T":  93.0, "V": 105.0,
          "W": 163.0, "Y": 141.0,
      }
      
      sum_of_volumes = 0
      for volume in volume_of.values():
          sum_of_volumes = sum_of_volumes + volume
      print sum_of_volumes
      
    6. Solution:

      volume_of = {
          "A":  67.0, "C":  86.0, "D":  91.0,
          "E": 109.0, "F": 135.0, "G":  48.0,
          "H": 118.0, "I": 124.0, "K": 135.0,
          "L": 124.0, "M": 124.0, "N":  96.0,
          "P":  90.0, "Q": 114.0, "R": 148.0,
          "S":  73.0, "T":  93.0, "V": 105.0,
          "W": 163.0, "Y": 141.0,
      }
      
      fasta = """>1BA4:A|PDBID|CHAIN|SEQUENCE
      DAEFRHDSGYEVHHQKLVFFAEDVGSNKGAIIGLMVGGVV"""
      
      # Let's extract the sequence
      sequence = fasta.split("\n")[1]
      
      sum_of_volumes = 0
      
      # for each character in the sequence ...
      for aa in sequence:
          volume_of_aa = volume_of[aa]
          sum_of_volumes = sum_of_volumes + volume_of_aa
      
      print sum_of_volumes
      
    7. Solution: let’s adapt the code from the previous example:

      list = [1, 25, 6, 27, 57, 12]
      
      minimum_so_far = list[0]
      for number in list[1:]:
          if number < minimum_so_far:
              minimum_so_far = number
      
      print "the minimum value is:", minimum_so_far
      
    8. Solution: let’s combine the example and the previous exercise:

      list = [1, 25, 6, 27, 57, 12]
      
      max = list[0]
      min = list[0]
      
      for number in list[1:]:
          if number > max:
              max = number
          if number < min:
              min = number
      
      print "minimum =", min, "maximum =", max
      
    9. Solution: range(0, len(sequence), 3) returns [0, 3, 6, 9, ...], containing the positions of the first character of all the triplets.

      Let’s write:

      sequence = "ATGGCGCCCGAACAGGGA"
      
      # let's start from an empty list
      triplets = []
      
      for pos_start in range(0, len(sequence), 3):
          triplets = sequence[pos_start:pos_start+3]
          triplets.append(triplets)
      
      print triplets
      
    10. Solution:

      text = """>2HMI:A|PDBID|CHAIN|SEQUENCE
      PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKI
      >2HMI:B|PDBID|CHAIN|SEQUENCE
      PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKI
      >2HMI:C|PDBID|CHAIN|SEQUENCE
      DIQMTQTTSSLSASLGDRVTISCSASQDISSYLNWYQQKPEGTVKLLIYY
      >2HMI:D|PDBID|CHAIN|SEQUENCE
      QITLKESGPGIVQPSQPFRLTCTFSGFSLSTSGIGVTWIRQPSGKGLEWL
      >2HMI:E|PDBID|CHAIN|SEQUENCE
      ATGGCGCCCGAACAGGGAC
      >2HMI:F|PDBID|CHAIN|SEQUENCE
      GTCCCTGTTCGGGCGCCA"""
      
      # first, let's split the text il lines
      lines = text.split("\n")
      
      # then, let's create an empty dictionary
      sequence_of = {}
      
      # now we can iterate on lines
      for line in lines:
      
          if line[0] == ">":
              # if the line is a header, we extract the sequence name
              name = line.split("|")[0]
          else:
              # the line contains the sequence, that we add to the dictionary, using the name extracted before as key
              sequence_of[name] = line
      
      print sequence_of
      
  2. Solutions:

    1. Solution:

      while raw_input("write 'STOP': ") != "STOP":
          print "you must write 'STOP'..."
      
    2. Solution:

      while raw_input("write stop: ").lower() != "stop":
          print "you must write 'stop'..."
      
  3. Solutions:

    1. Solution: all numbers in range(10).
    2. Solution: the number 0. break immediately interrupts the for cycle.
    3. Solution: all numbers in range(10). continue jumps to the next iteration, as Python automatically does when the instructions in the for cycle are finished. Since continue in this case is right at the end of the for cycle, it doesn’t have any effect.
    4. Solution: the number 0. In the first iteration, when number has value 0, first Python executes print number, printing 0; then if is executed, and also the break inside the if, immediately interrupting the for cycle.
    5. Solution: nothing. In the first iteration, when number has value 0, if is executed and also the break inside the if, immediately interrupting the for cycle. Therefore, print is never executed.
    6. Solution: nothing. Instructions inside the while are never executed, since the condition is False!
    7. Solution: nothing. Instructions inside the while are never executed, since the condition is False! As a consequence, the line condition = True is never executed.
    8. Solution: "the condition is true" an infinite number of times. Since the condition is always True, the while never stops iterating!
    9. Solution: ten strings of the form "position 0 contains the element 0", "position 1 contains the element 1", and so on
    10. Solution: all the elements of lines (processed by strip()) occurring before the first empty line: "line 1", "line 2" and "line 3". As soon as line has value "" (the fourth element of lines) the if is executed, and break interrupts the cycle. Note that the fourth row is not printed.
  4. Solution:

    numbers = (0, 1, 1, 0, 0, 0, 1, 1, 2, 1, 2)
    
    for i in range(len(numbers)):
        number_in_pos_i = numbers[i]
    
        if number_in_pos_i == 2:
            print "the position is", i
            break
    
  5. Solution:

    strings = ("000", "51", "51", "32", "57", "26")
    
    for i in range(len(strings)):
        string_in_pos_i = strings[i]
    
        if "2" in string_in_pos_i:
            print "position =", i, "value =", string_in_pos_i
            break
    
  6. Solution:

    length = int(raw_input("write the length of the sequence: "))
    import random
    alphabet = "AGCT"
    sequence = ""
    for i in range(length):
        index = random.randint(0, 3)
        sequence = sequence + alphabet[index]
    print sequence
    

Nested code

  1. Solution:

    n = 5
    matrix = [range(n) for i in range(n)]
    
    for line in matrix:
        for element in line:
            print element
    
  2. Solution:

    1. All the elements of the matrix.
    2. The sum of all the elements of the matrix.
    3. Again, all the elements of the matrix.
    4. Again, all the elements of the matrix.
    5. The list of the elements on the diagonal.
  3. Solution:

    numbers = [8, 3, 2, 9, 7, 1, 8]
    
    for num_1 in numbers:
        for num_2 in numbers:
            print num_1, num_2
    

    This code is very similar to the clock example!

  4. Solution:

    numbers = [8, 3, 2, 9, 7, 1, 8]
    
    already_printed_pairs = []
    
    for i in range(len(numbers)):
        for j in range(len(numbers)):
    
            pair = (numbers[i], numbers[j])
    
            # check wheter we already printed the symmetric pair
            if (pair[1], pair[0]) in already_printed_pairs:
                continue
    
            # this code will be executed if the pair has not been printed:
            # print the pair and update already_printed_pairs
            print pair
            already_printed_pairs.append(pair)
    
  5. The solution is the same of the previous exercise.

  6. Solution:

    numbers = range(10)
    
    for element_1 in numbers:
        for element_2 in numbers:
            if 2 * element_1 == element_2:
                print element_1, element_2
    
  7. Solution:

    numbers = [8, 3, 2, 9, 7, 1, 8]
    
    for element_1 in numbers:
        for element_2 in numbers:
            if element_1 + element_2 == 10:
                print element_1, element_2
    
  8. Solution:

    numbers = [8, 3, 2, 9, 7, 1, 8]
    
    # first, let's create an empty list
    list_of_pairs = []
    
    for element_1 in numbers:
        for element_2 in numbers:
            if element_1 + element_2 == 10:
                # update the list with append()
                list_of_pairs.append((element_1, element_2))
    
    # finally, let's print the list
    print list_of_pairs
    
  9. Solution:

    numbers_1 = [5, 9, 4, 4, 9, 2]
    numbers_2 = [7, 9, 6, 2]
    
    # iteration on the *first* list
    for i in range(len(numbers_1)):
        num_in_pos_i = numbers_1[i]
    
        # iteration on the *second* list
        for j in range(len(numbers_2)):
            num_in_pos_j = numbers_2[j]
    
            if num_in_pos_i == num_in_pos_j:
                print "positions:", i, j, "; repeated value:", num_in_pos_i
    
  10. Solution:

    numbers_1 = [5, 9, 4, 4, 9, 2]
    numbers_2 = [7, 9, 6, 2]
    
    # first, let's create an empty list
    list_of_triplets = []
    
    # iteration on the *first* list
    for i in range(len(numbers_1)):
        num_in_pos_i = numbers_1[i]
    
    # iteration on the *second* list
        for j in range(len(numbers_2)):
            num_in_pos_j = numbers_2[j]
    
            if num_in_pos_i == num_in_pos_j:
                # instead of printing, we update the list
                llist_of_triplets.append((i, j, num_in_pos_i))
    
    # finally, let's print the list
    print list_of_triplets
    
  11. Solution:

    n = 5
    matrix = [range(n) for i in range(n)]
    
    # let's initialize with the first element (any other element would be fine as well)
    max_element_so_far = matrix[0][0]
    
    # iteration...
    for line in matrix:
        for element in line:
            # we update max_element_so_far when we find a higher element,
            if element > max_element_so_far:
                max_element_so_far = element
    
    print max_element_so_far
    
  12. Solution:

    sequences = [
        "ATGGCGCCCGAACAGGGA",
        "GTCCCTGTTCGGGCGCCA",
    ]
    
    # first, let's create an empty list
    result = []
    
    # iteration
    for sequence in sequences:
        # split the current sequence in triplets
        triplets = []
        for i in range(0, len(sequence), 3):
            triplets.append(sequence[i:i+3])
    
        # append (*not* extend()!!!) the obtained triplets
        # to the list result
        result.append(triplets)
    
    # finally, let's print the list
    print result
    
  13. Solution:

    numbers = [5, 9, 4, 4, 9, 2]
    
    num_occurrences = {}
    
    for number in numbers:
        if not num_occurrences.has_key(number):
            num_occurrences[number] = 1
        else:
            num_occurrences[number] += 1
    

    alternatively:

    numbers = [5, 9, 4, 4, 9, 2]
    
    num_occurrences = {}
    
    for number in numbers:
        if not num_occurrences.has_key(number):
            num_occurrences[number] = 0
        num_occurrences[number] += 1
    

    or, using count():

    numbers = [5, 9, 4, 4, 9, 2]
    
    num_occurrences = {}
    
    for number in numbers:
        if not num_occurrences.has_key(number):
            num_occurrences[number] = numbers.count(number)
    

    Note that in the last variant, the if line is optional (but not the following “content”!)

  14. Solution:

    groups = [["gene1", "gene2"], ["gene3"], [], ["gene4", "gene5"]]
    
    # let's initialize with the first group
    biggest_group_so_far = groups[0]
    
    # iteration
    for grup in groups[1:]:
        if len(gropu) > len(biggest_group_so_far):
            biggest_group_so_far = group
    
    print biggest_group_so_far
    
  15. Solution:

    sequences_2HMI = {
        "A": "PISPIETVPVKLKPGMDGPKVKQWPLTEEKI",
        "B": "PISPIETVPVKLKPGMDGPKVKQWPLTEEKI",
        "C": "DIQMTQTTSSLSASLGDRVTISCSASQDISS",
        "D": "QITLKESGPGIVQPSQPFRLTCTFSGFSLST",
        "E": "ATGGCGCCCGAACAGGGAC",
        "F": "GTCCCTGTTCGGGCGCCA",
    }
    
    # let's start with an empty dictionary
    histograms = {}
    
    for key, sequence in sequences_2HMI.items():
    
        # let's associate this key to an empty dictionary
        histograms[key] = {}
    
        for residue in sequence:
            if not histograms[key].has_key(residue):
                histograms[key][residue] = 1
            else:
                histograms[key][residue] += 1
    
    # let's print the result
    print histograms
    
    # let's print the result more clearly
    for key, histogram in histograms.items():
        print key
        print histogram
        print ""
    
  16. Solution:

    table = [
        "protein domain start end",
        "YNL275W PF00955 236 498",
        "YHR065C SM00490 335 416",
        "YKL053C-A PF05254 5 72",
        "YOR349W PANTHER 353 414",
    ]
    
    # as before, first let's extract column names from the first row
    column_names = table[0].split()
    
    # let's start from an empty list
    lines_as_dictionaries = []
    
    # now, let's iterate on the other rows
    for line in table[1:]:
    
        # let's compile the dictionary for this row
        dictionary = {}
        words = line.split()
        for i in range(len(words)):
    
            # extract the corresponding word
            word= words[i]
    
            # extract the corresponding column name
            column_name = column_names[i]
    
            # update the dictionary
            dictionary[column_name] = word
    
        # having compiled the dictionary for this line,
        # we can update the list
        lines_as_dictionaries.append(dictionary)
    
    # finished! now let's print the result (one row at a time,
    # to make it easier to read)
    for row in lines_as_dictionaries:
        print row
    
  17. Solution:

    alphabel_lo = "abcdefghijklmnopqrstuvwxyz"
    alphabet_up = alfabeto_min.upper()
    
    # let's build the dictionary
    lo_to_up = {}
    for i in range(len(alphabel_lo)):
        lo_to_up[alphabel_lo[i]] = alphabel_up[i]
    
    
    string = "I am a string"
    
    # let's convert the string
    converted_chars = []
    for character in string:
        if lo_to_up.has_key(character):
            # convert the alphabetic character
            converted_chars.append(lo_to_up[character])
        else:
            # we don't convert it (e.g., it's not an alphabetic character)
            converted_chars.append(character)
    converted_string = "".join(converted_chars)
    
    print converted_string
    
  18. Solution:

    lines_1 = open(raw_input("path 1: ")).readlines()
    lines_2 = open(raw_input("path 2: ")).readlines()
    
    # we have to be careful, since the two files could be of different lengths!
    max_lines = len(lines_1)
    if len(lines_2) > max_lines:
        max_lines = len(lines_2)
    
    # iteration on the lines of both files
    for i in range(max_lines):
    
        # take the i-th line of the first file, if existent,
        if i < len(lines_1):
            line_1 = lines_1[i].strip()
        else:
            line_1 = ""
    
        # take the i-th line of the second file, if existent,
        if i < len(lines_2):
            line_2 = lines_2[i].strip()
        else:
            line_2 = ""
    
        print line_1 + " " + line_2
    
  19. Solution:

    # let's read the fasta file
    fasta_as_dictionary = {}
    for line in open("data/dna-fasta/fasta.1").readlines():
    
        # let's clean the sequence
        line = line.strip()
    
        if line[0] == ">":
            header = line
            fasta_as_dictionary[header] = ""
    
        else:
            fasta_as_dictionary[header] += line
    
    # let's iterate on header-sequence pairs
    for header, sequence in fasta_as_dictionary.items():
    
        print "processind", header
    
        # let's count the number of occurrences of each nucleotide
        count = {}
        for nucleotide in ("A", "C", "G", "T"):
            count[nucleotide] = sequence.count(nucleotide)
        print "nucleotide occurrences:", count
    
        # calculate gc-content
        gc_content = (count["G"] + count["C"]) / float(len(sequence))
        print "GC content:", gc_content
    
        # calculate the AT/GC-ratio
        sum_at = count["A"] + count["T"]
        sum_cg = count["C"] + count["G"]
        at_gc_ratio = float(sum_at) / float(sum_cg)
        print "AT/GC-ratio:", at_gc_ratio