Python: Lists (Solutions)

Note

In some cases I will use the character \ at the end of a line.

Used in this way, \ tells Python that the command continues in the following line. Didn’t I use \, Python could think that the command is complete, giving an error message if the syntax is wrong.

You can ignore these \.

Operations

  1. Solution:

    list = []
    print list, len(list)         # check
    
  2. Solution:

    list = range(5)
    print list, len(list)         # check
    print len(list)
    
  3. Solution:

    list = [0] * 100
    print list, len(list)         # check
    
  4. Solution:

    list_1 = range(10)
    list_2 = range(10, 20)
    
    list_complete = list_1 + list_2
    print list_complete
    
    print list_complete == range(20)   # True
    
  5. Solution:

    list = ["I am", "a", "list"]
    print list, len(list)         # check
    
    print len(list[0])
    print len(list[1])
    print len(list[2])
    
  6. Solution:

    list = [0.0, "b", [3], [4, 5]]
    
    print len(list)                # 4
    
    print type(list[0])            # float
    
    print list[1], len(list[1])   # "b", 1
    
    print list[2], len(list[2])   # [3], 1
    
    print list[-1], len(list[-1]) # [4, 5], 2
    
    print "b" in list              # True
    
    print 4 in list                # False
    print 4 in list[-1]            # True
    
  7. Solution: the first is a list of integers, the second a list of strings, the third is a string!:

    print type(list_1)             # list
    print type(list_2)             # list
    print type(list_3)             # str
    
  8. Solutions:

    # an empty list
    list = []
    print len(list)                # 0
    del list
    
    
    # invalid syntax, Python gives an error message
    list = [}
    
    
    # a list that contains an empty list
    list = [[]]
    print len(list)                # 1
    print len(list[0])             # 0
    del list
    
    
    # the following doesn't work because the list is not defined!
    list.append(0)
    
    
    # this works
    list = []
    list.append(0)
    print list                     # [0]
    del list
    
    
    # this doesn't work because we forgot to put commas!
    list = [1 2 3]
    
    
    # this gives an error message because the list has only 3 elements!
    list = range(3)
    print list[3]
    
    
    # Extract the last element
    list = range(3)
    print list[-1]
    del list
    
    
    # Extract the first two elements (list[2], the third,
    # is excluded)
    list = range(3)
    sublist = list[0:2]
    print list
    del list
    
    
    # Extract all the elements(list[3], not existent
    # is excluded)
    list = range(3)
    sublist = list[0:3]
    print list
    del list
    
    
    # Extract the first two elements (list[-1], the third,
    # is excluded)
    list = range(3)
    sublist = list[0:-1]
    print list
    del list
    
    
    # Insert in third position the string "two"
    list = range(3)
    list[2] = "two"
    print list
    del list
    
    
    # this doesn't work, the list contains only three elements,
    # there is no fourth position, and Python gives an error
    list = range(3)
    list[3] = "three"
    
    
    # insert in third posizion the string "three"
    list = range(3)
    list[-1] = "three"
    print list
    del list
    
    
    # the index has to be an integer, Python gives an error message
    list = range(3)
    list[1.2] = "one point two"
    
    
    # substitute the second element of list (i.e. 1)
    # with a list of two strings; this is can be done,
    # since lists *can* contain other lists
    list = range(3)
    list[1] = ["protein1", "protein2"]
    print list
    del list
    
  9. Solution:

    matrix = [
        [1, 2, 3],
        [4, 5, 6],
        [7, 8, 9],
    ]
    
    first_row = matrix[0]
    print first_row
    
    second_element_first_row = first_row[1]
    # or
    second_element_first_row = matrix[0][1]
    print second_element_first_row
    
    sum_first_row = matrix[0][0] + matrix[0][1] + matrix[0][2]
    print sum_first_row
    
    second_column = [matrix[0][1], matrix[1][1], matrix[2][1]]
    print second_column
    
    diagonal = [matrix[0][0], matrix[1][1], matrix[2][2]]
    print diagonal
    
    three_rows_together = matrix[0] + matrix[1] + matrix[2]
    print three_rows_together
    

Methods

  1. First, let’s create a list. For example, an empty list:

    list = []
    

    next, let’s add the required elements with append():

    list.append(0)
    list.append("text")
    list.append([0, 1, 2, 3])
    
  2. Solution:

    # add one 3 at the end of the list
    list = range(3)
    list.append(3)
    print list
    del list
    
    # add a list with a 3 at the end of the list
    list = range(3)
    list.append([3])
    print list
    del list
    
    # add a 3 (the only element contained in the list [3])
    # at the end of the list
    list = range(3)
    list.extend([3])
    print list
    del list
    
    # doesn't work: extend() extends a list with the content of
    # another list, but here 3 is *not* a list!
    # Python gives an error message
    list = range(3)
    list.extend(3)
    
    # replace the element in position 0, the first, with a 3
    list = range(3)
    list.insert(0, 3)
    print list
    del list
    
    # insert a 3 at the end of list
    list = range(3)
    list.insert(3, 3)
    print list
    del list
    
    # insert the list [3] at the end of list
    list = range(3)
    list.insert(3, [3])
    print list
    del list
    
    # doesn't work: the first argument of insert() has to be an integer
    # not a list! Python gives an error message
    list = range(3)
    list.insert([3], 3)
    
  3. Solution:

    list = []
    list.append(range(10))
    list.append(range(10, 20))
    print list
    

    Here we use append(), that inserts an element at the end of list. In this example we insert two lists, the results of range(10) and range(10, 20).

    Clearly, len(list) is 2, since we added only 2 elements.

    On the other hand:

    list = []
    list.extend(range(10))
    list.extend(range(10, 20))
    print list
    

    here we use extend(), that extends a list with another list. Here the final list has 20 elements, as we can see with:

    print len(list)
    
  4. Solution:

    list = [0, 0, 0, 0]
    list.remove(0)
    print list
    

    only the first occurrence of 0 is removed!

  5. Solution:

    list = [1, 2, 3, 4, 5]
    
    # invert the order of the elements of list
    list.reverse()
    print list
    
    # order the elements of list
    list.sort()
    print list
    

    After the two operations,``list`` gets back to the initial value.

    On the other hand:

    list = [1, 2, 3, 4, 5]
    list.reverse().sort()
    

    cannot be done, since the result of list.reverse() is None:

    list = [1, 2, 3, 4, 5]
    result = list.reverse()
    print result
    

    being result not a list,``sort()`` cannot be applied. Python gives an error message.

  6. Let’s try this:

    list = range(10)
    inverse_list = list.reverse()
    
    print list                         # modified!
    print inverse_list                 # None!
    

    the code doesn’t work: reverse() modifies list e returns None! Moreover, this code modifies directly``list``, and we don’t want that.

    First, let’s create a copy of list, next we can order the copy:

    list = range(10)
    inverse_list = list[:]            # *not* inverse_list = list
    inverse_list.reverse()
    print list                         # unvaried
    print inverse_list                 # inverted
    

    On the other hand, this code:

    list = range(10)
    inverse_list = list
    inverse_list.reverse()
    print list                         # modified!
    print inverse_list                # inverted
    

    doesn’t work as we want: inverse_list doesn’t contain a copy of list, but a reference to the same object referred by list.

    As a consequence, when we invert inverse_list we also invert list.

  7. As before:

    motifs = [
        "KSYK",
        "SVALVV"
        "GVTGI",
        "VGSSLAEVLKLPD",
    ]
    sorted_motifs = motifs.sort()
    

    the code doesn’t work: reverse() modifies motifs e returns None! First, let’s create a copy of motifs, next we can order the copy:

    sorted_motifs = motifs[:]
    fsorted_motifs.sort()
    print motifs                    # unvaried
    print sorted_motifs            # ordered
    

String-List Methods

  1. Solution:

    text = """The Wellcome Trust Sanger Institute
    is a world leader in genome research.""""
    
    words = text.split()
    print len(words)
    
  2. Solution:

    table = [
        "protein | database | domain | start | end",
        "YNL275W | Pfam | PF00955 | 236 | 498",
        "YHR065C | SMART | SM00490 | 335 | 416",
        "YKL053C-A | Pfam | PF05254 | 5 | 72",
        "YOR349W | PANTHER | 353 | 414",
    ]
    
    first_row = table[0]
    almost_column_titles = first_row.split("|")
    almost_column_titles
    # ["protein ", " database ", ...]
    
    # unfortunately, column titles contain superfluous spaces
    # to remove them, we can change the argument of split()
    
    column_titles = first_row.split(" | ")
    print column_titles
    # ["protein", "database", ...]
    

    We could also use strip() together with a list comprehension on almost_column_titles, but it’s not necessary.

  3. Solution:

    words = ["word_1", "word_2", "word_3"]
    
    print " ".join(words)
    
    print ",".join(words)
    
    print " e ".join(words)
    
    print "".join(words)
    
    backslash = r"\"
    print backslash.join(words)
    
  4. Solution:

    verses = [
        "Taci. Su le soglie"
        "del bosco non odo"
        "parole che dici"
        "umane; ma odo"
        "parole piu' nuove"
        "che parlano gocciole e foglie"
        "lontane."
    ]
    
    poem = "\n".join(verses)
    

List Comprehension

  1. Solutions:

    1. Solution:

      list_plus_three = [number + 3 for number in list]
      
      print list_plus_three             # check
      
    2. Solution:

      odds = [number for number in list
                       if (number % 2 == 1)]
      
    3. Solution:

      opposites = [-number for number in list]
      
    4. Solution:

      inverses = [1.0 / number for number in list
                       if number != 0]
      
    5. Solution:

      first_and_last = [list[0], list[-1]]
      
    6. Solution:

      from_second_to_penultimate = list[1:-1]
      
    7. Solution:

      list_odds = [number for number in list
                       if (number % 2 == 1)]
      number_odds = len(list_odds)
      print number_odds
      

      or:

      number_odds = len([number for number in list
                            if (number % 2 == 1)])
      
    8. Solution:

      list_divided_by_5 = [float(number) / 5
                            for number in list]
      
    9. Solution:

      list_multiples_5_divided = [float(number) / 5.0)
                                 for number in list
                                 if (number % 5 == 0)]
      
    10. Solution:

      list_of_strings = [str(number) for number in list]
      
    11. Solution:

      # As before, but iterating on `list_of_strings`
      # rather than directly on `list`
      number_odds = len([string for string in list_of_strings
                            if (int(string) % 5 == 0)])
      
    12. Solution:

      text = " ".join([str(number) for number in list])
      

      Notice that if we forget to write``str(number)``, join() doesn’t work.

  2. Solutions:

    1. # forward
      list_1 = [1, 2, 3]
      list_2 = [str(x) for x in list_1]
      
      # backward
      list_2 = ["1", "2", "3"]
      list_1 = [int(x) for x in list_2]
      
    2. # forward
      list_1 = ["name", "surname", "age"]
      list_2 = [[x] for x in list_1]
      
      # backward
      list_2 = [["name"], ["surname"], ["age"]]
      list_1 = [l[0] for l in list_2]
      
    3. # forward
      list_1 = ["ACTC", "TTTGGG", "CT"]
      list_2 = [[x.lower(), len(x)] for x in list_1]
      
      # backward
      list_2 = [["actc", 4], ["tttgggcc", 6], ["ct", 2]]
      list_1 = [l[0].upper() for l in list_2]
      
  3. Solution:

    1. [x for x in list]: creates a copy of list.
    2. [y for y in list]: creates a copy of list (the same as before).
    3. [y for x in list]: invalid. (If x represents an element of the list, what is y?)
    4. ["x" for x in list]: creates a list full of strings "x" as long as list. The result will be: ["x", "x", ..., "x"].
    5. [str(x) for x in list]: for each int x in list, x is converted to a string str(x) and included in the resulting list. The result will be: ["0", "1", ..., "9"].
    6. [x for str(x) in list]: invalid: the transformation str(...) is in the wrong place!
    7. [x + 1 for x in list]: for each int x in list, adds 1 with x + 1 and includes the result in the resulting list. The result will be: [1, 2, ..., 10].
    8. [x + 1 for x in list if x == 2]: for each int x in list checks whether it is equal to 2. In the positive case x + 1 is included in the resulting list, otherwise it’s excluded. The result will be’: [3].
  4. Solution:

    clusters = """\
    >Cluster 0
    0 >YLR106C at 100.00%
    >Cluster 50
    0 >YPL082C at 100.00%
    >Cluster 54
    0 >YHL009W-A at 90.80%
    1 >YHL009W-B at 100.00%
    2 >YJL113W at 98.77%
    3 >YJL114W at 97.35%
    >Cluster 52
    0 >YBR208C at 100.00%
    """
    
    rows = clusters.split("\n")
    

    In order to get cluster names, we have to keep only lines starting with ">", and for each of them apply split() in order to get the second element (the nam of the cluster):

    cluster_names = [row.split()[1] for row in rows
                    if row.startswith(">")]
    

    In order to get protein names, we have to keep only lines not starting with ">", and for each of them apply split() and keep the second element (also removing ">" from the name of the protein):

    proteins = [row.split()[1].lstrip(">") for row in rows
                if not row.startswith(">")]
    

    In order to get protein-percentage pairs, we have to keep only lines not starting with ">". On each of them, we apply split() and keep the second element (protein name) and the last (percentage):

    protein_percentage_pairs = \
        [[row.split()[1].lstrip(">"), row.split()[-1].rstrip("%")]
         for row in rows if not row.startswith(">")]
    

    Annotated version:

    protein_percentage_pairs = \
        [[row.split()[1].lstrip(">"), row.split()[-1].rstrip("%")]
    #     ^^^^^^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    #       protein name, as before            percentage
    #    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    #                     protein-percentage pair
         for row in rows if not row.startswith(">")]
    
  5. Solution:

    matrix = [range(0,3), range(3,6), range(6,9)]
    
    
    # extract first row
    first_row = matrix[0]
    
    
    # extract first column
    first_column = [matrix[0][i] for i in range(3)]
    
    
    # invert row order
    upside_down = matrix[:]
    upside_down.reverse()
    # or
    upside_down = [matrix[2-i] for i in range(3)]
    
    
    # invert column order
    palindrome = []
    # append the first line
    palindrome.append([matrix[0][2-i] for i in range(3)])
    # append the second line
    palindrome.append([matrix[1][2-i] for i in range(3)])
    # append the third line
    palindrome.append([matrix[2][2-i] for i in range(3)])
    
    # or in a single step -- it's complicated and you can ignore it!!!
    palindrome = [[row[2-i] for i in range(3)]
                   for row in matrix]
    
    
    # we can re-create matrix with a single list comprehension
    matrix_again = [range(i, i+3) for i in range(9)
                        if i % 3 == 0]
    
  6. Solutions:

    list = range(100)
    
    squares = [number**2 for number in list]
    
    difference_of_squares = \
        [squares[i+1] - squares[i]
         for i in range(len(squares) - 1)]
    
  7. Solutions:

    mouse_genes = ["Fus", "Tdp43", "Sod1", "Ighmbp2", "Srsf2"]
    
    sorted_mouse_genes = mouse_genes[:]
    sorted_mouse_genes.sort()
    print sorted_mouse_genes
    
    human_genes = [gene.upper() for gene in sorted_mouse_genes]