========================= Python: Lists (Solutions) ========================= .. note:: In some cases I will use the character ``\`` at the end of a line. Used in this way, ``\`` tells Python that the command continues in the following line. Didn't I use ``\``, Python could think that the command is complete, giving an error message if the syntax is wrong. You can ignore these ``\``. Operations ---------- #. Solution:: list = [] print list, len(list) # check #. Solution:: list = range(5) print list, len(list) # check print len(list) #. Solution:: list = [0] * 100 print list, len(list) # check #. Solution:: list_1 = range(10) list_2 = range(10, 20) list_complete = list_1 + list_2 print list_complete print list_complete == range(20) # True #. Solution:: list = ["I am", "a", "list"] print list, len(list) # check print len(list[0]) print len(list[1]) print len(list[2]) #. Solution:: list = [0.0, "b", [3], [4, 5]] print len(list) # 4 print type(list[0]) # float print list[1], len(list[1]) # "b", 1 print list[2], len(list[2]) # [3], 1 print list[-1], len(list[-1]) # [4, 5], 2 print "b" in list # True print 4 in list # False print 4 in list[-1] # True #. Solution: the first is a list of integers, the second a list of strings, the third is a *string*!:: print type(list_1) # list print type(list_2) # list print type(list_3) # str #. Solutions:: # an empty list list = [] print len(list) # 0 del list # invalid syntax, Python gives an error message list = [} # a list that contains an empty list list = [[]] print len(list) # 1 print len(list[0]) # 0 del list # the following doesn't work because the list is not defined! list.append(0) # this works list = [] list.append(0) print list # [0] del list # this doesn't work because we forgot to put commas! list = [1 2 3] # this gives an error message because the list has only 3 elements! list = range(3) print list[3] # Extract the last element list = range(3) print list[-1] del list # Extract the first two elements (list[2], the third, # is excluded) list = range(3) sublist = list[0:2] print list del list # Extract all the elements(list[3], not existent # is excluded) list = range(3) sublist = list[0:3] print list del list # Extract the first two elements (list[-1], the third, # is excluded) list = range(3) sublist = list[0:-1] print list del list # Insert in third position the string "two" list = range(3) list[2] = "two" print list del list # this doesn't work, the list contains only three elements, # there is no fourth position, and Python gives an error list = range(3) list[3] = "three" # insert in third posizion the string "three" list = range(3) list[-1] = "three" print list del list # the index has to be an integer, Python gives an error message list = range(3) list[1.2] = "one point two" # substitute the second element of list (i.e. 1) # with a list of two strings; this is can be done, # since lists *can* contain other lists list = range(3) list[1] = ["protein1", "protein2"] print list del list #. Solution:: matrix = [ [1, 2, 3], [4, 5, 6], [7, 8, 9], ] first_row = matrix[0] print first_row second_element_first_row = first_row[1] # or second_element_first_row = matrix[0][1] print second_element_first_row sum_first_row = matrix[0][0] + matrix[0][1] + matrix[0][2] print sum_first_row second_column = [matrix[0][1], matrix[1][1], matrix[2][1]] print second_column diagonal = [matrix[0][0], matrix[1][1], matrix[2][2]] print diagonal three_rows_together = matrix[0] + matrix[1] + matrix[2] print three_rows_together Methods ------- #. First, let's create a list. For example, an empty list:: list = [] next, let's add the required elements with ``append()``:: list.append(0) list.append("text") list.append([0, 1, 2, 3]) #. Solution:: # add one 3 at the end of the list list = range(3) list.append(3) print list del list # add a list with a 3 at the end of the list list = range(3) list.append([3]) print list del list # add a 3 (the only element contained in the list [3]) # at the end of the list list = range(3) list.extend([3]) print list del list # doesn't work: extend() extends a list with the content of # another list, but here 3 is *not* a list! # Python gives an error message list = range(3) list.extend(3) # replace the element in position 0, the first, with a 3 list = range(3) list.insert(0, 3) print list del list # insert a 3 at the end of list list = range(3) list.insert(3, 3) print list del list # insert the list [3] at the end of list list = range(3) list.insert(3, [3]) print list del list # doesn't work: the first argument of insert() has to be an integer # not a list! Python gives an error message list = range(3) list.insert([3], 3) #. Solution:: list = [] list.append(range(10)) list.append(range(10, 20)) print list Here we use ``append()``, that inserts an *element* at the end of ``list``. In this example we insert two lists, the results of ``range(10)`` and ``range(10, 20)``. Clearly, ``len(list)`` is ``2``, since we added only 2 elements. On the other hand:: list = [] list.extend(range(10)) list.extend(range(10, 20)) print list here we use ``extend()``, that extends a list with another list. Here the final list has 20 elements, as we can see with:: print len(list) #. Solution:: list = [0, 0, 0, 0] list.remove(0) print list only the first occurrence of ``0`` is removed! #. Solution:: list = [1, 2, 3, 4, 5] # invert the order of the elements of list list.reverse() print list # order the elements of list list.sort() print list After the two operations,``list`` gets back to the initial value. On the other hand:: list = [1, 2, 3, 4, 5] list.reverse().sort() *cannot* be done, since the result of ``list.reverse()`` is ``None``:: list = [1, 2, 3, 4, 5] result = list.reverse() print result being result not a ``list``,``sort()`` cannot be applied. Python gives an error message. #. Let's try this:: list = range(10) inverse_list = list.reverse() print list # modified! print inverse_list # None! the code doesn't work: ``reverse()`` modifies ``list`` e returns ``None``! Moreover, this code modifies directly``list``, and we don't want that. First, let's create a copy of ``list``, next we can order the copy:: list = range(10) inverse_list = list[:] # *not* inverse_list = list inverse_list.reverse() print list # unvaried print inverse_list # inverted On the other hand, this code:: list = range(10) inverse_list = list inverse_list.reverse() print list # modified! print inverse_list # inverted doesn't work as we want: ``inverse_list`` doesn't contain a copy of ``list``, but a reference to the same object referred by ``list``. As a consequence, when we invert ``inverse_list`` we also invert ``list``. #. As before:: motifs = [ "KSYK", "SVALVV" "GVTGI", "VGSSLAEVLKLPD", ] sorted_motifs = motifs.sort() the code doesn't work: ``reverse()`` modifies ``motifs`` e returns ``None``! First, let's create a copy of ``motifs``, next we can order the copy:: sorted_motifs = motifs[:] fsorted_motifs.sort() print motifs # unvaried print sorted_motifs # ordered String-List Methods --------------------- #. Solution:: text = """The Wellcome Trust Sanger Institute is a world leader in genome research."""" words = text.split() print len(words) #. Solution:: table = [ "protein | database | domain | start | end", "YNL275W | Pfam | PF00955 | 236 | 498", "YHR065C | SMART | SM00490 | 335 | 416", "YKL053C-A | Pfam | PF05254 | 5 | 72", "YOR349W | PANTHER | 353 | 414", ] first_row = table[0] almost_column_titles = first_row.split("|") almost_column_titles # ["protein ", " database ", ...] # unfortunately, column titles contain superfluous spaces # to remove them, we can change the argument of split() column_titles = first_row.split(" | ") print column_titles # ["protein", "database", ...] We could also use ``strip()`` together with a *list comprehension* on ``almost_column_titles``, but it's not necessary. #. Solution:: words = ["word_1", "word_2", "word_3"] print " ".join(words) print ",".join(words) print " e ".join(words) print "".join(words) backslash = r"\" print backslash.join(words) #. Solution:: verses = [ "Taci. Su le soglie" "del bosco non odo" "parole che dici" "umane; ma odo" "parole piu' nuove" "che parlano gocciole e foglie" "lontane." ] poem = "\n".join(verses) List Comprehension ------------------ #. Solutions: #. Solution:: list_plus_three = [number + 3 for number in list] print list_plus_three # check #. Solution:: odds = [number for number in list if (number % 2 == 1)] #. Solution:: opposites = [-number for number in list] #. Solution:: inverses = [1.0 / number for number in list if number != 0] #. Solution:: first_and_last = [list[0], list[-1]] #. Solution:: from_second_to_penultimate = list[1:-1] #. Solution:: list_odds = [number for number in list if (number % 2 == 1)] number_odds = len(list_odds) print number_odds or:: number_odds = len([number for number in list if (number % 2 == 1)]) #. Solution:: list_divided_by_5 = [float(number) / 5 for number in list] #. Solution:: list_multiples_5_divided = [float(number) / 5.0) for number in list if (number % 5 == 0)] #. Solution:: list_of_strings = [str(number) for number in list] #. Solution:: # As before, but iterating on `list_of_strings` # rather than directly on `list` number_odds = len([string for string in list_of_strings if (int(string) % 5 == 0)]) #. Solution:: text = " ".join([str(number) for number in list]) Notice that if we forget to write``str(number)``, ``join()`` doesn't work. #. Solutions: #. :: # forward list_1 = [1, 2, 3] list_2 = [str(x) for x in list_1] # backward list_2 = ["1", "2", "3"] list_1 = [int(x) for x in list_2] #. :: # forward list_1 = ["name", "surname", "age"] list_2 = [[x] for x in list_1] # backward list_2 = [["name"], ["surname"], ["age"]] list_1 = [l[0] for l in list_2] #. :: # forward list_1 = ["ACTC", "TTTGGG", "CT"] list_2 = [[x.lower(), len(x)] for x in list_1] # backward list_2 = [["actc", 4], ["tttgggcc", 6], ["ct", 2]] list_1 = [l[0].upper() for l in list_2] #. Solution: #. ``[x for x in list]``: creates a copy of ``list``. #. ``[y for y in list]``: creates a copy of ``list`` (the same as before). #. ``[y for x in list]``: invalid. (If ``x`` represents an element of the list, what is ``y``?) #. ``["x" for x in list]``: creates a list full of strings ``"x"`` as long as ``list``. The result will be: ``["x", "x", ..., "x"]``. #. ``[str(x) for x in list]``: for each int ``x`` in ``list``, ``x`` is converted to a string ``str(x)`` and included in the resulting list. The result will be: ``["0", "1", ..., "9"]``. #. ``[x for str(x) in list]``: invalid: the transformation ``str(...)`` is in the wrong place! #. ``[x + 1 for x in list]``: for each int ``x`` in ``list``, adds 1 with ``x + 1`` and includes the result in the resulting list. The result will be: ``[1, 2, ..., 10]``. #. ``[x + 1 for x in list if x == 2]``: for each int ``x`` in ``list`` checks whether it is equal to ``2``. In the positive case ``x + 1`` is included in the resulting list, otherwise it's excluded. The result will be': ``[3]``. #. Solution:: clusters = """\ >Cluster 0 0 >YLR106C at 100.00% >Cluster 50 0 >YPL082C at 100.00% >Cluster 54 0 >YHL009W-A at 90.80% 1 >YHL009W-B at 100.00% 2 >YJL113W at 98.77% 3 >YJL114W at 97.35% >Cluster 52 0 >YBR208C at 100.00% """ rows = clusters.split("\n") In order to get cluster names, we have to keep only lines starting with ``">"``, and for each of them apply ``split()`` in order to get the second element (the nam of the cluster):: cluster_names = [row.split()[1] for row in rows if row.startswith(">")] In order to get protein names, we have to keep only lines *not* starting with ``">"``, and for each of them apply ``split()`` and keep the second element (also removing ``">"`` from the name of the protein):: proteins = [row.split()[1].lstrip(">") for row in rows if not row.startswith(">")] In order to get protein-percentage pairs, we have to keep only lines *not* starting with ``">"``. On each of them, we apply ``split()`` and keep the second element (protein name) and the last (percentage):: protein_percentage_pairs = \ [[row.split()[1].lstrip(">"), row.split()[-1].rstrip("%")] for row in rows if not row.startswith(">")] Annotated version:: protein_percentage_pairs = \ [[row.split()[1].lstrip(">"), row.split()[-1].rstrip("%")] # ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ # protein name, as before percentage # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ # protein-percentage pair for row in rows if not row.startswith(">")] #. Solution:: matrix = [range(0,3), range(3,6), range(6,9)] # extract first row first_row = matrix[0] # extract first column first_column = [matrix[0][i] for i in range(3)] # invert row order upside_down = matrix[:] upside_down.reverse() # or upside_down = [matrix[2-i] for i in range(3)] # invert column order palindrome = [] # append the first line palindrome.append([matrix[0][2-i] for i in range(3)]) # append the second line palindrome.append([matrix[1][2-i] for i in range(3)]) # append the third line palindrome.append([matrix[2][2-i] for i in range(3)]) # or in a single step -- it's complicated and you can ignore it!!! palindrome = [[row[2-i] for i in range(3)] for row in matrix] # we can re-create matrix with a single list comprehension matrix_again = [range(i, i+3) for i in range(9) if i % 3 == 0] #. Solutions:: list = range(100) squares = [number**2 for number in list] difference_of_squares = \ [squares[i+1] - squares[i] for i in range(len(squares) - 1)] #. Solutions:: mouse_genes = ["Fus", "Tdp43", "Sod1", "Ighmbp2", "Srsf2"] sorted_mouse_genes = mouse_genes[:] sorted_mouse_genes.sort() print sorted_mouse_genes human_genes = [gene.upper() for gene in sorted_mouse_genes]