{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Dictionaries\n", "\n", "Python dictionaries are analogous to Java's HashMap data structure, and like HashMap, they provide fast access to a value associated with a key using a hash function. And this being Python, they're a lot easier to use!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# first, let's create an empty dictionary\n", "eng2sp = {}\n", "# and then let's add a few items to it\n", "eng2sp['one'] = 'uno'\n", "eng2sp['two'] = 'dos'\n", "eng2sp['one']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "eng = ['one', 'two']\n", "sp = ['uno', 'dos']\n", "sp[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can create a dictionary by providing key-value pairs in the same format as the output we saw above:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "eng2sp = {'one': 'uno', 'two': 'dos', 'three': 'tres'}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that order doesn't matter!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "inventory = {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}\n", "inventory" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `len` function returns the number of key-value pairs:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "len(inventory)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can change the values in a dictionary:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "inventory['pears'] = 0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can determine if a dictionary has an entry with a given key:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "'pears' in inventory, 'bananas' in inventory\n", "'kiwi' in inventory" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can use integers, strings and tuples as keys:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pair_dictionary = {}\n", "pair_dictionary[('a', 1)] = 1\n", "pair_dictionary[('z', 3)] = 5\n", "pair_dictionary[(1,1)] = 3\n", "pair_dictionary[10] = 5\n", "pair_dictionary" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hash((0,1))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Keys of a dictionary have to be immutable, therefore we cannot use lists as keys (but tuples are fine)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can iterate over a dictionary using a for loop:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "inventory = {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}\n", "for fruit in inventory :\n", " print (fruit, inventory[fruit])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dictionaries are extremely useful for efficiently keeping track of things. Let's write a function that computes the number of occurrences of each letter in a sentence:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'f': 1,\n", " 'h': 1,\n", " 'i': 1,\n", " 'n': 2,\n", " 'o': 1,\n", " 'p': 1,\n", " 's': 1,\n", " 't': 1,\n", " 'u': 1,\n", " 'y': 1}" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def letter_count(sentence) :\n", " counts = {}\n", " for letter in sentence.lower() :\n", " if letter in \" !.?,\" :\n", " continue # ignore whitespace and punctuation\n", " if not(letter in counts) :\n", " counts[letter] = 0\n", " counts[letter] += 1\n", " return counts\n", "letter_count('Python is fun!')\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can modify the above function to use the ``get`` method of a dictionary:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def letter_count(sentence) :\n", " counts = {}\n", " for letter in sentence.lower() :\n", " if letter in \" !.?,\\n\\t\" :\n", " continue # ignore whitespace and punctuation\n", " counts[letter] = counts.get(letter, 0) + 1\n", " return counts\n", "letter_count('Python is fun!')\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "eng2sp = {'one': 'uno', 'two': 'dos', 'three': 'tres'}\n", "eng2sp.get('tres', 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The second argument of the ``get`` method of a dictionary is the value which should be used in case the given key is not in the dictionary, which allowed us to skip the step that handles the case of a kmer that is not in the dictionary.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercises:\n", "\n", "* Most common substring. Write a function called `most_common_substring(s, length)` that returns the substring of the given length that occurs the most number of times within the input string `s`. For example, on the input `'mississipi', 4`, the return value should be `'issi'`. Hint: use slices to extract substrings of the appropriate length." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 1 }