{"cells": [{"cell_type": "markdown", "metadata": {}, "source": ["# Courte introduction au machine learning\n", "\n", "Le jeu de donn\u00e9es [Wine Quality Data Set](https://archive.ics.uci.edu/ml/datasets/Wine+Quality) recense les composants chimiques de vins ainsi que la note d'experts. Peut-on pr\u00e9dire cette note \u00e0 partir des composants chimiques ? Peut-\u00eatre que si on arrive \u00e0 construire une fonction qui permet de pr\u00e9dire cette note, on pourra comprendre comment l'expert note les vins."]}, {"cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [{"data": {"text/html": ["
run previous cell, wait for 2 seconds
\n", ""], "text/plain": [""]}, "execution_count": 2, "metadata": {}, "output_type": "execute_result"}], "source": ["from jyquickhelper import add_notebook_menu\n", "add_notebook_menu()"]}, {"cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": ["%matplotlib inline"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Donn\u00e9es et premi\u00e8re r\u00e9gression lin\u00e9aire\n", "\n", "On peut utiliser la fonction impl\u00e9ment\u00e9e dans ce module."]}, {"cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [{"data": {"text/html": ["
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fixed_acidityvolatile_aciditycitric_acidresidual_sugarchloridesfree_sulfur_dioxidetotal_sulfur_dioxidedensitypHsulphatesalcoholqualitycolor
07.40.700.001.90.07611.034.00.99783.510.569.450
17.80.880.002.60.09825.067.00.99683.200.689.850
27.80.760.042.30.09215.054.00.99703.260.659.850
311.20.280.561.90.07517.060.00.99803.160.589.860
47.40.700.001.90.07611.034.00.99783.510.569.450
\n", "
"], "text/plain": [" fixed_acidity volatile_acidity citric_acid residual_sugar chlorides \\\n", "0 7.4 0.70 0.00 1.9 0.076 \n", "1 7.8 0.88 0.00 2.6 0.098 \n", "2 7.8 0.76 0.04 2.3 0.092 \n", "3 11.2 0.28 0.56 1.9 0.075 \n", "4 7.4 0.70 0.00 1.9 0.076 \n", "\n", " free_sulfur_dioxide total_sulfur_dioxide density pH sulphates \\\n", "0 11.0 34.0 0.9978 3.51 0.56 \n", "1 25.0 67.0 0.9968 3.20 0.68 \n", "2 15.0 54.0 0.9970 3.26 0.65 \n", "3 17.0 60.0 0.9980 3.16 0.58 \n", "4 11.0 34.0 0.9978 3.51 0.56 \n", "\n", " alcohol quality color \n", "0 9.4 5 0 \n", "1 9.8 5 0 \n", "2 9.8 5 0 \n", "3 9.8 6 0 \n", "4 9.4 5 0 "]}, "execution_count": 4, "metadata": {}, "output_type": "execute_result"}], "source": ["from papierstat.datasets import load_wines_dataset\n", "df = load_wines_dataset()\n", "df[\"color2\"] = 0\n", "df.loc[df[\"color\"] == \"white\", \"color2\"] = 1\n", "df[\"color\"] = df[\"color2\"]\n", "df = df.drop('color2', axis=1)\n", "df.head()"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Ou on peut aussi r\u00e9cup\u00e9rer les donn\u00e9es depuis le site et former les m\u00eames donn\u00e9es."]}, {"cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": ["# import pandas\n", "# df_red = pandas.read_csv('winequality-red.csv', sep=';')\n", "# df_red['color'] = 0\n", "# df_white = pandas.read_csv('winequality-white.csv', sep=';')\n", "# df_white['color'] = 1\n", "# df = pandas.concat([df_red, df_white])\n", "# df.shape, df_red.shape, df_white.shape"]}, {"cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [{"data": {"text/html": ["
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countmeanstdmin25%50%75%max
fixed_acidity6497.07.2153071.2964343.800006.400007.000007.7000015.90000
volatile_acidity6497.00.3396660.1646360.080000.230000.290000.400001.58000
citric_acid6497.00.3186330.1453180.000000.250000.310000.390001.66000
residual_sugar6497.05.4432354.7578040.600001.800003.000008.1000065.80000
chlorides6497.00.0560340.0350340.009000.038000.047000.065000.61100
free_sulfur_dioxide6497.030.52531917.7494001.0000017.0000029.0000041.00000289.00000
total_sulfur_dioxide6497.0115.74457456.5218556.0000077.00000118.00000156.00000440.00000
density6497.00.9946970.0029990.987110.992340.994890.996991.03898
pH6497.03.2185010.1607872.720003.110003.210003.320004.01000
sulphates6497.00.5312680.1488060.220000.430000.510000.600002.00000
alcohol6497.010.4918011.1927128.000009.5000010.3000011.3000014.90000
quality6497.05.8183780.8732553.000005.000006.000006.000009.00000
color6497.00.7538860.4307790.000001.000001.000001.000001.00000
\n", "
"], "text/plain": [" count mean std min 25% \\\n", "fixed_acidity 6497.0 7.215307 1.296434 3.80000 6.40000 \n", "volatile_acidity 6497.0 0.339666 0.164636 0.08000 0.23000 \n", "citric_acid 6497.0 0.318633 0.145318 0.00000 0.25000 \n", "residual_sugar 6497.0 5.443235 4.757804 0.60000 1.80000 \n", "chlorides 6497.0 0.056034 0.035034 0.00900 0.03800 \n", "free_sulfur_dioxide 6497.0 30.525319 17.749400 1.00000 17.00000 \n", "total_sulfur_dioxide 6497.0 115.744574 56.521855 6.00000 77.00000 \n", "density 6497.0 0.994697 0.002999 0.98711 0.99234 \n", "pH 6497.0 3.218501 0.160787 2.72000 3.11000 \n", "sulphates 6497.0 0.531268 0.148806 0.22000 0.43000 \n", "alcohol 6497.0 10.491801 1.192712 8.00000 9.50000 \n", "quality 6497.0 5.818378 0.873255 3.00000 5.00000 \n", "color 6497.0 0.753886 0.430779 0.00000 1.00000 \n", "\n", " 50% 75% max \n", "fixed_acidity 7.00000 7.70000 15.90000 \n", "volatile_acidity 0.29000 0.40000 1.58000 \n", "citric_acid 0.31000 0.39000 1.66000 \n", "residual_sugar 3.00000 8.10000 65.80000 \n", "chlorides 0.04700 0.06500 0.61100 \n", "free_sulfur_dioxide 29.00000 41.00000 289.00000 \n", "total_sulfur_dioxide 118.00000 156.00000 440.00000 \n", "density 0.99489 0.99699 1.03898 \n", "pH 3.21000 3.32000 4.01000 \n", "sulphates 0.51000 0.60000 2.00000 \n", "alcohol 10.30000 11.30000 14.90000 \n", "quality 6.00000 6.00000 9.00000 \n", "color 1.00000 1.00000 1.00000 "]}, "execution_count": 6, "metadata": {}, "output_type": "execute_result"}], "source": ["df.describe().T"]}, {"cell_type": "markdown", "metadata": {}, "source": ["J'ai tendance \u00e0 utiliser ``df`` partout quitte \u00e0 ce que le premier soit \u00e9cras\u00e9. Conservons-le dans une variable \u00e0 part."]}, {"cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": ["df_data = df"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Quelle est la distribution des notes ?"]}, {"cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [{"data": {"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAD4CAYAAAAAczaOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAATGElEQVR4nO3df6zd9X3f8eerOGkITsEp6RUFNjPJi0qxxuAK2JCi67EQA1VJpkUCMQJpIkcTTMlqaXEqVWRNI3lSaKfQDM0NXohCc8XyQ1jghnpe7zL+oMFOaYzjRHjEZbaZ3c6OqRPU1tl7f9yvtxu4ts8999x77j2f50M6Oud8vp/v9/t533PO63zP93zP96aqkCS14WeGPQBJ0uIx9CWpIYa+JDXE0Jekhhj6ktSQFcMewNlcfPHFtXr16r7n/9GPfsQFF1wwuAENyajUAdayVI1KLaNSB8yvlt27d/9lVb1jtmlLOvRXr17Nrl27+p5/amqKiYmJwQ1oSEalDrCWpWpUahmVOmB+tST58zNNc/eOJDXE0Jekhhj6ktQQQ1+SGmLoS1JDDH1JaoihL0kNMfQlqSGGviQ1ZEn/IldayvYcOsG9m55a9PUe2Hzboq9To8MtfUlqiKEvSQ0x9CWpIYa+JDXE0Jekhhj6ktQQQ1+SGmLoS1JDDH1JaoihL0kNMfQlqSGGviQ1xNCXpIYY+pLUEENfkhpi6EtSQwx9SWqIoS9JDTH0Jakhhr4kNeScoZ/k8iR/nGRfkr1JPtq1fzLJoSTPd5dbZ8zziST7k3w/yXtmtK/v2vYn2bQwJUmSzmRFD31OARur6ttJ3gbsTrKjm/a7VfWZmZ2TXAncAfwy8IvAf0ny97vJnwPeDRwEnkuyraq+O4hCJEnnds7Qr6pXgFe623+VZB9w6VlmuR2YrKq/Bn6QZD9wXTdtf1W9BJBksutr6EvSIklV9d45WQ18E7gK+HXgXuBVYBfTnwaOJ/k94Nmq+lI3zyPAH3aLWF9VH+7a7waur6r7X7eODcAGgLGxsWsnJyf7rY2TJ0+ycuXKvudfKkalDhitWo4eO8GR1xZ/vWsvvXDgyxyVx2VU6oD51bJu3brdVTU+27Redu8AkGQl8FXgY1X1apKHgU8B1V0/CPwakFlmL2b//uAN7zhVtQXYAjA+Pl4TExO9DvENpqammM/8S8Wo1AGjVctDjz3Bg3t6fgkNzIG7Jga+zFF5XEalDli4Wnp6xiZ5E9OB/1hVfQ2gqo7MmP77wJPd3YPA5TNmvww43N0+U7skaRH0cvROgEeAfVX1OzPaL5nR7X3AC93tbcAdSX42yRXAGuBbwHPAmiRXJHkz01/2bhtMGZKkXvSypX8jcDewJ8nzXdtvAHcmuZrpXTQHgI8AVNXeJI8z/QXtKeC+qvoJQJL7gaeB84CtVbV3gLVIks6hl6N3nmH2/fTbzzLPp4FPz9K+/WzzSZIWlr/IlaSGGPqS1BBDX5IaYuhLUkMMfUlqiKEvSQ0x9CWpIYa+JDXE0Jekhhj6ktQQQ1+SGmLoS1JDDH1JaoihL0kNMfQlqSGGviQ1xNCXpIYY+pLUEENfkhpi6EtSQwx9SWrIimEPQKNh9aaneuq3ce0p7u2xby8ObL5tYMuSWuCWviQ1xNCXpIYY+pLUEENfkhpi6EtSQwx9SWrIOUM/yeVJ/jjJviR7k3y0a397kh1JXuyuV3XtSfLZJPuTfCfJNTOWdU/X/8Uk9yxcWZKk2fSypX8K2FhVvwTcANyX5EpgE7CzqtYAO7v7ALcAa7rLBuBhmH6TAB4ArgeuAx44/UYhSVoc5wz9qnqlqr7d3f4rYB9wKXA78GjX7VHgvd3t24Ev1rRngYuSXAK8B9hRVceq6jiwA1g/0GokSWeVquq9c7Ia+CZwFfByVV00Y9rxqlqV5Elgc1U907XvBD4OTABvqarf7tp/E3itqj7zunVsYPoTAmNjY9dOTk72XdzJkydZuXJl3/MvFcuhjj2HTvTUb+x8OPLa4Na79tILB7ewOTp67MRAa+nVQtS8HJ5jvRiVOmB+taxbt253VY3PNq3n0zAkWQl8FfhYVb2a5IxdZ2mrs7T/dEPVFmALwPj4eE1MTPQ6xDeYmppiPvMvFcuhjl5PrbBx7Ske3DO4s38cuGtiYMuaq4cee2KgtfRqIWpeDs+xXoxKHbBwtfR09E6SNzEd+I9V1de65iPdbhu666Nd+0Hg8hmzXwYcPku7JGmR9HL0ToBHgH1V9TszJm0DTh+Bcw/wxIz2D3RH8dwAnKiqV4CngZuTrOq+wL25a5MkLZJePpveCNwN7EnyfNf2G8Bm4PEkHwJeBt7fTdsO3ArsB34MfBCgqo4l+RTwXNfvt6rq2ECqkCT15Jyh330he6Yd+DfN0r+A+86wrK3A1rkMUJI0OP4iV5IaYuhLUkMMfUlqiKEvSQ0x9CWpIYa+JDXE0Jekhhj6ktQQQ1+SGmLoS1JDDH1JaoihL0kNMfQlqSGGviQ1xNCXpIYY+pLUEENfkhpi6EtSQwx9SWqIoS9JDTH0Jakhhr4kNcTQl6SGGPqS1BBDX5IaYuhLUkMMfUlqyDlDP8nWJEeTvDCj7ZNJDiV5vrvcOmPaJ5LsT/L9JO+Z0b6+a9ufZNPgS5EknUsvW/pfANbP0v67VXV1d9kOkORK4A7gl7t5/kOS85KcB3wOuAW4Eriz6ytJWkQrztWhqr6ZZHWPy7sdmKyqvwZ+kGQ/cF03bX9VvQSQZLLr+905j1iS1LdU1bk7TYf+k1V1VXf/k8C9wKvALmBjVR1P8nvAs1X1pa7fI8AfdotZX1Uf7trvBq6vqvtnWdcGYAPA2NjYtZOTk30Xd/LkSVauXNn3/EvFcqhjz6ETPfUbOx+OvDa49a699MLBLWyOjh47MdBaerUQNS+H51gvRqUOmF8t69at211V47NNO+eW/hk8DHwKqO76QeDXgMzSt5h9N9Ks7zZVtQXYAjA+Pl4TExN9DhGmpqaYz/xLxXKo495NT/XUb+PaUzy4p9+n3RsduGtiYMuaq4cee2KgtfRqIWpeDs+xXoxKHbBwtfT1jK2qI6dvJ/l94Mnu7kHg8hldLwMOd7fP1C5JWiR9HbKZ5JIZd98HnD6yZxtwR5KfTXIFsAb4FvAcsCbJFUnezPSXvdv6H7YkqR/n3NJP8mVgArg4yUHgAWAiydVM76I5AHwEoKr2Jnmc6S9oTwH3VdVPuuXcDzwNnAdsraq9A69GknRWvRy9c+cszY+cpf+ngU/P0r4d2D6n0UmSBspf5EpSQwx9SWqIoS9JDTH0Jakhhr4kNcTQl6SGGPqS1BBDX5IaYuhLUkMMfUlqiKEvSQ0x9CWpIYa+JDXE0Jekhhj6ktQQQ1+SGmLoS1JDDH1JaoihL0kNMfQlqSGGviQ1xNCXpIYY+pLUEENfkhpi6EtSQ1YMewCS5mb1pqcGvsyNa09xbw/LPbD5toGvW4vLLX1JaoihL0kNOWfoJ9ma5GiSF2a0vT3JjiQvdteruvYk+WyS/Um+k+SaGfPc0/V/Mck9C1OOJOlsetnS/wKw/nVtm4CdVbUG2NndB7gFWNNdNgAPw/SbBPAAcD1wHfDA6TcKSdLiOWfoV9U3gWOva74deLS7/Sjw3hntX6xpzwIXJbkEeA+wo6qOVdVxYAdvfCORJC2wVNW5OyWrgSer6qru/g+r6qIZ049X1aokTwKbq+qZrn0n8HFgAnhLVf121/6bwGtV9ZlZ1rWB6U8JjI2NXTs5Odl3cSdPnmTlypV9z79ULIc69hw60VO/sfPhyGuDW+/aSy8c3MLm6OixEwOtZZh6fVyG+ffuxXJ4rfRqPrWsW7dud1WNzzZt0IdsZpa2Okv7GxurtgBbAMbHx2tiYqLvwUxNTTGf+ZeK5VBHL4f7wfShgQ/uGdzT7sBdEwNb1lw99NgTA61lmHp9XIb59+7Fcnit9Gqhaun36J0j3W4buuujXftB4PIZ/S4DDp+lXZK0iPoN/W3A6SNw7gGemNH+ge4onhuAE1X1CvA0cHOSVd0XuDd3bZKkRXTOz3NJvsz0PvmLkxxk+iiczcDjST4EvAy8v+u+HbgV2A/8GPggQFUdS/Ip4Lmu329V1eu/HJYkLbBzhn5V3XmGSTfN0reA+86wnK3A1jmNTpI0UP4iV5IaYuhLUkMMfUlqiKEvSQ0x9CWpIYa+JDXE0Jekhhj6ktQQQ1+SGmLoS1JDDH1JaoihL0kNMfQlqSGGviQ1xNCXpIYY+pLUEENfkhpi6EtSQwx9SWqIoS9JDTH0Jakhhr4kNcTQl6SGGPqS1BBDX5IaYuhLUkMMfUlqyLxCP8mBJHuSPJ9kV9f29iQ7krzYXa/q2pPks0n2J/lOkmsGUYAkqXeD2NJfV1VXV9V4d38TsLOq1gA7u/sAtwBrussG4OEBrFuSNAcLsXvnduDR7vajwHtntH+xpj0LXJTkkgVYvyTpDFJV/c+c/AA4DhTwH6tqS5IfVtVFM/ocr6pVSZ4ENlfVM137TuDjVbXrdcvcwPQnAcbGxq6dnJzse3wnT55k5cqVfc+/VCyHOvYcOtFTv7Hz4chrg1vv2ksvHNzC5ujosRMDrWWYen1chvn37sVyeK30aj61rFu3bveMvS8/ZcW8RgU3VtXhJL8A7EjyvbP0zSxtb3jHqaotwBaA8fHxmpiY6HtwU1NTzGf+pWI51HHvpqd66rdx7Ske3DPfp93/d+CuiYEta64eeuyJgdYyTL0+LsP8e/diObxWerVQtcxr905VHe6ujwJfB64DjpzebdNdH+26HwQunzH7ZcDh+axfkjQ3fYd+kguSvO30beBm4AVgG3BP1+0e4Inu9jbgA91RPDcAJ6rqlb5HLkmas/l8Nh0Dvp7k9HL+oKq+keQ54PEkHwJeBt7f9d8O3ArsB34MfHAe65Yk9aHv0K+ql4B/MEv7/wZumqW9gPv6XZ8kaf78Ra4kNcTQl6SGGPqS1BBDX5IaYuhLUkMMfUlqiKEvSQ0x9CWpIYa+JDXE0Jekhhj6ktQQQ1+SGmLoS1JDDH1JaoihL0kNGY1/8LnErO7x/8X2auPaUz3/D9oDm28b6LoljRa39CWpIW7pS1ryev30PJdPxb0atU/PbulLUkMMfUlqiKEvSQ0x9CWpIYa+JDXE0Jekhhj6ktQQQ1+SGmLoS1JDDH1Jasiih36S9Um+n2R/kk2LvX5JatminnsnyXnA54B3AweB55Jsq6rvLsT69hw6MfDzcEjScrbYJ1y7DthfVS8BJJkEbgcWJPQlab4Gfar0Xn1h/QULstxU1YIseNaVJf8cWF9VH+7u3w1cX1X3z+izAdjQ3X0n8P15rPJi4C/nMf9SMSp1gLUsVaNSy6jUAfOr5e9W1Ttmm7DYW/qZpe2n3nWqaguwZSArS3ZV1fggljVMo1IHWMtSNSq1jEodsHC1LPYXuQeBy2fcvww4vMhjkKRmLXboPwesSXJFkjcDdwDbFnkMktSsRd29U1WnktwPPA2cB2ytqr0LuMqB7CZaAkalDrCWpWpUahmVOmCBalnUL3IlScPlL3IlqSGGviQ1ZORCP8lbknwryZ8l2Zvk3w57TPOV5Lwkf5rkyWGPZT6SHEiyJ8nzSXYNezz9SnJRkq8k+V6SfUn+0bDH1I8k7+wei9OXV5N8bNjj6leSf9295l9I8uUkbxn2mPqR5KNdDXsX4vEYuX36SQJcUFUnk7wJeAb4aFU9O+Sh9S3JrwPjwM9V1a8Mezz9SnIAGK+qZf3jmSSPAv+9qj7fHYX21qr64bDHNR/dKVIOMf1jyT8f9njmKsmlTL/Wr6yq15I8Dmyvqi8Md2Rzk+QqYJLpsxf8DfAN4F9W1YuDWsfIbenXtJPd3Td1l2X7zpbkMuA24PPDHosgyc8B7wIeAaiqv1nugd+5CfgfyzHwZ1gBnJ9kBfBWludvgH4JeLaqflxVp4D/BrxvkCsYudCH/7c75HngKLCjqv5k2GOah38P/Bvg/wx7IANQwB8l2d2dbmM5+nvAXwD/qdvl9vkkC3OSlMV1B/DlYQ+iX1V1CPgM8DLwCnCiqv5ouKPqywvAu5L8fJK3Arfy0z9onbeRDP2q+klVXc30L36v6z4yLTtJfgU4WlW7hz2WAbmxqq4BbgHuS/KuYQ+oDyuAa4CHq+ofAj8ClvUpwrtdVL8K/Odhj6VfSVYxffLGK4BfBC5I8i+GO6q5q6p9wL8DdjC9a+fPgFODXMdIhv5p3cfuKWD9kIfSrxuBX+32hU8C/yTJl4Y7pP5V1eHu+ijwdab3Wy43B4GDMz49foXpN4Hl7Bbg21V1ZNgDmYd/Cvygqv6iqv4W+Brwj4c8pr5U1SNVdU1VvQs4Bgxsfz6MYOgneUeSi7rb5zP9ZPjecEfVn6r6RFVdVlWrmf74/V+ratltvQAkuSDJ207fBm5m+qPsslJV/wv4n0ne2TXdxPI/NfidLONdO52XgRuSvLU7mOMmYN+Qx9SXJL/QXf8d4J8x4Mdmsc+yuRguAR7tjkb4GeDxqlrWhzqOiDHg69OvR1YAf1BV3xjukPr2r4DHut0iLwEfHPJ4+tbtN3438JFhj2U+qupPknwF+DbTu0P+lOV7SoavJvl54G+B+6rq+CAXPnKHbEqSzmzkdu9Iks7M0Jekhhj6ktQQQ1+SGmLoS1JDDH1JaoihL0kN+b8I/aKnJnwohQAAAABJRU5ErkJggg==\n", "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["df['quality'].hist();"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Les notes pour les blancs et les rouges."]}, {"cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [{"data": {"image/png": "iVBORw0KGgoAAAANSUhEUgAAAsUAAADSCAYAAACih70SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAdjklEQVR4nO3df7RdZX3n8fdHghWwGn7IHZpQY0fGkcoSaQZpWeO6FYv8cAztKh0YRgKlE2cWOjqyVhtcs8a2th2cVWqBVjopUEOLIKUysJSqLOS2daZQRSk/dRFpJDEpUflhI201+p0/zr54CDe5596ce+7dZ79fa5119n72s/f+PicnT755zrP3TlUhSZIkddkLFjsASZIkabGZFEuSJKnzTIolSZLUeSbFkiRJ6jyTYkmSJHWeSbEkSZI6z6RYkiQNVZLNSd40Q/lkkq2LEZM0G5NiSZIkdZ5JsVolybLFjkGSJI0fk2Itec3PcL+S5D7g20mOSTKV5KkkDyZ5a1/dqSS/1Ld+XpLP9q2fnOTLSZ5O8qEkf7Fb/V9M8nCSJ5N8KsnLR9ZQSRov/ybJQ01/+kdJXrR7hSTrk3wlyT80dX+2b9t5ST6b5LebY/xdklP7th/SHHdbs/3/jKphGk8mxWqLs4HTgcOAm4FPA4cD7wSuS/Kq2Q6Q5DDgJuBi4FDgy8BP9W0/A3gv8HPAy4C/Aq4faiskqTvOAd4M/EvgXwH/fYY6XwH+LfBS4NeAP0lyRN/219Prqw8D/hdwdZI02/4YOBD4cXr/HnxwAdqgDjEpVltcXlVbgGOBFwOXVNV3quozwMfpJc2zOQ14sKo+VlW7gMuBv+/b/nbgf1bVw8323wKOdbRYkubl96pqS1U9AfwmM/TTVfWnVbWtqr5fVR8FHgGO76vy1ar6w6r6HrAROAKYaBLnU4H/XFVPVtV3q+ovFr5JGmcmxWqLLc37jwBbqur7fdu+CqwY4Bg/0nccqqqA/qugXw5c1kzLeAp4AsiAx5YkPdeWvuWv0uuDnyPJuUnu7et3X0NvVHjaswMXVfVMs/hi4Ejgiap6cvhhq6tMitUW1bxvA45M0v/d/VHga83yt+n9nDbtX/QtbwdWTq80P8Gt7Nu+BXh7VS3vex1QVf9vWI2QpA45sm/5R+n1389qfoX7Q+AdwKFVtRx4gN5gxGy2AIckWT6kWCWTYrXO3fQS319Osn+SSeDfATc02+8Ffi7JgUleCVzQt+8ngGOSnNHcxeJCnps0/wFwcZIfB0jy0iRnLmxzJGlsXZhkZZJD6F2v8dHdth9Eb8Dj6wBJzqc3UjyrqtoO/DnwoSQHN/8evGF4oauLTIrVKlX1HeCt9OaSfQP4EHBuVX2pqfJB4DvA4/Tmn13Xt+83gDPpXazxTeBo4PPAPzfbbwY+ANyQ5Fv0RiyevdJZkjQnH6F3UfSjzes3+jdW1UPApcBf0+uzjwH+7xyO/zbgu8CXgB3Au/c9ZHVZetMqpe5ppmBsBc6pqjsXOx5JkrR4HClWpyR5c5LlSX6I3s95Ae5a5LAkSdIiMylW1/wkvftifoPeXOQzquofFzckSZK02Jw+IUmSpM5zpFiSJEmdZ1IsSZKkzls2W4Ukr+K59xb8MeB/ANc25auAzcAvVNWTzQMRLqP3SN1ngPOq6gt7O8dhhx1Wq1atmnPw3/72tznooIPmvF8bjHPbYLzbZ9vaaz7tu+eee75RVS9boJCWpK712cY9Wm2NG9obe9fi3mO/XVUDv4D96D1y8eX07vW6vilfD3ygWT6N3g21A5wA3D3bcX/iJ36i5uPOO++c135tMM5tqxrv9tm29ppP+4DP1xz60XF4da3PNu7RamvcVe2NvWtx76nfnuv0iZOAr1TVV4E19B6OQPN+RrO8Bri2Oe9dwPIkR8zxPJIkSdLIzDp9YjdnAdc3yxPVe8wiVbU9yeFN+Qp6zySftrUp295/oCTrgHUAExMTTE1NzTEU2Llz57z2a4NxbhuMd/tsW3uNe/skSXs2cFKc5IX0Hq978WxVZyh73n3fqmoDsAFg9erVNTk5OWgoz5qammI++7XBOLcNxrt9tq29xr19kqQ9m8v0iVOBL1TV483649PTIpr3HU35VuDIvv1WAtv2NVBJkiRpocwlKT6bH0ydALgVWNssrwVu6Ss/Nz0nAE9PT7OQJEmSlqKBpk8kORD4GeDtfcWXADcmuQB4DDizKb+N3h0oNtG7Jdv5Q4tWY+P+rz3Nees/MZJzbb7k9JGcR5K070b57wP4b4R+YKCkuKqeAQ7dreyb9O5GsXvdAi4cSnSSJEnSCPhEO0mSJHWeSbEkSZI6z6RYkiRJnWdSLEmSpM4zKZYkSVLnmRRLkiSp80yKJUmS1HkmxZIkSeq8gR7eIWlwPo1JkqT2caRYklooyZFJ7kzycJIHk7yrKT8kye1JHmneD27Kk+TyJJuS3JfkuL5jrW3qP5Jk7WK1SZIWk0mxJLXTLuCiqno1cAJwYZKjgfXAHVV1FHBHsw5wKnBU81oHXAm9JBp4H/B64HjgfdOJtCR1iUmxJLVQVW2vqi80y/8APAysANYAG5tqG4EzmuU1wLXVcxewPMkRwJuB26vqiap6ErgdOGWETZGkJcE5xZLUcklWAa8D7gYmqmo79BLnJIc31VYAW/p229qU7al893OsozfCzMTEBFNTU3OOc+fOnfPab7EZ92hNHAAXHbNrZOcb5mfU1s/cuHsGSoqTLAeuAl4DFPCLwJeBjwKrgM3AL1TVk0kCXAacBjwDnDc9miFJGq4kLwb+DHh3VX2r1wXPXHWGstpL+XMLqjYAGwBWr15dk5OTc451amqK+ey32Ix7tK647hYuvX90Y3abz5kc2rHa+pkbd8+g0ycuAz5ZVf8aeC29n+nmNG9NkjRcSfanlxBfV1Ufa4ofb6ZF0LzvaMq3Akf27b4S2LaXcknqlFmT4iQvAd4AXA1QVd+pqqeY+7w1SdKQNL/KXQ08XFW/07fpVmD6DhJrgVv6ys9t7kJxAvB0M83iU8DJSQ5uLrA7uSmTpE4Z5PeJHwO+DvxRktcC9wDvYu7z1rYPLWpJ0onA24D7k9zblL0XuAS4MckFwGPAmc222+hNa9tEb2rb+QBV9USS9wOfa+r9elU9MZomSNLSMUhSvAw4DnhnVd2d5DJ+MFViJgPNT+vyRRuDGOe2wWgvpBj159jmi0RmM+7fyza1r6o+y8z9LcBJM9Qv4MI9HOsa4JrhRSdJ7TNIUrwV2FpVdzfrN9FLih9PckQzSjzIvLXn6PJFG4MY57bBaC+kGOZFFINo80Uisxn37+W4t0+StGezzimuqr8HtiR5VVN0EvAQc5+3JkmSJC1Jgw5nvRO4LskLgUfpzUV7AXOYtyZJkiQtVQMlxVV1L7B6hk1zmrcmSZIkLUU+5lmSJEmdZ1IsSZKkzjMpliRJUueZFEuSJKnzTIolSZLUeSbFkiRJ6jyTYkmSJHWeSbEkSZI6z6RYkiRJnWdSLEmSpM4zKZYkSVLnmRRLkiSp80yKJUmS1HkmxZIkSeq8gZLiJJuT3J/k3iSfb8oOSXJ7kkea94Ob8iS5PMmmJPclOW4hGyBJkiTtq7mMFP90VR1bVaub9fXAHVV1FHBHsw5wKnBU81oHXDmsYCVJkqSFsC/TJ9YAG5vljcAZfeXXVs9dwPIkR+zDeSRJkqQFNWhSXMCnk9yTZF1TNlFV2wGa98Ob8hXAlr59tzZlkiRJ0pK0bMB6J1bVtiSHA7cn+dJe6maGsnpepV5yvQ5gYmKCqampAUP5gZ07d85rvzYY57YBTBwAFx2zayTnGvXnOMq2wWjbN+7fy3FvnyRpzwZKiqtqW/O+I8nNwPHA40mOqKrtzfSIHU31rcCRfbuvBLbNcMwNwAaA1atX1+Tk5JyDn5qaYj77tcE4tw3giutu4dL7B/0/2b7ZfM7kSM4zbZRtg9G2b9y/l+PePknSns06fSLJQUl+eHoZOBl4ALgVWNtUWwvc0izfCpzb3IXiBODp6WkWkiRJ0lI0yHDWBHBzkun6H6mqTyb5HHBjkguAx4Azm/q3AacBm4BngPOHHrUkSZI0RLMmxVX1KPDaGcq/CZw0Q3kBFw4lOknSjJJcA7wF2FFVr2nKfhX4T8DXm2rvrarbmm0XAxcA3wP+a1V9qik/BbgM2A+4qqouGWU7JGmp8Il2ktROHwZOmaH8g8095Y/tS4iPBs4CfrzZ50NJ9kuyH/D79O4vfzRwdlNXkjpndFcDSZKGpqr+MsmqAauvAW6oqn8G/i7JJnoXTANsan4RJMkNTd2HhhyuJC15jhRL0nh5R5L7klyT5OCmbE/3j/e+8pLUcKRYksbHlcD76d0b/v3ApcAvsuf7x880MPK8+8pDt+8tb9yj1eZ7vbf1MzfuHpNiSRoTVfX49HKSPwQ+3qzu7f7xs95Xvjl2Z+8tb9yj1eZ7vbf1MzfuHqdPSNKYaB6kNO1n6d1THnr3jz8ryQ8leQVwFPA3wOeAo5K8IskL6V2Md+soY5akpcKRYklqoSTXA5PAYUm2Au8DJpMcS28KxGbg7QBV9WCSG+ldQLcLuLCqvtcc5x3Ap+jdku2aqnpwxE3RPKxa/4lZ61x0zC7OG6DeIDZfcvpQjiMtZSbFktRCVXX2DMVX76X+bwK/OUP5bfQeuiRJneb0CUmSJHWeSbEkSZI6z6RYkiRJnWdSLEmSpM4zKZYkSVLnmRRLkiSp8wZOipPsl+SLST7erL8iyd1JHkny0ebG7zQ3h/9okk3N9lULE7okSZI0HHMZKX4X8HDf+geAD1bVUcCTwAVN+QXAk1X1SuCDTT1JkiRpyRooKU6yEjgduKpZD/BG4KamykbgjGZ5TbNOs/2kpr4kSZK0JA06Uvy7wC8D32/WDwWeqqpdzfpWYEWzvALYAtBsf7qpL0mSJC1Jsz7mOclbgB1VdU+SyeniGarWANv6j7sOWAcwMTHB1NTUIPE+x86dO+e1XxuMc9sAJg6Ai47ZNXvFIRj15zjKtsFo2zfu38txb58kac9mTYqBE4G3JjkNeBHwEnojx8uTLGtGg1cC25r6W4Ejga1JlgEvBZ7Y/aBVtQHYALB69eqanJycc/BTU1PMZ782GOe2AVxx3S1cev8gX799t/mcyZGcZ9oo2wajbd+4fy/HvX2SpD2bdfpEVV1cVSurahVwFvCZqjoHuBP4+abaWuCWZvnWZp1m+2eq6nkjxZIkSdJSsS/3Kf4V4D1JNtGbM3x1U341cGhT/h5g/b6FKEmSJC2sOf3GW1VTwFSz/Chw/Ax1/gk4cwixSZIkSSPhE+0kSZLUeSbFkiRJ6jyTYkmSJHWeSbEkSZI6z6RYkiRJnWdSLEmSpM4zKZYkSVLnmRRLkiSp80yKJUmS1HkmxZIkSeo8k2JJkiR1nkmxJEmSOs+kWJJaKMk1SXYkeaCv7JAktyd5pHk/uClPksuTbEpyX5Lj+vZZ29R/JMnaxWiLJC0FJsWS1E4fBk7ZrWw9cEdVHQXc0awDnAoc1bzWAVdCL4kG3ge8HjgeeN90Ii1JXTNrUpzkRUn+JsnfJnkwya815a9IcnczuvDRJC9syn+oWd/UbF+1sE2QpO6pqr8EntiteA2wsVneCJzRV35t9dwFLE9yBPBm4PaqeqKqngRu5/mJtiR1wrIB6vwz8Maq2plkf+CzSf4ceA/wwaq6IckfABfQG324AHiyql6Z5CzgA8C/X6D4JUk/MFFV2wGqanuSw5vyFcCWvnpbm7I9lT9PknX0RpmZmJhgampqzsHt3LlzXvsttqUY90XH7Jq1zsQBg9UbxCjbP8y4BzHMti3F78ogjLtn1qS4qgrY2azu37wKeCPwH5ryjcCv0kuK1zTLADcBv5ckzXEkSaOXGcpqL+XPL6zaAGwAWL16dU1OTs45iKmpKeaz32JbinGft/4Ts9a56JhdXHr/IGNfs9t8zuRQjjOIK667ZWhxD2KYbVuK35VBGHfPQN+6JPsB9wCvBH4f+ArwVFVN/1euf3Th2ZGHqtqV5GngUOAbux2zs6MOgxjntsFoRwJG/Tm2eZRjNuP+vRyD9j2e5IhmlPgIYEdTvhU4sq/eSmBbUz65W/nUCOKUpCVnoKS4qr4HHJtkOXAz8OqZqjXvA408dHnUYRDj3DYY7UjAKEc4oN2jHLMZ9+/lGLTvVmAtcEnzfktf+TuS3EDvorqnm8T5U8Bv9V1cdzJw8YhjlqQlYU7/clfVU0mmgBPoXaixrBktnh51gB+MSGxNsgx4Kc+/GESStA+SXE9vlPewJFvp3UXiEuDGJBcAjwFnNtVvA04DNgHPAOcDVNUTSd4PfK6p9+tVZX8tqZNmTYqTvAz4bpMQHwC8id7Fc3cCPw/cwPNHJNYCf91s/4zziSVpuKrq7D1sOmmGugVcuIfjXANcM8TQJKmVBhkpPgLY2MwrfgFwY1V9PMlDwA1JfgP4InB1U/9q4I+TbKI3QnzWAsQtSZIkDc0gd5+4D3jdDOWP0rvZ++7l/8QPfrKTJEmSljyfaCdJkqTOG90l8pIkSUvMqgHu+Tyoi47ZNes9pDdfcvrQzqfhcqRYkiRJnWdSLEmSpM4zKZYkSVLnmRRLkiSp80yKJUmS1HkmxZIkSeo8k2JJkiR1nkmxJEmSOs+kWJIkSZ1nUixJkqTOMymWJElS55kUS5IkqfNmTYqTHJnkziQPJ3kwybua8kOS3J7kkeb94KY8SS5PsinJfUmOW+hGSJIkSftikJHiXcBFVfVq4ATgwiRHA+uBO6rqKOCOZh3gVOCo5rUOuHLoUUuSJElDNGtSXFXbq+oLzfI/AA8DK4A1wMam2kbgjGZ5DXBt9dwFLE9yxNAjlyRJkoZk2VwqJ1kFvA64G5ioqu3QS5yTHN5UWwFs6dtta1O2fbdjraM3kszExARTU1NzDn7nzp3z2q8NxrltABMHwEXH7BrJuUb9OY6ybTDa9u144mmuuO6WkZ3vmBUvHdm5YPz/3kmS9mzgpDjJi4E/A95dVd9KsseqM5TV8wqqNgAbAFavXl2Tk5ODhvKsqakp5rNfG4xz2wCuuO4WLr1/Tv8nm7fN50yO5DzTRtk2GG37xrltMP5/7yRJezbQ3SeS7E8vIb6uqj7WFD8+PS2ied/RlG8FjuzbfSWwbTjhSpIkScM3yN0nAlwNPFxVv9O36VZgbbO8Frilr/zc5i4UJwBPT0+zkCRJkpaiQX4HPRF4G3B/knubsvcClwA3JrkAeAw4s9l2G3AasAl4Bjh/qBFLkiRJQzZrUlxVn2XmecIAJ81Qv4AL9zEuSZIkaWR8op0kSZI6z6RYksZMks1J7k9yb5LPN2U+hVSS9sKkWJLG009X1bFVtbpZ9ymkkrQXJsWS1A0+hVSS9mJ0d+GXJI1KAZ9OUsD/bh6W5FNI52kpxj3IUzOH+XTNUbZ/1E8FHaZBYl9q3yVYmt/xQQw7bpNiSRo/J1bVtibxvT3Jl/ZS16eQzmIpxn3e+k/MWueiY3YN7QmU4/zkzGEa5DMf9ZM6B7EUv+ODGHbcTp+QpDFTVdua9x3AzcDx+BRSSdork2JJGiNJDkryw9PLwMnAA/gUUknaq3b+PiFJ2pMJ4OYk0OvjP1JVn0zyOXwKqSTtkUmxJI2RqnoUeO0M5d/Ep5BK0h45fUKSJEmdZ1IsSZKkzjMpliRJUufNmhQnuSbJjiQP9JUdkuT2JI807wc35UlyeZJNSe5LctxCBi9JkiQNwyAjxR8GTtmtbD1wR1UdBdzRrAOcChzVvNYBVw4nTEmSJGnhzJoUV9VfAk/sVrwG2NgsbwTO6Cu/tnruApZP3yxekiRJWqrmO6d4Yvrm7s374U35CmBLX72tTZkkSZK0ZA37PsWZoaxmrJisozfFgomJCaampuZ8sp07d85rvzYY57YBTBzQe0b8KIz6cxxl22C07RvntsH4/72TJO3ZfJPix5McUVXbm+kRO5ryrcCRffVWAttmOkBVbQA2AKxevbomJyfnHMTU1BTz2a8NxrltAFdcdwuX3j+aZ8dsPmdyJOeZNsq2wWjbN85tg/H/eydJ2rP5Tp+4FVjbLK8FbukrP7e5C8UJwNPT0ywkSZKkpWrWIZ8k1wOTwGFJtgLvAy4BbkxyAfAYcGZT/TbgNGAT8Axw/gLELEmSJA3VrElxVZ29h00nzVC3gAv3NShJkiRplHyinSRJkjrPpFiSJEmdZ1IsSZKkzjMpliRJUueZFEuSJKnzRncXfkmSJI3UqvWfmLXORcfs4rwB6s1m8yWn7/MxFpNJ8RJ1/9eeHsoXdFBt/yJL0u4GSQYGMUjCYB8qtZ/TJyRJktR5JsWSJEnqPJNiSZIkdZ5JsSRJkjrPpFiSJEmdZ1IsSZKkzjMpliRJUuctyH2Kk5wCXAbsB1xVVZcsxHkkjbdh3Wd2UB8+5aCRnm8psd+W1HVDHylOsh/w+8CpwNHA2UmOHvZ5JEnDYb8tSQszUnw8sKmqHgVIcgOwBnho2CfyqW+SNBQj6bdH3WdL0lwsRFK8AtjSt74VeP0CnEeSNBz225L2WdunvKWqhnvA5EzgzVX1S83624Djq+qdu9VbB6xrVl8FfHkepzsM+MY+hLuUjXPbYLzbZ9vaaz7te3lVvWwhghmVQfrtjvfZxj1abY0b2ht71+Kesd9eiJHircCRfesrgW27V6qqDcCGfTlRks9X1ep9OcZSNc5tg/Fun21rr3Fv317M2m93uc827tFqa9zQ3tiNu2chbsn2OeCoJK9I8kLgLODWBTiPJGk47Lcldd7QR4qraleSdwCfondrn2uq6sFhn0eSNBz225K0QPcprqrbgNsW4ti72aef8pa4cW4bjHf7bFt7jXv79mhE/XZbP1/jHq22xg3tjd24WYAL7SRJkqS28THPkiRJ6rxWJsVJXpTkb5L8bZIHk/zaYsc0bEn2S/LFJB9f7FiGKcnmJPcnuTfJ5xc7nmFKsjzJTUm+lOThJD+52DENS5JXNX9m069vJXn3Ysc1LEn+W9OXPJDk+iQvWuyYxknb++w29sdt7Wvb2I+2uX9sa9+X5F1NzA8O87Nu5fSJJAEOqqqdSfYHPgu8q6ruWuTQhibJe4DVwEuq6i2LHc+wJNkMrK6qNt4Pca+SbAT+qqquaq7gP7CqnlrsuIateSTw14DXV9VXFzuefZVkBb0+5Oiq+sckNwK3VdWHFzey8dH2PruN/XFb+9q296Nt6h/b2vcleQ1wA70ncX4H+CTwX6rqkX09ditHiqtnZ7O6f/NqX3a/B0lWAqcDVy12LBpMkpcAbwCuBqiq77SpI5+jk4CvLPUOf46WAQckWQYcyAz3Vtf8tbnPtj8enTHpR9vWP7ax73s1cFdVPVNVu4C/AH52GAduZVIMz/6cdS+wA7i9qu5e7JiG6HeBXwa+v9iBLIACPp3knuYJWePix4CvA3/U/Mx6VZLhPn9y6TgLuH6xgxiWqvoa8NvAY8B24Omq+vTiRjV+Wtxnt7U/bmNfOw79aGv6xxb3fQ8Ab0hyaJIDgdN47sOH5q21SXFVfa+qjqX35KXjm+H01kvyFmBHVd2z2LEskBOr6jjgVODCJG9Y7ICGZBlwHHBlVb0O+DawfnFDGr7m58y3An+62LEMS5KDgTXAK4AfAQ5K8h8XN6rx08Y+u+X9cRv72lb3o23rH9va91XVw8AHgNvpTZ34W2DXMI7d2qR4WvPTyhRwyiKHMiwnAm9t5oPdALwxyZ8sbkjDU1XbmvcdwM305gSNg63A1r7Rr5vode7j5lTgC1X1+GIHMkRvAv6uqr5eVd8FPgb81CLHNLZa1me3tj9uaV/b9n60bf1ja/u+qrq6qo6rqjcATwD7PJ8YWpoUJ3lZkuXN8gH0/mC/tLhRDUdVXVxVK6tqFb2fYT5TVUv+f26DSHJQkh+eXgZOpvczSOtV1d8DW5K8qik6CXhoEUNaKGfTkp8G5+Ax4IQkBzYXhJ0EPLzIMY2VtvbZbe2P29rXjkE/2rb+sbV9X5LDm/cfBX6OIX3uC/JEuxE4AtjYXOX5AuDGqmrNrXI6bAK4ufd3j2XAR6rqk4sb0lC9E7iu+QntUeD8RY5nqJq5Wz8DvH2xYxmmqro7yU3AF+j9BPdF2vt0p6XKPnu02tzXtrIfbWP/2PK+78+SHAp8F7iwqp4cxkFbeUs2SZIkaZhaOX1CkiRJGiaTYkmSJHWeSbEkSZI6z6RYkiRJnWdSLEmSpM4zKZYkSVLnmRRLkiSp80yKJUmS1Hn/H9dfzsrdRwrNAAAAAElFTkSuQmCC\n", "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["import matplotlib.pyplot as plt\n", "fig, ax = plt.subplots(1, 2, figsize=(12, 3))\n", "df[df['color'] == 0]['quality'].hist(ax=ax[0])\n", "df[df['color'] == 1]['quality'].hist(ax=ax[1])\n", "ax[0].set_title('rouge')\n", "ax[1].set_title('blanc');"]}, {"cell_type": "markdown", "metadata": {}, "source": ["On construit le jeu de donn\u00e9es. D'un c\u00f4t\u00e9, ce qu'on sait - les features X -, d'un autre ce qu'on cherche \u00e0 pr\u00e9dire."]}, {"cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [{"data": {"text/plain": ["Index(['fixed_acidity', 'volatile_acidity', 'citric_acid', 'residual_sugar',\n", " 'chlorides', 'free_sulfur_dioxide', 'total_sulfur_dioxide', 'density',\n", " 'pH', 'sulphates', 'alcohol', 'quality', 'color'],\n", " dtype='object')"]}, "execution_count": 10, "metadata": {}, "output_type": "execute_result"}], "source": ["df.columns"]}, {"cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": ["X = df.drop(\"quality\", axis=1)\n", "y = df['quality']"]}, {"cell_type": "markdown", "metadata": {}, "source": ["On divise en apprentissage / test puisqu'il est de coutume d'apprendre sur des donn\u00e9es et de v\u00e9rifier les pr\u00e9dictions sur un autre."]}, {"cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": ["from sklearn.model_selection import train_test_split\n", "X_train, X_test, y_train, y_test = train_test_split(\n", " X, y, random_state=42)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["On cale un premier mod\u00e8le, une r\u00e9gression lin\u00e9aire."]}, {"cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [{"data": {"text/plain": ["LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)"]}, "execution_count": 13, "metadata": {}, "output_type": "execute_result"}], "source": ["from sklearn.linear_model import LinearRegression\n", "clr = LinearRegression()\n", "clr.fit(X_train, y_train)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["On r\u00e9cup\u00e8re les coefficients."]}, {"cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([ 9.55561389e-02, -1.53182004e+00, -9.60658321e-02, 6.51351208e-02,\n", " -3.21323223e-01, 6.06114885e-03, -1.60663994e-03, -1.05342354e+02,\n", " 5.14593092e-01, 7.84057766e-01, 2.32175504e-01, -3.29941606e-01])"]}, "execution_count": 14, "metadata": {}, "output_type": "execute_result"}], "source": ["clr.coef_"]}, {"cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [{"data": {"text/plain": ["105.86698928549437"]}, "execution_count": 15, "metadata": {}, "output_type": "execute_result"}], "source": ["clr.intercept_"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Puis on calcule le coefficient $R^2$."]}, {"cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.26585260463659766"]}, "execution_count": 16, "metadata": {}, "output_type": "execute_result"}], "source": ["from sklearn.metrics import r2_score\n", "pred = clr.predict(X_test)\n", "r2_score(y_test, pred)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Ou l'erreur moyenne en valeur absolue."]}, {"cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.5682450595415709"]}, "execution_count": 17, "metadata": {}, "output_type": "execute_result"}], "source": ["from sklearn.metrics import mean_absolute_error\n", "mean_absolute_error(y_test, clr.predict(X_test))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Le mod\u00e8le se trompe en moyenne d'un demi-point pour la note."]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Arbre de r\u00e9gression\n", "\n", "Voyons ce qu'un arbre de r\u00e9gression peut faire."]}, {"cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [{"data": {"text/plain": ["DecisionTreeRegressor(ccp_alpha=0.0, criterion='mse', max_depth=None,\n", " max_features=None, max_leaf_nodes=None,\n", " min_impurity_decrease=0.0, min_impurity_split=None,\n", " min_samples_leaf=10, min_samples_split=2,\n", " min_weight_fraction_leaf=0.0, presort='deprecated',\n", " random_state=None, splitter='best')"]}, "execution_count": 18, "metadata": {}, "output_type": "execute_result"}], "source": ["from sklearn.tree import DecisionTreeRegressor\n", "dt = DecisionTreeRegressor(min_samples_leaf=10)\n", "dt.fit(X_train, y_train)"]}, {"cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.2423038841196432"]}, "execution_count": 19, "metadata": {}, "output_type": "execute_result"}], "source": ["r2_score(y_test, dt.predict(X_test))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["L'arbre de r\u00e9gression r\u00e9v\u00e8le l'int\u00e9r\u00eat d'avoir une base d'apprentissage et de test puisque ce mod\u00e8le peut r\u00e9pliquer \u00e0 l'identique les donn\u00e9es sur lequel le mod\u00e8le a \u00e9t\u00e9 estim\u00e9. A contrario, sur la base de test, les performances en pr\u00e9diction sont plut\u00f4t mauvaise."]}, {"cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.6288975399079262"]}, "execution_count": 20, "metadata": {}, "output_type": "execute_result"}], "source": ["r2_score(y_train, dt.predict(X_train))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Pour \u00e9viter cela, on joue avec le param\u00e8tre *min_smaple_leaf*. Il signifie qu'une pr\u00e9diction de l'arbre de r\u00e9gression est une moyenne d'au moins *min_sample_leaf* notes tir\u00e9es de le base d'apprentissage. Il y a beaucoup moins de chance que cela aboutisse \u00e0 du sur apprentissage."]}, {"cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 49/49 [00:03<00:00, 14.25it/s]\n"]}, {"data": {"text/html": ["
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
minlr2_train_dtr2_test_dtr2_train_regr2_test_reg
011.0000000.0500870.305220.265853
120.9309930.1307650.305220.265853
\n", "
"], "text/plain": [" minl r2_train_dt r2_test_dt r2_train_reg r2_test_reg\n", "0 1 1.000000 0.050087 0.30522 0.265853\n", "1 2 0.930993 0.130765 0.30522 0.265853"]}, "execution_count": 21, "metadata": {}, "output_type": "execute_result"}], "source": ["import pandas\n", "from sklearn.ensemble import RandomForestRegressor\n", "from tqdm import tqdm\n", "res = []\n", "for i in tqdm(range(1, 50)):\n", " dt = DecisionTreeRegressor(min_samples_leaf=i)\n", " reg = LinearRegression()\n", " dt.fit(X_train, y_train)\n", " reg.fit(X_train, y_train)\n", " r = {\n", " 'minl': i,\n", " 'r2_train_dt': r2_score(y_train, dt.predict(X_train)),\n", " 'r2_test_dt': r2_score(y_test, dt.predict(X_test)),\n", " 'r2_train_reg': r2_score(y_train, reg.predict(X_train)),\n", " 'r2_test_reg': r2_score(y_test, reg.predict(X_test)),\n", " }\n", " res.append(r)\n", "df = pandas.DataFrame(res)\n", "df.head(2)"]}, {"cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [{"data": {"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAEGCAYAAAB1iW6ZAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3deXxU5dn/8c+Vyb5DQgKELWBYAkhABGUREUWtFNGKinXhccG2gvantXVpLaL0cXn61Gq1ilr0qVjcFUXrhoKAoKFEgYRdhJCQhADZt5m5f3+cSQiQkAmZySQz1/v1mtfMnDlz5jqT5Dt37nPPfcQYg1JKqc4vyNcFKKWU8gwNdKWU8hMa6Eop5Sc00JVSyk9ooCullJ8I9tULJyYmmn79+vnq5ZVSqlPasGHDQWNMt6Ye81mg9+vXj8zMTF+9vFJKdUoi8mNzj2mXi1JK+QkNdKWU8hMa6Eop5Sd81oeulOrY6urqyM3Npbq62telBKTw8HB69epFSEiI28/RQFdKNSk3N5eYmBj69euHiPi6nIBijKG4uJjc3FxSU1Pdfl6LXS4i8g8RKRSRzc08LiLypIjsFJHvRWRUK+pWSnVQ1dXVJCQkaJj7gIiQkJDQ6v+O3OlDfwm46CSPXwykuS5zgL+3qgKlVIelYe47p/LetxjoxphVwKGTrHIp8H/Gsg6IF5Eera7ETd/nHuHRf29Fp/1VSqljeWKUSwqwr9H9XNeyE4jIHBHJFJHMoqKiU3qxrH1H+PuXu/jP3sOn9HyllPJXngj0pv4vaLL5bIxZZIwZbYwZ3a1bk99cbdHPRvUiNjyYF1f/cErPV0p1TpWVlVxyySUMHjyYoUOHcs8995x0/XfffZfs7OxWv86yZct45JFHTrXMBrNnz+bNN98E4IknnqCysrLN22yJJwI9F+jd6H4vIM8D221SVFgws8b24d+bD7DvkPffIKVUx2CM4c4772Tr1q1s3LiRNWvW8NFHHzW7/skC3W63N/u86dOnt/hh0VrtFeieGLa4DJgrIkuBsUCJMSbfA9tt1g1n9+OFr37g5bV7+P20dG++lFIKePD9LWTnlXp0m+k9Y/njT4eedJ09e/Zw8cUXM3nyZL7++mveffddAEJDQxk1ahS5ublNPm/t2rUsW7aMlStX8vDDD/PWW29x0003MW7cONasWcP06dMZOHAgDz/8MLW1tSQkJLBkyRKSk5N56aWXyMzM5G9/+xuzZ88mNjaWzMxMDhw4wGOPPcYVV1zR5GsaY5g3bx4rVqwgNTW14Tjfk08+SV5eHpMnTyYxMZEvvviiDe/aybkzbPFfwNfAIBHJFZGbROQXIvIL1yofAruBncDzwK+8Vq1Lz/gIfjK8B699u4/ymuY/aZVSnd+2bdu4/vrr2bhxI3379gXgyJEjvP/++0yZMqXJ54wbN47p06fz+OOPk5WVxYABAxqet3LlSu666y4mTJjAunXr2LhxI1dffTWPPfZYk9vKz89n9erVfPDBBydtub/zzjts27aNTZs28fzzz7N27VoAbr/9dnr27MkXX3zh1TAHN1roxphZLTxugNs8VpGbbpqQyvvf5fH6t/u4cYL7A++VUq3XUkvam/r27ctZZ53VcN9utzNr1ixuv/12+vfv36ptXXXVVQ23c3Nzueqqq8jPz6e2trbZL/DMmDGDoKAg0tPTKSgoaHbbq1atYtasWdhsNnr27Ml5553Xqto8odPO5ZLRO57RfbuweO0POJw6hFEpfxUVFXXM/Tlz5pCWlsavf/3rNm1r3rx5zJ07l02bNvHcc881+yWesLCwhtstDZf29bj9ThvoYLXS9x2q4tPs5j81lVL+4/e//z0lJSU88cQTLa4bExNDWVlZs4+XlJSQkmKNsH755ZfbXNs555zD0qVLcTgc5OfnH9O90lItntKpA33q0O706hLBP3QIo1J+Lzc3l4ULF5Kdnc2oUaPIyMjghRdeaHb9q6++mscff5yRI0eya9euEx6fP38+M2fOZOLEiSQmJra5vssuu4y0tDSGDx/OL3/5SyZNmtTw2Jw5cxoO7nqT+Oobl6NHjzaeOGPRC1/t5uHlObw/dwLDe8V5oDKlFEBOTg5DhgzxdRkBramfgYhsMMaMbmr9Tt1CB7jqzN5EhwXz4urdvi5FKaV8qtNPnxsTHsKVo3vzf1/v4Z6Lh9A9LtzXJSml2tHChQt54403jlk2c+ZM7r//fq+83qZNm7juuuuOWRYWFsb69eu98nqt0em7XAD2Hapk0uNf8ItJA/jtRYM9sk2lAp12ufhewHW5APTuGsnU9O68+s1eqmodvi5HKaV8wi8CHeCmiakcqazjzf80/VVgpZTyd34T6KP7diGjdzwvfLVbv2iklApIfhPoIsKt5/Tnx+JKPt5ywNflKKVUu/ObQAfri0b9EiJ5buUuPaORUn6mveZDB8jKyuLDDz9s1XP69evHwYMHOXLkCM8888wpvW5b+VWg24KEmyf257vcEtb/cLKz5imlOhtPzofeklMJ9Hq+DPROPw79eFec0Yu/fLqd51bu4qz+Cb4uRyn/8NE9cGCTZ7fZfThcfPIzA3lyPnSA2267jaKiIiIjI3n++ecZPHgwb7zxBg8++CA2m424uDg+++wzHnjgAaqqqli9ejX33nvvMbM01isuLmbWrFkUFRUxZsyYhl6Be+65h127dpGRkcEFF1zA448/3pZ3qVX8LtDDQ2zMHtePP3+6nW0HyhjUPcbXJSml2mDbtm0sXrz4mFZv/Xzod9xxR5PPqZ8Pfdq0aQ0npJgyZQrPPvssaWlprF+/nl/96lesWLGCBQsW8PHHH5OSksKRI0cIDQ1lwYIFDSe5aM6DDz7IhAkTeOCBB1i+fDmLFi0C4JFHHmHz5s1kZWV58F1wj98FOsC1Z/XlmS93sWjVbv585Qhfl6NU59dCS9qbPDEfenl5OWvXrmXmzJkNy2pqagAYP348s2fP5sorr+Tyyy93u65Vq1bx9ttvA3DJJZfQpUsXt5/rLX4Z6F2iQrnqzN68su5HfnPhQHrERfi6JKXUKfLEfOhOp5P4+PgmW83PPvss69evZ/ny5WRkZLSqZe3r+c+P51cHRRu7aUIqBli8Zo+vS1FKecipzoceGxtLampqw5wvxhi+++47AHbt2sXYsWNZsGABiYmJ7Nu3z635y8855xyWLFkCwEcffcThw4dPeN325reB3rtrJJcM78Gr6/dSUlXn63KUUm3U1vnQlyxZwosvvsiIESMYOnQo7733HgB33303w4cPZ9iwYZxzzjmMGDGCyZMnk52dTUZGBq+99lqT2//jH//IqlWrGDVqFJ988gl9+vQBICEhgfHjxzNs2DDuvvtuz78RJ+EXk3M1Z/P+EqY9tZrfXTSYX547wKuvpZS/0cm5fC8gJ+dqzrCUOCaclsjiNT9QY9dJu5RS/s2vAx3g1kn9KSyr4b2Neb4uRSnlBQsXLiQjI+OYy8KFCz22/cWLF5+w/dtuu81j2/ckv+5yAevgx7SnVnOkso7P7pxERKjN66+plD/QLhff0y6X44gID0xLZ/+RKp5ascPX5SillNf4faADjO2fwM9G9eL5r3azs9A3w4mUUsrbAiLQAe77yWAiQ4P5/bubdSZGpZRfCphAT4gO47cXDWLd7kO8m7Xf1+UopZTHBUygA8w6sw8ZveNZuDyHkkr9spFSnUl7zYe+bNkyHnnEd3PXtEVABXpQkPDwjGEcqqjl8U+2+rocpVQreHI+dLvd3uzzpk+f3uKHRUscDt9878UvJ+c6mWEpcdwwrh8vrd3DFWf0JqN3vK9LUqrDe/SbR9l6yLONoMFdB/O7Mb876TqenA/9pptuYty4caxZs4bp06czcOBAHn74YWpra0lISGDJkiUkJyfz0ksvNUydO3v2bGJjY8nMzOTAgQM89thjDdPxHu/LL7/kwQcfpEePHmRlZZGdnc0rr7zCk08+SW1tLWPHjuWZZ57BZrPx4osv8uijj9KzZ0/S0tIICws76VS97gqoFnq9Oy8YSLfoMH7/7iY9obRSHdy2bdu4/vrr2bhxI3379gWOzoc+ZcqUJp9TPx/6448/TlZWFgMGDGh43sqVK7nrrruYMGEC69atY+PGjVx99dU89thjTW4rPz+f1atX88EHH7TYcv/mm28a5pvJycnhtddeY82aNWRlZWGz2ViyZAl5eXk89NBDrFu3jk8//ZStWz33QRlwLXSAmPAQ/jAtnXn/2sgr637khnH9fF2SUh1aSy1pb/LEfOj1Gp95KDc3l6uuuor8/Hxqa2tJTU1t8jkzZswgKCiI9PR0CgoKTrr9MWPGNGzn888/Z8OGDZx55pkAVFVVkZSUxDfffMOkSZPo2rUrADNnzmT79u2t2o/mBGQLHWDa6T2YmJbI/3y8jdzDlb4uRynVDE/Mh97UtubNm8fcuXPZtGkTzz33HNXV1U0+JywsrOF2S0OeG2/fGMMNN9xAVlYWWVlZbNu2jfnz53t12LRbgS4iF4nINhHZKSIn/M8hIn1E5AsR2Sgi34vITzxfqmeJCAtnDMcA/++1LOwOp69LUkq14FTnQ29KSUkJKSkpALz88sseq7HelClTePPNNyksLATg0KFD/Pjjj4wZM4aVK1dy+PBh7HZ7w/lOPaHFQBcRG/A0cDGQDswSkfTjVvs98LoxZiRwNeCbU163Up+ESB6eMYxv9xzmqRU7fV2OUuok2jof+vHmz5/PzJkzmThxIomJiR6vNz09nYcffpipU6dy+umnc8EFF5Cfn09KSgr33XcfY8eO5fzzzyc9PZ24uDiPvGaLk3OJyNnAfGPMha779wIYY/670TrPAbuNMY+61v+zMWbcybbbXpNzuePO17J4N2s/S+eczZjUrr4uR6kOQSfn8p7y8nKio6Ox2+1cdtll3HjjjVx22WUnrOeNyblSgH2N7ue6ljU2H7hWRHKBD4F5TW1IROaISKaIZBYVFbnx0u1jwYxh9Okaya+XbtQvHCmlvG7+/PlkZGQwbNgwUlNTmTFjhke2684ol6bOgnp8s34W8JIx5s+uFvo/RWSYMeaYjmljzCJgEVgt9FMp2Buiw4L569Uj+dnf13LP29/zzM9HdbiTvyqlmrZw4cKGc4XWmzlzJvfff79XXm/Tpk1cd911xywLCwtj/fr1bm/jf/7nfzxdFuBeoOcCvRvd7wUcf7aIm4CLAIwxX4tIOJAIFHqiyPYwonc8v7lwEI98tJWl3+5j1pg+vi5JKeWG+++/32vh3ZThw4eTlZXVbq/XGu50uXwLpIlIqoiEYh30XHbcOnuBKQAiMgQIBzpOn4qb5kzsz4TTEnnw/S06za5SqtNpMdCNMXZgLvAxkIM1mmWLiCwQkemu1e4CbhGR74B/AbNNJ5yjNihI+N8rRxAZGsy8f2VRXafnIVVKdR5ujUM3xnxojBlojBlgjFnoWvaAMWaZ63a2MWa8MWaEMSbDGPOJN4v2pqTYcP48cwQ5+aU89EHrZ2pTSilfCdhvip7M5MFJ/GLSAJas38ubG5qe/EcppToaDfRm/GbqQM7un8D972xiS16Jr8tRKuC113zoAFlZWXz44Yen9Fxf0kBvRrAtiKeuGUmXyFB+8coGHZ+ulI95cj70lrQ20E82v3p7CsjZFt2VGB3GM9eO4qrnvubXr23kxRvOJChIx6erwHPgT3+iJsez86GHDRlM9/vuO+k6npwPHeC2226jqKiIyMhInn/+eQYPHswbb7zBgw8+iM1mIy4ujs8++4wHHniAqqoqVq9ezb333nvMLI315s+fT15eHnv27CExMZF//vOf3HPPPXz55ZfU1NRw2223ceutt+J0Opk7dy4rV64kNTUVp9PJjTfe2Oy86m2hgd6CUX268MC0dP7w3haeWrGTO85P83VJSgWUbdu2sXjxYp555ugUUfXzod9xxx1NPqd+PvRp06Y1BOeUKVN49tlnSUtLY/369fzqV79ixYoVLFiwgI8//piUlBSOHDlCaGgoCxYsaDjJxcls2LCB1atXExERwaJFi4iLi+Pbb7+lpqaG8ePHM3XqVDZs2MCePXvYtGkThYWFDBkyhBtvvNFzb1AjGuhuuPasvmzce4QnPt/OiN5xnDsoydclKdWuWmpJe5Mn5kMvLy9n7dq1zJw5s2FZTU0NAOPHj2f27NlceeWVXH755a2qbfr06URERADwySef8P333/Pmm28C1myOO3bsYPXq1cycOZOgoCC6d+/O5MmTW/UaraGB7gYRYeFlw8nOL+WOpVl8MG8CvbtG+rospQKCJ+ZDdzqdxMfHN/kNz2effZb169ezfPlyMjIyWvUt0OPnP3/qqae48MILj1ln+fLlbm+vrfSgqJsiQm08e+0ZOI3hxpe+pbCs6cnwlVLec6rzocfGxpKamtow54sxhu+++w6AXbt2MXbsWBYsWEBiYiL79u1rcS71plx44YX8/e9/p67OGkCxfft2KioqmDBhAm+99RZOp5OCggK+/PLLVm23NTTQW6FfYhSLrhtN7uEqrn5uHQdKNNSVai9tnQ99yZIlvPjii4wYMYKhQ4fy3nvvAXD33XczfPhwhg0bxjnnnMOIESOYPHky2dnZZGRk8Nprr7lV380330x6ejqjRo1i2LBh3Hrrrdjtdn72s5/Rq1evhmVjx4712Pznx2txPnRv6UjzobfWt3sO8V+Lv6VrVCiv3jKWXl20+0X5H50P3XPq5z8vLi5mzJgxrFmzhu7du7f4PG/Mh66Oc2a/rrxy81iOVNZy1XPr+LG4wtclKaU6sGnTppGRkcHEiRP5wx/+4FaYnwo9KHqKMnrH8+otZ3Hdi+u58rmvefWWsxjQLdrXZSkVcLw9H/rixYv561//esyy8ePH8/TTT7u9DW/2mzemXS5ttO1AGT9/YR0gLLl5LIO6x/i6JKU8Iicnh8GDB+vJXnzEGMPWrVu1y6U9Deoew9I5Z2MLgqsXfc3m/Trvi/IP4eHhFBcX0wlnwu70jDEUFxcTHh7equdpC91Dfiyu4Jrn11NaXcdL/zWGM/p28XVJSrVJXV0dubm5VFfraC5fCA8Pp1evXoSEhByz/GQtdA10D9p/pIqfP7+OwrIaXrzhTM4ekODrkpRSfka7XNpJSnwEr996NinxEcxe/A0rt3e6s/AppToxDXQPS4oNZ+kca8TLLS9n8smWA74uSSkVIDTQvSAhOox/3XIW6T1j+eWS/7Dsuzxfl6SUCgAa6F4SFxnCKzePZXTfLtyxdCPPrtyF06mjBZRS3qOB7kXRYcG89F9juHhYdx75aCvX/+MbCkt1xIBSyjs00L0sItTG09eM4r8vH07mj4e46K9f8XlOga/LUkr5IQ30diAizBrThw/mTaR7bDg3vZzJ/GVbqK5z+Lo0pZQf0UBvR6clRfPObeO4cXwqL63dw4yn17C9oHVzLiulVHM00NtZWLCNB36azuL/OpOishou/dsa3sva7+uylFJ+QAPdRyYPSuKjOyYyLCWWO5ZmMX/ZFuocTl+XpZTqxDTQfSgpNpxXbzmroQtm1qJ1OgpGKXXKNNB9LMQWxAM/TefJWSPZklfKT55czfrdxb4uSynVCWmgdxDTR/TkvbnjiQkP5poX1rNo1S5q7DoKRinlPg30DmRgcgzvzR3PlMFJ/OnDrZz93yv47w9z2HNQT3GnlGqZTp/bARljWL3zIK+u38sn2QU4nIaJaYlcM6YP56cnE2LTz2GlApXOh96JFZRW8/q3+/jXN3vJK6kmKSaMWycN4Lqz+hIarMGuVKBp83zoInKRiGwTkZ0ick8z61wpItkiskVEXm1Lweqo5Nhw5k1J46vfnceLN4zmtKRoHvogmwufWMWn2QV6ejClVIMWW+giYgO2AxcAucC3wCxjTHajddKA14HzjDGHRSTJGFN4su1qC/3UGGP4clsRDy/PZldRBeMGJHD/JUMY2jPO16UppdpBW1voY4CdxpjdxphaYClw6XHr3AI8bYw5DNBSmKtTJyJMHpzEv399DgsuHUpOfinTnlrNb9/8jrwjVb4uTynlQ8FurJMC7Gt0PxcYe9w6AwFEZA1gA+YbY/59/IZEZA4wB6BPnz6nUq9yCbEFcf3Z/bg0I4W/rdjBS2v38HpmLiN6xzM1PZnzhyQzMDkaEfF1qUqpduJOl8tM4EJjzM2u+9cBY4wx8xqt8wFQB1wJ9AK+AoYZY440t13tcvGsfYcqeS9rP5/mFPLdPutt7901gvOHJDM1vTtjU7sSFKThrlRnd7IuF3da6LlA70b3ewHHn1MtF1hnjKkDfhCRbUAaVn+7age9u0Yy97w05p6XRmFpNZ9vLeTT7AKWrN/L4jV76NUlgitH92bm6F70iIvwdblKKS9wp4UejHVQdAqwHyukrzHGbGm0zkVYB0pvEJFEYCOQYYxp9jvs2kJvH5W1dj7NLuD1zH2s2VlMkMCkgd246sw+TBmSpGPalepk2tRCN8bYRWQu8DFW//g/jDFbRGQBkGmMWeZ6bKqIZAMO4O6ThblqP5GhwVyakcKlGSnsLa7kjQ37eD1zH794ZQOJ0aFcfWYfrj+7L0mx4b4uVSnVRvrFogBkdzhZtaOIV9fv5fOthQQHCT89vSc3TkhlWIoOf1SqI2trH7ryM8G2IM4bnMx5g5PZc7CCxWt+4I0Nuby9cT9jU7ty04RUpgxJxqYHUZXqVLSFrgAoqapj6Td7eXntHvJKqkmJj+DyUSlcNjKF/t2ifV2eUspF53JRbqtzOPn35gOug6gHcRoY0Tuey0em8NMRPekaFerrEpUKaBro6pQUlFbzXtZ+3v7PfrYeKCM4SDhvcBK3TurPGX27+ro8pQKSBrpqs+y8Ut7ZmMtb/9nPoYpazu6fwLzzTuPsAQn6bVSl2pEGuvKYylo7r67fy6JVuyksq2FUn3jmnZfGuYO6abAr1Q400JXHVdc5eGNDLs9+uYv9R6oYlhLL5SN7Me60BAYmxeg0A0p5iQa68ppau5N3s/azaNVudhaWA9A1KpSz+ydw1oAExg1IoH9ilLbelfIQDXTVLnIPV/L1rmK+3l3M17uKyS+pBiAlPoIZI3ty2chenJakQyCVagsNdNXujDH8WFzJml0H+XhLAat3FOE0cHqvOGZkpDA9oyeJ0WG+LlOpTkcDXflcYWk1y77L452N+9mSV4otSJhwWiLnDurG+NMSSUvSuduVcocGuupQtheU8e7G/SzflM+PxZUAdIsJY/yABMaflsj40xLpGa9T/CrVFA101WHtO1TJmp0HWbOrmLU7D1JcUQvAoOQYLkhPZurQZIanxGnrXSkXDXTVKTidhm0FZazecZDPcgr4ds8hnAa6x4ZzfnoSU9O7c1b/BEKDdQ53Fbg00FWndKiilhVbC/k0+wArtxdRXeckJiyYyYOTmDo0mUkDuxETHuLrMpVqVxroqtOrrnPw1Y6DfJp9gM9zCimuqCXUFsS40xKYmt6dKUOSSIoJ064Z5fc00JVfcTgN/9l7mE+2HOCT7IKGA6tRoTZ6xEfQMz6CnnHh9IyPoEdcOD3iIkiODSMpJpzYiGANfdWpaaArv2WMYXtBOat3HmT/4SryjlSRV1JF3pFqDpbXnLB+WHAQybHhVsDHhtPddTvZdbt7XDjJseGEh9h8sDdKtUzPWKT8logwqHsMg7rHnPBYdZ2DAyXVFJRWU1BWQ2FpNYVlNdb90mqy80pZkVNIVZ3jhOf2iAunX0IUqd2i6J8YRWpiFP0So+jTNVJPrK06LA105bfCQ2z0cwVxc4wxlNXYKSip5kBpNQWlNeQfqeKH4gr2HKzgo035HK6sa1g/OEjo0zWS/t2i6N8tmv6J1nVaUjRd9OQfysc00FVAExFiw0OIDQ8hLfnEVj7AkcpafjhYwe6iCn44WMGuonJ2F1WwasdBau3OhvW6x4YzpEcMQ3rENlxSE6P03Kyq3WigK9WC+MhQRvYJZWSfLscsdzgN+w9XsetgOTsKysjJLyMnv5SvdhzE7rSOTYWHBNEvIYq+CZH0S4wiNSGKvglWF05yrI7KUZ6lga7UKbIFCX0SIumTEMnkQUkNy2vsDnYUlJOTX8q2A2XsKa5gZ2E5X2wtotZxtEUf4eoSqu+jT020+uz7do2ka1Sohr1qNQ10pTwsLNjGsJQ4hqXEHbPc4TTkHalij6t//oeDlfxwsJwteSX8e8sBHM6jI85CbUF0iwlrGIFTf0mKCSMpNoxuMdYwzC6RIRr8qoEGulLtxBYk9O4aSe+ukUxM63bMY7V2J7mHK/nhYAV7D1VSUGqNyikoq2a7azqEshr7CdsMsQndosPo5gp7K+iPBn58ZAgCWJlvBb8I2ESIiwiha3QoMWE6Nt9faKAr1QGEBgdZo2a6NX8CkIoaO0VlNRSW1VBYVk1h6dHbRWU17DtUyYYfD3PINcGZu4KDhC5RoSREhdIlMpQBSVGc0bcLo/t2pVeXCA37TkQDXalOIiosmKiw4JMOwwSoczg5WF5DUVkNJVV1GAMGa4gmWLedTsORyjoOV9ZyqOLopbiilnc35vHKur2ANa3x6L5dOKNvF9J7xBIVFkxkqI2IUBtRocFEhNoICw7S0O8gNNCV8jMhtiB6xEXQI+7U5pR3OA3bDpSxYe9hNuw5ROaPh/lo84Fm17cFCVGhNmLCQ4gJDyY2PITo8GBiwoNJiAqjZ7w1/UKP+HB6xkXQLSZMh3J6iQa6UuoYtiAhvWcs6T1jue6svgAUlFazq7CcyloHlXUOqmrt1u1aB5W1dsqr7ZRV2ymttlNeU0dBaTU7C60uouO/iRscJMRHhiICR2ceOXpAOEiE4CAh2BZEcJBgc116xIUzvFc8p6fEcXqvOJJiw9vpHek8NNCVUi2qH2XTWsYYSqrqyC+pJt81x05+SRWHKo5++7a+t0Y42h1kdxocTkOdw+m6Nuw7VMnK7TuoHwyUHBvG8JR4BiZHExMeQnSYzdUlFEx0WDBRYTbiI0PpGhVKbHhgHPjVQFdKeY2I1RqPjwxlSI/YNm+vstZOdl4p3+eWsGm/dVmxtQBnC3MM1v9XkBAVSpeoEOIiQhq6iGLCgq0PBFc3UWx917SfppYAABLASURBVFHE0S6kzjJZmwa6UqrTiAwNZnS/rozu17VhmTGGqjoHFTUOKmrsVLi6g8qr7Q0HfY8/+LvnYCVl1XWU1dgpr7HT0qSzobYgQoODCLYJwUFBhNiEYJsQEhREWIit4b+DqLBgokODiXZ9EHSPC6NHXAQ948PpHhdBdJh3I1cDXSnVqYkIkaFWV0u3mLBWP9/pNFTWOayAr7ZTVl1HaZWd0uo6SqvqKK22btfZDXankzqHwe5wYnd1CVXXOamosXOoopa9xZWU19hdHywnzuIZEx5Mz7gIbp+SxiWn9/DE7h/DrUAXkYuAvwI24AVjzCPNrHcF8AZwpjFGJztXSnV4QUFCdJjV794jruX13VVrd1JQWn3C8YP8kmqiw73Tlm5xqyJiA54GLgBygW9FZJkxJvu49WKA24H13ihUKaU6k9DgoIZvBrcXd2bqHwPsNMbsNsbUAkuBS5tY7yHgMaDag/UppZRykzuBngLsa3Q/17WsgYiMBHobYz442YZEZI6IZIpIZlFRUauLVUop1Tx3Ar2pwZsNx4RFJAj4C3BXSxsyxiwyxow2xozu1q1bS6srpZRqBXcCPRfo3eh+LyCv0f0YYBjwpYjsAc4ClolIkycxVUop5R3uBPq3QJqIpIpIKHA1sKz+QWNMiTEm0RjTzxjTD1gHTNdRLkop1b5aDHRjjB2YC3wM5ACvG2O2iMgCEZnu7QKVUkq5x63BkMaYD4EPj1v2QDPrntv2spRSSrWWO10uSimlOgENdKWU8hMa6Eop5Sc00JVSyk9ooCullJ/QQFdKKT+hga6UUn5CA10ppfyEBrpSSvkJDXSllPITGuhKKeUnNNCVUspPaKArpZSf8M6pp73o0W8eZeuhrb4uQymlTtngroP53ZjfeXy72kJXSik/0ela6N74VFNKKX+gLXSllPITGuhKKeUnOl2Xi1KqBbUVcGAz5H8HjhpInwHxvX1dVcfmdEDlIag+AsbpuhjAWNfGCY466/20uy6OGrDXQlQC9BwJEV18vRca6EqdwF4L1SVHLzX1t0utsKyrgLoqqK2EOtfFXmP90UOjIHDdFxsE2SAo+Njr4AgIi4bQKAiNdl2iQAQqi6Gi2LquPGhdV5dAWCxEJkBUonUdmQCRXaE03wrw/O+geMfR1wb45A/QfxKMuAaG/BRCI73/HlYdhrwscNQ2qjMBwmKs/WuOww7GYdXvdDQKV6f13gSHnfx1ayuhMAcOfA9FW63XB0Aava5AbTmUF0B5EVQUQkXRse/ZqUg4DXqOgpQzrEviaVBTbr0Xx19OmwI9RrTt9Zqgga78m8Nu/aEGhzb9eHWJFYJ5Wa5AzILinS1vV2xWwIREQEikFTQSRENwiFi3wQp4p926GIfrtsP1oVDeKHSaEBZntQDrA7G61KqzsthqTTYW09MKiaGXWdc9RoCzDr5bClmvwjtzYHkMDJ0BI662Hg+LceddPLofjVut1kKwV1v/EeT9B/b/x7o+tLvpbQSFuII92vrgPL7F21KoRiZCbA9rX+uvbSFQsAUObDr2wywkyvrwqq+5fh8w1mPRSRCXAikjISoJopOtVnZQo58jYv1cRcAWal2Cw63fJ1uY9XMvyT2673u+gk2vt/xehsV4JdDFNPxg2tfo0aNNZmZmq5934E9/oiZHx6F3esZpBUGIl1qL9hooy4OyA1Z4ihxtKUuQde20Q1310ecEhx1tJdtCTmxRS/217eStzNYyxhX0DusaYwVfUEgLr2OsDyxnnbWuLeTkr1NdAuWFVovf6bCW2UKs/Q4OP3rBWB8y9YFbf9tpb3lf6t/DsGhXa9xm1Vdfp6POunY6XEEZdDQw5fggxbpu+GB0uIK/9ui1o67R60Yd/fmFRrn2xQcctVBTZv1+BwUfvdiCXT/XYMLS0+l+332ntHkR2WCMGd3UY9pCV8dyOqxWjqMWuvSz/sX3BOO0fskbujHKjracIxMhqlvrWovNqSmD0v1WCxasbYdGHv33vT40jcMKgejko90dLQWit4i4Pixa++coVs3u1h0eZ11Mf6g6Yv2HYK+2LrXl1ntW38A7pkUaYT0vqNHrHPNBI9Z7HBptrd+e6rtjWv3eeZEt1PovxAc60LvgnlP9VFNuOLgDll4DxbusPtryL2D4lXDBgxDbs/XbK82HLW/D9o9h33orOBDrX83US6w+x+2fwM5PwbEJ4vpY3QHDLreCtrrUCugaV/91TanVImtoTYYdva4ogm+eh33rrG6KM26AMXP0YGBrOB1Qlm8Fd1Si9d+I6lQ6XZdLp2Ovhd1fQO+xEBHv62qat+3f8PYtVmvvyv+DHhmw+i+w9inrD3vinXD2PAhp4d/YqiOQ877Vj/jDV4CBpHRInQSp50DfcSe+D9UlsHU5bH7beq/c+de+KfF94KxfwchrPdPaV6oDOlmXiwa6t1Qegg2LYf0iKD8AiYPg2jet0HGH0+GZFlLuBtjwD2tYVdqFJ7ZYnU746s/wxULocTpcteTYdQ7vgU9+b4V0fF+YfJ/VPVJ/kM9RZ13XVsCOT6yLoxa69rda98OvgMQ09+utPGS16O1VVndPeJwVzmGxEB5r/TtbfxDNXn30WoKg9xhtVSq/p4Henop3wbq/Q9YSazhb/8kw+BJY8ZDVF/nzN6zgbE55EXzwa9j2kTVaYcL/g+7DTq2WnA/grZtdY2hrrGVJ6TDwQivck4bAe7fB1g/g9Kvgp3+1Rm00ZfeX8NE9UJTT/OtFJ8Own1kh3nOUZw8cKqUADXTvc9itwNuw2Oo6CAqG06+Es2+D5KHWOoU58MoV1lCzK//PGod6vJz34f1fW33F6ZdaoV5bboXvxLugz1j3a1q/CD76LaSMglmvWWNfd3xstX73fm21qutHFUx9yOqqaCmAHXbYv4GGURi2+iP4rgNz8X20hayUl2mgt5bTYbVaQ6Og++nWeNXjGWONB/7+ddj0hvXlhIguMPomGHMLxHQ/8Tml+bBkptXK/emTMPLn1vKqw/DR7+D716wDhpc9Z7Weqw5bB/rW/R2qDkHf8TDhTuvDoLnwdTrhswesvu9Bl8DPXjjxiyTVJbBrBexdB4OnQerEtr1fSql2o4HeGof3wNu3WqMl6kUnQ/fh1iV5GBzZa4Vv0VardTrwQuuLGmlTW/4mW3UpvH6d1aI/9z7odQa8N8/61to5d8M5vzlxGFptBWx42QrpsjyITYFBF8Ogn0C/iUe/NFNXDe/+Ara8A2feAhc/qi1mpfyMBro7jIHv/gUf/tZq/V78GMT1sr59Vn8p2mp9KQKsUSunX2X1c0d2bd1r2Wvh/dut1wPrgOllz1rdIy09b8s7kLMMdn5+9MBh2gUw8GLIfNHqTrngIRg3T/uwlfJDGugtqTxkHYjMfs/q1rjs2aZHo9hr4eA2a9RFl35te01jYM1frT7yiXc1fzCyObWVVit/23JryGHlQWsEyGXPWgcmlVJ+SQP9ZHZ9Ae/+EioOwnn3w7jbO183hdMBuZnWt9MST/N1NUopLzpZoLs1H7qIXCQi20Rkp4jc08Tjd4pItoh8LyKfi0jfthbdLja/Df+cYbW4b/7MGiLY2cIcrJr7jNUwVyrAtRjoImIDngYuBtKBWSKSftxqG4HRxpjTgTeBxzxdqMc57PD5AmsUy5yV0DPD1xUppVSbuNNCHwPsNMbsNsbUAkuBSxuvYIz5whhT6bq7Dujl2TK9YNPrcPgHOPfe9pkfWimlvMydQE8B9jW6n+ta1pybgI+aekBE5ohIpohkFhUVuV+lpznssOpxq3U+6GLf1aGUUh7kTqA3NfatySOpInItMBp4vKnHjTGLjDGjjTGju3Xr5n6Vnrb5TWsC/km/06F9Sim/4c70ublA4xmdegF5x68kIucD9wOTjDE1ninPCxx2WPkYJA+35lhRSik/4U4L/VsgTURSRSQUuBpY1ngFERkJPAdMN8YUer5MD9r8FhzaBZN+q61zpZRfaTHQjTF2YC7wMZADvG6M2SIiC0Rkumu1x4Fo4A0RyRKRZc1szrecDqvvPGmoNYeJUkr5EbfOWGSM+RD48LhlDzS6fb6H6/KOzW9bp1eb+bLrRLBKKeU/AifVnA5Y9Zg1H/iQ6S2vr5RSnUzgBPqWd+DgdmtGQ22dK6X8UGAkm9Np9Z13GwzpM3xdjVJKeUVgBHr2u9bUt9o6V0r5MbcOinZadVWw7hn46i/WnONDL/N1RUop5TX+GejGWOPNP5sPJfusM/tc+KfOOZOiUkq5yf8Cfe96+Pg+2J9pnTJuxjOQeo6vq1JKKa/zn0B32K0TVWx6HaK7w6XPWOf51Fa5UipA+E+g7/jYCvOz58Lk+yA0ytcVKaVUu/KfQN+4BKKT4fwHweY/u6WUUu7yjzF85UVWC/30qzTMlVIByz8CfdPr4LRDxs99XYlSSvlM5w90Y6zulpQzIGmwr6tRSimf6fyBnp8FhVu0da6UCnidP9A3LgFbGAz7ma8rUUopn+rcgW6vgU1vwJBpEBHv62qUUsqnOnegb/sQqo9od4tSStHZA33jEohNgf7n+roSpZTyuc4b6KX5sOtz/Xq/Ukq5dN5A/34pGKd2tyillEvnDPT6sed9zoaEAb6uRimlOoTOGei5mVC8AzKu8XUlSinVYXTOQM96BUIi9QxESinVSOcL9NpK2Pw2pF8KYTG+rkYppTqMzhfoWz+AmlLtblFKqeN0vkAPi4FBl0DfCb6uRCmlOpTON3n4oIuti1JKqWN0vha6UkqpJmmgK6WUn9BAV0opP6GBrpRSfkIDXSml/IQGulJK+QkNdKWU8hMa6Eop5SfEGOObFxYpAn5sYbVE4GA7lNMRBfK+Q2DvfyDvOwT2/ruz732NMd2aesBnge4OEck0xoz2dR2+EMj7DoG9/4G87xDY+9/WfdcuF6WU8hMa6Eop5Sc6eqAv8nUBPhTI+w6Bvf+BvO8Q2Pvfpn3v0H3oSiml3NfRW+hKKaXcpIGulFJ+okMGuohcJCLbRGSniNzj63q8TUT+ISKFIrK50bKuIvKpiOxwXXfxZY3eIiK9ReQLEckRkS0icodreaDsf7iIfCMi37n2/0HX8lQRWe/a/9dEJNTXtXqLiNhEZKOIfOC6HxD7LiJ7RGSTiGSJSKZrWZt+7ztcoIuIDXgauBhIB2aJSLpvq/K6l4CLjlt2D/C5MSYN+Nx13x/ZgbuMMUOAs4DbXD/vQNn/GuA8Y8wIIAO4SETOAh4F/uLa/8PATT6s0dvuAHIa3Q+kfZ9sjMloNPa8Tb/3HS7QgTHATmPMbmNMLbAUuNTHNXmVMWYVcOi4xZcCL7tuvwzMaNei2okxJt8Y8x/X7TKsP+wUAmf/jTGm3HU3xHUxwHnAm67lfrv/ItILuAR4wXVfCJB9b0abfu87YqCnAPsa3c91LQs0ycaYfLBCD0jycT1eJyL9gJHAegJo/11dDllAIfApsAs4Yoyxu1bx57+BJ4DfAk7X/QQCZ98N8ImIbBCROa5lbfq974gniZYmlunYSj8nItHAW8CvjTGlVkMtMBhjHECGiMQD7wBDmlqtfavyPhGZBhQaYzaIyLn1i5tY1e/23WW8MSZPRJKAT0Vka1s32BFb6LlA70b3ewF5PqrFlwpEpAeA67rQx/V4jYiEYIX5EmPM267FAbP/9YwxR4AvsY4lxItIfYPLX/8GxgPTRWQPVtfqeVgt9kDYd4wxea7rQqwP8jG08fe+Iwb6t0Ca60h3KHA1sMzHNfnCMuAG1+0bgPd8WIvXuPpMXwRyjDH/2+ihQNn/bq6WOSISAZyPdRzhC+AK12p+uf/GmHuNMb2MMf2w/s5XGGN+TgDsu4hEiUhM/W1gKrCZNv7ed8hviorIT7A+qW3AP4wxC31ckleJyL+Ac7GmziwA/gi8C7wO9AH2AjONMccfOO30RGQC8BWwiaP9qPdh9aMHwv6fjnXwy4bVwHrdGLNARPpjtVq7AhuBa40xNb6r1LtcXS6/McZMC4R9d+3jO667wcCrxpiFIpJAG37vO2SgK6WUar2O2OWilFLqFGigK6WUn9BAV0opP6GBrpRSfkIDXSml/IQGulKNiMj0lmb4FJF+jWfGVKqj6Ihf/VfKZ4wxywjML7IpP6AtdBUwXC3rrSLygohsFpElInK+iKxxzT89RkRmi8jfXOu/JCJPishaEdktIle09BpK+ZIGugo0pwF/BU4HBgPXABOA32B9Q/V4PVyPTwMeaacalTolGugq0PxgjNlkjHECW7BOJmCwph7o18T67xpjnMaYbCC5HetUqtU00FWgaTwniLPRfSdNH1NqvH7gzOmrOiUNdKWU8hMa6Eop5Sd0tkWllPIT2kJXSik/oYGulFJ+QgNdKaX8hAa6Ukr5CQ10pZTyExroSinlJzTQlVLKT/x/L1c4Oo2lggYAAAAASUVORK5CYII=\n", "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["df.plot(x=\"minl\", y=[\"r2_train_dt\", \"r2_test_dt\",\n", " \"r2_train_reg\", \"r2_test_reg\"]);"]}, {"cell_type": "markdown", "metadata": {}, "source": ["On voit que la performance sur la base de test augmente rapidement puis stagne sans jamais rattraper celle de la base d'apprentissage. Elle ne d\u00e9passe pas celle d'un mod\u00e8le lin\u00e9aire ce qui est d\u00e9cevant. Essayons avec une for\u00eat al\u00e9atoire."]}, {"cell_type": "markdown", "metadata": {}, "source": ["## For\u00eat al\u00e9atoire"]}, {"cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 25/25 [00:20<00:00, 1.54it/s]\n"]}, {"data": {"text/html": ["
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
minlr2_train_dtr2_test_dtr2_train_regr2_test_regr2_train_rfr2_test_rf
011.0000000.0302110.305220.2658530.9203180.472317
130.8640860.1301330.305220.2658530.8362990.455444
\n", "
"], "text/plain": [" minl r2_train_dt r2_test_dt r2_train_reg r2_test_reg r2_train_rf \\\n", "0 1 1.000000 0.030211 0.30522 0.265853 0.920318 \n", "1 3 0.864086 0.130133 0.30522 0.265853 0.836299 \n", "\n", " r2_test_rf \n", "0 0.472317 \n", "1 0.455444 "]}, "execution_count": 23, "metadata": {}, "output_type": "execute_result"}], "source": ["import pandas\n", "from sklearn.ensemble import RandomForestRegressor\n", "from tqdm import tqdm\n", "res = []\n", "for i in tqdm(range(1, 50, 2)):\n", " dt = DecisionTreeRegressor(min_samples_leaf=i)\n", " reg = LinearRegression()\n", " rf = RandomForestRegressor(n_estimators=25, min_samples_leaf=i)\n", " dt.fit(X_train, y_train)\n", " reg.fit(X_train, y_train)\n", " rf.fit(X_train, y_train)\n", " r = {\n", " 'minl': i,\n", " 'r2_train_dt': r2_score(y_train, dt.predict(X_train)),\n", " 'r2_test_dt': r2_score(y_test, dt.predict(X_test)),\n", " 'r2_train_reg': r2_score(y_train, reg.predict(X_train)),\n", " 'r2_test_reg': r2_score(y_test, reg.predict(X_test)),\n", " 'r2_train_rf': r2_score(y_train, rf.predict(X_train)),\n", " 'r2_test_rf': r2_score(y_test, rf.predict(X_test)),\n", " }\n", " res.append(r)\n", "df = pandas.DataFrame(res)\n", "df.head(2)"]}, {"cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [{"data": {"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAEGCAYAAAB1iW6ZAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nOzdeXxU5b348c8zk9ky2XdIIISdsCQgAmVTQBFbS9WKS1tbrlraKvb2tvVWbetFqr0u7a/WqnWtepWqRa3iihugAUFAAlH2ACGBkH3fZnt+f5xJSCCQAJNMlu/79Tqvs8yZM98TyHeePOc536O01gghhOj9TMEOQAghRGBIQhdCiD5CEroQQvQRktCFEKKPkIQuhBB9REiwPjguLk4PGTIkWB8vhBC90tatW0u11vHtvRa0hD5kyBC2bNkSrI8XQoheSSmVd6rXpMtFCCH6CEnoQgjRR0hCF0KIPiJofehCiJ7N7XZTUFBAY2NjsEPpl+x2OykpKVgslk6/RxK6EKJdBQUFhIeHM2TIEJRSwQ6nX9FaU1ZWRkFBAWlpaZ1+X4ddLkqpfyilipVSX53idaWUelgptV8ptUMpNekM4hZC9FCNjY3ExsZKMg8CpRSxsbFn/NdRZ/rQnwMWnOb1S4ER/mkJ8PczikAI0WNJMg+es/nZd5jQtdafAuWn2eU7wP9pw0YgSik14Iwj6aScgiruf383UvZXCCHaCsQol2Qgv9V6gX/bSZRSS5RSW5RSW0pKSs7qw7blV/D3tblsyas4q/cLIURfFYiE3t7fBe02n7XWT2qtJ2utJ8fHt3vnaocWnTeI6FALT6zLPav3CyF6p/r6er71rW8xevRoxo4dy+23337a/d944w127tx5xp+zatUq7rvvvrMNs8XixYt59dVXAXjooYeor68/52N2JBAJvQAY1Go9BTgagOO2y2E188NvDOGjXcXsL67pqo8RQvQwWmt++ctfsnv3brZt28b69et57733Trn/6RK6x+M55fsWLlzY4ZfFmequhB6IYYurgKVKqZeBqUCV1rowAMc9pR9+I5XH1+Xy1KcHuf+qCV35UUII4O63vmbn0eqAHjN9YAT/8+2xp93n0KFDXHrppcyZM4fPP/+cN954AwCr1cqkSZMoKCho930bNmxg1apVrFu3jnvuuYfXXnuNG2+8kenTp7N+/XoWLlzIyJEjueeee3C5XMTGxrJixQoSExN57rnn2LJlC4888giLFy8mIiKCLVu2cOzYMR544AGuuuqqdj9Ta82tt97KJ598QlpaWst1vocffpijR48yZ84c4uLiWLNmzTn81E6vM8MWXwI+B0YppQqUUjcqpX6qlPqpf5d3gQPAfuAp4OYui9YvNszGoskp/HvbEYqr5aYHIfqyPXv28MMf/pBt27aRmpoKQGVlJW+99Rbz5s1r9z3Tp09n4cKFPPjgg2RnZzNs2LCW961bt45f/epXzJw5k40bN7Jt2zauvfZaHnjggXaPVVhYSFZWFm+//fZpW+7//ve/2bNnDzk5OTz11FNs2LABgJ///OcMHDiQNWvWdGkyh0600LXW13XwugZuCVhEnXTTzKGs2HSY5zYc4r8XjO7ujxeiX+moJd2VUlNTmTZtWsu6x+Phuuuu4+c//zlDhw49o2Ndc801LcsFBQVcc801FBYW4nK5TnkDz+WXX47JZCI9PZ2ioqJTHvvTTz/luuuuw2w2M3DgQObOnXtGsQVCr63lMiTOyYKxSby4MY/aplP3hwkhejen09lmfcmSJYwYMYJf/OIX53SsW2+9laVLl5KTk8MTTzxxypt4bDZby3JHw6WDPW6/1yZ0gCWzh1Ld6OHlLw4HOxQhRDf43e9+R1VVFQ899FCH+4aHh1NTc+qBE1VVVSQnGyOsn3/++XOObfbs2bz88st4vV4KCwvbdK90FEug9OqEPnFwNFPSYvhH1kHcXl+wwxFCdKGCggLuvfdedu7cyaRJk8jMzOTpp58+5f7XXnstDz74IBMnTiQ39+RhzsuWLWPRokXMmjWLuLi4c47viiuuYMSIEYwfP56f/exnXHDBBS2vLVmypOXibldSwbrjcvLkyToQTyz6eFcRNz6/hYeuyeTyie3ezySEOAu7du1izJgxwQ6jX2vv30AptVVrPbm9/Xt1Cx1gzqgEhieE8cSnB6QcgBCiX+v15XNNJsWSWUP579d2kLW/lFkjzu4OVCFE73TvvfeycuXKNtsWLVrEb3/72y75vJycHK6//vo222w2G5s2beqSzzsTvb7LBaDJ42XW/WsYmRjOizdNDcgxhejvpMsl+PpdlwuALcTMf8xII2t/KV8dqQp2OEIIERR9IqEDfG/qYJxWM099diDYoQghRFD0uoTudnk5tKP0pO2RDgvXTRnM2zsKKajo+iI4QgjR0/S6hL713UO8+/cdHNlzcj30G2amoYB/ZB3q9riEECLYel1Cn7QglciEUD74x9c01LjavDYwysG3Mwby8ubDVNW7gxShEKIrdFc9dIDs7GzefffdM3rPkCFDKC0tpbKykscee+ysPvdc9bqEbrWHMP+msTTVefjouV1oX9tROktmD6Xe5eXFTXlBilAI0RUCWQ+9I2eT0JsFM6H3ynHo8YPCmbloOOte2su2jw4zaX5qy2tjBkQwe2Q8z64/xI0z07BbzEGMVIg+4r3b4VhOYI+ZNB4uPf2TgQJZDx3glltuoaSkhNDQUJ566ilGjx7NypUrufvuuzGbzURGRvLRRx9x11130dDQQFZWFnfccUebKo3NysrKuO666ygpKWHKlCktNzbefvvt5ObmkpmZycUXX8yDDz54Lj+lM9LrWujNxs5OZtjEeDa9cYBjB9oOVfzJ7KGU1jbxxrYjQYpOCBEogaqHvmTJEv72t7+xdetW/vSnP3HzzcajG5YvX87q1avZvn07q1atwmq1snz5cq655hqys7PbTeYAd999NzNnzmTbtm0sXLiQw4eNIoH33Xcfw4YNIzs7u1uTOfTSFjoYZSrnXD+a4sOb+eDpr7n6t+djd1oAmD4slrEDI3jyswNcPXkQJlNwS1oK0et10JLuSoGoh15bW8uGDRtYtGhRy7ampiYAZsyYweLFi7n66qu58sorOx3Xp59+yuuvvw7At771LaKjozv93q7Sa1voALZQC5fcNI66yibWvLC75U8epRRLZg/lQEkdH+8uDnKUQohzEYh66D6fj6ioKLKzs1umXbt2AfD4449zzz33kJ+fT2ZmJmVlZZ0+brDrn5+oVyd0gMS0CKZdMYwD2SV8te54F8u3xg8gOcrBk5+eXDZTCNE7nW099IiICNLS0lpqvmit2b59OwC5ublMnTqV5cuXExcXR35+fqfql8+ePZsVK1YA8N5771FRUXHS53a3Xp/QATLnDSJ1fCxZr+6j5LDxgwwxm7hpVhqbD1WwNe/kMetCiN7lXOuhr1ixgmeeeYaMjAzGjh3Lm2++CcBtt93G+PHjGTduHLNnzyYjI4M5c+awc+dOMjMzeeWVV9o9/v/8z//w6aefMmnSJD744AMGDx4MQGxsLDNmzGDcuHHcdtttgf9BnEafKM4F0FDr4pV7NhNiNXH1nedjtYdQ1+Rh+n2fMG1oDE9c324tGyHEKUhxruDrl8W5ABxhVubfmE51SQPr/rkHrTVOWwjXT0vlg51FHCipDXaIQgjRpfpMQgcYOCKa8y9LY+8XRez+vBCAH00fgsVs4u9rpS9diL7o3nvvJTMzs8107733Buz4zz777EnHv+WWWwJ2/EDqM10uzXw+zaq/ZlN0oIpFd5xPzEAn976zk6c+O8gLN06RB2AI0UnS5RJ8/bbLpZnJpLj4hnQsdjOrn/4Kt8vLr+aPYli8k9tW7qCqQWq8CCH6pj6X0AGckTYu+o90yo/WkfWvfdgtZv5yTSYltU3cverrYIcnhBBdok8mdIDB6bFMWpDKzqyj7NtcxISUKJbOGc7r247w/leFwQ5PCCECrs8mdICp304jaWgka1bsprK4nqVzhzM+OZI7//0VJTVNwQ5PCCECqk8ndJPZxPybxmIyKVY/9RV4Nf/v6gxqmzzc8foOgnVBWAhx5rqrHvqqVau4777g1a45F306oQOEx9i56D/SKS2oZd2KPQxPCOO/LxnFR7uKWbm1/dKbQoieJ5D10D0ezynft3Dhwg6/LDri9XrP6f1nq9dWWzwTQ8bHMeWyNL546yAJQyK44YI0PtxZxPK3djJ9WCwp0aHBDlGIHu3+L+5nd/nugB5zdMxofjPlN6fdJ5D10G+88UamT5/O+vXrWbhwISNHjuSee+7B5XIRGxvLihUrSExM5LnnnmPLli088sgjLF68mIiICLZs2cKxY8d44IEHuOqqq9r9zLVr13L33XczYMAAsrOz2blzJy+++CIPP/wwLpeLqVOn8thjj2E2m3nmmWe4//77GThwICNGjMBms/HII4+c2w+UftBCbzb50iEMmRDH+pX7OHagij8tykBrza9Xbsfnk64XIXqqQNVDb37funXr+NWvfsXMmTPZuHEj27Zt49prr+WBBx5o91iFhYVkZWXx9ttvd9hy/+KLL1rqzezatYtXXnmF9evXk52djdlsZsWKFRw9epQ//OEPbNy4kQ8//JDduwP3RdkvWugAyqS46D/SWfm/m1n95Fdcfef53PXtdH7zWg7PbTjEDTPTgh2iED1WRy3prhSIeujNWj+soqCggGuuuYbCwkJcLhdpae3ngMsvvxyTyUR6ejpFRUWnPf6UKVNajvPxxx+zdetWzj//fAAaGhpISEjgiy++4IILLiAmJgaARYsWsXfv3jM6j1PpVAtdKbVAKbVHKbVfKXXSV5RSarBSao1SaptSaodS6psBiS7AbI4QLv3JeFxNXt5/MofvZiYzb3QC97+/m/3FUutFiJ4oEPXQ2zvWrbfeytKlS8nJyeGJJ56gsbGx3ffYbLaW5Y4GUrQ+vtaaH/3oRy311/fs2cOyZcu6dDBGhwldKWUGHgUuBdKB65RS6Sfs9jvgX1rricC1QHCekNoJsclhzL1+NMcOVLP+1f3873fHE2o188t/ZeP2+oIdnhDiNM62Hnp7qqqqSE5OBuD5558PWIzN5s2bx6uvvkpxsfGQnfLycvLy8pgyZQrr1q2joqICj8fT8rzTQOhMC30KsF9rfUBr7QJeBr5zwj4aiPAvRwJHAxZhFxgxOZHMiwfz1bojlOVUcO8V49lRUMVja6SAlxA91bnWQz/RsmXLWLRoEbNmzSIuLi7g8aanp3PPPfcwf/58JkyYwMUXX0xhYSHJycnceeedTJ06lYsuuoj09HQiIyMD8pkdFudSSl0FLNBa3+Rfvx6YqrVe2mqfAcAHQDTgBC7SWm9t51hLgCUAgwcPPi8vLy8gJ3E2fF4fqx7ezrHcKq68bRJ//DyXt3cU8u+bZzA+JTA/XCF6MynO1XVqa2sJCwvD4/FwxRVXcMMNN3DFFVectF9XFOdq76F5J34LXAc8p7VOAb4JvKCUOunYWusntdaTtdaT4+ODW/XQZDZxyU1jcYRbeO+JHO6YO4q4MBv/9a9sGt3BGUMqhOgfli1bRmZmJuPGjSMtLY3LL788IMftzCiXAmBQq/UUTu5SuRFYAKC1/lwpZQfigB79hGZHuJVLfzqe1x/8ko3/3MP9V47nR89t5k+r9/C7y068TCCE6InuvffelmeFNlu0aBG//e1vu+TzcnJyuP7669tss9lsbNq0qdPH+NOf/hTosIDOdbmEAHuBecARYDPwPa311632eQ94RWv9nFJqDPAxkKxPc/Cuqod+NnauP8qaF3Yzcf5g3jU1sGLTYV768TSmDY0NdmhCBI10uQRfwLtctNYeYCmwGtiFMZrla6XUcqXUQv9uvwJ+rJTaDrwELD5dMu9p0mcMZOysgWz74DDXDYhjcEwov165nZpGqZ0uhOg9OjUOXWv9rtZ6pNZ6mNb6Xv+2u7TWq/zLO7XWM7TWGVrrTK31B10ZdFeYdfVIEtMiyPrnXu6ZO4qjlQ38/KVtMpRRCNFr9Jtb/ztitphYsGQ8FquJvDfz+MM301mzp4TbX8uRqoxCiF5BEnorYdE2FiwZR3VpI1E5NfzXvBG89mUB97+/J9ihCSFEhyShn2DgiGhmfHc4h3aUMv6Yl++fP4jH1+XyTNbBYIcmRL/WXfXQAbKzs3n33XfP6r3BJAm9HRPmpnDeglR2rS9k6hEf3xyVyB/e3smb2UeCHZoQ/VYg66F35EwT+unqq3enflNt8UwopZh2+TAi4hys/eceZiWFUpMcza9XbifGaWXWiODeFCVEdzv2xz/StCuw9dBtY0aTdOedp90nkPXQAW655RZKSkoIDQ3lqaeeYvTo0axcuZK7774bs9lMZGQkH330EXfddRcNDQ1kZWVxxx13tKnS2GzZsmUcPXqUQ4cOERcXxwsvvMDtt9/O2rVraWpq4pZbbuEnP/kJPp+PpUuXsm7dOtLS0vD5fNxwww2nrKt+LiShn0b6zIGERdt4/6mvmFNnpinayU9f2MrLS74h5QGE6CZ79uzh2Wef5bHHjtf8a66H/p//+Z/tvqe5Hvpll13WkjjnzZvH448/zogRI9i0aRM333wzn3zyCcuXL2f16tUkJydTWVmJ1Wpl+fLlLQ+5OJ2tW7eSlZWFw+HgySefJDIyks2bN9PU1MSMGTOYP38+W7du5dChQ+Tk5FBcXMyYMWO44YYbAvcDakUSegcGj43lyl+fxzuPbmfeUY0vysriZ7/g1Z9NJy3O2fEBhOgDOmpJd6VA1EOvra1lw4YNLFq0qGVbU5PxoPgZM2awePFirr76aq688sozim3hwoU4HA4APvjgA3bs2MGrr74KGNUc9+3bR1ZWFosWLcJkMpGUlMScOXPO6DPOhCT0TohLCeO7/z2Zdx7bztwjtXwWofjhPzbx2s+mkxBuD3Z4QvRpgaiH7vP5iIqKIjs7+6TXHn/8cTZt2sQ777xDZmZmu/t0JjatNX/729+45JJL2uzzzjvvdPp450ouinZSWLSNK341iUFjYplZaWJEoZfFz3whd5MK0Y3Oth56REQEaWlpLTVftNZs374dgNzcXKZOncry5cuJi4sjPz+/w1rq7bnkkkv4+9//jttt5IS9e/dSV1fHzJkzee211/D5fBQVFbF27dozOu6ZkIR+Bqz2EL5183jGzk7mvAYzww808dPnt9DkkeqMQnS1c62HvmLFCp555hkyMjIYO3Ysb775JgC33XYb48ePZ9y4ccyePZuMjAzmzJnDzp07yczM5JVXXulUfDfddBPp6elMmjSJcePG8ZOf/ASPx8N3v/tdUlJSWrZNnTo1YPXPT9Rhca6u0pOKc50prTXbPjzM56/nUmD2UjclmoeuPw+Tqb1Kw0L0TlKcK3Ca65+XlZUxZcoU1q9fT1JSUofvO9PiXNKHfhaUUkyan0pErIPVz3xF+aZK7rFs5/fXZaCUJHUhRFuXXXYZlZWVuFwufv/733cqmZ8NSejnYPh5CTijzuPVh76k8bMyHg35mqVXjwt2WEL0K11dD/3ZZ5/lr3/9a5ttM2bM4NFHH+30Mbqy37w16XIJgIpjdTx3/2ZMDV4a0sP55c3nYQ0xBzssIc6JdLkEX1c8gk50IDrJyY+XfQNfgp3wnbX8752fkX/szK6QCyHEuZKEHiBhkTZ+cfd0IqfHE13t5aU/fMHarPxghyWE6EckoQeQUoof/HA8U3+cjtekyHlxL888vg2fPCRDCNENJKF3gannDeCmu6dRFmuhMbuCv/w2i7KSumCHJYTo4yShd5H4mFB+/4dZNJ0XhanSzf8t28TWDVJ+V4iz1V310FetWsV99913tmG28fDDDzNmzBi+//3vB+R4HZFhi13IbFL88seTWDUmj60v72Pj/+0hN6eUK28YR4hFRsGI3uOzf+2lNL82oMeMGxTGrKtHdnr/5nro8+bNw+VyMW/ePN577z0uvfTSdvd/4403uOyyy0hPTz/pNY/HQ0hI++lv4cKFLFy4sNNxtcfr9WI2m3nsscd47733SEtLO6fjdZa00LvBwpmpfO+O89kbBSXbynjyrs8pOxrYXw4h+qJDhw4xZswYbr75ZmbOnMnw4cOBztdDv+2228jMzCQ3N5cLL7yQO++8kwsuuIC//vWvvPXWW0ydOpWJEydy0UUXUVRUBMBzzz3H0qVLAVi8eDE///nPmT59OkOHDm2ppNietWvXMmfOHL73ve8xfvx4fvrTn3LgwAEWLlzIX/7ylwD/ZNonLfRuMio5kj8sm8Wyx7eQtKeef97zBbOvGcmE2clyd6no8c6kJR1ogaqH3vy+devWAVBRUcHGjRtRSvH000/zwAMP8Oc///mkYxUWFpKVlcXu3btZuHDhaR9M8cUXX/DVV1+1tMjff/991qxZQ1xc3Fmd+5mShN6NIuwW/vTzaTzy7h7y388n66W97M8uYd61o4hKDA12eEL0SIGoh96s9ZOHCgoKuOaaaygsLMTlcp2yW+Tyyy/HZDKRnp7e0oo/lSlTpnRb90p7pMulm5lMip9fNpoFt0xgY5iX/N3lrFi2kTX/3E19tSvY4QnR4wSiHnp7x7r11ltZunQpOTk5PPHEEzQ2Nrb7HpvN1rLc0Z31J8ba3aSFHiRzxyQy4s4ZLH91B/qranyfHmX3xmNMviSVjHmDsNrln0aIEzXXQz9d2dxmHdU0r6qqIjk5GYDnn38+YDEGk7TQg2hQTChP/ngqVy0Zz1sDNbu0my/eOsgLv/+crz49glduSBKixbnWQz/RsmXLWLRoEbNmzeq2Pu6uJsW5eoi6Jg8PfbSX99bmMafJSpJLEZUYyjcuH0ZaZpxcOBXdTopzBZ8U5+qlnLYQfvutdJ7+1Qx2jbbzb2cTRyobeO+JHF5/8EsK91cGO0QhRA8nHbU9zOikCP710+m8urWA+97dxeAqzbwjNbz+py9Jy4jjG1cMIzopuBdehOhJuroe+olycnK4/vrr22yz2Wxs2rSpSz7vTEiXSw9WUefivvd28/rmfOZgJ7PeDF7N8PMSmDA3haS0rnkuoRAgXS49gTyCrg+Jdlq5/6oJLJqcwu/e+IqsozVc5YzAtKOUfZuLSEyLYMLcFIZNSsBslt4zIfo7yQK9wOQhMbx160x++e0xvOat48/2Wg4OtlJR0ciHz+zkhTs3sOXdgzTUyDh2IfqzTiV0pdQCpdQepdR+pVS7Jc6UUlcrpXYqpb5WSv0zsGEKi9nETbOGkvWbufzXpaPIool7dRVZAxSNTjObVh3k+Ts28PH/7aK0QJ6WJER/1GGXi1LKDDwKXAwUAJuVUqu01jtb7TMCuAOYobWuUEoldFXA/V2008rNFw7nx7OG8t5Xx3gm6yD35ZczOC6EK5yR7NtcxO4NhQwcEcWEuSmkZcRjMsmQRyH6g8600KcA+7XWB7TWLuBl4Dsn7PNj4FGtdQWA1ro4sGGKE1nMJhZmDOTNW2bw+s3TmZAezyM15TzsrKcozU5pcR3vP/EVL/7uc778IE/KCoher7vqoQNkZ2fz7rvvnvH7rrvuOiZMmNBt1RVP1JmLoslA64djFgBTT9hnJIBSaj1gBpZprd8/8UBKqSXAEoDBgwefTbyiHZMGRzPpe9EcqWzg/z4/xEubDlODh4sGhTPFo/j89Vw2vnGAwWNjGDU1ibSMOKnHLs7ImueepDjvQECPmZA6lDmLl3R6/0DWQ+9IdnY2W7Zs4Zvf/Gan9vd4PJSWlrJhwwby8vLO+PMCpTMt9Pb+Xj9xrGMIMAK4ELgOeFopFXXSm7R+Ums9WWs9OT4+/kxjFR1IjnJwx6Vj2HjnPJZfMY5cu497G8p5I8mHe7iTorwaPnj6a5797/WseXE3R/dXdlhsSIhgCmQ99NzcXBYsWMB5553HrFmz2L17NwArV65k3LhxZGRkMHv2bFwuF3fddRevvPIKmZmZvPLKK+1+xrJly1iyZAnz58/nhz/8IfPnz6e4uJjMzEw+++yzrvmBdKAzLfQCYFCr9RTgaDv7bNRau4GDSqk9GAl+c0CiFGck1BrC9dNS+f6UwazbV8LzGw7x0N4STCb49pgYzsPK3s1F7Mw6SkScnVHTBjBqahKR8Y5ghy56qDNpSQdaoOqhz5s3j8cff5wRI0awadMmbr75Zj755BOWL1/O6tWrSU5OprKyEqvVyvLly9myZQuPPPLIaWPbunUrWVlZOBwODh06xGWXXUZ2dnbgTv4MdSahbwZGKKXSgCPAtcD3TtjnDYyW+XNKqTiMLpjA/n0mzpjJpJgzKoE5oxLIL6/n5c2HeWVzAW/UlpOa6OCqxBjsFV42v3OQzW8fZMDwSEZNTWL4eQnYQi3BDl8IIDD10Gtra9mwYQOLFi1q2dbU1ATAjBkzWLx4MVdffTVXXnnlGcW2cOFCHI6e0xDqMKFrrT1KqaXAaoz+8X9orb9WSi0HtmitV/lfm6+U2gl4gdu01mVdGbg4M4NiQrntktH84qKRfLSziBWbDvPnfUcwmxSXnhfHhXYnDftqWLtiD5+9so+0jDhGTUticHoMJrlpSQRRIOqh+3w+oqKi2m09P/7442zatIl33nmHzMzMM2phB7v++Yk6daeo1vpd4N0Ttt3ValkDv/RPogezmE1cOn4Al44fwKHSOl7afJiVWwp4u66EQdEOrr04kbR6EwXZpezfWowj3MLIKUmMmpZEXEqYVH0UQXW29dAjIiJIS0tj5cqVLFq0CK01O3bsICMjg9zcXKZOncrUqVN56623yM/P77CWek8lTa9+bEickzsuHcPnd8zl4esmkhzt4MHNh7hl10G+zAwl7fJUBgyLImdtAf+6dzOv3PMF2z44TF1VU7BDF/3QudZDX7FiBc888wwZGRmMHTuWN998E4DbbruN8ePHM27cOGbPnk1GRgZz5sxh586dp70o2hNJcS7Rxv7iWl764jCvbi2gqsHN8IQwvpeZzDivhcNfllB0sBqlYFB6DKOmJZGWEY/FKkMg+yIpzhV8UpxLnJPhCWH8/rJ0brtkFG9tP8qLmw6z/IM9OCxmvpM5kCsXjMV3qJY9m47x4VWUCUAAACAASURBVDM7sdjNDJ+UwKhpSQwYHiV3pQoRRJLQRbvsFjOLJg9i0eRB5BRU8eLGPN7IPsLLm/PJGBTF968YxGynkwNbitm/tZhdGwqx2MwkDIkgKS2CpKGRJKZF4Ai3BvtURB/X1fXQn332Wf7617+22TZjxgweffTRgBw/kKTLRXRaVYOb178s4MWNeeSW1BHpsLDovBSumZQCRxso3FfJsYPVlBbUon3G/6uIeAdJaREkpkWSNDSC2JQwKfXbS+zatYvRo0fLhfAg0Vqze/fuM+pykYQuzpjWmo0HynlxYx6rvz6Gx6eZOTyO72QOZM7oBCKtIZTk1XDsYBVFB6s5dqCK+iqjlozZYiIhNdxI8GkRDBwRJa34HurgwYOEh4cTGxsrSb2baa0pKyujpqaGtLS0Nq9JQhddpri6kVc25/Py5nyOVDagFGSkRDFvdALzxiQyZkA4ALUVTRw7YCT4ooNVFB+uwecx/u/FDHSSPDKa5FFRRoIPkwTfE7jdbgoKCmhsbAx2KP2S3W4nJSUFi6XtTX6S0EWX01rz9dFqPtldzMe7i9mebzzUekCknbmjE5g3JoHpw+Kw+4uCed0+SvJrOLK3giN7KyncX4nH5QMgNjmM5JFRJI+KZuCIKOxOuWtViGaS0EW3K65pZO3uEj7eXcRn+0qpd3mxW0zMGBbHvDGJzB2dQFKkvWV/r8dHcV4NR/ZUcGRvBcdyq/C4faAgLiWM5BFGC37AsCjsYZLgRf8lCV0EVZPHy6YD5Xyyu5iPdhVRUNEAQPqACOaMjufCUQlMHBRFSKuLpV63j6JD1f4WfAXHcqvxeowWvDPKRmxyGLHJTmKTw4hLCSMqMRRziFxsFX2fJHTRY2it2Vdcy8e7ivlkdxFfHq7E69NE2EOYNTKeC0fGc8GoeBLC7W3e53F7jf73Q9WUH6mj9EgtFYV1+LzG/1+TWRGdFOpP9GHEpoQROzAMZ5RVLuiJPkUSuuixqhrcZO0rZe2eYtbuLaGkxigrMHZgBHNGJXDhqHgyT2i9N/N6fVQeq6fsaC1lBXWUHaml7EgttRXHSxPYnCEkDolgwLAoBo6IJCE1ghC5s1X0YpLQRa/QfGF13d4S1u4pbrf1PmtEfJu+9/Y01rkpP1pLaUEdZQU1HDtYTfnROgBMIYqEwREMGB7JwOFRJA2LlIuuoleRhC56pVO13hMjbGSkRJExKIqMlCjGp0QS6Th9Um6sc1OYW0XhfmNETXFeTUt3TcxAJwOGRzFweCQDhkcRHnP6LwwhgkkSuuj1mlvvmw+Vsz2/ku0FVRwsrWt5fWi800jyKZFkDIpizICIliGS7fG4vBQdqqZwvz/JH6jC3egFICzaRmSCg/AYO+GxDiJi7f5lO2HRNqkPL4JKErrok6rq3ew4UtmS4LPzK1ta8RazYnRSBBNSIkmLc5IQYScx3EZihJ2ECBuh1rZljHw+TVlBLYW5lRQdrKa6tJGasgbqql1tnqCrTApnlJWIWAfhrRJ9ZLyDmIFOuSlKdDlJ6KJf0FpzrLqR7flVbC+oZEdBJTvyq6hp8py0b7gthIQII8EnRthJCLcZST/CRlKEndRYJ3FhVnweTU1FIzVljdSUG/PqsgZjvayRusomWv8KOcItxAx0EjMwjJgBTmN5gFP66UXASPlc0S8opRgQ6WBApIMF45IAI8lXN3goqmmkuLqJourGluXimkaKqpvYfKic4pomXP5x7s3CbCGkxTnbToNjGB3nbOmz93p91FU0UVlUT3lhHeVH6ygvrGP3hkLcTd6WYzkjrccTvT/JR8Q5sIdZpOSwCJhe10L3etyYTGaUSfoxReBoralqcFNU3cTRqgbySus4WFrHwbJ6DpbWUlDR0KYlHuu0tiT5IXFOhsU7GZ4QRmqsE4vZhPYZLfvyo8eTfPnROioK64w7YP2USeEItxAaYSU0woYz0mosRxrrof51Z6QNi02GW4o+1uWy/cN3Wf+vFaSOzzSmCRMJj43rggiFOK7R7SW/vN5I8v7pQGkdh0rrKK45Pu49xKRIjQ1leELY8Sk+nGEJTkKtIfh8mpqyBsqP1lFb0UR9tYu6KmNeX+WivqqJ+hp3S/nh1iw2M84om5Hgo4zk74wykr4z0oYz0li22uUP776sT3W5xCQPIi1jEnk52exev65lW+qETFLHT2TQ2PFY7Y4gRyn6GrvFzIjEcEYkhp/0Wm2ThwMltewvPj7tK67lo13FeFsl5uQoB8MSwhgebyT6ISmhJEdHMyDSgbVV2QLt0zTWuamrclFffTzZ11U1tcyLDlVTX9nUprXfzGI3Gwk+ykpYtJ3opFCik5q7eewySqcP63Ut9GZaa0rz88jb/iV5OdkU7Poaj6sJkzmEgSNHkzphIqkTMkkcOhyTSf5UFd3P5fGRV1Z3PNH7k35uSS2NrbtdFCSG20mJdpAc7SA5ykFKdCjJ0Q5jW5Sj3SGYWmtcDR7qmpN9ZZOx7J/XVzVR7b9w28wUoohKCG1J8tED/PPEULmDtpfoU10up+JxuTiyZyd5Odnkbd9G8aFcAOzOMAaPy2DY+dMYMXU6FqstYJ8pxNnw+TRHKhvIL6+noLKBgooGjlQ0cKSynoKKBo5VNeI5ocslLsxGcrSDhHAb8eG2lnl8mDE6Jz7cRlyYFVvIyUm5qcFDxbE6KgrrjfmxeioK66gubXVdQEFErJ3oJCfhsXZCI6w4wv39+f7JEWGVB4L3AP0ioZ+ovrqKwznZHNqxjbwd26gtL8MW6mTMrAsZP/cSEoYM7bLPFuJceH2aoupGI9FX1vuTvZH4S2qaKKlpoqzO1e57Ix2W48k+3EZ0qJVQq5lQqxm7xYzDv+ywmLEqhanOi7fChbuiicbyJmpLGmisctFUf/JQTzD68R0RVkLD/Rduw41EHxZlwxllIyzamNtCQ6QoWhfplwm9Ne3zUbDrK3I++YC9m9bjdbtJHDqc8XPnM3rGBdhCnd0ShxCB4vb6KK9zUVzdREltIyU1Tf5lI+EX+xN/RZ2LBrf3pBb/6VjMikGRdoaGhzIo1Eqi1UKMOYQwFHYPeOo91Ne4qK9201DtorHOfdIxQiymNgm+9XJYlB1nlPEXgJQ8PnP9PqG31lhby66sNeR8vJqSw4cIsdoY9Y2ZjJs7n+RR6dKqEH2S2+ujwe2lweWf3F7qXV4a/dvq3V4aXV7qXR6Kapo4XF5Pfnk9h8vrqaxvm7BjnFYGxYQyOCaUwTEOUiIcRJtMhPoUNpfG1OjFVeOmtrKJusomaiuaqKtqannkYGu20BAc4VZj6Ga4tWXZEX68y6d5XVr9Bkno7dBaU3RgPzmfrGb3+nW4GhqIGZjC+LnzSZ89l9DIqKDFJkRPUtXgbknuLVOZMT9S2dBmJE8zu8VErNNGbJiVGKeVmFAL8VYLUZgJ02Bzg27w4m3w4K334PEve+o9eBu87URhPGA8PNZOZJy/vo5/HhFnlGHoLwlfEnoH3I2N7NmYRc4nH3B0z05M5hCGTZ5C0rCRRMTFEx6XQHhsHGHRMZjMclFIiGYer4/CqkbK6lyU1Rp9++UnLfvndU1tRvecitIQqiFUK0J9ilCtcPoU4VoR6VPEYCLCq7CccKgQm5nwWDtR8Q4iYh2ExdgwmRXaZ1yI1j6N1sbc5zOGhzZv8/lAezUms8IZbSM82k5YjI2waDuOcEuP+qKQhH4GygryyVnzAbuz1lJXWdHmNWUyERYTayT52HjC4+KJaJ7HxRMRn4gtNDRIkQvR89W7PJTVuqhudKM1+LTG559rrf3bmrfrln28Pk1lvZuCinr/xeIGikrrqatoItSlifSZiPQZCT/an/BDOpHalAlMJhPKZNy16/PolkcdNjNbTIRFG8k9PKZ5fjzhOyOtWB3d99eBJPSz5Gqop6aslOrSEmpKS4x5mX+5rISa0lJ83lajAZQiaehwUidMYkjGRAaMGI05pNfduyVEr+HzaUprm8hvGQlkjAoqKK+npLyBqnoP1Y1umnw+NOCDljn+/KsURNgtRDosRNjNRJjMhPsU4V6F0wsON1hdPkIafZgafah2uoSUCayhFhxhxrUAu9OCPczSMne0WrY7LTijzr6UgyT0LqJ9Puqrq6guLaamtITS/MPk5WRTuG832ufDYncweNwEUidMZMiEiUQlDexRf7oJ0R9orWlwe6lqcBtTvfv4coOb6lbLNY0e4+Kx/2Jxo//icYPbWHZ7NSYNYf4uoAif0R3k0ODQCodWOFE4MeHQYPGCqZ0UO+E7acy6NO2szuecb/1XSi0A/gqYgae11vedYr+rgJXA+Vrr3p2tO0GZTDijonFGRTNg+ChGToPpi75HY10t+V/vIG/HNg5t/5LcLZsAiIhPZMiEiaRmTGTwuAzszrAgn4EQfZ9SilBrCKHWEAZEnltZELfXZ4wMcntpdPmod3uoaTS6kSrqjWsF5XUuSv3z8tomamrdNNS6UC5fS9IPM3mYFaDza63DFrpSygzsBS4GCoDNwHVa650n7BcOvANYgaUdJfS+0ELvrMpjhRzyJ/f8r7fjamhAKRNJI0YycMRoohIHEJmYRGRCIhHxiYRYpHa2EH1Ng8tLeb2LijoXif67e8/GubbQpwD7tdYH/Ad7GfgOsPOE/f4APAD8+qyi7MOikgaQmTSAzPnfxOvxULh/D3k7tpG3fRvbP3gXj7vVXX9KERYTS1RCEpEJSUQmJhrLicZ6aGSUdNsI0Qs5rGaSrUZtnq7SmYSeDOS3Wi8AprbeQSk1ERiktX5bKSUJ/TTMISGkjB5LyuixzLj6B2ifj7qqSqqKjlFVfIxK/7yq+Bh5O76ktqK8zftDbDbiBqWSmDaMhLRhJKYNJ3ZQqrTqhRCdSujtNQdb+mmUUibgL8DiDg+k1BJgCcDgwYM7F2Efp0wmwqJjCIuOIXl0+kmvu11NVBcXtyT7yqKjlOYdYlfWOrZ/+B4AJrOZ2DZJfhjxg9Ow2OXp9UL0J53pQ/8GsExrfYl//Q4ArfX/+tcjgVyg1v+WJKAcWHi6fvT+1IfeFbTPR1VxEUUHcyk+uJ+ig7kUHcylsaYaAKVMxCSnkJA2jITUNGJSBhEzIIWIhAQpJyxEL3ZOwxaVUiEYF0XnAUcwLop+T2v99Sn2Xwv8Wi6Kdj+tNTVlpRT7k3vxwf0UH8xt021jtliIShxAzMAUYpJTiB6QTMzAFKIHJsuoGyF6gXO6KKq19iillgKrMYYt/kNr/bVSajmwRWu9KrDhirOllDLuWI2LZ/j501q211dXUX60gIqjRyg/WkD50QJK8/PYv2Uj2nf8rrjQyCgj0Q9MISI+AXtYOI7wcOxhxydHWDghNptcmBWiB5Ibi/oxr8dNZdGxlkRfUXiE8iMFlBceaem6aY/ZYjESvDMMR3gE9rAw7GHhmExmtPb5b+HWoDXad3y9Zbt/m1IKZ0xMS/kEo5xCHM6oaOkWEuIU+tQzRUXgmEMsxCYPIjZ50EmvuV1NNNbW0Fhba8xramiorfFvM6aGmhoa62qoLDpGY+4+tM9ntNyVQikTyqRQrZbBv24yoZTC6/VycPuXuBsb2ny2yWwmLCaW8Ni4E2rmxGELdeJxufxT08nLbmPZ3dS87iYsOoa4QanEDUolNmUQVofU2xF9kyR00S6L1YYlxkZ4TFyXfo7Wmqb6OmrKSqnx18oxauYY64X7drN34/q2NXNOQ5lMhFhthFitxmSxcGDrF3hcx5+rGRGfSNygwceT/KBUYpIHydBP0etJQhdBpZTC7gzD7gwjfvCQdvdpHqtfU1aCq77heLJunbj9y+0VQ2seEVSan9cyleXncWj7l/i8RqElZTIRnTSQuEGpxCSnYHWEGsezWLFYrZhbviCsLdtbr4dYbVjtdpRJnsAjgkf60EW/5fW4qSg8aiT4gsOUHs6jrCCPimOFcJa/FxabHavDgcVux2oPxepw+NeNudVux+Lfbg8LIzQiEkd4JI6ICEIjIrE6QuWCszgt6UMXoh3mEEtLt0trPp+3Vd+8C6/b1ap/3oXX5cLtNubN/fTupkbcjQ24Ghpw+efuRmO5tqL8+GsNDbibGk8TUwiO8AgcEZHGFG4kekdEBI6wCEwhZuOahP9aBNByTeL4NtVy/cJiteGMiSUsJha7MyxgXxauxgajW6yslIbqKuOvm9Q06bYKMknoQpzAZDJjtTuw2rum5ob2+XA1NvovLFfTUF1FfXWVMa+ppqG6moYaY1t1SREN1dU01ded8+eGWG3GXcn+BB8WE0tYdCzhscfXnVEx+DweaspLqSktNeZlJdSWlRnPAig35k11J8djMocQn5pG0rDhJA4dQdKwEcSmDJanfHUj6XIRohfwetw01ta2GgbqO/5ItZZl3/GhohhfHO7GRmoryqktL6OmvJTa8jJjqjDmXre7w88GcERE+kcdxR0ffRQTS3hsPPbwcMqP5HPswH6KcvdSdCC35QsoxGojYchQEocNJ2nYSBKHDidmQPJprzX4fF68LrfxF5Hb1bKstfY/XciY2i6b291uDgnpc0Ng5QEXQoiTaK1prK1pSfLNCd8cYvHfFxBHeEwcYTGxhFitnT+uz0dlUSHHcvdRdGCfMT+Yi6fJGGlkdTiIShqIz+v1d2e5j3dnuV0tF6oDxWQ2GxexLRbMFgshFitm/3Lr9RCLBYvNjtNfW8kZHUNYVAzOGGMeqNpIzV/CZ/tFIwldCBFUPp+X8gJ/K/7APqqKizCHWPwjhiz+0UIWf+L1b7Na2yRfpUxon9f/oGcfPp/P+GukZdnbZrvP68Xr8eD1uI0vDren5XqIsc2Nx2281rzsamigrrK83b9crI5Qf5dVDM4of8KPjsUcEoKrsfmayfFrKe6mxlbXUxpbrqm4Ghu5+Me3MGHegrP6WcpFUSFEUJlMZuIGDyFu8BDGXXhRsMM5La01jXW11FWUU1tRfsK8jLqKCo7s2UVdRRleT9v7I9qOZjKuwzijorHa/SOd7MYoqITUoV0SuyR0IYRoRSmFw1+36MQRUK01d1n5vF6sdgchVmvQ70OQhC6EEGdBKYUjPCLYYbQht7UJIUQf0eta6Pd/cT+7y3cHOwwhhDhro2NG85spvwn4caWFLoQQfUSva6F3xbeaEEL0BdJCF0KIPkISuhBC9BG9rstFiF7H6wFPA7gbweOf3A2gFJitYAox5mYrmC3+yQomC7Q3rtnTBE010FTtn/unxuqTt4XYIHUGDJkB9sjuP/eeSmvwusBVB+56cNWDu874d/F5/JP3+LLX3Xa99WS2giMKHNFg988dUWCLbP/frwtJQhfdz+uGrc/Bzjdh+DzI/D6EJQQ7KiPxNpRDfbnxi+6q8c/rwFV7iuXm9fp2knajsc3XuacttUuZjyd7k8n4PK+r4/eZQsAWYSSrzx8BZYKBEyHtAkibDYOngaVrqkmeMZ8PfG5/0vQnzpZlj/Hv4nMf/yJ0Nxjn5W7wJ+QTtrnrT3i9VcJ2+V9z1YEObM2YkynjS7Q5wbdO+OOvgtTpgf9EqeUiuo3WsOdd+PAuKNsPUalQmWckn5ELYNKPjAQfyOp4tcVQlQ91ZVBfBvWlxryu1EjcrdcbKzt3zBAHWJ3+Kcw/DzW2W+wQ4p8sjrbzELv/df9+YCRnrz+ZNS/73KfY7jE+yxZuJGt7hH85/Pi25uUQu/EXgKcJCjbDgXVw8FM4suV4q3LQ1OMJPnmS8ZfBmfC6obHKP1VCQ2Wr9ROnE15rqj2exAlQDjKFgMVp/LybJ6sTLKGt5qHGPtZQ/3pYq2Wn8XMzW4xjmSzG/0VTSKupnXWvCxoqjPNvqPD/LE5cP+G1i/8AE79/VqfZp4pzHfvjH2naJePQex1XLZQfNH6ZLaEQMwQcMUYrqvaYkXi9bqOLICzRmEJsZ/45Pk+rBFNltMZOpJTxy9r8i9vczdF6m8lstI5PmhsPkOi1tNfommlOMi5/XXOT2fiCsEcZPwNvO10LJ3U5dNDCVerUiVCZjdeV8fDw48umdrY1L5v8k9mYm8zHt5nM9KZ/F9uY0STdeedZvVeKc4ng8TRB5SGoLTESRewwCE+i5ZfP4oDoNIgaAg1lUFMElYeNyREN4YngiPX/YrejdYJqbvmB8Qtui4DwBOMLxGQBc3Orqx//t1dmfxdANETj/wL0f/k1VkL9wbb7n5iIQ+zHl82narW2+kIU3arX/c8+22810c0aqyDrL/D5Y0Yy/sYtMOMXRiuwI5WHYdsK2PYiVK+H0DjIvA4m/hCiU6Fgi9F9cHCdsexzG10IKVOM7oO02ZB8HoR0voa38KsuNP6qsUcZ/b/mXpci+rVe1+UierjmC55r/9fom55wLcz9HUQNOvNj+byQ+wl8+Tzsec9oTYbYjYtjygQDMo3kPfQCGDTN6AsVoo+TLpf+zFUHR748frW/eaSAp9E/EqCx/W0ms//KfAyExrQ/d0QfbwVrbSTdD++Csn2QOhMuuccYWXG2TGYYcbEx1RTB9peMvvYhM4yheI6owPyMhOgjJKF3F62NC4N1JcaIiroSY7JHwpDZ4IwN3Gc11cK+1fD1G7DvQ2Po3Kko0/FRF5bQ4yMxfF5jFEhD+emHyVnDITTa6Jsuz4XYEXDtSzDq0lP3e5+N8ESY+YvAHU+IPkgSeqDUFhv9urVFx5N1bUnbBH7KxKpgwAQYOgeGXgiDv3F8WFtntZfEnQnG0KiRC4xWdUir4Vwh/gRutpw+8WpttNqbk3ubecXx9cYqmPYzOG/xmQ9/E0IEhCT0c+H1wP6PYNsLsPf94zeQmCzgjAdnnDGPH3V82RlvJFpnnDHVHIPcNXBgDXz+KKx/yEi2g6cZCX7YHEgc3/4dZ021xufubE7ijcZwv4k/gLGXG18M5zqmW6njY67Pph9cCNFt5KLo2SjLNUZgbH8JagqNJJ1xLYy7yhiFYY86u+6GplrI22Ak99w1ULLL2B4aa9wAMvRC4+6ywu3w9b+NLxNPI4QlQfpCSL/c+CII5I05QogeRS6KBoKrHnatgi9fgLwso+95+MXwzQeNLo1AdDPYwmDkfGMCo/V+YK0x5a6Br18/vm9YknFn5djLjREeQX6WoRAi+CShn47WcHSb0aWS8xo0VRk3wcz9PWR+DyIGdu3nhycZLf+Ma41YSnbD4Y0QP9q4bVuSuBCilU4ldKXUAuCvgBl4Wmt93wmv/xK4CfAAJcANWuu8AMfafbQ2xlJvfgaKcow+7fTvwMTrjeFywUikSkHCGGMSQoh2dJjQlVJm4FHgYqAA2KyUWqW13tlqt23AZK11vVLqZ8ADwDVdEXC32Ph3WH0HDMiAb/3Z6BuXMc9CiB6uMy30KcB+rfUBAKXUy8B3gJaErrVe02r/jcAPAhlkt8rbAB/8DkZfBte8GNix1EII0YU603eQDOS3Wi/wbzuVG4H32ntBKbVEKbVFKbWlpKSk81F2l5pjsHIxxKTB5Y9JMhdC9CqdSejtZbV2xzoqpX4ATAYebO91rfWTWuvJWuvJ8fHxnY+yO3jd8K8fGU95ueZFebqLEKLX6UyXSwHQ+o6SFODoiTsppS4CfgtcoLVuCkx43eiD30P+RvjuM3LhUQjRK3Wmhb4ZGKGUSlNKWYFrgVWtd1BKTQSeABZqrYsDH2YXy3kVNv0dpt1sPBpKCCF6oQ4TutbaAywFVgO7gH9prb9WSi1XSi307/YgEAasVEplK6VWneJwPU/RTlh1q3Gb/MXLgx2NEEKctU6NQ9davwu8e8K2u1otXxTguLpHYxW88gPjGYyLnpOiUkKIXq3/3inq88G/f2Y8pPhHb/sfiyaEEL1X/03o6/8Ce96BBfdB6jeCHY0QQpyz/lkMJPcT+OQeGPddmPrTYEcjhBAB0f8SemU+vHojxI2Cbz8sNw8JIfqM/pXQ3Y3wr+uNm4iuedEoVyuEEH1E/+pDf/83Rjnca16EuOHBjkYIIQKq/7TQv3zBKIk7879gzLeDHY0QQgRc/0joR7PhnV8Zj3Gb87tgRyOEEF2i7yd0dyO8eoPxQOar/gHm/tXLJIToP/p+dsv6f1CeC9e/YSR1IYToo/p2C710H2T9BcYvgmFzgh2NEEJ0qb6b0LWGt/8LLA645I/BjkYIIbpc3+1y2f4SHPoMLnsIwhKCHY0QQnS5vtlCryuD1b+FQVNh0o+CHY0QQnSLvpnQP7wLmqqN1rmpb56iEEKcqO9lu0PrIftF+MZSSEwPdjRCCNFt+lZC9zTB27+AqMFwwW+CHY0QQnSrvnVRdP3DULoXvv8qWEODHY0QQnSrvtNCL8uFTx+E9MthxMXBjkYIIbpd30joWsM7v4QQm/EEIiGE6If6RpdLzqtwYC18808QMSDY0QghRFD0/hZ6QwWsvgOSz4PJNwQ7GiGECJre30L/aBnUl8MPXgeTOdjRCCFE0PTuFvrhTcZDK6b9DAZMCHY0QggRVL03oXvdxpjziBS48I5gRyOEEEHXe7tcPn8EinfCtS/Jw56FEILe2kKvOARr74fRl8HobwY7GiGE6BF6X0LXGt75tXEB9NL7gx2NEEL0GL0voe98A/Z/CHN+C5EpwY5GCCF6jN6X0G3hMOpbMGVJsCMRQogepfddFB1+kTEJIYRoo/e10IUQQrSrUwldKbVAKbVHKbVfKXV7O6/blFKv+F/fpJQaEuhAhRBCnF6HCV0pZQYeBS4F0oHrlFInPgroRqBCaz0c+Asgw0+EEKKbdaaFPgXYr7U+oLV2AS8D3zlhn+8Az/uXXwXmKaVU4MIUQgjRkc4k9GQgv9V6gX9bu/torT1AFRB74oGUUkuUUluUUltKSkrOLmIhhBDt6kxCb6+lrc9iH7TWT2qtJ2utJ8fHx3cmPiGEEJ3UmYReAAxqtZ4CHD3VPkqpECASKA9EgEIIITqnMwl9MzBCKZWmlLIC1wKrTthnFfAj//JVwCda65Na6EIIrVOJOQAABKtJREFUIbqO6kzeVUp9E3gIMAP/0Frfq5RaDmzRWq9SStmBF4CJGC3za7XWBzo4ZgmQ18FHxwGlHZ9Gn9Sfzx369/n353OH/n3+nTn3VK11u33WnUrowaKU2qK1nhzsOIKhP5879O/z78/nDv37/M/13OVOUSGE6CMkoQshRB/R0xP6k8EOIIj687lD/z7//nzu0L/P/5zOvUf3oQshhOi8nt5CF0II0UmS0IUQoo/okQm9o3K9fY1S6h9KqWKl1FettsUopT5USu3zz6ODGWNXUUoNUkqtUUrtUkp9rZT6T//2/nL+dqXUF0qp7f7zv9u/Pc1finqfvzS1NdixdhWllFkptU0p9bZ/vV+cu1LqkFIqRymVrZTa4t92Tv/ve1xC72S53r7mOWDBCdtuBz7WWo8APvav90Ue4Fda6zHANOAW/793fzn/JmCu1joDyAQWKKWmYZSg/ov//CswSlT3Vf8J7Gq13p/OfY7WOrPV2PNz+n/f4xI6nSvX26dorT/l5No3rUsSPw//v737eZWyiuM4/v6QLqIESVIkkUu0sI3Uxo0uRKSNF3GhmwzqT2ghQW2CVq2ioqVBGxOE0lwqqCAKLtRALTf9wIXiXUm1adF8WpxzaYgJrw4z83TO5wWXmTNzuPd84TzfezgPz/dwaK6DmhPbD2zfqO9/p1zYL9FP/Lb9R22urz8G9lFKUUPD8UvaBhwAjte26CT2/zDVvB9iQl9Lud4ebLH9AErSAzYveDwzV0+6eh24Rkfx1y2H74EV4DzwE/ColqKGtq+BT4H3gFFtb6Kf2A2ck3Rd0uqp91PN+yEeEr2mUrzRFknPA98A79r+rafzUWz/BbwmaSNwGnh1Urf5jmr2JC0DK7avS9q7+vGErs3FXu22fV/SZuC8pLvT/sIhrtDXUq63Bw8lbQWorysLHs/MSFpPSeYnbH9bP+4m/lW2HwGXKPcSNtZS1NDuNbAbOCjpV8rW6j7Kir2H2LF9v76uUP6R72LKeT/EhL6Wcr09GC9J/Dbw3QLHMjN1z/RL4Efbn4x91Uv8L9aVOZKeBfZT7iNcpJSihkbjt/2+7W22lyjX+QXbR+kgdknPSdqw+h54A7jNlPN+kE+KTirXu+AhzZSkk8BeSunMh8CHwBngFLAduAccsd3coSGS9gCXgVv8s4/6AWUfvYf4d1Jufj1DWWCdsv2RpJcpq9YXgJvAW7b/XNxIZ6tuuRyzvdxD7DXG07W5Dvi6liXfxBTzfpAJPSIintwQt1wiIuIpJKFHRDQiCT0iohFJ6BERjUhCj4hoRBJ6xBhJBx9X4VPS0nhlzIihGOKj/xELY/ssfT7IFg3ICj26UVfWdyUdl3Rb0glJ+yVdqfWnd0l6R9IXtf9Xkj6XdFXSz5IOP+5vRCxSEnr05hXgM2AnsAN4E9gDHKM8ofpvW+v3y8DHcxpjxFNJQo/e/GL7lu0RcIdymIAppQeWJvQ/Y3tk+wdgyxzHGfHEktCjN+M1QUZj7RGT7ymN9++npm/8LyWhR0Q0Igk9IqIRqbYYEdGIrNAjIhqRhB4R0Ygk9IiIRiShR0Q0Igk9IqIRSegREY1IQo+IaMTfnMe9CCHUUy0AAAAASUVORK5CYII=\n", "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["df.plot(x=\"minl\", y=[\"r2_train_dt\", \"r2_test_dt\",\n", " \"r2_train_reg\", \"r2_test_reg\",\n", " \"r2_train_rf\", \"r2_test_rf\"]);"]}, {"cell_type": "markdown", "metadata": {}, "source": ["A l'inverse de l'arbre de r\u00e9gression, la for\u00eat al\u00e9atoire est meilleure lorsque ce param\u00e8tre est petit. Une for\u00eat est une moyenne de mod\u00e8le, chacun appris sur un sous-\u00e9chantillon du jeu de donn\u00e9es initiale. M\u00eame si un arbre apprend par coeur, il est peu probable que son voisin ait appris le m\u00eame sous-\u00e9chantillon. En faisant la moyenne, on fait un compromis."]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Validation crois\u00e9e\n", "\n", "Il reste \u00e0 v\u00e9rifier que le mod\u00e8le est robuste. C'est l'objet de la validation crois\u00e9e qui d\u00e9coupe le jeu de donn\u00e9es en 5 parties, apprend sur 4, teste une 1 puis recommence 5 fois en faisant varier la partie qui sert \u00e0 tester."]}, {"cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "[Parallel(n_jobs=1)]: Done 5 out of 5 | elapsed: 5.4s finished\n"]}, {"data": {"text/plain": ["array([0.05037733, 0.24594631, 0.25811598, 0.348578 , 0.2462281 ])"]}, "execution_count": 25, "metadata": {}, "output_type": "execute_result"}], "source": ["from sklearn.model_selection import cross_val_score\n", "cross_val_score(\n", " RandomForestRegressor(n_estimators=25), X, y, cv=5,\n", " verbose=1)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Ce r\u00e9sultat doit vous interrompre car les performances sont loin d'\u00eatre stables. Deux options : soit le mod\u00e8le n'est pas robuste, soit la m\u00e9thodologie est fausse quelque part. Comme le probl\u00e8me est assez simple, il est probable que ce soit la seconde option : la jeu de donn\u00e9es est tri\u00e9e. Les vins rouges d'abord, les blancs ensuite. Il est possible que la validation crois\u00e9e estime un mod\u00e8le sur des vins rouges et l'appliquent \u00e0 des vins blancs. Cela ne marche pas visiblement. Cela veut dire aussi que les vins blancs et rouges sont tr\u00e8s diff\u00e9rents et que la couleur est probablement une information redondante avec les autres. M\u00e9langeons les donn\u00e9es au hasard."]}, {"cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": ["from sklearn.utils import shuffle\n", "X2, y2 = shuffle(X, y)"]}, {"cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "[Parallel(n_jobs=1)]: Done 5 out of 5 | elapsed: 5.6s finished\n"]}, {"data": {"text/plain": ["array([0.47975777, 0.50951094, 0.49514404, 0.51110336, 0.51584857])"]}, "execution_count": 27, "metadata": {}, "output_type": "execute_result"}], "source": ["cross_val_score(\n", " RandomForestRegressor(n_estimators=25), X2, y2, cv=5,\n", " verbose=1)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Beaucoup mieux. On peut faire comme \u00e7a aussi."]}, {"cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "[Parallel(n_jobs=1)]: Done 5 out of 5 | elapsed: 6.6s finished\n"]}, {"data": {"text/plain": ["array([0.53754932, 0.54227221, 0.5442236 , 0.57726314, 0.53393994])"]}, "execution_count": 28, "metadata": {}, "output_type": "execute_result"}], "source": ["from sklearn.model_selection import ShuffleSplit\n", "cross_val_score(\n", " RandomForestRegressor(n_estimators=25), X, y, cv=ShuffleSplit(5),\n", " verbose=1)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Pipeline\n", "\n", "On peut caler un mod\u00e8le apr\u00e8s une ACP mais il faut bien se souvenir de toutes les \u00e9tapes interm\u00e9diaires avant de pr\u00e9dire avec le mod\u00e8le final."]}, {"cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [{"data": {"text/plain": ["PCA(copy=True, iterated_power='auto', n_components=6, random_state=None,\n", " svd_solver='auto', tol=0.0, whiten=False)"]}, "execution_count": 29, "metadata": {}, "output_type": "execute_result"}], "source": ["from sklearn.decomposition import PCA\n", "pca = PCA(6)\n", "pca.fit(X_train, y_train)"]}, {"cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [{"data": {"text/plain": ["RandomForestRegressor(bootstrap=True, ccp_alpha=0.0, criterion='mse',\n", " max_depth=None, max_features='auto', max_leaf_nodes=None,\n", " max_samples=None, min_impurity_decrease=0.0,\n", " min_impurity_split=None, min_samples_leaf=1,\n", " min_samples_split=2, min_weight_fraction_leaf=0.0,\n", " n_estimators=100, n_jobs=None, oob_score=False,\n", " random_state=None, verbose=0, warm_start=False)"]}, "execution_count": 30, "metadata": {}, "output_type": "execute_result"}], "source": ["rf = RandomForestRegressor(n_estimators=100)\n", "X_train_pca = pca.transform(X_train)\n", "rf.fit(X_train_pca, y_train)"]}, {"cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.421429956568139"]}, "execution_count": 31, "metadata": {}, "output_type": "execute_result"}], "source": ["X_test_pca = pca.transform(X_test)\n", "pred = rf.predict(X_test_pca)\n", "r2_score(y_test, pred)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Ou alors on utilise le concept de *pipeline* qui permet d'assembler les pr\u00e9traitements et le mod\u00e8le pr\u00e9dictif sous la forme d'une s\u00e9quence de traitement qui devient le mod\u00e8le unique."]}, {"cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": ["from sklearn.pipeline import Pipeline\n", "pipe = Pipeline([\n", " ('acp', PCA(n_components=6)),\n", " ('rf', RandomForestRegressor(n_estimators=100))\n", "])\n", "pipe.fit(X_train, y_train);"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Grille de recherche\n", "\n", "De cette fa\u00e7on, on peut chercher simplement les meilleurs hyperparam\u00e8tres du mod\u00e8le."]}, {"cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Fitting 3 folds for each of 12 candidates, totalling 36 fits\n"]}, {"name": "stderr", "output_type": "stream", "text": ["[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "[Parallel(n_jobs=1)]: Done 36 out of 36 | elapsed: 44.4s finished\n"]}, {"data": {"text/plain": ["GridSearchCV(cv=ShuffleSplit(n_splits=3, random_state=None, test_size=None, train_size=None),\n", " error_score=nan,\n", " estimator=Pipeline(memory=None,\n", " steps=[('acp',\n", " PCA(copy=True, iterated_power='auto',\n", " n_components=6, random_state=None,\n", " svd_solver='auto', tol=0.0,\n", " whiten=False)),\n", " ('rf',\n", " RandomForestRegressor(bootstrap=True,\n", " ccp_alpha=0.0,\n", " criterion='mse',\n", " max_depth=None,\n", " m...\n", " min_samples_leaf=1,\n", " min_samples_split=2,\n", " min_weight_fraction_leaf=0.0,\n", " n_estimators=100,\n", " n_jobs=None,\n", " oob_score=False,\n", " random_state=None,\n", " verbose=0,\n", " warm_start=False))],\n", " verbose=False),\n", " iid='deprecated', n_jobs=None,\n", " param_grid={'acp__n_components': [1, 4, 7, 10],\n", " 'rf__n_estimators': [10, 20, 50]},\n", " pre_dispatch='2*n_jobs', refit=True, return_train_score=False,\n", " scoring=None, verbose=1)"]}, "execution_count": 33, "metadata": {}, "output_type": "execute_result"}], "source": ["from sklearn.model_selection import GridSearchCV\n", "param_grid = {'acp__n_components': list(range(1, 11, 3)),\n", " 'rf__n_estimators': [10, 20, 50]}\n", "grid = GridSearchCV(pipe, param_grid=param_grid, verbose=1,\n", " cv=ShuffleSplit(3))\n", "grid.fit(X, y)"]}, {"cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [{"data": {"text/plain": ["{'acp__n_components': 10, 'rf__n_estimators': 50}"]}, "execution_count": 34, "metadata": {}, "output_type": "execute_result"}], "source": ["grid.best_params_"]}, {"cell_type": "code", "execution_count": 34, "metadata": {"scrolled": false}, "outputs": [{"data": {"text/plain": ["array([7.1 , 5.06, 7. , ..., 6.74, 4.12, 5.98])"]}, "execution_count": 35, "metadata": {}, "output_type": "execute_result"}], "source": ["grid.predict(X_test)"]}, {"cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.9275290318700775"]}, "execution_count": 36, "metadata": {}, "output_type": "execute_result"}], "source": ["r2_score(y_test, grid.predict(X_test))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Ce nombre para\u00eet beaucoup trop beau pour \u00eatre vrai. Cela signifie sans doute que les donn\u00e9es de test ont \u00e9t\u00e9 utilis\u00e9s pour effectuer la recherche."]}, {"cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.49487646056265816"]}, "execution_count": 37, "metadata": {}, "output_type": "execute_result"}], "source": ["grid.best_score_"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Nettement plus plausible."]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Enregistrer, restaurer\n", "\n", "Le moyen le plus simple de conserver les mod\u00e8les en python est de les s\u00e9rialiser : on copie la m\u00e9moire sur disque puis on la restaure plus tard."]}, {"cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": ["import pickle\n", "\n", "with open('piperf.pickle', 'wb') as f:\n", " pickle.dump(grid, f)"]}, {"cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [{"data": {"text/plain": ["['piperf.pickle']"]}, "execution_count": 39, "metadata": {}, "output_type": "execute_result"}], "source": ["import glob\n", "glob.glob('*.pickle')"]}, {"cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": ["with open(\"piperf.pickle\", 'rb') as f:\n", " grid2 = pickle.load(f)"]}, {"cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([7.1 , 5.06, 7. , ..., 6.74, 4.12, 5.98])"]}, "execution_count": 41, "metadata": {}, "output_type": "execute_result"}], "source": ["grid2.predict(X_test)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Pr\u00e9diction de la couleur\n", "\n", "Le fait que la premi\u00e8re validation crois\u00e9e \u00e9choue \u00e9tait un signe que la couleur \u00e9tait facilement pr\u00e9visible. V\u00e9rifions."]}, {"cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": ["Xc = df_data.drop(['quality', 'color'], axis=1)\n", "yc = df_data[\"color\"]"]}, {"cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": ["Xc_train, Xc_test, yc_train, yc_test = train_test_split(Xc, yc)"]}, {"cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [], "source": ["from sklearn.linear_model import LogisticRegression\n", "log = LogisticRegression(solver='lbfgs', max_iter=1500)\n", "log.fit(Xc_train, yc_train);"]}, {"cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.04459922717947637"]}, "execution_count": 45, "metadata": {}, "output_type": "execute_result"}], "source": ["from sklearn.metrics import log_loss\n", "log_loss(yc_test, log.predict_proba(Xc_test))"]}, {"cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[ 391, 14],\n", " [ 9, 1211]], dtype=int64)"]}, "execution_count": 46, "metadata": {}, "output_type": "execute_result"}], "source": ["from sklearn.metrics import confusion_matrix\n", "confusion_matrix(yc_test, log.predict(Xc_test))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["La matrice de confusion est plut\u00f4t explicite."]}, {"cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [], "source": []}, {"cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [], "source": []}], "metadata": {"kernelspec": {"display_name": "Python 3", "language": "python", "name": "python3"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.2"}}, "nbformat": 4, "nbformat_minor": 2}