Estimation des paramètres d’un mélange de modèles SIRD

Le virus mute et certains variantes sont plus contagieuses. Dans ce modèle, on simule une population au sein de laquelle circulent deux virus avec le modèle suivant : CovidSIRDMixture. L’idée est ensuite d’estimer un module plus simple sur cette population pour comprendre comment le paramètre \(\beta\) de ce modèle simple évoluerait.

Simulation des données

import warnings
from pprint import pprint
import numpy
from matplotlib.cbook.deprecation import MatplotlibDeprecationWarning
import matplotlib.pyplot as plt
import pandas
from aftercovid.models import EpidemicRegressor, CovidSIRDMixture

model = CovidSIRDMixture()
model

Out:

somewhereaftercovid_39_std/aftercovid/examples/plot_covid_mixture.py:24: MatplotlibDeprecationWarning: The module matplotlib.cbook.deprecation is considered internal and it will be made private in the future.
  from matplotlib.cbook.deprecation import MatplotlibDeprecationWarning

CovidSIRDMixture

Quantities

  • S: personnes non contaminées
  • I1: nombre de personnes malades ou contaminantes pour le premier variant
  • I2: nombre de personnes malades ou contaminantes pour le second variant
  • R: personnes guéries (recovered)
  • D: personnes décédées

Constants

  • N: population

Parameters

  • beta1: taux de transmission dans la population
  • beta2: second taux de transmission dans la population
  • mu: 1/. : durée moyenne jusque la guérison
  • nu: 1/. : durée moyenne jusqu'au décès

Equations

  • \begin{equation}\mathtt{\text{\textbackslashfrac\{dD\}\{dt\} = \textbackslashnu \textbackslashleft(I\_\{1\} + I\_\{2\}\textbackslashright)}}\end{equation}
  • \begin{equation}\mathtt{\text{\textbackslashfrac\{dI1\}\{dt\} = - \textbackslashfrac\{I\_\{1\} I\_\{2\} S \textbackslashbeta\_\{1\} \textbackslashbeta\_\{2\}\}\{N\textasciicircum\{2\}\} - I\_\{1\} \textbackslashmu - I\_\{1\} \textbackslashnu + \textbackslashfrac\{I\_\{1\} S \textbackslashbeta\_\{1\}\}\{N\}}}\end{equation}
  • \begin{equation}\mathtt{\text{\textbackslashfrac\{dI2\}\{dt\} = - I\_\{2\} \textbackslashmu - I\_\{2\} \textbackslashnu + \textbackslashfrac\{I\_\{2\} S \textbackslashbeta\_\{2\}\}\{N\}}}\end{equation}
  • \begin{equation}\mathtt{\text{\textbackslashfrac\{dR\}\{dt\} = \textbackslashmu \textbackslashleft(I\_\{1\} + I\_\{2\}\textbackslashright)}}\end{equation}
  • \begin{equation}\mathtt{\text{\textbackslashfrac\{dS\}\{dt\} = \textbackslashfrac\{I\_\{1\} I\_\{2\} S \textbackslashbeta\_\{1\} \textbackslashbeta\_\{2\}\}\{N\textasciicircum\{2\}\} - \textbackslashfrac\{I\_\{1\} S \textbackslashbeta\_\{1\}\}\{N\} - \textbackslashfrac\{I\_\{2\} S \textbackslashbeta\_\{2\}\}\{N\}}}\end{equation}


Mise à jour des coefficients.

model['beta1'] = 0.15
model['beta2'] = 0.25
model["mu"] = 0.06
model["nu"] = 0.04
pprint(model.P)

Out:

[('beta1', 0.15, 'taux de transmission dans la population'),
 ('beta2', 0.25, 'second taux de transmission dans la population'),
 ('mu', 0.06, '1/. : durée moyenne jusque la guérison'),
 ('nu', 0.04, "1/. : durée moyenne jusqu'au décès")]

Point de départ

pprint(model.Q)

Out:

[('S', 9990.0, 'personnes non contaminées'),
 ('I1',
  8.0,
  'nombre de personnes malades ou contaminantes pour le premier variant'),
 ('I2',
  2.0,
  'nombre de personnes malades ou contaminantes pour le second variant'),
 ('R', 0.0, 'personnes guéries (recovered)'),
 ('D', 0.0, 'personnes décédées')]

On part d’un point de départ un peu plus conséquent car l’estimation n’est pas très fiable au départ de l’épidémie comme le montre l’exemple Estimation des paramètres d’un modèle SIRD.

model.update(S=9100, I1=80, I2=20)
pprint(model.Q)

Out:

[('S', 9100.0, 'personnes non contaminées'),
 ('I1',
  80.0,
  'nombre de personnes malades ou contaminantes pour le premier variant'),
 ('I2',
  20.0,
  'nombre de personnes malades ou contaminantes pour le second variant'),
 ('R', 0.0, 'personnes guéries (recovered)'),
 ('D', 0.0, 'personnes décédées')]

Simulation

X, y = model.iterate2array(90, derivatives=True)
data = {_[0]: x for _, x in zip(model.Q, X.T)}
data.update({('d' + _[0]): c for _, c in zip(model.Q, y.T)})
df = pandas.DataFrame(data)
df.tail()
S I1 I2 R D dS dI1 dI2 dR dD
85 1658.859497 15.487518 399.228607 4275.854492 2850.569824 -16.938131 -1.167224 -23.366255 24.882967 16.588644
86 1641.921387 14.320294 375.862335 4300.737793 2867.158447 -15.777788 -1.082651 -22.157824 23.410957 15.607306
87 1626.143555 13.237642 353.704529 4324.148438 2882.765625 -14.699398 -1.003725 -20.991093 22.016529 14.677687
88 1611.444214 12.233917 332.713409 4346.165039 2897.443359 -13.696982 -0.930137 -19.867615 20.696840 13.797894
89 1597.747192 11.303781 312.845795 4366.861816 2911.241211 -12.765003 -0.861588 -18.788368 19.448975 12.965983


Visualisation

df.drop('S', axis=1).plot(title="Simulation SIRDMixture")
Simulation SIRDMixture

Estimation

Le module implémente la class EpidemicRegressor qui réplique l’API de scikit-learn. Il faut additionner I1 et I2.

X2 = numpy.empty((X.shape[0], 4), dtype=X.dtype)
X2[:, 0] = X[:, 0]
X2[:, 1] = X[:, 1] + X[:, 2]
X2[:, 2] = X[:, 3]
X2[:, 3] = X[:, 4]

y2 = numpy.empty((y.shape[0], 4), dtype=X.dtype)
y2[:, 0] = y[:, 0]
y2[:, 1] = y[:, 1] + y[:, 2]
y2[:, 2] = y[:, 3]
y2[:, 3] = y[:, 4]

X, y = X2, y2

m = EpidemicRegressor('SIRD', verbose=True, learning_rate_init=1e-3,
                      max_iter=15, early_th=1)
m.fit(X, y)
pprint(m.model_.P)

Out:

0/15: loss: 7.289e+06 lr=1.13e-09
1/15: loss: 1.537e+06 lr=1.13e-09
2/15: loss: 4.064e+05 lr=1.13e-09
3/15: loss: 1.286e+05 lr=1.13e-09
4/15: loss: 4.238e+04 lr=1.13e-09
5/15: loss: 1.611e+04 lr=1.13e-09
6/15: loss: 7313 lr=1.13e-09
7/15: loss: 5092 lr=1.13e-09
8/15: loss: 4356 lr=1.13e-09
9/15: loss: 4253 lr=1.13e-09
10/15: loss: 4213 lr=1.13e-09
11/15: loss: 4242 lr=1.13e-09
12/15: loss: 4266 lr=1.13e-09
13/15: loss: 4221 lr=1.13e-09
14/15: loss: 4208 lr=1.13e-09
15/15: loss: 4216 lr=1.13e-09
[('beta', 0.21373683083252895, 'taux de transmission dans la population'),
 ('mu', 0.05959126247389176, '1/. : durée moyenne jusque la guérison'),
 ('nu', 0.03959126260267731, "1/. : durée moyenne jusqu'au décès")]

La réaction de la population n’est pas constante tout au long de l’épidémie. Il est possible qu’elle change de comportement tout au long de la propagation. On estime alors les coefficients du modèle sur une fenêtre glissante.

def find_best_model(Xt, yt, lrs, th):
    best_est, best_loss = None, None
    for lr in lrs:
        with warnings.catch_warnings():
            warnings.simplefilter("ignore", RuntimeWarning)
            m = EpidemicRegressor(
                'SIRDc',
                learning_rate_init=lr,
                max_iter=500,
                early_th=1)
            m.fit(Xt, yt)
            loss = m.score(Xt, yt)
            if numpy.isnan(loss):
                continue
        if best_est is None or best_loss > loss:
            best_est = m
            best_loss = loss
        if best_loss < th:
            return best_est, best_loss
    return best_est, best_loss

On estime les coefficients du modèle tous les 5 jours sur les 10 derniers jours.

coefs = []
for k in range(0, X.shape[0] - 9):
    end = min(k + 10, X.shape[0])
    Xt, yt = X[k:end], y[k:end]
    m, loss = find_best_model(Xt, yt, [1e-2, 1e-3], 10)
    loss = m.score(Xt, yt)
    print("k={} iter={} loss={:1.3f} coef={}".format(
        k, m.iter_, loss, m.model_._val_p))
    obs = dict(k=k, loss=loss, it=m.iter_, R0=m.model_.R0())
    obs.update({k: v for k, v in zip(m.model_.param_names, m.model_._val_p)})
    coefs.append(obs)

Out:

k=0 iter=110 loss=1.059 coef=[ 0.16914272  0.06107888  0.03922973 -0.03476577  0.          0.01382251]
k=1 iter=104 loss=1.010 coef=[ 0.16947373  0.06096531  0.04019742 -0.02168555  0.          0.01277832]
k=2 iter=87 loss=1.047 coef=[ 0.16982287  0.06053838  0.04157132 -0.00365807  0.          0.00281632]
k=3 iter=90 loss=1.699 coef=[ 0.173809    0.06144818  0.03807857 -0.03795552  0.          0.03655072]
k=4 iter=73 loss=1.480 coef=[ 0.1730155   0.05961993  0.04056164 -0.025299    0.          0.00104157]
k=5 iter=69 loss=1.746 coef=[ 0.17475486  0.06014877  0.04004698 -0.02642347  0.          0.00660412]
k=6 iter=64 loss=2.075 coef=[ 0.17705007  0.05885736  0.03921902 -0.0371571   0.          0.00231814]
k=7 iter=58 loss=2.457 coef=[ 0.17910118  0.06048205  0.03848627 -0.03073262  0.          0.02658902]
k=8 iter=47 loss=2.542 coef=[ 0.18022569  0.05958984  0.03894912 -0.0269172   0.          0.00767131]
k=9 iter=42 loss=3.847 coef=[ 0.18334175  0.05806602  0.03780021 -0.0541317   0.          0.00557352]
k=10 iter=39 loss=3.453 coef=[ 0.18460396  0.06017804  0.03930057 -0.01659643  0.          0.01633495]
k=11 iter=36 loss=4.335 coef=[ 0.18572756  0.05857751  0.03834809 -0.04329236  0.          0.00423948]
k=12 iter=34 loss=4.509 coef=[ 0.18762256  0.05939336  0.03838838 -0.02913562  0.          0.01722275]
k=13 iter=27 loss=4.235 coef=[ 0.18747346  0.06131825  0.0398541  -0.00091056  0.          0.02276878]
k=14 iter=28 loss=6.101 coef=[ 0.1908859   0.05818339  0.03793455 -0.04174065  0.          0.00431861]
k=15 iter=19 loss=5.803 coef=[ 0.19127557  0.05955157  0.03923351 -0.02242464  0.          0.00542906]
k=16 iter=21 loss=5.753 coef=[ 1.93513506e-01  5.96464239e-02  3.96153109e-02 -9.25576924e-03
  3.47709603e-05  5.29725776e-04]
k=17 iter=14 loss=8.348 coef=[ 1.95503679e-01  6.00140419e-02  3.76964716e-02 -2.57959556e-02
  8.73333424e-05  4.31909339e-02]
k=18 iter=16 loss=7.828 coef=[ 1.97014732e-01  6.06584806e-02  3.84417710e-02 -1.51995111e-02
  2.73726883e-05  3.83190028e-02]
k=19 iter=14 loss=7.687 coef=[ 0.19665662  0.06087182  0.03986953 -0.00117897  0.00028452  0.01526039]
k=20 iter=118 loss=8.722 coef=[ 0.20103198  0.06191527  0.03938572 -0.00141306  0.          0.04004642]
k=21 iter=95 loss=9.979 coef=[ 2.03040231e-01  6.00247835e-02  3.90913799e-02 -1.96995925e-02
  8.39900991e-07  1.45168830e-02]
k=22 iter=14 loss=9.645 coef=[ 0.20092359  0.0604967   0.03865549 -0.00478381  0.00207409  0.03294865]
k=23 iter=14 loss=10.987 coef=[ 0.20449777  0.05938598  0.03847792 -0.02202142  0.0011395   0.01577144]
k=24 iter=68 loss=13.558 coef=[ 0.20645235  0.05940806  0.03870703 -0.03323812  0.          0.01138196]
k=25 iter=41 loss=9.511 coef=[ 0.20593824  0.05973506  0.03969748 -0.0085143   0.00223832  0.00134106]
k=26 iter=37 loss=10.395 coef=[ 2.07842938e-01  5.97450557e-02  3.96773675e-02 -1.47546373e-02
  1.41208834e-04  1.24490795e-03]
k=27 iter=14 loss=14.160 coef=[ 0.21049429  0.05889326  0.03868654 -0.02417884  0.00188262  0.00309267]
k=28 iter=26 loss=10.887 coef=[ 0.21021969  0.05920342  0.03907415 -0.01760655  0.00132278  0.00262502]
k=29 iter=45 loss=13.647 coef=[ 0.21161024  0.0591627   0.0389828  -0.02840141  0.          0.00308339]
k=30 iter=58 loss=17.938 coef=[ 0.21276825  0.06276089  0.03900258 -0.01113953  0.          0.06615869]
k=31 iter=40 loss=13.489 coef=[ 0.21373573  0.06108927  0.03888641 -0.01351446  0.          0.0379311 ]
k=32 iter=39 loss=10.034 coef=[ 0.21374876  0.05972912  0.03944355 -0.01582065  0.          0.00482376]
k=33 iter=46 loss=16.853 coef=[ 0.21478163  0.06044689  0.03890089 -0.02310407  0.          0.02676801]
k=34 iter=85 loss=10.169 coef=[ 2.15292170e-01  6.08632059e-02  3.90358367e-02 -1.49863903e-03
  2.99693113e-05  3.05314925e-02]
k=35 iter=32 loss=9.370 coef=[ 0.21609759  0.05959421  0.03938347 -0.01841362  0.          0.00354949]
k=36 iter=17 loss=10.565 coef=[ 0.21826646  0.06014015  0.03950772 -0.01410564  0.00035765  0.01055974]
k=37 iter=35 loss=7.986 coef=[ 0.21733075  0.05979801  0.03937352 -0.01582762  0.          0.0071754 ]
k=38 iter=63 loss=8.729 coef=[ 0.21678592  0.0602616   0.03900896 -0.00766525  0.          0.02117895]
k=39 iter=25 loss=19.343 coef=[ 0.21928406  0.06038788  0.03939661 -0.02761133  0.          0.01724505]
k=40 iter=31 loss=5.854 coef=[ 0.21898938  0.0597618   0.03957733 -0.01527857  0.          0.00309354]
k=41 iter=27 loss=3.235 coef=[ 0.2190856   0.06005614  0.03997067 -0.00357261  0.          0.0014252 ]
k=42 iter=37 loss=9.122 coef=[ 2.20954385e-01  6.00172958e-02  3.89957648e-02 -1.72000213e-02
  7.78719833e-05  1.75590712e-02]
k=43 iter=8 loss=4.767 coef=[ 0.21885328  0.05946795  0.03922893 -0.00334712  0.00210546  0.00471232]
k=44 iter=33 loss=3.680 coef=[ 0.22070812  0.06085672  0.0394409  -0.00067373  0.0007384   0.02416486]
k=45 iter=76 loss=3.296 coef=[ 0.22065076  0.05948004  0.03933465 -0.01277711  0.00072714  0.00267557]
k=46 iter=41 loss=5.250 coef=[ 2.22252740e-01  5.97919777e-02  3.93012444e-02 -2.02544896e-02
  4.86985739e-06  8.31770067e-03]
k=47 iter=98 loss=3.308 coef=[ 0.22188061  0.05972756  0.03955278 -0.01659482  0.00343663  0.00411645]
k=48 iter=56 loss=8.219 coef=[ 0.22244612  0.06065243  0.03927177 -0.01864659  0.00276574  0.02460759]
k=49 iter=67 loss=3.711 coef=[ 0.22219893  0.06010201  0.03948539 -0.01528402  0.0047253   0.01210612]
k=50 iter=97 loss=1.599 coef=[ 0.22167667  0.06005235  0.03952954 -0.00841225  0.00544736  0.01070124]
k=51 iter=111 loss=1.291 coef=[ 0.22080354  0.05923094  0.03919226 -0.0142203   0.01075969  0.00425567]
k=52 iter=115 loss=1.240 coef=[ 0.22062159  0.05888222  0.0391302  -0.01907684  0.01456495  0.00064807]
k=53 iter=93 loss=1.367 coef=[ 0.22181334  0.05971533  0.03924529 -0.01233386  0.01100504  0.01169993]
k=54 iter=85 loss=2.048 coef=[ 0.22241826  0.06029065  0.03907731 -0.00800079  0.00966213  0.02392032]
k=55 iter=137 loss=1.218 coef=[ 0.22059037  0.06032301  0.03898394 -0.00279907  0.01716799  0.02879514]
k=56 iter=24 loss=2.079 coef=[ 0.22459907  0.06079658  0.03932538 -0.00671788  0.0023656   0.02572049]
k=57 iter=125 loss=1.431 coef=[ 0.22217719  0.06074922  0.03901603 -0.00141435  0.01371891  0.03432932]
k=58 iter=101 loss=1.240 coef=[ 0.22329811  0.05965272  0.03922602 -0.02035966  0.00921155  0.01036695]
k=59 iter=76 loss=1.473 coef=[ 0.22437641  0.05990879  0.03928351 -0.02094756  0.00636534  0.01277785]
k=60 iter=99 loss=1.697 coef=[ 0.22449795  0.05997189  0.03916783 -0.02286079  0.00804191  0.01643944]
k=61 iter=16 loss=0.626 coef=[ 2.25588970e-01  6.04165062e-02  3.96788597e-02 -9.61692204e-03
  1.53811964e-04  1.20728766e-02]
k=62 iter=40 loss=2.098 coef=[ 0.22684421  0.06084754  0.03934285 -0.01772781  0.0015767   0.02613916]
k=63 iter=42 loss=2.305 coef=[ 0.22725646  0.06050973  0.03925942 -0.02697024  0.00159094  0.021856  ]
k=64 iter=34 loss=2.462 coef=[ 0.22801663  0.06053059  0.03923822 -0.03046656  0.00055414  0.02223078]
k=65 iter=32 loss=1.579 coef=[ 2.27884001e-01  6.05298879e-02  3.93963443e-02 -2.51160221e-02
  1.12314403e-04  1.92822060e-02]
k=66 iter=47 loss=2.815 coef=[ 0.22885219  0.06074297  0.03910275 -0.0352568   0.00075727  0.02838009]
k=67 iter=45 loss=2.130 coef=[ 0.22919865  0.06080311  0.03922997 -0.0311114   0.00025152  0.0269489 ]
k=68 iter=50 loss=2.987 coef=[ 0.23041256  0.06211032  0.03902089 -0.01698834  0.00037711  0.05326218]
k=69 iter=55 loss=2.147 coef=[ 2.30570128e-01  6.10267469e-02  3.92009313e-02 -3.36294349e-02
  1.19888309e-04  3.12623369e-02]
k=70 iter=60 loss=0.828 coef=[ 0.229659    0.06065226  0.03956549 -0.02285491  0.          0.01836063]
k=71 iter=67 loss=0.905 coef=[ 0.23021589  0.06083164  0.03949669 -0.02302599  0.          0.02258533]
k=72 iter=78 loss=0.986 coef=[ 0.23116608  0.06080507  0.03948923 -0.02765347  0.          0.02227547]
k=73 iter=96 loss=2.297 coef=[ 2.33379417e-01  6.12257858e-02  3.90154445e-02 -4.76740690e-02
  1.00522885e-04  3.80626269e-02]
k=74 iter=107 loss=2.620 coef=[ 2.34998336e-01  6.18057136e-02  3.89427524e-02 -4.52189528e-02
  7.47378734e-05  4.94493356e-02]
k=75 iter=108 loss=2.082 coef=[ 2.35681515e-01  6.24371618e-02  3.91306179e-02 -2.63238643e-02
  3.83989544e-07  5.67534363e-02]
k=76 iter=120 loss=0.832 coef=[ 0.23371607  0.06098458  0.03959661 -0.02906851  0.          0.02341687]
k=77 iter=137 loss=0.743 coef=[ 0.23427768  0.06104425  0.03968816 -0.02535863  0.          0.02280683]
k=78 iter=154 loss=0.964 coef=[ 0.23609188  0.0611679   0.03960577 -0.03365469  0.          0.02634778]
k=79 iter=150 loss=0.626 coef=[ 0.23542907  0.06099408  0.03999775 -0.01633556  0.          0.01659586]
k=80 iter=202 loss=1.319 coef=[ 0.23975247  0.06163158  0.0394909  -0.04178003  0.          0.03623702]

Résumé

dfcoef = pandas.DataFrame(coefs).set_index('k')
dfcoef
loss it R0 beta mu nu a b c
k
0 1.059427 110 1.686223 0.169143 0.061079 0.039230 -0.034766 0.0 0.013823
1 1.010059 104 1.675259 0.169474 0.060965 0.040197 -0.021686 0.0 0.012778
2 1.046702 87 1.663141 0.169823 0.060538 0.041571 -0.003658 0.0 0.002816
3 1.699448 90 1.746355 0.173809 0.061448 0.038079 -0.037956 0.0 0.036551
4 1.480194 73 1.727019 0.173015 0.059620 0.040562 -0.025299 0.0 0.001042
... ... ... ... ... ... ... ... ... ...
76 0.832273 120 2.323656 0.233716 0.060985 0.039597 -0.029069 0.0 0.023417
77 0.743286 137 2.325743 0.234278 0.061044 0.039688 -0.025359 0.0 0.022807
78 0.963892 154 2.342793 0.236092 0.061168 0.039606 -0.033655 0.0 0.026348
79 0.625956 150 2.331169 0.235429 0.060994 0.039998 -0.016336 0.0 0.016596
80 1.319360 202 2.370912 0.239752 0.061632 0.039491 -0.041780 0.0 0.036237

81 rows × 9 columns



On visualise.

with warnings.catch_warnings():
    warnings.simplefilter("ignore", MatplotlibDeprecationWarning)
    fig, ax = plt.subplots(2, 3, figsize=(14, 6))
    dfcoef[["mu", "nu"]].plot(ax=ax[0, 0], logy=True)
    dfcoef[["beta"]].plot(ax=ax[0, 1], logy=True)
    dfcoef[["loss"]].plot(ax=ax[1, 0], logy=True)
    dfcoef[["R0"]].plot(ax=ax[0, 2])
    df.plot(ax=ax[1, 1], logy=True)
    fig.suptitle('Estimation de R0 tout au long de la simulation', fontsize=12)
Estimation de R0 tout au long de la simulation

Total running time of the script: ( 1 minutes 50.730 seconds)

Gallery generated by Sphinx-Gallery