Scoring Forecast models¶

This notebook explains how the Scorer class can be used to compute the following metrics:

MAE (mean absolute error);
MSE (mean squared error);
CRPS (Continuous Ranked Probability Score);
Log Score.
Interval Score
Weighted interval score (wis)

The CRPS and Log Score parameterize each prediction as a log-normal distribution. Details about how the log-normal parameters are computed are available here.

To see the plots below as interactive charts use alt.renderers.enable("default")

In [1]:

Copied!

import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

In [2]:

Copied!

import os
from dotenv import load_dotenv
load_dotenv()

api_key = os.getenv("API_KEY")
import os
from dotenv import load_dotenv
load_dotenv()

api_key = os.getenv("API_KEY")

In [3]:

Copied!

import altair as alt 
alt.renderers.enable("png")
import altair as alt 
alt.renderers.enable("png")

Out[3]:

RendererRegistry.enable('png')

In [4]:

Copied!





import numpy as np
import pandas as pd
from mosqlient import get_infodengue
from mosqlient.scoring import Scorer
import numpy as np
import pandas as pd
from mosqlient import get_infodengue
from mosqlient.scoring import Scorer

To show the use of the class to compare the score of new predictions with the predictions registered on the platform it will be used the baseline model.

In [5]:

Copied!

from datetime import date
from mosqlient.forecast import Arima
from datetime import date
from mosqlient.forecast import Arima

It will compare the predictions of id 77 and 78, which refer to the geocode 3304557, between 2022-01-02 and 2023-06-25.

So initially, we will get the data that will be used to train the model and generate the new predictions:

In [6]:

Copied!





disease = 'dengue'
geocode = 3304557 
end_date = date.today().strftime('%Y-%m-%d')

df = get_infodengue(api_key=api_key,
                    disease = 'dengue',
                    start_date = '2010-01-01', 
                    end_date = '2023-12-31', 
                    geocode = 3304557)

df['data_iniSE'] = pd.to_datetime(df['data_iniSE'])

df.set_index('data_iniSE', inplace = True )
    
df = df[['casos']].rename(columns = {'casos':'y'})

df = df.resample('W-SUN').sum()

df.head()
disease = 'dengue'
geocode = 3304557 
end_date = date.today().strftime('%Y-%m-%d')

df = get_infodengue(api_key=api_key,
                    disease = 'dengue',
                    start_date = '2010-01-01', 
                    end_date = '2023-12-31', 
                    geocode = 3304557)

df['data_iniSE'] = pd.to_datetime(df['data_iniSE'])

df.set_index('data_iniSE', inplace = True )
    
df = df[['casos']].rename(columns = {'casos':'y'})

df = df.resample('W-SUN').sum()

df.head()

100%|██████████| 2/2 [00:01<00:00,  1.78requests/s]

Out[6]:

	y
data_iniSE
2010-01-03	30
2010-01-10	44
2010-01-17	46
2010-01-24	47
2010-01-31	68

Create the model:

In [7]:

Copied!

m_arima = Arima(df = df)

m_arima
m_arima = Arima(df = df)

m_arima

Out[7]:

<mosqlient.forecast.baseline.Arima at 0x7937cfcd9a90>

Train the model:

In [8]:

Copied!

model = m_arima.train(train_ini_date='2010-01-01', train_end_date = '2021-12-31')
model = m_arima.train(train_ini_date='2010-01-01', train_end_date = '2021-12-31')

Performing stepwise search to minimize aic
 ARIMA(2,1,2)(0,0,0)[0] intercept   : AIC=-292.056, Time=0.15 sec
 ARIMA(0,1,0)(0,0,0)[0] intercept   : AIC=-263.166, Time=0.05 sec
 ARIMA(1,1,0)(0,0,0)[0] intercept   : AIC=-297.770, Time=0.03 sec
 ARIMA(0,1,1)(0,0,0)[0] intercept   : AIC=-289.846, Time=0.06 sec
 ARIMA(0,1,0)(0,0,0)[0]             : AIC=-265.069, Time=0.03 sec
 ARIMA(2,1,0)(0,0,0)[0] intercept   : AIC=-302.929, Time=0.06 sec
 ARIMA(3,1,0)(0,0,0)[0] intercept   : AIC=-302.144, Time=0.06 sec
 ARIMA(2,1,1)(0,0,0)[0] intercept   : AIC=-309.026, Time=0.17 sec
 ARIMA(1,1,1)(0,0,0)[0] intercept   : AIC=-301.338, Time=0.08 sec
 ARIMA(3,1,1)(0,0,0)[0] intercept   : AIC=-307.027, Time=0.25 sec
 ARIMA(1,1,2)(0,0,0)[0] intercept   : AIC=-305.684, Time=0.19 sec
 ARIMA(3,1,2)(0,0,0)[0] intercept   : AIC=-306.429, Time=0.32 sec
 ARIMA(2,1,1)(0,0,0)[0]             : AIC=-310.932, Time=0.05 sec
 ARIMA(1,1,1)(0,0,0)[0]             : AIC=-303.191, Time=0.05 sec
 ARIMA(2,1,0)(0,0,0)[0]             : AIC=-304.794, Time=0.09 sec
 ARIMA(3,1,1)(0,0,0)[0]             : AIC=-308.934, Time=0.08 sec
 ARIMA(2,1,2)(0,0,0)[0]             : AIC=inf, Time=0.19 sec
 ARIMA(1,1,0)(0,0,0)[0]             : AIC=-299.603, Time=0.02 sec
 ARIMA(1,1,2)(0,0,0)[0]             : AIC=-307.585, Time=0.09 sec
 ARIMA(3,1,0)(0,0,0)[0]             : AIC=-304.014, Time=0.03 sec
 ARIMA(3,1,2)(0,0,0)[0]             : AIC=-308.337, Time=0.11 sec

Best model:  ARIMA(2,1,1)(0,0,0)[0]          
Total fit time: 2.174 seconds

Generate the out-of-sample predictions:

In [9]:

Copied!

df_out = m_arima.predict_out_of_sample(horizon = 4, end_date = '2023-06-25', plot = False)

df_out.head()
df_out = m_arima.predict_out_of_sample(horizon = 4, end_date = '2023-06-25', plot = False)

df_out.head()

Out[9]:

	lower_95	upper_95	lower_90	upper_90	lower_80	upper_80	lower_50	upper_50	pred	date	data
0	2.380135	5.611532	2.541974	5.220175	2.744047	4.806410	3.123123	4.194429	3.614649	2022-01-02	21.0
1	2.039724	5.982174	2.213044	5.457454	2.433656	4.915421	2.859450	4.139010	3.433217	2022-01-09	14.0
2	1.717638	6.491374	1.897142	5.785903	2.130666	5.076974	2.596284	4.099097	3.252114	2022-01-16	22.0
3	1.490893	7.007500	1.671502	6.119112	1.910814	5.247761	2.401294	4.084722	3.118683	2022-01-23	19.0
0	36.296197	129.720212	39.933812	116.222528	44.643670	102.575269	53.967930	83.592642	66.974354	2022-01-30	31.0

To compare the model with other predictions in the platform, it is necessary to provide the ids of the predictions, the dataframe of the new prediction with the columns date, pred, lower and upper, and the dataframe with the true values to be compared (df_true). This dataframe must contain the columns date and casos.

Redefine the data to compare the predictions:

In [10]:

Copied!

data = df.reset_index()

data = data.rename(columns = {'data_iniSE': 'date', 'y':'casos'})

data.head()
data = df.reset_index()

data = data.rename(columns = {'data_iniSE': 'date', 'y':'casos'})

data.head()

Out[10]:

	date	casos
0	2010-01-03	30.0
1	2010-01-10	44.0
2	2010-01-17	46.0
3	2010-01-24	47.0
4	2010-01-31	68.0

import mosqlient as mosq t = mosq.get_prediction_by_id(api_key=api_key, id=1087) t.to_dataframe()

In [11]:

Copied!





%%time
score = Scorer(
    api_key=api_key,
    df_true=data,
    ids=[1087],
    pred=df_out,
)
%%time
score = Scorer(
    api_key=api_key,
    df_true=data,
    ids=[1087],
    pred=df_out,
)

CPU times: user 625 ms, sys: 8.02 ms, total: 633 ms
Wall time: 4.06 s

The class above can be initialized with just the ids or pred parameter filled.

To see the MAE error for your model (key = pred) and the other use:

In [12]:

Copied!

score.mae
score.mae

Out[12]:

{'pred': 129.92568967185332, '1087': 615.9091087085636}

To plot a bar chart:

In [13]:

Copied!

score.plot_mae()
score.plot_mae()

Out[13]:

No description has been provided for this image

To see the MSE error for your model (key = pred) and the other use:

In [14]:

Copied!

score.mse
score.mse

Out[14]:

{'pred': 41097.5227557785, '1087': 734525.5302135687}

To plot a bar chart:

In [15]:

Copied!

score.plot_mse()
score.plot_mse()

Out[15]:

To see the CRPS score for your model (key = pred) and the other, use the code below. The first dict contains the score by each point, and the second one shows the mean of the score.

The CRPS score is computed using the formula below when dist='log_normal', then: $$ \operatorname{CRPS}(\operatorname{LN}(\mu_i, \sigma_i), y_i) = y_i \left[ 2\Phi(y_i) - 1 \right] - 2\exp\left(\mu_i + \frac{\sigma_i^2}{2}\right) \left[ \Phi(\omega_i - \sigma_i) + \Phi\left(\frac{\sigma_i} {\sqrt{2}}\right) \right], $$ where $\omega_i = \frac{\operatorname{log}y_i - \mu_i}{\sigma_i}$, in which $y_i$ refers to the cases observed in week $i$, $\mu_i$ is the forecast for week $i$ and $\sigma_i$ is the standard deviation of the forecast in week $i$.

To access the CRPS score use:

In [16]:

Copied!

score.crps
score.crps

Out[16]:

({'pred': date
  2022-10-09     12.824913
  2022-10-16     17.968062
  2022-10-23     18.934474
  2022-10-30     23.340983
  2022-11-06     64.025511
  2022-11-13     26.025794
  2022-11-20     11.524928
  2022-11-27     12.545239
  2022-12-04     37.302721
  2022-12-11     27.401781
  2022-12-18     34.742208
  2022-12-25     28.648640
  2023-01-01     53.749019
  2023-01-08     75.912627
  2023-01-15     74.952830
  2023-01-22     71.067227
  2023-01-29     75.661439
  2023-02-05     67.915206
  2023-02-12    128.651877
  2023-02-19    115.924450
  2023-02-26    385.320567
  2023-03-05    429.230153
  2023-03-12    439.599199
  2023-03-19    469.786637
  2023-03-26    194.497343
  2023-04-02    291.302903
  2023-04-09    340.958328
  2023-04-16    464.517550
  2023-04-23    240.054338
  2023-04-30    280.421580
  2023-05-07    372.901456
  2023-05-14    475.294996
  2023-05-21    158.646668
  2023-05-28    194.717756
  2023-06-04    257.840314
  2023-06-11    306.453122
  2023-06-18    278.878464
  2023-06-25    148.616863
  dtype: float64,
  '1087': date
  2022-10-09     122.928540
  2022-10-16     157.128887
  2022-10-23     105.394784
  2022-10-30      16.183432
  2022-11-06      60.799867
  2022-11-13      27.547187
  2022-11-20      90.372713
  2022-11-27     141.283674
  2022-12-04     142.760249
  2022-12-11     118.622987
  2022-12-18      76.834504
  2022-12-25      37.037587
  2023-01-01       3.640307
  2023-01-08     139.797746
  2023-01-15     187.328836
  2023-01-22      97.553246
  2023-01-29     101.146750
  2023-02-05      17.767231
  2023-02-12     158.715423
  2023-02-19      52.664425
  2023-02-26     371.737816
  2023-03-05     458.311154
  2023-03-12     694.875100
  2023-03-19     964.551203
  2023-03-26    1280.654963
  2023-04-02    1077.862762
  2023-04-09    1497.116676
  2023-04-16    1279.683129
  2023-04-23    1369.273390
  2023-04-30    1740.582840
  2023-05-07    1655.467209
  2023-05-14    1585.906214
  2023-05-21    1356.085433
  2023-05-28    1218.369286
  2023-06-04    1138.610388
  2023-06-11    1282.560595
  2023-06-18    1158.720704
  2023-06-25     927.512307
  dtype: float64},
 {'pred': np.float64(176.53047804024365),
  '1087': np.float64(602.9839354033297)})

To plot the score you can use:

In [17]:

Copied!

score.plot_crps()
score.plot_crps()

Out[17]:

To see the Log score for your model (key = pred) and the other, use the code below. The first dict contains the score by each point, and the second one shows the mean of the score.

The Log score is computed using the formula below when dist='log_normal', then: $$ \operatorname{LogS}(\operatorname{LN}(\mu_i, \sigma_i), y_i) = \log y_i + \log \sigma_i + \frac{1}{2} \log (2\pi) \\ + \frac{(\log y_i - \mu_i)^2}{2\sigma_i^2}. \nonumber $$

To access the score use:

In [18]:

Copied!

score.log_score
score.log_score

Out[18]:

({'pred': date
  2022-10-09   -4.726129
  2022-10-16   -5.253850
  2022-10-23   -5.267267
  2022-10-30   -5.371019
  2022-11-06   -7.092913
  2022-11-13   -5.418675
  2022-11-20   -4.837875
  2022-11-27   -4.879481
  2022-12-04   -5.305424
  2022-12-11   -5.145771
  2022-12-18   -5.313283
  2022-12-25   -5.605567
  2023-01-01   -5.736034
  2023-01-08   -5.981004
  2023-01-15   -6.106720
  2023-01-22   -6.365071
  2023-01-29   -6.175830
  2023-02-05   -6.540963
  2023-02-12   -6.594165
  2023-02-19   -6.893570
  2023-02-26   -7.549482
  2023-03-05   -7.639048
  2023-03-12   -7.809768
  2023-03-19   -8.072596
  2023-03-26   -7.535772
  2023-04-02   -7.723401
  2023-04-09   -8.117515
  2023-04-16   -8.137527
  2023-04-23   -7.581368
  2023-04-30   -7.982317
  2023-05-07   -8.140518
  2023-05-14   -8.265051
  2023-05-21   -7.453745
  2023-05-28   -7.573456
  2023-06-04   -7.726165
  2023-06-11   -7.958195
  2023-06-18   -7.798377
  2023-06-25   -7.337817
  dtype: float64,
  '1087': date
  2022-10-09   -100.000000
  2022-10-16    -19.593705
  2022-10-23    -17.156070
  2022-10-30     -4.833135
  2022-11-06     -8.321957
  2022-11-13     -5.236594
  2022-11-20    -19.428080
  2022-11-27    -44.565292
  2022-12-04    -25.792618
  2022-12-11    -44.683243
  2022-12-18    -22.798509
  2022-12-25     -6.336284
  2023-01-01     -3.640273
  2023-01-08    -37.031695
  2023-01-15    -48.397196
  2023-01-22    -11.755754
  2023-01-29    -12.795067
  2023-02-05     -4.965794
  2023-02-12    -27.163728
  2023-02-19     -6.837575
  2023-02-26    -54.529135
  2023-03-05    -37.206181
  2023-03-12    -71.426071
  2023-03-19   -100.000000
  2023-03-26   -100.000000
  2023-04-02    -90.025644
  2023-04-09   -100.000000
  2023-04-16   -100.000000
  2023-04-23   -100.000000
  2023-04-30   -100.000000
  2023-05-07   -100.000000
  2023-05-14   -100.000000
  2023-05-21   -100.000000
  2023-05-28    -85.910048
  2023-06-04   -100.000000
  2023-06-11   -100.000000
  2023-06-18   -100.000000
  2023-06-25   -100.000000
  dtype: float64},
 {'pred': np.float64(-6.7108613693534975),
  '1087': np.float64(-55.53762230565999)})

To plot this score:

In [19]:

Copied!

score.plot_log_score()
score.plot_log_score()

Out[19]:

To see the Interval score for your model (key = pred) and the other, use the code below. The first dict contains the score by each point, and the second one shows the mean of the score.

The Interval score is computed using the formula below:

$$ S^{int}_\alpha(l_i, u_i; y_i) = u_i - l_i + \cfrac{2}{\alpha}(l_i - y_i)I\{y_i < l_i\} + \cfrac{2}{\alpha}(y_i - u_i)I\{y_i > u_i\} $$

where $I$ is the indicator function, $\alpha$ the significance level of the interval, $u_i$ the upper value of the interval at week $i$ and $l_i$ the lower value.

To access the score use:

In [20]:

Copied!

score.interval_score
score.interval_score

Out[20]:

({'pred': date
  2022-10-09      923.241265
  2022-10-16     1878.008815
  2022-10-23     1644.530902
  2022-10-30     1625.697892
  2022-11-06     2417.187072
  2022-11-13     1732.150236
  2022-11-20     1160.923642
  2022-11-27     1096.433919
  2022-12-04      220.439307
  2022-12-11      903.874574
  2022-12-18     1001.645377
  2022-12-25     2077.079350
  2023-01-01     1242.102005
  2023-01-08     1374.393674
  2023-01-15     2344.654274
  2023-01-22     3966.181015
  2023-01-29     2528.934852
  2023-02-05     5967.902669
  2023-02-12     3586.822465
  2023-02-19     6772.323339
  2023-02-26     3588.503469
  2023-03-05     6825.030504
  2023-03-12    12246.957079
  2023-03-19    18727.662898
  2023-03-26    15923.606674
  2023-04-02    15437.223105
  2023-04-09    24869.754193
  2023-04-16    20874.818525
  2023-04-23    14448.943543
  2023-04-30    24366.355724
  2023-05-07    24632.452339
  2023-05-14    24922.179653
  2023-05-21    16793.401373
  2023-05-28    15984.733347
  2023-06-04    16142.772736
  2023-06-11    19718.037105
  2023-06-18    16975.379835
  2023-06-25    13622.130894
  dtype: float64,
  '1087': date
  2022-10-09     2460.000000
  2022-10-16     3200.000000
  2022-10-23     2453.018211
  2022-10-30     1141.655159
  2022-11-06     2145.199097
  2022-11-13       87.534089
  2022-11-20     1138.189569
  2022-11-27     1860.855910
  2022-12-04     1552.824911
  2022-12-11     1430.841336
  2022-12-18     1093.866721
  2022-12-25      269.742994
  2023-01-01      757.446619
  2023-01-08     2072.897684
  2023-01-15     2996.128771
  2023-01-22     1251.874514
  2023-01-29     1268.268822
  2023-02-05     1858.860909
  2023-02-12     2227.175310
  2023-02-19     2396.444435
  2023-02-26     9084.735014
  2023-03-05    10638.192606
  2023-03-12    15369.965961
  2023-03-19    20825.364979
  2023-03-26    27023.554078
  2023-04-02    22944.582973
  2023-04-09    31359.046605
  2023-04-16    26687.779439
  2023-04-23    28422.474098
  2023-04-30    36043.345325
  2023-05-07    34044.664859
  2023-05-14    32285.035904
  2023-05-21    27835.751033
  2023-05-28    25252.096935
  2023-06-04    23443.350206
  2023-06-11    26204.903227
  2023-06-18    23735.375054
  2023-06-25    19142.480954
  dtype: float64},
 {'pred': np.float64(9225.380780004474),
  '1087': np.float64(12473.829587120174)})

To visualise a plot with the score use:

In [21]:

Copied!

score.plot_interval_score()
score.plot_interval_score()

Out[21]:

The weighted interval score (WIS) is compute using the equation below: $$ \text{WIS}(F, y) = \frac{1}{K + 1/2} \left( w_0|y - m| + \sum_{k=1}^K [w_K S^{int}_{\alpha_k} (l_K, u_K; y) ]\right), $$

by default, ( w_k = \frac{\alpha_k}{2} ) and ( w_0 = \frac{1}{2} ). In this equation, ( K ) denotes the number of intervals, and ( l_k ) and ( u_k ) represent the lower and upper bounds of the ( k )-th confidence interval, respectively. The implementation defines the ( \alpha_k ) values based on the names of the prediction columns.

In [22]:

Copied!

score.wis
score.wis

Out[22]:

({'pred': date
  2022-10-09     10.619696
  2022-10-16     15.243582
  2022-10-23     16.451777
  2022-10-30     20.004967
  2022-11-06     43.851187
  2022-11-13     16.347491
  2022-11-20     10.354658
  2022-11-27     11.300675
  2022-12-04     22.053271
  2022-12-11     19.142186
  2022-12-18     24.507445
  2022-12-25     24.421750
  2023-01-01     34.964983
  2023-01-08     49.578230
  2023-01-15     55.280560
  2023-01-22     61.302048
  2023-01-29     51.917378
  2023-02-05     58.195013
  2023-02-12     93.456592
  2023-02-19    100.924584
  2023-02-26    238.706284
  2023-03-05    283.653105
  2023-03-12    333.226960
  2023-03-19    405.351061
  2023-03-26    164.606193
  2023-04-02    242.191037
  2023-04-09    307.261889
  2023-04-16    407.471574
  2023-04-23    197.486172
  2023-04-30    251.763673
  2023-05-07    321.990235
  2023-05-14    421.634808
  2023-05-21    140.915998
  2023-05-28    166.933728
  2023-06-04    223.211572
  2023-06-11    277.237545
  2023-06-18    168.646092
  2023-06-25    113.664595
  dtype: float64,
  '1087': date
  2022-10-09     122.936667
  2022-10-16     156.346565
  2022-10-23     100.220575
  2022-10-30      11.503594
  2022-11-06      50.691970
  2022-11-13      19.850031
  2022-11-20      75.531717
  2022-11-27     117.499088
  2022-12-04     110.764907
  2022-12-11      93.798040
  2022-12-18      67.302732
  2022-12-25      28.410710
  2023-01-01       2.893301
  2023-01-08     122.987177
  2023-01-15     170.780828
  2023-01-22      84.101707
  2023-01-29      85.466473
  2023-02-05      14.533089
  2023-02-12     136.613243
  2023-02-19      38.907105
  2023-02-26     359.637551
  2023-03-05     438.716546
  2023-03-12     678.179959
  2023-03-19     948.085210
  2023-03-26    1264.948317
  2023-04-02    1060.228668
  2023-04-09    1482.110349
  2023-04-16    1269.453166
  2023-04-23    1359.683300
  2023-04-30    1731.608802
  2023-05-07    1648.215017
  2023-05-14    1581.431792
  2023-05-21    1350.354312
  2023-05-28    1208.123399
  2023-06-04    1132.963164
  2023-06-11    1277.512309
  2023-06-18    1153.506827
  2023-06-25     922.194125
  dtype: float64},
 {'pred': np.float64(142.25975248568696),
  '1087': np.float64(591.5287455391691)})

To visualise a plot with the score use:

In [23]:

Copied!

score.plot_wis()
score.plot_wis()

Out[23]:

To generate a table with a summary of the scores, you can use the following:

In [24]:

Copied!

score.summary
score.summary

Out[24]:

	mae	mse	crps	log_score	interval_score	wis
id
pred	129.925690	41097.522756	176.530478	-6.710861	9225.380780	142.259752
1087	615.909109	734525.530214	602.983935	-55.537622	12473.829587	591.528746

To generate a plot of the predictions, you can use the following:

In [25]:

Copied!

score.plot_predictions()
score.plot_predictions()

Out[25]:

The class will select the bigger range of dates that contains information about each prediction dataframe. If you want to see the model's performance in a lower range, use the method below. The new range of dates provided must be between score.min_date and score.max_date. Otherwise, it will return an error.

In [26]:

Copied!

score.set_date_range(start_date = '2023-01-01', end_date='2023-06-01' )
score.set_date_range(start_date = '2023-01-01', end_date='2023-06-01' )

In this case:

In [27]:

Copied!

score.summary
score.summary

Out[27]:

	mae	mse	crps	log_score	interval_score	wis
id
pred	170.317371	5.630798e+04	245.503825	-7.271587	12155.678942	199.491459
1087	802.048697	1.025145e+06	786.776870	-63.258371	16483.894858	774.411787

In [28]:

Copied!

score.plot_log_score()
score.plot_log_score()

Out[28]:

In [29]:

Copied!

score.plot_predictions()
score.plot_predictions()

Out[29]: