WEBVTT
00:00:00.070 --> 00:00:01.780
The following
content is provided
00:00:01.780 --> 00:00:04.019
under a Creative
Commons License.
00:00:04.019 --> 00:00:06.870
Your support will help MIT
OpenCourseWare continue
00:00:06.870 --> 00:00:10.730
to offer high quality
educational resources for free.
00:00:10.730 --> 00:00:13.330
To make a donation or
view additional materials
00:00:13.330 --> 00:00:17.240
from hundreds of MIT courses,
visit MIT OpenCourseWare
00:00:17.240 --> 00:00:17.865
at ocw.mit.edu.
00:00:28.370 --> 00:00:30.420
SPEAKER 1: Welcome to
this Vensim tutorial.
00:00:30.420 --> 00:00:33.770
As a group of students engaged
in the system dynamics seminar
00:00:33.770 --> 00:00:36.350
at MIT Sloan, we
will present how
00:00:36.350 --> 00:00:38.900
to estimate model parameters
and the confidence interval
00:00:38.900 --> 00:00:45.220
in a dynamic model using Maximum
Likelihood Estimation, MLE,
00:00:45.220 --> 00:00:48.900
Likelihood Ratio, LR method.
00:00:48.900 --> 00:00:52.455
First, the basics of MLE
are described, as well as
00:00:52.455 --> 00:00:55.570
the advantages and
underlying assumptions.
00:00:55.570 --> 00:00:58.420
Next, the different methods
available for finding
00:00:58.420 --> 00:01:00.940
confidence intervals of
the estimated parameters
00:01:00.940 --> 00:01:02.620
are discussed.
00:01:02.620 --> 00:01:06.820
Then, a step-by-step guide
to a parameter estimation
00:01:06.820 --> 00:01:09.900
using MLE-- an assessment
of the uncertainty
00:01:09.900 --> 00:01:14.080
around parameter estimates using
Univariant Likelihood Ratio
00:01:14.080 --> 00:01:17.960
Method in Vensim-- is provided.
00:01:17.960 --> 00:01:21.690
This video is presented to you
by William, George, Sergey,
00:01:21.690 --> 00:01:25.560
and Jim under the guidance of
Professor Hazhir Rahmandad.
00:01:30.340 --> 00:01:33.370
The literature, Struben,
Sterman, and Keith,
00:01:33.370 --> 00:01:36.335
highlight that estimating model
parameters and the uncertainty
00:01:36.335 --> 00:01:39.230
of these parameters are central
to good dynamic modeling
00:01:39.230 --> 00:01:40.530
practice.
00:01:40.530 --> 00:01:42.710
Models must be grounded
in data if modelers
00:01:42.710 --> 00:01:45.900
are to provide reliable
advice to policymakers.
00:01:45.900 --> 00:01:48.305
Ideally, one should
estimate model parameters
00:01:48.305 --> 00:01:52.300
using data that are independent
of the model behavior.
00:01:52.300 --> 00:01:54.940
Often, however, such
direct estimation
00:01:54.940 --> 00:01:57.690
from independent
data is not possible.
00:01:57.690 --> 00:02:00.030
In practice, modelers
must frequently
00:02:00.030 --> 00:02:01.860
estimate at least
some model parameters
00:02:01.860 --> 00:02:05.219
using historical data itself,
finding the set of parameters
00:02:05.219 --> 00:02:07.510
that minimize the difference
between the historical and
00:02:07.510 --> 00:02:09.889
simulated time series.
00:02:09.889 --> 00:02:12.360
Only then can estimations
of model parameters
00:02:12.360 --> 00:02:14.150
and uncertainties
of these estimations
00:02:14.150 --> 00:02:17.230
serve to test model
hypotheses and quantify
00:02:17.230 --> 00:02:21.120
important uncertainties, which
are crucial for decision-making
00:02:21.120 --> 00:02:23.150
based on modeling outcome.
00:02:23.150 --> 00:02:26.035
Therefore, when modeling
a system in Vensim,
00:02:26.035 --> 00:02:28.740
if the purpose of this model
involves numerical testing
00:02:28.740 --> 00:02:31.640
or projection, robust
statistical parameter
00:02:31.640 --> 00:02:33.930
estimation is necessary.
00:02:33.930 --> 00:02:36.550
Confidence intervals also
serve as an important tool
00:02:36.550 --> 00:02:38.575
for decision-making based
on modeling outcomes.
00:02:43.130 --> 00:02:46.070
Various approaches are
available to estimate parameters
00:02:46.070 --> 00:02:49.760
in confidence intervals in
dynamic model, including,
00:02:49.760 --> 00:02:52.680
for estimation, the
generalized methods of moments
00:02:52.680 --> 00:02:56.740
in maximum likelihood, and
for confidence intervals,
00:02:56.740 --> 00:03:00.340
likelihood-based methods
and bootstrapping.
00:03:00.340 --> 00:03:03.240
Maximum Likelihood Estimation
is becoming increasingly
00:03:03.240 --> 00:03:05.310
important for
nonlinear models when
00:03:05.310 --> 00:03:07.310
estimating nonlinear
parameters that
00:03:07.310 --> 00:03:10.690
consist of non-normal,
autocorrelated errors,
00:03:10.690 --> 00:03:12.466
and heteroscedasticity.
00:03:12.466 --> 00:03:15.550
It is simpler to understand
the construct, yet
00:03:15.550 --> 00:03:17.830
at the same time,
requires relatively little
00:03:17.830 --> 00:03:20.460
computational power.
00:03:20.460 --> 00:03:23.360
MLE is best suitable for
using historical data
00:03:23.360 --> 00:03:25.830
to generate parameter
estimation and confidence
00:03:25.830 --> 00:03:28.710
intervals as long as
errors of estimation
00:03:28.710 --> 00:03:31.720
are independent and
identically distributed.
00:03:31.720 --> 00:03:33.860
When using MLE in
complex systems
00:03:33.860 --> 00:03:36.150
where these assumptions
are violated,
00:03:36.150 --> 00:03:38.370
and/or the analytical
likelihood function
00:03:38.370 --> 00:03:40.890
might be difficult
to find, one should
00:03:40.890 --> 00:03:43.740
use more advanced methods.
00:03:43.740 --> 00:03:47.490
This tutorial will not
address these cases.
00:03:47.490 --> 00:03:50.380
As the demonstration will
show, the average laptop
00:03:50.380 --> 00:03:52.970
in use in 2014 is capable
of running the analysis
00:03:52.970 --> 00:03:54.310
in a few minutes or less.
00:03:59.580 --> 00:04:03.024
This tutorial is based on the
following four references--
00:04:03.024 --> 00:04:05.065
"Bootstrapping for confidence
interval estimation
00:04:05.065 --> 00:04:08.430
and hypothesis testing for
parameters of system dynamics
00:04:08.430 --> 00:04:12.830
models" by Dogan, "A behavioral
approach to feedback loop
00:04:12.830 --> 00:04:17.720
dominance analysis" by
Ford, "Modeling Managerial
00:04:17.720 --> 00:04:22.440
Behavior," also known as
"The Beer Game" by Sterman,
00:04:22.440 --> 00:04:26.820
and a soon-to-be published text
by Struben, Sterman, and Keith,
00:04:26.820 --> 00:04:29.580
"Parameter and Confidence
Interval Estimation in Dynamic
00:04:29.580 --> 00:04:34.470
Models-- Maximum Likelihood
and Bootstrapping Methods."
00:04:34.470 --> 00:04:37.880
More background and explanation
on the theory of MLE
00:04:37.880 --> 00:04:39.830
can be found in these works.
00:04:39.830 --> 00:04:43.325
This tutorial will focus on the
application of MLE in Vensim.
00:04:47.420 --> 00:04:50.750
The literature, Struben,
Sterman, and Keith,
00:04:50.750 --> 00:04:52.560
stresses that
modelers must not only
00:04:52.560 --> 00:04:55.770
estimate parameter values,
but also the uncertainty
00:04:55.770 --> 00:04:58.100
in the estimates so they
and others can determine
00:04:58.100 --> 00:05:00.690
how much confidence to
place in the estimates
00:05:00.690 --> 00:05:04.240
and select appropriate ranges
for sensitivity analysis
00:05:04.240 --> 00:05:08.420
in order to assess the
robustness of the conclusions.
00:05:08.420 --> 00:05:09.960
Estimating confidence
intervals can
00:05:09.960 --> 00:05:11.376
be thought of as
finding the shape
00:05:11.376 --> 00:05:14.470
of the inverted bowl, Figure 2.
00:05:14.470 --> 00:05:17.450
If for a given data set,
the likelihood function
00:05:17.450 --> 00:05:19.645
for a set of parameters
falls off very steeply
00:05:19.645 --> 00:05:23.382
for even small departures
from the best estimate,
00:05:23.382 --> 00:05:25.590
then one can have confidence
that the true parameters
00:05:25.590 --> 00:05:27.685
are close to the
estimated value.
00:05:27.685 --> 00:05:31.690
As always, assuming the
model is correctly specified,
00:05:31.690 --> 00:05:35.510
another maintained
hypothesis are satisfied.
00:05:35.510 --> 00:05:38.130
If the likelihood
falls off only slowly,
00:05:38.130 --> 00:05:40.540
other values of the parameters
are nearly as likely
00:05:40.540 --> 00:05:43.530
as the best estimates and one
cannot have much confidence
00:05:43.530 --> 00:05:46.080
in the estimated values.
00:05:46.080 --> 00:05:48.185
MLE methods provides
two major approaches
00:05:48.185 --> 00:05:50.495
to constructing confidence
intervals or confidence
00:05:50.495 --> 00:05:52.210
regions.
00:05:52.210 --> 00:05:55.634
The first is the
asymptotic method, AM,
00:05:55.634 --> 00:05:57.550
which assumes that the
likelihood function can
00:05:57.550 --> 00:06:01.400
be approximated by a parabola
around the estimated parameter.
00:06:01.400 --> 00:06:05.140
An assumption that is valid
for a very large sample.
00:06:05.140 --> 00:06:09.690
The second is the
likelihood ratio, LR method.
00:06:09.690 --> 00:06:12.390
The LR is the ratio of
the likelihood for a given
00:06:12.390 --> 00:06:16.556
set of parameter values to the
likelihood for the MLE values.
00:06:16.556 --> 00:06:19.760
The LR method involves searching
the actual likelihood surface
00:06:19.760 --> 00:06:21.880
to find values of the
likelihood function
00:06:21.880 --> 00:06:25.080
that yield a particular
value for the LR.
00:06:25.080 --> 00:06:27.960
That value is derived for
the probability distribution
00:06:27.960 --> 00:06:29.980
of the LR and the
confidence level
00:06:29.980 --> 00:06:34.220
desired, such as 95% chance that
the true parameter value lies
00:06:34.220 --> 00:06:35.550
within the confidence interval.
00:06:40.580 --> 00:06:43.720
This tutorial will use the
Univariate Likelihood Ratio
00:06:43.720 --> 00:06:47.262
for determining the MLE
competence interval in Vensim.
00:06:47.262 --> 00:06:49.220
The estimated parameter
and competence interval
00:06:49.220 --> 00:06:52.850
mean that for a specific
percentage of probability--
00:06:52.850 --> 00:06:56.410
usually 95 or 99% that
the real parameter falls
00:06:56.410 --> 00:06:59.540
within the confidence interval
with the designated percent
00:06:59.540 --> 00:07:01.050
possibility.
00:07:01.050 --> 00:07:03.140
This is consistent with
general applications
00:07:03.140 --> 00:07:06.680
of statistics and probability.
00:07:06.680 --> 00:07:08.660
The LR our method of
confidence interval
00:07:08.660 --> 00:07:11.100
estimation compared to the
likelihood for the estimated
00:07:11.100 --> 00:07:15.010
parameter, theta hat, with
that of an alternative set,
00:07:15.010 --> 00:07:16.320
theta star.
00:07:16.320 --> 00:07:19.130
The likelihood ratio is
determined in equation one
00:07:19.130 --> 00:07:26.490
as R equals L theta hat
divided by L theta star.
00:07:26.490 --> 00:07:28.630
Asymptotically, the
likelihood ratio
00:07:28.630 --> 00:07:32.770
falls at chi square
distribution, equation two.
00:07:32.770 --> 00:07:35.530
This is valuable, because the
univariate method requires
00:07:35.530 --> 00:07:39.320
no new optimizations once
the MLE has been found.
00:07:39.320 --> 00:07:41.570
The critical parameter
value for all parameters
00:07:41.570 --> 00:07:45.470
is then simply using
equation three.
00:07:45.470 --> 00:07:48.930
A disadvantage of univariate
confidence interval estimates,
00:07:48.930 --> 00:07:52.720
however, is that the parameter
space is not fully explored.
00:07:52.720 --> 00:07:56.225
Hence, the effect of interaction
between the parameters on LL
00:07:56.225 --> 00:08:01.640
is ignored,
00:08:01.640 --> 00:08:04.150
The tutorial now switches
to a real simulation
00:08:04.150 --> 00:08:06.640
using Vensim 6.1.
00:08:06.640 --> 00:08:08.750
It will show how the
theory just described
00:08:08.750 --> 00:08:12.310
can be applied to estimating
parameters of decision-making
00:08:12.310 --> 00:08:15.262
in the Beer Distribution Game.
00:08:15.262 --> 00:08:15.970
SPEAKER 2: Hello.
00:08:15.970 --> 00:08:18.600
This tutorial is now going
to explore analytics,
00:08:18.600 --> 00:08:20.850
by estimating the parameters
for a well-analyzed model
00:08:20.850 --> 00:08:23.600
of decision-making used in
the Beer Distribution Game.
00:08:23.600 --> 00:08:26.100
The model is described in the
paper, "Modeling Managerial
00:08:26.100 --> 00:08:28.530
Behavior-- Misperception
of Feedback in a Dynamic
00:08:28.530 --> 00:08:32.390
Decision-Making Experiment" by
Professor Sterman from 1989.
00:08:32.390 --> 00:08:34.610
Participants in the beer
game choose how much beer
00:08:34.610 --> 00:08:37.385
to order each period in
a simulated supply chain.
00:08:37.385 --> 00:08:39.760
The challenge is to estimate
the parameters of a proposed
00:08:39.760 --> 00:08:41.415
decision rule for
participant orders.
00:08:44.096 --> 00:08:45.470
The screen shows
the simple model
00:08:45.470 --> 00:08:47.407
of beer game decision-making.
00:08:47.407 --> 00:08:49.740
The ordering decision rule
proposed by Professor Sterman
00:08:49.740 --> 00:08:51.370
is the following.
00:08:51.370 --> 00:08:54.650
The orders placed every week
are given by the maximum of zero
00:08:54.650 --> 00:08:57.180
and the sum of expected
customer orders-- the orders
00:08:57.180 --> 00:08:58.710
that participants
expect to receive
00:08:58.710 --> 00:09:01.590
next period from their
immediate customer and inventory
00:09:01.590 --> 00:09:03.142
discrepancy.
00:09:03.142 --> 00:09:06.140
The inventory discrepancy
is the difference
00:09:06.140 --> 00:09:08.660
between total desired
inventory and some
00:09:08.660 --> 00:09:11.040
of actual on-hand
inventory and supply
00:09:11.040 --> 00:09:13.420
line of on-order inventory.
00:09:13.420 --> 00:09:15.400
There are four
parameters that are
00:09:15.400 --> 00:09:18.690
used in the model-- [INAUDIBLE],
the weight on incoming orders
00:09:18.690 --> 00:09:22.000
in demand forecasting,
S prime, the desired
00:09:22.000 --> 00:09:25.550
on-hand and on-order
inventory, the fraction
00:09:25.550 --> 00:09:28.750
of the gap between desired and
actual on-hand and on-order
00:09:28.750 --> 00:09:31.195
inventory ordered each
week, and the fraction
00:09:31.195 --> 00:09:34.600
of the supply line the
subject accounts for.
00:09:34.600 --> 00:09:36.680
As modelers, we don't
know what the parameters
00:09:36.680 --> 00:09:37.960
of the actual game are.
00:09:37.960 --> 00:09:41.200
But we have the data for actual
orders placed by participants.
00:09:41.200 --> 00:09:43.450
And this data is read
from the Excel spreadsheet
00:09:43.450 --> 00:09:47.194
into the variable,
actual orders.
00:09:47.194 --> 00:09:48.610
Let's start from
some guesstimates
00:09:48.610 --> 00:09:50.790
and assign the following values.
00:09:50.790 --> 00:09:55.440
[INAUDIBLE], fraction
of discrepancy
00:09:55.440 --> 00:09:58.840
between desired and
actual inventory,
00:09:58.840 --> 00:10:04.190
and supply line fraction
are all equal to 0.5.
00:10:04.190 --> 00:10:09.390
S prime, the total desired
inventory, is 20 cases of beer.
00:10:09.390 --> 00:10:12.870
With these parameters, the
model runs and generates
00:10:12.870 --> 00:10:14.580
some alias of the
order placed, which
00:10:14.580 --> 00:10:16.287
can be seem on this graph.
00:10:16.287 --> 00:10:18.120
Let's compare them
against the actual orders
00:10:18.120 --> 00:10:19.860
observed in the beer game.
00:10:19.860 --> 00:10:21.830
We can see that although
the trend is generally
00:10:21.830 --> 00:10:24.330
correct at the high level, the
shape is totally different.
00:10:27.067 --> 00:10:29.650
This necessitates the question,
how can we effectively measure
00:10:29.650 --> 00:10:30.520
the fit of the data?
00:10:30.520 --> 00:10:32.880
The typical way is to calculate
the sum of squared errors,
00:10:32.880 --> 00:10:35.040
which are differences between
simulated and actual data
00:10:35.040 --> 00:10:35.550
points.
00:10:35.550 --> 00:10:37.050
Square, to make
sure negative values
00:10:37.050 --> 00:10:38.990
don't reduce the total sum.
00:10:38.990 --> 00:10:40.730
The basic statistics
of any variable
00:10:40.730 --> 00:10:42.740
can be found by
using statistics to
00:10:42.740 --> 00:10:45.194
from either object
output or bench tool.
00:10:54.086 --> 00:10:56.220
For this tutorial, I
have already defined it.
00:10:56.220 --> 00:10:59.010
And in this case, we can see
[INAUDIBLE] of sum of squares
00:10:59.010 --> 00:11:02.130
of residuals is 1700,
with a mean of about 35.
00:11:02.130 --> 00:11:04.070
This is pretty far
from a good fit,
00:11:04.070 --> 00:11:05.932
and we can confirm
it officially.
00:11:05.932 --> 00:11:07.640
Now let's see how to
run the optimization
00:11:07.640 --> 00:11:10.098
to find the parameters that
will bring the values of orders
00:11:10.098 --> 00:11:12.360
placed as close to the
actual orders as possible.
00:11:12.360 --> 00:11:13.830
The optimization
control panel is
00:11:13.830 --> 00:11:17.200
invoked by using optimized
tool on the toolbar.
00:11:17.200 --> 00:11:21.600
When you first open, you have
to specify the file name here.
00:11:21.600 --> 00:11:24.140
Also, it is necessary to specify
what type of optimization
00:11:24.140 --> 00:11:25.549
you're going to do.
00:11:25.549 --> 00:11:27.090
There are two types
that can be used.
00:11:27.090 --> 00:11:29.070
They differ in the way they
interpret and calculate
00:11:29.070 --> 00:11:29.820
the payoff value.
00:11:29.820 --> 00:11:33.096
A payoff is a single number
that summarize the simulation.
00:11:33.096 --> 00:11:34.470
In the case of
the simulation, it
00:11:34.470 --> 00:11:35.900
will be a measure
of fit, which can
00:11:35.900 --> 00:11:37.775
be a sum of squared
errors between the actual
00:11:37.775 --> 00:11:39.390
and the simulated
values, or it can
00:11:39.390 --> 00:11:41.470
be a true electrical function.
00:11:41.470 --> 00:11:43.500
If you are only interested
in finding the best
00:11:43.500 --> 00:11:45.790
fit without worrying about
confidence intervals,
00:11:45.790 --> 00:11:47.580
you can use the
calibration mode.
00:11:47.580 --> 00:11:49.690
For calibration, it is
not necessary to define
00:11:49.690 --> 00:11:50.960
the payoff functional mode.
00:11:50.960 --> 00:11:53.562
Instead, choose the model
variable and the data variable
00:11:53.562 --> 00:11:55.520
with which the model
variable will be compared.
00:11:58.490 --> 00:12:00.130
In Vensim, then
at each time step,
00:12:00.130 --> 00:12:02.380
the difference between the
data and the model variable
00:12:02.380 --> 00:12:04.490
is multiplied by the
weight specified.
00:12:04.490 --> 00:12:06.482
And this product
is then squared.
00:12:06.482 --> 00:12:08.065
This number, which
is always positive,
00:12:08.065 --> 00:12:09.960
is then subtracted
from the payoff,
00:12:09.960 --> 00:12:11.920
so that the payoff
is always negative.
00:12:11.920 --> 00:12:13.420
Maximizing the
payoff means getting
00:12:13.420 --> 00:12:15.394
it to be as close
to zero as possible.
00:12:15.394 --> 00:12:17.560
However, this is not a true
log-likelihood function,
00:12:17.560 --> 00:12:20.184
so the results cannot be used to
find the confidence intervals,
00:12:20.184 --> 00:12:20.940
which we need.
00:12:20.940 --> 00:12:23.120
Therefore, we're going
to use the policy mode.
00:12:23.120 --> 00:12:25.340
For policy mode,
the payoff function
00:12:25.340 --> 00:12:26.980
must be specified explicitly.
00:12:26.980 --> 00:12:28.890
At each time step, the
value of the variable
00:12:28.890 --> 00:12:33.410
or presenting a payoff function
is multiplied by the weight.
00:12:33.410 --> 00:12:36.615
Then it is multiplied by time
step and added to the payoff.
00:12:36.615 --> 00:12:39.460
The optimizer is designed
to maximize the payoff.
00:12:39.460 --> 00:12:41.430
So variables, for
which more is better,
00:12:41.430 --> 00:12:42.860
should be given
positive weights,
00:12:42.860 --> 00:12:44.440
and those, for which
less is better,
00:12:44.440 --> 00:12:45.815
should be given
negative weights.
00:12:48.990 --> 00:12:52.816
Here, [INAUDIBLE] are specified
in the sum of squared errors.
00:12:52.816 --> 00:12:55.190
Residuals are calculated as
the difference between orders
00:12:55.190 --> 00:12:57.750
placed and actual orders
at each time step.
00:12:57.750 --> 00:12:59.390
This squared value
is then accumulated
00:12:59.390 --> 00:13:01.720
in the stock sum
of residual square.
00:13:01.720 --> 00:13:04.924
The tricky part is the variable,
total sum of residual square.
00:13:04.924 --> 00:13:07.090
Please note that because
Vensim, in the policy mode,
00:13:07.090 --> 00:13:08.923
multiplies the values
of the payoff function
00:13:08.923 --> 00:13:10.826
by the time step,
the payoff value
00:13:10.826 --> 00:13:12.950
is the weighted combination
of the different payoff
00:13:12.950 --> 00:13:15.290
elements integrated
over the simulation.
00:13:15.290 --> 00:13:16.770
This is not what
we're looking for.
00:13:16.770 --> 00:13:19.350
We are interested in the final
value of a variable instead
00:13:19.350 --> 00:13:22.180
of the integrated value, because
the final value is exactly
00:13:22.180 --> 00:13:25.170
the total sum of squared
errors the way we defined it.
00:13:25.170 --> 00:13:28.175
Here's an example to
illustrate the concept.
00:13:28.175 --> 00:13:31.600
In this line is a
stock variable level.
00:13:31.600 --> 00:13:33.783
It can be seen that, if we
just use the stock of sum
00:13:33.783 --> 00:13:35.574
of residual squared in
the payoff function,
00:13:35.574 --> 00:13:41.458
we get the value, which is equal
to the area under the curve,
00:13:41.458 --> 00:13:44.090
instead of the final
value that we need.
00:13:51.260 --> 00:13:53.600
[INAUDIBLE] look at
only the final value.
00:13:53.600 --> 00:13:55.870
In a model, it is necessary
to introduce new variable
00:13:55.870 --> 00:13:58.328
for the payoff function and
use the following equation that
00:13:58.328 --> 00:14:00.220
makes its value to be
zero at each time step,
00:14:00.220 --> 00:14:01.440
except for the final step.
00:14:07.094 --> 00:14:09.510
Add the current value of the
residual squared to the stock
00:14:09.510 --> 00:14:12.440
level to account for the value
of the current time step.
00:14:12.440 --> 00:14:14.810
This effectually ignores
all the intermediate values
00:14:14.810 --> 00:14:17.460
and looks only at
the final value.
00:14:17.460 --> 00:14:19.020
So far, this is
nothing different
00:14:19.020 --> 00:14:21.180
from what Vensim was doing
in the calibration mode.
00:14:21.180 --> 00:14:23.597
We simply replicated what it
would be doing automatically.
00:14:23.597 --> 00:14:25.888
In order to get the true
likelihood function-- assuming
00:14:25.888 --> 00:14:27.500
normal distribution
of errors-- we
00:14:27.500 --> 00:14:29.166
need to divide the
sum of squared errors
00:14:29.166 --> 00:14:30.250
by two variances.
00:14:30.250 --> 00:14:32.750
This is what you can see in the
true low likelihood function
00:14:32.750 --> 00:14:34.940
variable.
00:14:34.940 --> 00:14:36.360
So far, in this
demonstration, we
00:14:36.360 --> 00:14:38.360
don't know what the
standard deviation of errors
00:14:38.360 --> 00:14:40.830
is going to be, because we
haven't optimized anything yet,
00:14:40.830 --> 00:14:43.925
so we're going to leave
this variable as 1.
00:14:43.925 --> 00:14:45.800
We are going to change
it to the actual value
00:14:45.800 --> 00:14:48.340
of the standard deviation of
errors later by the confidence
00:14:48.340 --> 00:14:49.640
interval.
00:14:49.640 --> 00:14:54.674
Now let's change the
run name and go back
00:14:54.674 --> 00:14:55.590
to Optimization Setup.
00:14:58.360 --> 00:15:03.569
And let's use our log-likelihood
function as the payoff element.
00:15:03.569 --> 00:15:05.110
Because this is
policy mode, we don't
00:15:05.110 --> 00:15:07.856
need to specify anything
in Compare To field.
00:15:07.856 --> 00:15:10.210
For the weight, because
this is policy mode,
00:15:10.210 --> 00:15:12.670
it is necessary to
assign a negative value.
00:15:12.670 --> 00:15:14.170
This tells Vensim
that we're looking
00:15:14.170 --> 00:15:15.510
to minimize the payoff function.
00:15:15.510 --> 00:15:16.880
If there are multiple
payoff functions,
00:15:16.880 --> 00:15:18.750
the weight value would
have been important.
00:15:18.750 --> 00:15:20.170
For one function,
it doesn't matter.
00:15:20.170 --> 00:15:21.240
But we don't want
to change the values
00:15:21.240 --> 00:15:23.698
of the log-likelihood function
to calculate true confidence
00:15:23.698 --> 00:15:24.660
internal values.
00:15:24.660 --> 00:15:27.360
Therefore, the log-likelihood
function will not be scaled,
00:15:27.360 --> 00:15:29.162
and the weight is
going to be minus 1.
00:15:32.778 --> 00:15:35.420
Next screen, the
Optimization Control Panel,
00:15:35.420 --> 00:15:37.760
we have to specify
the file name again.
00:15:37.760 --> 00:15:41.447
Then, make sure to
use multiple starts.
00:15:41.447 --> 00:15:43.030
This means the
analysis is less likely
00:15:43.030 --> 00:15:44.680
going to be stuck
at a local optimum
00:15:44.680 --> 00:15:46.480
if the surface is not complete.
00:15:46.480 --> 00:15:48.899
If multiple starts is
random, starting points
00:15:48.899 --> 00:15:51.190
for any optimizations are
picked randomly and uniformly
00:15:51.190 --> 00:15:53.040
over the range of
each parameter.
00:15:53.040 --> 00:15:56.085
Next field, optimizer, tells
Vensim to use probable method
00:15:56.085 --> 00:15:59.277
to find local minimum
function at each start.
00:15:59.277 --> 00:16:01.110
Technically, you could
ignore this parameter
00:16:01.110 --> 00:16:02.693
and just rely on
random starts to find
00:16:02.693 --> 00:16:04.250
the values of a payoff function.
00:16:04.250 --> 00:16:06.080
However, using multiple
starts together
00:16:06.080 --> 00:16:07.750
with optimization
of each time step
00:16:07.750 --> 00:16:09.875
produces better and
faster convergence.
00:16:09.875 --> 00:16:12.599
The other values are left
at the default levels.
00:16:12.599 --> 00:16:14.890
You can read more about these
parameter in Vensim help,
00:16:14.890 --> 00:16:16.880
but generally, they
control the optimization
00:16:16.880 --> 00:16:18.760
and are important for
complicated models
00:16:18.760 --> 00:16:20.830
where simulation
time is significant.
00:16:20.830 --> 00:16:23.607
Let's add now the parameters
that Vensim will vary in order
00:16:23.607 --> 00:16:24.940
to minimize the payoff function.
00:16:50.350 --> 00:16:56.180
Next, make sure Payoff Report
is selected and hit Finish.
00:16:56.180 --> 00:16:59.160
For random or other random
values of multiple start,
00:16:59.160 --> 00:17:01.030
the optimizer will
never stop unless you
00:17:01.030 --> 00:17:02.739
click on the Stop
button to interrupt it.
00:17:02.739 --> 00:17:04.238
If you don't choose
multiple starts,
00:17:04.238 --> 00:17:05.780
the next few steps
are not necessary
00:17:05.780 --> 00:17:07.500
as Vensim would be able
to complete optimization
00:17:07.500 --> 00:17:09.333
and sensitivity analysis,
which is performed
00:17:09.333 --> 00:17:10.930
in the final step
of optimization.
00:17:10.930 --> 00:17:12.849
However, as I mentioned,
the optimization
00:17:12.849 --> 00:17:14.520
can be stuck at a
local optimum, thus
00:17:14.520 --> 00:17:16.349
producing some optimal results.
00:17:16.349 --> 00:17:18.260
So it is recommended that you
use multiple starts unless you
00:17:18.260 --> 00:17:19.801
know the shape of
the payoff function
00:17:19.801 --> 00:17:22.920
and are sure that local optimum
is equal to the global optimum.
00:17:22.920 --> 00:17:24.829
So having chosen
random multiple starts,
00:17:24.829 --> 00:17:27.060
the question is, when to
stop the optimization.
00:17:27.060 --> 00:17:28.810
This requires some
experiments and depends
00:17:28.810 --> 00:17:30.550
on the shape of your
payoff function.
00:17:30.550 --> 00:17:32.410
In this case, the
values of the payoff
00:17:32.410 --> 00:17:34.360
have been changing quite
fast in the beginning
00:17:34.360 --> 00:17:36.600
and are now at the same
level for quite some time.
00:17:36.600 --> 00:17:39.100
So it's a good time to attempt
to interrupt the optimization
00:17:39.100 --> 00:17:40.443
and see if you like the results.
00:17:43.890 --> 00:17:46.350
As you can see, the shape
of the orders placed
00:17:46.350 --> 00:17:48.820
is much closer to the
actual order values.
00:17:48.820 --> 00:17:51.440
The statistics shows
a much better fit,
00:17:51.440 --> 00:17:54.050
with the total sum of squared
error significantly lower
00:17:54.050 --> 00:18:00.719
at the level of just 279,
with a mean of 5.813.
00:18:00.719 --> 00:18:02.760
The values of the rest of
the optimized parameter
00:18:02.760 --> 00:18:04.718
can be looked up in the
file with the same name
00:18:04.718 --> 00:18:06.742
as the run name and
the extension out.
00:18:06.742 --> 00:18:08.950
We can open this file from
Vensim using the Edit File
00:18:08.950 --> 00:18:10.050
command from File menu.
00:18:15.810 --> 00:18:17.950
So far, we only know the
optimal values, but not
00:18:17.950 --> 00:18:20.264
the confidence intervals.
00:18:20.264 --> 00:18:22.680
We need now to tell Vensim to
use the optimal values found
00:18:22.680 --> 00:18:25.709
and calculate the confidence
intervals around them.
00:18:25.709 --> 00:18:28.250
We are going to do it by using
the output of the optimization
00:18:28.250 --> 00:18:30.365
and changing
optimization parameters.
00:18:30.365 --> 00:18:33.040
To do this, first, we have to
modify the optimization control
00:18:33.040 --> 00:18:39.770
parameters and turn
of multiple starts,
00:18:39.770 --> 00:18:47.560
and optimize them, and save
it as the optimization control
00:18:47.560 --> 00:18:48.840
type file, BOC.
00:18:59.969 --> 00:19:02.260
Also, we need to change the
value of standard deviation
00:19:02.260 --> 00:19:05.409
to the actual value
to make sure we're
00:19:05.409 --> 00:19:07.825
looking at the true values of
the log-likelihood function.
00:19:07.825 --> 00:19:10.324
This value can be found in the
statistics [INAUDIBLE] again.
00:19:10.324 --> 00:19:12.470
We use it to change the
variable in the model.
00:19:27.820 --> 00:19:29.570
Open the Optimization
Control Panel again,
00:19:29.570 --> 00:19:33.750
and leave the payoff
function as it is.
00:19:33.750 --> 00:19:35.510
But on the Optimization
Control Screen,
00:19:35.510 --> 00:19:37.140
open the file that
was just created
00:19:37.140 --> 00:19:40.015
and check that multiple starts
and optimizer are turned off.
00:19:44.620 --> 00:19:49.510
In addition, specify [INAUDIBLE]
by choosing payoff value
00:19:49.510 --> 00:19:51.840
and entering the value by
which Vensim will change
00:19:51.840 --> 00:19:54.480
the optimal likelihood function
in order to find the constants
00:19:54.480 --> 00:19:55.630
controls.
00:19:55.630 --> 00:19:57.340
Because for likelihood
ratio method,
00:19:57.340 --> 00:19:58.840
the likelihood ratio
is approximated
00:19:58.840 --> 00:20:00.330
by the chi-square
distribution, it
00:20:00.330 --> 00:20:02.705
is necessary to find the value
of chi-square distribution
00:20:02.705 --> 00:20:05.520
[INAUDIBLE] for 95th percentile
for one degree of freedom,
00:20:05.520 --> 00:20:07.480
because we are doing
the univariate analysis.
00:20:07.480 --> 00:20:10.470
A disadvantage of univariate
confidence interval estimation
00:20:10.470 --> 00:20:13.085
is that the parameter space
is not fully explored.
00:20:13.085 --> 00:20:15.210
Hence, the effect of
interaction between parameters
00:20:15.210 --> 00:20:17.600
and log-likelihood
function is ignored.
00:20:17.600 --> 00:20:19.650
The value of chi-squared,
in this case,
00:20:19.650 --> 00:20:25.930
is approximately
3.84146, which we're
00:20:25.930 --> 00:20:28.565
going to use in the
sensitivity field.
00:20:28.565 --> 00:20:30.320
Now, the [INAUDIBLE] button.
00:20:30.320 --> 00:20:33.245
And as you see,
this time, we don't
00:20:33.245 --> 00:20:35.370
have to wait as Vensim
doesn't do any optimization.
00:20:35.370 --> 00:20:36.865
The results,
allocated in the file
00:20:36.865 --> 00:20:46.350
that has the name of
the run-- and the word
00:20:46.350 --> 00:20:49.010
sensitive with
the extension tab.
00:20:49.010 --> 00:20:51.602
Let's open this file and
see our estimated parameters
00:20:51.602 --> 00:20:52.810
and the confidence intervals.
00:20:56.262 --> 00:20:58.470
Since we are using too much
more likelihood function,
00:20:58.470 --> 00:21:00.636
the confidence bounds are
not necessarily symmetric.
00:21:04.959 --> 00:21:06.250
This is a very simple tutorial.
00:21:06.250 --> 00:21:09.280
I hope you are now more familiar
with [INAUDIBLE] for estimating
00:21:09.280 --> 00:21:11.760
parameters and confidence
intervals for your models.
00:21:11.760 --> 00:21:12.420
Thank you.
00:21:15.650 --> 00:21:18.170
SPEAKER 3: A summary of the
seven key steps involved
00:21:18.170 --> 00:21:22.630
in determining the MLE in Vensim
is contained on the slide.
00:21:22.630 --> 00:21:25.200
It can be printed and
serve as a useful reference
00:21:25.200 --> 00:21:27.330
and check for those
wishing to use the method.
00:21:31.650 --> 00:21:34.342
MLE has its advantages
and disadvantages
00:21:34.342 --> 00:21:37.560
in estimating parameters and
their confidence intervals
00:21:37.560 --> 00:21:40.020
when performing
dynamic modeling.
00:21:40.020 --> 00:21:41.870
In conclusion,
when the condition
00:21:41.870 --> 00:21:45.030
of errors being independent
and identically distributed
00:21:45.030 --> 00:21:48.580
are met, then MLE is a simple
and straightforward method
00:21:48.580 --> 00:21:50.770
for estimating
parameters in determining
00:21:50.770 --> 00:21:53.700
their statistical fit in system
dynamics models developed
00:21:53.700 --> 00:21:55.550
in Vensim.