Example
We use optimx
to solve pseudo Poisson maximum likelihood
estimation (PPML). If \(y_i\) is a
continuous random variable, it obviously does not follow a Poisson
distribution, whose support consists of non-negative integers. However,
if the conditional mean model \[E[y_i | x_i]
= \exp( x_i' \beta),\] is satisfied, we can still use the
Poisson regression to obtain a consistent estimator of the parameter
\(\beta\) even if \(y_i\) does not follow a conditional Poisson
distribution.
If \(Z\) follows a Poisson distribution with mean \(\lambda\), the probability mass function \[ \Pr(Z = k) = \frac{\mathrm{e}^{-\lambda} \lambda^k}{k!}, \mbox{ for }k=0,1,2,\ldots, \] so that \[ \log \Pr(Y = y | x) = -\exp(x'\beta) + y\cdot x'\beta - \log k! \] Since the last term is irrelevant to the parameter, the log-likelihood function is \[ \ell(\beta) = \log \Pr( \mathbf{y} | \mathbf{x};\beta ) = -\sum_{i=1}^n \exp(x_i'\beta) + \sum_{i=1}^n y_i x_i'\beta. \] In addition, it is easy to write the gradient \[ s(\beta) =\frac{\partial \ell(\beta)}{\partial \beta} = -\sum_{i=1}^n \exp(x_i'\beta)x_i + \sum_{i=1}^n y_i x_i. \] and verify that the Hessian \[ H(\beta) = \frac{\partial^2 \ell(\beta)}{\partial \beta \partial \beta'} = -\sum_{i=1}^n \exp(x_i'\beta)x_i x_i' \] is negative definite. Therefore, \(\ell(\beta)\) is strictly concave in \(\beta\).
In operational reserach, the default optimization is minimization, although economics has utility maximization and cost minimization and statistics has maximum likelihood estimation and minimal least squared estimation. To follow this convention in operational research, here we formulate the negative log-likelihood.
# Poisson likelihood
poisson.loglik = function( b ) {
b = as.matrix( b )
lambda = exp( X %*% b )
ell = -sum( -lambda + y * log(lambda) )
return(ell)
}
To implement optimization in R
, it is recommended to
write the criterion as a function of the parameter. Data can be fed
inside or outside of the function. If the data is provided as additional
arguments, these arguments must be explicit. (In constrast, in
Matlab
the parameter must be the sole argument for the
function to be optimized, and data can only be injected through a nested
function.)
#implement both BFGS and Nelder-Mead for comparison.
library(AER)
library(numDeriv)
library(optimx)
## prepare the data
data("RecreationDemand")
y = RecreationDemand$trips
X = with(RecreationDemand, cbind(1, income) )
## estimation
b.init = c(0,1) # initial value
b.hat = optimx( b.init, poisson.loglik,
method = c("BFGS", "Nelder-Mead"),
control = list(reltol = 1e-7,
abstol = 1e-7) )
print( b.hat )
## p1 p2 value fevals gevals niter convcode kkt1
## BFGS 1.177408 -0.09994070 261.1141 99 21 NA 0 TRUE
## Nelder-Mead 1.167261 -0.09703975 261.1317 53 NA NA 0 FALSE
## kkt2 xtime
## BFGS TRUE 0.012
## Nelder-Mead TRUE 0.001
Given the conditional mean model, nonlinear least squares (NLS) is also consistent in theory. NLS minimizes \[ \sum_{i=1}^n (y_i - \exp(x_i \beta))^2 \] A natural question is: why do we prefer PPML to NLS? My argument is that, PPML’s optimization for the linear index is globally convex, while NLS is not. It implies that the numerical optimization of PPML is easier and more robust than that of NLS. I leave the derivation of the non-convexity of NLS as an exercise.
In practice no algorithm suits all problems. Simulation, where the true parameter is known, is helpful to check the accuracy of one’s optimization routine before applying to an empirical problem, where the true parameter is unknown. Contour plot is a useful tool to visualize the function surface/manifold in a low dimension.
Example
x.grid = seq(0, 1.8, 0.02)
x.length = length(x.grid)
y.grid = seq(-.5, .2, 0.01)
y.length = length(y.grid)
z.contour = matrix(0, nrow = x.length, ncol = y.length)
for (i in 1:x.length){
for (j in 1:y.length){
z.contour[i,j] = poisson.loglik( c( x.grid[i], y.grid[j] ) )
}
}
contour( x.grid, y.grid, z.contour, 20)
For problems that demand more accuracy, third-party standalone
solvers can be invoked via interfaces to R
. For example, we
can access NLopt
through the packages nloptr
.
However, standalone solvers usually have to be compiled and configured.
These steps are often not as straightforward as installing commercial
Windows software.
NLopt
offers an extensive
list of algorithms.
Example
We first carry out the Nelder-Mead algorithm in NLOPT.
library(nloptr)
## optimization with NLoptr
opts = list("algorithm"="NLOPT_LN_NELDERMEAD",
"xtol_rel"=1.0e-7,
maxeval = 500
)
res_NM = nloptr( x0=b.init,
eval_f=poisson.loglik,
opts=opts)
print( res_NM )
##
## Call:
## nloptr(x0 = b.init, eval_f = poisson.loglik, opts = opts)
##
##
## Minimization using NLopt version 2.7.1
##
## NLopt solver status: 4 ( NLOPT_XTOL_REACHED: Optimization stopped because
## xtol_rel or xtol_abs (above) was reached. )
##
## Number of Iterations....: 113
## Termination conditions: xtol_rel: 1e-07 maxeval: 500
## Number of inequality constraints: 0
## Number of equality constraints: 0
## Optimal value of objective function: 261.114078295329
## Optimal value of controls: 1.177397 -0.09993981
# "SLSQP" is indeed the BFGS algorithm in NLopt,
# though "BFGS" doesn't appear in the name
opts = list("algorithm"="NLOPT_LD_SLSQP","xtol_rel"=1.0e-7)
To invoke BFGS in NLOPT, we must code up the gradient \(s(\beta)\), as in the function
poisson.log.grad()
below.
poisson.loglik.grad = function( b ) {
b = as.matrix( b )
lambda = exp( X %*% b )
ell = -colSums( -as.vector(lambda) * X + y * X )
return(ell)
}
We compare the analytical gradient with the numerical gradient to make sure the function is correct.
# check the numerical gradient and the analytical gradient
b = c(0,.5)
grad(poisson.loglik, b)
## [1] 6542.46 45825.40
poisson.loglik.grad(b)
## income
## 6542.46 45825.40
With the function of gradient, we are ready for BFGS.
res_BFGS = nloptr( x0=b.init,
eval_f=poisson.loglik,
eval_grad_f = poisson.loglik.grad,
opts=opts)
print( res_BFGS )
##
## Call:
##
## nloptr(x0 = b.init, eval_f = poisson.loglik, eval_grad_f = poisson.loglik.grad,
## opts = opts)
##
##
## Minimization using NLopt version 2.7.1
##
## NLopt solver status: 4 ( NLOPT_XTOL_REACHED: Optimization stopped because
## xtol_rel or xtol_abs (above) was reached. )
##
## Number of Iterations....: 41
## Termination conditions: xtol_rel: 1e-07
## Number of inequality constraints: 0
## Number of equality constraints: 0
## Optimal value of objective function: 261.114078295325
## Optimal value of controls: 1.177397 -0.09993984