I'm trying to test a simple hidden Markov model with two states and
Gaussian emissions. I've written the model attached below. It uses
some test data, generated with transition probability matrix [[0.9,
0.1], [0.3, 0.7]], mu = [1., -1.], and sigma = [0.25, 0.75], that I've
placed in my public Dropbox folder (http://dl.dropbox.com/u/647515/
PyMC/test_data.txt).
Unfortunately, the model doesn't run, crashing with the following
error message:
/home/tsmall/HMM/NormalMixture/NormalMixtureModel.py in
states_logp(value, Ptrans)
53
54 for i in range(1, len(value)):
---> 55 logp = logp + pymc.categorical_like(value[i],
P[value[i-1]])
56
57 return logp
IndexError: index out of bounds
What's happening is that the variable "states", which contains the
hidden states of the Markov chain, gets a small number of errant state
indices. With this two state example, the state indices are 0 and 1,
but states ends up with a handful (or sometimes only one) of indices
>1, which causes the error. The mystery is that if I draw random
chains using states.random(), I never see errant indices. I'm
wondering if it's possible that somewhere under the hood PyMC is
modifying the state indices and if one of the PyMC pundits can take a
look?
I also wanted to comment that I find the PyMC implementation of the
Dirichlet distribution a little awkward to use because, for dimension
k, you have to drop the k-th entry. Or, equivalently,
pymc.rdirichlet([2., 2.], size=1) returns a one-element array rather
than a two-element array. Of course, I realize that since the
elements have to sum to 1, including all elements is unnecessary, but
the PyMC behavior is different from the behavior of
np.random.dirichlet() or the Dirichlet implementation in the Gnu
Science Libary.
Thanks,
Todd
import numpy as np
import pymc
def unconditionalProbability(Ptrans):
"""Compute the unconditional probability for the states of a
Markov chain."""
m = Ptrans.shape[0]
P = np.column_stack((Ptrans, 1. - Ptrans.sum(axis=1)))
I = np.eye(m)
U = np.ones((m, m))
u = np.ones(m)
return np.linalg.solve((I - P + U).T, u)
data = np.loadtxt('test_data.txt',
dtype=np.dtype([('state', np.uint8),
('emission', np.float)]),
delimiter=',',
skiprows=1)
# Two state model for simplicity.
N_states = 2
N_chain = len(data)
# Transition probability stochastic
theta = np.ones(N_states) + 1.
def Ptrans_logp(value, theta):
logp = 0.
for i in range(value.shape[0]):
logp = logp + pymc.dirichlet_like(value[i], theta)
return logp
def Ptrans_random(theta):
return pymc.rdirichlet(theta, size=len(theta))
Ptrans = pymc.Stochastic(logp=Ptrans_logp,
doc='Transition matrix',
name='Ptrans',
parents={'theta': theta},
random=Ptrans_random)
#Hidden states stochastic
def states_logp(value, Ptrans):
P = np.column_stack((Ptrans, 1. - Ptrans.sum(axis=1)))
Pinit = unconditionalProbability(Ptrans)
logp = pymc.categorical_like(value[0], Pinit)
for i in range(1, len(value)):
logp = logp + pymc.categorical_like(value[i], P[value[i-1]])
return logp
def states_random(Ptrans, N_chain=N_chain):
P = np.column_stack((Ptrans, 1. - Ptrans.sum(axis=1)))
Pinit = unconditionalProbability(Ptrans)
states = np.empty(N_chain, dtype=np.uint8)
states[0] = pymc.rcategorical(Pinit)
for i in range(1, N_chain):
states[i] = pymc.rcategorical(P[states[i-1]])
return states
states = pymc.Stochastic(logp=states_logp,
doc='Hidden states',
name='states',
parents={'Ptrans': Ptrans},
random=states_random,
dtype=np.uint8)
# Gaussian emission parameters
mu = pymc.Normal('mu', 0., 1.e-6, value=np.random.randn(N_states))
sigma = pymc.Uniform('sigma', 0., 100.,
value=np.random.rand(N_states))
y = pymc.Normal('y', mu[states], 1./sigma[states]**2,
value=data['emission'], observed=True)
Will have a look. Is it possible that your model is changing
dimension?
Incase you are interested, here is a simple hmm model for disease
dynamics that I coded recently:
http://files.me.com/fonnesbeck/knvirc
>
> I also wanted to comment that I find the PyMC implementation of the
> Dirichlet distribution a little awkward to use because, for dimension
> k, you have to drop the k-th entry. Or, equivalently,
> pymc.rdirichlet([2., 2.], size=1) returns a one-element array rather
> than a two-element array. Of course, I realize that since the
> elements have to sum to 1, including all elements is unnecessary, but
> the PyMC behavior is different from the behavior of
> np.random.dirichlet() or the Dirichlet implementation in the Gnu
> Science Libary.
We had this discussion internally awhile back. The problem is that it
is too easy to end up with values that do not sum to one, so we made
Dirichlet behave the same way that a binomial would -- you only
specify p for a binomial, and not q=1-p. I began the argument on the
same side you are on, and eventually came around to see it the other
way. Optimally, we should allow for both.
On Feb 6, 9:54 am, Chris Fonnesbeck <[email protected]>
wrote:
> On Feb 5, 4:16 pm, Todd <[email protected]> wrote:
>
> > What's happening is that the variable "states", which contains the
> > hidden states of the Markov chain, gets a small number of errant state
> > indices. With this two state example, the state indices are 0 and 1,
> > but states ends up with a handful (or sometimes only one) of indices>1, which causes the error. The mystery is that if I draw random
>
> > chains using states.random(), I never see errant indices. I'm
> > wondering if it's possible that somewhere under the hood PyMC is
> > modifying the state indices and if one of the PyMC pundits can take a
> > look?
>
> Will have a look. Is it possible that your model is changing
> dimension?
Thanks for taking a look! The model is certainly not supposed to be
changing dimension...
>
> Incase you are interested, here is a simple hmm model for disease
> dynamics that I coded recently:
>
> http://files.me.com/fonnesbeck/knvirc
>
>
I'm confused by your model, which I imagine is due to some lack of
understanding on my part. How are you actually sampling the hidden
state path? The stochastic z doesn't have a random method, and, when
I run the model, the value of z never changes. I would like to be
able to sample the hidden path in order, for example, to estimate the
state probabilities at a particular time. To be as clear as possible,
here's a WinBUGS/JAGS version of the model I'd like to implement in
PyMC:
var Ptrans[2, 2]
model {
# Transition Probability
Ptrans[1,] ~ ddirch(alpha[])
Ptrans[2,] ~ ddirch(alpha[])
# States
Pinit[1] <- 0.5
Pinit[2] <- 0.5
state[1] ~ dcat(Pinit[])
for (t in 2:N) {
state[t] ~ dcat(Ptrans[state[t-1],])
}
# Parameters
mu[1] ~ dnorm(0., 1.e-6)
mu[2] ~ dnorm(0., 1.e-6)
tau[1] ~ dgamma(0.001, 0.001)
tau[2] ~ dgamma(0.001, 0.001)
sigma[1] <- 1./sqrt(tau[1])
sigma[2] <- 1./sqrt(tau[2])
# Observations
for (t in 1:N) {
y[t] ~ dnorm(mu[state[t]], tau[state[t]])
}
}
>
> > I also wanted to comment that I find the PyMC implementation of the
> > Dirichlet distribution a little awkward to use because, for dimension
> > k, you have to drop the k-th entry. Or, equivalently,
> > pymc.rdirichlet([2., 2.], size=1) returns a one-element array rather
> > than a two-element array. Of course, I realize that since the
> > elements have to sum to 1, including all elements is unnecessary, but
> > the PyMC behavior is different from the behavior of
> > np.random.dirichlet() or the Dirichlet implementation in the Gnu
> > Science Libary.
>
> We had this discussion internally awhile back. The problem is that it
> is too easy to end up with values that do not sum to one, so we made
> Dirichlet behave the same way that a binomial would -- you only
> specify p for a binomial, and not q=1-p. I began the argument on the
> same side you are on, and eventually came around to see it the other
> way. Optimally, we should allow for both.
Thanks for letting me know what you guys were thinking. I can
appreciate your point of view.
Todd
Strange -- let me have another look at my own model, then. In the
meantime, here is a version of your model that appears to update. The
problem was with values for the categorical likelihood being proposed
that were out of its range.
http://snipplr.com/view/28065/hidden-markov-model/
Chris
On Feb 9, 1:02 pm, Chris Fonnesbeck <[email protected]>
wrote:
Hi Chris,
Thanks for the revised version of the code. I think, though, that the
question remains why values outside of the range of the categorical
likelihood are appearing. As I mentioned in my original post, if I
call states.random() a zillion times on its own, I never see errant
values. It's only when I'm sampling a model that the errant values
occur.
Aside from this problem, does the model I've written otherwise look
like a correct PyMC implementation of a hidden Markov model?
Todd
>
> Thanks for the revised version of the code. I think, though, that the
> question remains why values outside of the range of the categorical
> likelihood are appearing. As I mentioned in my original post, if I
> call states.random() a zillion times on its own, I never see errant
> values. It's only when I'm sampling a model that the errant values
> occur.
>
This is because there is no connection between the random method of a
Stochastic and the proposal of values for that Stochastic. Values are
proposed according to the StepMethod assigned to the Stochastic. So,
for a (say) Metropolis sampler, proposals are based on a random walk
centered on the current value of the variable. Hence, it is easy to
get a value that, when used as an index, is outside the desired range.
So, the value checking line that I added to the model fixes the
problem. It sounds like you were expecting an independence sampler
based on the random method?
> Aside from this problem, does the model I've written otherwise look
> like a correct PyMC implementation of a hidden Markov model?
It appears so, though I have not yet looked a the BUGS code you sent.
cf
On Feb 9, 3:06 pm, Chris Fonnesbeck <[email protected]>
wrote:
> On Feb 10, 11:35 am, Todd <[email protected]> wrote:
>
>
>
> > Thanks for the revised version of the code. I think, though, that the
> > question remains why values outside of the range of the categorical
> > likelihood are appearing. As I mentioned in my original post, if I
> > call states.random() a zillion times on its own, I never see errant
> > values. It's only when I'm sampling a model that the errant values
> > occur.
>
> This is because there is no connection between the random method of a
> Stochastic and the proposal of values for that Stochastic. Values are
> proposed according to the StepMethod assigned to the Stochastic. So,
> for a (say) Metropolis sampler, proposals are based on a random walk
> centered on the current value of the variable. Hence, it is easy to
> get a value that, when used as an index, is outside the desired range.
> So, the value checking line that I added to the model fixes the
> problem. It sounds like you were expecting an independence sampler
> based on the random method?
>
Hi Chris,
Argh, yes, that's my (elementary) conceptual mistake.
Thanks again for your help!
Todd