Estimating Tsallis q
From InterSciWiki
THE PURPOSE HERE IS TO FIND A MORE DIRECT Maximum Likelihood Estimation for q-exponential distributions, following the procedures that have been outlined to us by Tsallis and Borges for fitting a straight line (by MLE!) to x and
. Help will be greatly appreciated.
See embedded distributions approach.
http://en.wikipedia.org/wiki/Tsallis_entropy
Sandbox has the older page
[edit] q exponential and q logarithm
The solution to
, y(0)=1, is the q-exponential function (Tsallis 2004:5-6). A constant such as λ may be inserted before x and will carry over before x in equations 1 and 2)
if (1 + (1 - q)x) > 0); otherwise
, where
,
The inverse of the q-exponential is the q-logarithmic function (Tsallis 2004:6):
where (ln1 x = ln x)
With an optional constant κ
where
[edit] Probability distribution and Cumulative probability distribution (CDF)
The probability distribution for
(Tsallis 2004:6-8) normalized with
is
The cumulative probability distribution for the q-exponential function in (1) (Tsallis 2004:8) is also a q-exponential function
, with
, where for
is not normalizable. Thus, q cannot be fitted for a range at or above 2. (Note that κ does not enter into this equation).
[edit] Comparison to the Shalizi MLE by Pareto II
See Cosma Shalizi, 2007 Maximum Likelihood Estimation for q-Exponential (Tsallis) Distributions. http://www.cscs.umich.edu/~crshalizi/research/tsallis-MLE These are classically known as Pareto II distributions. (Examples). Rather than using Shalizi's method, however, we procede to estimate q and κ directly, as follows (the two MLE approaches will be compared).
As related to Tsallis's exposition, above, Shalizi took as his cumulative distribution
,
Substituting to obtain Pareto II parameters, ,θ,σ, we may take his equations (8) and (10) for the MLE estimation of
in terms of
[edit] Possibility of a more direct solution
Is it possible that
is a solution for q?
Or that
is a solution for q?
Let
, so (equation 5) becomes (following Shalizi 2007:1, and his eq. 3)
, with the probability density
, where f(x) is the density with a total integral of 1.
Following Shalizi 2007:1, and his eq. 4), the function whose integral is
is
THIS FIGURE IS FOR THE CUMULATIVE DISTRIBUTION
(note that κ does not enter into this equation)
[edit] Deriving the MLE
For maximum likelihood we need the correct probability density function. Following Shalizi (2007:1-2, and his eq. 5), under the q-exponential model with parameter η, the log-probability density of a sequence of independent identically-distributed samples is
Then (following Shalizi 2007:2, and his eq. 7): (see final Note 1 below)
(following Shalizi 2007:2, and his eq. 8):
To find the MLE (maximum
, take the first derivative of the log-likelihood
with respect to the parameter η and set it equal to zero to solve for the maximum.
, where in the simplest case xmin = η, but we may also take the value of xmin that maximizes fit according to the Kolmogorov-Smirnov (K-S) bootstrap method.
The solution for
, rearranging terms, is simply
.
Let us denote by
the MLE estimates by this method (W) and by Shalizi's (S) method. Because
are both derived from the same equation (W: equation 10 above) if we set xmin = η, comparison of parameter transformations show:
.
Here, both
are upper bounded by 2, but the problem is with xmin. So it seems we might need a different R program to compute the MLE than Shalizi's. It will be easier, however, to modify Clauset's qpva.m program and check the validity of this new MLE. This new formula should be
[edit] Testing the MLE 1
Cosma's 2007 equation 4, page 1, used in Aarom's qpva.m RESULTS WILL DIFFER EACH TIME BECAUSE WIDTHS OF THE INTERVALS BETWEEN x VALUES WILL VARY RANDOMLY
Not Normalized to a probability because theta > sigma
xmin = 0; theta = 5; sigma = 1.3; n = 10^2; x = (1+(1-rand(n,1)/sigma)).^(-theta - 1); figure(1); loglog(sort(x),(n:-1:1)./n);
Normalize to a probability
xmin = 1; theta = 2.1; sigma = 1.1; n = 10^2; x = (theta/sigma)*(1+(1-rand(n,1)/sigma)).^(-theta - 1); figure(1); loglog(sort(x),(n:-1:1)./n); q=1+1/theta kappa=theta/sigma
(note inversion of kappa by Cosma in his equation 1 and 2)
Now try
[edit] x not cumulative
n = 10^2; q=1.5; kappa=1.9 sigma=1.9 x = [(kappa*(1-rand(n,1)/sigma)).^(1-q)-1]/(1-q); figure(1); loglog(sort(x),(n:-1:1)./n);
n = 10^2; q=1.5; kappa=1.9 sigma=1.9 x = [(kappa*(1-rand(n,1)/sigma)).^(1-q)-1]/(1-q); figure(1); plot(sort(x),(n:-1:1)./n);
[edit] straight line?
Cumulative
q=1.47623; kappa=1.9; sigma=1.909; n = 10^2; x = sort(1-rand(n,1)); for i=1: n xx(i)=sum(x(i)); end %xx=cumsum(x) lnqx = [(kappa*xx/sigma).^(1-q)-1]/(1-q); figure(1); plot(1-x,lnqx);
Not cumulative
q=1.47623; kappa=1 sigma=1.909 n = 10^2; x = sort(1-rand(n,1)); lnqx = [(kappa*x/sigma).^(1-q)-1]/(1-q); figure(1); plot(x,lnqx);
lnqx=((x(n,1)/sigma)^(1-theta)-1)/(1-theta) figure(2); loglog(sort(lnqx),(n:-1:1)./n);
[edit] try new MLE
Then try (see if theta is returned)
,
Or from Cosma's Equation 4 should this be
xmin = 0; theta = 2; sigma = 3; n = 10^2; x = (xmin+(1-rand(n,1)/sigma)).^(-theta-1); figure(1); loglog(sort(x),(n:-1:1)./n);
q=0.8; n = 10^2; k=2 sigma=3 x = [(k*(1-rand(n,1)/sigma)).^(1-q)-1]/(1-q); figure(1); loglog(sort(x),(n:-1:1)./n);
[edit] Testing the MLE 2 ???
Hi Laurent,
The best way to test whether your MLE is accurate is to try it out on synthetic data, for which you know the correct q-exponential parameters. The qpva.m code includes a way of generating synthetic data drawn from the Pareto II, formulated as by Cosma's article, so potentially you could use that code to generate synthetic data for your MLE -- just remember that you'll need Eq 2 in Cosma's article to convert the Pareto II parameters into q-exponential parameters.
Sincerely, Aaron
On Jan 15, 2008, at 2:51 PM, Laurent Tambayong wrote:
> > Hello Aaron, > > > > We need to test our derivation. http://intersci.ss.uci.edu/wiki/index.php/Estimating_Tsallis_q > > > > Is that possible to adjust QPVA to be based on equation 10 (and its corresponding q underneath) ? Currently that code is based on eq 11, 12, and 13 from Cosma. Thanks.
> > Line 174 of qpva.m reads > > w2 = (xmin+sigma).*(1-rand(n2,1)).^(-1/theta) - sigma; > > which generates n2 data points drawn from the Pareto II distribution with parameters theta and sigma, with lower-cutoff xmin. Note that xmin=0 corresponds to the "full" distribution with "no" lower cutoff. Laurent asked
I think I don't get this correctly. I was expecting to get an output in the form of data array, plus theta and sigma, given xmin. However, I don't know what I need exactly to type in MATLAB's command window to generate such output. Please bear with me. I just started to use MATLAB last time I was at SFI. Thanks
Aaron Clauset wrote: Okay, no problem. Put the following into matlab: Then, "x" will be an array of length "n" of values distributed by a Pareto II with parameters xmin = 0, theta = 2, and sigma =3. You can visualize the distribution by the last two commands. (Note by DRW: these commands dont call an external functions, they just generate a function, and the tail will differ in each trial: why differ? Why does the tail vary so much? (a) from the body? (b) from trial to trial?)
xmin = 0; theta = 2; sigma = 3; n = 10^5; x = (xmin+sigma).*(1-rand(n,1)).^(-1/theta) - sigma; figure(1); loglog(sort(x),(n:-1:1)./n);
Or from Cosma's Equation 4 should this be
x = (xmin+(1-rand(n,1)/sigma).^(-theta-1);
-Aaron. Then try this very slow calculation to get the K-S probability p
x = (xmin+sigma).*(1-rand(n,1)).^(-1/theta) - sigma; [p, gof] = qpva(x, xmin); figure(2); loglog(sort(x),(n:-1:1)./n);
[edit] Comparison to the Shalizi MLE by Pareto II
See Cosma Shalizi, 2007 Maximum Likelihood Estimation for q-Exponential (Tsallis) Distributions. http://www.cscs.umich.edu/~crshalizi/research/tsallis-MLE These are classically known as Pareto II distributions. (Examples). Rather than using Shalizi's method, however, we procede to estimate q and κ directly, as follows (the two MLE approaches will be compared).
As noted above, in Tsallis's exposition:
The cumulative probability distribution for the q-exponential function in (1) (Tsallis 2004:8) is also a q-exponential function
, with
, where for
is not normalizable. Thus, q cannot be fitted for a range at or above 2. (Note that κ does not enter into this equation).
Shalizi took as his cumulative distribution
,
Substituting to obtain Pareto II parameters, ,θ,σ, we may take his equations (8) and (10) for the MLE estimation of
in terms of
[edit] Another set of questions
(Probably obsolete)
If the MLE defined above is done successfully, our question is: WILL THIS CONVERGE TO THE Shalizi Pareto II MLE?
We have a matlab program from Aaron Clauset that solves the 2-parameter Shalizi Pareto II MLE formula for
and
(and also finds xmin for the best fitting q distribution). In Shalizi's MLE, to convert to κ and q to the Pareto II, we have
κ = θ / σ (equation 11) and
q = 1 + 1 / θ (equation 11)
Then
(equation 12)
and
is estimated by the transcendental:
(equation 13)
[edit] References
Tsallis, Constantino. 2004. Nonextensive Statistical Mechanics: Construction and Physical Interpretation. Chapter 1, in Murray Gell-Mann and Constantino Tsallis, Nonextensive Entropy: Interdisciplinary Applications pp. 1-53.
Tsallis, Constantino. 2001. Nonextensive statistical mechanics and thermodynamics: Historical back-ground and present status. In Nonextensive Statistical Mechanics and Its Applications, eds. S. Abe e Y. Okamoto (Series Lecture Notes in Physics, Springer-Verlag, Heidelberg, 2001), pp.~3--98 [ISBN 3-540-41208-5]. The appendix gives properties of q-exponentials and q-logarithms.
Beck, C. (2002) Non-additivity of Tsallis entropies and fluctuations of temperature Europhys. Lett., 57 (3), p. 329
Straub, John E., and Ioan Andricioaei 1999 Computational methods inspired by Tsallis statistics: Monte Carlo and molecular dynamics algorithms for the simulation of classical and quantum systems Brazilian Journal of Physics vol.29 n.1
Some properties of q-logarithm and q-exponential functions in Tsallis statistics Takuya Yamano 2002 Physica A: Statistical Mechanics and its Applications 305(3-4): 486-496.
[edit] See also
http://www.scholarpedia.org/article/Kolmogorov-Sinai_entropy Dynamical entropy
Tsallis’ entropy maximization procedure revisited
[edit] Note 1
DRW: My previous equation 9 was incorrect. I had set n / η=1. Natasa commented as follows:
To find the MLE (maximum
, take the first derivative of the log-likelihood
with respect to the parameter η and set it equal to zero.
- (Comment by NK: from (equation 8) you get the following equation for the derivative that has to equate to 0:
- (I don't know how to nicely (analytically) get η out of that. I think the easiest way would be to use a kind of iterative algorithm to get to the solution (so something similar to that of the MLE estimation, that Shalizi did). --NK
- (Comment by NK: from (equation 8) you get the following equation for the derivative that has to equate to 0:
[edit] Note 2: Eliminated - Curve fitting: estimating q with the q-logarithmic function
To estimate q it is sufficient to fit a straight line for the THE CUMULATIVE Z(k) against k, where in this illustration q ~ 1.7. This can be done by maximizing r2 for Z(k) ~ k (letter k not Greek κ), where κ is just the slope ( -slope more precisely) for that value of q. We can now compare maximizing r2 with the MLE solution above, and derive κ as the (-slope) in the figure.


