Estimating Tsallis q
THE PURPOSE HERE IS TO FIND A MORE DIRECT Maximum Likelihood Estimation for q-exponential distributions, following the procedures that have been outlined to us by Tsallis and Borges for fitting a straight line (by MLE!) to x and
. Help will be greatly appreciated.
See embedded distributions approach.
http://en.wikipedia.org/wiki/Tsallis_entropy
Sandbox has the older page
q exponential and q logarithm
The solution to
, y(0)=1, is the q-exponential function (Tsallis 2004:5-6). A constant such as
may be inserted before x and will carry over before x in equations 1 and 2)
if (1 + (1 - q)x) > 0); otherwise
, where
,
The inverse of the q-exponential is the q-logarithmic function (Tsallis 2004:6):
where (ln1 x = ln x)
With an optional constant
where
Probability distribution and Cumulative probability distribution (CDF)
The probability distribution for
(Tsallis 2004:6-8) normalized with
is
The cumulative probability distribution for the q-exponential function in (1) (Tsallis 2004:8) is also a q-exponential function
, with
, where for
is not normalizable. Thus, q cannot be fitted for a range at or above 2. (Note that
does not enter into this equation).
Comparison to the Shalizi MLE by Pareto II
See Cosma Shalizi, 2007 Maximum Likelihood Estimation for q-Exponential (Tsallis) Distributions. http://www.cscs.umich.edu/~crshalizi/research/tsallis-MLE These are classically known as Pareto II distributions. (Examples). Rather than using Shalizi's method, however, we procede to estimate q and
directly, as follows (the two MLE approaches will be compared).
As related to Tsallis's exposition, above, Shalizi took as his cumulative distribution
,
Substituting to obtain Pareto II parameters,
, we may take his equations (8) and (10) for the MLE estimation of
in terms of
Possibility of a more direct solution
Is it possible that
is a solution for q?
Or that
is a solution for q?
Let
, so (equation 5) becomes (following Shalizi 2007:1, and his eq. 3)
, with the probability density
, where
is the density with a total integral of 1.
Following Shalizi 2007:1, and his eq. 4), the function whose integral is
is
THIS FIGURE IS FOR THE CUMULATIVE DISTRIBUTION
(note that
does not enter into this equation)
Deriving the MLE
For maximum likelihood we need the correct probability density function. Following Shalizi (2007:1-2, and his eq. 5), under the q-exponential model with parameter
, the log-probability density of a sequence of independent identically-distributed samples is
Then (following Shalizi 2007:2, and his eq. 7): (see final Note 1 below)
(following Shalizi 2007:2, and his eq. 8):
To find the MLE (maximum
, take the first derivative of the log-likelihood
with respect to the parameter
and set it equal to zero to solve for the maximum.
, where in the simplest case
, but we may also take the value of
that maximizes fit according to the Kolmogorov-Smirnov (K-S) bootstrap method.
The solution for
, rearranging terms, is simply
.
Let us denote by
the MLE estimates by this method (W) and by Shalizi's (S) method. Because
are both derived from the same equation (W: equation 10 above) if we set
, comparison of parameter transformations show:
.
Here, both
are upper bounded by 2, but the problem is with
. So it seems we might need a different R program to compute the MLE than Shalizi's. It will be easier, however, to modify Clauset's qpva.m program and check the validity of this new MLE. This new formula should be
Testing the MLE 1
Cosma's 2007 equation 4, page 1, used in Aarom's qpva.m RESULTS WILL DIFFER EACH TIME BECAUSE WIDTHS OF THE INTERVALS BETWEEN x VALUES WILL VARY RANDOMLY
Not Normalized to a probability because theta > sigma
xmin = 0; theta = 5; sigma = 1.3; n = 10^2; x = (1+(1-rand(n,1)/sigma)).^(-theta - 1); figure(1); loglog(sort(x),(n:-1:1)./n);
Normalize to a probability
xmin = 1; theta = 2.1; sigma = 1.1; n = 10^2; x = (theta/sigma)*(1+(1-rand(n,1)/sigma)).^(-theta - 1); figure(1); loglog(sort(x),(n:-1:1)./n); q=1+1/theta kappa=theta/sigma
(note inversion of kappa by Cosma in his equation 1 and 2)
Now try
x not cumulative
n = 10^2; q=1.5; kappa=1.9 sigma=1.9 x = [(kappa*(1-rand(n,1)/sigma)).^(1-q)-1]/(1-q); figure(1); loglog(sort(x),(n:-1:1)./n);
n = 10^2; q=1.5; kappa=1.9 sigma=1.9 x = [(kappa*(1-rand(n,1)/sigma)).^(1-q)-1]/(1-q); figure(1); plot(sort(x),(n:-1:1)./n);
straight line?
Cumulative
q=1.47623; kappa=1.9; sigma=1.909; n = 10^2; x = sort(1-rand(n,1)); for i=1: n xx(i)=sum(x(i)); end %xx=cumsum(x) lnqx = [(kappa*xx/sigma).^(1-q)-1]/(1-q); figure(1); plot(1-x,lnqx);
Not cumulative
q=1.47623; kappa=1 sigma=1.909 n = 10^2; x = sort(1-rand(n,1)); lnqx = [(kappa*x/sigma).^(1-q)-1]/(1-q); figure(1); plot(x,lnqx);
lnqx=((x(n,1)/sigma)^(1-theta)-1)/(1-theta) figure(2); loglog(sort(lnqx),(n:-1:1)./n);
try new MLE
Then try (see if theta is returned)
,
Or from Cosma's Equation 4 should this be
xmin = 0; theta = 2; sigma = 3; n = 10^2; x = (xmin+(1-rand(n,1)/sigma)).^(-theta-1); figure(1); loglog(sort(x),(n:-1:1)./n);
q=0.8; n = 10^2; k=2 sigma=3 x = [(k*(1-rand(n,1)/sigma)).^(1-q)-1]/(1-q); figure(1); loglog(sort(x),(n:-1:1)./n);
Testing the MLE 2 ???
Hi Laurent,
The best way to test whether your MLE is accurate is to try it out on synthetic data, for which you know the correct q-exponential parameters. The qpva.m code includes a way of generating synthetic data drawn from the Pareto II, formulated as by Cosma's article, so potentially you could use that code to generate synthetic data for your MLE -- just remember that you'll need Eq 2 in Cosma's article to convert the Pareto II parameters into q-exponential parameters.
Sincerely, Aaron
On Jan 15, 2008, at 2:51 PM, Laurent Tambayong wrote:
> > Hello Aaron, > > > > We need to test our derivation. http://intersci.ss.uci.edu/wiki/index.php/Estimating_Tsallis_q > > > > Is that possible to adjust QPVA to be based on equation 10 (and its corresponding q underneath) ? Currently that code is based on eq 11, 12, and 13 from Cosma. Thanks.
> > Line 174 of qpva.m reads > > w2 = (xmin+sigma).*(1-rand(n2,1)).^(-1/theta) - sigma; > > which generates n2 data points drawn from the Pareto II distribution with parameters theta and sigma, with lower-cutoff xmin. Note that xmin=0 corresponds to the "full" distribution with "no" lower cutoff. Laurent asked
I think I don't get this correctly. I was expecting to get an output in the form of data array, plus theta and sigma, given xmin. However, I don't know what I need exactly to type in MATLAB's command window to generate such output. Please bear with me. I just started to use MATLAB last time I was at SFI. Thanks
Aaron Clauset wrote: Okay, no problem. Put the following into matlab: Then, "x" will be an array of length "n" of values distributed by a Pareto II with parameters xmin = 0, theta = 2, and sigma =3. You can visualize the distribution by the last two commands. (Note by DRW: these commands dont call an external functions, they just generate a function, and the tail will differ in each trial: why differ? Why does the tail vary so much? (a) from the body? (b) from trial to trial?)
xmin = 0; theta = 2; sigma = 3; n = 10^5; x = (xmin+sigma).*(1-rand(n,1)).^(-1/theta) - sigma; figure(1); loglog(sort(x),(n:-1:1)./n);
Or from Cosma's Equation 4 should this be
x = (xmin+(1-rand(n,1)/sigma).^(-theta-1);
-Aaron. Then try this very slow calculation to get the K-S probability p
x = (xmin+sigma).*(1-rand(n,1)).^(-1/theta) - sigma; [p, gof] = qpva(x, xmin); figure(2); loglog(sort(x),(n:-1:1)./n);
Comparison to the Shalizi MLE by Pareto II
See Cosma Shalizi, 2007 Maximum Likelihood Estimation for q-Exponential (Tsallis) Distributions. http://www.cscs.umich.edu/~crshalizi/research/tsallis-MLE These are classically known as Pareto II distributions. (Examples). Rather than using Shalizi's method, however, we procede to estimate q and
directly, as follows (the two MLE approaches will be compared).
As noted above, in Tsallis's exposition:
The cumulative probability distribution for the q-exponential function in (1) (Tsallis 2004:8) is also a q-exponential function
, with
, where for
is not normalizable. Thus, q cannot be fitted for a range at or above 2. (Note that
does not enter into this equation).
Shalizi took as his cumulative distribution
,
Substituting to obtain Pareto II parameters,
, we may take his equations (8) and (10) for the MLE estimation of
in terms of
Another set of questions
(Probably obsolete)
If the MLE defined above is done successfully, our question is: WILL THIS CONVERGE TO THE Shalizi Pareto II MLE?
We have a matlab program from Aaron Clauset that solves the 2-parameter Shalizi Pareto II MLE formula for
and
(and also finds
for the best fitting q distribution). In Shalizi's MLE, to convert to
and q to the Pareto II, we have
(equation 11) and
(equation 11)
Then
(equation 12)
and
is estimated by the transcendental:
(equation 13)
References
Tsallis, Constantino. 2004. Nonextensive Statistical Mechanics: Construction and Physical Interpretation. Chapter 1, in Murray Gell-Mann and Constantino Tsallis, Nonextensive Entropy: Interdisciplinary Applications pp. 1-53.
Tsallis, Constantino. 2001. Nonextensive statistical mechanics and thermodynamics: Historical back-ground and present status. In Nonextensive Statistical Mechanics and Its Applications, eds. S. Abe e Y. Okamoto (Series Lecture Notes in Physics, Springer-Verlag, Heidelberg, 2001), pp.~3--98 [ISBN 3-540-41208-5]. The appendix gives properties of q-exponentials and q-logarithms.
Beck, C. (2002) Non-additivity of Tsallis entropies and fluctuations of temperature Europhys. Lett., 57 (3), p. 329
Straub, John E., and Ioan Andricioaei 1999 Computational methods inspired by Tsallis statistics: Monte Carlo and molecular dynamics algorithms for the simulation of classical and quantum systems Brazilian Journal of Physics vol.29 n.1
Some properties of q-logarithm and q-exponential functions in Tsallis statistics Takuya Yamano 2002 Physica A: Statistical Mechanics and its Applications 305(3-4): 486-496.
See also
http://www.scholarpedia.org/article/Kolmogorov-Sinai_entropy Dynamical entropy
Tsallis’ entropy maximization procedure revisited
Note 1
DRW: My previous equation 9 was incorrect. I had set
=1. Natasa commented as follows:
To find the MLE (maximum
, take the first derivative of the log-likelihood
with respect to the parameter
and set it equal to zero.
- (Comment by NK: from (equation 8) you get the following equation for the derivative that has to equate to 0:
- (I don't know how to nicely (analytically) get
out of that. I think the easiest way would be to use a kind of iterative algorithm to get to the solution (so something similar to that of the MLE estimation, that Shalizi did). --NK
- (Comment by NK: from (equation 8) you get the following equation for the derivative that has to equate to 0:
Note 2: Eliminated - Curve fitting: estimating q with the q-logarithmic function
To estimate q it is sufficient to fit a straight line for the THE CUMULATIVE Z(k) against k, where in this illustration q ~ 1.7. This can be done by maximizing
for Z(k) ~ k (letter k not Greek
), where
is just the slope ( -slope more precisely) for that value of q. We can now compare maximizing
with the MLE solution above, and derive
as the (-slope) in the figure.

