Tallis Matlab fit

From InterSciWiki

Jump to: navigation, search

Doug and Laurent,

I think I've made the necessary corrections (which were somewhat non-trivial). Assuming xmin=min(x)=1000, here's a MLE fit against the data in the previous email, showing pretty good visual agreement.

If I don't try to re-estimate xmin at all, i.e., I leave it fixed at xmin=min(x)=1000, then I estimate a p-value of p=0.27 \pm 0.01. The code that does that calculation is below, but I've also included it as a special case in the qpva.m code (accessibly by also passing qpva.m the argument 'fixed').

If we use the full-blown semi-parametric bootstrap version, in which we first estimate xmin using the KS method (this yields xmin-hat=75000, with ntail=100 or so), then I estimate a p-value of p=0.78. (The increase is, I think, partly attributable to the large decrease in ntail between this and the previous method.)

In either case, the fit looks entirely plausible.

-Aaron

Subject: Re: if we modify plpva.m question (fwd)

I think these changes would be correct. As Cosma mentions, solving equation (10) for sigma can be done numerically -- it shouldn't be too difficult to incorporate such a procedure into the estimation part of the routine.

The other thing that needs to be changed is the cdf used in the calculation of the KS statistic (currently, it's a power-law with lower cutoff, but should be a q-exponential with lower cutoff).

-A

On Dec 19, 2007, at 9:42 AM, Doug White wrote:

the second equation (10) below has sigma hat in the formula, so needs the solution of the equation. I think that is a bit beyond our capabilities in matlab at this point. If this works, could you modify the code? Then we are on our way to finishing an article: fitting by hand, our results look good.

Aaron: if we modify your plpva.m code as follows

> Task: modify plpva.m --> qpva.m
> For plaw:       alpha = 1 + nz ./ sum( log(z./xmin) );
> Substitute    theta=eqn (8) Cosma p.2
>               sigma=eqn(10) Cosma p.2
>               q=1+1/theta
>               kappa=sigma/theta
> eq (8&10):http://arxiv.org/PS_cache/math/pdf/0701/0701854v2.pdf
> 
> For plaw:     q2 = xmin*(1-rand(n2,1)).^(-1/(alpha-1));
> Substitute    q2=  xmin*(1-(1-q)*(1-rand(n2,1)/kappa) ^(1/(q-1));
> Then run in matlab
> Will this find the best K-S fit for an xqmin and q entropy parameters?

Doug


On Dec 21, 2007, at 7:00 PM, Laurent Tambayong wrote:

> Aaron,

> If we adjust xmin < 1000, will the test still be reliable? There are some data, especially emails data , that have data <1000. Thanks and have a good weekend.

--Laurent

It should still be relatively reliable, but the further down you push min(x) with discrete data, while using a continuous distribution to fit it, the more inaccurate the MLE will become. (We discuss this briefly in the paper.) You may be able to get down as far as min(x) = 100 before the discrete vs. continuous issue starts introducing more bias than you already see from the noise in your data.

-A

Personal tools