# Tsallis q historical cities and city-sizes

Figure 9.2: The Tertius Chandler Four Thousand Years of Urban Growth (1987) rank-size city data (semi-log) for Eurasia (Europe, China, and Mid-Asia). From Oscillatory dynamics of city-size distributions in world historical systems 2007, White, Tambayong and Kejžar

## Introduction

The Tsallis q historical cities and city-sizes PROJECT undertaken by Doug White and Laurent Tambayong with the help of Cosma Shalizi and Aaron Clauset builds on an earlier paper with Nataša Kejžar, <in press>, Oscillatory dynamics of city-size distributions in world historical systems, and an even earlier paper on a generative feedbakk or Social-circles network model with Nataša, Constantino Tsallis, Doyne Farmer, and Scott White. The latter paper provided a simulation of the ideas that motivated this current project and the previous paper with Nataša, namely, that the Tsallis entropy model will fit city size distributions (better than the Pareto I and other common distributions) because there are long-range inter-city correlations that operate through trade networks and interregional competition and conflict.

It has taken new shape in the Tsallis q distribution project started at SFI in September 2007 with Aaron Clauset and Cosma Shalizi.

## Statistical Methods

Lessons learned at Santa Fe with the Tsallis q distribution project:

1. Cosma's initial [q-fitting software] normalizes over the empirical distribution whose lower limit is specified by xmin but we require instead and xqmin that sets the lower limit to which the Pareto II is normalized.
xqmin may, in theory, be smaller than the xmin lower bound of the data. An example is the city data, where we know the the minimum population size is 5,000 people. So, in the case where the theoretical xqmin < xmin, the R program needs one or two options:
1. The fitted distribution must run to xqmin even though the fitting is done only for the jtail starting with xmin. This will create a normalization into the (0,1) probability distribution in which 1 corresponds to a Y0 lower bound of the unnormalized distribution, and it is easy to solve for Y0.
2. The R program can vary xqmin from an initial minimum to maximize fit.
Like (2.), if xqmin is initially set equal to xmin of the data, we need a procedure to find the optimal xqmin > xmin. Another way to do this is to use Aaron's Matlab programs using a Kolmogorov-Smirnov approach to fitting q-exponentials the xqmin threshold so as to maximize fit. That is, to apply the Kolmogorov-Smirnov test to each fit of the Pareto II, varying the minxq cutoff. In most cases from the Clauset et al. 2007 study the K-S values forms a U-shaped distribution of fit with a single minimum or minimum region, varying the minxq cutoff for the Pareto II. Shalizi's Likilihood ratio tests can then be used for comparison of models. The challenge here will be to compare fit between Pareto I with one cutoff, minxp and Pareto II with another lower cutoff, minxq.
1. When using Cosma's R software for likelihood ratio comparisons with other distributions, Pareto I and II, for example, we have the difficulty of comparing the fits of two distributions which have different lower bounds to data that may also have to different lower bounds, so three are four parameters total, threee needed as input:
2. minx lower cutoff for the data series
3. minxp > minx lower cutoff for the Pareto I fit
4. minxq lower cutoff for the Pareto II fit, which may be greater or less than minx, and defines a region with a calculated minimum:
5. qxmin ≥ minx that represents the lower bound for fitting Pareto II to the data.

## The Empirical Research

The empirical steps:

1. Extract the sizes of each of Chandler's "largest cities" for each regional urban network and of each Chandler's historical periods. (done)
2. For each ordered list of sizes, fit Pareto I and II, extracting:
Pareto I $\alpha$ and minxp parameters
Pareto II $q$, $\kappa$, and qxmin parameters.
Error bounds, etc., for each fit.
3. The Pareto II fit will require optimization on the qxmin parameter.
the Y0 parameter for Pareto II (in its Tsallis distribution transform) can thus be recovered as well.
4. Plot the empirical distribution in the range qxmin < minxp to maxx.
5. add to the plot the curves for Pareto I and II over their ranges.
6. Use a modified likelihood ration (LR) comparison to evaluate the fit probabilites for Pareto I and II.
These steps provide time series for q and $\kappa$, Y0, and $\alpha$.
7. For 1 < q ≤ 2, check the convergence of $\alpha$ ~ 1/(q-1).
8. estimate 50-year changes where not given to create a uniform time series.
9. test two-equation time-lagged effects using multiple regression.

The project fits multiple equations for dynamical modeling of interactions among historical variables measuring change over 50 year periods for Chinese and European cities, 900 CE to 2000.

As suggested by David Krakauer, it might be the case that we also find relevant time-series aspects of q-entropy as theorized by Tsallis for extensive processes where long range interactions such as trade and warfare decay as $r^-\alpha$ distances. For greater accuracy we are using maximum likelihood estimation (MLE) of parameters for historically changing city-size distributions. This is not likely, however, since it would require some basic properties of the data that preserve extensivity.

These are links to some of the relevant papers by Tsallis on this latter topic:

On the extensivity of the entropy Sq, the q-generalized central limit theorem and the q-triplet Constantino Tsallis. Archiv:Cpmd-mat/05012525 2005. http://arxiv.org/PS_cache/cond-mat/pdf/0510/0510125v1.pdf

Dynamical scenario for nonextensive statistical mechanics Constantino Tsallis (cited in 2006 q-triplet) nicer version

## An alternate approach to generalized entropy

Generalized generalized entropy (Stefan Thurner and Rudolf Hanel) and reanalysis of the Chandler data for China

## Back to the drawing board

I am thinking that we may want to use our two R routines (Cosma, Kejzar) organized as iterative programs that loop through all the data to do comparisons:

1. . Cosma MLE
2. . Cosma MLE with small sample correction (adjusted to FLATTEN the estimates by sample size, i.e. DETREND
3. . Use upper cutoff maximizizing K-S test
4. . Natasa (Borges) with crude default
5. . Natasa (Borges) convergent only
6. . Use upper cutoff maximizizing K-S test

Note that her routines run thru data like these (Patterns.R)

• "v0.5110_5000" <-

c(2617.4, 1026.9, 526.2, 288, 173.6, 111.4, 72.3, 48.1, 33.7, 24, 17.7, 12.9, 10.6, 7.3, 6.3, 4.7, 3, 2.4, 2.5, 2.1, 1.7, 1.1, 0.8, 0.7, 0.7, 0.5, 0.7, 0.6, 0, 0.4, 0.4, 0.2, 0.1, 0, 0, 0.4, 0.1, 0.1, 0.3, 0, 0, 0, 0, 0, 0.1)

• "v0.5111_5000" <-

c(2527.1, 1033.5, 515.7, 295.6, 172.4, 114, 80.7, 57.9, 39.7, 30, 23, 15.7, 15.2, 10.7, 9.7, 6.9, 6.5, 5.6, 3.6, 3.5, 2.8, 2, 2.2, 2.2, 2, 1.1, 1.6, 0.8, 1.2, 0.8, 0.6, 0.5, 0.8, 0.5, 0.7, 0.7, 0.5, 0.8, 0.2, 0.7, 0.8, 0, 0.1, 0.6, 0.2, 0.1, 0.1, 0.3, 0.3, 0.1, 0.2, 0.1, 0.5, 0, 0.3, 0.3, 0.2, 0.2, 0.1, 0.1, 0.3, 0, 0.2, 0.1, 0.1, 0.2, 0.2, 0.2, 0.3, 0.2, 0.2, 0.2, 0.1, 0, 0.1, 0.1, 0, 0.1, 0, 0, 0, 0.1, 0, 0.2, 0.2, 0.1, 0.2, 0.1, 0.2, 0.1, 0.1, 0, 0.2, 0.2, 0.1, 0, 0.1, 0, 0, 0, 0.1, 0, 0, 0.1, 0, 0, 0.3, 0, 0, 0, 0, 0, 0, 0, 0, 0.1, 0.1, 0, 0, 0, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.1, 0, 0, 0, 0, 0, 0, 0, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.1)

• "v0.5120_5000" <-

c(2487.7, 1022.5, 542.7, 312.1, 195.6, 122.8, 83.7, 56.4, 38.8, 29.2, 22.2, 16.9, 14.3, 9, 9.7, 5.4, 4.8, 3.7, 3.7, 2.6, 1.9, 1.9, 2.3, 1.3, 1.1, 0.9, 0.3, 0.8, 0.5, 0.7, 0.8, 0.4, 0.8, 0.3, 0.7, 0.2, 0.3, 0.1, 0.1, 0.1, 0.1, 0, 0.2, 0.1, 0.2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.1)

• "v0.5121_5000" <-

and return arrays like these (array2.R)

• "t_kappa" <-

structure(c(1.57, 1.77, 2, 2.25, 2.5, 1.7, 1.9, 2.1, 2.42, 2.75, 1.85, 2.05, 2.2, 2.6, 3, 1.91, 2.12, 2.35, 2.8, 3.2, 2.1, 2.3, 2.5, 2.95, 3.4, 0, 0, 0, 0, 0, 1.45, 1.5, 1.8, 2.2, 2.55, 1.6, 1.8, 2, 2.35, 2.8, 1.8, 2.2, 2.3, 2.7, 3, 2.6, 2.6, 2.7, 3.2, 3.1, 9, 6.5, 3.6, 3.3, 3.2, 0, 0, 0, 0, 0, 1.3, 1.5, 1.6, 2.1, 2.6, 1.5, 1.7, 1.8, 2.3, 2.8, 2.4, 2.4, 2.6, 2.8, 3, 5.4, 5, 4.5, 3.8, 3, 14, 9, 8, 6.5, 5, 0, 0, 0, 0, 0), .Dim = as.integer(c(5, 6, 3)))

• "t_q" <-

structure(c(1.07, 1.11, 1.16, 1.38, 1.6, 1.09, 1.13, 1.2, 1.4, 1.6, 1.11, 1.16, 1.25, 1.47, 1.7, 1.17, 1.22, 1.31, 1.5, 1.85, 1.23, 1.29, 1.38, 1.7, 2, 0, 0, 0, 0, 0, 1.11, 1.18, 1.23, 1.45, 1.75, 1.15, 1.22, 1.25, 1.5, 1.85, 1.21, 1.26, 1.36, 1.6, 2, 1.25, 1.36, 1.45, 1.85, 2.45, 1.23, 1.35, 1.6, 2.2, 3, 0, 0, 0, 0, 0, 1.16, 1.23, 1.3, 1.5, 1.8, 1.21, 1.26, 1.38, 1.6, 2.1, 1.2, 1.29, 1.38, 1.74, 2.1, 1.15, 1.25, 1.42, 2, 2.9, 1.1, 1.25, 1.48, 2.2, 4, 0, 0, 0, 0, 0), .Dim = as.integer(c(5, 6, 3)))

• "t_delta" <-

structure(c(0, -0.3, -0.6, -1.1, -1.4, 0, -0.3, -0.6, -1.1, -1.4, 0, -0.3, -0.6, -1.1, -1.4, 0, -0.3, -0.6, -1.1, -1.4, 0, -0.3, -0.5, -1.1, -1.4, 0, 0, 0, 0, 0, 0, -0.3, -0.6, -1.1, -1.5, 0, -0.4, -0.6, -1.1, -1.5, 0, -0.4, -0.6, -1.1, -1.5, 0, -0.3, -0.6, -1.1, -1.5, 0, -0.3, -0.6, -1.1, -1.5, 0, 0, 0, 0, 0, 0, -0.3, -0.6, -1.1, -1.5, 0, -0.3, -0.6, -1.1, -1.5, 0, -0.3, -0.6, -1.1, -1.5, 0, -0.3, -0.6, -1.1, -1.5, 0, -0.3, -0.6, -1.05, -1.5, 0, 0, 0, 0, 0), .Dim = as.integer(c(5, 6, 3)))

• "t_p0" <-

structure(c(0.82, 0.8, 0.8, 0.85, 0.9, 0.7, 0.72, 0.75, 0.8, 0.85, 0.63, 0.68, 0.7, 0.75, 0.8, 0.55, 0.6, 0.63, 0.78, 0.75, 0.45, 0.5, 0.55, 0.67, 0.7, 0, 0, 0, 0, 0, 0.86, 0.85, 0.85, 0.86, 0.87, 0.7, 0.72, 0.78, 0.78, 0.82, 0.55, 0.58, 0.6, 0.72, 0.8, 0.33, 0.46, 0.5, 0.65, 0.72, 0.09, 0.35, 0.4, 0.55, 0.65, 0, 0, 0, 0, 0, 0.9, 0.9, 0.9, 0.88, 0.85, 0.7, 0.72, 0.75, 0.77, 0.8, 0.42, 0.48, 0.57, 0.7, 0.8, 0.17, 0.25, 0.36, 0.52, 0.7, 0.06, 0.14, 0.25, 0.4, 0.5, 0, 0, 0, 0, 0), .Dim = as.integer(c(5, 6, 3)))

Altho those are for degree distributions (I need to find those for city sizes)

Here is the subroutine for those inputs/outputs (estimation.params.1.R)

1. fit the q-exponential curve to the data (get the estimation of parameters)
proba1 <- function(dat,kappa_q=2,q=1.2,p0=1,delta=0){
x <- 1:length(dat)
y <- dat[dat > 0]/sum(dat[dat > 0])
x <- x[dat > 0]
#
temp <- nls(y ~ ((x)^delta)*p0*(1-(1-q)*x/kappa_q)^(1/(1-q)),start=list(p0=p0,kappa_q=kappa_q,q=q,delta=delta))
#
p0 <- summary(temp)$param[1] kappa_q <- summary(temp)$param[2]
q <- summary(temp)$param[3] delta <- summary(temp)$param[4]
return(list(nls_model=temp,p0=p0, kappa_q=kappa_q, q=q,delta=delta))
}
ln_q <- function(z, q){
return((z^(1-q) - 1)/(1-q))
}


Here are her notes on those routines

1. CALCULATE Q FOR NETWORK DEGREE DISTRIBUTION
2. File ""estimation_params_1.R" contains a routine to compute the best fit for q-exponential curve
3. on a dataset. (caution: only nice curves are well fitted; otherwise a totally dummy solution can be found
4. or even it might not converge - it depends also on initial parameter values)
1. File "neparam_tests.R" contains the procedures that I used when calculating the goodness of fit for the
2. q-exponentials. I used Kolmogorov-Smirnov nonparametric test.

(My reminder is that: Natasa - we have found that q-exponential fits require a lower cutoff.)

Here is her K-S test (neparam_tests.R): see file on wiki

SO NEEDED HERE ARE HER ROUTINES FOR CITIES -- DRW 10/16/2007