Cosma Shalizi

From InterSciWiki
Jump to: navigation, search

Cosma Rohilla Shalizi is an associate professor in the Department of Statistics at Carnegie Mellon University in Pittsburgh. Shalizi is co-author of the CSSR algorithm at, which exploits entropy properties to efficiently extract Markov Models from time-series data without assuming a parametric form for the model.


Blind Construction of Optimal Nonlinear Recursive Predictors for Discrete Sequence.


notebooks: Cosma Shalizi CMU, Statistics, Google Scholar Bio, CV, Home

V. Fowler and Cristakis

Fortunately, one consequence of this recent outbreak of drama is a very long and thoughtful message from Tom Snijders to the SOCNET mailing list. Since there is a public archive, I do not think it is out of line to quote parts of it, though I would recommend anyone interested in the subject to (as the saying goes) read the whole thing:
What struck me most in the paper by Lyons ... are the following two points. The argument for social influence proposed by Christakis and Fowler (C&F) that earlier I used to find most impressive, i.e., the greater effect of incoming than of outgoing ties, was countered: the difference is not significant and there are other interpretations of such a difference, if it exists; and the model used for analysis is itself not coherent. This implies that C&F's claims of having found evidence for social influence on several outcome variables, which they already had toned down to some extent after earlier criticism, have to be still further attenuated. However, they do deserve a lot of credit for having put this topic on the agenda in an imaginative and innovative way. Science advances through trial and error and through discussion. Bravo for the imagination and braveness of Nick Christakis and James Fowler.
...Our everyday experience is that social influence is a strong and basic aspect of our social life. Economists have found it necessary to find proof of this through experimental means, arguing (Manski) that other proofs are impossible. Sociologists tend to take its existence for granted and are inclined to study the "how" rather than the "whether". The arguments for the confoundedness of influence and homophilous selection of social influence (Shalizi & Thomas Section 2.1) seem irrefutable. Studying social influence experimentally, so that homophily can be ruled out by design, therefore is very important and Sinan Aral has listed in his message a couple of great contributions made by him and others in this domain. However, I believe that we should not restrict ourselves here to experiments. Humans (but I do not wish to exclude animals or corporate actors) are purposive, wish to influence and to be influenced, and much of what we do is related to achieve positions in networks that enable us to influence and to be influenced in ways that seem desirable to us. Selecting our ties to others, changing our behaviour, and attempting to have an influence on what others do, all are inseparable parts of our daily life, and also of our attempts to be who we wish to be. This cannot be studied by experimental assignment of ties or of exchanges alone: such a restriction would amount to throwing away the child (purposeful selection of ties) with the bathwater (strict requirements of causal inference).
The logical consequence of this is that we are stuck with imperfect methods. Lyons argues as though only perfect methods are acceptable, and while applauding such lofty ideals I still believe that we should accept imperfection, in life as in science. Progress is made by discussion and improvement of imperfections, not by their eradication.
A weakness and limitation of the methods used by C&F for analysing social influence in the Framingham data was that, to say it briefly, these were methods and not generative models. Their methods had the aim to be sensitive to outcomes that would be unlikely if there were no influence at all (a sensitivity refuted by Lyons), but they did not propose credible models expressing the operation of influence and that could be used, e.g., to simulate influence processes. The telltale sign that their methods did not use generative models is that in their models for analysis the egos are independent, after conditioning on current and lagged covariates; whereas the definition of social influence is that individuals are not independent....
Snijders goes on, very properly, to talk about the models he and his collaborators have been developing for quite a few years now (e.g.), which can separate influence from homophily under certain assumptions, and to aptly cite Fisher's dictum that the way to get causal conclusions from observations studies is to "Make your theories elaborate" --- not give up. Lyons's counsels of perfection and despair are "words of a knight riding in shining armour high above the fray, not of somebody who honours the muddy boots of the practical researcher". (Again, if this sounds interesting, read the full message.) I agree with pretty much everything Snijders says, but feel like adding a few extra points.
It is of course legitimate to make modeling assumptions, but that one then needs to support those assumptions with considerations other than their convenience to the modeler. I see far too many papers where people say "we assume such and such", get results, and don't try to check whether their assumptions have any basis in reality (or, if not, how far astray that might be taking them). Of course the support for assumptions may be partial or imperfect, might have to derive in some measure from different data sources or even from analogy, etc., through all the usual complications of actual science. But if the assumptions are important enough to make, then it seems to me they are important enough to try to check. (And no, being a Bayesian doesn't get you out of this.)
As we say in our paper, I suspect that much more could be done with the partial-identification or bounds approach Manski advocates. The bounds approach also seems more scientifically satisfying than many sensitivity analyses, which make almost as many restrictive and unchecked assumptions as the original models. Often it seems that this is all that scientists or policy-makers would actually want anyway, and so the fact that we cannot get complete identification would not be so very bad. I wish people smarter than myself would attack this for social influence.
It would be very regrettable if people came away from this thinking that social network studies are somehow especially problematic. On the one hand, as shown in Sec. 3 of our paper, when social influence and homophily are both present, individual-level causal inference which ignores the network is itself confounded, perhaps massively. (I've been worrying about this for a while.) But the combination of social influence and homophily would seem to be the default condition for actual social assemblages, while individual-level studies from (e.g.) survey data have become the default mode of doing social science.
On the other and more positive side, we have it seems to me lots of examples of successfully pursuing scientific, causal knowledge in fields where experimentation is even harder than in sociology, such as astronomy and geology. Perhaps explaining the clustering of behavior in social networks is fundamentally harder than explaining the clustering of earthquakes, but we're even more at the mercy of observation in seismology than sociology.


  • Shalizi, Cosma R., Andrew C. Thomas. 2011. Homophily and contagion are generically confounded in observational social network studies. Sociological Methods & Research 40(2):211-239. -
  • Abstract: The authors consider processes on social networks that can potentially involve three factors: homophily, or the formation of social ties due to matching individual traits; social contagion, also known as social influence; and the causal effect of an individual's covariates on his or her behavior or other measurable responses. The authors show that generically, all of these are confounded with each other. Distinguishing them from one another requires strong assumptions on the parametrization of the social process or on the adequacy of the covariates used (or both). In particular the authors demonstrate, with simple examples, that asymmetries in regression coefficients cannot identify causal effects and that very simple models of imitation (a form of social contagion) can produce substantial correlations between an individual’s enduring traits and his or her choices, even when there is no intrinsic affinity between them. The authors also suggest some possible constructive responses to these results.
Partial identification
review of Manski


Cosma gave a talk at our 2008 event. May 16th complexity videoconference on "Statistical methods for complex systems." Abstract: A summary of the tools people should use to study complex systems, covering statistical learning and data-mining, time series analysis, cellular automata, agent-based models, evaluation techniques and simulation, information theory and complexity measures.

1:30-3:00 UCI 3030 Anteater I&R Bldg UCLA 285 Powell Library UCSD 260 Galbraith Hall

See: "Methods and Techniques of Complex Systems Science: An Overview", chapter 1 (pp. 33-114) in Thomas S. Deisboeck and J. Yasha Kresh (eds.), Complex Systems Science in Biomedicine (NY: Springer, 2006).

Optimal Nonlinear Prediction of Random Fields on Networks Published 2003 Discrete Mathematics and Theoretical Computer Science 11-31

Under our Tools and Methods for our Probability distributions we have a link to his Maximum Likelihood Estimation for q-Exponential (Tsallis) Distributions as it applies, for example, to city size distributions.

Notebook on Feedback Networks

Under Cosma's Social Networks notebook is a useful comment on

  • Douglas R. White, Natasa Kejzar, Constantino Tsallis, Doyne Farmer and Scott White, "A generative model for feedback networks", cond-mat/0508028 = Physical Review E 73 (2006): 016119 [Cosma: I find the growth model here very interesting, because it breaks with the now-usual "preferential attachment" mechanism. I think this model would repay very careful attention, both dynamically (could one map this onto preferential attachment in some meaningful way?) and statistically (what is the limiting degree distribution, and how does it vary with the growth parameters?).]

For our Network_tools he has provided the basis for a maximal likelihood estimation (MLE) procedure for Degree distributions that Mark Handcock may be able to program within the dnet package for R.

Cosma has a notebook entry for complexity and an interesting if not yet productive dialog with Tsallis. Tsallis replied to Buchanan's New Scientist article, August 2005, reviewing q-entropy, but the editors eliminated 90% of the original letter (see below), here reprinted: Original 2005 Letter to the Editor of The New Scientist about the Buchanan review. The Letter to the Editor of The New Scientist magazine about the Buchanan review, as shortened for issue 2518, 24 September 2005, page 25.

Scott White's Review of Buchanan

The literature on nonextensive physics (see below)

Tsallis Response

29 July 2007 (PDT)

Thanks for all this information. A few remarks:

1) I certainly enjoyed reading Scott's fair viewpoint!

2) Letters to the Editor of New Scientist cannot be of more of about 200 words (I was not aware of that when I wrote the full version). This is why only the short version came out.

3) The most convenient link to the literature of nonextensive statistics is the regularly updated bibliography at (it is exactly the same that you and Scott quote, excepting for the fact that this one is always regularly updated)

4) One of the most impressive experimental verifications of the predictions of q-statistics concerns cold atoms in dissipative optical lattices. Eric Lutz made an analytical prediction in 2003, and it was verified in 2006 by a London team: see PRL here attached.

5) I conjectured in 1999 (Brazilian Journal of Physics 29, 1; see Figure 4 (i) that a longstanding quasi-stationary state (QSS) was expected in LONG-range interacting Hamiltonian systems (one of the core problems of statistical mechanics), and (ii) that this QSS should be described by q-statistics instead of Boltzmann-Gibbs statistics. Point (i) was quickly verified by many groups around the world.

But I had to wait (hearing of course lots of skeptical remarks by colleagues!) for 9 looooooong years in order to see (ii). It is now done since about one month: see the last figure of the 2007 attachment by Pluchino, Rapisarda and myself. Instead of the celebrated Maxwellian (Gaussian) distribution of velocities (valid for SHORT-range interactions), one sees a q-Gaussian!



drw, see also:

MLE for q

Shalizi, Cosma. 2007 Maximum Likelihood Estimation for q-Exponential (Tsallis) Distributions.

J.-F. Bercher, C. Vignat. 2008. A new look at q-exponential distributions via excess statistics. Physica A 387(22):5422-5432.

Abstract: Q-exponential distributions play an important role in nonextensive statistics. They appear as the canonical distributions, i.e. the maximum generalized q-entropy distributions under mean constraint. Their relevance is also independently justified by their appearance in the theory of superstatistics introduced by Beck and Cohen. In this paper, we provide a third and independent rationale for these distributions. We indicate that q-exponentials are stable by a statistical normalization operation, and that Pickands’ extreme values theorem plays the role of a CLT-like theorem in this context. This suggests that q-exponentials can arise in many contexts if the system at hand or the measurement device introduces some threshold. Moreover we give an asymptotic connection between excess distributions and maximum q-entropy. We also highlight the role of Generalized Pareto Distributions in many applications and present several methods for the practical estimation of q-exponential parameters.


A q-Exponential Regression Model

See Malacarne at Social-circles_network_feedback_model#References

Abstract: Usually, the studies of distributions of city populations have been reduced to power laws. In such analyses, a common practice is to consider cities with more than one hundred thousand inhabitants. Here, we argue that the distribution of cities for all ranges of populations can be well described by using a q-exponential distribution. This function, which reproduces the Zipf-Mandelbrot law, is related to the generalized nonextensive statistical mechanics and satisfies an anomalous decay equation.
... In a comparative study, the q- exponential and Weibull distributions are employed to investigate frequency ...
  • Patriota, Alexandre G. 2012. A q-exponential regression model.

Bhatt, R., and Kishor Barman. 2012. Global Dynamics of Online Group Conversations. And available online at;

= google Constantino_Tsallis#Articles_relevant_to_White.2C_Tsallis.2C_et_al

Tutorial in R for Discrete Distributions

Using the new discrete estimator and producing sampling distribution plots

R programming

Programming in R - John M. Chambers