# Nihat Ay

## Contents

## References

J. Rauh, N. Ay, 2011. Robustness and Conditional Independence Ideals. arXiv

Nihat Ay and David C. Krakauer. 2007. Geometric robustness theory and biological networks. Theory in Biosciences 125 (2): 93-121.

Nihat Ay, Jessica C. Flack, and David C. Krakauer. 2007. Robustness and Complexity Co-constructed in Multimodal Signaling Networks. Philosophical Transactions of the Royal Society. London. 362: 441-447. http://www.jstor.org/stable/20209856

Nihat Ay and Daniel Polani. 2006. Information Flows in Causal Networks santa fe institute workingpaper

Wolfgang Löhr and Nihat Ay. 2008. On the Generative Nature of Prediction santa fe institute workingpaper

Nihat Ay. 2008. A Refinement of the Common Cause Principle santa fe institute workingpaper

Causal Graphs Laboratory paper: http://www.santafe.edu/education/schools/complex-systems-summer-schools/2011-program-info/

Prokopenko, M.; Nihat ay; Obst, Olivier; Polani, D. 2010. Phase transitions in least-effort communications Journal of Statistical Mechanics – Theory and Experiment, November 2010

## Main References for Causal Graphs

- Lauritzen, Steffen. 1996. Graphical Models.
- Søren Højsgaard, David Edwards and Steffen Lauritzen (Feb 23, 2012). Graphical Models with R (Use R!) by Amazon $54.01 to buy Order Number: 103-3491932-5692232

1 item will be shipped to Douglas R. White by oddesseyy. Estimated delivery: Sept. 4, 2013 - Sept. 19, 2013

- Ch.3
- Markovian parents - Unidirectional Graph
- p36 Main theorem
- p46
- p48 3.25 D-separation (note Union...)
^{m}<- m designates "moral" graph (modal?) parents have to marry- (see Lauritzen slides below, 85% through)
- ancestral graph... do all parents (some operation => D-separation

- p51 Interaction theory = D-sep in Markov graph
- ...=> Global => Local => Pairwise --- related to Factorization F

- product Str p35 Density
- Phi the interaction => physics literature
- interaction graphs => Interaction potential

Cowell, Robert G., Philip Dawid, Steffen L. Lauritzen, and D. J. Spiegelhalter. 1999. Probabilistic Networks and Expert Systems: Exact Computational Methods for Bayesian Networks (Information Science and Statistics). Berlin: Springer - Verlag

- index: see interaction potential p.28 3.1 p.86-89, 6.2 - 6.2.1 6.333 Pi used by physicists (product) refers to interaction
- for the energy thing - books on graphical models. E,g., Winkler is one of the many p61.

- Lauritzen, Steffen L. and Speigelhalter, D.J. (1988) Local computation with probabilities on graphical structures and their application to expert systems (with discusssion) J.R.Statist. B, 50, 157 - 224. Pearl. Berlin: Springer - Verlag.

## Supplementary

Gerhard Winkler. 1995. Image Analysis, Random Fields, and Dynamic Monte Carlo Methods. Berlin: Springer - Verlag

- p61 Corollary 3.3.2 Gibbs field potential - physical lanugage - he makes explicit the physics(physical) connection
- normalized vacuum potential
- Size = the interaction potential
- Part 2,p62: last paragraph Gibbs fields and Dynamic Monte Carlo Methods. Gibbs Sampler chapter. Ising model

Jordan, Michael, ed. 2001. Learning in Graphical Models. Cambridge, MA: MIT Press.

- his chapter, et al.: "An Introduction to variational methods for Graphical Models."
- Summary

Jordan, Michael (UC Berkeley) - has a new book coming out,

## History and overview

Lauritzen, Steffen. 2011. Graphical Models. PPT in PDF. Graduate Lectures, Oxford.

- slides 2/3rd the way thru:
- A particular successful development is associated with BUGS, (Gilks et al., 1994) (WinBUGS, OpenBUGS).
- it enables a Bayesian analyst to focus on substantive modelling whereas the technical model specification and computational side is taken care of automatically,
- exploiting modularity, factorization, and MCMC methodology, including the Gibbs and Metropolis–Hastings sample
- Conforming with Bayesian paradigm, parameters and observations are explicitly represented in model as nodes in graph, all being observables;

Linear regression (next slide) model { }

- for( i in 1 : N ) { Y[i] ~ dnorm(mu[i],tau) mu[i] <- alpha + beta * (x[i] - xbar)

} tau ~ dgamma(0.001,0.001) sigma <- 1 / sqrt(tau) alpha ~ dnorm(0.0,1.0E-6) beta ~ dnorm(0.0,1.0E-6)

Data and BUGS model for pumps

- The number of failures Xi is assumed to follow a Poisson distribution with parameter θi ti , i = 1, . . . , 10
- where θi is the failure rate for pump i and ti is the length of operation time of the pump (in 1000s of hours). The data are shown below.

- Pump 1 2 3 4 5 6 7 8 9 10 ti 94.5 15.7 62.9 126 5.24 31.4 1.05 1.05 2.01 10.5 xi 5 1 5 14 3 19 1 1 4 22
- A gamma prior distribution is adopted for the failure rates: θi ∼ Γ(α,β),i = 1,...,10

BUGS program for pumps

- With suitable priors the program becomes
- for (i in 1 : N) { theta[i] ~ dgamma(alpha, beta) lambda[i] <- theta[i] * t[i] x[i] ~ dpois(lambda[i])
- } alpha ~ dexp(1) beta ~ dgamma(0.1, 1.0)model {model {
- }

Moral graph example

- The DAG is used for modular specification of the model, and the moral graph for local computatio
- Is a huge conceptual extension of so-called Bayesian hierarchical model
- distinction prior/likelihood and parameter/random variable less well define
- If founder nodes in network are considered fixed and unknown, no reason not to consider models in Fisherian paradigm.

## SEM

In Pearl 2009 p27 1.4.1 Structural Equations

- 1.40 Noise, pa stands for
*parents*that directly determine the value of X and U represent errors ("disturbances") due to omitted factors - 1.41 Linear --> Fit data assuming interaction. p28 price demand income wages, p29 do operator, Predictions, Interventions, Counterfactuals, as types of queries
- p1 use of probabilities, p2 eliminating paradoxes

- Energy : Gibbs, W. (1902). Elementary Principles of Statistical Mechanics. NewHaven, Connecticut: Yale University Press. (ref in Lauritzen 2011)

- Bartlett 1935
- Darroch 1980. "Papers setting the scene include Darroch et al. (1980), Wermuth and Lauritzen (1983), and Lauritzen and Wermuth (1989)." (ref in Lauritzen 2011)
- Darroch, J. N., Steffen L. Lauritzen, and T. P. Speed (1980). Markov fields and log-linear interaction models for contingency tables. The Annals of Statistics 8, 522–539.

- Wright 1921, 1923 extends from Gaussian case by Pearl 2009:27-38 "Structural equations" and functional causal models.
- Wold, Herman O. A. 1954. Causality and Econometrics. Econometrica 22: 162-177.
- Nihat says that the SEM is deterministic since the functional form = f(pa
_{i}, u_{i}) in SEM (and modified in Hal White's papers) can be integrated and turned into a probabilistic form, similar to regression: x_{i}= SUM_{k.ne.1}a_{ik}x_{k}+ u_{i}, ..., n.

- as in Lauritsen, Lauritsen doesn't believe the deterministic form is needed, although Pearl provides for it (actually both), as in 2009p27: 1.40 and 1.41.

## Ancestries

Title:** Information-Theoretic Inference of Common Ancestors**
Authors: Steudel, B; Ay, N
Author Full Names: Steudel, Bastian; Ay, Nihat
Source: ENTROPY, 17 (4):2304-2327; 10.3390/e17042304 APR 2015
Language: English
SFI Taxonomy: Information Theory, Machine Learning and Statistics
Abstract: A directed acyclic graph (DAG) partially represents the conditional independence structure among observations of a system if the local Markov condition holds, that is if every variable is independent of its non-descendants given its parents. In general, there is a whole class of DAGs that represents a given set of conditional independence relations. We are interested in properties of this class that can be derived from observations of a subsystem only. To this end, we prove an information-theoretic inequality that allows for the inference of common ancestors of observed parts in any DAG representing some unknown larger system. More explicitly, we show that a large amount of dependence in terms of mutual information among the observations **implies the existence of a common ancestor that distributes this information. Within the causal interpretation of DAGs, our result can be seen as a quantitative extension of Reichenbach's principle of common cause to more than two variables. Our conclusions are valid also for non-probabilistic observations, such as binary strings, since we state the proof for an axiomatized notion of "mutual information" that includes the stochastic as well as the algorithmic version.**
ISSN: 1099-4300

Title: Information Geometry on Complexity and Stochastic Interaction Authors: Ay, N Author Full Names: Ay, Nihat Source: ENTROPY, 17 (4):2432-2458; 10.3390/e17042432 APR 2015 Language: English SFI Taxonomy: Information Theory, Machine Learning and Statistics Abstract: Interdependencies of stochastically interacting units are usually quantified by the Kullback-Leibler divergence of a stationary joint probability distribution on the set of all configurations from the corresponding factorized distribution. This is a spatial approach which does not describe the intrinsically temporal aspects of interaction. In the present paper, the setting is extended to a dynamical version where temporal interdependencies are also captured by using information geometry of Markov chain manifolds.

Alex Lancaster notified us of this new paper:

Title: Science and technology consortia in U.S. biomedical research: a paradigm shift in response to unsustainable academic growth. Authors: Balch C1, Arias-Pulido H, Banerjee S, Lancaster AK, Clark KB, Perilstein M, Hawkins B, Rhodes J, Sliz P, Wilkins J, Chittenden TW. Source: Bioessays. 2015 Feb;37(2):119-22. doi: 10.1002/bies.201400167. Epub 2014 Nov 12. SFI Taxonomy: Medicine (Theoretical) Abstract Science and technology consortia provide a viable solution for the recent unsustainable academic growth in biomedical research. Margaret Alexander, Librarian Santa Fe Institute 1399 Hyde Park Rd. Santa Fe, NM 87501 mba@santafe.edu 505-946-2707