# Software: Kinship simulation

n=source("http://intersci.ss.uci.edu/wiki/pub/PajekR-KinSimGF.R") #open and copy to R console source("http://intersci.ss.uci.edu/wiki/pub/KinSim1.R") #copy this line to R console, repeatedly

- Author(s) Doug White, Tolga Oztan: Generation permutation code

## Contents

## New developments with a new simulator in R

Tolga:> Alright. I am now writing the code with three options given to the user. I'll pull the H and HH vectors from the adjacency matrix as well.

Doug>> Once you finish we'll write Vlado and ask him to write for us a macro that will give you the G and F data directly as well as the x-coords and the generational vector, all sent over into an R file that replace the file for the current toy dataset. Then, with no matrix manipulation on your code a new flat file for the simulated genealogy will be produced. If you want to be fancy since this is compact output you can set a user option for n replications of simulated data.

Then we can use the earlier "marriage census" tools we did several years ago and census the n simulated datasets. We can automate the comparison of the actual genealogy as a census in comparison to the statistical distribution of the simulated datasets.

Fancier still we can set our simulator to excluded certain type of marriage e.g., no FaBrDa etc marriages and others because these cover some of the "proscriptive rules" of the actual genealogy.

And so forth.

Doug to Vlado and Andrej> You will recall http://intersci.ss.uci.edu/wiki/pub/PajekR-KinSimGF.R which one or both of your wrote for us to develop a Kinship simulator in R, i.e., you gave us the matrix n (here 13x13) for a p-graph, the generations as an integer variables (y axis) and the x value of the graph. I wrote out by hand the function G (n=13 vector) that takes males to parents, and our simulators computes F for females to parents.

http://intersci.ss.uci.edu/wiki/index.php/Software:_Kinship_simulation was a graphic for a prototype program that we have now almost finished with our R code: it holds generations constant and takes all couples in each generation, permutes G or F or both to simulate a random reordering of the parents.

WHAT I WOULD LOVE NOW is if you could build a macro for us that computes generations, the x vector for east-west position of nodes, and then computes G and F for each node rather than the matrix n, and outputs into an R script.

You'll see in http://intersci.ss.uci.edu/wiki/pub/PajekR-KinSimGF.R that I added G by hand so we could do R script permutations of the parents of males, then later add the same for F.

Using vectors G and F and not the matrix as in n1 in script you wrote for us earlier, we will be able to do permutations for large pgraph networks.

We're going to use the "test for motifs" option in Pajek to do the marriage-type censuses according to what we did in 2005 with Klaus Hamberger -- but now for both the actual genealogy and simulations according to my 1999 article. Keeping the number of motifs fairly small, we should be able to so multiple simulations and compare their frequency distributions of motifs with the actual genealogy.

## Introduction

This is an R routine that emulates the monte carlo/metropolitan simulation of Douglas R. White. 1999 Controlled Simulation of Marriage Systems. Journal of Artificial Societies and Social Simulation 2(3). As in the graph, nodes are marriages, red dotted line run from wife and/or daughter to parents, black solid lines from husband and/or son to parents. The goal of the simulation is to (1) randomize marriages in each generation (2) create a distribution of randomized marriages (3) compare the distributions of actual marriages to those of these randomized distributions as a test of the null hypothesis. A further possibility is to (4) add a probabilistic preferential bias for a certain type of marriage to the simulation, (5) generate a distribution ("bootstrap") from the preferential bias parameter (6) test the goodness of fit of the observed data to the bootstrap distribution. The TIPP Kinship and computing project in anthropology and history, with software developed by Douglas R. White and Klaus Hamberger further supported by the French Agence National de la recherche (ANR), PIs Michael Houseman and Cyril Grange, provides the software for computing the rates and frequencies of different kinds of marriage from small to extremely large genealogical datasets.

- 2004 Ring Cohesion Theory in Marriage and Social Networks. Mathématiques et sciences humaines 43(168):5-28 http://www.ehess.fr/revue-msh/recherche.php?numero=168 http://eclectic.ss.uci.edu/download/MarriageNetTools.htm
- 2004 Matrimonial Ring Structures (Klaus Hamberger, Michael Houseman, Isabelle Daillant, Douglas R. White and Laurent Barry). Mathématiques et sciences humaines 43(168):83-121. TIPP Kinship and computing replaces: http://eclectic.ss.uci.edu/download/MarriageNetTools.htm
- InterSci wiki R Paper Examples: Using R to replicate a published study#Simulation

## Export Pajek to R

Given a P-graph-format kinship network in Pajek, the /Network/Partition/Depth/Generation option computes a partition of generations. The /Network/Partition/Depth/Genealogical option also computes a partition of generations but for large networks often gives two extra generations beyond the minimum number required. Generations can also be computed from birth or marriage dates and grouped into decades or other temporal intervals. The simulation requires some basis for assuming that it includes those who could have made other marriage choices within this temporal frame.

Both the network and the generation partition are exported to R. Read the *.paj into RECENT VERSION OF Pajek (N=13 nodes), /Partion/to Vector and export into R by Tools/R/Send to R/All Networks and Vectors - v1 is the generation vector, and adjacency matrix n1 and net are the square value matrices.

To replicate the 2D visualization of the pajek screen, use Net/Vector/Get Coordinate for the x and y coordinates.

## R

Each married person has a one or more marriage links that are top-wired to the same parental node, and that connects to a spouse below.

The R routine allows three options

- Rewire all women's marriages
- Rewire all men's marriages
- Rewire woman's and men's marriages

Egos in the category selected retain their parental link and are disconnected from their spouse links, then randomly rewired to a detached spouse link in the generation of ego's marriage.

- iterating through parental generations 1-N
- For each ego in the category selected (male/female/either), the parent generation, parent node, child index, and marriage node is saved in an array.
- For this parent generation, each child of each parent node is rewired to a randomly selected empty marriage node in the saved array. That node is filled before the next rewiring.

End

## A test file for pajek

As R: http://intersci.ss.uci.edu/wiki/pub/PajekR-testSim.R

KinSim.paj:

*Network KinSim *Vertices 13 1 2 3 4 5 6 7 8 9 10 11 12 13 *Arcs 13 5 1 c Black p Solid 1 5 1 c Black p Solid 1 6 2 c Red p Dots 2 6 1 c Black p Solid 2 7 2 c Red p Dots 3 7 1 c Black p Solid 3 8 2 c Red p Dots 4 8 1 c Black p Solid 4 5 2 c Red p Dots 5 9 1 c Black p Solid 5 10 2 c Red p Dots 6 10 1 c Black p Solid 6 11 2 c Red p Dots 7 11 1 c Black p Solid 7 12 2 c Red p Dots 8 12 1 c Black p Solid 8 9 2 c Red p Dots *Partition (partition into generations) 3 1 1 1 1 2 2 2 2 3 3 3 3

## Load Pajek data into R

RIGHT CLICK and unzip for KinSim3gen.paj RIGHT CLICK and unzip to get all Pajek files

----------------------------------------------------------------------- The following networks/matrices read: n1 : KinSim3gen (13)

The following vectors read:: v1 : From partition 1 (13) v2 : x-coordinate of N1 (13) v3 : z-coordinate of N1 (13)

Use objects() to get list of available objects Use comment(?) to get information about selected object Use savevector(v?,'???.vec') to save vector to Pajek input file Use savematrix(n?,'???.net') to save matrix to Pajek input file (.MAT) savematrix(n?,'???.net',2) to request a 2-mode matrix (.MAT) Use savenetwork(n?,'???.net') to save matrix to Pajek input file (.NET) savenetwork(n?,'???.net',2) to request a 2-mode network (.NET) Use v?<-loadvector('???.vec') to load vector(s) from Pajek input file Use n?<-loadmatrix('???.mat') to load matrix from Pajek input file -----------------------------------------------------------------------

savenetwork, savevector, savematrix are programs... to pass networks back to Pajek

Read the *.paj into RECENT VERSION OF Pajek (N=13 nodes), /Partion/to Vector and export into R by Tools/R/Send to R/All Networks and Vectors - v1 is the generation vector, and adjacency matrix n1 and net are the square value matrices. See Export Pajek to R for v2, v3 coordinates.

The commands below are in SNA, by Carter Butts.

Carter Butts SNA ManualSyntax of multiple embedded for loops | Saved:Looped version | Turkish Nomads

## Copy and run

#Step 1 - copy and paste R code from the following to your R console http://intersci.ss.uci.edu/wiki/pub/PajekR-KinSimGF.R # testSim.R #PajekR.r from Pajek MUST have v2=the x dim vector #A. CUT AND PASTE DATA FROM THE NEXT LINE library(sna) net=source("http://intersci.ss.uci.edu/wiki/pub/PajekR-KinSimGF.R") #C:/Program Files/pajek/PAJEK//PajekR-testSim.R") #PajekR.r gplot(n1, coord = cbind(4*v2,v1),gmode = "digraph", vertex.col=v1, edge.col=c(1,2), vertex.cex=2, label = c(1:dim(n2)[2])) #mode = coord source("http://intersci.ss.uci.edu/wiki/pub/KinSim1.R") dim(n2) #B. CUT AND PASTE CODE n2=n1 gplot(n1, coord = cbind(4*v2,v1),gmode = "digraph", vertex.col=v1, edge.col=c(1,2), vertex.cex=2, label = c(1:dim(n2)[2])) #mode = coord #skip these #gplot(n1, coord = cbind(v1,1-v1),gmode = "digraph", vertex.col=v1, vertex.cex=2) #gplot(n1, mode = "fruchtermanreingold",gmode = "digraph", vertex.col=v1) #mode = #gplot(n1, mode = "mds",gmode = "digraph", vertex.col=v1) #mode = "fruchtermanreingold" #gplot3d(n1, mode = "mds",gmode = "digraph", vertex.col=v1) #mode = "fruchtermanreingold" #NOW DO SIMULATION is.matrix(n1) is.matrix(n2) dim(n2) lensqrt=length(n1)^.5 library(gtools) m=max(v1) #WITH REPETITION START HERE n2=n1 m3=n2 for (igen in 1:2) {#0 m3 for each generation WORKS FOR 1,2 BUT NOT BOTH AND ARROWS IN 2:2 (upper are backwards; lower 1 generation back to original) n2 <- m3 numnodes=0 #create permutation F1=0 #female row i save FOR THIS GENERATION for(i in 1:lensqrt) {#1 i = CHILD for(j in 1:lensqrt) {#2 j = MOTHER if (v1[i]==igen & n2[i,j]==2) {#3 numnodes=numnodes+1 F1[numnodes] <- i }#3 put CHILD row i in the list to permute PARENT }#2 }#1 numnodes=0 #save overlaps to avoid in WiPar and HuPar lists in a0 a0=0 #child row i save FOR THIS GENERATION for(i in 1:lensqrt) {#1 i = CHILD if (v1[i]==igen) {#2 for(j in 1:lensqrt) # j = WiPar if (n2[i,j]==2) {numnodes=numnodes+1 #3 for(k in 1:lensqrt) # k = HuPar if (n2[i,k]==1) a0[numnodes] <- k }#3 put CHILD row i in the list to permute PARENT }#2 }#1 a0 gplot(n1, coord = cbind(4*v2,v1),gmode = "digraph", vertex.col=v1, edge.col=c(1,2), vertex.cex=2, label = c(1:dim(n2)[2])) #mode = coord

#ADD NEW PARENTAL LINKS numnodes=0 FJ=0 #female col j save FOR THIS GENERATION for(i in 1:lensqrt) {#1 if (v1[i]==igen) {#2 for(j in 1:lensqrt) if (n2[i,j]==2) {#3 numnodes=numnodes+1 FJ[numnodes] <- j }#3 #FJ[j] <- numnodes }#3 #WRONG DIRECTION?? #3 put PARENT col j in list with same order as CHILD i }#2 }#1 gplot(m2, coord = cbind(4*v2,v1),gmode = "digraph", vertex.col=v1, edge.col=c(1,2), vertex.cex=2, label = c(1:dim(n2)[2])) #mode = coord numnodes #PERMUTATION FINISHED HERE #?("if") ?("for") help - not much use m2=n2 num=length(F1) f1=0 f2=0 fj=0 f1 <- F1[1:num] #to permute i list fj <- FJ[1:num] #matching j list f1 #rows to permute

f2 <- permute(f1) #permute to permute list f2 #these are the permuted-i locations #Permutation for Sibling marriage HuPar=WiPar avoided nSib=0 for(i in 1:num) if (a0[i]==f2[i]) nSib=nSib+1 #( #Permutation with avoidance #f2 <- permute(f1) ########permute to permute list #} #Permutation with avoidance #for(k in 1:Len1) { #F3=0 #F3=f2 #f2 <- permute(f1) ########permute to permute list

#permutation created for columns numnodes #CREATE NEW NETWORK #delete the old ties of type 2 for (i in 1:lensqrt) {#1 if (v1[i]==igen) {#2 for (j in 1:lensqrt) if (n2[i,j]==2) m2[i,j] <- 0 }#2 }#1 #create permuted tie as type 2 #3 for (i in 1:num) m2[f2[i],fj[i]] <- 2 #3 #OT THE RIGHT ROW AND COLUMN? #for (i in 1:num) m2[fj[i],f2[i]] <- 2 #3 #IS THIS RIGHT FOR ROW AND COLUMN? #http://rss.acs.unt.edu/Rdoc/library/sna/html/gplot.html m3=m2 }#0

gplot(m3, coord = cbind(4*v2,v1),gmode = "digraph", vertex.col=v1, edge.col=c(1,2), vertex.cex=2, label = c(1:dim(n2)[2])) #mode = coord

F1 f1 f2 a0 FJ fj matrix (m3,13,13)

#if successful you can repeat with source("http://intersci.ss.uci.edu/wiki/pub/KinSim1.R")

## Matrix

matrix(n1,13,13)

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [1,] 0 0 0 0 1 2 0 0 0 0 0 0 0 [2,] 0 0 0 0 0 1 2 0 0 0 0 0 0 [3,] 0 0 0 0 0 0 1 2 0 0 0 0 0 [4,] 0 0 0 0 2 0 0 1 0 0 0 0 0 [5,] 0 0 0 0 0 0 0 0 1 2 0 0 0 [6,] 0 0 0 0 0 0 0 0 0 1 2 0 0 [7,] 0 0 0 0 0 0 0 0 0 0 1 2 0 [8,] 0 0 0 0 0 0 0 0 2 0 0 1 0 [9,] 0 0 0 0 0 0 0 0 0 0 0 0 0 10,] 0 0 0 0 0 0 0 0 0 0 0 0 0 11,] 0 0 0 0 0 0 0 0 0 0 0 0 0 12,] 0 0 0 0 0 0 0 0 0 0 0 0 0 13,] 0 0 0 0 1 0 0 0 0 0 0 0 0

## Still errors

#A DEBUG PROBLEM HERE WITH THE DIRECTION OF ARROWS SOMETIMES RECIPROCAL #OCCASIONALLY ONE SPOUSE GETS LOST (Either generation) #IS THE NUMBERING OFF BY ONE?

# gplot(m2, coord = cbind(4*v2,4-v1),gmode = "digraph", vertex.col=v1, vertex.cex=2) #mode = coord

## Objects created in R

n1 1 2 3 4 5 6 7 8 9 10 11 12 1 0 0 0 0 1 2 0 0 0 0 0 0 2 0 0 0 0 0 1 2 0 0 0 0 0 3 0 0 0 0 0 0 1 2 0 0 0 0 4 0 0 0 0 2 0 0 1 0 0 0 0 5 0 0 0 0 0 0 0 0 1 2 0 0 6 0 0 0 0 0 0 0 0 0 1 2 0 7 0 0 0 0 0 0 0 0 0 0 1 2 8 0 0 0 0 0 0 0 0 2 0 0 1 9 0 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 0 0 0 0 0 0 0 v1 [1] 1 1 1 1 2 2 2 2 3 3 3 3 v2 [1] 0.6658 0.4858 0.3058 0.1258 0.7172 0.5372 0.3572 0.1772 0.7674 0.5874 [11] 0.4074 0.2274 0.8229

## Some functions

n2=as.numeric(n1) n2 # [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 #[38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 2 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 0 0 0 #[75] 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 #[112] 2 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 #[149] 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 length(n2) #[1] 169

Output n1, i=0,71 has a square matrix in link list, including a zero diagonal n2=as.numeric(n1)

n1[3,7] #is row 3 (child) column 7 (parent) for (i in 1:13) {for (j in 1:13) { { print (i,j, n1[i,1])} } }

### Afterwards

http://vlado.fmf.uni-lj.si/pub/networks/pajek/howto/HowToR.htm

### Compilation drawing on gfortran?

http://r.research.att.com/tools/

## Permutation code in R

## OBSOLETE FROM HERE- testpit - to write the R program

The place to start is with a program created by Pajek when files are written to R, such as:

http://intersci.ss.uci.edu/wiki/pub/PajekR.r #(this one has only 12 nodes)

### Old Fortran code

Although its old Fortran spagetti code, http://intersci.ss.uci.edu/wiki/pub/PGRAPH-KinSim.txt is a reminder that it was highly efficient for large-scale simulations to use two functions:

### Simulation code in progress

- F(i)=returns the parental node of a female
- G(i)=returns the parental node of a male

let i=0,N-1 be the nodes in the p-graph. And g=1,Gen be the generation partitions (v1), 1 the top ancestors AND DONT FORGET TO EXPORT THE X COORD

net=source("C:/Program Files/pajek/PAJEK/PajekR.r") library(sna) gplot(n1, coord = cbind(4*v2,4-v1),gmode = "digraph", vertex.col=v1, vertex.cex=2) #mode = coord # n2= # matrix of links between marriages 1=Gmale 2=Ffemale %REPLACE the value 2 elements in n2 with those of permute(f1)

is.matrix(n1) [1] TRUE is.matrix(n2) [1] TRUE dim(n2) [1] 13 13

n2=n1 m2=n2 F1=0 F2=0

ro=0 co=0 numnodes=0 for (i in 1:(length(n2)-1)) {if (n2[i]==2) #if a 2 here at i {numnodes=numnodes+1}; {F1[numnodes] <- i}; #then put i in the permutation list #{F2[numnodes] <- co[i]}; #remember the column location } numnodes dimsq=length(n2)^.5 f1 <- F1[1:numnodes] f1 library(gtools) f2 <- permute(f1) #permute the permutation list f2 #these are the permuted-i locations #http://www.math.montana.edu/Rweb/Rhelp/00Index.html m2=n2 # make the array copy numnodes=0 for (i in 1:(length(n2)-1)) {if (n2[i]==2) #if a 2 here at i (again) m2[i] <- 0; #then zero this location in the array copy (this works) for (i in 1:length(n2)) {ro[i+1] <- (i+0)%/%(dimsq+0)}; for (i in 1:length(n2)+0) {co[i+1] <- 1+(i%%dimsq)} #modulo } ro co

numnodes

numnodes=0 for (i in 1:(length(n2)-1)) {if (n2[i]==2) #if a 2 here at i (again) {numnodes=numnodes+1; #this works m2[f2[numnodes]+1] <- 2} # put a 2 in a permuted-i location (DOESNT WORK) } #m2[f2[numnodes]+1] <- 2} is INTENDED to put 2s back with COLUMNS permuted, ROWs the same #I see my error -- it IS putting them back, right where they were before #I need now to transfer the vector code into a row column location dimsq #right=13 in this case i=14 co[i] ro[i]

numnodes m3=as.numeric(m2) n3=as.numeric(n2) n3 m3 #check f2 f3=0 for (i in 1:numnodes) {f3[i]=f2[i]} f1 f2 f3

- AND ADD A FOR LOOP TO DO THIS SEPARATELY FOR EACH GENERATION

- This was pseudocode en route toward R code for the permutation problem

write "error: no more than one parental couple" N=length(n1) #169 matrix entries 13x13 n=length(n2) #169 matrix entries (numeric) nr=n^.5 #13 row/col length = number of marriages = 13 in this example ro=1 #initialize row vector co=1 #initialize col vector for (i in 1:length(n2)) {ro[i]<-i%/%13} for (i in 1:length(n2)) {co[i]<-i%%13} ro #check co #check

Permute female marriages: R base functions - http://www.faculty.ucr.edu/~tgirke/Documents/R_BioCond/R_Programming.html

Gen=max(v1) n=length(n2) F=1 #initialize parental node of a female #for (g in 1:Gen) { #Get F,G links for (i in 1:length(n2)) {F[i]=0} # check0 for (i in 1:length(n2)) {if (n2[i]==2) F[i]<-co[i]} F1=NULL #initialize F array location for a generation G F1=NA #initialize F array location for a generation G F2=NULL #initialize F array location for a generation G F2=NA #initialize F array parental node in generation G m2=n2 numnodes=0 for (i in 1:(length(n2)-1)) (if (n2[i]==2) numnodes=numnodes+1; F1[numnodes] <- i; F2[numnodes] <- co[i]; m2[i]==n2[i]; m2[n2[i]]==0; } numnodes f1 <- F1[1:numnodes] f1 library(gtools) f2 <- permute(f1) f2

G<-c(5,6,7,8,9,10,11,12,0,0,0,0,5) Gran<-G z<-which(Gran!=0) #not equal to zero v<-sample(Gran[z]) #permute those Gran[z]<-v #put permuted back in Gran

You need not do this for F. You can permute G alone, F alone, or F and G. Should be able to mark certain kinds of relatives as unmarriageable as well, save their array of numbers, convert to 0s,

Will it work to permute the remainder, put the unmarriageables back where they were?

x = c(1,2,0,0,3,0) idxs = which(x > 0) permutation = sample(x[idxs]) x[idxs] = permutation