You are here

Bayes factor

Bayes factor


BEAST Documentation->Tutorials-> Bayes factor

Identifying well-supported rates using Bayes factors 

This tutorial describes how to identify rates that are frequently invoked to explain the diffusion process and how to visualize these in Google Earth Google Earth. The tutorial uses the results obtained by the analysis described in BSSVS, in particular the rate indicators outputted to the H5N1_HA_discrete_rateMatrix.log file. To analyze the results in this file, we require a tool recently added to the BEAST code (RateIndicatorBF); a recent build that contains this tool can be downloaded at beast.jar. For Google Earth visualization, we will also have to associate each rate with two particular locations and their latitudes and longitudes. We will use the same tab-delimited file prepared in Tree summary. For the H5N1 example, this should look like this:

Fujian	25.917	118.283
Guangdong	22.87	113.483
Guangxi	23.6417	108.1
Hebei	39.3583	116.6417
Henan	33.875	113.5
HongKong	22.3	114.167
Hunan	27.383	111.517

Importantly, the order of the locations should be the same as in the generalDataType of the beast xml file:

	<!-- The general data type for locations                                   -->
	<generalDataType id="geography">
		<location idref="Fujian"/> 
		<location idref="Guangdong"/> 
		<location idref="Guangxi"/> 
		<location idref="Hebei"/> 
		<location idref="Henan"/> 
		<location idref="HongKong"/> 
		<location idref="Hunan"/> 

If a KML output for Google Earth visualization is not required, the same file can also be restricted to the locations (without coordinates). If the beast.jar is in the same folder as the H5N1_HA_discrete_rateMatrix.log file, the command below can be used. If this is not the case, the full path to the beast.jar will have to specified. To list the options of the RateIndicatorBF analysis tool, run the beast.jar using the help option:

java -cp beast.jar -help

There are many options you can explore, but some are not yet fully implemented. For our purposes, we can use:

java -cp beast.jar -burnin 200  
-pmean 0.693 -poffset 6 -bfcutoff 3 -kml true
 -locationsfile locationCoordinates H5N1_HA_discrete_rateMatrix.log BFtest.out

where "-burnin 200" specifies 200 trees to be discarded as burn-in, "-pmean" specifies the mean of the (truncated) Poisson prior [the default is 0.693 unless specified otherwise], "-offset" specifies the offset of the truncated Poisson prior [the default is also the number of locations - 1], "-bfcutoff" specifies the Bayes factor above which we consider rates to be well -supported, "-kml true" calls for a KML output file, "-locationsfile" specifies the file to which we saved the locations and their coordinates. The last two arguments specify the input and output file.

This should generate a text (BFtest.out) and KML output file. The text file lists the rates yielding a bayes factor above the specified cut-off:

Indicator cutoff (for BF = 3.0) = 0.5839295061943813
mean Poisson Prior = 0.693
Poisson Prior offset = 6
I=0.8989450305385897	BF=19.015301372764394: between Fujian (long: 118.283; lat: 25.917) and Guangdong (long: 113.483; lat: 22.87)
I=0.7290394225430317	BF=5.751387858124344: between Guangdong (long: 113.483; lat: 22.87) and Guangxi (long: 108.1; lat: 23.6417)
I=0.7157134925041644	BF=5.381591249719858: between Fujian (long: 118.283; lat: 25.917) and Hebei (long: 116.6417; lat: 39.3583)
I=0.6835091615769017	BF=4.616479904377707: between Henan (long: 113.5; lat: 33.875) and Hunan (long: 111.517; lat: 27.383)
I=0.6274292059966685	BF=3.5998439546800576: between Guangxi (long: 108.1; lat: 23.6417) and HongKong (long: 114.167; lat: 22.3)

Open the KML output (default KMLrates.kml) file in Google Earth; The rate mapping should look like this: