You are here

BSSVS

Setting up a phylogeographic Bayesian stochastic search variable selection (BSSVS) procedure

This tutorial describes how to modify the standard phylogeographic model XML file to set up BSSVS ([1]). For the necessary XML additions, we refer to the tutorial describing the standard Discrete Phylogeographic Analysis. The example XML file for this analysis can be found here [2] and considers influenza A H5N1 diffusion among (K=7) locations. 

Uncomment the two bitFlip operators to sample the indicators.

	<operators id="operators" optimizationSchedule="log">
		.
		.
		.
		<!-- discrete state model operators -->
		<scaleOperator scaleFactor="0.75" weight="10">
			<parameter idref="geoSiteModel.mu"/>
		</scaleOperator>
		<scaleOperator scaleFactor="0.75" weight="30" scaleAllIndependently="true" autoOptimize="true">
			<parameter idref="rates"/>
		</scaleOperator>

		<!-- BSSVS operators	-->
		<bitFlipOperator weight="30">
			<parameter idref="indicators"/>
		</bitFlipOperator>
		<bitFlipInSubstitutionModelOperator scaleFactor="0.75" weight="30" autoOptimize="true">
			<svsGeneralSubstitutionModel idref="originModel"/>
			<parameter idref="geoSiteModel.mu"/>
		</bitFlipInSubstitutionModelOperator>
	</operators>

Uncomment the truncated poissonPrior; To put 50% of prior probability on the minimum number of rates we need to keep al location connected, set mean = 0.693 (log2) and the offset = K - 1.

			<prior id="prior">
				.
				.
				.
				<!-- BSSVS truncated Poisson prior	-->
				<poissonPrior mean="0.693" offset="6.0">
					<statistic idref="nonZeroRates"/>
				</poissonPrior>				
			</prior>

The indicators are logged together with the rates and geoSiteModel.mu to a separate log file:

		 <log id="rateMatrixLog" logEvery="10000" fileName="H5N1_HA_discrete_rateMatrix.log">
			<parameter idref="rates"/>
			<parameter idref="indicators"/>
   			<sumStatistic idref="nonZeroRates"/>
   			<parameter idref="geoSiteModel.mu"/>
		</log>

This completes the BSSVS extension of the standard discrete model. Comparison of the prior probability (provided by the truncated Poisson prior) and the posterior expectation for each rateIndicator informs us to what extent the data requires this rate to provide an adequate explanation of the diffusion process.


Return to Tutorials

Philippe Lemey, 2009