This function takes the output of a PLINK 1.7 assoc command, a BOLT-LMM or SAIGE association test, produces a publication-ready Manhattan Plot, and saves the result as a .fig for manual tweaking and a high resolution .png. The function has been tested with .assoc, .assoc.fisher, and .assoc.logistic file extensions in PLINK. As this function reads the header line from the output file, this must be included. For very large GWASs on imputed data where the output file can reach several GB in size, I recommend removing SNPs with p>0.1 or 0.01 before passing the file to ManhattanPlot. This function will be useful to bioinformaticians working with GWAS data that come from a MATLAB background.
Arguments 'sex' can be set to 0 or 1 to specify whether the sex chromosome should be plotted, 'sig' specifies the genome-wide significance threshold for the horizontal line. Columns are identified using the 'format' argument with options PLINK, BOLT-LMM and SAIGE. This is done by using the standard names of each column.
By default this function looks for the PLINK column headers of CHR, BP, and P. If using another program besides PLINK, BOLT-LMM or SAIGE, the column headers could be renamed CHR, BP, P and it should still work.
'PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner' available at http://zzz.bwh.harvard.edu/plink/index.shtml.
'The BOLT-LMM algorithm computes statistics for testing association between phenotype and genotypes using a linear mixed model (LMM) . By default, BOLT-LMM assumes a Bayesian mixture-of-normals prior for the random effect attributed to SNPs other than the one being tested' available at https://data.broadinstitute.org/alkesgroup/BOLT-LMM/downloads/
'SAIGE is an R package that implements the Scalable and Accurate Implementation of Generalized mixed model that uses the saddlepoint approximation (SPA)(mhof, J. P. , 1961; Kuonen, D. 1999; Dey, R. et.al 2017) and large scale optimization techniques to calibrate case-control ratios in logistic mixed model score tests (Chen, H. et al. 2016) in large-scale GWAS' available at https://github.com/weizhouUMICH/SAIGE
Harry Green (2020). Manhattan Plots for visualisation of GWAS results (https://www.mathworks.com/matlabcentral/fileexchange/69549-manhattan-plots-for-visualisation-of-gwas-results), MATLAB Central File Exchange. Retrieved .
Bugfix in labelling section
Added functionality for labelling SNPs on the Manhattan Plot and control of title displayed on figure
Added some more options:
Title decided to change itself back
Updated to now take value-argument pairs as optional inputs, making it easier to use SAIGE or BOLT-LMM outputs. Also options for controlling the genome-wide significance threshold. Renamed function ManhattanPlot as no longer unique to PLINK
Didn't realise the title could be different to the filename. No changes made to code, but function renamed