Hclust2 is a handy tool for plotting heat-maps with several useful options to produce high quality figures that can be used in publication.
$ conda install -c bioconda hclust2
or
$ pip install hclust2
Below is the heatmap produced by Hclust2 on the MetaPhlAn2 abundance profiles of HMP and HMP1-phase2 samples (microbial species and samples are hierarchically clustered).
#!python
usage: hclust2.py [-h] [-i [INPUT_FILE]] [-o [OUTPUT_FILE]]
[--legend_file [LEGEND_FILE]] [-t INPUT_TYPE] [--sep SEP]
[--out_table OUT_TABLE] [--fname_row FNAME_ROW]
[--sname_row SNAME_ROW] [--metadata_rows METADATA_ROWS]
[--skip_rows SKIP_ROWS] [--sperc SPERC] [--fperc FPERC]
[--stop STOP] [--ftop FTOP] [--def_na DEF_NA]
[--f_dist_f F_DIST_F] [--s_dist_f S_DIST_F]
[--load_dist_matrix_f LOAD_DIST_MATRIX_F]
[--load_dist_matrix_s LOAD_DIST_MATRIX_S]
[--load_pickled_dist_matrix_f LOAD_PICKLED_DIST_MATRIX_F]
[--load_pickled_dist_matrix_s LOAD_PICKLED_DIST_MATRIX_S]
[--save_pickled_dist_matrix_f SAVE_PICKLED_DIST_MATRIX_F]
[--save_pickled_dist_matrix_s SAVE_PICKLED_DIST_MATRIX_S]
[--no_fclustering] [--no_sclustering] [--flinkage FLINKAGE]
[--slinkage SLINKAGE] [--dpi DPI] [-l] [--title TITLE] [-s]
[--no_slabels] [--minv MINV] [--maxv MAXV] [--no_flabels]
[--max_slabel_len MAX_SLABEL_LEN]
[--max_flabel_len MAX_FLABEL_LEN]
[--flabel_size FLABEL_SIZE] [--slabel_size SLABEL_SIZE]
[--fdend_width FDEND_WIDTH] [--sdend_height SDEND_HEIGHT]
[--metadata_height METADATA_HEIGHT]
[--metadata_separation METADATA_SEPARATION]
[--image_size IMAGE_SIZE]
[--cell_aspect_ratio CELL_ASPECT_RATIO]
[-c {Accent,Blues,BrBG,BuGn,BuPu,Dark2,GnBu,Greens,Greys,OrRd,Oranges,PRGn,Paired,Pastel1,Pastel2,PiYG,PuBu,PuBuGn,PuOr,PuRd,Purples,RdBu,RdGy,RdPu,RdYlBu,RdYlGn,Reds,Set1,Set2,Set3,Spectral,YlGn,YlGnBu,YlOrBr,YlOrRd,afmhot,autumn,binary,bone,brg,bwr,cool,copper,flag,gist_earth,gist_gray,gist_heat,gist_ncar,gist_rainbow,gist_stern,gist_yarg,gnuplot,gnuplot2,gray,hot,hsv,jet,ocean,pink,prism,rainbow,seismic,spectral,spring,summer,terrain,winter,bbcyr,bbcry,bcry}]
[--bottom_c BOTTOM_C] [--top_c TOP_C] [--nan_c NAN_C]
optional arguments:
-h, --help show this help message and exit
-i [INPUT_FILE], --inp [INPUT_FILE], --in [INPUT_FILE]
The input matrix
-o [OUTPUT_FILE], --out [OUTPUT_FILE]
The output image file [image on screen of not
specified]
--legend_file [LEGEND_FILE]
The output file for the legend of the provided
metadata
-t INPUT_TYPE, --input_type INPUT_TYPE
The input type can be a data matrix or distance matrix
[default data_matrix]
Input data matrix parameters:
--sep SEP
--out_table OUT_TABLE
Write processed data matrix to file
--fname_row FNAME_ROW
row number containing the names of the features
[default 0, specify -1 if no names are present in the
matrix
--sname_row SNAME_ROW
column number containing the names of the samples
[default 0, specify -1 if no names are present in the
matrix
--metadata_rows METADATA_ROWS
Row numbers to use as metadata[default None, meaning
no metadata
--skip_rows SKIP_ROWS
Row numbers to skip (0-indexed, comma separated) from
the input file[default None, meaning no rows skipped
--sperc SPERC Percentile of sample value distribution for sample
selection
--fperc FPERC Percentile of feature value distribution for sample
selection
--stop STOP Number of top samples to select (ordering based on
percentile specified by --sperc)
--ftop FTOP Number of top features to select (ordering based on
percentile specified by --fperc)
--def_na DEF_NA Set the default value for missing values [default None
which means no replacement]
Distance parameters:
--f_dist_f F_DIST_F Distance function for features [default correlation]
--s_dist_f S_DIST_F Distance function for sample [default euclidean]
--load_dist_matrix_f LOAD_DIST_MATRIX_F
Load the distance matrix to be used for features
[default None].
--load_dist_matrix_s LOAD_DIST_MATRIX_S
Load the distance matrix to be used for samples
[default None].
--load_pickled_dist_matrix_f LOAD_PICKLED_DIST_MATRIX_F
Load the distance matrix to be used for features as
previously saved as pickle file using hclust2 itself
[default None].
--load_pickled_dist_matrix_s LOAD_PICKLED_DIST_MATRIX_S
Load the distance matrix to be used for samples as
previously saved as pickle file using hclust2 itself
[default None].
--save_pickled_dist_matrix_f SAVE_PICKLED_DIST_MATRIX_F
Save the distance matrix for features to file [default
None].
--save_pickled_dist_matrix_s SAVE_PICKLED_DIST_MATRIX_S
Save the distance matrix for samples to file [default
None].
Clustering parameters:
--no_fclustering avoid clustering features
--no_sclustering avoid clustering samples
--flinkage FLINKAGE Linkage method for feature clustering [default
average]
--slinkage SLINKAGE Linkage method for sample clustering [default average]
Heatmap options:
--dpi DPI Image resolution in dpi [default 150]
-l, --log_scale Log scale
--title TITLE Title of the plot
-s, --sqrt_scale Square root scale
--no_slabels Do not show sample labels
--minv MINV Minimum value to display in the color map [default
None meaning automatic]
--maxv MAXV Maximum value to display in the color map [default
None meaning automatic]
--no_flabels Do not show feature labels
--max_slabel_len MAX_SLABEL_LEN
Max number of chars to report for sample labels
[default 15]
--max_flabel_len MAX_FLABEL_LEN
Max number of chars to report for feature labels
[default 15]
--flabel_size FLABEL_SIZE
Feature label font size [default 10]
--slabel_size SLABEL_SIZE
Sample label font size [default 10]
--fdend_width FDEND_WIDTH
Width of the feature dendrogram [default 1 meaning
100% of default heatmap width]
--sdend_height SDEND_HEIGHT
Height of the sample dendrogram [default 1 meaning
100% of default heatmap height]
--metadata_height METADATA_HEIGHT
Height of the metadata panel [default 0.05 meaning 5%
of default heatmap height]
--metadata_separation METADATA_SEPARATION
Distance between the metadata and data panels.
[default 0.001 meaning 0.1% of default heatmap height]
--image_size IMAGE_SIZE
Size of the largest between width and eight size for
--cell_aspect_ratio CELL_ASPECT_RATIO
Aspect ratio between width and height for the cells of
the heatmap [default 1.0]
-c {Accent,Blues,BrBG,BuGn,BuPu,Dark2,GnBu,Greens,Greys,OrRd,Oranges,PRGn,Paired,Pastel1,Pastel2,PiYG,PuBu,PuBuGn,PuOr,PuRd,Purples,RdBu,RdGy,RdPu,RdYlBu,RdYlGn,Reds,Set1,Set2,Set3,Spectral,YlGn,YlGnBu,YlOrBr,YlOrRd,afmhot,autumn,binary,bone,brg,bwr,cool,copper,flag,gist_earth,gist_gray,gist_heat,gist_ncar,gist_rainbow,gist_stern,gist_yarg,gnuplot,gnuplot2,gray,hot,hsv,jet,ocean,pink,prism,rainbow,seismic,spectral,spring,summer,terrain,winter,bbcyr,bbcry,bcry}, --colormap {Accent,Blues,BrBG,BuGn,BuPu,Dark2,GnBu,Greens,Greys,OrRd,Oranges,PRGn,Paired,Pastel1,Pastel2,PiYG,PuBu,PuBuGn,PuOr,PuRd,Purples,RdBu,RdGy,RdPu,RdYlBu,RdYlGn,Reds,Set1,Set2,Set3,Spectral,YlGn,YlGnBu,YlOrBr,YlOrRd,afmhot,autumn,binary,bone,brg,bwr,cool,copper,flag,gist_earth,gist_gray,gist_heat,gist_ncar,gist_rainbow,gist_stern,gist_yarg,gnuplot,gnuplot2,gray,hot,hsv,jet,ocean,pink,prism,rainbow,seismic,spectral,spring,summer,terrain,winter,bbcyr,bbcry,bcry}
--bottom_c BOTTOM_C Color to use for cells below the minimum value of the
scale [default None meaning bottom color of the scale]
--top_c TOP_C Color to use for cells below the maximum value of the
scale [default None meaning bottom color of the scale]
--nan_c NAN_C Color to use for nan cells [default None]