mtviz

start

to start follow this link.

overview

mtviz is a tool for drawing draw publication ready pictures of mitochondrial genomes. on the right side you can see an example output for cucumaria miniata. more examples are here.

currently the main features of mtviz are: the most important feature is the 'fitting' of the names into the available space. the problem here is that the available space is often smaller than the space needed by the name of the gene (especially for tRNA). therefore the following is done:
  1. the name of the gene is rotated by 90° if the height of the gene is smaller than its width
  2. the fontsize is reduced until the name fits or a minimal fontsize (specified by the user) is reached
  3. the space is enlarged until the name fits. this additional space is taken from genes which have more space available than their name needs. this is done in equal portions in order to disturb the display as little as possible and draw the image accurately

contact

authors:matthias bernt, daniel merkle, martin middendorf
( parallel computing and complex systems group , university of leipzig ).
contact: bernt (at) informatik (dot) uni-leipzig (dot) de

(very short) tutorial

  1. upload a genbank file (like NC_005929.gb)
  2. change properties of the features
  3. adjust drawing properties
  4. download the file or image shown on the right

description

the webpage consists mainly of 4 areas. on the top is a file upload form for genbank file. below are three columns (two columns if there is no data). the leftmost column shows the data to plot, i.e. the annotation of the genome. the second column indicates the current drawing options. if there is data to plot the last column shows a preview of the plot and a link for downloading the image.

data input and modification

with the form on the top of the page the user can load a genbank file from the local hard disk. just select a file with the browse button (it is possible that this text appears in your system language, e.g. for german users it could be 'Durchsuchen') and confirm the selection with the load button. afterwards the accession number, the name of the organism, its circularity, and (most important) the list of features found in the genbank file should be shown in the feature table below. the feature table has six columns: to extract the data the parser from biopython is used. In my experience this works very well, i.e. it worked for all data sets tested so far (the datasets were taken from ncbi). note that features of the type 'source' or 'cds' are ignored while reading the genbank file. all the data elements shown can be modified. e.g. it is always usefull to change the names of the trna genes to their one letter abbreviation. if you insert data in the last line of the list (the empty one) a feature is added to the list. to confirm the changes use the update button. this will also redraw the image on the right or regenerate the file.

output format

you can choose between different output formats (eps, ps, pdf, png, jpg) simply by choosing an entry of the format drop down list. most suitable for your publications should be the eps format. sometimes it can be useful to specify a background color this can be done with the background option. the colors can be chosen with the color selector on the right of the input field (...). if you need other colors than the available you have to specify them hex-encoded (#rrggbb).

drawing options

the picture consists mainly of two circles with user specified inner and outer radius. it is possible and recommendable to surround the picture with a margin of a certain width. all lines (the circles, and the lines representing the gene ends) are drawn with width line width. to present the strandedness of the genes the inner or outer radius is drawn thicker, this is specified with sign width. the names of the genes are drawn with a certain font. if it is possible the names are drawn in the size specified by fontsize, otherwise the size can be reduced down to min fontsize (if the name still does not fit, then the name is just printed and will probably overlap with the adjacent genes). it is also possible to switch off these optimizations by toggeling the optimize checkbox. then all genes are drawn with exact scaling - the price is that the gene names will overlap in many cases. another possibility to draw the genes exactly scaled would be to set the minimal fontsize to 1 - the disadvantage of this strategy is that it will be hard to recognize some of the names. normally the drawing of the genome is started at the topmost point of the circle, i.e. the 0 position is located there. if you want the drawing to be started at a certain gene then select a gene for the rotate option.

in the feature types section you can set some options which apply to all genes of a type. with uas and ovl you can set options for unassigned (i.e. unannotated) and overlapping regions. it is possible to select a color for each feature type. with the checkbox on the right of each line you can specify if the strandedness of genes of a certain type should be marked. e.g. it is usefull to draw unassigned regions without a sign.

the appearance of the scale can be controled with the following options. the scale checkbox toggles if a scale should be drawn or not. with every you can specify the frequency of the scale marks. the length specifies how long the lines of the scale should be. and with fontsize and scale color it is possible to change the fontsize and color of the scale.

last but not least it is possible to add some legend to the drawing like the accession number, the organism name or its size. currently it is possible to specify text for the top left corner, the center, and the bottom left corner. it is possible to insert some of the informations from the genbank file in a convenient way. a "%a" is replaced by the accession number, "%o" by the organism, and "%s" by the size. it is also possible to write text over multiple lines, to start a new line insert "\n". e.g. the default value for the center legend: "%a %s nt\n%o", will result in two lines - where the 1st consists of the accession number, the size and the two letters 'nt', and the 2nd line shows the name of the species.

planned features