JoinMap ® 5Software for the calculation of genetic linkage maps in experimental populations of diploid species
JoinMap provides high quality tools that allow detailed study of the experimental data and the generation of publication-ready map charts. The intuitive MS-Windows ® user interface of JoinMap invites to a better exploration of the data. For instance, you can perform several diagnostical tests, both before and after the actual map calculation, and you can remove potentially erroneous loci and individuals from the map calculations by a simple mouse-click.
New features introduced with version 5Many features are not yet available in the current early release edition of JoinMap v5. They will be made available in future releases during the next months. These are the new features available in the current early release edition of JoinMap v5:
The v5 executable program is a 64-bit MS-Windows application. This means that it can have access to more than 4 GB memory (i.e. RAM) of the computer, which is a 32-bit limitation. Obviously, the program requires that it runs under a 64-bit version of the MS-Windows operating system and that the computer has more than 4 GB of RAM; an amount of 16 GB of memory is recommended. The access to more memory means in practice that computations can take place in memory without having to store intermediate results on the hard drive, resulting in higher speeds.
- Database driven
The various data within the JoinMap v5 projects are stored in databases. Viewing these data in tables, tree views and as plain text is made in such a way that only the currently visible part is retrieved from the databases. In contrast to this, in the previous versions of JoinMap always the entire data were retrieved from file, which can make the program very slow if the file sizes become extremely large (which is especially the case when dealing with pairwise data). Using a database system greatly improves the responsiveness of the user interface with large datasets; with small datasets the responsiveness is slightly slower than with previous JoinMap versions due to the database system overhead. The database system used is the embedded SQL database engine called SQLite, which does not require any database server installation or maintenance.
Some calculations were enhanced to be able to run in parallel. Modern CPUs have often multiple computation cores. JoinMap v5 makes use of all available cores, thus the more cores the computer has, the faster the computations: the speed scales linearly with the number of cores, except for a small amount of overhead. It is difficult or sometimes even impossible to change algorithms towards a parallel approach. In JoinMap v5 the following algorithms run in parallel: the calculation of (a) the locus similarities and (b) the individual similarities in population nodes, (c) the determination of the groupings in population nodes, (d) the computation of the recombination frequencies in group nodes. A parallel algorithm is only really useful if all the data are accessible in memory (i.e. RAM). In practice, this may be problematic for the determination of the groupings of very large datasets, because the amount of pairwise information scales quadratically with the number of markers. Therefore, these computations are made in such a way in v5 that they dynamically switch to regular serial (vs parallel) computations using temporary files on the hard drive if it turns out insufficient memory is available. It is good to realize that the speed of the hard drive will be the limiting factor in dealing with large datasets: the enormous amount of results of computations on all pairs of markers must be stored in huge database files.
- Identical loci
The production of a reliable high resolution linkage map does not only require many markers, it also requires a population of sufficient size containing the necessary segregation information. Unfortunately this latter aspect is not always the case. In such instances, there will be a large amount of redundancy in the marker data, which should lead to many markers being identical. In most computations JoinMap v5 determines which markers have an identical segregation pattern and will perform the requested computation only for the representative marker of any set of identical markers. Subsequently, it will present the same results for the identicals next to their representative. Detecting and removing identical markers in the population nodes, as advised in previous versions of JoinMap, is therefore not necessary in v5.
- Multipoint recombination frequency estimation
For all population types, except type CP, the time consuming Gibbs sampling (which is a so-called Monte Carlo Expectation-Maximization (EM) algorithm) to determine the multipoint recombination frequencies in the maximum likelihood mapping procedure is replaced by a much faster true EM algorithm. For population type CP this could not be achieved due to the complexity of the likelihood.
- Batch computations
Identical computations can often be done in batches in v5. For this, nodes of the same type in the navigation panel must be specially selected by right-clicking. When subsequently the calculations for a certain selected tabsheet are requested, the same calculations will be done for all specially selected nodes. This also applies to the calculation of maps with group nodes.
Any JoinMap project may grow to a situation where the navigation tree contains very many nodes. At some point, certain nodes may be regarded redundant, at least for the time being. JoinMap v5 offers the possibility to move entire tree branches to be stored under an archive node. There, the data remain available for viewing and even computations, while at the same time the more essential part of the navigation tree remains more clearly arranged. Archived branches can be returned to the regular project tree if needed.
The Dataset node functionality is renewed to be able to accommodate thousands of loci. For instance, a set of marker data of 50,000 markers for 100 individuals copied from an MS-Excel spreadsheet can be pasted into a JoinMap Dataset tabsheet, which takes less than half a minute. Within the Dataset tabsheet copying, cutting and pasting can be done with the regular key combinations and with the corresponding toolbar buttons. Marking regions for copying and cutting is not done the standard MS-Windows way, but more easily by marking the top-left and the bottom-right cells by right-clicking or pressing the keyboard space bar: the first right-click on a cell sets both marks; any next right-click replaces the second mark. The marked region will be hightlighted. Cancelling the marking can be done by pressing the Esc key, by double-clicking anywhere in the grid and by right-clicking the cell where you started the marking. If no region is marked, then copying and cutting will act upon the currently selected cell. The checking of the coding with the Dataset-menu function Check for Coding Errors will generate a report that is available under the blue i-button of the tool bar. If there are coding errors, it will also indicate the first and the last error detected.
- Project reconstruction
A major effort was made to be fault tolerant with project files. In the hopefully very rare occasion that the project cannot be properly opened or appears corrupted, you may have JoinMap attempt to reconstruct the project database (Project.sqlite in the project directory). You can do this by renaming the original project database to e.g. Project.sqlite.bak and next open the project in JoinMap. The program will attempt to reconstruct the project database as good as possible. If the reconstructed project appears better than that from the original project database, you may decide to continue with this reconstructed version, and otherwise you close the project and replace the new project database by the .bak version in order to continue with the original version.
Having large datasets will make that computations, administrative work and database actions may take a while. In order to give feedback to the program user, messages are given on the status bar and often progress is indicated with a growing progress bar. The program maintains a history of the last 25 messages that were shown on the status bar (as long as the program is active); this message history can be accessed by right-clicking on the status bar. Some database actions cannot be predicted for their duration, so that the standard progress bar growing to 100% cannot be used. To give feedback that the program really is busy, the progress bar area will show sequences of '>' symbols.
New feature introduced with version 4.1The only, but very important, enhancement of version 4.1 of JoinMap is the ability to use the multipoint maximum likelihood mapping algorithm on populations of type CP, i.e. the outbreeding species full-sib family. The new method has a very high speed at computing dense maps for CP populations. For instance, a linkage group of about 250 good quality SNP markers (a mix of <hkxhk>, <lmxll> and <nnxnp> segregation types) is estimated in about 8 minutes (on a regular PC). The method is described in a paper that was accepted for publication:
Projects of JoinMap version 4 can be opened and will automatically be converted into projects of version 4.1.
New features introduced with version 4With this edition JoinMap is taking another big step in linkage analysis software! Many new features were added, some improving the user interface, others supplying more powerful methods, for instance:
- data management: copy and paste your marker data from MS-Excel into JoinMap; easily check for coding errors;
- new population types: advanced intermated families and advanced backcross families, of any given generation;
- more criteria to study the linkage group formation: linkage LOD, independende test P-value, recombination frequency;
- use existing maps (of multiple groups) or existing groupings to create the linkage groups of a new population;
this is very handy when employing markers with known map positions in new populations, and also when expanding your map with an additional set of markers;
- use the so-called strongest cross link information to verify assignments of markers to groups;
- very fast computation of high density maps with the new mapping algorithm according to Jansen, et al, 2001.
TAG 102: 1113-1122 based on Monte Carlo Maximum Likelihood: the algorithm needs only couple of minutes for a 100
markers linkage group!!;
(the new mapping algorithm and the regression mapping algorithm of JoinMap 3.0 are available side by side)
(for the outbreeder full-sib family (CP) the new mapping algorithm is limited to pseudo-testcross analyses, i.e. a map for each of the two parental meioses separately)
- get an idea of plausible map positions of markers;
- graphical genotyping;
- bar and XY charts;
- print preview.
- an intuitive MS-Windows user interface, which adds a lot of practical functionality
- all analyses are based upon just a single input file in plain text format with a flexible layout
- also imports MAPMAKER raw data format (data types: f2 intercross, f2 backcross, ri self)
- experimental population types: BC1, F2, RIL, F1-derived and F2-derived DH, outbreeder full-sib family
- powerful determination of linkage groups
- automatic determination of linkage phases for outbreeder full-sib family
- several diagnostics, before and after the actual map calculation:
- test segregation distortion
- check similarity of loci
- check similarity of individuals
- calculate genotype probabilities conditional on map and flanking genotypes to discover double recombinations
- test heterogeneity of recombination estimates between different populations
- combine ('join') data derived from several sources into an integrated map
- map charts, with many adjustable features and exportable to MS-Word ® and MS-PowerPoint ®
- copying of results to clipboard for additional use in MS-Excel ®
- print or export results, e.g. export maps for use in MapQTL ®
- no limits to the amount of loci, linkage groups, etcetera, apart from the physical memory (RAM) of the computer
- manual in Acrobat ® Reader PDF file format
- easy-to-use InstallShield ® installer
LimitsThe facts that JoinMap v5 is 64-bit software and uses a database system make it suitable for working with thousands of loci. The software does not have a built-in fixed limit regarding the maximum number of loci, with the exception of the Dataset tabsheet which can accomodate a maximum of one million loci (although that amount was not tested). The software was tested to work fine with a dataset of 50,000 loci. Please note, however, that the production of a reliable high resolution linkage map does require a population of sufficient size containing the necessary segregation information.
With this version of JoinMap, the speed of the hard disk drive will now become a limiting factor in dealing with large datasets. For instance, a dataset of 50,000 loci in an F2 population will produce a table of ~1.2 billion records of pairwise data that must be stored, resulting in a database file of over 30 GB in size, which will require some time to write.
ImpressionGet an impression of the software with the slide show:
JoinMap ® 4 slide show (size: ~0.9 MB).
Version informationJoinMap 5 is 64-bit software for the 64-bit MS-Windows platforms 7, 8, 8.1 and 10. Other MS-Windows platforms are not supported. Previous versions are no longer available.
The original version of JoinMap was published in 1993 in The Plant Journal by Piet Stam . Version 2.0 of JoinMap was presented at the Plant Genome III Conference, January 1995, San Diego, California, USA .
- Stam, 1993. Construction of integrated genetic linkage maps by means of a new computer package: JoinMap. The Plant Journal 3: 739-744.
- Stam, 1995. JoinMap 2.0 deals with all types of plant mapping populations. Plant Genome III Abstracts.