ConvertMS documentation

 

This software is intended to be used for the conversion of mass spectrometry files from either mass-to-charge versus intensities tables (e.g. Bruker export .DAT files) to the XML format used by the SELDI software (Ciphergen/Bio-Rad) or the other way around. The DAT files refer to any file-type, as long as they contain tabulated data (e.g. CSV, etc.).

 

First, select the file(s) you want to convert (select the 'convert all files in folder' box to select all files). Then, if you want to convert .DAT to XML, you need to select the .XML template. And after selecting the Process press the 'Go for it' button to start the conversion (or 'Quit' to end the program).

 

Your files need to be either tabulated (2 columns for .DAT, one for m/z and the other for intensities, headers will be stripped away if they are present. Only the first 2 columns will be used).

 

You can also strip away sample-specific data from the .XML file(s) by selecting 'Anonymise XML'.

 

The XML files to be converted have to be in the format used by the SELDI software.

 

The output files will be created in a new folder called 'DAT2XML', 'XML2DAT' or 'XML2XML' (automatically created by the software) located in the folder of the original file(s).

 

 

DAT to XML conversion

 

File settings:

 

1. Select the data density (amount of data points) in your output file. This value is translated and used as a multiplier across all spectra to be converted. You can also type in a number as well (maximum is 999.999.

 

2. Specify whether you want to allow an m/z error tolerance or not (which is used in Linear Regression analysis to estimate intensity values at any point in a given spectrum).

 

3. Tick the 'unordered m/z values' if your table is not strictly in ascending order (otherwise the software won't check and therefore miss values).

 

4. Tick the 'first column is m/z, second is intensity' (unticked means the first column is intensity and the second m/z) and select the column separator in your source file (separating column 1 from 2).

 

 Maths settings:

 

 5. The lower section deals with the specific settings for the mathematics behind the conversion. UID is the unique identifier which (as the name states) needs to be unique for each individual spectrum (or the SELDI software won't be able to read the file properly). U, a, t(0) and b are used in the formula

 

                  mz = U * ( a *( N * tofdeltaT - t(0) + tofT(0) ) ^2 + b )

 

6. The multiplier is applied to the intensity values (stretching the y-axis to allow for more accurate linear regression output values), and the offset is the m/z data point where the spectra start (i.e the detector blank value).

 

 

XML to DAT conversion

 

First select whether your output file(s) have the same file name as the original one, or whether the file is extracted from the XML data (the spectrum name). Then tick the 'transform' box to shift the data from the standard output (x-axis values based on the quadratic formula) to a linear output (with identical x-axis differences between each data point).

 

If you choose to transform the data then you can select whether the new datapoint spacers are based on the smallest m/z difference in the original spectrum, or whether you want to raise this data density by up to 10 times.

 

Additionally, you can include/exclude data before the detector-blank value, and you can select whether you want the intensity multiplier value (embedded within the XML file) applied to the intensity values.

 

 

Source code is available upon request by sending us an email.