CGS, Ubuntu

[+] | [-]

Generic instructions

These instructions don’t contiain information specific to any GNU/Linux distribution. You might also look at ubuntu instructions.

You will need about 4GB of disk space to keep the hapmap data. The main bulk of data, about 3.5GB, will reside in /var directory.

You can check your current disk usage by typing:

df -h

Downloading the data

cd ~/Desktop
mkdir -p hapmap
cd hapmap
wget http://downloads.sourceforge.net/genedc/genepi-2007-06-20.sql.bz2
wget http://downloads.sourceforge.net/genedc/legend-2006-07.csv.bz2
bunzip2 legend-2006-07.csv.bz2
  1. Don’t uncompress genepi-2007-06-20.sql.bz2 (it’s a quite large file, and you don’t need in uncompressed anyway.)

Preparing database with Hapmap.org data

Please note that these instructions assume you’re working on your local machine. If you want to make an insitution-wide PostgreSQL installation, ask your network administrator how to secure it.

  1. Install PostgreSQL (preferably 8.2 or later)
  2. Create a PostgreSQL user with privileges to create databases
  3. Create a database named genepi.
  4. Load the data into genepi database. You can do that from the command line, without uncompressing genepi-2007-06-20.sql.bz2, with the following line:
bzcat genepi-2007-06-20.sql.bz2 | psql genepi

If you can’t do that, you can uncompress the genepi-2007-06-20.sql.bz2 file (it will explode to size of about 3.4GB!) and use following command.

psql -f genepi-2007-06-20.sql genepi

You might see some messages like this, they are harmless.

ERROR: role “maciej” does not exist

The last command may take up to few hours to complete. Time of completion depends on your computer’s hard disk speed and processing power. Leaving your computer alone until it finishes, is suggested. This is one-time operation. The data will remain in the computer unless you decide to import a new release of Hapmap data.

Converting your data to hapmixmap format

Install required libraries.

  1. C++ compiler (genedc is tested with GCC)
  2. Flex
  3. Bison
  4. Boost (filesystem and program options)
  5. Psycopg 1.x (Windows version)

Downloading and compiling genedc.

  1. Download genedc tar archive
  2. Compile it with usual ./configure and make
tar xfz genedc-X.Y.Z.tar.gz
cd genedc-X.Y.Z
./configure
make
./genedc --help

To convert your genotypes into hapmixmap format, type:

./genedc -l ../legend-2006-07.csv -i your-file

It will take about 20 seconds.

Preparing training data

From the genedc-X.Y.Z directory:

src/hapmap/cgs-hapmap -g user_genotypes.txt -o training

It will take about 30 seconds to complete the first time. Following runs will take about 5 seconds.

Last modified on Tue Feb 26 07:53:16 +0000 2008