Help-Site Computer Manuals
Software
Hardware
Programming
Networking
  Algorithms & Data Structures   Programming Languages   Revision Control
  Protocols
  Cameras   Computers   Displays   Keyboards & Mice   Motherboards   Networking   Printers & Scanners   Storage
  Windows   Linux & Unix   Mac

/var/sites/help-site.com/auto/tmp/CPAN/9677/Bio-Affymetrix-0.5/lib/Bio/Affymetrix/CDF.pm

/var/sites/help-site.com/auto/tmp/CPAN/9677/Bio-Affymetrix-0.5/lib/Bio/Affymetrix/CDF.pm


NAME

Bio::Affymetrix::CDF- parse Affymetrix CDF files


SYNOPSIS

use Bio::Affymetrix::CDF;

# Parse the CDF file

my $cdf=new Bio::Affymetrix::CDF({``probemode''=>0});

$cdf->parse_from_file(``foo.cdf'');

# Find some fun facts about this chip type

print $cdf->rows().``,''.$cdf->cols().``\n'';

print $cdf->version().``\n'';

# Print out all of the probeset names on this chip type

foreach my $i (keys %{$chp->probesets}) { print $chp->probesets->{$i}->name().``\n''; }


DESCRIPTION

The Affymetrix microarray system produces files in a variety of formats. If this means nothing to you, these modules are probably not for you :). This module parses CDF files. Use this module if you want to find out about the design of an Affymetrix GeneChip, or you need the object for another one of the modules in this package.

All of the Bio::Affymetrix modules parse a file entirely into memory. You therefore need enough memory to hold these objects. For some applications, parsing as a stream may be more appropriate- hopefully the source to these modules will give enough clues to make this an easy task. This module in particular takes a lot of memory if probe information is also stored (about 150Mb). Memory usage is not too onorous (about 15Mb) if probe level information is omitted. You can.control this by setting probemode=>1 or probemode=>0 in the constructor.

You can also use these modules to write CDF files (using the write_to_filehandle method). See COMPATIBILITY for some important caveats.

HINTS

You fill the object filled with data using the parse_from_filehandle, parse_from_string or parse_from_file routines. You can get/set various statistics using methods on the object.

The key method is probesets. This returns a reference to a hash of Bio::Affymetrix::CDF::Probeset objects. The keys of this hash are unit number - if you are looking for a specific probeset you will have to search for it yourself. Each Bio::Affymetrix::CDF::Probeset object contains information about the probesets.


NOTES

REFERENCE

Modules were written with the official Affymetrix documentation, which can be located at http://www.affymetrix.com/support/developer/AffxFileFormats.ZIP

COMPATIBILITY

This module can parse the CDF files used with the Affymetrix software MAS 5 and GCOS. These files have QC information in them (such as the information of the location of the QC probesets), which is not parsed.

This module can also write CDF files. The support is currently pretty limited .Currently the software can only write MAS5 files (not XDA format files), and will only write files that have been parsed in previously- it cannot create CDF files from scratch. So if you have any way of making Affymetrix chips, you will just have to look elsewhere :). These limitations are caused through not parsing the QC information.


TODO

Parsing QC information? Rearrange probe information to make it more usable?


COPYRIGHT

Copyright (C) 2005 by Nick James, David J Craigon, NASC, The University of Nottingham

This module is free software. You can copy or redistribute it under the same terms as Perl itself.

Affymetrix is a registered trademark of Affymetrix Inc., Santa Clara, California, USA.


AUTHORS




Nick James (nick at arabidopsis.info)

David J Craigon (david at arabidopsis.info)

Nottingham Arabidopsis Stock Centre (http://arabidopsis.info), University of Nottingham.


METHODS

new


  Arg [1]    : hashref of parameters (optional)

  Example    : my $cdf=new Bio::Affymetrix::CDF({probemode=>1);

  Description: constructor for CDF object. Turn probemode on and off (default off) by supplying named parameters as a hash reference

  Returntype : new Bio::Affmetrix::CDF object

  Exceptions : none

  Caller     : general

original_format Arg [0] : none Example : my $format=$cdf->original_format() Description: Returns the format of the CDF file parsed. Currently MAS5 or XDA. Returntype : string Exceptions : none Caller : general

name Arg [1] : string $name (optional) Example : my $name=$cdf->name() Description: Get/set the name of this chip type (e.g. ATH1-121501). Only supplied by MAS5 version files. Returntype : string Exceptions : none Caller : general =cut

sub name { my $self=shift; if (my $q=shift) { $self->{``NAME''}=$q; } return $self->{``NAME''}; }

resequencing_reference_sequence Arg [1] : string $refseq (optional) Example : my $refseq=$cdf->resequencing_reference_sequence() Description: Get/set the name of resequencing_reference_sequence. Only available in GCOS format files Returntype : string Exceptions : none Caller : general =cut

sub resequencing_reference_sequence { my $self=shift; if (my $q=shift) { $self->{``NAME''}=$q; } return $self->{``NAME''}; }

rows Arg [1] : integer $rows (optional) Example : my $name=$cdf->rows() Description: Get/set the number of rows in this chip Returntype : integer Exceptions : none Caller : general =cut

sub rows { my $self=shift; if (my $q=shift) { $self->{``ROWS''}=$q; } return $self->{``ROWS''}; }

cols Arg [1] : integer $cols (optional) Example : my $name=$cdf->cols() Description: Get/set the number of cols in this chip Returntype : integer Exceptions : none Caller : general =cut

sub cols { my $self=shift; if (my $q=shift) { $self->{``COLS''}=$q; } return $self->{``COLS''}; }

probesets Arg [1] : hashref $probesets Example : my %probesets=%{$cdf->probesets()} Description: Get the probesets on the array Returntype : an reference to an hash of Bio::Affymetrix::CDF::Probeset objects (q.v.), keyed on unit number Exceptions : none Caller : general =cut

sub probesets { my $self=shift;


    if (my $q=shift) {

        $self->{"PROBESETS"}=$q;

    }

    return $self->{"PROBESETS"};

}

probe_grid Arg [1] : arrayref $probelist Example : my $probe=$ps->probe_grid()->[500][500]; #Return probe at 500,500 Description: Get/set the grid of probes making up this array. Only available if with probes mode is used.


    Returns an reference to a two dimensional array of

    Bio::Affymetrix::CDF::Probe objects. 

  Returntype : reference to two-dimensional array of Bio::Affymetrix::CDF::Probe objects

  Exceptions : none

  Caller     : general

=cut

sub probe_grid { my $self=shift; if (!$self->{``_PROBEMODE''}) { croak ``probe_grid is not available when not in probemode''; }


    if (my $q=shift) {

        $self->{"PROBEGRID"}=$q;

    }

    return $self->{"PROBEGRID"};

}

# These are all named ``original_'' because they aren't calculated, they are what a parsed file claims

original_number_of_probes Arg [0] : none Example : my $number_of_probes=$cdf->original_number_of_probes() Description: Get the number of probesets on the array, as listed originally in the file. A better way is to do my $q=scalar(@{$cdf->probesets()}); if you want a current count. Should really be called original_number_of_probesets. Returntype : integer Exceptions : none Caller : general =cut

sub original_number_of_probes { my $self=shift; return $self->{``NUMBEROFUNITS''}; }

original_max_unit Arg [0] : none Example : my $max_units=$cdf->original_max_units() Description: Get the max unit number in the CDF file. Fairly useless. Only available in MAS5 files Returntype : integer Exceptions : none Caller : general =cut

sub original_max_unit { my $self=shift; return $self->{``MAXUNIT''}; }

original_num_qc_units Arg [0] : none Example : my $max_units=$cdf->original_num_qc_units() Description: Get the number of QC units in the CDF file. Only piece of QC information obtainable using this piece of software. Returntype : integer Exceptions : none Caller : general =cut

sub original_num_qc_units { my $self=shift; return $self->{``NUMQCUNITS''}; }

original_file_name


  Arg [0]    :  none

  Example    :  my $cdf_file_name=$cdf->original_file_name();

  Description:  If this object was created using parse_from_file, the original filename. Otherwise undef.

  Returntype :  string

  Exceptions :  none

  Caller     :  general

parse_from_string


  Arg [1]    :  string

  Example    :  $cdf->parse_from_string($cdf_file_in_a_string);

  Description:  Parse a CDF file from a buffer in memory

  Returntype :  none

  Exceptions :  none

  Caller     :  general

parse_from_file


  Arg [1]    :  string

  Example    :  $cdf->parse_from_file($cdf_filename);

  Description:  Parse a CDF file from a file

  Returntype :  none

  Exceptions :  dies if cannot open file

  Caller     :  general

parse_from_filehandle


  Arg [1]    :  reference to filehandle

  Example    :  $cdf->parse_from_filehandle(\*STDIN);

  Description:  Parse a CDF file from a filehandle

  Returntype :  none

  Exceptions :  none

  Caller     :  general

write_to_file


  Arg [1]    :  string $filename

  Arg [2]    :  string $format

  Arg [3]    :  string $version

  Example    :  $cdf->write_to_file($cdf_filename);

  Description:  Writes a CDF file to a file. See write_to_filehandle for descriptions of format and version

  Returntype :  none

  Exceptions :  dies if cannot open file

  Caller     :  general

write_to_filehandle


  Arg [1]    :  filehandle $filehandle

  Arg [2]    :  string $format

  Arg [3]    :  string $version

  Example    :  $cdf->write_to_filehandle($cdf_filename);

  Description:  Writes a CDF file to a filehandle. Takes arguments of

  the filehandle, the desired format, and the desired version of that

  format.

  Currently, format defaults to MAS5, and version defaults to

  GC3.0. These are the only formats the software is capable of

  producing currently. Also, this software cannot write files that

  were read in using the GCOS file format. The original CDF file must

  have been parsed in probe mode.

  Returntype :  none

  Exceptions :  dies if cannot open file

  Caller     :  general
Programminig
Wy
Wy
yW
Wy
Programming
Wy
Wy
Wy
Wy