Lab1.Rnw
% % NOTE -- ONLY EDIT THE .Rnw FILE!!! The .tex file is % likely to be overwritten. % % \VignetteIndexEntry{Lab 1} %\VignetteDepends{Biobase} %\VignetteKeywords{Microarray} \documentclass[12pt]{article}
\usepackage{amsmath,pstricks} \usepackage[authoryear,round]{natbib} \usepackage{hyperref}
\newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}}
\textwidth=6.2in \textheight=8.5in %\parskip=.3cm \oddsidemargin=.1in \evensidemargin=.1in \headheight=-.3in
\newcommand{\scscst}{\scriptscriptstyle} \newcommand{\scst}{\scriptstyle}
\title{Lab 1: Bioconductor Basics} \bibliographystyle{plainnat}
\begin{document}
\maketitle
In this laboratory we will introduce some of the basic interactions with Bioconductor.
<>=
library(Biobase) library(annotate) library(golubEsets)
@
The package \texttt{golubEsets} contains three data sets that were obtained from the web and slightly massaged. They represent the data analysed in \citet{Golub99} to perform class prediction using microarray data. The data were collected on Affymetrix Hu 6800 chip and which contains probes for 7129 genes.
An \texttt{exprSet} basically consists of the gene expression matrix (optionally a set of standard errors for those estimates), the related experimental metadata (who did what when and to what), and the phenotypic data. Here phenotype is interpreted quite broadly -- it represents any physical characteristics of the sample.
<>=
data(golubTrain)
golubTrain
golubTrain[,1:10]
golubTrain[1:100,]
@
Notice that when subsetting we have arranged it so that the \textit{rows} correspond to genes and the \textit{columns} correspond to samples.
The phenotypic data are stored in a separate, but linked, data frame. You can obtain it and interact with it using specific methods.
<>=
pD <- phenoData(golubTrain)
pD
pd <- pData(pD)
pd
@
An object of class \texttt{phenoData} is a combination of a dataframe containing the various data elements and a list that explains what each variable represents. This information is usually relegated to a help page but we felt that it was important to keep it more closely associated with the data.
The \verb+$+ operator performs the job of extracting particular variables from an object of class \texttt{phenoData}. It also can be used directly on the \texttt{exprSet}.
<>=
table(pD$ALL.AML)
##different data data(golubTest) table(golubTest$ALL.AML)
@ %$ The S4 methods package has introduced substantial new capabilities into R. To obtain the manual pages for S4 classes you should use the following syntax \texttt{class?exprSet}. Please do that now and we will look at help page.
Almost all R functions have a set of runnable examples that are shown at the bottom of the manual page. You can either scroll down to them and cut-and-paste them across or use the R function \Rfunction{example} to run them. Try \texttt{example(exprSet)}.
To see what packages are currently loaded into your R session you can use \Rfunction{search}. You can list the functions in any package that is attached by using \texttt{objects("package:ts")}, for example. This will list all the objects in the time series package \Rpackage{ts}. Another useful command is \Rfunction{find} which will tell you which package contains the definition of a function.
\end{document}