\name{nullp} \Rdversion{1.1} \alias{nullp} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Probability Weighting Function } \description{ Calculates a Probability Weighting Function for a set of genes based on a given set of biased data (usually gene length) and each genes status as differentially expressed or not. } \usage{ nullp(DEgenes, genome, id, bias.data=NULL,plot.fit=TRUE) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{DEgenes}{ A named binary vector where 1 represents DE, 0 not DE and the names are gene IDs. } \item{genome}{ A string identifying the genome that \code{genes} refer to. For a list of supported organisms run \code{\link{supportedGenomes}}. } \item{id}{ A string identifying the gene identifier used by \code{genes}. For a list of supported gene IDs run \code{\link{supportedGeneIDs}}. } \item{bias.data}{ A numeric vector containing the data on which the DE may depend. Usually this is the median transcript length of each gene in bp. If set to \code{NULL} \code{nullp} will attempt to fetch length using \code{\link{getlength}}. } \item{plot.fit}{ Plot the PWF or not? } } \details{ It is essential that the entire analysis pipeline, from summarizing raw reads through to using \code{goseq} be done in just one gene identifier format. If your data is in a different format you will need to obtain the gene lengths and supply them to the \code{nullp} function using the \code{bias.data} arguement. Converting to a supported format from another format should be avoided whenever possible as this will almost always result in data loss. \code{NA}s are allowed in the bias.data vector if you do not have information about a certain gene. Setting a gene to \code{NA} is preferable to removing it from the analysis. If \code{bias.data} is left as \code{NULL}, \code{nullp} attempts to use \code{\link{getlength}} to fetch GO catgeory to gene identifier mappings. It is recommended you review the fit produced by the \code{nullp} function before proceeding by leaving \code{plot.fit} as \code{TRUE}. } \value{ A numeric vector containing the value on the probability weighting function for each gene. This is usually passed to the function \code{goseq} via the \code{pwf} arguement. } \references{ Young, M. D., Wakefield, M. J., Smyth, G. K., Oshlack, A. (2010) \emph{Gene ontology analysis for RNA-seq: accounting for selection bias} Genome Biology Date: Feb 2010 Vol: 11 Issue: 2 Pages: R14 } \author{ Matthew D. Young \email{myoung@wehi.edu.au} } %\note{ %% ~~further notes~~ %} %% ~Make other sections like Warning with \section{Warning }{....} ~ \seealso{ \code{\link{supportedGenomes}}, \code{\link{supportedGeneIDs}}, \code{\link{goseq}}, \code{\link{getlength}} } \examples{ data(prostate) pwf <- nullp(genes, 'hg19', 'ensGene') } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. %\keyword{ ~kwd1 } %\keyword{ ~kwd2 }% __ONLY ONE__ keyword per line