TCGA-Assembler is an open-source, freely
available tool that automatically downloads, assembles, and
processes public The Cancer Genome Atlas (TCGA) data, to facilitate
downstream data analysis by relieving investigators from the burdens
of data preparation. TCGA-Assembler includes two modules. Module A
acquires public TCGA data from TCGA Data Coordinating Center and
assembles individual data files into locally stored data
tables. Module B does various manipulations on the data tables to
prepare them for downstream analysis.
TCGA-Assembler can now retrieve and process all microarray gene expression data from more than 10 cancer types in TCGA. Please use the newly updated "DownloadRNASeqData()" function and "ProcessRNASeqData()" function to acquire and process microarray data. The user manual has also been updated to reflect this change.
TCGA-Assembler can now retrieve not only the clinical information of patients, but also the biospecimen information of patient samples. A new function "DownloadBiospecimenData()" has been added to Module A for retrieving the biospecimen information of samples.
Fixed a user-reported bug in the "DownloadRPPAData()" function that may be trigged by additional columns in RPPA antibody annotation file. Typical antibody annotation files include only three columns, while the annotation files of a few cancer types may have more columns.
Updated the directory traverse result file in the package. It is now DirectoryTraverseResult_Jul-08-2014 and includes the URLs of all TCGA data files accurate as of July 8th, 2014. With the updated file, you can easily download most current TCGA data files. And you can also update the file by yourself using the "TraverseAllDirectories()" function.
To download version 1.0.3 requires a simple registration process for basic information of TCGA-Assembler users.
TCGA just changed their naming rule
for clinical data! This affects the function
"DownloadClinicalData()" in our package! We are in
contact with TCGA team and working to release a
mini-update version soon.
TCGA-Assembler Acquires and Processes Large Numbers of TCGA Data.
Below is the summary of public TCGA data that can be acquired and processed by TCGA-Assembler. Entries are the numbers of patient samples measured by different assay platforms and the numbers of patients with de-identified clinical information (accurate as of August, 2013). The numbers will gradually increase as new data are still being produced.