HuBMAP Datasets — datasets • HuBMAPR

datasets returns the details available datasets, ordered by last modified dates

*_columns() returns a tibble or named character vector describing the content of the tibble returned by samples(), datasets(), donors(), collections(), or publications().

dataset_detail() takes a unique dataset_id and returns details about one specified dataset as a tibble

dataset_derived() takes a unique sample_id and returns the derived (support) dataset details. Support datasets normally belong to Image Pyramid, with image files available to download via Globus Collection. See details to download in files_globus_url().

dataset_metadata() takes a unique dataset_id and returns the metadata of the dataset.

dataset_contributors() takes a unique dataset_id and returns the contributors of the dataset. For questions for this dataset, reach out to the individuals listed as contacts, either via the email address listed in the table or contact information provided on their ORCID profile page.

Usage

datasets()

datasets_default_columns(as = c("tibble", "character"))

dataset_detail(uuid)

dataset_derived(uuid)

dataset_metadata(uuid)

dataset_contributors(uuid)

Arguments

as: character(1) return format. One of "tibble" (default), or "character".
uuid: character(1) corresponding to the HuBMAP Donor UUID string. This is expected to be a 32-digit hex number.

Value

*_columns() returns a named list name containing the column name used in the tibble returned by samples(), datasets(), donors(), collections(), or publications(). When as = "tibble",the return value is a tibble with paths as elements and abbreviations as names.

Details

Additional details are provided on the HuBMAP consortium webpage, https://software.docs.hubmapconsortium.org/apis

Examples

datasets()
#> # A tibble: 5,493 × 14
#>    uuid        hubmap_id dataset_type dataset_type_additio…¹ organ analyte_class
#>    <chr>       <chr>     <chr>        <chr>                  <chr> <chr>        
#>  1 6e5d5a3cbc… HBM986.Q… CyTOF        ""                     Blood Protein      
#>  2 63005d98f9… HBM938.W… CyTOF        ""                     Bone… Protein      
#>  3 e137385cfe… HBM444.R… CyTOF        ""                     Bone… Protein      
#>  4 c585d35f67… HBM696.B… CyTOF        ""                     Bone… Protein      
#>  5 9465cf8e0f… HBM422.W… CyTOF        ""                     Bone… Protein      
#>  6 335c871203… HBM576.V… CyTOF        ""                     Bone… Protein      
#>  7 630144bf74… HBM547.N… CyTOF        ""                     NA    Protein      
#>  8 a211b3570e… HBM843.Q… CyTOF        ""                     Blood Protein      
#>  9 607baa57ec… HBM628.F… CyTOF        ""                     Blood Protein      
#> 10 2825809b6a… HBM727.D… CyTOF        ""                     Blood Protein      
#> # ℹ 5,483 more rows
#> # ℹ abbreviated name: ¹dataset_type_additional_information
#> # ℹ 8 more variables: sample_category <chr>, status <chr>,
#> #   dataset_processing_category <chr>, pipeline <chr>, registered_by <chr>,
#> #   donor_hubmap_id <chr>, group_name <chr>, last_modified_timestamp <chr>
datasets_default_columns()
#> # A tibble: 14 × 1
#>    columns                            
#>    <chr>                              
#>  1 uuid                               
#>  2 hubmap_id                          
#>  3 group_name                         
#>  4 dataset_type_additional_information
#>  5 dataset_type                       
#>  6 organ                              
#>  7 analyte_class                      
#>  8 dataset_processing_category        
#>  9 sample_category                    
#> 10 registered_by                      
#> 11 status                             
#> 12 pipeline                           
#> 13 last_modified_timestamp            
#> 14 donor_hubmap_id                    

uuid <- "7754aa5ebde628b5e92705e33e74a4ef"
dataset_detail(uuid)
#> # A tibble: 1 × 31
#>   ancestor_ids ancestors  contains_human_genetic_sequen…¹ created_by_user_disp…²
#>   <list>       <list>     <lgl>                           <chr>                 
#> 1 <chr [4]>    <list [4]> FALSE                           HuBMAP Process        
#> # ℹ abbreviated names: ¹contains_human_genetic_sequences,
#> #   ²created_by_user_displayname
#> # ℹ 27 more variables: created_by_user_email <chr>, created_timestamp <dbl>,
#> #   creation_action <chr>, data_access_level <chr>, data_types <list>,
#> #   dataset_info <chr>, dataset_type <chr>, descendant_ids <list>,
#> #   descendants <list>, display_subtype <chr>, donor <list>, entity_type <chr>,
#> #   files <list>, group_name <chr>, group_uuid <chr>, hubmap_id <chr>, …
# no derived dataset
uuid <- "3acdb3ed962b2087fbe325514b098101"
dataset_derived(uuid)
#> NULL

# with derived dataset
uuid <- "2c77b1cdf33dbed3dbfb74e4b578300e"
dataset_derived(uuid)
#> # A tibble: 1 × 6
#>   uuid           hubmap_id data_types dataset_type status last_modified_timest…¹
#>   <chr>          <chr>     <chr>      <chr>        <chr>  <chr>                 
#> 1 9e7b040f23124… ""        ""         RNAseq [Sal… ""     NA                    
#> # ℹ abbreviated name: ¹last_modified_timestamp
uuid <- "564167adbbb2fdd64c24e7ea409c23f1"
dataset_metadata(uuid)
#> New names:
#> • `` -> `...1`
#> # A tibble: 26 × 2
#>    Key                              Value                    
#>    <chr>                            <chr>                    
#>  1 acquisition_instrument_model     "BZ-X710"                
#>  2 acquisition_instrument_vendor    "Keyence"                
#>  3 analyte_class                    "Nucleic acid + protein" 
#>  4 contributors_path                "extras/contributors.tsv"
#>  5 data_path                        "."                      
#>  6 dataset_type                     "Histology"              
#>  7 intended_tile_overlap_percentage ""                       
#>  8 is_batch_staining_done           "No"                     
#>  9 is_image_preprocessing_required  "No"                     
#> 10 is_staining_automated            "No"                     
#> # ℹ 16 more rows

uuid <- "564167adbbb2fdd64c24e7ea409c23f1"
dataset_contributors(uuid)
#> # A tibble: 3 × 11
#>   affiliation display_name    email            first_name is_contact is_operator
#>   <chr>       <chr>           <chr>            <chr>      <chr>      <chr>      
#> 1 Stanford    John Hickey     john.hickey@duk… John       Yes        Yes        
#> 2 Stanford    Chiara Caraccio caraccio@stanfo… Chiara     Yes        Yes        
#> 3 Stanford    Garry Nolan     gnolan@stanford… Garry      Yes        No         
#> # ℹ 5 more variables: is_principal_investigator <chr>, last_name <chr>,
#> #   metadata_schema_id <chr>, middle_name_or_initial <chr>, orcid <chr>