Skip to contents

HuBMAPR

‘HuBMAP’ data portal (https://portal.hubmapconsortium.org/) provides an open, global bio-molecular atlas of the human body at the cellular level. HuBMAPR package provides an alternative interface to explore the data via R.

The HuBMAP Consortium offers several APIs. To achieve the main objectives, HuBMAPR package specifically integrates three APIs:

  • Search API: The Search API is primarily searching relevant data information and is referenced to the Elasticsearch API.

  • Entity API: The Entity API is specifically utilized in the bulk_data_transfer() function for Globus URL retrieval

  • Ontology API: The Ontology API is applied in the organ() function to provide additional information about the abbreviation and corresponding full name of each organ.

Each API serves a distinct purpose with unique query capabilities, tailored to meet various needs. Utilizing the httr2 and rjsoncons packages, HuBMAPR effectively manages, modifies, and executes multiple requests via these APIs, presenting responses in formats such as tibble or character. These outputs are further modified for clarity in the final results from the HuBMAPR functions, and these functions help reflect the data information of HuBMAP Data Portal as much as possible.

HuBMAP Data incorporates three different identifiers:

  • HuBMAP ID, e.g. HBM399.VCTL.353

  • Universally Unique Identifier (UUID), e.g. 7036a70229eff1a51af965454dddbe7d

  • Digital Object Identifiers (DOI), e.g. 10.35079/HBM399.VCTL.353.

The HuBMAPR package utilizes the UUID - a 32-digit hexadecimal number - and the more human-readable HuBMAP ID as two common identifiers in the retrieved results. Considering precision and compatibility with software implementation and data storage, UUID serves as the primary identifier to retrieve data across various functions, with the UUID mapping uniquely to its corresponding HuBMAP ID.

The systematic nomenclature is adopted for functions in the package by appending the entity category prefix to the concise description of the specific functionality. Most of the functions are grouped by entity categories, thereby simplifying the process of selecting the appropriate functions to retrieve the desired information associated with the given UUID from the specific entity category. The structure of these functions is heavily consistent across all entity categories with some exceptions for collection and publication.

Installation

HuBMAPR is a R package available in Bioconductor version \geq 3.20 and R version \geq 4.4.0. You can install HuBMAPR by using the following commands in R session from Bioconductor:

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("HuBMAPR")

## Check Bioconductor installation
BiocManager::valid()

Additionally, you can install development version from GitHub:

BiocManager::install_github("christinehou11/HuBMAPR")

Use

𝐄𝐧𝐭𝐢𝐭𝐲 𝐂𝐚𝐭𝐞𝐠𝐨𝐫𝐲:\textbf{Entity Category:}

  • Dataset

  • Sample

  • Donor

  • Collection

  • Publication

𝐀𝐯𝐚𝐢𝐥𝐚𝐛𝐥𝐞 𝐫𝐞𝐜𝐨𝐫𝐝𝐬 𝐟𝐨𝐫 [𝐄𝐧𝐭𝐢𝐭𝐲 𝐂𝐚𝐭𝐞𝐠𝐨𝐫𝐲]:\textbf{Available records for [Entity Category]:}

  • datasets()

  • samples()

  • donors()

  • collections()

  • publications()

𝐓𝐡𝐞 𝐝𝐞𝐟𝐚𝐮𝐥𝐭 𝐜𝐨𝐥𝐮𝐦𝐧𝐬 𝐟𝐫𝐨𝐦 [𝐄𝐧𝐭𝐢𝐭𝐲 𝐂𝐚𝐭𝐞𝐠𝐨𝐫𝐲] ()]:\textbf{The default columns from [Entity Category] ()]:}

  • datasets_default_columns(as = c(“tibble”, “character”))

  • samples _default_columns(as = c(“tibble”, “character”)

  • donors _default_columns(as = c(“tibble”, “character”)

  • collections _default_columns(as = c(“tibble”, “character”)

  • publications _default_columns(as = c(“tibble”, “character”)

𝐒𝐢𝐧𝐠𝐥𝐞 𝐑𝐞𝐜𝐨𝐫𝐝 𝐈𝐧𝐟𝐨𝐫𝐦𝐚𝐭𝐢𝐨𝐧 𝐟𝐨𝐫 [𝐄𝐧𝐭𝐢𝐭𝐲 𝐂𝐚𝐭𝐞𝐠𝐨𝐫𝐲] 𝐫𝐞𝐜𝐨𝐫𝐝:\textbf{Single Record Information for [Entity Category] record:}

  • Dataset

    • dataset_detail(dataset_uuid)

    • dataset_derived(dataset_uuid)

    • dataset_metadata(dataset_uuid)

    • dataset_contributors(dataset_uuid)

  • Sample

    • sample_detail(sample_uuid)

    • sample_derived(sample_uuid, entity_type = c(“Dataset”, “Sample”))

    • sample_metadata(sample_uuid)

  • Donor

    • donor_detail(donor_uuid)

    • donor _derived(donor_uuid, entity_type = c(“Dataset”, “Sample”))

    • donor _metadata(donor_uuid)

  • Collection

    • collection_detail(collection_uuid)

    • collection_data(collection_uuid)

    • collection_contributors(collection_uuid)

    • collection_contacts(collection_uuid)

    • collection_information(collection_uuid)

  • Publication

    • publication_detail(publication_uuid)

    • publication _data(publication_uuid)

    • publication_authors(publication_uuid)

    • publication_information(publication_uuid)

𝐏𝐫𝐨𝐯𝐞𝐧𝐚𝐧𝐜𝐞 𝐨𝐟 𝐚 𝐝𝐚𝐭𝐚𝐬𝐞𝐭/𝐬𝐚𝐦𝐩𝐥𝐞/𝐝𝐨𝐧𝐨𝐫\textbf{Provenance of a dataset/sample/donor}:

  • uuid_provenance(dataset/sample/donor uuid)

𝐀𝐝𝐝𝐢𝐭𝐢𝐨𝐧𝐚𝐥 𝐢𝐧𝐟𝐨𝐫𝐦𝐚𝐭𝐢𝐨𝐧 𝐚𝐛𝐨𝐮𝐭 𝐨𝐫𝐠𝐚𝐧 𝐚𝐛𝐛𝐫𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧 𝐚𝐧𝐝 𝐢𝐭𝐬 𝐟𝐮𝐥𝐥 𝐧𝐚𝐦𝐞:\textbf{Additional information about organ abbreviation and its full name:}

  • organ()

𝐑𝐞𝐭𝐫𝐢𝐞𝐯𝐞 𝐝𝐚𝐭𝐚 𝐟𝐢𝐥𝐞𝐬 𝐟𝐫𝐨𝐦 𝐨𝐧𝐞 𝐝𝐚𝐭𝐚𝐬𝐞𝐭 𝐬𝐢𝐧𝐠𝐥𝐞 𝐫𝐞𝐜𝐨𝐫𝐝:\textbf{Retrieve data files from one dataset single record:}

  • bulk_data_transfer(dataset_uuid)

View the article Explore Human BioMelecular Atlas Program Data Portal to read detailed examples.