browseMetadata

What is the browseMetadata package?
- Browse metadata
- Map metadata
Getting started with browseMetadata
- Installation and set-up
- Demo (using the R Studio IDE)
  - browseMetadata()
  - mapMetadata()
Using a custom metadata input
Using a custom domain list input
Using a custom lookup table input
Tips and future steps
License
Citation
Contributing
Acknowledgements

What is the `browseMetadata` package?

The browseMetadata package allows researchers to explore publicly available metadata from the Health Data Research Gateway and the connected Metadata Catalogue. This tool helps researchers plan projects by interacting with metadata prior to gaining full access to health datasets. Learn more about health metadata here.

At the early stages of a project, researchers can use this tool to browse datasets and categorise variables.

Browse metadata

What datasets are available? Which datasets fit my research?

The tool summarises datasets and their tables, and displays how many variables within each table have descriptions.

Map metadata

Which variables align with my research domains?
(e.g. socioeconomic, childhood adverse events, diagnoses, culture and community)

After browsing, users can categorise each variable into predefined research domains. To speed up this manual process, the function automatically categorises frequently used variables (e.g. ID, Sex, Age). The function also accounts for variables that appear across multiple tables and allows users to copy their categorisations to ensure consistency. The output files can be used in later analyses to filter and visualise variables by category.

Getting started with `browseMetadata`

Installation and set-up

Run in the R console:

install.packages("devtools")
devtools::install_github("aim-rsf/browseMetadata")

Load the library:

library(browseMetadata)

Set your working directory to an empty folder:

setwd("/Users/your-username/test-browseMetadata")

Demo (using the `R Studio` IDE)

Fo a longer more detailed demo, see the Getting Started page on the package website.

There are four main functions you can interact with: browseMetadata(), mapMetadata(), mapMetadata_compare_outputs(), and mapMetadata_convert_outputs(). For more information on any function, type ?function_name. For example: ?browseMetadata.

`browseMetadata()`

This function is easy to run and doesn't require user interaction. Run it in demo mode using the demo JSON file located in the inst/inputs directory:

browseMetadata()

Upon success, you should see:

ℹ Three outputs have been saved to your output directory.
ℹ Open the two HTML files in your browser for full-screen viewing.

The output files are saved to your working directory. You can change the save location by adjusting the output_dir argument. Examples of outputs are available in inst/outputs.

`mapMetadata()`

Use the outputs from browseMetadata() as a reference when running mapMetadata().

To run the mapping function in demo mode, use:

mapMetadata()

In demo mode, the function processes only the first 20 variables from selected tables. Follow the on-screen instructions, and categorise variables into research domains, using the Plot tab as your reference. The demo will simplify domains for ease of use; in a real scenario, you can define more specific domains.

Upon completion, your categorisations, session log, and a summary plot will be saved in your output directory.

Using a custom metadata input (recommended)

You can run mapMetadata() and browseMetadata() using a custom JSON file instead of the demo input:

new_json_file <- "path/your_new_json.json"
demo_domains_file <- system.file("inputs/domain_list_demo.csv", package = "browseMetadata")

browseMetadata(json_file = new_json_file)
mapMetadata(json_file = new_json_file, domain_file = demo_domains_file)

Currently, the recommended way of retrieving these metadata JSON files is to download them from Metadata Catalogue. Navigate to the Data Model page of interest and use the drop down button to select the JSON format to download.

Using a custom domain list input (recommended)

You can replace the default demo domains with research-specific domains. Remember any domain file input will have Codes 0,1,2 and 3 automatically appended to the start of the domain list, so do not include these in your domain list.

Using a custom lookup table input (advanced)

The lookup table governs the automatic categorisations. If you modify the default lookup file, ensure that all domain codes in the lookup file are also included in your domain file for valid outputs.

Tips and future steps

You can process a subset of variables in one session and complete the rest later.
If you're processing multiple tables, save all outputs in the same directory to enable table copying. This feature will speed up categorisation and ensure consistency.
You can compare categorisations across researchers using the mapMetadata_compare_outputs() function.
Use the output file from the mapMetadata() function as input for subsequent analysis to filter and visualise variables by research domain.

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
For more information, refer to GNU General Public License.

Citation

To cite browseMetadata in publications:

Stickland R (2024). browseMetadata: Browse and categorise metadata for datasets. R package version 1.2.2.

A BibTeX entry for LaTeX users:

  @Manual{,
    title = {browseMetadata: Browse and categorise health metadata},
    author = {Rachael Stickland},
    year = {2024},
    note = {R package version 1.2.2},
    doi = {https://doi.org/10.5281/zenodo.10581499}, 
  }

Contributing

We welcome contributions to browseMetadata. Please read our Contribution Guidelines for details on how to contribute.

Report Issues: Found a bug? Have a feature request? Report it on GitHub Issues.
Submit Pull Requests: Follow our Contribution Guidelines for pull requests.
Feedback: Share your thoughts by opening an issue.

Contributors ✨

Thanks go to these wonderful people (emoji key):

_{Rachael Stickland}
🖋 📖 🚧 🤔 📆 👀

_{Batool Almarzouq}
📓 👀 🤔 📆

_{Mahwish Mohammad}
📓 👀 🤔

_{Daniel Delbarre}
🤔 📓

_NidaZiaS
🤔

This project follows the all-contributors specification. Contributions of any kind are welcome!

Acknowledgements ✨

Thanks to the MELD-B research project, the SAIL Databank team, and the Health Data Research Innovation Gateway for ideas, feedback, and hosting open metadata.

This project is funded by the NIHR [Artificial Intelligence for Multiple Long-Term Conditions (AIM) programme (NIHR202647). The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.

Name		Name	Last commit message	Last commit date
Latest commit History 546 Commits
.github		.github
R		R
data		data
inst		inst
man		man
pkgdown/favicon		pkgdown/favicon
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.all-contributorsrc		.all-contributorsrc
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.md		README.md
_pkgdown.yml		_pkgdown.yml
browseMetadata.Rproj		browseMetadata.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

browseMetadata

Table of Contents

What is the `browseMetadata` package?

Browse metadata

Map metadata

Getting started with `browseMetadata`

Installation and set-up

Demo (using the `R Studio` IDE)

`browseMetadata()`

`mapMetadata()`

Using a custom metadata input (recommended)

Using a custom domain list input (recommended)

Using a custom lookup table input (advanced)

Tips and future steps

License

Citation

Contributing

Contributors ✨

Acknowledgements ✨

About

Releases 7

Packages

Contributors 4

Languages

License

aim-rsf/browseMetadata

Folders and files

Latest commit

History

Repository files navigation

browseMetadata

Table of Contents

What is the browseMetadata package?

Browse metadata

Map metadata

Getting started with browseMetadata

Installation and set-up

Demo (using the R Studio IDE)

browseMetadata()

mapMetadata()

Using a custom metadata input (recommended)

Using a custom domain list input (recommended)

Using a custom lookup table input (advanced)

Tips and future steps

License

Citation

Contributing

Contributors ✨

Acknowledgements ✨

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 4

Languages

What is the `browseMetadata` package?

Getting started with `browseMetadata`

Demo (using the `R Studio` IDE)

`browseMetadata()`

`mapMetadata()`

Packages