Visualizations
## Technologies and Tools
Based on the provided code snippets, here's a list of technologies and tools used in the repository:
- Programming Language: R
- Frameworks/Libraries: tidyverse (dplyr, tidyr, ggplot2), janitor, kableExtra, DiagrammeR, DiagrammeRsvg, rsvg, base64enc
- Tools: RStudio, Git, GitHub, GitHub Actions, Docker, act, actionlint, REDCap (implied)
## Functionality
The VisualizationLibrary-main repository is an R package designed to create standardized data visualizations for METRC studies. It focuses on generating tables and figures commonly used in study reports, ensuring consistency across different REDCap projects.
Significant Projects/Components:
- Standard Tables: Functions to generate tables for enrollment status, baseline characteristics, injury characteristics, follow-up visit status, adverse events, protocol deviations, and more.
- Standard Figures: Functions to create consort diagrams, cumulative enrollment plots, and other visualizations relevant to study progress and outcomes.
- Data-Matrix Driven Approach: Utilizes standardized variable names and interfaces ("constructs") defined from a data matrix, enabling consistency and cross-study analysis.
## Relevant Skills
The code demonstrates several advanced skills relevant to a software developer's resume:
- R Programming Expertise: The developer exhibits a strong understanding of R syntax, data manipulation techniques (using tidyverse), and visualization libraries (ggplot2, kableExtra).
- Data Visualization: The code effectively utilizes various visualization libraries to create informative and visually appealing tables and figures.
- Package Development: The developer demonstrates the ability to create and structure an R package, including documentation, dependencies, and proper use of namespaces.
- Version Control and CI/CD: The repository utilizes Git and GitHub for version control and employs GitHub Actions for continuous integration and deployment, showcasing familiarity with modern development workflows.
- Problem-Solving Skills: The code reveals the ability to analyze complex data, extract relevant information, and present it in a clear and concise manner.
Examples:
- Data Manipulation: The `closed_baseline_characteristics_percent` function uses dplyr verbs like `filter`, `select`, `group_by`, and `summarize` to efficiently manipulate and aggregate data for creating percentage tables.
- Data Visualization: The `dsmb_consort_diagram` function utilizes the `grViz` package to generate a consort diagram, showcasing the ability to create complex visualizations programmatically.
- Package Development: The use of roxygen2 comments to generate documentation and the structured organization of the package demonstrate understanding of R package development best practices.
## Example Code
Here are some code snippets illustrating the use of technologies and skills:
```R
# Example of data manipulation with dplyr
df_final <- df %>%
filter(enrolled) %>%
group_by(injury_type) %>%
summarize(Total = n())
# Example of visualization with ggplot2
g <- ggplot(df, aes(x = facilitycode, y = EnrolledPatients)) +
geom_bar(stat = "identity", fill = 'blue3', color = 'black', size = 0.5, width = 0.8) +
labs(title = "Number of patients enrolled by site", x = "Site", y = "Number enrolled") +
theme_minimal()
# Example of roxygen2 documentation
#' @title Number of Subjects Screened, Eligible, Enrolled and Not Enrolled
#' @description This function visualizes the enrollment totals for each site
#' @param analytic This is the analytic data set ...
enrollment_status_by_site <- function(analytic) { ... }
```
## Notable Achievements
- Development of a reusable R package for METRC studies, promoting standardization and efficiency in data visualization.
- Implementation of a data-matrix driven approach, ensuring consistency and facilitating cross-study analysis.
- Contribution to the open-source community by making the VisualizationLibrary package publicly available on GitHub.
- Demonstrated expertise in R programming, data visualization, and package development.