Harmonisation Template For All Cohorts

Author

My Name

Published

March 10, 2025

Preface

Here is the documentation of the data harmonisation step generated using Quarto. To learn more about Quarto books visit https://quarto.org/docs/books.

File Structure

Here is the file structure of the project used to generate the document.

harmonisation/                            # Root of the project template.
|
├── .quarto/ (not in repository)          # Folder to keep intermediate files/folders 
|                                         # generated when Quarto renders the files.
|
├── archive/                              # Folder to keep previous books and harmonised data.
|   |
│   ├── reports/                          # Folder to keep previous versions of
|   |   |                                 # data harmonisation documentation.
|   |   |
|   |   ├── {some_date}_batch/            # Folder to keep {some_date} version of
|   |   |                                 # data harmonisation documentation.
|   |   |
|   |   └── Flowchart.xlsx                # Flowchart sheet to record version control.
|   |
|   └── harmonised/                       # Folder to keep previous version of harmonised data.
|       |
|       ├── {some_date}_batch/            # Folder to keep {some_date} version of
|       |                                 # harmonised data.
|       |
|       └── Flowchart.xlsx                # Flowchart sheet to record version control.
|
├── codes/                                # Folder to keep R/Quarto scripts 
|   |                                     # to run data harmonisation.
|   |
│   ├── {cohort name}/                    # Folder to keep Quarto scripts to run
|   |   |                                 # data cleaning, harmonisation 
|   |   |                                 # and output them for each cohort.
|   |   |
|   |   └── preprocessed_data/            # Folder to keep preprocessed data.
|   |
│   ├── harmonisation_summary/            # Folder to keep Quarto scripts to create
|   |                                     # data harmonisation summary report.
|   |
│   ├── output/                           # Folder to keep harmonised data.
|   |                                     
|   ├── cohort_harmonisation_script.R     # R script to render each {cohort name}/ folder. 
|   |                                     # folder into html, pdf and word document.
|   |
|   └── harmonisation_summary_script.R    # R script to render the {harmonisation_summary}/ 
|                                         # folder into word document.
│  
├── data-raw/                             # Folder to keep cohort raw data (.csv, .xlsx, etc.)
|   |
│   ├── {cohort name}/                    # Folder to keep cohort raw data.
|   |   |
|   |   ├── {data_dictionary}             # Data dictionary file that correspond to the 
|   |   |                                 # cohort raw data. Can be one from the
|   |   |                                 # collaborator provide or provided by us.
|   |   |
|   |   └── Flowchart.xlsx                # Flowchart sheet to record version control.
|   |
|   ├── data-dictionary/                  # Folder to keep data dictionary 
|   |   |                                 # used for harmonising data.
|   |   |
|   |   └── Flowchart.xlsx                # Flowchart sheet to record version control.
|   |
|   └── data-input/                       # Folder to keep data input file 
|       |                                 # for collaborators to fill in.
|       |
|       └── Flowchart.xlsx                # Flowchart sheet to record version control.
|  
├── docs/                                 # Folder to keep R functions documentation 
|                                         # generated using pkgdown:::build_site_external().
|  
├── inst/                                 # Folder to keep arbitrary additional files 
|   |                                     # to include in the project.
|   |  
|   └── WORDLIST                          # File generated by spelling::update_wordlist()
|  
├── man/                                  # Folder to keep R functions documentation
|   |                                     # generated using devtools::document().
|   |
│   ├── {fun-demo}.Rd                     # Documentation of the demo R function.
|   |
│   └── harmonisation-template.Rd         # High-level documentation.
|  
├── R/                                    # Folder to keep R functions.
|   |
│   ├── {fun-demo}.R                      # Script with R functions.
|   |
│   └── harmonisation-package.R           # Dummy R file for high-level documentation.
│  
├── renv/ (not in repository)             # Folder to keep all packages 
|                                         # installed in the renv environment.
| 
├── reports/                              # Folder to keep the most recent data harmonisation
|                                         # documentation.
|
├── templates/                            # Folder to keep template files needed to generate
|   |                                     # data harmonisation documentation efficiently.
|   |
|   ├── quarto-yaml/                      # Folder to keep template files to generate 
|   |   |                                 # data harmonisation documentation structure 
|   |   |                                 # in Quarto. 
|   |   |
│   |   ├── _quarto_{cohort name}.yml     # Quarto book template data harmonisation documentation 
|   |   |                                 # for {cohort name}.
|   |   |
|   |   └── _quarto_summary.yml           # Quarto book template data harmonisation summary.
|   |
|   └── index-qmd/                        # Folder to keep template files to generate
|       |                                 # the preface page of the data harmonisation 
|       |                                 # documentation.
|       |
|       ├── _index_report.qmd             # Preface template for each cohort data harmonisation
|       |                                 # report. 
|       |
|       └── _index_summary.qmd            # Preface template for data harmonisation 
|                                         # summary report. 
|        
├── tests/                                # Folder to keep test unit files. 
|                                         # Files will be used by R package testhat.
|
├── .Rbuildignore                         # List of files/folders to be ignored while 
│                                         # checking/installing the package.
|
├── .Renviron (not in repository)         # File to set environment variables.
|
├── .Rprofile (not in repository)         # R code to be run when R starts up.
|                                         # It is run after the .Renviron file is sourced.
|
├── .Rhistory (not in repository)         # File containing R command history.
|
├── .gitignore                            # List of files/folders to be ignored while 
│                                         # using the git workflow.
|
├── .lintr                                # Configuration for linting
|                                         # R projects and packages using linter.
|        
├── .renvignore                           # List of files/folders to be ignored when 
│                                         # renv is doing its snapshot.
|
├── DESCRIPTION[*]                        # Overall metadata of the project.
|
├── LICENSE                               # Content of the MIT license generated via
|                                         # usethis::use_mit_license().
|
├── LICENSE.md                            # Content of the MIT license generated via
|                                         # usethis::use_mit_license().
|
├── NAMESPACE                             # List of functions users can use or imported
|                                         # from other R packages. It is generated 
|                                         # by devtools::document().
│        
├── README.md                             # GitHub README markdown file generated by Quarto.
|
├── README.qmd                            # GitHub README quarto file used to generate README.md. 
|        
├── _pkgdown.yml                          # Configuration for R package documentation
|                                         # using pkgdown:::build_site_external().
|        
├── _quarto.yml                           # Configuration for Quarto book generation.
|                                         # It is also the project configuration file.
|
├── csl_file.csl                          # Citation Style Language (CSL) file to ensure
|                                         # citations follows the Lancet journal.
|        
├── custom-reference.docx                 # Microsoft word template for data harmonisation 
|                                         # documentation to Word.
|
├── harmonisation_template.Rproj          # RStudio project file.
|        
├── index.qmd                             # Preface page of Quarto book content.
|        
├── references.bib                        # Bibtex file for Quarto book.
|      
└── renv.lock                             # Metadata of R packages installed generated
                                          # using renv::snapshot().

[*] These files are automatically created but user needs to manually add some information.