• Home
  • About
  • People
    • Management
    • Staff & HiWis
    • Scientific Board
    • Advisors
    • Members
    • Fellows
    • Alumnae
  • Partners
    • Institutional Members
    • Local Open Science Initiatives
    • Other LMU Support Services
    • External Partners
    • Funders
  • Training
  • Events
  1. Training Tracks
  2. Open Research Cycle
  3. Analyze & Collaborate
  4. Website

Thanks for visiting the LMU Open Science Center–our site is still under construction for now, but we’ll be live soon!

  • Training Tracks
    • Self-Training
      • Principles
        • Credible Science
        • Replicability Crisis
      • Study Planning
        • Introduction to Data Simulations in R
        • Preregistration: Why and How?
        • Simulations for Advanced Power Analyses
      • Data Management
        • TBD: Data Anonymity
        • Data Dictionary
        • FAIR Data Management
        • Maintaining Privacy with Open Data
        • Introduction to Open Data
        • TBD: Generating Synthetic Data
      • Reproducible Processes
        • Advanced git
        • Collaborative coding with GitHub and RStudio
        • Introduction to Quarto
        • Introduction to R
        • Introduction to {renv}
        • Introduction to Version Control within RStudio
        • Introduction to Zotero
      • Publishing Outputs
        • Code Publishing
        • Open Access, Preprints, Postprints
    • Open Research Cycle
      • Plan & Design
        • Current
        • MI
        • Website
        • RL
      • Collect & Manage
        • Current
        • MI
        • Website
        • RL
      • Analyze & Collaborate
        • Current
        • MI
        • Website
        • RL
      • Preserve & Share
        • Current
        • MI
        • Website
        • RL
    • Train the Trainer

On this page

  • Core Analysis Activities
  • Reproducible Analysis Frameworks
  • Tools & Resources
    • Statistical Software
    • Computational Notebooks
    • Collaborative Platforms
    • Environment Management
    • Cloud Computing Environments
    • Writing & Communication
  • Analysis & Collaboration Checklist
  • Edit this page
  • Report an issue
  1. Training Tracks
  2. Open Research Cycle
  3. Analyze & Collaborate
  4. Website

Analyze & Collaborate

Create reproducible analyses, document workflows, collaborate effectively, and ensure computational reproducibility

Build Reproducible Analysis Workflows

Analysis and collaboration transform data into knowledge—but only if your work can be understood, verified, and built upon. This phase covers practices for creating transparent, reproducible analyses and collaborating effectively with your research team.

What you’ll find here: Approaches to computational reproducibility, tools for literate programming, strategies for code documentation, and platforms for collaborative workflows.

Download Analysis Checklist View Code Review Guide


NoteLearn More: Reproducible Analyses
  • Introduction to R (3h) - Statistical programming fundamentals
  • Version Control with Git (1.5h) - Git within RStudio
  • Collaborative GitHub (1h) - Team workflows
  • Writing Readable Code (45 min) - Clean code practices
  • Introduction to Quarto (2h) - Literate programming
  • R Package Management with renv (1h) - Manage dependencies
  • Git Branching and Merging (1h) - Advanced Git workflows (optional)

Core Analysis Activities

Reproducible analysis requires attention to multiple aspects of your workflow.

Computational Workflows

Creating traceable analyses:

  • Scripted workflows (not point-and-click)
  • Automated pipelines
  • Environment management
  • Dependency tracking

Code Documentation

Making code understandable:

  • Inline comments explaining decisions
  • Function documentation
  • Workflow documentation (README)
  • Analysis narratives

Collaboration

Working effectively with teams:

  • Shared code repositories
  • Code review processes
  • Collaborative writing platforms
  • Communication tools

Quality Assurance

Ensuring analysis quality:

  • Code testing and validation
  • Peer code review
  • Reproducibility checks
  • Statistical consulting

Reproducible Analysis Frameworks

Different research contexts require different reproducibility approaches.

  • Computational Reproducibility
  • Code Documentation
  • Literate Programming
  • Version Control & Collaboration
  • Statistical Analysis & Quality

Computational reproducibility ensures others can rerun your analysis and obtain the same results.

Core principles:

Scripted Workflows

  • Avoid manual data manipulation
  • Use scripts instead of point-and-click
  • Document all analysis steps

Environment Control

  • Track software versions
  • Manage package dependencies
  • Use containers when appropriate

Organized Structure

  • Consistent project organization
  • Relative (not absolute) paths
  • Clear file naming

Random Seeds

  • Set seeds for random processes
  • Document stochastic procedures
  • Enable exact replication
Inline Comments

Comments explain why you made decisions, not just what the code does.

What to comment:

  • Rationale for methodological choices
  • Explanation of complex algorithms
  • Context for non-obvious code
  • Known limitations or workarounds
  • Citations to methods or papers

What not to comment:

  • Obvious code
  • Outdated information
  • Sensitive information
README Files

README files are essential for understanding how to use your code repository.

Required elements:

  • Project title and description
  • How to run the analysis: Step-by-step instructions
  • Software requirements: Language versions, dependencies
  • Input data: Where to obtain or how to access
  • Output: What files are produced
  • Project structure: Explanation of directories

Example:

# Analysis of Treatment Effects

## Requirements
- R version 4.3+
- Packages listed in renv.lock

## Running the Analysis
1. Install dependencies: `renv::restore()`
2. Run scripts in order: 01_preprocessing.R, 02_analysis.R

Literate programming combines code, results, and narrative text in a single document.

NoteLearn More: Literate Programming

Introduction to Quarto (2.5h) - Create reproducible documents combining code, results, and prose.

Common tools:

Computational notebooks:

  • Jupyter Notebooks (Python, R, Julia)
  • R Markdown / Quarto
  • Observable (JavaScript)

Benefits:

  • Results auto-update with code changes
  • Combine analysis and reporting
  • Output to multiple formats
  • Facilitate reproducibility

What to include in analysis notebooks:

  1. Introduction: Research question
  2. Methods: Statistical procedures
  3. Data loading: Source and preprocessing
  4. Main analysis: Tests and models
  5. Results: Tables and figures
  6. Session info: Software versions
Git Best Practices

Version control is essential for tracking changes to analysis code.

NoteLearn More: Version Control
  • Version Control with Git (2h)
  • Collaborative GitHub Workflows (1h)

Key practices:

  • Commit frequently with logical changesets
  • Write meaningful commit messages
  • Never commit large data files
  • Never commit sensitive information
  • Use branches for experimental analyses
  • Tag releases for publications

Good commit message:

Add power analysis for sample size justification

- Implemented simulation-based power calculation
- Tested effect sizes (d = 0.3, 0.5, 0.8)
Collaboration & Review

Collaboration workflows:

  • Centralized workflow: Simple, one main branch
  • Feature branch workflow: Each analysis on separate branch
  • Pull request review: Review before merging

Code review checklist:

What reviewers check:

  1. Functionality and correctness
  2. Reproducibility
  3. Readability and organization
  4. Best practices
Analysis Planning

Plan analyses before looking at outcome data to avoid biases.

Analysis plans should specify:

  • Statistical tests to be used
  • Variables and transformations
  • Handling of missing data
  • Outlier criteria and treatment
  • Multiple comparisons corrections

Exploratory vs. confirmatory:

  • Confirmatory: Tests specified in preregistration
  • Exploratory: Additional analyses not prespecified
  • Clearly label which is which
Quality Assurance

Code quality practices:

  • Follow style guides (Tidyverse for R, PEP 8 for Python)
  • Use automated styling tools (styler, black, lintr)
  • Consistent naming conventions
  • Descriptive variable names
  • Limit line length (80-100 characters)

Statistical consulting:

Consider consulting for complex designs, advanced methods, or results interpretation.

TipOur Recommendation: Statistical Consulting at LMU

StaBLab (LMU Statistical Consulting Unit) provides expert guidance on statistical analyses.

  • Contact: kontakt@stablab.stat.uni-muenchen.de
  • Services: Analysis planning, method selection, power analysis
Research Integrity & Avoiding Bias

Questionable research practices (QRPs) can inflate false positive rates and reduce reproducibility.

Common QRPs to avoid:

  • p-hacking: Running multiple analyses and reporting only significant ones
  • HARKing: Hypothesizing After Results are Known
  • Selective reporting: Omitting null results or failed experiments
  • Optional stopping: Stopping data collection when results become significant

Prevention strategies:

  • Preregister hypotheses and analysis plans
  • Distinguish confirmatory from exploratory analyses
  • Report all conducted analyses
  • Use Registered Reports format

Resources:

  • p-Hacking interactive demo - See how easy it is to find “significant” results
  • Big Little Lies Shiny App - Simulation of p-hacking strategies by Angelika Stefan & Felix Schönbrodt
Note

For detailed guidance on avoiding bias, see the PRO Initiative guidelines for making analyses public.


Tools & Resources

Statistical Software

Open-source tools for transparent, reproducible statistical analysis:

R

R / RStudio

Statistical programming environment

Tutorial available
Read more

Free, open-source language for statistics and data science. Extensive package ecosystem. RStudio IDE provides integrated development environment.

Python

Python

General-purpose programming with data science libraries

Read more

Powerful libraries like NumPy, pandas, scikit-learn. Popular in machine learning and computational research.

JASP

GUI for statistical analysis

Read more

Free, open-source alternative to SPSS. Bayesian and frequentist analyses. Produces reproducible output.

NoteLearn More: R Programming
  • R for Data Science - Free online book by Hadley Wickham
  • Tidy Data - Foundational paper on data organization
  • Swirl - Interactive R learning within RStudio
  • RStudio Cheat Sheets - Quick reference guides

Computational Notebooks

Tools for combining code, output, and narrative in reproducible documents:

Quarto

Quarto

Reproducible documents combining code and narrative

Tutorial available
Read more

Create reproducible manuscripts, presentations, and reports. Supports R, Python, Julia. Renders to multiple formats including HTML, PDF, and Word.

Jupyter Notebooks

Interactive computational notebooks

Read more

Web-based notebooks combining code, output, and markdown. JupyterLab provides full IDE. Share via JupyterHub or export formats.

Observable

JavaScript notebooks for data visualization

Read more

Reactive notebooks with powerful visualization libraries. Ideal for interactive data exploration and communication.

Collaborative Platforms

GitHub

GitHub

Code hosting and collaboration

Tutorial available
Read more

Public and private repositories. Pull request workflows. GitHub Actions for automation. Issue tracking and project boards.

GitLab

LRZ GitLab

Institutional Git hosting

Supported at LMU
Read more

Private repositories for LMU research. CI/CD pipelines. Integrated issue tracking and code review.

OSF

OSF

Research project management

Read more

Combines storage, version control, and collaboration. Preregistration support. DOIs for projects and components.

Environment Management

Tools to ensure your code runs the same way everywhere:

renv

R package management

Tutorial available
Read more

Isolate package dependencies per project. Create reproducible R environments. Works with RStudio projects.

Conda

Conda

Python environment manager

Read more

Manage Python packages and environments. Cross-platform. Includes scientific computing packages.

Docker

Docker

Containerization for full reproducibility

Read more

Package entire computational environment. Ensures exact software versions. Ideal for complex workflows.

Cloud Computing Environments

Avoid the “Works on My Machine” error by running code in cloud environments:

Posit Cloud

Posit Cloud

RStudio in the browser

Read more

Run R and RStudio entirely in the cloud. Share projects with collaborators. Free tier available for teaching and small projects.

Code Ocean

Computational reproducibility platform

Read more

Create reproducible “compute capsules” with code, data, and environment. Supports R, Python, Julia, and more. DOIs for computational workflows.

Google Colab

Google Colab

Free Jupyter notebooks with GPU

Read more

Run Python notebooks in Google’s cloud. Free GPU/TPU access for machine learning. Easy sharing via Google Drive.

Writing & Communication

Tools for collaborative writing and team communication:

Overleaf

Overleaf

Collaborative LaTeX editor in the browser

Read more

Real-time collaboration, track changes, comments, and version history. Free tier available. Many journal templates included.

PaperHive

Collaborative annotation and discussion

Read more

Annotate and discuss research papers collaboratively. Works with PDFs and supports public or private discussions.

Matrix

LMU Chat (Matrix)

Decentralized, secure team messaging

Supported at LMU
Read more

Open-source, federated messaging protocol. LMU provides Matrix hosting for secure, GDPR-compliant team communication with channels, direct messages, and file sharing.


TipOur Recommendation: Computational Resources at LMU

LRZ Compute Cloud provides computational resources for data-intensive analyses.

  • Services: Virtual machines, high-performance computing, storage
  • Access: Available to LMU researchers
  • Support: LRZ Service Desk

Analysis & Collaboration Checklist

Before Starting Analysis:

During Analysis:

Before Sharing:

MI
RL
  • Edit this page
  • Report an issue
Ludwig-Maximilians-Universität
LMU Open Science Center

Leopoldstr. 13
80802 München

Contact
  • Prof. Dr. Felix Schönbrodt (Managing Director)
  • Dr. Malika Ihle (Coordinator)
  • OSC team
Join Us
  • Subscribe to our announcement list
  • Become a member
  • LMU chat on Matrix

Imprint | Privacy Policy | Accessibility