• Home
  • About
  • People
    • Management
    • Staff & HiWis
    • Scientific Board
    • Advisors
    • Members
    • Fellows
    • Alumnae
  • Partners
    • Institutional Members
    • Local Open Science Initiatives
    • Other LMU Support Services
    • External Partners
    • Funders
  • Training
  • Events
  1. Training Tracks
  2. Open Research Cycle
  3. Plan & Design
  4. Current

Thanks for visiting the LMU Open Science Center–our site is still under construction for now, but we’ll be live soon!

  • Training Tracks
    • Self-Training
      • Principles
        • Credible Science
        • Replicability Crisis
      • Study Planning
        • Introduction to Data Simulations in R
        • Preregistration: Why and How?
        • Simulations for Advanced Power Analyses
      • Data Management
        • TBD: Data Anonymity
        • Data Dictionary
        • FAIR Data Management
        • Maintaining Privacy with Open Data
        • Introduction to Open Data
        • TBD: Generating Synthetic Data
      • Reproducible Processes
        • Advanced git
        • Collaborative coding with GitHub and RStudio
        • Introduction to Quarto
        • Introduction to R
        • Introduction to {renv}
        • Introduction to Version Control within RStudio
        • Introduction to Zotero
      • Publishing Outputs
        • Code Publishing
        • Open Access, Preprints, Postprints
    • Open Research Cycle
      • Plan & Design
        • Current
        • MI
        • Website
        • RL
      • Collect & Manage
        • Current
        • MI
        • Website
        • RL
      • Analyze & Collaborate
        • Current
        • MI
        • Website
        • RL
      • Preserve & Share
        • Current
        • MI
        • Website
        • RL
    • Train the Trainer
  1. Training Tracks
  2. Open Research Cycle
  3. Plan & Design
  4. Current

Plan & Design

4. Preserve & Share3. Analyse & Collaborate2. Collect & Manage1. Plan & Design
Set the foundation for open & reliable research

Explore & Reuse Check Legal Frameworks Write Data Management Plan Design Study

Team Checkpoint: Study Plan Presentation & Preregistration Submission

Explore and Reuse

Any resource that inspire you or that you want to reuse and/or adapt must minimally be cited, using their DOIs (and otherwise URL with author, date, and time of access) and you must follow the license and/or usage agreement provided by the authors. A research output (e.g. data, code) without a license cannot be reused even if they appear publicly online.

  • Articles
  • Preregistrations
  • Data
  • Code
  • To come up with a well-founded research question, review existing literature on open source registries like OpenAlex.
  • Keep track of your bibliography with an open source reference manager like Zotero. It can be integrated within RStudio to write reproducible manuscripts seamlessly, formatting your bibliography in any format you want (see 3. Analyze & Collaborate).

LEARN MORE

OSC Tutorial
LMU OSC logo
Zotero logo

Introduction to Zotero

An open-source reference manager that integrates with Word, RStudio, and Google Docs. (1h)

TOOLS & RESOURCES

OpenAlex icon

OpenAlex

All the world’s research, connected and open

A preregistration typically consists of a hypothesis and predictions, a plan for data collection (when relevant), and a plan for data analysis, that researchers upload before starting their projects, often in order to increase the rigor of confirmatory research (see pre-analysis planning).


To have some insight into projects that are not (yet) published, either currently ongoing or abandoned, you can therefore look for projects that were preregistered. Projects that are left unpublished typically have a note attached to their preregistration. Some registries are discipline-specific while others are discipline-agnostic:

  • https://osf.io/registries
  • https://aspredicted.org/
  • https://preclinicaltrials.eu/
  • https://www.animalstudyregistry.org/
  • https://clinicaltrials.gov/

TOOLS & RESOURCES

OSF

OSF

Registry of preregistrations. Widely used across fields.

To find existing datasets, search for discipline-specific repositories on re3data which is a central registry of all repositories, or explore subject agnostic repositories such as DataCite or Zenodo (oftentimes with the corresponding analysis code). These platforms either give you access to existing data or provide metadata and explanations on how to request access to the data. Metadata are data about your data, such as author, date, measurement device, unit of measurement, context of data collection, ect.


Found a dataset to reuse? Make sure you know where your data comes from, how the data was collected and processed, and reflect on whether any of it poses problems for your research question. Finally, check what requirements and resources the data sources have. Some requirement to access the data may be to submit a preregistration (see Study Design & Analysis plan). Either way, in order not to introduce confirmation or hindsight bias in your data analyses, do not start plotting the data itself but instead familiarize yourself with the metadata to plan your analyses (see Study Design & Analysis plan ). Specifically, there may be a data dictionary (also called “codebook”), or other extensive documentation on which variables and their description are included.

TOOLS & RESOURCES

re3data

Registry of data repositories. Search across disciplines to find domain-specific options.

You may find code available for reuse archived on Zenodo or actively developed on Github. Learn how to work with GitHub in more details in 3. Analyze & Collaborate or start learning Git version control now!


Important

Code publicly visible on GitHub without a license cannot be legally reused. Ask the authors to add an open license to their repository to allow reuse (they must add a file called LICENSE.txt with e.g. the Apache 2.0 license text - see our code publishing tutorial to learn more about licenses).

LEARN MORE

OSC Tutorial
LMU OSC logo
GitHub

GitHub

Version control platforms for code collaboration with issue tracking.

Legal Requirements

  • LMU guidelines
  • Funders
  • Ethics

The LMU Guidelines for Safeguarding Good Scientific Practice (in german) are legally binding for all academics, researchers, research support staff, teachers, and students at LMU Munich. Only the original text in German prevails, but here is an English summary of relevant aspects for this guide:

Appropriate level of documentation and standards to allow reproduction:

  • Reproducible methods must be used. (§11)
  • When research software is developed, its source code must be documented. (§12)

Appropriate level of documentation and standards to allow replication:

  • All information relevant to the production of a research result must be documented comprehensively to enable replication. (§7 and §12)
  • If specific professional recommendations exist for review and evaluation, the results must be documented in accordance with these respective specifications. (§12)
  • Individual results that do not support the hypothesis must also be documented; a selection of results is not permitted. (§12)

Public access to research results:

  • Apart from specific exceptions, all findings should be made public. For this, they must be described in a detailed and comprehensible manner which includes making available the research data, materials and information on which the results are based, as well as the methods used and the software employed (including appropriately licensed self-written software) according to the FAIR principles*. (§13)
  • Data, material, software made publicly accessible must be appropriately archived, usually for a period of 10 years (§17).

Note

The FAIR principles are defined as:

  • Findable: metadata should be deposited in a searchable repository and be assigned a permanent identifier

  • Accessible: the data is either open, or accessible upon some authentication process, or closed, but with open metadata.

  • Interoperable: the data is described with a standard terminology (so the dataset can be merged with other ones) and saved in a stable file format

  • Reusable: the data is richly documented (e.g. with a data dictionary) and is accompanied by a data usage license

See for more information and section 2. Collect & Manage to learn how to implement the FAIR principles in your research.

In later sections, you will acquire skills in FAIR data management and reproducible workflows that will enable you to comply with these guidelines.

Funders may have additional open and reproducible science requirements on top of those indicated in the LMU guidelines. For instance, some calls request a Research Data Management plan before making their second payment, some specify the extent and timing of data sharing and provide funds for such activity. Check all requirements in the call information sheets.


Contact the LMU Research Funding Unit to review your grant proposal and assess if your proposal is meeting your funders’ open science requirements.

Data collection and analyses conducted on animals and humans typically requires the approval of your Faculty’s ethics committee (see e.g. instructions for the LMU Faculty of Medicine and instructions for the LMU Department of Psychology) to make sure your research will be conducted responsibly and that participants’ data will be protected.


Your ethics proposal will typically include information on:

  • Data storage and retention – outlining how data will be securely stored, backed up, and retained over time. This information can be extracted from a more detailed Research Data Management plan (see RDM plans).
  • Risks if the data were leaked – identifying potential consequences for participants or the research project if confidentiality is breached.
  • Data anonymization – describing procedures to remove or obscure personally identifiable information to protect participant privacy (see Data anonymization for options).
  • Language of the informed consent forms – ensuring that participants clearly understand the purpose, procedures, and any potential risks of the study. Conditions for sharing their data should be clearly explained here.
  • Power analysis to justify sample size – providing a statistical rationale for the number of participants, which supports the validity and ethical justification of the study. This, and more detailed information on the statistical plan, can be extracted from your pre-analysis plan (see pre-analysis planning)

Research Data Management Plans

A Data Management Plan (DMP) documents how you will handle research data throughout your project. Writing a DMP prompts you to think and document decisions you might otherwise leave implicit.

Your DMP will ask:

  • What data will you collect or generate (types, formats, volume, sources)?
  • How will you describe it (metadata standards, documentation practices)?
  • How will you organize files (naming conventions, folder structure, versioning)?
  • Where will you store it (locations, backups, access controls)?
  • How will you ensure quality (validation checks, error-handling)?
  • How will you share outputs (repositories, licenses, embargo periods)?
  • What constraints apply (consent, anonymization, GDPR, data use agreements)?

The specific questions vary by discipline, data type, and funder requirements. DMP tools like RDMO guide you through the relevant questions with funder-specific templates. Your DMP is a living document: start with what you know, and refine the details as your project develops (see Data Management).

LEARN MORE

LMU supported
LMU OSC logo

FAIR Data Management

Includes a chapter on writing DMPs (2h total)

TOOLS & RESOURCES

RIO-logo

RIOjournal DMPs

examples DMP by discipline

LMU supported
RDMO-logo

RDMO

Funder-compliant templates (DFG, ERC, Horizon Europe)

Study Design & Analysis Plan
  • Pre-analysis planning
  • Simulation of Data
  • Power analyses

Why should you plan your statistical plan prior to collecting data?


Humans are prone to cognitive biases such as confirmation bias (seeking information that supports existing beliefs) and hindsight bias (believing outcomes were predictable after the fact). In research, these biases can distort findings, especially when researchers make analytic decisions after seeing results. Although statistical testing typically accepts a 5% false positive rate, “researcher degrees of freedom” — choices about data collection, exclusions, transformations, sample size, covariates, etc. — can dramatically inflate false positives when decisions are made post hoc. Practices like increasing sample size until reaching statistical significance, selectively removing outliers, or trying multiple analytic strategies increase the likelihood of false-positive results, often unintentionally.


The core problem is that analyses guided by observed outcomes allow biases to influence decisions, making many reported effects unreliable. A key remedy is transparency and preregistration: specifying hypotheses, methods, and analysis plans before data collection or analysis. Preregistration constrains bias in confirmatory testing while still allowing exploratory analyses, clearly distinguishing robust hypothesis tests from hypothesis-generating work. This improves credibility, limits false positives, and often leads to better study design through early methodological feedback.


You will benefit from using a preregistration template for:

  • experimental studies (i.a. with a manipulated variable): it will define what will be your confirmatory analysis and strengthens your claim

  • observational or exploratory studies: it will help you move along the exploratory-confirmatory continuum

  • qualitative studies: it will provide a way to document e.g. your positionality towards a subject in the course of a project.

Several preregistration templates exist. While the standard OSF preregistration template is most commonly used, some are tailored for specific field or specific methods (e.g. systematic review, qualitative work, secondary data analysis).

Your preregistration will define your study’s:

  • Hypothesis and predictions
  • Data collection procedures
  • Sample size and stopping rule
  • Variables (manipulated, measured, indices)
  • Statistical method (model, dependent and independent variables, covariables, trasnformations)
  • Data exclusion criteria
  • How to deal with missing data

A great tool to create your statistical plan, especially for early career researchers still learning statistics and needing feedback from supervisors, collaborators, or statisticians on their design, is to simulate data and write the possible statistical test to analyze that data (see Simulation of data and Power analyses ).


Once completed, you can submit your preregistration on e.g. the Open Science Framework (OSF) before collecting or analyzing existing data. The plan can be embargoed (i.e. kept private for a predetermine amount of time, and for a maximum of 4 years of the OSF) to prevent scooping, then later made public upon publication. This process improves transparency and is inviting valuable early feedback from collaborators. An even stronger approach is submitting preregistrations directly to journals (then called “Registered Reports”), enabling peer review at a stage where methodological adjustments are still possible.

Registered Reports are a publication format, now adopted by over 300 journals (see participating journals), where studies are peer-reviewed before data collection. Reviewers evaluate the hypotheses, methods, and planned analyses, allowing methodological improvements. If the plan is approved, the journal grants in-principle acceptance, meaning publication is guaranteed provided researchers follow the protocol.


After completing the study, authors add results and discussion sections, clearly separating preregistered confirmatory analyses from exploratory ones. Final review focuses on adherence to the approved plan and the validity of conclusions, not on whether results are significant. This model shifts incentives toward asking important questions and using rigorous methods rather than chasing striking outcomes.

LEARN MORE

OSC Tutorial
LMU OSC logo

TBA preregistration tutorial

——-

TOOLS & RESOURCES

OSF icon

OSF

Flexible templates, embargoes, file storage. Widely used across fields

COS icon

COS

Flexible templates, embargoes, file storage. Widely used across fields

In our context, a computer simulation is the generation of artificial data to build up an understanding of real data and the statistical models we use to analyze them. You can simulate data to:


  • Test your statistical intuition or demonstrate mathematical properties you cannot easily anticipate.
    Example: Check whether there are more than 5% significant effects for a variable in a model when supposedly random data are generated.

  • Understand sampling theory and probability distributions or test whether you understand the underlying processes of your system.
    Example: See whether simulated data drawn from specific distributions is comparable to real data.

  • Perform power analyses.
    Example: Assess whether the sample size (within a simulation repetition) is high enough to detect a simulated effect in more than 80% of the cases. (see Power analyses)

  • Prepare a pre-analysis plan. Example: To be confident about the (confirmatory) statistical analyses you want to commit to before data collection (e.g. through a preregistration or registered report), providing a simulated dataset to a statistician or mentor will allow them to provide concrete suggestions on the most appropriate statistical test to apply to your data. The code containing the analyses of simulated data can be submitted along with your preregistration or registered report for reviewers to exactly understand what analyses you intend to perform. Once you get your real data, you may simply plug them into this code and get the results of your confirmatory analyses immediately.


Generating an artificial dataset in R (see our simulation tutorial ) is much easier than researchers often believe it to be and is helpful even when you need to make assumptions about variable distribution or when the parameter space is not well known.

Power analysis is relevant whether you are designing a project from scratch or running an analysis on already existing data. There are two main types of power analyses:

A priori power analysis

To calculate the smallest sample size required to detect the smallest effect of interest, simulate data following our advanced tutorial. For a very basic power calculation, you can use simple R function if you know 3 out of 4 of these parameters:

  • required sample size n (usually the one missing)
  • desired power (default 0.80)
  • the alpha level (default 0.05)
  • the expected effect size (has to be estimated or extracted from the literature on the form of d, f, etc.)

To get support with pre-analysis planning, you can book a consultation with the LMU statistical consulting unit kontakt@stablab.stat.uni-muenchen.de and visit the StaBLab website for more information.

Post-hoc power analysis

Sometimes, you may not be able to control the sample size for your project (e.g., because data collection has ended and you only have a certain number of good quality samples). In this situation, you can compute a so-called “post-hoc power”. Beware: This power computation comes in two flavors - one is legitimate, and one is flawed and not defensible.

The legitimate post-hoc power is computed with your actual n, and the same effect size that you plugged into your a-priori power analysis. This analysis gives you the achieved power to detect your assumed effect.

The flawed version of post-hoc power is called “observed power”: If an analysis yields a non-significant result, some researchers calculate the post-hoc power, but plug in the observed effect size. “Observed power”, however, is just a one‑to‑one function of the p‑value (a non-significant p-value returns a low power < 50 %, a just significant p-value of .05 always yields a power of exactly 50%). Observed power adds no new information to the p-value and is essentially meaningless. Do not compute this type of post-hoc power!

Plan & Design Checklist

To complete before presenting your complete study plan to your research group and, if applicable, submitting your ethics proposal and/or preregistration. Not all items are relevant for all fields of research or study type.

Background Information

Study Design

Data Management Planning:

Project Management

Before Data Collection:

  • Edit this page
  • Report an issue
Open Research Cycle
MI
Ludwig-Maximilians-Universität
LMU Open Science Center

Leopoldstr. 13
80802 München

Contact
  • Prof. Dr. Felix Schönbrodt (Managing Director)
  • Dr. Malika Ihle (Coordinator)
  • OSC team
Join Us
  • Subscribe to our announcement list
  • Become a member
  • LMU chat on Matrix

Imprint | Privacy Policy | Accessibility