# Research vocabulary

> In the age of AI, knowing the words is half the rigour.

Names for the moves and measures of a real study: design, sampling, validity, statistics, sources, reasoning, applied methods.

Source: https://www.juanmnl.com/notes/research-vocabulary.html

## Study Design

How a study is structured, which shapes how much you can trust its conclusions.

### Hypothesis

A specific, testable prediction the study sets out to confirm or refute.

Say: "State the hypothesis you're testing."

### Null hypothesis

The default "there is no effect/difference" that you try to disprove.

Say: "What's the null hypothesis here?"

### Control vs experimental group

The group that gets no intervention vs the one that does, compared to isolate the effect.

Say: "Was there a proper control group?"

### Randomized controlled trial / RCT

Participants randomly assigned to groups, the strongest design for causal claims.

Say: "Is this from an RCT or just observational?"

### Observational study

Researchers observe without intervening. Can show correlation, rarely causation.

Say: "Flag that this is observational, not causal."

### Longitudinal vs cross-sectional

Longitudinal tracks the same subjects over time; cross-sectional is one point in time.

Say: "Was it longitudinal or a single snapshot?"

### Qualitative vs quantitative

Qual explores meaning (interviews, themes); quant measures (numbers, stats).

Say: "Is the evidence qualitative or quantitative?"

### Blinding

Hiding who's in which group from subjects (single) or also researchers (double) to cut bias.

Say: "Was the study double-blind?"


## Sampling

Who was studied, and whether they represent who you care about.

### Population

The whole group you want conclusions about (e.g. all adults in the US).

Say: "What population does this generalize to?"

### Sample & sample size (n)

The subset actually studied. Bigger, well-chosen samples give more reliable estimates.

Say: "What was the sample size?"

### Random sampling

Selecting subjects by chance so the sample mirrors the population.

Say: "Was the sample randomly drawn?"

### Representative sample

A sample whose makeup matches the population on key traits.

Say: "Is the sample representative or skewed?"

### Selection / sampling bias

When the sample systematically over/under-represents some group, distorting results.

Say: "Could selection bias explain this?"

### Generalizability / external validity

How well findings extend from the sample to the wider world.

Say: "How generalizable is this finding?"


## Variables & Validity

What's measured, what's manipulated, and what might be quietly distorting things.

### Independent variable

The factor the researcher manipulates to see its effect.

Say: "What's the independent variable?"

### Dependent variable

The outcome measured to detect the effect of the independent variable.

Say: "What's the dependent (outcome) variable?"

### Confound / lurking variable

An unaccounted variable that affects both cause and effect, faking a relationship.

Say: "What confounds could explain this?"

### Correlation

Two variables move together. Says nothing about cause on its own.

Say: "Is this just a correlation?"

### Causation

One variable actually produces the change in another. Needs strong design to claim.

Say: "Does the evidence support causation, not just correlation?"

### Validity

Whether the study measures what it claims to measure (the right target).

Say: "Is the measure valid for this question?"

### Reliability

Whether the measurement is consistent and repeatable.

Say: "Is the measure reliable across trials?"


## Statistics

The numbers behind the claim. These let you judge whether a result is real or noise.

### Mean / median / mode

Three "centers." Median resists outliers; mean doesn't.

Say: "Report the median, not just the mean."

### Distribution

The shape of how values spread, the bell curve is the normal distribution.

Say: "What's the distribution shape, is it skewed?"

### Standard deviation

How spread out values are. Small = clustered; large = dispersed.

Say: "What's the standard deviation?"

### p-value

Roughly: the chance of seeing this result if the null were true. Low = unlikely to be flukey. Not "importance."

Say: "Is the result statistically significant (p-value)?"

### Statistical significance

A result unlikely to be random, not the same as large or meaningful.

Say: "Significant, but is the effect big enough to matter?"

### Effect size

How big the difference actually is, the practical magnitude, beyond significance.

Say: "What's the effect size, not just the p-value?"

### Confidence interval

A range the true value likely falls within. Wide = uncertain; narrow = precise.

Say: "Give the confidence interval, not just the point estimate."

### Margin of error

The plus/minus around a poll or estimate from sampling.

Say: "What's the margin of error on that figure?"

### Regression

A model fitting a line/curve to data to estimate relationships and predict.

Say: "Did they control for other variables in the regression?"


## Sources & Literature

Where knowledge comes from and how vetted it is.

### Primary vs secondary source

Primary = the original work; secondary = summaries/coverage. Cite primary for claims.

Say: "Find the primary source for that stat."

### Peer review

Expert vetting before publication, the quality bar for scientific work.

Say: "Is it peer-reviewed?"

### Preprint

Early, unvetted release. Useful but flag it as not yet reviewed.

Say: "Note if it's only a preprint."

### Meta-analysis

A study that statistically combines many studies for a pooled, stronger estimate.

Say: "Is there a meta-analysis on this?"

### Systematic review

A rigorous, transparent survey of all studies on a question.

Say: "Summarize the systematic reviews."

### Citation & impact

References to a work; high citation counts hint at influence (not always quality).

Say: "How well-cited is this paper?"


## Reasoning & Bias

The thinking traps and the tools to avoid them.

### Deductive vs inductive

Deductive applies a rule to a case; inductive infers a pattern from cases.

Say: "Is this reasoning deductive or inductive?"

### Falsifiability

A claim is scientific only if there's evidence that could refute it.

Say: "Is this claim falsifiable?"

### Base rate

The underlying rate a result should be judged against. Ignoring it = base-rate fallacy.

Say: "What's the base rate for context?"

### Confirmation bias

The tendency to notice evidence that fits what you already believe.

Say: "Check this for confirmation bias."

### Publication bias

The literature skews toward exciting/positive findings; null results vanish.

Say: "Could publication bias inflate this effect?"

### Replication

Independent repetition getting the same result, the real test of a finding.

Say: "Has this been replicated?"

### Steelman

Engaging the best form of an opposing argument, not a strawman.

Say: "Steelman the counter-argument first."

### Cherry-picking

Highlighting supportive data while ignoring the rest.

Say: "Is this cherry-picked, or the full picture?"


## Applied / UX Methods

Research techniques you'll meet in product, design, and market work.

### A/B test

Showing two versions to split traffic to see which performs better. An online RCT.

Say: "Was the difference A/B tested for significance?"

### Usability test

Watching real users attempt tasks to find friction in a design.

Say: "Run a usability test on this flow."

### Survey

Standardized questions to many people. Watch wording and sampling bias.

Say: "Design a survey that avoids leading questions."

### Interview

In-depth one-on-one conversation. Rich qualitative insight, small n.

Say: "Draft an interview guide for user research."

### Cohort analysis

Tracking a group who share a start point (e.g. signup month) over time.

Say: "Break retention out by cohort."

### Triangulation

Combining multiple methods/sources so they corroborate each other.

Say: "Triangulate survey data with interviews."
