Documentation is where most experiments fall apart. We've learned that a consistent template forces honesty and captures the details that matter. Here's the exact structure we use for every SKY Labs experiment — and you can use it too.
A clear, testable statement of what you're changing and what you expect to happen. Include the reasoning behind your hypothesis.
Example
"If we move the email signup form from the sidebar to an inline form within the first 500 words, we expect a 15-25% increase in conversions because readers are more engaged at that point in the content."
Tip: Write your hypothesis before collecting any data. Never change it mid-experiment.
What would count as success? What would count as failure? Define specific, measurable thresholds before you run the experiment.
Example
"Success = 10% increase in conversion rate over 14 days. Failure = decrease of 5% or more. Inconclusive = change between -5% and +10%."
Tip: Include both quantitative thresholds and qualitative signals (e.g., "if user complaints increase, stop early").
Detail exactly what you changed. Include technical implementation, timing, and any control conditions.
Example
"Changed ad placement from sidebar to inline after 2nd paragraph. Implemented via WordPress hooks. Control group: original sidebar placement. Test duration: 14 days. All other site elements unchanged."
Tip: Document everything. Months later, you won't remember the exact setup. Future you will thank you.
Capture at least 7-14 days of baseline data before making any changes. This establishes what "normal" looks like.
Example
"Baseline period: March 1-14. Average daily visitors: 47. Average conversion rate: 2.1%. Bounce rate: 58%."
Log what happened during the experiment. Include dates, any external factors, and observations.
Example
"Day 1-3: Implementation complete. No issues. Day 4: Server outage for 2 hours (noted in analytics). Day 7: First signs of trend emerging. Day 10-14: Pattern consistent."
Tip: Note anything unusual — holidays, technical issues, algorithm updates. Context matters.
Show the actual numbers. Don't cherry-pick. Include both the metrics that support your hypothesis and those that contradict it.
Example
"Test period (14 days): 847 visitors, 19 conversions (2.24%). Baseline (14 days): 892 visitors, 18 conversions (2.02%). Difference: +0.22 percentage points (+11% relative)."
Tip: Show raw numbers, not just percentages. A "100% increase" from 2 to 4 conversions is very different from 200 to 400.
Interpret the results. What patterns did you see? What surprised you? Be honest about what the data suggests — and what it doesn't.
Example
"Conversion rate increased consistently after Day 4 and held through Day 14. The improvement was small (0.22pp) but consistent. No negative impact on bounce rate or time on page."
What did you learn? Even failed experiments teach you something about your users, your methodology, or your assumptions.
Example
"Inline forms after 2nd paragraph show consistent, small improvements. This placement seems to capture users when they're most engaged. Worth implementing across all content pages."
What can you NOT conclude from this experiment? Be explicit about sample size, external factors, and any limitations.
Example
"This experiment ran with ~800 monthly visitors. Results are directional, not statistically significant. Findings may not generalize to higher-traffic sites or different niches. Server outage on Day 4 may have affected early data."
Tip: This section is critical for transparency. If you don't include limitations, someone else will point them out.
What will you do based on this experiment? Implement the change? Run a follow-up test? Abandon the idea entirely?
Example
"Implement inline form placement across all content pages. Run follow-up test in 3 months to validate long-term impact. Consider testing different positions (after 1st paragraph vs after 2nd)."
Documentation Principles
No Cherry-Picking
Include all data, even when it contradicts your hypothesis. Honest documentation means showing the full picture.
Be Specific
"Improved" is vague. "Increased from 2.1% to 2.4% over 14 days" is specific. Use numbers, dates, and precise language.
Document Failures
Failed experiments are often more valuable than successful ones. Document them with the same rigor.
Call Out Limitations
Be explicit about what your data can and cannot tell you. Small sample? No statistical significance? Say it.
Pre-Documentation Checklist
Hypothesis written and reviewed before starting
Success criteria defined with specific thresholds
Baseline data collected (minimum 7 days)
Only one variable identified for change
Experiment duration defined
External factors noted (holidays, platform updates, etc.)
Documentation template ready
Want to use this template? Copy the structure above for your own experiments. We've found it works for everything from small A/B tests to multi-week research projects. The key is consistency — use the same structure every time so you can compare experiments over time.
Back to SKY Labs