How to Research a Product for Ad Copy

A methodology for scraping a brand's website to build a facts-only product catalog — what the product is, what's in it, how it's sold — before any strategy or messaging work begins.

Last updated 2026-04-17

Crawls a brand's website and extracts factual product data from every available product page. Compiles findings into a structured reference document.

What this outputs: Facts only. What the product is, what's in it, how it's sold, how it's used — as stated by the brand. No interpretation of what those facts mean for the customer.

What this does not output: Benefits, value props, emotional framing, review language, or any claim the brand hasn't made themselves. If a piece of information could be rephrased as "so what?" — it doesn't belong here.


PHASE 1: MAP THE PRODUCT CATALOG

Step 1 — Find all product pages

Start by mapping the site. Try these in order:

  1. [domain]/sitemap.xml — scan for product page URLs
  2. [domain]/shop, /products, /collections, /store, /catalog
  3. Homepage navigation — follow any shop or products links

Use web_fetch to retrieve the page and extract all product URLs. Look for URL patterns like /products/, /product/, /p/, /item/, or /collections/[name]/products/.

Step 2 — Compile the full URL list

From the sitemap or shop index, extract every URL that is an individual product page (not category pages, blog posts, or account pages).

Log the full list before scraping. If there are more than 30 products, pause and ask the user:

"I found [N] products. Do you want full coverage, or should I scope this to a specific product line? If full coverage, this may take a few minutes."


PHASE 2: SCRAPE EACH PRODUCT PAGE

For each product URL, use web_fetch and extract only what is explicitly stated on the page. Do not infer, editorialize, or add context.

Fields to Extract Per Product

Identity

Format & Physical Description

Ingredients / Materials / Components

Usage

Intended Use / Who It's For

Certifications & Third-Party Validations

Claims

Pricing & Purchase Structure


PHASE 3: BUILD THE CATALOG DOCUMENT

Compile all scraped data into the output document below.

Output Format

# Product Catalog: [Brand Name]
*Generated: [Date] | Source: [domain] | Products documented: [N]*

---

## How to Use This Document

Facts only. Every field reflects what the brand states on their product pages — nothing has been interpreted, reframed, or editorialized. Use as the raw input layer for benefit mapping, messaging angle development, and hook writing. Do not treat any field here as a benefit or value prop — that work happens downstream.

---

## Product Index

| # | Product Name | Type | Variants | Price Range |
|---|---|---|---|---|
| 1 | [Name] | [Type] | [e.g., 3 sizes] | [$X–$Y] |

---

## Full Product Entries

---

### [Product Name]

**URL:** [product page URL]
**Product Line:** [if applicable, otherwise omit]
**Type:** [e.g., facial serum, dog shampoo, protein powder, silicone brush]
**Format:** [e.g., liquid, capsule, spray, solid bar]

**Sizes / Variants:**
| Variant | Price | Subscription Price |
|---|---|---|
| [e.g., 1 oz] | $[X] | $[Y] (save Z%) |

**What's Included:**
- [Exactly what comes in the package]

**Ingredients / Materials:**
> [Full ingredient list — verbatim from page, or "Not listed on page" if absent]

**Key Ingredients Called Out by Brand:**
- [Ingredient name]: [concentration or descriptor as stated — e.g., "2% Salicylic Acid"]

**How to Use (as stated by brand):**
> [Verbatim or close paraphrase]

**Frequency:** [e.g., twice daily, as needed]

**Who It's For (brand-stated only):**
- [Any explicit use-case or audience language from the page]

**Certifications:**
- [List all — or "None listed" if absent]

**Claims:**
- "[Verbatim claim from page]"
- "[Any additional verbatim claims]"

**Regulatory Notes:**
- [Any disclaimer language — verbatim]

**Purchase Options:**
- One-time: $[X]
- Subscribe & Save: $[Y] ([Z]% discount)
- Available in bundle: [Bundle name] — $[price] *(includes: [what's in the bundle])*

---

[Repeat for each product]

---

## Catalog Notes

| Field | Value |
|---|---|
| Total products documented | [N] |
| Pages successfully scraped | [N] |
| Pages that failed / were inaccessible | [list URLs] |
| Products with incomplete ingredient data | [list] |
| Products with no pricing listed | [list] |
| Low-confidence entries | [anything inferred rather than directly stated] |

PHASE 4: DELIVER & CONFIRM

  1. Save as product-catalog-[brandname].md
  2. Present using present_files
  3. State coverage: "I documented [N] products across [N] pages."
  4. Flag gaps clearly: any pages that failed, products missing ingredients, or fields left blank
  5. Ask one closing question:

"Does this look complete? If there are products missing, paste the URLs and I'll add them. Once confirmed, this becomes the factual reference layer for all creative work on this brand."


Downstream Handoff Note

When passing to Creative Strategy Engine, Hook Writing, or Creative Mechanics, reference this document explicitly:

"Using product-catalog-[brandname].md as the factual source layer. All benefit framing and messaging angles should derive from what's documented there — not from assumptions."

This ensures every benefit claim in hooks and copy traces back to a verifiable product fact.

Frequently Asked Questions

What is How to Research a Product for Ad Copy?

A methodology for scraping a brand's website to build a facts-only product catalog — what the product is, what's in it, how it's sold — before any strategy or messaging work begins.

PHASE 1: MAP THE PRODUCT CATALOG?

This is one of the key sections of How to Research a Product for Ad Copy. See the full methodology above for details.

PHASE 2: SCRAPE EACH PRODUCT PAGE?

This is one of the key sections of How to Research a Product for Ad Copy. See the full methodology above for details.

PHASE 3: BUILD THE CATALOG DOCUMENT?

This is one of the key sections of How to Research a Product for Ad Copy. See the full methodology above for details.