WordPress & WooCommerce
Content sync
Once the plugin is connected to your workspace, you need to push your WordPress content into Pitchbar so the agent can retrieve from it. The plugin syncs in two complementary modes: a manual bulk sync you trigger from the Settings page, and automatic deltas that fire on every WP hook for the post / product types you opted in.
Both modes are idempotent on external_id — re-running
them is free.
Content sources
| WP entity | Pitchbar Source type | External ID format |
|---|---|---|
| Posts, pages, custom post types | wordpress | wp:{post_id} |
| WooCommerce products | woocommerce_products | wc:{product_id} |
| WC coupons | Embedded in woocommerce_products source config['coupons'] | — |
One Source row per (agent, site host). The two types coexist on the same agent — posts answer FAQ-style questions, products answer "do you have X" style shopping intent. Pitchbar auto-creates each Source on the first sync.
Bulk sync (manual)
Open Settings → Pitchbar. Two buttons appear once the plugin is configured:
- Sync posts now — always visible. Pages through every
publish-status post of the post types you enabled, 50 per HTTP batch. - Sync products now — visible only when WooCommerce is active. Pages through
simple,variable,grouped, andexternalproducts, 50 per batch.
Click either button. The status line shows progress and final counts: "Sync complete. (124 posts, 87 queued, 37 skipped)". Skipped means the content hash matched the previous sync — no re-embedding needed, no LLM cost.
Resumable sync (large sites)
Shared hosting commonly enforces a 30-second PHP execution cap.
Pitchbar's syncers respect this: each pass enforces a
TIME_BUDGET_SECONDS = 20 wall-clock guard. When the
budget is exhausted AND more pages remain to process, the syncer:
- Persists a resume marker as a WP transient:
pitchbar_post_sync_resumefor posts,pitchbar_product_sync_resumefor products. Shape:{ "page": 7, "at": 1715472000 }. - Schedules a WP-Cron continuation 30 seconds out on the same hook (
pitchbar_run_full_sync_event/pitchbar_run_product_sync_event). - Returns immediately with
more: true,next_page: 7in the response so the admin UI knows to show "large site detected, the rest is continuing in the background."
Re-running "Sync now" while a resume marker exists picks up where the last pass left off. Successful completion (no more pages) clears the resume marker.
While a chunked sync is mid-flight, every WP admin page shows a soft notice: "Pitchbar is finishing a large-site sync in the background (posts). The widget already works; new content shows up once this completes."
Delta sync (automatic)
On every save_post, wp_trash_post, or
before_delete_post for an opted-in post type with
publish status, the plugin fires a single delta call
to Pitchbar:
| Endpoint | Trigger | Body shape |
|---|---|---|
POST /api/v1/wp/posts/changed |
save_post / wp_trash_post / before_delete_post on an opted-in post type. |
One post payload + action: "upsert" or "delete". |
POST /api/v1/wp/products/changed |
woocommerce_new_product, woocommerce_update_product, woocommerce_delete_product, woocommerce_trash_product. |
One product payload + action: "upsert" or "delete". |
Bulk sync uses the parallel /posts/sync and
/products/sync endpoints with up to 50 entities per
batch. Both endpoints authenticate exactly the same way
(bearer + HMAC).
Content normalization
The plugin's PostContentExtractor resolves visible
HTML for every post before sending it to Pitchbar:
- Page-builder detection — if the post is owned by Elementor, Beaver Builder, Oxygen, or Bricks, the builder's native renderer is called instead of
post_content. See Page builders. - Postdata priming —
$GLOBALS['post']is set +setup_postdata()is called so third-partythe_contentfilters (Yoast, Jetpack, Divi, embeds) see a valid$postglobal. - Block expansion —
do_blocks()resolves every Gutenberg block. - Filter chain —
apply_filters('the_content', …)runs every theme/plugin hook (lazy-loading, image replacement, related posts, etc.). - Shortcode resolution —
do_shortcode()resolves any remaining shortcodes. - Filter override — the final HTML is run through your own
pitchbar_post_content_htmlfilter, which lets you strip navigation chrome, force a custom template, or short-circuit entirely. See "Override the synced HTML" below. - Cleanup —
$GLOBALS['post']is restored andwp_reset_postdata()is called inside afinallyblock, so a throwing filter callback can't leak the loop state.
Content hash & skip-on-unchanged
Every sync (bulk or delta) sends a content_hash
field. Pitchbar compares it to the existing Document's stored
hash; on match the server returns immediately without re-chunking
or re-embedding. Re-running "Sync now" against a stable site is
effectively free — only the diff costs.
The hash is SHA-256 of the normalized concatenation:
sha256( title + "\n" + content_html + "\n" + excerpt + "\n" + taxonomy_term_names )
Whitespace is collapsed to single spaces via preg_replace('/\s+/u', ' ', …)
before hashing so trivial reflow doesn't trigger a re-index.
What gets indexed per post
- Post title (plain text, entity-decoded, tags stripped)
- Excerpt — explicit post excerpt if set, otherwise the first 40 words of stripped content
- Taxonomy term names — every term across every taxonomy the post type registers (categories, tags, custom taxonomies)
- Body HTML — fully expanded as described above, server-stripped of tags into chunks
- Permalink, post type, modified timestamp, detected language (from WP locale)
What gets indexed per WooCommerce product
- Name, SKU, permalink, primary image URL (medium size)
- Short description + long description (Gutenberg blocks expanded, shortcodes resolved, the_content filter applied)
- Price, regular_price, sale_price, currency, on_sale flag, stock_status
- Category names (via
product_cattaxonomy) - Attribute name + value pairs flattened to
"color: blue, red"strings so the embedding captures both - SHA-256 content_hash + ISO 8601 modified_at
Coupon sync
Coupons ride along with the product sync. After every successful
product sync completes (resumed-to-completion, not mid-flight),
the plugin's CouponSyncer runs a snapshot:
- Enumerates up to 50 publish-status entries of the
shop_couponcustom post type viaget_posts()— this works on every WooCommerce version since coupons shipped in WC 2.0. (Previously the plugin usedwc_get_coupons(), which is not public WC API on every release.) - Hydrates each match through
new WC_Coupon($id)(a stable WC class). - Filters out coupons that are expired (
get_date_expires()< now) or already over their usage limit. - POSTs the result to
POST /api/v1/wp/coupons/sync. Pitchbar persists them on the source'sconfig['coupons']array.
For each coupon the plugin sends: code (uppercased),
label (humanized: "10% off", "5 off your order", etc.),
discount, expires_at. See
WooCommerce
deep links for how the LLM uses these.
Auto-switch to the ecommerce vertical
On the first successful product upsert against an agent whose
site_type is generic, Pitchbar
automatically switches the agent to
site_type = "ecommerce". This enables the
EcommercePreset system-prompt fragment and the
<product/> inline-block emission rules so the
LLM can recommend products as rich cards instead of plain text.
The switch fires only when site_type is
generic. Agents already on saas,
documentation, or explicit ecommerce are
never overwritten.
Delete semantics
before_delete_post / woocommerce_delete_product
fire a single delete delta. Pitchbar tears down the
Document, its chunks, and the vector points associated with it.
Deleting an entity Pitchbar never saw is a silent no-op
(HTTP 200 with deleted: false) so repeated cleanup is
safe.
Override the synced HTML
Apply your own filter to mutate the HTML the plugin sends to Pitchbar — strip the site header, force an Elementor template, redact a section, etc.
add_filter('pitchbar_post_content_html', function ($html, $post, $builder) {
// $builder is the detected page-builder slug, or null for
// plain Gutenberg posts. e.g. 'elementor', 'divi', 'bricks'.
if ($post->post_type === 'product') {
return $html;
}
// Strip any leftover navigation chrome our theme injects via
// the_content. Pitchbar already does its own tag-stripping,
// but doing it here reduces the chunk-overhead.
$html = preg_replace('#<nav[^>]*>.*?</nav>#is', '', $html);
return $html;
}, 10, 3);
The filter receives the fully-rendered HTML before hashing
and POSTing. Returning '' effectively disables sync
for that post (Pitchbar will see empty content and decline to
embed anything).
Programmatic sync (from PHP)
You can trigger a sync from your own code — useful for migrations or testing:
// Posts
$result = (new \Pitchbar\Sync\PostSyncer)->runFullSync();
// $result['ok'], $result['posts'], $result['more'], $result['next_page']
// Products (requires WooCommerce)
$result = (new \Pitchbar\Sync\ProductSyncer)->runFullSync();
// Coupons
$result = (new \Pitchbar\Sync\CouponSyncer)->run();
None of these throw — they return result arrays with an
ok key, an errors array, and (for the
chunked syncers) more + next_page
fields for resume tracking.