Rescued technical SEO for Expreso (WordPress + CloudPanel + NGINX + S3/CloudFront)

Customer
Expreso.press
Tech Stack

The typical problem in a large medium: volume + legacy

Expreso featured the classic fronts of a site with years of content:

  • 404 and others 4XX by old URLs, invented by bots or generated by incorrect patterns.
  • Broken images (missing assets) that triggered “Broken images” and “Page has broken image”.
  • Mixed content (HTTPS pages with HTTP resources).
  • Duplicated goal (multiple meta description) by overlapping components.
  • Intermittent sitemaps (occasional 503 per load/proxy/cache).
  • External links with redirection (http → https / shorteners).

In a medium, these details multiply quickly: if there is no maintenance system, the site can continue to publish non-stop... but every week the “noise” grows and the reports become more difficult to control.

Strategy: attack in layers (server → CDN → WordPress).

The logic was simple: instead of chasing bugs one by one, we built a cleaning and prevention system.

1) NGINX: global hygiene to reduce noise and protect resources

The first layer was to filter what comes to the site:

  • Typical scanner requests (.env, wp-config.php, admin.php, shells, fake paths).
  • Repetitive routes that bloat logs and audits (e.g. mraid.js variants, pagespeed, etc.).
  • Hygienic“ behavior for 4XX without breaking legitimate endpoints.

It was organized in snippets so as not to risk the vhost and appropriate responses were applied:

  • 444 / 410 / 204 depending on the case, avoiding passing junk traffic to PHP.
  • Clear exceptions where the site must respond (e.g. valid .well-known or real routes).

Impact: less wasted load, fewer false errors and a more stable basis for tracking.

2) CDN (S3/CloudFront): resolve “Broken images” on a massive scale

Here was the actual volume.

In a medium, many historical images are lost or erased over time. It is unfeasible to recover them, but it is possible to prevent them:

  • the user sees broken images,
  • crawlers find thousands of 404s,
  • the site is eternally “dirtied” in audits.

Practical solution applied in Expreso: placeholders in S3 + invalidation in CloudFront.

Flow:

  1. Extract from the report (CSV) the URLs of images with 4XX in cdn.expreso.press.
  2. Upload a minimum placeholder (1px or light image) with content-type correct.
  3. Invalidate CloudFront to clear 404 caches.

Key learning: AWS CLI's naive approach (head-object per URL) can become very slow or hang on large lists. For Expreso, we migrated to a “FAST” method with boto3 (SDK) + parallelism + timeouts, with visible progress in logs.

Important detail:

  • Some reports come with paths already with wp-content/uploads/..., so the system should avoid duplicating keys such as wp-content/uploads/wp-content/uploads/.....

3) Mixed content: cut the “HTTP inside HTTPS”.”

Mixed content doesn't always break the page, but it does leave:

  • security alerts
  • negative reports
  • technical inconsistencies

Work was done to standardize:

  • resources and embeds in HTTPS
  • correction of legacy patterns (http://...) when applying
  • operating rule: nothing “unsafe” in bonded resources

4) External links with redirection (External 3XX)

In audits, it is common to find:

  • external links on http:// redirecting to https://
  • shorteners (bit.ly, tinyl...) that add jumps
  • domains that changed canonical or redirect to strange paths

In Expreso the work was separated into:

  • Auto: same host (secure normalization http→https / www→no-www).
  • Review: shorteners, weird strings, redirects to login/paywall, etc.

This allows to fix in batch what is safe and leave the rest for risk-free revision.

Result: moving from “firefighting” to operating with control

Once the layers (server + CDN + normalization) have been applied, the site remains:

  • cleaner for crawlers,
  • with fewer visible breaks,
  • and with a clear maintenance routine so that it does not accumulate again.

Future operation (what keeps Expreso healthy)

Simple technical rules for capturers (without getting into the content)

  • Always paste links in https
  • Avoid shorteners: paste the Final URL
  • Images only from Media Library (no hotlink)
  • Embeds/iframes always in https

Weekly routine for admins (20-30 minutes)

  • Run audit (Ahrefs or similar) and attend in order:
    1. Broken images
    2. Links to broken pages
    3. Mixed content
    4. Multiple meta description
    5. Timeouts / 5XX
  • Run batch fixes (scripts + invalidation) when applicable.
  • Validate sitemaps (if there is intermittency, it is taken care of by performance/proxy/cache, not by “content”).

Conclusion

In an environment such as Express, technical SEO is not solved with an isolated adjustment: it is solved with system + routine.

When noise is controlled, delivery is stabilized and broken assets are corrected at scale, the site becomes more trackable, more reliable and much easier to maintain week to week.

Share this post

Recent posts

Projects

Rescued technical SEO from Expreso.press

Technical SEO rescue on a high volume WordPress with CloudPanel/NGINX...

Photo

Canon M200 for light travel

Recently, I had the opportunity to test the Canon M200, a camera that...

Projects

Welcome Transpais 3.0

In this project, as Chido Digital Collective, we have brought together brilliant minds and...