Stop facebook attribution parameters from breaking my website

Problem Description

The affected application would return a 404 The requested URL was not found on this server message when accessed from logged in facebook users. That is because the link facebook presents to its users contains their fbclid URL parameter:

Example: https://domain.tld/adventcalendar/?fbclid=IwAR0QwiqUUrAZqv66g2y4SINDYjMZlGSZXEi6NhMXSLJqdfzoVGiWxMgfP1c

Why is facebook appending ?fbclid=${id} to links?

Dave Clark offers a succinct explanation taken from Quora:

Why are Facebook appending parameters beginning with “fbclid” when linking to other sites?

This is part of an update to the way that Facebook Pixel works.

Up to now, Pixel has correlated the information about how and where you clicked on a link by using a third-party cookie: Facebook has set up a cookie, which it therefore owns, and code imbedded on the target (advertiser’s) page has been able to read this cookie and then send data to Facebook enabling clicks and conversions to be counted up and analysed. This helps the advertiser know how effective, or not, each of their adverts has been with different groups of users.

Increasingly, third-party cookies are becoming unreliable, as they are increasingly being blocked in response to (quite justified) privacy and tracking concerns. So, from October 24th 2018 onwards, Pixel is offering a “first-party” option. As you’ve noticed, they have already instrumented the outgoing links from their site/app with “fbclid” (Facebook click identifier) parameters. If the target (advertiser’s) page contains Pixel code with the first-party option active, then Pixel will use that parameter to correlate the data, creating its own cookie (which is then owned by the advertiser’s domain rather than by Facebook) in response.

You can read about cookie settings for the Facebook pixel.

You may notice something similar when you click on links from Google ads, which add a “gclid” (Google click identifier) parameter onto the outgoing links for exactly the same reasons.

Solution

Since my application is a Christmas calendar that only needs to be available for 24 days I opted for a simple solution.

Add this to your .htaccess (or Apache vhost config):

RewriteEngine On
RewriteCond %{QUERY_STRING} ^(.*)&?fbclid=[^&]+&?(.*)$ [NC]
RewriteRule ^(.*)$ /adventcalendar/$1?%1%2 [R=302,L]

Rule Explanation

  • Line 1:
    • enables the mod_rewrite module
  • Line 2:
    • matches all requests that contain fbclid
    • defines 2 capture groups (.*) that will be referred to as $1 (URL) and $2 (additional URL parameters)
  • Line 3:
    • rewrites all requests that match the condition
    • tells the client’s user agent that there is a redirect (302)
    • defines the target url as ${domain}/adventcalendar/${matchingRewriteRuleURLs}${urlFromCaptureGroup1}${urlFromCaptureGroup2}

☝️ Difference between $1 vs %1 in .htaccess

%1 Refers to a pattern matched in a RewriteCond condition, while $1 refers to a pattern matched inside a RewriteRule.

More generically, use %n to refer to the numbered matches from RewriteCond condition regex patterns, and use $n to refer to numbered matches from RewriteRule regex patterns.

Source:
https://stackoverflow.com/questions/6654834/difference-between-1-vs-1-in-htaccess