Marianne Sweeny: The Threshold of Duplicate Content

At content26, our standard party line has always been that unique content is worth the comparatively small increase in cost and time. But because we don’t generally have access to sales figures, page view rankings, and similar data, we’re hard-pressed to back that up.marianne-sweeny

Enter Marianne Sweeny, a search information architect. Marianne agreed to answer some questions about how duplicate content performs in search and explain how businesses can evaluate their need for unique content.

We can talk about problems with duplicate content and best practices for multichannel content all day long. But Marianne can talk about information architecture, search optimization, and the nitty-gritty of what happens with that duplicate content all day long.

“Duplicate Content Is Flotsam and Jetsam”

content26: What’s your opinion on SEO and duplicate content? Google says there’s no penalty, but do you think there’s more to the story?

Marianne Sweeny: Google does not overtly punish the site. The downside of duplicate content is found in at least two areas:

  1. Google assigns a crawl budget to each site, as the web is so vast and growing larger every day. If a site has a great deal of content duplication, that crawl budget is being squandered on pages that are found, crawled, compared, and discarded because the search engine believes that it already has a copy. Why would it want to waste resources storing two copies when one will suffice?
  2. If one version is a near duplicate update, Google will still discard what it considered to be the least valuable version. This could be the updated version.

Duplicate content is flotsam and jetsam. It is the floating island of plastic pollution in the ocean and should be removed.

content26: How can companies best determine if creating (and paying for) unique content is worth it, or if duplicate content is good enough? What kinds of trade-offs are there?

Marianne Sweeny: The search engines are becoming more content-centric because good content equals good user experience. Old, thin, stale, duplicated content is not quality content, and users signal their displeasure by leaving the page having taken no action, not allowing the page to load, or returning to the results for another selection.

If a business does not care about their visibility in organic search results or the ability of their website to convert or monetize its visitors, there is no reason they should care about duplicate content. I would like to ask these carefree site owners why they have a website in the first place.

Duplicate content is a black-and-white matter that is determined by a set of algorithms that compares term placement and density.

content26: Is there such a thing as semi-unique content? Can companies change, say, the headers and bullet points of a description and hope it affects SEO in a measurable way?

Marianne Sweeny: The search engines have a threshold that they use to determine if something is or is not a duplicate. There is no semi-state that I know of. If a page exceeds the threshold, it is found to be a duplicate and the search engine then decides, based on its internal criteria, which version is retained and which one is discarded. So, changes as slight as the ones you mention would not positively influence the determination of sameness.

content26: In other words, what makes content duplicate or unique? Is there a progression between the two, or is it more a black-and-white matter?

Marianne Sweeny: It is a black-and-white matter that is determined by a set of algorithms that compares word placement and density. There are a few good tools out there that will do the same. The one that I use is the Similar Page Checker on the Webconfs site.

content26: Do you have examples of brands who handle multichannel content well?

Marianne Sweeny: REI set the standard and that standard was developed by Samantha Starmer, a pioneer in the area of cross-channel content. Jonathan Colman continued Samantha’s good work at REI. Considering the vastness of its resources, Microsoft does an excellent job with content curation. And, being a Ford owner myself, I am going to call them out for great content alignment across channels.

content26: Is there a way to have good multichannel content without creating unique content for major retail channels?

Marianne Sweeny: Absolutely. Content does not have to be unique across the channels. Only the web requires uniqueness due to the sophistication of web search processing and the resources required to index a multi-trillion page web. What is critical is that the messaging be consistent across channels. What I find online at has to match what I find on the table at the REI  store in Seattle.

Old, thin, stale, duplicated content is not quality content, and users signal their displeasure by leaving the page having taken no action, not allowing the page to load, or returning to the results for another selection.

content26: Are you able to quantify, at all, the changes that brands might see from using unique content versus duplicate content on their major retail sites?

Marianne Sweeny: Google Webmaster Tools calls out duplicated metadata. So, seeing those errors diminish will have a positive impact. GWT also shows how many pages are indexed vs. how many found. So eliminating the duplicates will have a positive influence there. And, last but not least, site owners can do their own experiments on duplicate content:

  1. Select a set of duplicate pages
  2. Note the current placement for each page in results for selected keyword phrases
  3. Remove the more recent version of the duplicated set
  4. Request removal of that page from the index through a Google Webmaster account
  5. Track the placement in search results for the remaining page


Marianne Sweeny

Marianne Sweeny joined Portent, Inc. as SR Search Strategist in 2012, where she focuses on the user experience factors of SEO for client sites. As Director of Search Services at Ascentium, she designed a search practice that brought a strategic approach to search optimization. Prior to joining Ascentium, Marianne was a Web Producer for the Microsoft Enterprise Server websites, where she applied her search optimization and information architecture skills to improving customer engagement. In 2004, she cofounded Microsoft Information Architects, a 300-member, company-wide, cross-discipline community that continues to evangelize the best practices of IA company-wide. 

Let's work together.