What is duplicate content?

duplicate contentDuplicate content is a case of percentages – if all of your content is exactly the same as found elsewhere (e.g. Affiliate feed) then that can be considered duplicate content and cause your web page (or website) not to rank.

However, if your content contains only a small amount of content from elsewhere and you have added value to the original source (e.g. Recipe list in an article or content creation) then it may not be considered duplicate.

Matt Cutts explains the difference in the video below.

Best footprint tool – and it’s free!

I find myself looking up footprints almost everyday and having to remember which spreadsheet, bookmark or word document to look in. So it was a delight to stumble upon chase the footprint – http://chasethefootprint.com/.


At first it looks like a very simple tool and hardly worth a mention but when you dig deeper you find it’s treasures.

There are common top level  categories like guest blogging, directories, forums and blog commenting. But each contains sub-category of all the footprint terms you could ever imagine. You still have to use some imagination like sometimes putting keywords in quotes and negative phrases to pinpoint prospects.

When you hit return you are immediately taken to google search with the foot print filled in. After that it is a question of sifting through to find good prospects.

How to get sales territory data in GA

GA provides very useful GEO data in its demographic reports category. You can find traffic behaviour and conversion reports for continent, country and city based on ip addresses. However, what if you want to produce data for specific sales target territories like:

  • North East England
  • North West England
  • Yorkshire and the Humber England
  • East Midlands England
  • West Midlands England
  • East of England
  • South East England
  • London
  • South West England
  • Scotland
  • Wales

The process I used is a long one but involves:

  1. Download data of top 500 city visits for a period up to six months
  2. Make a comprehensive list of the cities and their respective categories – a good use of juniors or interns.
  3. Create a custom report for each sales territory using explorer type
  4. Decide the metrics, dimensions and drill downs you want in the report as the field and
  5. Choose filter and enter the towns as regex (regular expression) with pipe symbol (|) to signify “or”

Now when you run the report only designated towns for the region will show and be counted. Phew!


Using Date Ranges in Advanced search

There are instances where you want to scrape a list of urls for link building but would like to stay relatively current. For instance you scrape forums for your keywords but end up with very old threads that get you instantly band when trying to post. In such cases you want to restrict results either to a the last x months or to a particular date range.

Restrict to last x months

To restrict results to the last x months you put the search operator date:x where x is the number of months to go back. So for instance date:3 in your query returns results indexed within the last 3 months only.

Dates within a range

Its a little trickier to restrict dates to a particular range. Basically you put your date range in normal format at the site and it converts to Julian format. To convert normal dates to julian format there is a tool at http://jwebnet.net/advancedgooglesearch.html#advDate.

You then add this to your advanced search using daterange: operators As an example “keyword daterange:2455928-2456293” returns results of your keyword indexed in 2012 (1/1/12-31/12/12).

Analytics Regex Testing

The trouble with setting REGEX in analytics is that you never quite know it’s working until you start receiving data. This can take 24 hours after which you may need to change the REGEX again remembering settings each time. Another issue is that there are only so many goals you can set per each profile so too many trials and errors may may make you run out.

Analytics Content Advanced Search

One useful way to test REGEX is in analytics itself. If you paste the term in advanced search in content it should show the required data you want to isolate. If appropriate pages are shown then you can be sure the REGEX term will give required data.

AdWords Search Terms Reports

Search terms report are useful for letting you know what keywords searchers used to trigger your advert but it does not tell you which bidding keyword was used. Google’s new report has added a “Keyword” column that allows you to see the exact bid keyword that matched the searchers query in order to trigger your advert. It is part of a customisation you can make in adding new columns under the search terms report.

Seeing the matched keyword is useful in giving you information to make certain decisions like adjust the keyword, add negative keywords to the campaign/adroup, adjust ads/bids or even relevant landing pages.

Further reading:

Adwords blog

Search Engine Land


Anchor Text Variation Pre & Post Penguin

Anchor_DiversityA lot of link builders have been struggling with the correct ratio of anchor text after the Penguin update. Below is a suggestion by mMatt Cullen who Many SEO’s have a lot of respect for:

“Pre-Panda/Penguin, there was a pretty specific formula that would
work the majority of the time for ranking a website in Google.
Obviously, this was dependent on the competition too, but the key
strategies went something like this…

1. Get a bunch links – a lot of them

2. Get links with your keyword in them

The formula looked like this:

– 60% of your links should be your main keyword

– 15% of links are your main keyword + other words (eg: read about,
check out, etc)

– 15% of your links generic text (eg: click here, visit site)

– 10% of your links as the text (eg: your link URL used as the text)

And like a lot of people have experienced, when Google’s
Penguin update came around, it effected this “formula”.

BUT there’s one very good reason that some of our sites
didn’t get hit at all, or recovered extremely fast.  And it has
absolutely NOTHING to do with where we were getting the links from,
but more so with “how” we were  linking… i.e. the link
density and anchor text we were using.

We quickly realized that the old way of doing things needed to be
changed.  Slamming your sites with a bunch of keyword rich anchor
text is finished – it won’t work.

But the reason we’ve now rebounded is because of the following
two things:

1. It wasn’t the links that were the problem

Getting a ton of links still works.  Do I need to repeat that?

Most people froze and just stopped building links altogether
fearing that if they continued, their sites would get hit even

They were wrong.  You see I figured that if you suddenly stopped
business as usual, that it would be a sure-fire sign to Google that
you were up to something and then they would hit you even harder.

They were looking for a reaction on your part.  If something
suddenly changes, you’re likely gaming the system.  If your
links are in fact natural, well then why would anything change?

It wouldn’t…

So we got that part right and now we’re almost back to
pre-Penguin traffic levels and rankings.

But there was a second thing…

2. Anchor text links have CHANGED…

Our formula for the amount of keyword anchor text we used was now
defunct.  We didn’t change it right away (keeping with what I
outlined above), but we did slowly phase in a new formula.

The new formula goes a little something like this (obviously
adjusting as needed, based on our competition too):

20% – your main keyword that you’re wanting to rank for

25% – related LSI keywords (these are keywords related to your main
keyword.  Think of these as the “related” searches that
Google shows you as you’re typing in their search field).

35% – your URL as the anchor text, as well as other combinations of
“branded keywords”.

20% – generic keywords

Google is now looking for a more diverse linking profile – a more
natural one.”

Checking Server Status and Backlinks with Scrapebox

scrapeboxScrapebox has a couple of useful addons for checking the status of your pages and backlinks of previous linking sites and webpages.

You may want to check the status of many pages in a website to know if they are alive or dead so as to decide what to do with them – e.g. redirect them to a live page.

Create a list of Indexed Pages

The first thing you want to do is create a list of webpages for the site. You can do this by using Screpbox sitemap plugin or by using the site:yourdomain.com command to get a list of webpages indexed in Google, Yahoo and Bing. Once you get this list you need to remove duplicates from the different search engines so that you have a clean list to work with.

Check Server Status

From Scrapebox instal the Alive Check Addon and run it for the list of indexed urls. You can tell it what status code to treat as alive such as 200 success or reverse check for 404 pages. You can also choose whether to follow a 301 permanent redirect or count it as dead. This will provide a result of the urls with comments like Alive, Dead or failed. In case of Failed it is useful to recheck a few times in case your site server is rejecting your request as precaution to too many requests.

Check Backlinks using Scrapebox

Another useful addon lets you check if your links are still up on a particular webpage. The link checker plugin works by checking a bunch of webpages or blogs for mentions of your site. You can also use it to check if the links are nofollow or otherwise. My experience with the addons is that you need to run the server status error list a few times as it time-out before finding an actual link. However, the link not found report was accurate.

Below is the instruction video for both addons: