Me & the Boss

Me & the Boss

Tuesday, September 4, 2007

Fwd: Are You Duplicating Your Own Content?



---------- Forwarded message ----------
From: DupeFree Pro Updates <updates@dupefreepro.com>
Date: Sep 4, 2007 5:53 AM
Subject: Are You Duplicating Your Own Content?
To: phil <phil.pal@gmail.com>


Hi Phil,

Sounds crazy but it's possible you might be creating duplicates of
your own content without even knowing it.

If the Search Engines see duplicates of your pages they
automatically choose which page to rank and shove the rest in to
their supplemental index (the black hole of Search Engine traffic).

It's important you understand how to avoid duplicating your own
content so that you can stay in control of which of your pages rank
in the Search Engines.

The two main possible causes for this are:

1) Duplicate Domain URL's
2) Internal Duplicates


--------------------------------------------------
Duplicate Domain URL's
--------------------------------------------------

The Search Engines view all the following URL's as *separate* pages
even though they all actually point to the same page...

http://yourdomain.com
http://yourdomain.com/
http://yourdomain.com/index.html
http://www.yourdomain.com
http://www.yourdomain.com/
http://www.yourdomain.com/index.html

If you (or others) are linking to your site using a variety of
these different URL's you'll not only be diluting PR (Page Rank) on
your site but you also stand the chance of having your content
labelled as duplicate.

At the time of writing Google is known to be aware of this issue
and are working to solve it. However, I urge you to not leave it to
fate. Take control of the situation as soon as you can.

Fortunately the work around is very simple and only involves a
small code to be placed in to your htaccess file on your webserver
(this works on Apache servers only).

Jason Katzenback has created a video tutorial on the PortalFeeder
Blog showing the code you need and how to use it. Check it out here:

http://portalfeeder.com/blog/?p=57


--------------------------------------------------
Internal Duplicates
--------------------------------------------------

Are you 100% certain you do not have duplicates of your own content
within your own sites?

If you are using one of the popular free content management systems
(i.e. WordPress) your site might already be suffering from this.

For example, WordPress the popular Blog management system,
automatically creates archive and category pages on your Blog. The
default settings of WordPress result in these archive and category
pages containing duplicates of the exact same posts appearing
elsewhere in your Blog.

When Google finds all the multiple versions of your post their bot
tries to determine which page to rank and places all the rest in to
the supplemental index.

This might not sound like a major problem because one way or
another your content it still getting ranked but if it is left up
to the Search Engine bots you may not get the page *you* want to
rank.

Some content management systems create other kinds of internal
duplicates such as different formats of the same page (i.e. PDF,
text, word doc).

Perform the following Google search if you want to see how many
pages your website has in the supplemental index:

site:www.yourdomain.com ***-view

(make sure you replace yourdomain.com with your actual domain name)

Any pages listed from this search will be pages of your website
that Google has choose to move to their supplemental index. (You'll
see the green text 'Supplemental Result' under each result).

Pages in the supplemental index are known to hardly ever get
traffic if at all. This is until they move out of the supplemental
index, however, many report this as hard to achieve.

The work around for this issue is to tell the Search Engine spiders
to ignore specific locations on your website. This will enable you
to control which pages will be indexed and ranked.

You can do this by adding the following code to a 'robots.txt' file
at the root of your website:

User-agent:*
Disallow: /example/directory/
Disallow: /another/example/directory/
Disallow: /one/more/example/directory/

The first part 'User-agent:*' causes the following statements to
apply to all search engine bots that read the robots.txt file.

The 'Disallow: /.../' lines are where you list each directory
location on your webserver that you want the Search Engine bots to
ignore (i.e. NOT index).

So in our above example we are telling all search engine bots to
*not* index any webpage or indexable file located in the three
stated directory locations on our website.

Doing this correctly can really help you control which pages are
chosen by the SE's to rank in their results.

If you're not sure how all this works please do take the time to
understand the robots.txt properly before you implement it. Search
Google for info on robots.txt and also check out the Wikipedia page
below:

http://en.wikipedia.org/wiki/Robots.txt


If you are putting in all the effort required to make sure your
content is unique you really don't want to fall over at this last
hurdle.

I hope if you weren't aware of these potential pitfalls before that
you take the simple action necessary to ensure you don't fall
victim to self-imposed duplicate content.

Talk soon,

Michael & Steven Grzywacz
DupeFree Pro




--------------------------------------------

This email is never sent without permission.

You are receiving this email because you
registered to download DupeFree Pro.

If you no longer wish to receive these valuable
emails from us, please see below.

33a, oak road, cobham, Surrey,
KT11 3BA, UNITED KINGDOM

To unsubscribe or change subscriber options visit:
http://www.aweber.com/z/r/?zOwMDAyctMxsrBwsrAzM

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.