url-rewriting

How to optimize your chinese urls for Baidu (2/2)

26
May
url.jpg

(you can find the first post on how to optimize your chinese urls for baidu here)

In this post you will learn seo tips for generating seo friendly urls, especially in the case of chinese content : url rewriting, url aliasing and transliteration for chinese title. Then we will put in practice this tips with drupal.

1- Url rewriting

First of all, i hope you all already activated url rewriting for your website, it is a very basic thing when you begin to think about your search engine optimization.
 

What does it mean ?
Except if you create all your web pages manually, when you use a framework or a CMS to generate your pages, urls would usually look like:

htpp://www.example.com?q=node/1.
Your aim is to clean all the mess in that kind of urls and make it more search engine friendly.
 

There are two places where you can apply the rewriting :

  1. let the web server manage it : for example in apache in your .htaccess file
  2. manage it in your code, by using aliases for your pages

This 2 ways can be combined, for example in drupal, the "?q=" will be removed by apache and the aliases are handled by the drupal php code.
and so
htpp://www.example.com?q=node/1
become as by magic :
http://www.example.com/my-article-title

2- Url aliasing

Of course you can create and manage your aliases manually, but for a large website the best thing to do is to automate this action.
For example you can build a system that take your page titles and creates an alias with it.
A good idea cause most of the time, important keywords will be already in the title of your page, so that will improve your SEO.

But don't forget, there is a pretty annoying limit in the actual web standards : url encoding should only use ASCII character-set. So when you alias your urls, you should remove all special characters, accents, spaces and of course... chinese characters!

We've already seen last time that webmasters are using url encoding functions to deal with this issue. For example in php you can use this urlencode() function : http://cn.php.net/urlencode.
It simply returns a string in which all non-alphanumeric characters except "-" "_" and "." have been replaced with a percent (%) sign followed by two hex digits, like it is said in the web bible.

But this is not the optimum, and you will have to adapt this function in order not to encode some special characters like "é" "à" and "ç"... but just replace them by e a c; and also replace spaces by a "-". In that way search engines can still understand and distinguish your words and keywords.

3- transliteration for chinese characters

Let's see the case of chinese characters now.
The best search engine friendly solution is to automatically change your titles into pinyin cause web search engines are able to :

  • recognize pinyin
  • identify pinyin words
  • link pinyin words and chinese characters

So if you are talking about cars on a chinese website page and your page's title is 汽车.
Now if you use the transliteration into pinyin of this title for your url (http://example.com/qiche), search engines will be able to recognize in this url that the page is about cars.  (qiche is the pinyin for cars)
 

And so you will improve your rankings ;o)

4- using drupal with transliteration

Using the powerful web framework drupal, you will be able to activate all this tips almost without touching a piece of code (we know, that's amazing). I listed for you the modules you will need to activate :

  • Clean url : simply activate this to remove the "?q=" in the url
  • Path : add the possibility to rename URLs using aliases
  • Pathauto : provides a mechanism for modules to automatically generate aliases for the content they manage (using page titles for example)
  • Transliteration : provides a central service for transliteration

Then, in your pathauto.inc file, simply add this line of code line 170 :
if (module_exists('transliteration'))

{ $output = transliteration_get($output); }

You should now have dramatically improved  your url search engine optimization for chinese content!

would you have good or bad comments, please write them below.

cheers