Skip to content

Get full code source of an url using phantomjs. Usefull for ajax website to give google crawler correct informations.

Notifications You must be signed in to change notification settings

hugsbrugs/php-phantomjs-get-html

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

php phantomjs get html

As explained in Google Webmaster AJAX Crawling Guidelines, ajax based websites suffer from being correcly crawled by search engines.

If you break your ass building a cool ajax website which will never appear in search engine results, well, it's quite frustating. Unless you use the power of Phantomjs, a browser simulator (to keep it simple) which will allow us to get a web page snapshot of ALL html built from ajax requests !

This very very simple piece of code is inspired from https://github.com/microweber/screen, a php tools which creates websites screenshots thanks to phantomjs.

Test Usage

  • Upload to your webserver (be carefull that phantomjs in bin folder if for linux 64 bits, replace if needed with : http://phantomjs.org/download.html )
  • Make the bin executable chmod +x /YOUR_PATH/bin/phantomjs
  • Make your folder writable (it creates jobs folder on the fly)
  • Open your browser to index.php and test it with url of your choice

Real World Usage

1. Code something similar to this where you handle URLs server side :

    if( strpos($Uri, "_escaped_fragment_") !== FALSE )
    {
    # REMOVE "?_escaped_fragment_=" FROM URI
    $Uri = str_replace("?_escaped_fragment_=", "", $Uri);

    # INCLUDE 
    include_once "get_code_source.php";

    # DO THE MAGIC
    $Html = get_code_source($url);

    # DISPLAY PAGE 
    echo $Html;

    exit();
}


In the left menu, go to crawl -> Fetch as Google -> Fill in an ajax built URL and click Fetch and render button.
Then click in table on previously submited url and go to Fetching tab to see what Google sees as your page's code source
Finaly Google should start indexing your pages ! HOURA ;)

Troubleshooting

You might experience problems on some shared hosting in order to make Phantomjs works.
Contact your hosting provider to know if you can run it
If somme people achieve to make it work on OVH shared hosting, please fell free to share your workaround !

Similar programs to achieve ajax SEO

About

Get full code source of an url using phantomjs. Usefull for ajax website to give google crawler correct informations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages