I used phantom to get information from a SNS community last week , it’s considerably cool and easy to use. But if we don’t take care of these two points ,maybe we will waste time for our working and get nothing.

phantom document

Usage:

.1.create a webpage object.

var page = require ("webpage").create();

.2.open the page with URL

page.open("url",function(status){ Function_loading })

.3.loading page

we can divide Function_loading into two parts.

part one for judging open page with URL, by status ,with ‘true’ or ‘false’.

part two for loading successfully,and page object can be used to evaluate.

for example:

var page_body_information = page.evaluate(function(){

var body_info = document.body.innerHTML;

return body_info;

});

.4.destroy the object.

phantom.exit();

It seems quite easy to use,but two points should be remember after my trying, acturely with kevin’s help.

point 1 : Ordinary JS commands like ”console.log()” don’t work in page.evaluate() function. So if we want to get sth through this function,”return sth” would be a good way.

point 2 : With the kevin’s help ,I noticed that maybe another object exist, and it cause the problems of point 1 .

The page should be loaded completely, so it turns to another object with command page.evaluate(). console.log() works in page.open() but not in page.evalate(),so we guess that there are two objects.

page.open("URL",function(status){

console.log("it works");

page.evalute(function(){

console.log("It doesn't work");

..});

});

Webpage loading logic

.1 : As we mentioned , the webpage object can’t work before loading finish.So we use setTimeout to make sure it can be used.

.2 : setTimeout gives us a time to set, however ,we don’t know which time is suitable ,so a Recursive function should be needed .

.3 : How to control this Recursive function ? Running times limit (running_time_limit) may be a good idea. In fact, the number of times makes webpage object works should be found out , then sets the max running time bigger.



blog comments powered by Disqus

Published

29 April 2012

Tags