Cut desired HTML part from DOM object

Cut desired HTML part from DOM object - html

I am trying to get one specific css class from my DOM object. I use simplehtmldom library.
1) The library
simplehtmldom.sourceforge.net
2) Because my localhost doesnt support fopen for some reason, I use the CURL library to get the HTML, source:
http://simplehtmldom.sourceforge.net/manual_faq.htm
3) Now, my script looks like this. It gives me source of HTML from the website which I desire.
<?php
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, "http://hokejbal.cz/1-liga/tabulky/");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($curl);
curl_close($curl);
print $result;
str_get_dom;
$ret = $html->find('.standings tablesort tablesorter tablesorter-default');
?>
4) Now, I want to get just a part of the website. Exactly this table:
<table class="standings tablesort tablesorter tablesorter-default">
I found it in Google Chrome webmaster tools
Unfortunately, when I run the script, I get whole HTML page, not just the desired part. What am I doing wrong?

The selector would be '.standings.tablesort.tablesorter.tablesorter-default'
Update: Try the below code.
<?php
$html = file_get_html('http://hokejbal.cz/1-liga/tabulky/');
$ret = $html->find('table.standings', 0);
print $ret;
?>

Related

PHP Phantomjs: How to handle Page Transition? (Web Scraping)

Everyone.
This question maybe duplicated with Using PhantomJs, how to get and handle the new page?.
But there was not exact answer.
My question is as follows.
The pages what I want to scrape 3 pages.
1st Page : input unique id , click next.
if valid id => go to 2nd page.
2nd Page : click link(contains id)
3rd Page : download Pdf File.
So my aim is to download pdf file from unique id automatically.
Then the main point is how to handle page transition in Phantom PHP?
My test Code is as:
// Use the composer autoloader.
require_once 'vendor/autoload.php';
// Setup mink drivers.
$goutteDriver = new \Behat\Mink\Driver\GoutteDriver();
$phantomjsDriver = new \Behat\Mink\Driver\Selenium2Driver('phantomJS');
// Setup mink sessions.
$goutteSession = new \Behat\Mink\Session($goutteDriver);
$phantomjsSession = new \Behat\Mink\Session($phantomjsDriver);
// Setup mink session manager.
$mink = new \Behat\Mink\Mink();
// Register sessions.
$mink->registerSession('goutte', $goutteSession);
$mink->registerSession('phantomjs', $phantomjsSession);
// Set Goutte as the default session.
$mink->setDefaultSessionName('phantomjs');
// Visit mink website with phantomjs driver.
$mink->getSession('phantomjs')->visit('https://testurl.com');
// Get the default goutte session.
$session = $mink->getSession('phantomjs');
// Get the page document.
$page = $session->getPage();
echo $session->getCurrentUrl(), PHP_EOL;
// $page->find('css', '#guides')->clickLink("Drivers");
// echo $session->getCurrentUrl(), PHP_EOL;
// Output the installation instructions from the page.
$input = $page->find('css', '#id');
$input->setValue("1234567890");
echo $input->getValue(), PHP_EOL;
$page->find('css', '#validar')->Click();
echo $page->find('css', '#validar')->getValue(), PHP_EOL;
$session->executeScript('document.getElementById("validar").click()');
//$session->reload();
sleep(5);
// $mink->getSession('phantomjs')->visit('https://testurl.com/next');
$page = $session->getPage();
echo $session->getCurrentUrl(), PHP_EOL;
sleep(5);
$mink->getSession('phantomjs')->visit('https://testurl.com/next2?id=1234567890');
$page = $session->getPage();
echo $session->getCurrentUrl(), PHP_EOL;
// Stop browser sessions.
$mink->stopSessions();
So
How to handle page transition?
How to download pdf file properly?

Yii2: refresh() not working after sending response content

I have a ActiveForm which I use to get some data from and when I click the send button it will run the model (in this case a csv file generator) but the refresh is not working, when I remove the method it will refresh.
After some testing it seems that fputcsv() will stop the script, so that everything that comes after this will not run.
view
public function actionIndex()
{
$model = new Export();
if ($model->load(Yii::$app->request->post()) && $model->validate()) {
$fields = Yii::$app->request->post('Export');
\backend\models\Export::generate();//this prevents the refresh
Yii::$app->session->setFlash();
return $this->refresh();
} else {
return $this->render('index' , ['model' => $model]);
}
}
model
static public function generate()
{
header('Content-Encoding: UTF-8');
header('Content-Type: text/csv; charset=UTF-8');
header('Content-Disposition: attachment; filename="sample.csv"');
header("Pragma: no-cache");
header("Expires: 0");
$data = [array comes here];
$fp = fopen('php://output', 'w') or die("Unable to open file!");
fputs($fp, $bom =( chr(0xEF) . chr(0xBB) . chr(0xBF) ));
foreach ( $data as $line ) {
fputcsv($fp , $line , ';' );
}
stream_get_contents($fp);
fclose($fp);
}

Controller::refresh() uses Location header to reload page. Since headers need to precede the content, you cannot add new header after content was sent. Your Export::generate() method sends content, so you cannot add any header after that, thus $this->refresh() do not work.
Prior to Yii 2.0.14 there was a bug and framework simply ignored that you're trying to send header after content has been send. If you upgrade Yii, you should get "nice" Exception in this case.
If you're trying to display nice page after downloading file, your approach is incorrect. You can't really return file and then redirect to different page. You should first display nice HTML page and inside of it redirect user to download page (for example by using <meta http-equiv="refresh" content="0; url=http://example.com/" /> in head or creating hidden form and submitting it by JavaScript). After downloading the file user will stay at this nice page, so from UX perspective everything should be OK.

Riot Api - Json

I'd love to start programming in JSON, for riot api, but I don't know how to start it.. I have done something like that, but this doesn't show anything lol.. Just white page.
<html>
<head>
<title>JSON example</title>
<script language="javascript" >
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://euw.api.pvp.net/api/lol/euw/v2.5/league/by-summoner/31827832?api_key=myapikey');
// Set so curl_exec returns the result instead of outputting it.
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
// Get the response and close the channel.
$response = curl_exec($ch);
curl_close($ch);
$json = json_decode($response, true);
foreach($json as $elem){
echo $elem[0]['name'];
echo $elem[0]['tier'];
}
</script>
</head>
<body>
</body>
</html>

Please take a look at these guides I wrote for an introduction to the Riot API and using/understanding JSON.
While you can use many languages like PHP, I teach it in Javascript/Ajax/JQuery as that knowledge than be applied to other languages pretty easily, especially with PHP since the syntax of both look decently similar.
[Tutorial] Beginners introduction to Riot API and JSON, using Javascript and Ajax
I discuss what the API is and how you use it, as well as securing your key. I also mention JSON and how to access and understand it with a program.
Let me know if you have any questions.

AS3 ALIVEPDF saving via method.remote (PHP) no longer working

The SWF is located on a web server. I am calling the function using this code in AS3...
myPDF.save(Method.REMOTE, "http://www.example.com/generator/createpdf.php",
Download.ATTACHMENT, "line.pdf");
Here is my PHP script located on the server...
$method = $_GET['method'];
$name = $_GET['name'];
if ( isset ( $GLOBALS["HTTP_RAW_POST_DATA"] )) {
// get bytearray
$pdf = $GLOBALS["HTTP_RAW_POST_DATA"];
// add headers for download dialog-box
header('Content-Type: application/pdf');
header('Content-Length: '.strlen($pdf));
header('Content-disposition:'.$method.'; filename="'.$name.'"');
echo $pdf;
} else echo 'An error occured.';
It used to work, but stopped a while back. Any help would be greatly appreciated.

1) This stopped working for me as well, until I added the following -
if(!$HTTP_RAW_POST_DATA){
$HTTP_RAW_POST_DATA = file_get_contents(‘php://input’);
}
2) I also patched /src/org/alivepdf/pdf/PDF.as::save() per this post enter link description here

How come Smart search is so fast in facebook

I am wondering how facebook has implemented the search functinality on the home page. as soon as i type 'a' the dropdown comes with the list of friends and its very very fast..
I saw in firebug that it sends a ajax request to one of its file.
I wanted to implement the same functionality in one of my webapp but even though my table has just 4 records it takes bit time to load the dropdown.
What i have done is
send ajax req with my search parameter
executed sql query
made the html
and returned it so it will
replace the div

Facebook has very expensive servers using a very expensive CDN (Akamai) and uses server-side caching like memcached.
If you can predict with reasonable accuracy the things the user might search for (e.g. a known friends and friends-of-friends list) and pre-cache them on the server you can do this quickly. If you deliver that list with the webpage in the first place and cache it on the client, it will be lightning fast (once the page is loaded anyway).

Try the following PHP code, it will crawl into the Fast Facebook Search site and echo the results. I hope it will be helpful, feel free to tweak it :)
<?php
function facebook_search_api($args, $referer = 'YOUR SITE ADDRESS', $endpoint = 'web')
{
$url = "http://www.FastFacebookSearch.com".$endpoint;
if ( !array_key_exists('v', $args) )
$args['v'] = '1.0';
//$args['key']="ABQIAAAArMTuM-CBxyWL0PYBLc7SuhT2yXp_ZAY8_ufC3CFXhHIE1NvwkxT-uD75NXlWUsDRBw-8aVAlQ29oCg";
//$args['userip']=$_SERVER['REMOTE_ADDR'];
$args['rsz']='8';
$url .= '?'.http_build_query($args, '', '&');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, $referer);
$body = curl_exec($ch);
curl_close($ch);
//decode and return the response
return json_decode($body,true);
}
$query_temp=urldecode(isset($_GET['q'])?$_GET['q']:"none");
$search_type=urldecode(isset($_GET['search_engine'])?$_GET['search_engine']:"");
echo "$search_type Search Results for: $query_temp<br />-----<br />";
$query=$search_type.$query_temp;
$res = google_search_api(array('q' => $query));
$pages=$res['responseData']['cursor']['pages'];
$nres=0;
for($i=0;$i<count($pages);$i++)
{
$res = google_search_api(array('q' => $query,'start'=>$rez['responseData']['cursor']['pages'][$i]['start']));
for($j=0;$j<count($res['responseData']['results']); $j++)
{
$nres++;
echo urldecode("<a href=".$res['responseData']['results'][$j]['url'])."><big>";
echo urldecode($res['responseData']['results'][$j]['title'])."</a></big><br />";
echo urldecode("<font color=green><small>".$res['responseData']['results'][$j]['url'])."</small></font><br>";
echo urldecode("<iiisearch>".$res['responseData']['results'][$j]['content'])."<br><br>";
}
}
echo "<br />---<br />Total number of reuslts: $nres";
?>

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Cut desired HTML part from DOM object - html

The selector would be '.standings.tablesort.tablesorter.tablesorter-default' Update: Try the below code. <?php $html = file_get_html('http://hokejbal.cz/1-liga/tabulky/'); $ret = $html->find('table.standings', 0); print $ret; ?>

Related

PHP Phantomjs: How to handle Page Transition? (Web Scraping)

Yii2: refresh() not working after sending response content

Riot Api - Json

AS3 ALIVEPDF saving via method.remote (PHP) no longer working

How come Smart search is so fast in facebook

Categories

Resources