Craigslist, CURL, Simple PHP DOM Issues - html

I am logging into Craigslist with CURL to scrape the status of my posted listings. The problem I encounter is the transfer of HTML from CURL $output to file_get_html. While Craigslist statuses are actually nested inside TR elements, I just wanted to test the most basic functions to see if things were getting passed through (i.e. link scraping). They are not.
For example, this doesn't work:
$cookie_file_path = getcwd()."/cookie.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://accounts.craigslist.org/login?LoginType=L&step=confirmation&originalURI=%2Flogin&rt=&rp=&inputEmailHandle='.$email.'&inputPassword='.$password.'&submit=Log%20In');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, 'http://www.craigslist.org');
$agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)";
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
$output = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);
echo $output;
//
include_once('simple_html_dom.php');
$html = file_get_html($output);
//find all links
foreach($html->find('a') as $element)
echo $element->href . '<br>';
I know the expression works because it returns links if I put in 'http://google.com', or something or other.

This is how it should be done
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, 'http://www.sitename.com');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 10);
$str = curl_exec($curl);
curl_close($curl);
$html= str_get_html($str);

Shouldn't you be using str_get_html instead of file_get_html?
Since $ouput is a string!

Related

Mayan EDMS API - Cannot add a document to a cabinet

Using Mayan EDMS, I am unable to add a document to a cabinet.
I am using PHP to send, create and upload a document, which works well.
Thereafter, trying to add the document to the cabinet results in the following error
{"detail":"Not found."}
I am at a loss on how to proceed as the API documentation is not clear about the request body and fails on the execute in the swagger documentation.
// Add document to cabinet
$cabinet = $params['cabinet'];
$data = array (
'document' => [$document_id], // values must be a list []
);
$post_data = json_encode($data);
$request_url = 'http://example.com/api/v4/cabinets/' . $cabinet . '/documents/add/';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $request_url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
$headers = [];
$headers[] = 'Accept: application/json';
$headers[] = 'Content-Type: application/json';
$headers[] = 'Content-Length: ' . strlen($post_data);
$headers[] = 'Authorization: Basic YWRtaW46QndLcFNaQnJURDI3MzN5';
$result = curl_exec($ch);
// $result returns : {"detail":"Not found."}
curl_close($ch);

How to PHP Convertapi HTML to PDF without physcal HTML file

i wish to convert an html document, which is stored as a string ($html) into PDF using convertAPI via CURL (ie no physical file)
i don't understand how I need to post the $html to the API, i was looking at the example on the convertapi webpage, but i don't seem to be able to make sense of it.
example pasted below.
$html = '<hmtl file contents>' ;
$parameters = array(
'Secret' => 'X?X?X?X?X?X?X',
);
function convert_api($src_format, $dst_format, $files, $parameters) {
$parameters = array_change_key_case($parameters);
$auth_param = array_key_exists('secret', $parameters) ? 'secret='.$parameters['secret'] : 'token='.$parameters['token'];
$curl = curl_init();
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER , false);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_URL, "https://v2.convertapi.com/{$src_format}/to/{$dst_format}?{$auth_param}");
if (is_array($files)) {
foreach ($files as $index=>$file) {
$parameters["files[$index]"] = file_exists($file) ? new CurlFile($file) : $file;
}
} else {
$parameters['file'] = file_exists($files) ? new CurlFile($files) : $files;
}
curl_setopt($curl, CURLOPT_POSTFIELDS, $parameters);
$response = curl_exec($curl);
$httpcode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
$error = curl_error($curl);
curl_close($curl);
if ($response && $httpcode >= 200 && $httpcode <= 299) {
return json_decode($response);
} else {
throw new Exception($error . $response, $httpcode);
}
}
thank you
Try this short example:
$secret = 'XXXXXXXXXX';
$curl = curl_init();
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_BINARYTRANSFER, true);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_HTTPHEADER, array('Content-Type: application/octet-stream', 'Accept: application/octet-stream', 'Content-Disposition: attachment; filename="file.html"'));
curl_setopt($curl, CURLOPT_URL, "https://v2.convertapi.com/html/to/pdf?secret=".$secret);
curl_setopt($curl, CURLOPT_POSTFIELDS, '<!doctype html><html lang=en><head><meta charset=utf-8><title>Conversion test</title></head><body>This is html body</body></html>');
$result = curl_exec($curl);
if (curl_getinfo($curl, CURLINFO_HTTP_CODE) == 200) {
file_put_contents("result.pdf", $result);
} else {
print("Server returned error:\n".$result."\n");
}
More examples can be found at: https://repl.it/#ConvertAPI

Instagram Latest post API suddenly stopped working

My instagram feed was working fine but it has stopped working. I am trying to find a solution. First I assume the issue is related to SSL, I even got SSL certificate for my website. But doesn't seem to work. Any help or suggestion is appreciated. Output showed latest image posted on instagram and now shows nothing.1
<div class="col-xs-12 col-sm-6 col-md-5 no-padding">
<div class="insta_img">
<?php
$instagram_id = get_field('instagram_id', 'option') ? get_field('instagram_id', 'option') : '';
$access_token = get_field('instagram_access_token', 'option') ? get_field('instagram_access_token', 'option') : '';
$url = "https://www.instagram.com/{$instagram_id}/?__a=1";
curl_close($ch);
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 20);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$json = curl_exec($ch);
curl_close($ch);
$data = json_decode($json);
$data = $data->user->media->nodes;
$text = $data[0]->caption;
$created_time = $data[0]->date ;
$images = $data[0]->thumbnail_src;
/
?>
<img src="<?php echo $images; ?>" alt="instagram"/>
<div class="insta_content"><?php echo $text; ?>
<dt><?php echo date('d/m/Y', $created_time); ?></dt>
</div>
</div>
You're not correctly parsing the response. For example, in that response there is no user->media->nodes. You will need to check the response you get from the endpoint, and then parse it correctly to get the images:
$access_token = get_field('instagram_access_token', 'option') ? get_field('instagram_access_token', 'option') : '';
$url = "https://api.instagram.com/v1/users/self/media/recent?access_token={$access_token}";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 20);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$json = curl_exec($ch);
curl_close($ch);
$data = json_decode($json);
$data = $data->data;
$text = $data[0]->caption->text;
$created_time = $data[0]->created_time;
$images = $data[0]->images->standard_resolution->url;
Instagram has disabled following capabilities (i.e., deprecated following API's) on 6th April 2018:
Follower List - to read the list of followers and followed-by users
Relationships - to follow and unfollow accounts on a user’s behalf
Commenting on Public Content - to post and delete comments on a user’s behalf on public media
Likes - to like and unlike media on a user’s behalf
Subscriptions - to receive notifications when media is posted
Users Information - to search for and view users' public content
Some information on Public Content returned through hashtag and location
search - Name, Bio, Comments, Commenters, Follower Count, Following Count,
Post Count, and Profile Picture
The access token is working fine. But the post are not visible on the website. The code that I am using is as follow:
$access_token = get_field('instagram_access_token', 'option') ? get_field('instagram_access_token', 'option') : '';
$url = "https://api.instagram.com/v1/users/self/?access_token={$access_token}";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 20);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$json = curl_exec($ch);
curl_close($ch);
$data = json_decode($json);
$data = $data->user->media->nodes;
$text = $data[0]->caption;
$created_time = $data[0]->date ;
$images = $data[0]->thumbnail_src;

GET json with authentication API key

I'm making a script to be run by a cronjob.
It's suppossed to fetch some json orders and process them.
My script at the moment looks like this:
$json_string = '/admin/orders/7109.json';
$real_url = "https://my-store.myshopify.com{$json_string}";
$user = 'my-user';
$pass = 'my-pass';
$ch = curl_init($real_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC);
curl_setopt($ch, CURLOPT_USERPWD, $user . ':' . $pass);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: application/json'));
$result = json_decode(curl_exec($ch),true);
curl_close($ch);
file_put_contents(__DIR__ . '/../debug/tracker_test.txt', print_r($result,true));
I'm getting this written into the code file even tho my credentials are correct.
Array
(
[errors] => [API] Invalid API key or access token (unrecognized login or wrong password)
)
Am I missing something ?
Edit: In the private apps section of Shopify it gives an example url format:
https://apikey:password#hostname/admin/resource.json
So now the script looks like this:
$json_url = 'https://my-api-key:my-api-pass#my-store.myshopify.com/admin/orders.json';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $real_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: application/json'));
$result = json_decode(curl_exec($ch), true);
curl_close($ch);
but I'm still getting the same error.
Turns out I had a syntax error.
I forgot to update the CURLOPT_URL with the new url variable.
It ended up looking like this:
$json_url = 'https://my-api-key:my-api-pass#my-store.myshopify.com/admin/orders.json';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $json_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: application/json'));
$result = json_decode(curl_exec($ch), true);
curl_close($ch);

Accessing Shoothill API in R/ R Studio

Hi I am trying to access the Floodhill Flood Alert API in R using R studio.
http://www.shoothill.com/floodapi/
I'm not entirely sure how I login with the API key I have and then call the API.
I have had success with calling an API using a different API, e.g.
library(jsonlite)
jsondata <- fromJSON("http://api.wunderground.com/api/c86b0e891d592775/geolookup/conditions/q/IA/Cedar_Rapids.json")#access api
names(jsondata)
summary(jsondata)
Help on accessing the Shoothill Flood Alert API would be much appreciated!
This technically isn't a full R answer but the example they give on the API provider page is 100% doable in R with the RCurl package:
<?php
error_reporting(E_ALL);
ini_set('display_errors', 1);
$apiKey = '<Your API Key>';
$url = 'https://apifa.shoothill.com/Account/APILogin/';
$postinfo = "apikey=".$apiKey."&persist=false";
$cookie_file_path = dirname(__FILE__) . "cookie.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIE, "cookiename=1");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, $_SERVER['REQUEST_URI']);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postinfo);
$out = curl_exec($ch);
curl_setopt($ch, CURLOPT_HTTPGET, 1);
curl_setopt($ch, CURLOPT_URL, "https://apifa.shoothill.com/API/Floods");
curl_setopt($ch, CURLOPT_HTTPHEADER,array('Content-Type: application/json'));
$html = curl_exec($ch);
curl_close($ch);
echo $html;
return $html;
?>
OmegaHat has a rly detailed explanation of how to use RCurl and you should be able to translate the above pretty well after going through it.