How to programmatically download website sources? - json

I need to download data feed from this website:
http://www.oddsportal.com/soccer/argentina/copa-argentina/rosario-central-racing-club-hnmq7gEQ/
In Chrome using developer tools I was able to find this link
http://fb.oddsportal.com/feed/match/1-1-hnmq7gEQ-1-2-yj45f.dat
which contains everything I need. Question is how to programmatically (preferably in java) get to the second link when I know the first.
Thanks in advance for any useful help.

This is quite similar to this issue. You can use that to get a String with all the sources. Then you just search the string to find what you're looking for. It can look like this.
First start ChromeDriver and navigate to the page you wish to scrap.
WebDriver driver = new ChromeDriver();
driver.get("http://www.oddsportal.com/soccer/argentina/copa-argentina/rosario-central-racing-club-hnmq7gEQ/");
Then download the sources into a string
String scriptToExecute = "var performance = window.performance || window.mozPerformance || window.msPerformance || window.webkitPerformance || {}; var network = performance.getEntries() || {}; return network;";
String netData = ((JavascriptExecutor) driver).executeScript(scriptToExecute).toString();
And finally search the string for the desired link
netData = netData.substring(netData.indexOf("fb.oddsportal"), netData.indexOf(".dat")+4);
System.out.println(netData);

You can use a framework such as JSoup in Java and scrape a page.
Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
Once you have this you can then query the links on that page and save them to an array:
Elements links = doc.select("a[href]");
Then run though this array and follow them links.
for (Element link : links) {
Document doc = Jsoup.connect(link.attr("abs:href")).get();
}

Related

mxGraph -Save functionality not working locally

I downloaded https://github.com/jgraph/mxgraph open source code from Git,this application Save functionality not working in the locally. Is there any possible for run the save functionality in Locally?. is there any configuration required for that? please help me.
After save button click getting below error message
I provide the code snippets for local save && upload of the saved file
code to export the xml of the current graph object
let encoder = new mxCodec();
let result = encoder.encode(graph.getModel());
let xml = mxUtils.getXml(result);
//workaround for the xml export, do not include the <mxGraphModel> tags
xml = xml.substring(xml.indexOf("<mxGraphModel>")+"<mxGraphModel>".length, xml.indexOf("</mxGraphModel>"));
code to upload the xml to re-generate the saved state of the graph
let doc = mxUtils.parseXml(xml);
let codec = new mxCodec(doc);
codec.decode(doc.documentElement, graph.getModel());
let elt = doc.documentElement.firstChild;
let cells = [];
while (elt != null)
{
let cell = codec.decode(elt)
if(cell != undefined){
if(cell.id != undefined && cell.parent != undefined && (cell.id == cell.parent)){
elt = elt.nextSibling;
continue;
}
cells.push(cell);
}
elt = elt.nextSibling;
}
graph.addCells(cells);
You can save locally by using the mxCodec class found in the IO package. Check out the example code here.
I'm not sure how to tie it into that specific button, but find the function that is called when you click save and add/replace it with the three lines needed to encode as xml.
As for how to get that xml code to save as a file, I'm not sure. Perhaps you'll find that code when you modify the save button functionality. The easy way would be to create a div and replace its innerhtml with the xml data, then just copy it and save it yourself.

C# Get full html document from site

I tried to use GetStringAsync
using (var client = new HttpClient())
{
var html = await client.GetStringAsync(url);
richTextBox1.Text = html.ToString();
}
and DownloadString
System.Net.WebClient wc = new System.Net.WebClient();
string webData = wc.DownloadString(url);
richTextBox1.Text = webData;
But it doesn't give me full html document like Google Chrome F12. How can I get full html code of url using C#?
Need this url: http://poeplanner.com/ but it doesn't show me even a single table when Chrome F12 does.
My guess is that the code you don't see is a code that added with javascript. So you need use a browser program to get this code too.
This app will run the javascript too and you can ask from it the final html.
If I'm right, try to use phantomjs.
Related question on PhantomJS

Flickr API image unavailable windows phone

I'm trying to use flicke's api to import images into a Windows Phone app and display them on the phones panoramic dispaly.
I'm new to flickr's API and am stuck ATM.
I've tried the following call:
// original string flickString = "https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=cc9babb2754c1d29837bea480c97013e&text=game+of+thrones&format=json&nojsoncallback=1&api_sig=bb86a60e9e42f31950bf53d25fc45f08";
string flickString = "https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=cc9babb2754c1d29837bea480c97013e&text=game+of+thrones&extras=url_sq%2C+url_t%2C+url_s%2C+url_q%2C+url_m%2C+url_n%2C+url_z%2C+url_c%2C+url_l%2C+url_o+&format=json&nojsoncallback=1&api_sig=9e74e094d8c6a7496fc66e070f5c0898";
var baseUrl = string.Format(flickString, flickrAPIKey);
string flickrResult = await client.GetStringAsync(baseUrl);
FlickrData flickrApiData = JsonConvert.DeserializeObject<FlickrData>(flickrResult);
if(flickrApiData.stat == "ok")
{
foreach (Photo data in flickrApiData.photos.photo)
{
// To retrieve one photo
// http://farm{farmid}.staticflickr.com/{server-id}/{id}_{secret}{size}.jpeg
//string photoUrl = "http://farm{0}.staticflickr.com/{1}/{2}_{3}_o.jpeg";
//string photoUrl = "http://farm{0}.staticflickr.com/{1}/{2}_{3}_b.jpeg";
//string photoUrl = "http://farm{0}.staticflickr.com/{1}/{2}_{3}_n.jpg";
string photoUrl = "http://farm{0}.staticflickr.com/{1}/{2}_{3}"
string baseFlickrUrl = string.Format(photoUrl,
data.farm,
data.server,
data.id,
data.secret);
flickr1Image.Source = new BitmapImage(new Uri(baseFlickrUrl));
break;
}
}
}
When I deploy and run the app I get an image saying that this image is unavailiablemessage every time? I've tried changing the search terms etc and still get the sme message. Which is making me wondor if I've missed something setting up my account with flickr earlier that I'm not aware of? It's very frustrating - help please.
Thanks to card_master for his help so far
I'm also integrating with Flickr. I'm creating a web site that uses their api.
I'm using FlickrNet. This is an open source .net library that you can use to call the Flickr Services. This is a C# library.
The benefit of using it on a mobile application you can take advantage of the caching. It allows you to store images in the phones storage. This won't work on a web application though.

how to modify url string of html extension pages

i am asking a very basic question. my problem is that i want to write url string like below
http://example.com/index.html/sometext
but when i write url like index.html/ it results in page not found. if url only upto index.html then it works. is there any way to write it in html pages. Please help Thanks
The URL you provided is not valid. The .html extension would be the end of the URL... with the format you've specified, it implies that another folder exists under index.html, which would never be possible. However, if you want to add parameters to the URL, you can add them like this:
http://example.com/index.html?text=sometext
You can then capture that data in your code.
EDIT
To answer the second question of how to pickup the URL parameters, you can use the method shown in this post:
http://www.jquerybyexample.net/2012/06/get-url-parameters-using-jquery.html
Basically, create a function as follows...
function GetURLParameter(sParam)
{
var sPageURL = window.location.search.substring(1);
var sURLVariables = sPageURL.split('&');
for (var i = 0; i < sURLVariables.length; i++)
{
var sParameterName = sURLVariables[i].split('=');
if (sParameterName[0] == sParam)
{
return sParameterName[1];
}
}
}​
And you can use it like this:
var text = GetURLParameter('text');
You either want to use a QueryString like http://example.com/index.html&someVar=sometext
Or you will want to enable mod_rewrite to transform incoming urls to something that your backend technology understands.
There are lots of frameworks that enable you to use those fancy URLs like http://example.com/index/someText, for example Laravel for PHP or, ASP.NET MVC, or ...

Get HTML from Frame using WebBrowser control - unauthorizedaccessexception

I'm looking for a free tool or dlls that I can use to write my own code in .NET to process some web requests.
Let's say I have a URL with some query string parameters similar to http://www.example.com?param=1 and when I use it in a browser several redirects occur and eventually HTML is rendered that has a frameset and a frame's inner html contains a table with data that I need. I want to store this data in the external file in a CSV format. Obviously the data is different depending on the querystring parameter param. Let's say I want to run the application and generate 1000 CSV files for param values from 1 to 1000.
I have good knowledge in .NET, javascript, HTML, but the main problem is how to get the final HTML in the server code.
What I tried is I created a new Form Application, added a webbrowser control and used code like this:
private void FormMain_Shown(object sender, EventArgs e)
{
var param = 1; //test
var url = string.Format(Constants.URL_PATTERN, param);
WebBrowserMain.Navigated += WebBrowserMain_Navigated;
WebBrowserMain.Navigate(url);
}
void WebBrowserMain_Navigated(object sender, WebBrowserNavigatedEventArgs e)
{
if (e.Url.OriginalString == Constants.FINAL_URL)
{
var document = WebBrowserMain.Document.Window.Frames[0].Document;
}
}
But unfortunately I receieve unauthorizedaccessexception because probably frame and the document are in different domains. Does anybody has an idea of how to work around this and maybe another brand new approach to implement functionality like this?
Thanks to the Noseratio's comments I managed to do that with the WebBrowser control. Here are some major points that might help others who have similar questions:
1) DocumentCompleted event should be used. For Navigated event body of the document is NULL.
2) Following answer helped a lot: WebBrowserControl: UnauthorizedAccessException when accessing property of a Frame
3) I was not aware about IHTMLWindow2 similar interfaces, for them to work correctly I added references to following COM libs: Microsoft Internet Controls (SHDocVw), Microsoft HTML Object Library (MSHTML).
4) I grabbed the html of the frame with the following code:
void WebBrowserMain_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (e.Url.OriginalString == Constants.FINAL_URL)
{
try
{
var doc = (IHTMLDocument2) WebBrowserMain.Document.DomDocument;
var frame = (IHTMLWindow2) doc.frames.item(0);
var document = CrossFrameIE.GetDocumentFromWindow(frame);
var html = document.body.outerHTML;
var dataParser = new DataParser(html);
//my logic here
}
5) For the work with Html, I used the fine HTML Agility Pack that has some pretty good XPath search.