Scraping data after filling out form? - html

I'm doing a little project for my class and I'm just a beginner, so please forgive me if I mix up some of my terminology.
Basically, I'm creating an interactive journey planner for my city's public transit system. Unfortunately, they haven't made all the data I need publicly available. So instead of putting all my time into gathering the data for personal use, I've opted to do some screen scraping - letting their servers calculate the journey info from a START and STOP variable and then displaying the selected info on my page.
So is it possible to fill out a form's fields remotely, and then scrape the data on the page that subsequently loads? And if so, what would be the quickest, most convenient way? This happens to be a case where the data can't be manipulated via the URL, so it has to access the data by filling out the form first.
The website in question:
http://jp.translink.com.au/travel-information/journey-planner

Here is what you can do:
1.) Send a POST Request to the journey-planner with some data like that (be aware that CORS might jump in, then you could use cURL via PHP or whatsoever):
Start:Wickham Tce, Spring Hill
End:Upper Edward St, Spring Hill
SearchDate:10/05/2013 12:00:00 AM
TimeSearchMode:LeaveAfter
SearchHour:7
SearchMinute:40
TimeMeridiem:AM
TransportModes:Bus
TransportModes:Train
TransportModes:Ferry
MaximumWalkingDistance:1500
WalkingSpeed:Normal
ServiceTypes:Regular
ServiceTypes:Express
ServiceTypes:NightLink
FareTypes:Standard
FareTypes:Prepaid
FareTypes:Free
2.) You will get a new response location. This seems to be a REST link. Important for you is the id at the end. You will have to call that page and parse the HTML and look for a div with the HTML-id option-summaries, where you will find more information within the divs travel-option-1 to travel-option-n. You have to look at it carefully in order to find out which information is stored whee and how you will be able to use it.
In order to find such things you should learn how to use Firebug or Chrome's development tools.
This is one way to solve your problem. Probably not the best but still better than "screen-scraping" anything. But it will ask you for a lot of skills and effort. Furthermore if the data provider is going to change just a bit your solution will not work anymore. Additionally they might prevent your access by CORS or anything else (blocking your IP etc.)

Related

Getting specific data from video surveillance web-interface in Zabbix

guys! I'm looking for a solution or some ideas on how to solve my task.
There is a video surveillance camera(vendor: Hikvision) with an accessible web-interface.
In the web-interface, there is a field Device Name containing data I need to retrieve by means of the Zabbix server and further to use this data for renaming discovered hosts.
Since Hikvision cameras support SNMP, I've tried the SNMP agent in Zabbix. I turned out that Hikvision MIB doesn't contain data from that field.
Also exploring web-interface through Developer tools in Google Chrome I stumbled upon the string Request URL: http://10.90.187.16/ISAPI/System/deviceInfo which gives such response in XML format:
<DeviceInfo xmlns="http://www.hikvision.com/ver20/XMLSchema" version="2.0">
<deviceName>1.5.1.1</deviceName>
<deviceID>566eec0b-6580-11b3-81a1-1868cb48861f</deviceID>
<deviceDescription>IPCamera</deviceDescription>
<deviceLocation>hangzhou</deviceLocation>
<systemContact>Hikvision.China</systemContact>
<model>DS-2CD2155FWD-IS</model>
<serialNumber>DS-2CD2155FWD-IS20170417AAWR749464587</serialNumber>
<macAddress>18:68:cb:48:86:1f</macAddress>
<firmwareVersion>V5.4.5</firmwareVersion>
<firmwareReleasedDate>build 170124</firmwareReleasedDate>
<encoderVersion>V7.3</encoderVersion>
<encoderReleasedDate>build 170123</encoderReleasedDate>
<bootVersion>V1.3.4</bootVersion>
<bootReleasedDate>100316</bootReleasedDate>
<hardwareVersion>0x0</hardwareVersion>
<deviceType>IPCamera</deviceType>
<telecontrolID>88</telecontrolID>
<supportBeep>false</supportBeep>
<supportVideoLoss>false</supportVideoLoss>
</DeviceInfo>
Where the tag <deviceName>1.5.1.1</deviceName> contains required data and now the question is how to put two and two together by means of Zabbix.
Digging into Zabbix documentation I've found an article about creating an Item based on HTTP agent with XML request . Unfortunately there are not any exmaples how to do it exactly.
Has somebody had such experience? Any clues will be helpful
You can create an HTTP Agent item, set it to TEXT type and point it to http://10.90.187.16/ISAPI/System/deviceInfo (don't forget the authentication, if required!), Zabbix will retrieve the full XML.
To get the desired value you have to create a dependent item, point it to the previous item and set up a preprocessing step.
Create a single XML Xpath preprocessing rule with parameter string(/DeviceInfo/DeviceName) to get the 1.5.1.1 value
If you want to get the firmware version, create another dependent item and set up the XPath to string(/DeviceInfo/FirmwareVersion) and so on for every element you need.
If you want a single value you can use a single item, adding the preprocessing rule to the http agent item. I use my solution for flexibility, maybe one day I'll need another XML element or maybe a firmware update will add some element to the page.
Dependent items are more flexible, but of course the full XML uses more storage in the database for stuff you don't need right now: it's a tradeoff, either way works!

How can I generate a file like this for Bing Heat Map data?

I am working on a fairly simple Heat Map application where the longitude and latitude of the points will be stored in a SQL Server database. I have been looking at an example that uses an array of objects as follows (eliminated a lot of data for brevity):
/* Sample data to demonstrate Bing Maps Heatmap */
/* http://alastair.wordpress.com */
var CrimeData = [
new Microsoft.Maps.Location(52.67280, 0.94392),
new Microsoft.Maps.Location(52.62423, 1.29493),
new Microsoft.Maps.Location(52.62187, 1.29080),
new Microsoft.Maps.Location(52.58962, 1.72228),
new Microsoft.Maps.Location(52.69915, 0.24332),
new Microsoft.Maps.Location(52.51161, 0.99350),
new Microsoft.Maps.Location(52.59573, 1.17067),
new Microsoft.Maps.Location(52.94351, 0.49153),
new Microsoft.Maps.Location(52.64585, 1.73145),
new Microsoft.Maps.Location(52.75424, 1.30079),
new Microsoft.Maps.Location(52.63566, 1.27176),
new Microsoft.Maps.Location(52.63882, 1.23121)
];
What I want to do is present the user with a list of some sort that displays all the data sets that exist in the database (they each have a name associated with them) and then allow the user to check all or only a select few. I will then need to generate an array like the above to create the heat map. Any ideas on a good approach to this?
What you trying to achieve is more related to a web developement rather than only related to Bing Maps.
To summarize, you have multiple ways to do this but it really depends on what you are capable to do and what you need in the interface.
What process/technology?
First, you need to determine what process you want to follow to display the data and it will set the technology that you will use. The questions that you need to ask yourself are:
Do you want to be able to change the data sets dynamically without refreshing the whole page?
If yes, it means that you will have to use asynchronous data loading through a dedicated web service in order to avoid loading all the information at the initial load of the page.
Do you have lots of data to load?
If so, it might comfort you with asynchronous loading to avoid loading all data.
If not loading every elements in multiple arrays might be the simplest solution.
Implementation
So now, you want to create a web service to load the data asynchronously, you can take a look at the following websites :
http://www.asp.net/get-started
http://www.stefanprodan.com/2011/04/async-operations-with-jquery-ajax-and-asp-net-mvc/
There might be interesting other website, you will be able to find them. If needed, add comment and I'm sure the community will help you.
If you want to generate the data directly in the script, it could be simple as you can compose the JavaScript directly in your dynamically created HTML page (in your ASP.Net markup code or whatever technology you're using).

The prefix "atom" for element "atom:cc" is not bound exception

I am trying to fetch the contacts of the user who have an account in google apps marketplace. While fetching the contact i get the following error
com.google.gdata.util.ParseException: The prefix "atom" for element "atom:cc" is not bound.
at com.google.gdata.util.XmlParser.parse(XmlParser.java:695)|
at com.google.gdata.util.XmlParser.parse(XmlParser.java:568)|
at com.google.gdata.data.BaseFeed.parseAtom(BaseFeed.java:793)|
at com.google.gdata.wireformats.input.AtomDataParser.parse(AtomDataParser.java:68)|
at com.google.gdata.wireformats.input.AtomDataParser.parse(AtomDataParser.java:39)|
at com.google.gdata.wireformats.input.CharacterParser.parse(CharacterParser.java:)|
at com.google.gdata.wireformats.input.XmlInputParser.parse(XmlInputParser.java:52)|...
I am using Java client library to fetch the contacts. Can you please let me know is there an issue in the java client library? This issue is there for a long time and I badly need to find a solution for this...What should I do to make it work...Any help will be grateful..
Thanks,
VijayRaj
I got the same Problem, that you have with the Java Client, with the .NET client.
After contacting Google support, they told me that the Contacts arbitrary XML data which is in an Property element cannot be parsed within my version of GData .
However, there is a time intensive workaround, by deleting and recreating Contacts, but thats probably not what you are looking for, me either.
After switching to the Python implementation all works fine now.
Check out this Issue report Issue 361

Tweet counter for identi.ca

Is there a way to retrieve the amount of times a certain URL was "dented" (shared on identi.ca, status.net and/or the likes?).
For twitter there are several services that give this information.
Twitter itself: http://urls.api.twitter.com/1/urls/count.json?url=http://example.com&callback=twttr.receiveCount
Tweetmeme: http://api.tweetmeme.com/url_info.jsonc?url=http://example.com
Topsy: http://otter.topsy.com/stats.js?url=http://example.com&callback=?
I don't need the fancy extra information that Tweetmeme or Topsy deliver, only the amount.
I am aware that this is problematic, seen from the "distributed" nature of status.net: it will only give a count from once single silo, e.g. identi.ca. However, for me, for now, that would be enough.
Is there such an endpoint that gives me such JSON?
I don't think so. There's a file table in StatusNet databases that holds references to dented URLs (so it wouldn't be hard to count them if you had access to database or could write a plugin -- i.e., you wouldn't have to parse all notices, just lookup the file table), but it's not exposed through the API.
The list of API possible calls for StatusNet is here: http://status.net/wiki/TwitterCompatibleAPI
In addition, there's a proposed Google Summer of Code project on this subject: Social Analytics plugin

What is the difference between the BU and ZK OK codes in SAP macro

I am trying the post an invoice to SAP using the F-47 transaction and using SHDB to record the transaction and learn how it works. I see there that sometimes BU and ZK BDC OK codes are used. I would like to understand the difference between them, but could not find any official documentation. Please, explain the difference between the two?
I found the meaning of some of the status codes. I post it here, so I can remember:
/00. Enter
/AB Go to overview
=ZK Go to additional information
=ENTE Enter (don't know exactly what is difference between /00)
=PI select cursor location
=STER Go to taxes
=DELZ delete cursor
=GO continue
=BU post (save)
/EEND end processing
=Yes select "yes" from message box
=BP park (save)
=ENTR Enter (don't know exactly what is difference between =ENTE or /00)
=AE save when changing document
=BK change document header (parking or posting parked document)
=P+ next page
=BL delete parked document
A BDC_OKCODE indicates which action is (will) be executed on a screen (things like save, back, exit etc). The BU code is used for a SAVE function (like in MM01 transaction). Sorry but I cannot recall to which function ZK maps to. Obviously their difference lies in the fact that they map to different functions. You can still find out which function each button utilizes by using System->Status->GUI status.
By the way, BTCI transactions are not fully robust- minor changes in GUI flow let your program break. Error handling / analysis is tedious.... DId you have a look to posting methods more preferably? E.g. like BAPI_* function modules? With the help of LSMW you can browse for different input methods and use them later standalone. Or you can use transaction BAPI directly.