We have a web application that creates a web page. In one section of the page, a graph is diplayed. The graph is created by calling graphing program with an "img src=..." tag in the HTML body. The graphing program takes a number of arguments about the height, width, legends, etc., and the data to be graphed. The only way we have found so far to pass the arguments to the graphing program is to use the GET method. This works, but in some cases the size of the query string passed to the grapher is approaching the 2058 (or whatever) character limit for URLs in Internet Explorer. I've included an example of the tag below. If the length is too long, the query string is truncated and either the program bombs or even worse, displays a graph that is not correct (depending on where the truncation occurs).
The POST method with an auto submit does not work for our purposes, because we want the image inserted on the page where the grapher is invoked. We don't want the graph displayed on a separate web page, which is what the POST method does with the URL in the "action=" attribute.
Does anyone know a way around this problem, or do we just have to stick with the GET method and inform users to stay away from Internet Explorer when they're using our application?
Thanks!
One solution is to have the page put data into the session, then have the img generation script pull from that session information. For example page stores $_SESSION['tempdata12345'] and creates an img src="myimage.php?data=tempdata12345". Then myimage.php pulls from the session information.
One solution is to have the web application that generates the entire page to pre-emptively
call the actual graphing program with all the necessary parameters.
Perhaps store the generated image in a /tmp folder.
Then have the web application create the web page and send it to the browser with a "img src=..." tag that, instead of referring to the graphing program, refers to the pre-generated image.
Related
I'm creating a VCL Application with Delpi 10.3 and want to support some web functionality by having the user enter the ISBN of a book into a TEdit component and from there passing/sending this value to a search field on this website: https://isbnsearch.org after which the website looks up the ISBN and displays the Author of the book. I want to somehow access the information (i.e Author) presented by the search result and again use it in my application.
This is my GUI, for a better idea of what I want to accomplish:
What code can I use for this? Any other feasible suggestions or approaches are acceptable.
When performing a search on that website, it simply loads a page with a specific URL query string...
https://isbnsearch.org/search?s=suess
The above example is when I search for "suess", so you can easily concatenate a search URL.
You can use any HTTP component, such as TIdHTTP, to load this search page, then use an HTML parser to scrape the page and read what you need. Much, much easier than trying to read through the TWebBrowser.
In the end, you won't actually display the HTML (I mean you can if you want to), but the idea is to read the data and display it in your own format.
On that specific page, start by locating the ul element with id searchresults. Then, each li element contains individual results. Unfortunately, this website uses pagination, and only shows 10 results per page. To do this, call this page again with another parameter &p=2 for the 2nd page, &p=3 for the 3rd page, and so on.
On the other hand, that is the worst way to acquire such information. What you should be doing is using a proper API which gives you machine-friendly data. The service you are referencing doesn't appear to have an option, but here's an example of one which does:
https://openlibrary.org/dev/docs/api/books - this also appears to provide you MUCH more information than the one you're using.
When extracting data you can use CSS/xpaths. But is there a similar or reliable method of doing this in the page source.
www.amazon.com/Best-Sellers-Electronics-Televisions/zgbs/electronics/172659
You could get the page source and then parse using Regex but probably not be reliable if for instance the tv did not load on the page. I have looked up various solutions but I have yet to find one that mentions getting every tv at start of each line (1, 4, 7 etc,, in source) or using a reliable method e.g Css/xpaths in source of a page.
What would is the golden standard of reliable method of doing what I am after?
To get the page source you can use CURL if the page is rendered entirely on server side (most pages won't be), or headless chrome to get the actual DOM that will render in the browser (https://developers.google.com/web/updates/2017/04/headless-chrome).
For scraping the content, I've used cheerio (https://github.com/cheeriojs/cheerio) which will allow you to read in HTML to an object and then scrape your data off that using jQuery expressions. (Headless chrome allows you to execute JS on the pages you visit, so you don't necessarily need cheerio).
In your specific example you could get the TV on each line by combining the right class selectors to get the divs containing TV's, and using attribute selector with 'margin-left=0px' which would get first item on each line. That is obviously very much bound to structure of the page and will likely be broken by smallest of changes in the page source. (And not really any different from using xpaths. Still better than regex though)
With certain elements loading / not loading on the page (if that was what you meant by TV not being there), no golden solutions that I know of, except allowing sufficient time for the page to load and handling your scraper failing gracefully.
I am writing a program for managing an inventory. It serves up html based on records from a postresql database, or writes to the database using html forms.
Different functions (adding records, searching, etc.) are accessible using <a></a> tags or form submits, which in turn call functions using http.HandleFunc(), functions then generate queries, parse results and render these to html templates.
The search function renders query results to an html table. To keep the search results page ideally usable and uncluttered I intent to provide only the most relevant information there. However, since there are many more details stored in the database, I need a way to access that information too. In order to do that I wanted to have each table row clickable, displaying the details of the selected record in a status area at the bottom or side of the page for instance.
I could try to follow the pattern that works for running the other functions, that is use <a></a> tags and http.HandleFunc() to render new content but this isn't exactly what I want for a couple of reasons.
First: There should be no need to navigate away from the search result page to view the additional details; there are not so many details that a single record's full data should not be able to be rendered on the same page as the search results.
Second: I want the whole row clickable, not merely the text within a table cell, which is what the <a></a> tags get me.
Using the id returned from the database in an attribute, as in <div id="search-result-row-id-{{.ID}}"></div> I am able to work with individual records but I have yet to find a way to then capture a click in Go.
Before I run off and write this in javascript, does anyone know of a way to do this strictly in Go? I am not particularly adverse to using the tried-and-true js methods but I am curious to see if it could be done without it.
does anyone know of a way to do this strictly in Go?
As others have indicated in the comments, no, Go cannot capture the event in the browser.
For that you will need to use some JavaScript to send to the server (where Go runs) the web request for more information.
You could also push all the required information to the browser when you first serve the page and hide/show it based on CSS/JavaScript event but again, that's just regular web development and nothing to do with Go.
I have an SSRS 2008 report with a field that contains and is configured to render as HTML. Some of the text in this field may contain IMG tags, and the IMG tag is not among the tags SSRS natively supports within its HTML rendering extension.
I am trying to find a way to write a custom handler to hook into the processing of this field that will let me look at the raw HTML before the SSRS handler processes it, in the hopes of grabbing IMG tags, extracting the SRC URL and getting the raw bytes of an image to insert on the fly in a way SSRS will accept, yet retaining the HTML SSRS will render.
From what I've read and seen so far, if a field is marked to render as HTML, the SSRS processor grabs it and parses it entirely before any handler could modify it, meaning the IMG tag is (would be) discarded before I could do anything with it (or even know it was present). The only option I see is to turn off the HTML rendering entirely, thus losing the benefit of the tags SSRS can recognize.
EDIT: Per Jamie's response below, I'm beginning to think the "2nd half" of this issue may prove harder than I realized: Is it even possible to programmatically add an Image to an SSRS Report at runtime (obviously through code/custom assembly)? That is, I'd like to write some code that might look something like this (pseudocode)
'Conceptual Pseudocode I'd like to be able to write
'for dynamic addition of Image element in SSRS report
'Is this even possible?? Is there a documented Report
'object model??
Public Function AddImage(imageBytes() as Byte) as Image
Dim newImage as New Image()
newImage.SetBytes(imageBytes)
Report.Add(newImage)
return newImage
End Function
I'm hoping I'm just overlooking something simple that prevents me from grabbing the raw, unprocessed HTML, and someone else might be able to point me in the right direction on how to grab it.
EDIT: I have created and implemented this solution within the SSRS development environment and it works. WOOHOO :) It did require some hoop-jumping with creating a Single-Threaded Apartment thread to host the WebBrowser control, and to create a message pump, but it does work! **
As I was literally typing up the message to a co-worker that this issue was a non-starter, I did have a bit of an inspiration on a way to solve this problem. I know this post hasn't generated a great deal of response, but just in case someone else finds themselves in a similar problem, I'm going to share what I've implemented in a "petri dish" scenario that, provided I get all the code permission issues resolved, should allow me a decent solution to this problem.
With SSRS inability to handle an IMG tag insurmountable, I actually thought of an idea that took the HTML rendering away from SSRS entirely. To do this, I created custom code that hands off the HTML rendering to a WebBrowser control, then copies the rendered result as an image. It does the following:
Instantiates a WebBrowser control of a given width and height.
Sets the DocumentText property of that control to the HTML from TinyMCE
Waits for the DocumentText to completely render.
Creates a bitmap equal to the size of the control.
Uses the undocumented and presumably unsupported DrawToBitmap method of the WebBrowser to draw the rendered HTML to a bitmap.
Copies the Bitmap to an Image
Saves the Image as a .png file
Returns the path to the .png as the result of the function.
In SSRS, I plan to replace the erstwhile HTML text field with an external Image control that will then call the above method and render the image file. I may alter that to simply draw the image to the SSRS Image control directly, but that's a final detail I'll resolve later. I think this basic design is going to work. Its a little kludgey, but I think it will work.
I have some permissions issues to work out with the code that SSRS will allow me to call at runtime, but I'm confident I'll get those sorted out (even if I end up moving the code to a separate assembly). Once this is tested and working, I plan to mark this as the answer.
Thanks to those who offered suggestions.
I've done something similar with success: We had an HTML "Comment" field that was collected on a web form. For a particular report we wanted to truncate this field to the first 1000 characters or so, but preserve valid HTML.
So I created a C# .dll & class with a public function:
public static string TruncateHtml(string html, int characters)
{
...
}
(I used the HtmlAgilityPack for most of the HTML parsing, and to create and close off my new HTML string, while I kept track of the content length.)
Then I could call that code with the fully qualified path to the function in an SSRS expression:
=ReportHtmlHandler.HtmlTruncate.TruncateHtml(Fields!Comment.Value, 1000)
I could have added a calculated field to my dataset with this, but I was only using this value for one field, so I kept it at the field expression level.
All of this code gets called well before the HTML is processed or rendered by SSRS. I'm sure that any original IMG tag will be in the string.
This approach might work for you, possibly create a ExtractImg function which could be set as the source of an img on the report. I think some of the tricky bits for your requirement will be to handle multiple images as well as embedding the extracted img. But you might be able to do this simply with a external reference to an image. I haven't done much with external images in SSRS.
An MSDN blog entry on calling a custom dll from SSRS: http://support.microsoft.com/kb/920769
I have graphs in an html page. The graphs are generated by a call to a cgi-bin program in an IMG tag:
<IMG src="http://myserver.com/cgi-bin/StatBarChart.cgi?data=1,2,&data=3,5,1&legend=EC,ER">
Currently, the data for the graphs is passed as GET args (in the URL itself.)
Everything’s working OK, but te GET arguments are too long. I want to pass the data via POSTDATA. All the books I have (and discussions on the web that I’ve found) talk about using POSTDATA in forms that include a Submit button. I just want the graphs to appear as part of the page, without a Submit. Can this be done? Can it be done in HTML4, or does it require javascript?
I would require javascript, as you would have to get the resource yourself and set it to the img tag. This is not possible in html4.
Also, I don't see the problem with a long url. Your user will never see it (unless he looks in the sourcecode, which I don't consider as simple "user" anymore) so there is no problem with that either.