Firefox using old dataset for D3 visualisation - html

I have a web app which consists of a frontend HTML page where the user enters some search parameters, a PHP processing file which takes the search parameters and uses an online Web API to retrieve relevant data and then passes it to another HTML file, where the data is displayed in a dynamic bar chart with D3. The PHP process creates a JSON file, data.json, which is imported via $.getJSON in the second HTML page.
This works fine in Chrome and IE but not in Firefox. If I clear the browser history and run a search, then everything works fine. Any subsequent searches I do do not show the new data, but the data from that original search after the history deletion, even though the data.json file is updating correctly.
So this makes me think that Firefox is for some reason storing the initial data.json data somehow and using that data each time the page is called.
I haven't included any code because it seems more about the semantics of Firefox than a problem with the code. It did seem to start doing this after I styled the site with Bootstrap/Bootswatch but I don't see why that would have any effect.
Any ideas why this is happening, please?!

Related

Is there a way to scrape data from a website that is not available in the page's source?

What are the few things that I'll have to include in my code that will point me in the right direction?
For Example this website
Open your browser's debugger on Network tab and observe what are the requests when site is loading dynamic content (when you click). You'll see it's getting all the data using some API, for example: https://www.bestfightodds.com/api?f=ggd&b=3&m=16001&p=2
You can download all the data by changing parameters in this URL.
Usually that's enough but here it's more tricky as the data returned by the server is somehow encoded and not easily readable. You'd have to debug its javascript to find function which is used to decode this data before you can parse it.

New MySQL query on each page refresh

I'm trying to to guarantee that fresh JSON is sent to my page every time a user clicks refresh. Currently, if the JSON is updated the webpage will not reflect the change until Apache is restarted.
I have tried the following approaches -
Create a nocache function and call the decorator in the page function
I have tried putting headers in my HTML
Using Command + Shift + R in Chrome for MacOS for a "hard" refresh
No good... I'm beginning to think I'm misunderstanding something. Can someone point out the error of my ways? I copy and pasted the code presented in those links. The first link even speaks about JSON specifically. I can show my exact code being used if desired, but like I said; copy and paste.
Maybe its not even a caching issue, I'm not sure, but I'm open to any ideas!
EDIT:
I know now that my no-cache headers ARE being passed to the HTML. The issue lies somewhere in that the Flask isn't asking MySQL for updated data every time the page is loaded, only when Apache is restarted. So even if fresh data is in MySQL DB it will not be displayed for the user unless Apache gets restarted.
I finally found another post on Stack Overflow regarding my question.
Turns out I need to make my DB connection and form the JSON in the same function. Before I was calling the data from the DB in a separate function and then referencing it to create JSON and pass it to the HTML in a different one. Now everything is inline see HERE.

Drupal 7 (VERY) Custom Preview

I have a drupal site that is being used strictly as a CMS that produces JSON feeds using services and services_views, which are consumed by a separate site. What I would like to do (and I have a working proof of concept of this) is allow for a "live preview" on the real site, by intercepting the node form preview / submit, encoding the node as JSON, and loading a special page on the live site that consumes that JSON and displays the page accordingly.
The problem with this JSONized node is, it's different from the JSON being produced by my view (using services_views). My end goal is to produce JSON that is identical for both previewed and non-previewed objects, without having to maintain separate output methods (I could easily hand-customize the json but then when my view for the public api changes I have to make the same changes to the preview json. Trying to avoid this).
I'm looking for feedback on this approach. Is what I'm attempting even possible? The ideas I've been able to come up with so far are:
being able to (conditionally) drive my view with data from a non-databse source
sneakily inserting data into the view object during one of the stages of execution? Kludgy but I'm not above that :)
saving a "clone" node (or revision?) of the node being previewed and let the view use that to display the preview JSON?
Maybe this is the wrong approach altogether and there's something better? (Trying to intercept and format the services output in my module... maybe avoid services_views altogether?)
If anyone can offer some advice, insight or opinions on how to best proceed here, I'd be really grateful.
in a custom module, you could set up a page that grabs the json output from the view page.
$JSON = file_get_contents($url);
that way the preview stays bound to the view, even if the view changes.
First I think it's not an easy task what you are trying to achieve. So before all, good luck.
I think you could intercept the node submission data, then create a node programatically, then render that node, and then export the rendered node to JSON. Inmediately after you get the JSON, delete this node, because the programmatically created node is only for preview.
This task could be more CPU demanding but think that previewing content exactly as the content will look is difficult.
Your rss feeds that your site reads could be filtered with some parameter to avoid programmatically created nodes (prewiew nodes), despite these nodes will be available for a very short time.

file_get_contents( [file direct link] ) not working in php anymore

I have an enterprise box account, and I was tasked with creating a crawler that would scan an account on box and save all meta information (including a direct link) in a local database. This works fine.
in PHP I have also built a function that downloads the documents (via the direct link I obtained from the api) and extracts readable text from them. This was working perfect a week ago, yesterday however this stopped working completely. I'm using the file_get_contents() function to download the file, and currently it only retrieves the document's file size rather than the document itself, which I find strange. I have tried CURL and I get the same result, it seems box is responding to my direct file requests with the file size instead of the actual file.
The files are ALL open access, so anyone with a direct link can download the file without logging in. I have also tried running this code on another server in another hosting company and I get the exact same result. I have tested my code by accessing other files from other locations (not box) and it works fine.
It's important to note that this was working fine just a week ago, but now it doesn't work at all. Nothing changed in between on my end, (that I know of). Anyone have an idea?

Searching for recently created folders yields no result

I am able to successfully create a folder within the root of my Box account via the v2 API. However, if I immediately issue a search request for it, I get no results. If I wait for some period of time (maybe 20 mins) and then issue the search request, the folder I created is returned.
Is there some cacheing going on on the Box side? If so, is there a way to invalidate the cache via the API or some workaround for this?
Thanks!
What is going on is background backend processing of your file. Just like a new website won't show up in a google search until Google has time to 'learn' about the new website, Box's search engine has to process the file and add the text version of the contents to the Box search engine. Exactly how long it takes to be added depends on a lot of variables, including the size and format of the file.
You'll see pretty much the same behavior if you upload a large document to Box and then try to preview it immediately. Box goes off and does some magic to convert your file to a previewable format. Except in the case of the preview, the Box website gives you a little bit of feedback saying "Generating preview." The search bar doesn't tell you "adding new files to search index."
This is mostly because it is more important for Box to get your file and make sure we store it safely and let you know that Box has it. A few milliseconds later on we start working on processing your file for full text search and all the other processing that we do.