I have a demo website built using Jekyll that compares the outputs of different APIs for benchmarking purposes. The website has a total of three pages, and in each page, you have a submission form that takes in a text string, runs it against a few apis, do some preprocessing on the results, and then shows them in a table underneath it. The table is cumulative, and each new request results get added at the very top row.
One issue I am facing is that when you switch to a different page within the website itself, and then return to the previous one, the table of results is completely wiped out, and you start with a clean slate. My question is: using Jekyll, is there a way to preserve the state of each page as you switch between them per a given session?
Jekyll is a static page generator, so it's not possible to preserve state of pages on the server.
But you can use browser local storage according to docs: https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage
Related
if i make changes in the template in mediawiki, the effects are not shown on the page using the template until i edit the page and re-save the page. How to reflect the changes in all other pages?
Earlier:
[[Trivia::{{{Trivia|}}}]]
Suppose i apply style to an element in template:
[[Trivia::{{{Trivia|}}}]]
Then, the changes not shown in the pages using this template.
When you change a template transcluded on many pages like this, MediaWiki has to update all the affected pages. However, it doesn't do this by default, as in the default configuration, MediaWiki is only run when someone requests a page. If MediaWiki has to update hundreds (or in Wikipedia's case, millions) of pages when someone requests a page, then that person's request would take a very long time to complete.
Instead, MediaWiki has a concept of "jobs". Jobs can be triggered by various actions. For example, after deleting a page, all links to that page must turn from blue to red; and after changing a category in a template, all pages that transclude that template must switch to using the new category. By default, MediaWiki runs one of these jobs per page view. This is ok for some sites, but depending on your usage patterns, it might not update all the pages fast enough for you.
To work around this problem, MediaWiki introduced a "job queue". By using this job queue, you can run jobs in batch processes in the background, so jobs are completed quickly and requests aren't slowed down by having to process jobs. See the link above for how to set the job queue up.
Other solution would be creating a cron job which would run the runJobs.php script from maintenance catalogue, let's say, once an hour (or more or less frequently, according to your needs).
I am using such MySQL request for measuring views count
UPDATE content SET views=views+1 WHERE id='$id'
For example if I want to check how many times some single page has been viewed I've just putting it on top of page code. Unfortunately I always receiving about 5-10x bigger amount than results in Google Analytics.
If I am correct one refresh should increase value in my data base about +1. Doesn't "Views" in Google Analytics works in the same way?
If e.g. Google Analytics provides me that single page has been viewed 100x times and my data base says it was e.g. 450x times. How such simple request could generate additional 350 views? And I don't mean visits or unique visits. Just regular views.
Is it possible that Google Analytics interprates such data in a little bit different way and my data base result is correct?
There are quite a few reasons why this could be occurring. The most usual culprit is bots and spiders. As soon as you use a third-party API like Google Analytics, or Facebook's API, you'll get their bots making hits to your page.
You need to examine each request in more detail. The user agent is a good place to start, although I do recommend researching this area further - discriminating between human and bot traffic is quite a deep subject.
In Google Analytics the data is provided by the user, for example:
A user view a page on your domain, now he is on charge to comunicate to Google The PageView, if something fails in the road, the data will no be included in the reports.
In the other case , the SQL sistem that you have is a Log Based Analytic, the data is collected by your system reducing the data collection failures.
If we see this in that way, that means taht some data can be missed with the slow conections and users that dont execute javascriopt (Adbloquers or bots), or the HTML page is not properly printed***.
Now 5x times more it's a huge discrepancy, in my experiences must be near 8-25% of discrepancy. (tested over transaction level, maybe in Pageview can be more)
What i recomend you is:
Save device, browser information, the ip, and some other metadata information that can be useful and dont forget the timesatmp, so in that way yo can isolate the problem, maybe are robots or users with adblock, in the worst case you code is not properly implemented ( located in the Footer as example)
*** i added this because one time i had a huge discrepancy, but it was a server error, the HTML code was not properly printed showing to the user a empty HTTP. The MYSQL was no so fast to save the information and process the HTML code. I notice it when the effort test (via Screaming frog) showed a lot of 500x errors. ( Wordpress Blog with no cache)
I am trying to write a script that will check our website everyday for the total amount of web pages we have. How can I do this using an API like Google Analytics? Using JSON would be nice. So here is what it might look like and maybe someone can help please?
{
"startDate": "{date.startOfMonth.format()}",
"endDate": "{date.today}",
"dimensions": ["query","page"]
}
As nyuen has pointed out you cannot count every page in your web presence with Google Analytics. GA will only register pages that a) have GA tracking code and b) have executed this tracking code at least once in your selected timeframe. Usually that's most of the pages, but you can't be sure.
What you can do is issuing a query that queries the page path dimension and at least one metric - pageviews would be obvious. That's not because you actually need the number of pageviews for your purpose, but because a query without at least one metric will not work. Send the query via the API or the query explorer and then simply count the number of rows in the result set. Since the page path is unique the number of results is the number of distinct pages with pageviews in the selected timeframe, which is the closes you will get with GA.
But there are actually tools for what you are trying to do, so you might want to start with those - for example you might have your script make a system call (assuming a linux system) to wget with the --spider option which will create a list of files on a given domain. This does not require tracking code (it works by following links in the pages source code). There is also web spider software like Screaming frog on Windows (doesn't really work in a script, but I guess Windows has some task scheduling tool that allow you to start programs at pre-defined times) which does not only do the counting but also returns information about the health of your site (dead links etc).
Or, since this seems to be your server, you might write a script that traverses the file system and makes a list of the files it encounters there (will not work if your pages are dynamically generated, since this counts only physical files).
Or you write a script that parses your server logs and extracts call to content files (will work only for files that have actually been viewed).
So there are a number of better alternatives to using Google Analytics for that purpose, you might want to look into one of them first.
I want to know if I should have one html file per url(home,register,login,contact) i got more than 50 or should i separate them into like 5 files and get them through ?id=1,2,3,4,5,6 etc.
I want to know which method is more convenient , anyway I have understood that the second method would have to load the whole file which will be more slower than loading a single file.
But loading a single file will require more petitions and request to and from the server and the whole html files will be heavier due to i have to write a head and include all the files for each one of them
In past experience, I make sure that any components with distinct functionality is placed in its own file. I would consider distinct functionality as the examples that you listed above (home, register, login, contact, etc). On the other hand, if you are managing blog posts (or something similar), I would definitely use GET requests (i.e. ?page=1,2,3).
I have also maintained websites that have about 50-100 different pages, but it did use a content management system. If you feel overwhelmed, this could also be a possibility to explore.
If you do not choose to use a cms, I would recommend you to use partial files. A good example for a partial would be a header or footer. By using partials, you no longer need to replicate the same code on multiple pages (say goodbye to creating 50 navbars).
There is a website that my company uses that updates information about 3 specific things throughout the day. We use the information from 1 of them and what we are wanting to do is pull this information as it is added to their site and add it to a page of our own to view easier. Is this even possible? Can anyone point me in the direction of setting this up? It is all text that we want to pull.
Pick a language (e.g. Perl). Find an HTTP library for it (e.g. LWP). Fetch the page and run it through an HTTP parser (e.g. HTML::TreeBuilder). Pull out the bits you want and shove them into a template (e.g. TT) then dump to a file. Stick the program in cron or Windows Scheduler.