What is the best way to Paginate Forge/BIM360 Docs files lists? - autodesk-forge

I am currently in the process of implementing pagination, sort and search functionalities in the project files/plans/sheets views of BIM 360 Docs integration.
Since I couldn't find any best practices regarding to these features, I thought I would reach out so that I don't keep stuck reinventing the wheel.
Background:
Most of the implementation uses https://github.com/Autodesk-Forge/forge-api-dotnet-client/ SDK.
Based on what I saw, pagination in Autodesk API is very basic and does not play well with filtered views. Please correct me if I am wrong, but it looks like there is no way to get number of items in the view and/or calculate total number of pages in the resultset.
If one uses filtering to limit types of items returned by the API (e.g. documents, sheets, project files), API applies pagination first and filters second. That causes holes in returned resultsets, e.g. one would request page 1 sized as 5 items, and get 3 items back, then request similarly sized page 2 and get no items back, then page 3 would yield 2 items.
The above mentioned issues force us to use dynamic lazy-loading paging, similar to how it's currently done in the BIM360 Docs UI.
Question:
Is there a different, better way to paginate? Or do we have to lazy-load results while scrolling, never knowing how much records the next page would return?

Unfortunately, paginating is not available for Forge MD API of BIM360 currently as I know. Apologizing for any inconvenience caused.
However, it's been logged as request id FDM-1769 a few days ago. I saw your name on the request list. So I think it will be supported in the future. Besides, a workaround is to fetch all data from the API, then paginate on the client side via Javascript.

Related

'Search' endpoint seems unstable

When using Forge Data Management API endpoint
projects/:project_id/folders/:folder_id/search we have two problems.
It seems that we sometimes have to wait for several minutes (hours?)
after a model is uploaded until it can be found by the search.
We often get error 429 "Too Many Requests" even that we only call do
very few calls (less than 10 within an hour).
These issues makes the endpoint hard to use in production code. Is there anything we can do to improve the success rate? Is Autodesk going to improve the endpoint?
This question is related to How to find cloud Item id of a Revit model?
We are aware of those issues and working on improving things in those areas. However, I cannot tell when exactly those will become available.
In the meantime, depending on your workflow, there are two things that could be of help:
Use webhooks in order to be notified about new files being added to BIM 360/ACC
Use folder/contents endpoint to find the file you need, which also supports filtering just like the folder/search endpoint. You would have to iterate through subfolders though if you wanted to look for items in them as well. Newly added files should show up here straight away.

GoogleBetterAds - violatingSites.list - google-apis-explorer

I can get a list of summaries of violating sites, using the following link:
https://developers.google.com/ad-experience-report/[...]/violatingSites/list
My questions:
Is this list exhaustive?
If not, is it possible to get an exhaustive list (or not) and how?
Is it possible to know how these websites are pulled (the share of websites analysed, etc)?
- Is this list exhaustive?
What's size of your actual API return?
If you have an API return statement increasingly longer and longer with new data at each new request, you can think have the exhaustive list (with a possible update
latency).
If the API return statement have always same size with different data, in example old data will not appears and it replaced by new data, it's not exhaustive.
- If not, is it possible to get an exhaustive list (or not) and how?
I have no idea at the moment, the total number of websites can be in billion ...
- Is it possible to know how these websites are pulled (the share of websites analysed, etc)?
I have no idea for the moment too, I think it is either a confidential process or that it is described in the general conditions and subtily in the documentation...

Counting views of any element on website

I am using such MySQL request for measuring views count
UPDATE content SET views=views+1 WHERE id='$id'
For example if I want to check how many times some single page has been viewed I've just putting it on top of page code. Unfortunately I always receiving about 5-10x bigger amount than results in Google Analytics.
If I am correct one refresh should increase value in my data base about +1. Doesn't "Views" in Google Analytics works in the same way?
If e.g. Google Analytics provides me that single page has been viewed 100x times and my data base says it was e.g. 450x times. How such simple request could generate additional 350 views? And I don't mean visits or unique visits. Just regular views.
Is it possible that Google Analytics interprates such data in a little bit different way and my data base result is correct?
There are quite a few reasons why this could be occurring. The most usual culprit is bots and spiders. As soon as you use a third-party API like Google Analytics, or Facebook's API, you'll get their bots making hits to your page.
You need to examine each request in more detail. The user agent is a good place to start, although I do recommend researching this area further - discriminating between human and bot traffic is quite a deep subject.
In Google Analytics the data is provided by the user, for example:
A user view a page on your domain, now he is on charge to comunicate to Google The PageView, if something fails in the road, the data will no be included in the reports.
In the other case , the SQL sistem that you have is a Log Based Analytic, the data is collected by your system reducing the data collection failures.
If we see this in that way, that means taht some data can be missed with the slow conections and users that dont execute javascriopt (Adbloquers or bots), or the HTML page is not properly printed***.
Now 5x times more it's a huge discrepancy, in my experiences must be near 8-25% of discrepancy. (tested over transaction level, maybe in Pageview can be more)
What i recomend you is:
Save device, browser information, the ip, and some other metadata information that can be useful and dont forget the timesatmp, so in that way yo can isolate the problem, maybe are robots or users with adblock, in the worst case you code is not properly implemented ( located in the Footer as example)
*** i added this because one time i had a huge discrepancy, but it was a server error, the HTML code was not properly printed showing to the user a empty HTTP. The MYSQL was no so fast to save the information and process the HTML code. I notice it when the effort test (via Screaming frog) showed a lot of 500x errors. ( Wordpress Blog with no cache)

How to use Wikipedia API to get page statistics for all pages in a Category?

I am looking to identify the most popular pages in a Wikipedia Category (for example, which graph algorithms had the highest page views in the last year?). However, there seems to be little up-to-date information of Wikipedia APIs, especially for obtaining statistics.
For example, the StackOverflow post on How to use Wikipedia API to get the page view statistics of a particular page in Wikipedia? contains answers that no longer seem to work.
I have dug around a bit, but I am unable to find any usable APIs, other than a really nice website, where I could potentially do this manually, by typing page titles one by one (max. up to ten pages only): https://tools.wmflabs.org/pageviews/. Would appreciate any help. Thanks!
You can use a MediaWiki API call like this to get the titles in the category: https://en.wikipedia.org/w/api.php?action=query&list=categorymembers&cmtitle=Category:Physics
Then you can use this to get page view statistics for each page: https://wikimedia.org/api/rest_v1/#!/Pageviews_data/get_metrics_pageviews_per_article_project_access_agent_article_granularity_start_end
(careful of the rate limit)
E.g. for the last year, article "Physics" (part of the Physics category): https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia.org/all-access/all-agents/Physics/daily/20151104/20161104
If you're dealing with large categories, it may be best to start downloading statistics from https://dumps.wikimedia.org/other/pageviews/2016/2016-11/ to avoid making so many REST API calls.
TreeViews is a tool designed to do exactly this. Getting good data is going to be hard if your category contains thousands of pages, in which case you'd better do the calculations yourself as Krenair suggests.

Get all files in box account

I need to fetch a list of all the files in a user's box account, such that the list of files can then be displayed in a table view (iOS).
I have successfully implemented this by recursively using /folders/{folder id}/items on all the folder's in my user's box.
However, while this works, it's kind of dirty, seeing as how a request is made for each of the users's folders, which could be quite a large number.
Is there any way to get a list of all the files (it's no issue if folders are included, I can ignore those manually) available?
I tried implementing this using search, but I couldn't identify a value for the query parameter that returned everything.
Any help would be appreciated.
Help me, Obi-Wan Kenobi. You're my only hope.
What you are looking for (recursive call through a Box account) is not available. We have enterprise customers will bajillions of files and millions of folders. Recursively asking for everything would take too long.
What we generally recommend is that you ask for as little as you can, and that you use multiple threads and anticipate what you'll need just a little bit, so that you can deliver a high-performance user-interface to your end-users.
For example ?fields=item_collection is expensive to retrieve, and can add a lot to a paylaod. It can double, or 10x the time that it takes to get back a payload from the Box API. Most UI's don't need to show all the items inside every folder. So they are better off asking for ?fields=.
You can make your application responsive to the user if you make the smallest possible call. Of course there is a balance. Mobile networks have high latency, and sometimes that next API call to show some extra thing is slow. But for a folder tree, you can get high performance by retrieving only the current level, displaying that, and then starting to fetch one-level down while the user is looking at the first level.
Same goes for displaying thumbnails. If a user drills into a folder and starts looking at thumbnails for pictures, there's a good chance they'll want to see other thumbnails in that same folder. Your app should anticipate that, and start to pull one or two extras down in the background. Yes, it means more API calls, but your users will give your app a higher rating for being fast.