Google Bigquery json API, pageToken has no effect - json

I'm trying to implement the JSON api (v2) of bigquery. In my code I get the same behaviour as on the documentation page for tabledata-list
My table size is about 11.000 rows. In the documentation page I fill in the following parameters:
ProjectId = X
DatasetId = Y
TableId = Z
MaxResults = 10000 #I want to paginate my results
This returns 10.000 rows and a pageToken. So I do the same request and now I set the page token so that I get the next page of results.
And that returns the same 10.000 rows as before. I expected this to do pagination as described on this page:
All collection.list methods return paginated results under certain circumstances. The number of results per page is controlled by the maxResults property
A page is a subset of the total number of rows. If your results are more than one page of data, the result data will have a nextPageToken property. To retrieve the next page of results, make another list call and include the token value as a URL parameter named pageToken.
Where do I go wrong?
EDIT:
My colleague pointed out to me that on the other documentation pages the result contains a nextPageToken except the response contains a pageToken. The difference being that where pageToken refers to the current page, the nextPageToken refers to the next page.
However the documentation states it should return a nextPageToken (except when there is no more data). But len(table) > len(result)

On the same page it's mentioned that there is a difference for TableData.List() call
The bigquery.tabledata.list method, which is used to page through
table data, uses a row offset value or a page token.
So for TableData.List() you must use the row offset value to paginate, and in order to access previous pages you can use your hashes from your session. This is built because with large volume and big data, you cannot pre-cache the next set of data from your worker pool.
You can help improving the documentation, by using the link on top right of each page that says: Feedback on this document feel free to use that to reach out with improvements.
Also you can submit issues to https://code.google.com/p/google-bigquery/issues/list

Unfortunately, the field returned for TableData.List() that contains the logical "next page token" is literally named "pageToken", rather than "nextPageToken".
Other APIs, like Datasets.List(), return a field literally named "nextPageToken" which contains the logical "next page token".
It's a case of inconsistent naming, but hopefully this helps clear up some confusion.

Related

DataTable - get row_ids of displayed data on page

I use pagination (10 rows per page). I would like to get via a callback the row ids of rows which are displayed on the current page.
I found the 'derived_viewport_row_ids' attributes promising, but it results in a list of Nones.
#app.callback(Output('main-table','style_data_conditional'), Input('main-table','derived_viewport_row_ids'), prevent_initial_call=True)
def gradient_style(page_row_ids):
print(page_row_ids)
return no_update
Does anybody have an idea how to access this information?

Unable to retrieve unlimited number of record via API

We are struggling to mine all time records for this year via API.
We have tried to include the :dont_limit_result GET variable and set it to 1, however it did not help us.
The version that we use is ACTIVE COLLAB 5.11.0, the URL we are hitting: projects?dont_limit_result=1&page=$page
Please give me some advise on how to proceed.
Most of API responses are paginated, and pagination can't be turned off using a GET switch. Instead, you should check following headers:
X-Angie-PaginationCurrentPage - indicates current page
X-Angie-PaginationItemsPerPage - indicates number of items per page
X-Angie-PaginationTotalItems - indicates number of items in the entire data set.
and walk through pages until you reach the end of data set.
Another option is to give project's filter a try. Here's an example request that will return all projects:
curl -H "X-Angie-AuthApiToken: YOUR-API-TOKEN" "http://your.activecollab.com/api/v1/reports/run?type=ProjectsFilter"
This one will return all active projects:
curl -H "X-Angie-AuthApiToken: YOUR-API-TOKEN" "http://your.activecollab.com/api/v1/reports/run?type=ProjectsFilter&completed_on_filter=is_not_set"
I'm using the php API wrapper 3.0 - how do i get the headers back to know there are more pages and then what is the correct form of the query to get further pages?
For example my basic query is:
$timeRecords = $client->get('projects/22/time-records')->getJson();
to get time records - but this only returns 100 and there are more!
Thanks,
P

Valid to return different json-response depending on list or retrieve?

I am currently designing a Rest API and is a little stuck on performance matters for 2 of the use cases in the system:
List all campaigns (api/campaigns) - needs to return campaign data needed for listing and paging campaigns. Maybe return up to 1000 records and would take ages to retreive and return detailed data. The needed data can be returned in a single DB call.
Retrieve campaign item (api/campaigns/id) - need to return all data about the campaign and may take up to a second to run. Multiple DB calls is needed to get all campaign data for a single campaign.
My question is: Is it valid to return different json-responses to those 2 calls (if well documented) even if it regards the same resource? I am thinking that the list response is a sub set of the retreive-response. The reason for this is to make to save DB calls and bandwitdh + parsing.
Thanks in advance!
I think it's both fine and expected for /campaigns and /campaigns/{id} to return different information. I would suggest using query parameters to limit the amount of information you need to return. For instance, only return a URI to each player unless you see a ?expand=players query parameter, in which case you return detailed player information.

Clarification on maxResults and nextPageToken using Google Drive API v2

I just wanted clarification with regard to the Files: list feature of the Google Drive API here:
https://developers.google.com/drive/v2/reference/files/list
What is the the maximum value that can be specified with maxResults? I assume this value calculates the number of results on the next page of results?
Also, is the nextPageToken simply part of the query string that's required to be passed with nextLink to get the next page of results?
Thanks!
The maxResults query parameter can be used to limit (or increase) the number of items returned in a list request. There is a default value and a hard limit that is set by our server.
Unfortunately, we don't usually document those numbers as they can easily change and recommend developers to look for a nextPageToken and/or nextLink in the resulting collection to know whether or not all items have been returned.
The nextPageToken attribute is to be used as the pageToken query parameter on a list request. If you are using the nextLink from the resulting collection, you do not need to specify the pageToken query parameter as it should already be included.
maxResults cannot be larger than 1000 according to this page: https://developers.google.com/drive/v2/reference/files/list

Box API: Get_managed_users returning all users

Using the Box 1.0 REST API, I am trying to work with the functions in SOAP UI.
The API doc for get_managed_users with user_id=12345 (internal id retrieved with get_user_id call correctly) is returning all the users. The docs say that would be the case if you do not specify a user_id value. But my full command is: (Token and API key changed to protect the clueless)
https://www.box.com/api/1.0/rest?user_id=27360&auth_token=blahbalhblah1234&action=get_managed_users&api_key=someKeyYouShouldNotSee
Now I could work with the complete result list, but that won't scale as we get thousands of users into the system.
I can make a call with edit_managed_user, using the same user_id value and the change is reflected in the UI, and in the next get_managed_users call. Thus I do have the correct user_id value, I would so assume.
I tried testuser#gmail.com as the user_id value as well, and get the entire list back. This leads me to believe that somehow I am sending user_id wrong, but I just do not see it.
Any hints? Why, with what seems like a valid user_id value is it acting like it is absent or incorrect?
Most likely you have either called this method with an invalid user_id, or one that is not in your set of managed users. Can you double check that the user comes back in your list of already managed users?