page and page_size parameters are ignored for get_groups, get_group_folders, and get_group_users - box-api

I'm working on an application that uses the Box v1 "enterprise" APIs for user and group management (the v2 API doesn't have these methods yet). Specifically, I'm enumerating groups and their associated folders and users using get_groups, get_group_folders, and get_group_users.
I have a large number of groups and folders in my organization, and I'm unable to page through the results; I only get 20 items at a time from each of these APIs. I've tried variations on the page and page_size parameters listed in the API docs, but they don't seem to do anything.
Specifically, each of these three requests gives me the same 20 groups back:
https://www.box.net/api/1.0/rest?api_key=XXX&auth_token=YYY&action=get_groups
https://www.box.net/api/1.0/rest?api_key=XXX&auth_token=YYY&action=get_groups&page=2
https://www.box.net/api/1.0/rest?api_key=XXX&auth_token=YYY&action=get_groups&page_size=50
The same goes for get_group_folders and get_group_users.

For optional parameters you do need to format them within params[]. For example when changing the page_size, your request would be:
http://box.net/api/1.0/rest?action=get_groups&api_key=API_KEY&auth_token=AUTH_TOKEN&params[page_size]=VALUE .

Related

What is the reason why the OneNote APIs won't return all the pages in a notebook?

I am reading around here and I am seeing multiple messages about the /pages endpoint that is not working a expected
It seems that the OneNote APIs (MS Graph or Office365) are not returning all the pages that the user can see. In particular recent pages are not shown as available.
This message is for those of you who work for Microsoft and who keep an eye on this forum. Please if you have any explanation or workaround for this we would like to hear about it.
If this is work in progress we would also like to know when the APIs can be considered stable and reliable enough to consider them OK for production use
Update:
Permissions or scopes
scopes=[
"Notes.Read",
"Notes.Read.All",
"Notes.ReadWrite",
]
This is for a device authorization flow, the device is acting as a Microsoft Online account. The app is registered to Azure as personal app but the enterprise one does the same
The authorization process is described here
What type of app/authentication flow should I select to read my cloud OneNote content using a Python script and a personal Microsoft account?
After that I am using this endpoint to get the notebooks
https://graph.microsoft.com/v1.0/users/user-id/onenote/notebooks
from the returned json I pick the endpoint for the notebook I want to read and I access the endpoint the link stored in notebook['sectionsUrl']. This call returns a sections json
From this I pick the section I want and I access the link stored in section['pagesUrl']
Each call returns the expected info excepting the last one, when I get an arbitrary low number of pages in the section I want to explore. There is nothing wrong with the format of the info, it is just incomplete or not up to date
Not sure if this is related but when I try to access the pages in a section from MS Graph Explored I am seeing the same behavior (not all the pages are reported). This is a shared notebook and I am using the owner account for all the above so it should not be a permission problem
from msal import PublicClientApplication
import requests
endpoint= "https://graph.microsoft.com/v1.0/me/onenote"
authority = "https://login.microsoftonline.com/consumers"
app=PublicClientApplication(client_id=client_id, authority=authority)
flow = app.initiate_device_flow(scopes=scopes)
# there is an interactive part here that I automated using selenium, you
# are supposed to ouse a link to enter a code and then autorize the
# device; code not shown
result = app.acquire_token_by_device_flow(flow)
token= result['access_token']
headers={'Authorization': 'Bearer ' + token}
endpoint= "https://graph.microsoft.com/v1.0/users/c5af8759-4785-4abf-9434-xxxxxxxxx/onenote/notebooks"
notebooks = requests.get(endpoint,headers=headers).json()
for notebook in notebooks['value']:
print(notebook['displayName'])
print(notebook['sectionsUrl'])
print(notebook['sectionGroupsUrl'])
# I pick a certain notebook
section=[section for section in sections if section['displayName']=="Test"][0]
endpoint=notebook['sectionsUrl']
pages=requests.get(endpoint,headers=headers).json()
for page in pages['value']:
print(page['title'])
Update2
If I use this endpoint
https://graph.microsoft.com/v1.0/users/user-id/onenote/sections/section-id/pages
I would expect to get the complete list of pages for that section.
That is not working
After reading again and again the docs I my understanding is that the approach is to
call https://graph.microsoft.com/v1.0/users/user-id/onenote/pages$fiter or search etc etc
I this correct?
Also I vaguely remember there is a way to search for a section and have it expanded so that the search returs the children too.
Am I close to understanding this?
Thank you
MM

Amazon: product advertising api pagination top sellers

Is this a limitation of the amazon API?
I would like to pull data similar to this page: amazon.com/Best-Sellers-Home-Improvement-Pumps-Plumbing-Equipment/zgbs/hi/13749581/ref=zg_bs_nav_hi_1_hi
STACKOVERFLOW BREAKS THIS LINK!
am using:
operation: 'BrowseNodeLookup',
response_group: "BrowseNodeInfo,TopSellers"
The TopSeller response group only returns 10 items and does not respond to ItemPage.
Is there a way to do item lookup without a query using a browse node and sorting by popularity?
The AWS documentation on the BrowseNodeLookup API and the TopSellers response group indicates that it only includes the top 10, and there is no mention of pagination.
The TopSellers response group returns the ASINs and titles of the 10 best sellers within a specified browse node.
However, the results from TopSellers are basically equivalent to the results of an ItemSearch with Sort set to salesrank. Therefore, you can solve pagination requirements as follows:
On initial load (such as a user loading a web page or opening a particular view in a mobile application), issue BrowseNodeLookup and retrieve TopSellers. Populate some portion of the UI with information from the browse node and some other portion of the UI with the TopSellers results.
If the user never goes past the first page, then do nothing more. (There is no need to spend time on an additional service call.)
As the user navigates to subsequent pages, issue ItemSearch with Sort set to salesrank and ItemPage set to the page number. Use these results to update the portion of the web page/view in your application that was previously populated from the browse node TopSellers.
Note that you will still only be able to retrieve up to 10 pages worth of results. This is an ItemSearch API limitation.

How to restrict fields returned by stackexchange api, and turn off paging?

I'd like to have a list of just the current titles for all questions in one of the smaller (less than 10,000 questions) stackexchange site. I tried the interactive utility here: https://api.stackexchange.com/docs/questions and it both reports the result as a json at the bottom, and produces the requesting url at the top. For example:
https://api.stackexchange.com/2.2/questions?order=desc&sort=activity&tagged=apples&site=cooking
returns this JSON in my browser:
{"items":[{"tags":["apples","crumble"],"owner":{ ...
...
...],"has_more":true,"quota_max":300,"quota_remaining":252}
What is quota? It was 10,000 on one search on one site, but suddenly it's only 300 here.
I won't be doing this very often, what I'd like is the quickest way to edit that (or similar of course) url so I can get a list of all of the titles on a small site. I don't understand how to use paging, and I don't need any of the other fields. I don't care if I get them, but I'm thinking if I exclude them I can have more at once.
If I need to script it, python (2.7) is my preferred (only) language.
quota_max is the number of requests your application is allowed per day. 300 is the default for an unregistered application. This used to be mentioned directly on the page describing throttles, but seems to have been removed. Here is historical information describing the default.
To increase this to 10,000, you need to register an application and then authenticate by passing an access token in your script.
To get all titles on a site, you can use a Python library to help:
StackAPI. The answer below will use this library. DISCLAIMER: I wrote this library
Py-StackExchange
SEAPI
StackPy
Assuming you have registered your application and authenticated we can proceed.
First, install StackAPI (documentation):
pip install stackapi
This code will then grab the 10,000 most recent questions (max_pages * page_size) for the site hardwarerecs. Each page costs you one API hit, so the more items per page, the few API calls.
from stackapi import StackAPI
SITE = StackAPI('hardwarerecs')
SITE.page_size = 100
SITE.max_pages = 100
# Filter to only get question title and link
filter = '!BHMIbze0EQ*ved8LyoO6rNjkuLgHPR'
questions = SITE.fetch('questions', filter=filter)
In the questions variable is a dictionary that looks very similar to the API output, except that the library did all the paging for you. Your data is in questions['data'] and, in this case, contains a list of dictionaries that look like this:
[
...
{u'link': u'http://hardwarerecs.stackexchange.com/questions/29/sound-board-to-replace-a-gl2200-in-a-house-of-worship-foh-setting',
u'title': u'Sound board to replace a GL2200 in a house-of-worship FOH setting?'},
{ u'link': u'http://hardwarerecs.stackexchange.com/questions/31/passive-gps-tracker-logger',
u'title': u'Passive GPS tracker/logger'}
...
]
This result set is limited to only the title and the link because of the filter we applied. You can find the appropriate filter by adjusting what fields you want in the web UI and copying the filter field.
The hardwarerecs parameter that is passed when creating the SITE parameter is the first part of the site's domain URL. Alternatively, you can find it by looking at the api_site_parameter for your site when looking at the /sites end point.

How to get Domain Authority from http://moz.com using GET request?

I want to get domain authority value from "moz.com" (didn't find other sources).
Sometimes page does not load properly and response from moz.com does not have proper dom elements which I parse. Probably page uses javascript to show values. It also has restriction, can not analyze more than 3 times/day (I need to visit it maximum once a day)
require 'rest-client'
require 'nokogiri'
link_url = "http://google.com"
api_url = "http://moz.com/researchtools/ose/links?site="
response = RestClient.get(api_url + link_url.split("?").first)
value = Nokogiri::HTML(response).css('.url-metrics-authority span.large').first.text.strip #previously there was Nokogiri::HTML(response).css('.metrics-authority').first.text.strip
pp value
From console that works good, but when I run it using ruby script, it fails.
Can I somehow wait for js to execute or are there any other sources to get domain authority?
You can get the Domain authority for any website/URL by making use of the free URL Metrics API provided by Moz. You will need AccessId and Secret key to consume Mozscape API's. I would suggest you to build a wrapper API to get Moz Domain Authority around the Moz API so that you can consume the wrapper API from the Javascript.
I am Russ Jones and consult for Moz. I also helped architect the latest version of Domain Authority.
The appropriate documentation for collecting Domain Authority is here
Getting an API Key is free and allows for 2,500 lookups per month at no faster than 1 every 10 seconds. Paid access starts at $250/mo and includes 120,000 rows per month with significantly fewer restrictions.

Tweet counter for identi.ca

Is there a way to retrieve the amount of times a certain URL was "dented" (shared on identi.ca, status.net and/or the likes?).
For twitter there are several services that give this information.
Twitter itself: http://urls.api.twitter.com/1/urls/count.json?url=http://example.com&callback=twttr.receiveCount
Tweetmeme: http://api.tweetmeme.com/url_info.jsonc?url=http://example.com
Topsy: http://otter.topsy.com/stats.js?url=http://example.com&callback=?
I don't need the fancy extra information that Tweetmeme or Topsy deliver, only the amount.
I am aware that this is problematic, seen from the "distributed" nature of status.net: it will only give a count from once single silo, e.g. identi.ca. However, for me, for now, that would be enough.
Is there such an endpoint that gives me such JSON?
I don't think so. There's a file table in StatusNet databases that holds references to dented URLs (so it wouldn't be hard to count them if you had access to database or could write a plugin -- i.e., you wouldn't have to parse all notices, just lookup the file table), but it's not exposed through the API.
The list of API possible calls for StatusNet is here: http://status.net/wiki/TwitterCompatibleAPI
In addition, there's a proposed Google Summer of Code project on this subject: Social Analytics plugin