Facebook Graph API v2.10 page likes - json

disclosure. I am not a programmer
For the past year or so I have been utilizing the Facebook Graph API to pull Facebook page "likes" into a spreadsheet so that I can track how many likes my page gets vs other pages of similar business. It is kind of rudimentary but it became cumbersome to have to visit every page each week to get the total page "Likes" so this my solution.
I was utilizing this formula...
=importjson(concatenate("https://graph.facebook.com/",APINames!$B11,"?access_token=",$B$1),"/likes","noHeaders")
I reference this post...
Get Facebook page like count for OpenGraph v2.10
Which states to use the a different URL to retrieve page likes.
https://graph.facebook.com/v2.10/<page-id>?access_token=<access-token>&fields=fan_count
When inputting the new URL in my formula function I still receive a reference error.
=importjson(concatenate("https://graph.facebook.com/v2.10/",APINames!$B3,"?access_token=",$B$1),"&fields=fan_count","noHeaders")
If anyone could point me in the right direction I would be very grateful. I have spent over an hour scouring the web for information as well as reading the new changelog for v2.10. I fear going back to the manual process!!

I'm not 100% sure, but you should try this:
=importjson(concatenate("https://graph.facebook.com/v2.10/",APINames!$B3,"?access_token=",$B$1,"&fields=fan_count"),"/fan_count", "noHeaders")
I think the brackets are on the wrong place, because you need to concatenate the url, the page-id (in APINames), the query parameter access_token, the access token form cell B1 and the fields query_parameter.
At least with this change I get a fan_count. Interesting use case, BTW.

Related

How does one scrape multiple pages with Beautiful Soup for a website that requires a login?

Currently, I'm working with Beautiful Soup to extract information from websites. I've managed to scrape data from multiple pages from a certain apartment renting website with it - let's call it “Website A”. Now, I'd like to obtain data from another renting websites (“Website B”). I tried to follow a similar procedure as previously, but it failed because Website B requires a login.
I did manage to scrape the first page of apartments of Website B by means of Adelin's answer. His/her approach is based on the usage of Curl Trillworks (link). In principle, this approach could work for Website B as well. However, then one would need to manually repeat the procedure for the 800 or so pages on which the apartments are listed, and afterwards do the same for each of the 15 apartments per page.
This is too much work for me, so I try to automate the process. For instance, I tried adapting this to my situation, but I haven't succeeded so far. The dictionary I get is empty. I also tried making a new header for each page by putting a new referer each time in the original header. Then I'd put these referers in the the header dictionary. However, this failed - probably because websiteB recognized I was using the same cookie everytime I sent a request (the same one I used for the original apartment page for Website B).
So my question is:
Suppose one would have a list of pages of Website B that all have the same format (www.websiteB.com/PageNumber/ ). How would one
quickly/automatically obtain a header for each page by means of your
own login credentials for the website, with which one can create an appropriate response?
I could share the code I have so far, but I'm somewhat hesitant as this is a large commercial website and I suspect they aren't particularly happy with me sharing code that allows their website to be scraped and names the website itself as well.

Would it be possible to scrape data from Airbnb directly into a Google Sheet?

I'm trying to build a super simple Google Sheet dashboard comparing the prices at D+7 and D+30 in real-time of specific listings/rooms that are both on Airbnb and Booking.com.
On the Booking.com side, it was super easy : I just created a formula concatenating the URL with the check-in/check-out dates, number of guests and trip duration as parameters and using the =IMPORTXML function and the proper class, I was able to automatically retrieve the price.
It is more difficult on Airbnb, as the price is dynamic(see here: https://www.airbnb.com/rooms/25961741). When I use what I think is the proper class, I get a "N/A Error, Imported content is empty" on Google Sheet.
I also tried using the Airbnb API with REGEX functions to extract the price, but the price set in the listing info is a default price, and does not reflect reality:
"price":1160,"price_formatted":"$1160"
https://api.airbnb.com/v2/listings/25961741?client_id=d306zoyjsyarp7ifhu67rjxn52tv0t20&_format=v1_legacy_for_p3&number_of_guests=1
Do you now if there are any other possible way to access this dynamic price and have it automatically parsed into a spreadsheet? It seems that the data I'm looking for in within meta tags on the HTML code and I don't know if it's possible to scrape it into Google sheet using =IMPORT functions.
Maybe with of a script ?
Thanks a lot !
I'm curious if you were unable to yank direct with the ABNB API; what if you tried to directly pull off the site's service? Have a look at this URL:
https://www.airbnb.com/api/v2/explore_tabs?version=1.3.9&satori_version=1.0.7&_format=for_explore_search_web&experiences_per_grid=20&items_per_grid=18&guidebooks_per_grid=20&auto_ib=false&fetch_filters=true&has_zero_guest_treatment=false&is_guided_search=true&is_new_cards_experiment=true&luxury_pre_launch=false&query_understanding_enabled=true&show_groupings=true&supports_for_you_v3=true&timezone_offset=-240&client_session_id=8e7179a2-44ab-4cf3-8fb8-5cfcece2145d&metadata_only=false&is_standard_search=true&refinement_paths%5B%5D=%2Fhomes&selected_tab_id=home_tab&checkin=2018-09-15&checkout=2018-09-27&adults=1&children=0&infants=0&click_referer=t%3ASEE_ALL%7Csid%3A61218f59-cb20-41c0-80a1-55c51dc4f521%7Cst%3ALANDING_PAGE_MARQUEE&allow_override%5B%5D=&price_min=16&federated_search_session_id=5a07b98f-78b2-4cf9-a671-cd229548aab3&screen_size=medium&query=Paris%2C%20France&_intents=p1&key=d306zoyjsyarp7ifhu67rjxn52tv0t20&currency=USD&locale=en
This is a GET request to ABNB's live page search; now I don't know much about ABNB but I can see from the listings portion of the JSON feed it does have a few pricing factors that differ from the API results you provided; I'm not sure what you need to pull exactly but this may lead you in the right direction; check the 'Listings' array and see if there's something you can possibly use.
Keep in mind if you are looking to automate scraping this data you would want to generate new search sessions; but first you want to see if this is the type of data you're looking for.
Another option, Google CSE's API; I've pulled data in the page headers of sites as they appear in Google based on the Schema.org's tags; but this may be delayed data and it appears you need real-time; the best route would be reserach the above example or try to make sure of ABNB's natural API (they provide its functionality for a reason right?; there must be a way to get what you need).
Hope my answer helped lead you in the right direction!

How do I make a url that is specific to some set of data with out a html file?

For example, facebook. I have a list of different teams that shows overall data for that team, but the goal is the user will click on their team and send them to page that is detailed information about their team. My client wants the user's team name to be within the url so they can save the page as a favorite.
How I do this with out making a html file for every single team that gets made?
Im using Django for back end.
If you're using django then there is certainly no sense in making html for each team. You should make a template and populate this template with the data you're getting from your database (models.py) according to url (urls.py) and the appropriate view (views.py).
This is fairly basic django usage covered extensivly in the offical tutorials and the django documentation. Read it and use it, cause there is no shortcut. And last but not the least - enjoy cause such good tutorials and docs you won't see every day.

Adding Tab to Page

I am trying to add the tab to a page I am admin of.
I use the url to do that -
http://www.facebook.com/dialog/pagetab?app_id=&next=.
Facebook shows a list of all the pages I am admin of. And that drop down has no specific sorting order.
Now my problem is - I have multiple pages with same page name. They ofcourse have different urls. I tried changing the name of pages, but due to high number of likes I can't change the names.
The only option I am left with is hit & Trial. And I have to do it for more than 30 apps.
So you understand my pain point.
Please advice any alternative.
Thanks
Pankaj
I would recommend writing down the page ids and making some sort of system for yourself to remember (perhaps only the last few digits) which page is which.
In any case, there is a way for you to add a tab application directly to a page without ever seeing that "Add Page Tab" dialog. You can do it all through the API. This means you'll need your pages access token so head on over to the Graph API Explorer, make sure you click the "get access token" button and mark the manage_pages permission.
You need to query /me/accounts to get a list of all the pages you administer.
You'll see a list with the page id, name, category... I hope you will be able to identify your page more easily here. Once you have, you'll need to get the access_token for that page. Keep a record of it - we'll need it in a few minutes. You'll also need the page id.
Modify the following URL to include the parameters we got previously -
https://graph.facebook.com/PAGE_ID/tabs?app_id=TAB_APP_ID&method=post&access_token=PAGE_ACCESS_TOKEN
Navigate to that URL and if all goes well, you'll get a simple true message indicating that the action was successful.

Partial GET request for Google Calendar html download

Hi I'm working on an Arduino project and I'd like to display the next event from a Google calendar on a small display. I want to know if there's a way to limit the size of a HTML request from Google. Right now when I do the request I'm getting my full calendar's data. This significantly slows down the time it takes to get the event. I tried using a GET request and Range bytes 1000-3000 but this doesn't seem to work. Does anyone know any workarounds for this without going through Oauth?
You want the "maxResults" parameter and you may also like to limit the fields returned using the "fields" parameter as well. Check the events > list docs for details.