Delicious API - All bookmarks for a given website? - json

Is there a way for me to get at a list of the URLs on my website which people have bookmarked on the delicious.com website? Their documentation appears to make no reference to wildcard searches or anything other than individual URLs. Any suggestions?

Delicious.com provides feeds to query information about URLs, but the function only accept an md5 hash of the URL you want to look up. It looks like it's impossible to perform a single query with a wildcard search via the currently exposed public API. A hack would be to create a list of of valid URLs of your site and then query Delicious for each of these URLs keeping in mind the inter-request delay of 1-second as well as other restrictions.

Related

How to track clicks on Links using GTM

I have an ads campaign and I don't know from where users came to my website and
How can I know which one of those links users click must
www.example.com/twitter
www.example.com/whatsapp
www.example.com/linkdIn
www.example.com/<this will be the source name>
I want to know which link users came from using GTM.
All links must open on the landing page.
Thanks.
You don't really need GTM to track click source. The GA script translates certain query parameters to traffic source dimensions automatically. Those query parameters are called UTM parameters. Here's the documentation on how they're mapped to GA data.
You can use the url builder tool to generate a url if you find it difficult to figure out the proper syntax.
Basically, you just generate a link to your landing, embedding there the information about the source and then you post the link on the said source. And you carefully do that for all sources.
Sure, GA also tracks the referrer, but TLS will eraze the query params of the referrer, so it may be much more awkward to use to determine the source, but GA already tries to parse the referrers to determine the source automatically, when no utm params are set. UTMs will override the automatic referrer logic.
Finally, GTM. GTM is powerful. It allows you to do more than that. For example, it's able to override the above described logic and set the source, medium, keyword, even referrer, using JS. Ultimately, mostly because of GTM's ability to deploy custom JS, it is possible to override any field in tracking and add extra fields.

How to separate original and translated review comment from a from google maps api

I have a lot of google access tokens in my DB (more than 1k). These tokens belong to google-my-business owners that have been authenticated to my app via google and gave permissions to use their tokens. I want to get all google reviews from all these accounts and save them to my DB.
But when I get reviews, I get them with an automatically generated translation as a single string. And I want to split the original and translated and save it separately.
Reviews have this format.
(Translated by Google) Awesome!
(Original) Круто!
Usually, I would just split this string by "(Translated by Google)" substring. But this solution doesn't work, because the substring is actually different and depends on the user's account language settings. And if user set up his language to russian, this substring will look like "(Переведено Google)".
Is there any way to split the original comment from translated one, considering different language settings?
P.S. This question is not a duplicate because other ones don't have this language problem.
I found a solution that I really don't like and it's an overkill, but that's the only thing that came to my mind.
First I get the user's language settings by calling this endpoint https://developers.google.com/my-business/reference/rest/v4/accounts.
Then I translate '(Translated by Google)' string via google translate API to this language and then I split the review comment by this string. It works, but if you suggest something better, it would be awesome.

Would it be possible to scrape data from Airbnb directly into a Google Sheet?

I'm trying to build a super simple Google Sheet dashboard comparing the prices at D+7 and D+30 in real-time of specific listings/rooms that are both on Airbnb and Booking.com.
On the Booking.com side, it was super easy : I just created a formula concatenating the URL with the check-in/check-out dates, number of guests and trip duration as parameters and using the =IMPORTXML function and the proper class, I was able to automatically retrieve the price.
It is more difficult on Airbnb, as the price is dynamic(see here: https://www.airbnb.com/rooms/25961741). When I use what I think is the proper class, I get a "N/A Error, Imported content is empty" on Google Sheet.
I also tried using the Airbnb API with REGEX functions to extract the price, but the price set in the listing info is a default price, and does not reflect reality:
"price":1160,"price_formatted":"$1160"
https://api.airbnb.com/v2/listings/25961741?client_id=d306zoyjsyarp7ifhu67rjxn52tv0t20&_format=v1_legacy_for_p3&number_of_guests=1
Do you now if there are any other possible way to access this dynamic price and have it automatically parsed into a spreadsheet? It seems that the data I'm looking for in within meta tags on the HTML code and I don't know if it's possible to scrape it into Google sheet using =IMPORT functions.
Maybe with of a script ?
Thanks a lot !
I'm curious if you were unable to yank direct with the ABNB API; what if you tried to directly pull off the site's service? Have a look at this URL:
https://www.airbnb.com/api/v2/explore_tabs?version=1.3.9&satori_version=1.0.7&_format=for_explore_search_web&experiences_per_grid=20&items_per_grid=18&guidebooks_per_grid=20&auto_ib=false&fetch_filters=true&has_zero_guest_treatment=false&is_guided_search=true&is_new_cards_experiment=true&luxury_pre_launch=false&query_understanding_enabled=true&show_groupings=true&supports_for_you_v3=true&timezone_offset=-240&client_session_id=8e7179a2-44ab-4cf3-8fb8-5cfcece2145d&metadata_only=false&is_standard_search=true&refinement_paths%5B%5D=%2Fhomes&selected_tab_id=home_tab&checkin=2018-09-15&checkout=2018-09-27&adults=1&children=0&infants=0&click_referer=t%3ASEE_ALL%7Csid%3A61218f59-cb20-41c0-80a1-55c51dc4f521%7Cst%3ALANDING_PAGE_MARQUEE&allow_override%5B%5D=&price_min=16&federated_search_session_id=5a07b98f-78b2-4cf9-a671-cd229548aab3&screen_size=medium&query=Paris%2C%20France&_intents=p1&key=d306zoyjsyarp7ifhu67rjxn52tv0t20&currency=USD&locale=en
This is a GET request to ABNB's live page search; now I don't know much about ABNB but I can see from the listings portion of the JSON feed it does have a few pricing factors that differ from the API results you provided; I'm not sure what you need to pull exactly but this may lead you in the right direction; check the 'Listings' array and see if there's something you can possibly use.
Keep in mind if you are looking to automate scraping this data you would want to generate new search sessions; but first you want to see if this is the type of data you're looking for.
Another option, Google CSE's API; I've pulled data in the page headers of sites as they appear in Google based on the Schema.org's tags; but this may be delayed data and it appears you need real-time; the best route would be reserach the above example or try to make sure of ABNB's natural API (they provide its functionality for a reason right?; there must be a way to get what you need).
Hope my answer helped lead you in the right direction!

Text Form Twitter API?

I'm making a Twitter account statistics program that reads tweets, retweet counts, and favorite counts. I could attempt to read the user's Twitter account URL line by line and parse the information from there, but I was wondering if there was a public API or part of Twitter that just spits out the raw data without formatting it all pretty for web browsers? Not only would this be more efficient in the program, but would also be much neater.
It seems as though API 1.1 uses JSON to fetch data, but I need to make a developer account and create unique identifiers in order to access such data. Is it worth it? Is there some sort of alternative that would be faster and easier?
All API calls to Twitter now require OAuth authentication, so there is unfortunately no way around signing up for a developer account and creating an app. It's not even possible to use a service that makes the requests on your behalf, as this is re-syndication which is forbidden by Twitter's API terms, so you need to make the calls yourself.

How to prevent crawlers from following links?

I'm building a site that will allow sellers to:
list their products on my site
have each product link back to the seller's site
be charged for each link clicked
What I need to do now is to somehow make sure that I am only logging actual human users following the links to the sellers site. If it's a bot crawling the site, I shouldn't be charging the sellers for that.
Is there a way for me tell bots not to follow a certain link? I don't think it's nofollow as that is not intended to block access to content.
The way to tell a bot not to follow a link is precisely to add rel=nofollow to your <a> tag.
Assuming you are also logging locally before forwarding to the external url you could also check the user agent string.
In fact, if you are going to ask people to pay based on number of referrals it might be an idea to log IP address and user agent against each paid for click in case your stats are ever questioned.
You just add a [robots.txt] file, e.g. like this one.
You can find more info about [robots.txt] files on the net, e.g. in Wikipedia.
Typicall you can identify them by the user agent string. You can find a list here, can't say it's perferct, but it's a good base to extend: PHP/MySQL - an array filter for bots
Robots.txt is another way, more about it here