Asynchronous location lookup on GAE/webapp2 - json

I do have got the following GAE datastore model (models/location.py) which I want to populate:
from google.appengine.ext import db
class Location(db.Model):
name = db.StringProperty(required=True)
country = db.StringProperty(required=False)
address = db.PostalAddressProperty(required=False)
coordinates = db.GeoPtProperty(required=False)
description = db.TextProperty(required=False)
To do so I've created a class LocationCreateHandler and a function _geocode (handlers/location.py):
from google.appengine.ext import db
from google.appengine.api import urlfetch
from webapp2_extras import json
import urllib
from handlers import BaseHandler
from models.location import Location
import logging
class LocationCreateHandler(BaseHandler):
def post(self):
name = self.request.get("name")
country = self.request.get("country")
address = self.request.get("address")
coordinates = _geocode(self, address)
description = self.request.get("description")
newLocation = Location(name=name, country=country, address=address, coordinates=coordinates, description=description)
newLocation.put()
return self.redirect("/location/create")
def get(self):
self.render_response("location/create.html")
def _geocode(self, address):
try:
logging.info("Geocode address: %s", address)
parameter = {'address': address.encode('utf-8'), 'sensor': 'false'}
payload = urllib.urlencode(parameter)
url = ('https://maps.googleapis.com/maps/api/geocode/json?%s' % payload)
logging.info("Geocode URL: %s", url)
result = urlfetch.fetch(url)
jsondata = json.decode(result.content)
location = jsondata['results'][0]['geometry']['location']
coordinates = '%s,%s' % (location['lat'], location['lng'])
logging.info("Geocode coordinates: %s", coordinates)
return coordinates
except:
return "0.0,0.0"
How would I make this asynchronous? At the moment the user would have to wait until the geocode lookup has finished.
Once I get this working I also plan to use _geocode() after updating a Location record.
I still have to figure out the _geocode part after "result =", seems to be a bug there to as I always receive 0.0,0.0.
-Luca.

Seems like a good use for task queues. When the POST arrives, kick off a task, passing it all the parameters for the Location entity. You can then finish the request and return to the client right away. The task can call _geocode, then create the Location entity with all the data.
Or, if you need to create the Location object in the request handler for some reason, you can do so in the POST handler, and pass the new entity's key to the task. When the task completes, it can fetch the entity and update it with the coordinates.
Also, to help determine why your urlfetch isn't working, here's a way to log the exception while still catching it:
import traceback
try:
...
except:
logging.exception(traceback.print_exc())

Thank you Jamie
I did now implement a GeocodeWorker (workers/googleapis.py) using task queues:
from handlers import BaseHandler
import urllib
import logging
from google.appengine.api import urlfetch
from google.appengine.ext import db
from webapp2_extras import json
from models.location import Location
class GeocodeWorker(BaseHandler):
def post(self):
address = self.request.get('address')
logging.info("Geocode address: %s", address)
parameter = {'address': address.encode('utf-8'), 'sensor': 'false'}
url = ('https://maps.googleapis.com/maps/api/geocode/json?%s' % urllib.urlencode(parameter))
logging.info("Geocode URL: %s", url)
result = urlfetch.fetch(url)
JSONData = json.decode(result.content)
location = JSONData['results'][0]['geometry']['location']
coordinates = '%s,%s' % (location['lat'], location['lng'])
logging.info("Geocode coordinates: %s", coordinates)
key = self.request.get('key')
logging.info("Geocode key: %s", key)
existingLocation = Location.get(db.Key(key))
existingLocation.coordinates = coordinates
existingLocation.put()
I also modified my location handler (handlers/location.py) to call the worker:
from google.appengine.api import taskqueue
from webapp2 import uri_for
class LocationCreateHandler(BaseHandler):
def get(self):
self.render_response('location/create.html')
def post(self):
name = self.request.get('name')
country = self.request.get('country')
address = self.request.get('address')
description = self.request.get('description')
newLocation = Location(name=name, country=country, address=address, description=description)
key = newLocation.put()
params = {'key': key, 'address': address}
taskqueue.add(url=uri_for('googleapis-geocode'), queue_name='googleapis', name=('googleapis-geocode-%s' % key), params=params)
return self.redirect('/location/create')
I did also create a queue.yaml file, defining the googleapis queue.
Also I removed the try: except: part. The google queue will automatically retry the operation and quit after a defined time.
Do you see any areas of improvements?
-Luca.

Related

Django/Django Channels - weird looking json response with double \ between each field

Hello I'm trying to do a real time friend request notification system and having this weird looking json response. I'm new to backend development and django (1st year software engineering student). Im just wondering if this is normal since i havent seen anything like this and if theres a way to fix it. Ive worked on a chat app before but it was just all text messages and so I got confused when it comes to django models. I have tried multiple ways I found but only this works. I think it might be because I called json.dumps twice but if i remove either of them, it wont work. Thank you
When a user sends a friend request, this is what i got back from the web socket(with double \ for each field)
Heres the code
//views.py
class SendRequestView(views.APIView):
permission_class = (permissions.IsAuthenticated,)
def post(self, request, *args, **kwargs):
receiver_username = self.kwargs['receiver_username']
if receiver_username is not None:
receiver = get_object_or_404(User, username=receiver_username)
request = ConnectRequest.objects.create(sender=self.request.user, receiver=receiver)
notification = ConnectNotification.objects.create(type='connect request', receiver=receiver, initiated_by=self.request.user)
channel_layer = get_channel_layer()
channel = f'notifications_{receiver.username}'
async_to_sync(channel_layer.group_send)(
channel, {
'type': 'notify',
'notification': json.dumps(ConnectNotificationSerializer(notification).data, cls=DjangoJSONEncoder),
}
)
data = {
'status': True,
'message': 'Success',
}
return JsonResponse(data)
// consumer.py
class ConnectNotificationConsumer(AsyncJsonWebsocketConsumer):
async def connect(self):
user = self.scope['user']
group_layer = f'notifications_{user.username}'
await self.accept()
await self.channel_layer.group_add(group_layer, self.channel_name)
async def disconnect(self, close_code):
user = self.scope['user']
group_layer = f'notifications_{user.username}'
await self.channel_layer.group_discard(group_layer, self.channel_name)
async def notify(self, event):
notification = event['notification']
await self.send(text_data=json.dumps({
'notification': notification
})
)

How to use background tasks with Starlette when there's no background object?

I'm hoping to avoid any use of Celery at the moment. In Starlette's docs they give two ways to add background tasks:
Via Graphene: https://www.starlette.io/graphql/
class Query(graphene.ObjectType):
user_agent = graphene.String()
def resolve_user_agent(self, info):
"""
Return the User-Agent of the incoming request.
"""
user_agent = request.headers.get("User-Agent", "<unknown>")
background = info.context["background"]
background.add_task(log_user_agent, user_agent=user_agent)
return user_agent
Via a JSON response: https://www.starlette.io/background/
async def signup(request):
data = await request.json()
username = data['username']
email = data['email']
task = BackgroundTask(send_welcome_email, to_address=email)
message = {'status': 'Signup successful'}
return JSONResponse(message, background=task)
Does anyone know of a way to add tasks to Starlette's background with Ariadne? I am unable to return a JSONResponse in my resolver, and I do not have access to a info.context["background"]. The only thing I have attached to my context is my request object.
Solved!
Starlette Middleware:
class BackgroundTaskMiddleware(BaseHTTPMiddleware):
async def dispatch(
self, request: Request, call_next: RequestResponseEndpoint
) -> Response:
request.state.background = None
response = await call_next(request)
if request.state.background:
response.background = request.state.background
return response
Ariadne Resolver:
#query.field("getUser")
#check_authentication
async def resolve_get_user(user, obj, info):
task = BackgroundTasks()
task.add_task(test_func)
task.add_task(testing_func_two, "I work now")
request = info.context["request"]
request.state.background = task
return True
async def test_func():
await asyncio.sleep(10)
print("once!!")
async def testing_func_two(message: str):
print(message)
The functions still execute synchronously, but because they're background tasks I'm not too worried.
More discussion here.
The above which is marked as a solution does not work for me since BackgroundTask does not work properly when you use a middleware that subclasses BaseHTTPMiddleware see here:
https://github.com/encode/starlette/issues/919
In my case basically the task is not ran in the background and it is awaited to be completed, also I am not using Ariadne, but this should let you do the job and run a task in the background
Edit:
This worked for me.
executor = ProcessPoolExecutor()
main.executor.submit(
bg_process_funcs,
export_file_format,
export_headers,
data,
alert_type,
floor_subtitle,
date_subtitle,
pref_datetime,
pref_timezone,
export_file_name,
export_limit,)
executor.shutdown()
logger.info("Process Pool Shutdown")

how to load data from a flask url and console log the data n react?

I want to load a json file that I get from a url generated in flask.
After d3.json(url, function) I'm trying to console log the json, but nothing happens and I dont know what's wrong. So maybe someone can help
This is basically my code:
Component where I want to display a graph (Display.js):
import React, { Component } from 'react';
import "./Display.css";
import * as d3 from "d3";
export default class Display extends Component {
componentWillReceiveProps() {
const url = "http://localhost:5000/company?company_name=" + this.props.inputDataFromParent //the url depends on the input the user made. InputDataFromParent is the passed value (from the parent) that the user typed in
d3.json(url, function (data) {
console.log(data)
})
}
when I type in an input (company name) that doesnt exist in the database I get an error: "Uncaught (in promise) SyntaxError: Unexpected token N in JSON at position 0"at (index):1
when I type in an input that does exist nothing happens in the console.
Here's my main.py:
import flask
from pandas import DataFrame
from models import company_search
from flask import request
from models import subsidiaries
app=flask.Flask("__main__")
#app.route("/company")
def result():
if request.method == 'GET':
company_name = request.args.get('company_name', None)
if company_name:
return subsidiaries(company_name)
return "No place information is given"
app.run(debug=True)
and this is models.py (neo4j is used as the database):
def subsidiaries(eingabe):
if regex_eingabe_kontrolle(eingabe):
namelistdf = graph.run("MATCH (c:Company)-[rel:Relation]->(d:Company) WHERE rel.relation_group='OWNERSHIP' AND rel.percent_share >= 50 AND c.company_name= $eingabe RETURN c, rel, d",eingabe=eingabe).to_data_frame()
if namelistdf.empty:
return "No company with this name exists"
namelistjson = namelistdf.to_json(orient="records",date_unit="s",default_handler=str)
return namelistjson
else:
return "Please enter a valid company name"
I get the data from a neo4j database.
It's not about to console log the data because I want to generate a graph from the data. The console log is only for testing if the data is right. But now it seems that the data isn't passed to d3.json(url, function (data) correctly
Your problem occurs in your main file when you return No place information is given. Since this is not in JSON format, JavaScript fails to parse it causing the error to be thrown. To fix this, you can change it to:
import json
#app.route("/company")
def result():
if request.method == 'GET':
company_name = request.args.get('company_name', None)
if company_name:
return subsidiaries(company_name)
return json.dumps("No place information is given")
If you do the same for No company with this name exists and Please enter a valid company name, you won't get anymore JSON deserialization errors. However, D3 may still throw errors because it does not know what to do with a string.
You may want to return HTTP status codes along with your error messages so you can control the error. To do this in Flask, you can return the text along with the status code. For example: return json.dumps("Error message"), 400 will return the JSON string Error message with the status code 400. Since d3.json makes an underlying call to the JavaScript fetch function, you should be able to access the status code by using data.status.

Building a proxy rotator with specific URL and script

I am struggling to build a proxy rotator with existing code structured for a different url.
The URLs I want are provided in the code example below. I am trying to have the provided script call the desired URLs and get ALL the 'IP:PORT' (current script limits to ten) when proxy type is "HTTPS".
It can be done in xpath or bs4. I am familair with bs4 more though.
I understand the logic but I am failing on how to structure this.
To start, I've tried stripping strings and trying to call specific td elements but its not working.
#URLs I want
url_list = ['http://spys.one/free-proxy-list/US/','http://spys.one/free-proxy-list/US/1/']
#code I have
from lxml.html import fromstring
import requests
from itertools import cycle
import traceback
def get_proxies():
url = 'https://free-proxy-list.net/'
response = requests.get(url)
parser = fromstring(response.text)
proxies = set()
for i in parser.xpath('//tbody/tr')[:10]:
if i.xpath('.//td[7][contains(text(),"yes")]'):
proxy = ":".join([i.xpath('.//td[1]/text()')[0], i.xpath('.//td[2]/text()')[0]])
proxies.add(proxy)
return proxies
proxies = get_proxies()
proxy_pool = cycle(proxies)
proxy = next(proxy_pool)
response = requests.get(url,proxies={"http": proxy, "https": proxy})
I hope to learn how the code provided is structured for the 2 desired URLs, return all IP:PORT numbers when proxy type is HTTPS
One way is to issue port specific POST requests in a loop. You could amend to add to one final list. The endpoint is already https specific.
import requests
from bs4 import BeautifulSoup as bs
def get_proxies(number, port, p):
r = requests.post('http://spys.one/en/https-ssl-proxy/', data = {'xpp': 5, 'xf4': number})
proxies = [':'.join([str(i),port]) for i in p.findall(r.text)]
return proxies
ports = ['3128', '8080', '80']
p = re.compile(r'spy14>(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})<script')
proxies = []
for number, port in enumerate(ports,1):
proxies+=get_proxies(number, port, p)
print(proxies)
Example results:
For country specific:
import requests
from bs4 import BeautifulSoup as bs
def get_proxies(number, port, p, country):
r = requests.post('http://spys.one/en/https-ssl-proxy/', data = {'xpp': 5, 'xf4': number})
soup = bs(r.content, 'lxml')
proxies = [':'.join([p.findall(i.text)[0], port]) for i in soup.select('table table tr:has(.spy14:contains("' + country + '")) td:has(script) .spy14')]
return proxies
ports = ['3128', '8080', '80']
p = re.compile(r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})document')
proxies = []
for number, port in enumerate(ports,1):
proxies+=get_proxies(number, port, p, 'United States')
print(proxies)
For the one you said is already written I will refer to my original answer of:
from bs4 import BeautifulSoup as bs
import requests
def get_proxies():
r = requests.get('https://free-proxy-list.net/')
soup = bs(r.content, 'lxml')
proxies = {tr.td.text + ':' + tr.td.next_sibling.text for tr in soup.select('tr:has(.hx:contains(yes))')}
return proxies
get_proxies()

Tastypie how to get custom Json Response after injecting Post data

i want to have custom json response after data post sendind to my Tastypie API models django.
class MyModelResource(ModelResource):
my_field=""
class Meta:
queryset = MyModel.objects.all()
resource_name = 'nick_name'
authentication = ApiKeyAuthentication()
authorization = DjangoAuthorization()
def hydrate(self, bundle):
#on recupere les donnée injectée par bundle.data['title']
#et on inject les donnée via bundle.obj.title
#bundle.data['my_field'] ="1234"
bundle.obj.my_field=bundle.data['my_field']
self.my_field = bundle.data['my_field']
return bundle
def wrap_view(self, view):
"""
Wraps views to return custom error codes instead of generic 500's
"""
#csrf_exempt
def wrapper(request, *args, **kwargs):
try:
callback = getattr(self, view)
response = callback(request, *args, **kwargs)
if request.is_ajax():
patch_cache_control(response, no_cache=True)
lst_dic=[]
mon_dic = dict(success=True, my_field=self.my_field
)
# response is a HttpResponse object, so follow Django's instructions
# to change it to your needs before you return it.
# https://docs.djangoproject.com/en/dev/ref/request-response/
lst_dic.append(mon_dic)
response = HttpResponse(simplejson.dumps(lst_dic), content_type='application/json')
return response
except (BadRequest, fields.ApiFieldError), e:
return HttpBadRequest({'success':False,'code': 666, 'message':e.args[0]})
except ValidationError, e:
# Or do some JSON wrapping around the standard 500
return HttpBadRequest({'success':False,'code': 777, 'message':', '.join(e.messages)})
except Exception, e:
# Rather than re-raising, we're going to things similar to
# what Django does. The difference is returning a serialized
# error message.
return self._handle_500(request, e)
return wrapper
My problem here, i can't grab the self.my_field value to put in mon_dic, i always have data object, not value...
thx for help
EDIT : Add my_field global variable, and then grab value from bundle that's it ;)
Maybe I am not understanding what you want to do here. But wrap_view is for handling customer error responses. If all you want to do is return the data that was posted, you can set always_return_data to true in your Meta:
class Meta:
always_return_data = True
Or if you want to control what data gets sent back, you can use the dehydrate method:
def dehydrate(self, bundle):
bundle.data['custom_field'] = "Whatever you want"
return bundle