dask.delayed KeyError with distributed scheduler - ctypes

I have a function interpolate_to_particles written in c and wrapped with ctypes. I want to use dask.delayed to make a series of calls to this function.
The code runs successfully without dask
# Interpolate w/o dask
result = interpolate_to_particles(arg1, arg2, arg3)
and with the distributed schedular in single-threaded mode
# Interpolate w/ dask
from dask.distributed import Client
client = Client()
result = dask.delayed(interpolate_to_particles)(arg1, arg2, arg3)
result_c = result.compute(scheduler='single-threaded')
but if I instead call
result_c = result.compute()
I get the following KeyError:
> Traceback (most recent call last): File
> "/path/to/lib/python3.6/site-packages/distributed/worker.py",
> line 3287, in dumps_function
> result = cache_dumps[func] File "/path/to/lib/python3.6/site-packages/distributed/utils.py",
> line 1518, in __getitem__
> value = super().__getitem__(key) File "/path/to/lib/python3.6/collections/__init__.py",
> line 991, in __getitem__
> raise KeyError(key) KeyError: <function interpolate_to_particles at 0x1228ce510>
The worker logs accessed from the dask dashboard do not provide any information. Actually, I do not see any information that the workers have done anything besides starting up.
Any ideas on what could be occurring, or suggested tools that I can use to further debug? Thanks!

Given your comments it sounds like your function does not serialize well. To test this, you might try pickling the function in one process, and try unpickling it in another.
>>> import pickle
>>> print(pickle.dumps(interpolate_to_particles))
b'some bytes printed out here'
And then in another process
>>> import pickle
>>> interpolate_to_particles = pickle.loads(b'the same bytes you had before')
If this doesn't work then you'll know that that's your problem. I would encourage you to look up "how to make sure that ctypes functions are serializable" or something similar, or ask another question with that smaller scope here on Stack Overflow.

Related

Is cython compatible with typing.NamedTuple?

I have the following code in file temp.py
from typing import NamedTuple
class C(NamedTuple):
a: int
b: int
c = C(1, 2)
I compile it using the command:
cythonize -3 -i temp.py
and run it using the command
python3 -c 'import temp'
I get the following exception:
Traceback (most recent call last): File "<string>", line 1, in <module> File "temp.py", line 7, in init temp
c = C(1, 2) TypeError: __new__() takes 1 positional argument but 3 were given
Version of python: 3.6.15
Version of cython: 0.29.14
Is there anything wrong in the above code/build steps ?
It'll work in the current Cython 3 alpha version (and later). It won't work in Cython 0.29.x (you're using a pretty outdated version of this, but that won't affect this feature).
It requires classes to have an __annotations__ dictionary, which is a feature that was added in the Cython 3 alpha releases.
You won't get much/any speed advantage from compiling this is Cython though - it'll still generate a normal Python class. But it will work.
in short, NO, it is not compatible. Edit: not currently compatible.
named tuples is just python magic (creating classes at runtime), cython doesn't know about it, so you have to execute that code by calling the interpreter at runtime, using exec.
# temp.pyx
temp_global = {}
exec("""
from typing import NamedTuple
class C(NamedTuple):
a: int
b: int
""",temp_global)
C = temp_global['C']
c = C(1,2)
print(c)
to test it
import pyximport
pyximport.install()
import temp
this ends up being some python code that's being executed whenever you import your binary, the entire file is being passed to exec whenever you import it, so it's not really "Cython Code", you can just write it as a python .py file and avoid cython, or just implement your "Cython class" without relying on python magic. (no named tuples or dynamic code that is created at runtime)

Reading binary files in Cython

I am attempting to read a binary file in Cython. Previously this was working in Python, but I am looking to speed up the process. This code below was written as a familiarisation and logic check before writing the complete module. Once this section is complete the code will be expanded to read in multiple 400 Mb files and process.
A function was created that opens the file, reads in a number of data point and returns them to an array.
from libc.stdlib cimport malloc, free
from libc.stdio cimport fopen, fclose, FILE, fscanf, fread
def readin_binary(filename, int number_of_points):
"""
Test reading in a file and returning data
"""
header_bytes = <unsigned char*>malloc(number_of_points)
filename_byte_string = filename.encode("UTF-8")
cdef FILE *in_binary_file
in_binary_file = fopen(filename_byte_string, 'rb')
if in_binary_file is NULL:
print("file not found")
else:
print("Read file {}".format(filename))
fread(&header_bytes, 1, number_of_points, in_binary_file)
fclose(in_binary_file)
return header_bytes
print(hDVS.readin_binary(filename, 10))
The code compiles.
When the code is run the following error occurs:
Python has stopped working error
I've been playing with this for a few days now. I think there is a simple error but I can not see it. Any ideas?

simplejson json AttributeError: "module" object has no attribute "dump"

I am new at programming and I have a question regarding the following error.
I am using python 2.7., and I have the following script to create a simple graph (example taken from python CrashCourse by Eric Matthes):
import matplotlib.pyplot as plt
squares = [1,4,9,16,25]
plt.plot(squares, linewitdth = 5)
#Set chart title and lable axes.
plt.title("Square Numbers", fontsize = 24)
plt.xlabel("Value", fontsize = 14)
plt.ylabel("Square of Value", fontsize = 14)
# Set size of tick labels
plt.tick_params(axis = "both", labelsize = 14)
plt.show()
When I ran this script in WindowsPowerShell I got the following error:
Traceback (most recent call last): File "mpl_squares.py", line 1, in <module>
import matplotlib.pyplot as plt
File "C:\Users\Roger\Anaconda2\lib\sitepackages\matplotlib\__init__.py, line 134, in <module> from ._version import get_versions
File "C:\Users\Roger\Anaconda2\lib\site-packages\matplotlib\_version.py", line 7, in <module> import json
File "C:\Users\Roger\Desktop\lpthw\json.py", line 7, in <module>
AttributeError: "module" object has no attribute "dump"
In other script I had the same problem when importing this module, then I found
a solution by replacing the line "import json" by "import simplejson, and It worked well.
Here is the solution I found back then:
json is simplejson, added to the stdlib. But since json was added in 2.6, simplejson has the advantage of working on more Python versions (2.4+).
simplejson is also updated more frequently than Python, so if you need (or want) the latest version, it's best to use simplejson itself, if possible.
A good practice, in my opinion, is to use one or the other as a fallback.
try: import simplejson as json
except ImportError: import json
Now I checkt in the error I got the module it is poiting out "_version.py"
This is the information contained in this file:
# This file was generated by 'versioneer.py' (0.15) from
# revision-control system data, or from the parent directory name of an
# unpacked source archive. Distribution tarballs contain a pre-generated
#copy
# of this file.
import json
import sys
version_json = '''
{
"dirty": false,
"error": null,
"full-revisionid": "26382a72ea234ee0efd40543c8ae4a30cffc4f0d",
"version": "1.5.3"
}
''' # END VERSION_JSON
def get_versions():
return json.loads(version_json)
Question:
Do you think I would have to fix something in the _version.py module by replacing
import json for import simplejson and the function added in the module?
I am thinking in a workaround to fix the problem but I don't wanna modify anything from the _version.py if it make things worse. Thank you very much for your comments and suggestions.
Best Regards
It seems like your C:\Users\Roger\Desktop\lpthw\json.py gets imported instead of Python's built-in json module.
Did you somehow add that folder (C:\Users\Roger\Desktop\lpthw) to your PYTHONPATH, e.g. with sys.path.append() or the PYTHONPATH variable? Read more about how Python finds modules.
The reason why the fix with simplejson works is that it is not overridden by some other module of the same name.
Try renaming C:\Users\Roger\Desktop\lpthw\json.py to something like C:\Users\Roger\Desktop\lpthw\myjson.py and also try to figure out how that lpthw folder made it into your PYTHONPATH.

Recieve JSON data from link in whattomine without scraping HTML

Explanation
This link is where you are sent to after entering in your hardware stats (hashrate, power, power cost, etc.). On the top bar (below the blue Twitter follow button) is a link to a JSON file created after the page loads with the hardware stats information entered; clicking on that JSON link redirects you to another URL (https://whattomine.com/asic.json).
Goal
My goal is to access that JSON file directly after manipulating the values in the URL string via the terminal. For example, if I would like to change hashrate from 100 to 150 in this portion of the URL:
[sha256_hr]=100& ---> [sha256_hr]=150&
After the URL manipulations (like above, but not limited to), I would like to receive the JSON output so that I can pick-out the desired data.
My Code
Advisory: I started Python programming ~June 2017, please forgive.
import json
import pandas as pd
import urllib2
import requests
hashrate_ghs = float(raw_input('Hash Rate (TH/s): '))
power_W = float(raw_input('Power of Miner (W): '))
electric_cost = float(raw_input('Cost of Power ($/kWh): '))
hashrate_ths = hashrate_ghs * 1000
initial_request = ('https://whattomine.com/asic?utf8=%E2%9C%93&sha256f=true&factor[sha256_hr]={0}&factor[sha256_p]={1}&factor[cost]={2}&sort=Profitability24&volume=0&revenue=24h&factor[exchanges][]=&factor[exchanges][]=bittrex&dataset=Main&commit=Calculate'.format(hashrate_ths, power_W, electric_cost))
data_stream_mine = urllib2.Request(initial_request)
json_data = requests.get('https://whattomine.com/asic.json')
print json_data
Error from My Code
I am getting an HTTPS handshake error. This is where my Python freshness is second most blatantly visible:
Traceback (most recent call last):
File "calc_1.py", line 16, in <module>
s.get('https://whattomine.com/asic.json')
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 521, in get
return self.request('GET', url, **kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/Library/Python/2.7/site-packages/requests/adapters.py", line 506, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='whattomine.com', port=443): Max retries exceeded with url: /asic.json (Caused by SSLError(SSLError(1, u'[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:590)'),))
Thank you for your help and time!
Please advise me of any changes or for more information concerning this question.
This is just a comment. The following approach would suffice (Python 3).
import requests
initial_request = 'http://whattomine.com/asic.json?utf8=1&dataset=Main&commit=Calculate'
json_data = requests.get(initial_request)
print(json_data.json())
The key point here this part - put .json in your initial_request and it will be enough.
You may add all you parameters as you did in the query part after ? sign
It looks like a few others faced similar problems.
While for some it seemed to be like a pyOpenSSL version issue, uninstalling and reinstalling which has fixed the problem. Another older answer in SO asks to do the following.

Cause of jinja2.exceptions.TemplateNotFound even when using just jinja2

The cause of the problem is obvious after the fact, but I'd like to share the not-too-obvious cause here.
When running code such as
import jinja2
templateLoader = jinja2.FileSystemLoader(searchpath=".")
templateEnv = jinja2.Environment(loader=templateLoader,
trim_blocks=True,
lstrip_blocks=True)
htmlTemplateFile = 'file.jinja.html'
htmlTemplate = templateEnv.get_template(htmlTemplateFile)
if you get this problem:
Traceback (most recent call last):
...
File "file.py", line xyz, in some_func
htmlTemplate = templateEnv.get_template(htmlTemplateFile)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/jinja2/environment.py", line 812, in get_template
return self._load_template(name, self.make_globals(globals))
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/jinja2/environment.py", line 774, in _load_template
cache_key = self.loader.get_source(self, name)[1]
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/jinja2/loaders.py", line 187, in get_source
raise TemplateNotFound(template)
jinja2.exceptions.TemplateNotFound: file.jinja.html
you may find the discussions online point that this issue must have something to do with the interaction of jinja2 with flask, with GAE, with Pyramid, or with SQL, and it may indeed be that your templates are not in a "template" folder, but this problem can arise from the interaction of jinja2 and the os module.
The culprit is changing the current directory by, for instance,
import os
os.chdir(someDir)
If templateEnv.get_template(...) is called past this point, jinja2 will look for the templates in the "current" dir, even if that has changed.
Since module os provides os.chdir but not os.pushdir/os.popdir, one has to either simulate the latter pair or avoid chdir altogether.