Using dict for setting plotly Dash Slider marks results in syntax error - plotly-dash

I'm building a dashboard based on plotly Dash. One of the dcc is a Slider. If I want to show marks on the slider that are within a certain range of numbers, then this works just fine:
dcc.Slider(min=-10, max=20, step=1, value=0, marks={i: str(i) for i in range(-10, 20)})
But Dash documentation prefers to use dict notation. But if I do this:
dcc.Slider(min=-10, max=20, step=1, value=0, marks=dict(i = str(i) for i in range(-10,20)))
then I receive a Syntax Error
How can I implement this functionality using dict notation?

dict() can take in a list of key-value pairs. So you can do a list-comprehension inside your dict notation:
dcc.Slider(min=-10, max=20, step=1, value=0, marks=dict([(i,str(i)) for i in range(-10,20)]))
However I would still prefer the dict literal way, just for the slightly better readability and operation.

Related

TCL Dict to JSON

I am trying to convert a dict into JSON format and not seeing any easy method using TclLib Json Package. Say, I have defined a dict as follows :
set countryDict [dict create USA {population 300 capital DC} Canada {population 30 capital Ottawa}]
I want to convert this to json format as shown below:
{
"USA": {
"population": 300,
"captial": "DC"
},
"Canada": {
"population": 30,
"captial": "Ottawa"
}
}
("population" is number and capital is string). I am using TclLib json package (https://wiki.tcl-lang.org/page/Tcllib+JSON) . Any help would be much appreciated.
There's two problems with the “go straight there” approach that you appear to be hoping for:
Tcl's type system is extremely different to JSON's; in Tcl, every value is a (subtype of) string, but JSON expects objects, arrays, numbers and strings to wholly different things.
The capital becomes captial. For bonus fun. (Hopefully that's just a typo on your part, but we'll cope.)
I'd advise using rl_json for this; it's a much more capable package that treats JSON as a fundamental type. (It's even better at it when it comes to querying into the JSON structure.)
package require rl_json
set result {{}}; # Literal empty JSON object
dict for {countryID data} $countryDict {
rl_json::json set result $countryID [rl_json::json template {{
"population": "~N:population",
"captial": "~S:capital"
}} $data]
# Yes, that was {{ … }}, the outer ones are for Tcl & the inner ones for a JSON object
}
puts [rl_json::json pretty $result]
That produces almost exactly the output you asked for, except with different indentation. $result is the “production” version of the output that you can work with for further processing, but which has no excess whitespace at all (which is a great choice when you're dealing with documents over 100MB long).
Notes:
The initial JSON object could have been done like this:
set result "{}"
that would have worked just as well (and been the same Tcl bytecode).
json set puts an item into an object or array; that's exactly what we want here (in a dict for to go over the input data).
json template takes an optional dictionary for mapping substitution names in the template to values; that's perfect for your use case. Otherwise we'd have had to do dict with data {} to map the contents of the dictionary into variables, and that's less than perfect when the input data isn't strictly controlled.
The contents of template argument to json template is itself JSON. The ~N: prefix in a leaf string value says “replace this with a number from the substitution called…”, and ~S: says “replace this with a string from the substitution called…”. There are others.

Language translation using TorchText (PyTorch)

I have recently started with ML/DL using PyTorch. The following pytorch example explains how we can train a simple model for translating from German to English.
https://pytorch.org/tutorials/beginner/torchtext_translation_tutorial.html
However I am confused on how to use the model for running inference on custom input. From my understanding so far :
1) We will need to save the "vocab" for both German (input) and English(output) [using torch.save()] so that they can be used later for running predictions.
2) At the time of running inference on a German paragraph, we will first need to convert the German text to tensor using the german vocab file.
3) The above tensor will be passed to the model's forward method for translation
4) The model will again return a tensor for the destination language i.e., English in current example.
5) We will use the English vocab saved in first step to convert this tensor back to English text.
Questions:
1) If the above understanding is correct, can the above steps be treated as a generic approach for running inference on any language translation model if we know the source and destination language and have the vocab files for the same? Or can we use the vocab provided by third party libraries like spacy?
2) How do we convert the output tensor returned from model back to target language? I couldn't find any example on how to do that. The above blog explains how to convert the input text to tensor using source-language vocab.
I could easily find various examples and detailed explanation for image/vision models but not much for text.
Yes globally what you are saying is correct, and of course you can any vocab, e.g. provided by spacy. To convert a tensor into natrual text, one of the most used thechniques is to keep both a dict that maps indexes to words and an other dict that maps words to indexes, the code below can do this:
tok2idx = defaultdict(lambda: 0)
idx2tok = {}
for seq in sequences:
for tok in seq:
if not tok in tok2idx:
tok2idx[tok] = index
idx2tok[index] = tok
index += 1
Here sequences is a list of all the sequences (i.e. sentences in your dataset). You can change the model easily if you have only a list of words or tokens, by only keeping the inner loop.

Do web2py json returns have extraneous whitespace, if so, how to remove

Just to check, the default JSON view which changes python objects to JSON seems to include whitespace between the variables, i.e.
"field": [[110468, "Octopus_vulgaris", "common octopus"...
rather than
"field":[[110468,"Octopus_vulgaris","common octopus"...
Is that right? If so, is there an easy way to output the JSON without the extra spaces, and is this for any reason (other than readability) a bad idea.
I'm trying to make some API calls return the fastest and most concise JSON representation, so any other tips gratefully accepted. For example, I see the view calls from gluon.serializers import json - does that get re-imported every time the view is used, or is python clever enough to use it once-only. I'm hoping the latter.
The generic.json view calls gluon.serializers.json, which ultimately calls json.dumps from the Python standard library. By default, json.dumps inserts spaces after separators. If you want no spaces, you will not be able to use the generic.json view as is. You can instead do:
import json
output = json.dumps(input, separators=(',', ':'))
If input includes some data that are not JSON serializable and you want to take advantage of the special data type conversions implemented in gluon.serializers.json (i.e., datetime objects and various web2py specific objects), you can do the following:
import json
from gluon.serializers import custom_json
output = json.dumps(input, separators=(',', ':'), default=custom_json)
Using the above, you can either edit the generic.json view, create your own custom JSON view, or simply return the JSON directly from the controller.
Also, no need to worry about re-importing modules in Python -- the interpreter only loads the module once.

Parsing large JSON file with Scala and JSON4S

I'm working with Scala in IntelliJ IDEA 15 and trying to parse a large twitter record json file and count the total number of hashtags. I am very new to Scala and the idea of functional programming. Each line in the json file is a json object (representing a tweet). Each line in the file starts like so:
{"in_reply_to_status_id":null,"text":"To my followers sorry..
{"in_reply_to_status_id":null,"text":"#victory","in_reply_to_screen_name"..
{"in_reply_to_status_id":null,"text":"I'm so full I can't move"..
I am most interested in a property called "entities" which contains a property called "hastags" with a list of hashtags. Here is an example:
"entities":{"hashtags":[{"text":"thewayiseeit","indices":[0,13]}],"user_mentions":[],"urls":[]},
I've browsed the various scala frameworks for parsing json and have decided to use json4s. I have the following code in my Scala script.
import org.json4s.native.JsonMethods._
var json: String = ""
for (line <- io.Source.fromFile("twitter38.json").getLines) json += line
val data = parse(json)
My logic here is that I am trying to read each line from twitter38.json into a string and then parse the entire string with parse(). The parse function is throwing an error claiming:
"Type mismatch, expected: Nothing, found:String."
I have seen examples that use parse() on strings that hold json objects such as
val jsontest =
"""{
|"name" : "bob",
|"age" : "50",
|"gender" : "male"
|}
""".stripMargin
val data = parse(jsontest)
but I have received the same error. I am coming from an object oriented programming background, is there something fundamentally wrong with the way I am approaching this problem?
You have most likely incorrectly imported dependencies to your Intellij project or modules into your file. Make sure you have the following lines imported:
import org.json4s.native.JsonMethods._
Even if you correctly import this module, parse(String: json) will not work for you, because you have incorrectly formed a json. Your json String will look like this:
"""{"in_reply_...":"someValue1"}{"in_reply_...":"someValues2"}"""
but should look as follows to be a valid json that can be parsed:
"""{{"in_reply_...":"someValue1"},{"in_reply_...":"someValues2"}}"""
i.e. you need starting and ending brackets for the json, and a comma between each line of tweets. Please read the json4s documenation for more information.
Although being almost 6 years old, I think this question deserves another try.
JSON format has a few misunderstandings in people's minds, especially how they are stored and how they are read back.
JSON documents, are stored as either a single object having all the other fields, or an array of multiple object possibly in same format. this second part is important because arrays in almost every programming language are defined by angle brackets and values separated by commas (note here I used a person object as my single value):
[
{"name":"John","surname":"Doe"},
{"name":"Jane","surname":"Doe"}
]
also note that everything except brackets, numbers and booleans are enclosed in quotes when written into file.
however, there is another use that is not official but preferred to transfer datasets easily where every object, or document as in nosql/mongo language, are stored in a new line like this:
{"name":"John","surname":"Doe"}
{"name":"Jane","surname":"Doe"}
so for the question, OP has a document written in this second form, but tries an algorithm written to read the first form. following code has few simple changes to achieve this, and the user must read the file knowing that:
var json: String = "["
for (line <- io.Source.fromFile("twitter38.json").getLines) json += line + ","
json=json.splitAt(json.length()-1)._1
json+= "]"
val data = parse(json)
PS: although #sbrannon, has the correct idea, the example he/she gave has mistakenly curly braces instead of angle brackets to surround the data.
EDIT: I have added json=json.splitAt(json.length()-1)._1 because the code above ends with a trailing comma which will cause parse error per the JSON format definition.

read_csv converters for unknown columns

I'm trying to read a csv file that holds several values in every cell and I want to encode them to a single int formatted byte to be stored in a pandas cell, (e.g. (1, 1) -> 771). For that I would like to use the converters parameter of the read_csv function. The problem is that I don't know the names of the columns before hand and the value to be passed to the converters should be a dict with the column names as keys. In fact I want to convert all columns with the same converter function. For that it would be better to write:
read_csv(fhand, converter=my_endocing_function)
than:
read_csv(fhand, converters={'col1':my_endocing_function,
'col2':my_endocing_function,
'col3':my_endocing_function,})
Is something like that possible? Right now to solve the issue I'm doing:
dataframe = read_csv(fhand)
enc_func = numpy.vectorize(encoder.encode_genotype)
dataframe = dataframe.apply(enc_func, axis=1)
But I guess that this approach might be less efficient.
By the way I have similar doubts with the formatters used by the to_string method.
You can pass integers (0, 1, 2) instead of the names. From the docstring:
converters : dict. optional
Dict of functions for converting values in certain columns. Keys can either
be integers or column labels