How do I do conditional coloring based on a value in the Vega-Lite API - vega-lite

I am trying to conditionally color the text of a heat map in the same style black and white text on this page: Condition
I am looking specifically at the conditional color encoding:
"encoding": {
"text": {"field": "num_cars", "type": "quantitative"},
"color": {
"condition": {"test": "datum['num_cars'] < 40", "value": "black"},
"value": "white"
}
}
I can't seem to get something similar to work in Vega-Lite. My latest version looks like this:
vl.data(weatherData)
.transform(
vl.calculate("monthAbbrevFormat(month(datum.date))").as("month"),
vl.calculate("date(datum.date)").as("day"),
vl.aggregate([{op:"average",
field:"temp_max",
as:"avg_temp"
}]).groupby(["month","day"])
)
.encode(
vl.y().fieldO("month").sort(["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]),
vl.x().fieldO("day")
)
.layer(
vl.markRect({tooltip: true, clip: true})
.encode(
vl.color().average("avg_temp").scale({scheme:"redyellowblue", reverse:true})
),
vl.markText({tooltip: true, clip: true})
.encode(
vl.text().average("avg_temp").format(".1f"),
vl.color().condition({test:"datum['avg_temp'] > 26",value:"white"}).value("black")
)
)
.width(1000)
.height(400)
.render()
When I convert that to JSON, I get
"color":{
"condition":{"test":"datum['avg_temp'] > 26", "value":"white"},
"value":"black"
}
which looks the same to my eyes. However, the text resolutely stays black.
I've put the output JSON into the Vega editor, and it also doesn't work there, so the problem isn't limited to my problems with the JavaScript API. It would great if someone could point out where my logic is failing (and also fill me in on the correct syntax in the API as the documentation is sadly lacking in examples).

I figured out my problem. In an earlier version I was computing the average in the encoding. I switched to pre-computing it in the transform, and didn't update the encoding for the text. So it was still
.encode(
vl.text().average("avg_temp").format(".1f"),
vl.color().condition({test:"datum['avg_temp'] > 26",value:"white"}).value("black")
)
The average wasn't changing the values, but it meant that dataum['avg_temp'] wasn't available.
If there is a way to compute the average as part of the test, that would make this a lot cleaner, but I couldn't find a way to do that.

Related

Vega-lite: Mark text overlaps top axis

How can fix text overlap on the top axis in the example?
vega-lite example
This is what I ended up with.
Now, you will probably not like how I got there. In the editor I went to the Compiled Vega tab and edited the low-level Vega spec. The fix was:
Add a signal with the extent transform:
{"type": "extent", "field": "value", "signal": "y_domain"}
Derive two more signals from it:
{"name": "y_domain_min", "update": "y_domain[0]"},
{"name": "y_domain_max", "update": "y_domain[1] * 1.25"}
Note the multiplier.
Use the new signals for the domain definition.
{
"name": "y",
"type": "linear",
"domain": [{"signal": "y_domain_min"}, {"signal": "y_domain_max"}],
...
If a fix is possible directly in Vega-lite, I'd like to know.

Json to Data Frame and Excel with Python

I would like to ask for some help with the conversion of a nested json into pandas df.
I have read the quite brilliant input from a couple of year past, but that is outdated now. :(
Flatten double nested JSON
So here is a sample of my input data (mind that classes might contain up to 10 class name and confidence pairs):
[
{
"classifier_id": "my_classifier_id",
"url": "https://api.eu-de.natural-language-classifier...",
"text": "for sales? aligning obsolete incentive\u00a0 system to what is the standard today: 100% reference salary, 100% of ref sal if you hit 100% of your quota",
"top_class": "conditions",
"classes": [
{
"class_name": "conditions",
"confidence": 0.9074866214536228
},
{
"class_name": "temperature",
"confidence": 0.09251337854637723
}
]
},
{
"classifier_id": "my_classifier_id",
"url": "https://api.eu-de.natural-language-classifier...",
"text": "Complete integration of incentives.\u00a0 People act inline with how they are compensated as the general rule. \u00a0 If we get that right then this model can genuinely change the face of IBM to the client.",
"top_class": "conditions",
"classes": [
{
"class_name": "conditions",
"confidence": 0.9683663322166756
},
{
"class_name": "temperature",
"confidence": 0.0316336677833244
}
]
},
{
"classifier_id": "my_classifier_id",
"url": "https://api.eu-de.natural-language-classifier.watson...",
"text": "Enablement, operational support on the most basic things",
"top_class": "temperature",
"classes": [
{
"class_name": "temperature",
"confidence": 0.8174158442711534
},
{
"class_name": "conditions",
"confidence": 0.1825841557288465
}
]
}
]
What I have tried thus far in python:
data_df = pd.read_json(r'C:\Users\...\Documents\Python NLP\WATSON NLC\OUTPUT JSON\nlc_data_full.json')
When using this the classes still remain in a json like form:
[{'class_name': 'conditions', 'confidence': 0.907486621453622}, {'class_name': 'temperature', 'confidence': 0.092513378546377}]
[{'class_name': 'conditions', 'confidence': 0.9683663322166751}, {'class_name': 'temperature', 'confidence': 0.031633667783324}]
[{'class_name': 'temperature', 'confidence': 0.8174158442711531}, {'class_name': 'conditions', 'confidence': 0.182584155728846}]
I would love to get a format that can be worked on in excel. Thank you for looking into this.
Well I think I managed to figure out what everyone already knew anyways. LOL
So the magic is in the pd.json_normalize function. With the parameters it takes it basically is able to open multinested json files with relative ease.
Also the pandas site has been a good friend as always: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html
I am calling my dataset: nlc_data = [ .......
Here is a super lightweight solution for cases that do not have such intricate nesting: normie_2 = pd.json_normalize(nlc_data, max_level=0)
This one works for multi nested json files:
result = pd.json_normalize(nlc_data, 'classes', ['text', 'top_class'])
Well I guess I got a lot smarter today. Bare with me ... I just might have another awesome questions tomorrow.
Bye, Levi

How can I retrieve the perimeter and specific geometric properties using the Model Derivative API?

I have followed the Postman tutorial for the model derivative API, specifically for extracting metadata. I used a .dxf file, since I want to know if it is possible to retrieve perimeter, length/width properties based off the file.
I received a 200 response and it gave me a massive list of objects w/ their respective objectid's. Basically I got back a ton of these:
{
"objectid": 253,
"name": "Line [108]",
"externalId": "108",
"properties": {
"3D Visualization ": {
"Material": "ByLayer"
},
"General": {
"Color": "ByLayer",
"Handle": "108",
"Layer": "color#000000ff",
"Linetype": "BYLAYER",
"Linetype scale": "1.000",
"Lineweight": "ByLayer",
"Name ": "Line",
"Plot style": "ByColor",
"Thickness": "0.000 mm",
"Transparency": "ByLayer"
},
"Geometry": {
"Angle": "192.931 deg",
"Length": "0.088 mm"
}
}
}
The .dxf file I tested was as simple as possible and it looks like this image:
How can I retrieve the perimeter of this image? Is it possible to retrieve other specific geometric properties that I specify?
How can I know what part of the .dxf file each objectid is referring to?
Although it looks simple, the polyline (?) is probably being tessellated, resulting in a large number of small lines. Have you tried the original DWG file? Can you try that with viewer.autodesk.com?

Vega-lite default bar width strange

I'm seeing the following oddly styled chart. I understand I can explicitly change the padding etc., but the default vega-lite layout is usually pretty good. I'm confused what I'm doing that's leading to this sub-normal behavior. Thanks! Here is the code in the vega-lite editor
I understand that I can also change x's type to ordinal to make the styling better, though I'm not sure I understand still why it is the difference I see. I need the type to be quantitative so I get the min/max brush bound, as opposed to the set.
Also I actually do not even know how to manually set the bar width after reading the documentation here https://vega.github.io/vega-lite/docs/scale.html. If anyone might have a working example that would be great.
Thanks.
As #marcprux mentioned, there is pre-binned support so you don't have to repeat the bin transform here. However, currently the prebinned support requires both bin_start and bin_end.
For now you could modify the spec to derive a new bin_end field and use it with x2.
{
"data": ...
"transform": [{
"calculate": "datum.ShareWomen_bin+0.1",
"as": "ShareWomen_bin_end"
}],
"mark": "bar",
"encoding": {
"x": {"bin": {"binned": true, "step": 0.1}, "field": "ShareWomen_bin", "type": "quantitative", "title": "ShareWomen_bin"},
"x2": {"field": "ShareWomen_bin_end"},
"y": {"field": "count", "type": "quantitative"}
}
}
like this spec.
I can see that we shouldn't require deriving bin_end and thus have created an issue to track this feature request: https://github.com/vega/vega-lite/issues/6086.
Btw, the quantitative scale only affects the bar position.
To set the bar size directly, you can use size property in a mark definition:
mark: {type: "bar", size: 5}
Since you declare "x" as a quantitative field, there's no assumption that the points along the axis are evenly distributed. E.g., you could add in some data points in between the others:
{"ShareWomen_bin": 0.83, "count": 40, "is_overview": true},
{"ShareWomen_bin": 0.87, "count": 70, "is_overview": true},
and you would see them rendered in between the other bars:
As you mention, you can specify that the bars should be encoded as ordinal values. Another solution is to leave it as quantitative, but specify that it is binned, in which case the bars will also be rendered as if they were ordinal:
"x": {"field": "ShareWomen_bin", "type": "quantitative", "bin": true},
Since it appears that your data is already binned, you should read about how vega-lite supports pre-binned data: https://vega.github.io/vega-lite/docs/bin.html#binned

Autodesk Forge - Extract geometry data from a 2D Cad Drawing using the modelderivative API

I am trying to extract data from a 2d Cad drawing. Essentially I would like to find the x/y coordinates of every element. However, the data does not show this information.
I am using the modelderivative/v2/designdata/{{urn}}/metadata/{{guid}}/properties endpoint to extract the data itself.
Here is an example of the output this gives
{
"objectid": 3308,
"name": "Text [67AC]",
"externalId": "67AC",
"properties": {
"AnnotationScaling": {
"Annotative": "No"
},
"General": {
"Color": "ByLayer",
"Handle": "67ac",
"Layer": "IMAGE-HYPERLINKS",
"Linetype": "ByLayer",
"Linetype scale": "1.000",
"Lineweight": "ByLayer",
"Name ": "Text",
"Plot style": "ByColor",
"Thickness": "0.000",
"Transparency": "ByLayer"
},
"Hyperlinks": {
"Description": ".\\R0010020.JPG",
"Name": ".\\R0010020.JPG"
},
"Misc": {
"Backward": "No",
"Upside down": "No"
},
"Text": {
"Contents": "R0010020.JPG",
"Height": "0.050",
"Justify": "Left",
"Obliquing": "0.000 deg",
"Rotation": "111.348 deg",
"Style": "Standard",
"Width factor": "1.000"
}
}
},
As you can see, there is no key 'Geometry'
Can anyone point me in the right direction on how I can extract the object positioning data for a 2d Cad drawing? Could it be that the drawing itself needs to implicitly set this information?
Here is an example of what I'm seeing in the Cad drawing itself.
Cad Output
There is no mention of the correct keys "Position X", "Position Y" in the modelderivative output above. Can anyone explain why this might be? Am I exporting it incorrectly? Or does Forge remove this information?
I am using PHP and getting the data server-side.
I exported another test model and found the following was generated
"Geometry": {
"Area": "1131855.821",
"Circumference": "3771.382 mm",
"Diameter": "1200.468 mm",
"Radius": "600.234 mm"
}
But there are no X/Y/Z coordinates in this data.
You can parse the individual primitives of your 2D drawing on the client side based on this blog post: https://forge.autodesk.com/blog/working-2d-and-3d-scenes-and-geometry-forge-viewer.
Parsing the drawing geometry on the server-side would be a bit more involved, since the file format used in Forge Viewer is not publicly documented. You could use tools like https://github.com/Autodesk-Forge/forge.commandline-nodejs, but I'm not sure if there are alternatives for PHP.