How do you fix the rendered text in a hconcat pyramid chart? - vega-lite

I am trying to create a concat pyramid chart, but the text in the middle seems to have a problem rendering properly. Changing the field for mark text to something that is a number does not have this render problem. This is the example I followed to and modify from. Population Pyramid
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"spacing": 0,
"hconcat": [
{
"transform": [
{ "filter": { "field": "sentiment", "equal": "negative" } }
],
"encoding": {
"y": { "field": "type", "title": null, "axis": null },
"x": {
"field": "sentiment",
"aggregate": "count",
"axis": null,
"sort": "descending"
}
},
"layer": [
{ "mark": "bar", "encoding": { "color": { "field": "channel" } } }
]
},
{
"width": 100,
"view": { "stroke": null },
"mark": { "type": "text", "align": "center" },
"encoding": {
"y": { "field": "type", "axis": null },
"text": { "field": "type" }
}
},
{
"mark": "bar",
"transform": [
{ "filter": { "field": "sentiment", "equal": "positive" } }
],
"encoding": {
"color": { "field": "channel" },
"y": { "field": "type", "axis": null },
"x": { "field": "sentiment", "aggregate": "count", "axis": null }
}
}
],
"config": { "view": { "stroke": null }, "axis": { "grid": false } },
"data": {
"values": [
{
"id": 1,
"type": "shops",
"channel": "line man",
"sentiment": "negative"
}
]
}
}

Since you have not done any aggregation in your text chart, each text mark is drawn multiple times – once per corresponding row in the data. This stacking of multiple text marks is what makes it appear as if it's rendered poorly.
To ensure that each text mark is only drawn once, you'll need to aggregate the data. There are a few ways to do this, but the easiest here is to use the argmin or argmax of an associated numerical column:
"encoding": {
"y": {"field": "type", "axis": null},
"text": {"field": "type", "aggregate": {"argmin": "id"}}
}

Related

Cannot impute missing values

Have this image
Given this vega-lite
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": [
{
"timestamp": "2011-04-01T17:06:21.000Z",
"value": 0.44777528325189986
},
{
"timestamp": "2011-04-02T17:06:21.000Z",
"value": 0.44390285331388984
},
{
"timestamp": "2011-04-03T17:06:21.000Z",
"value": 0.44813958999449255
},
{
"timestamp": "2011-04-04T17:06:21.000Z",
"value": 0.4440416510172272
},
{
"timestamp": "2011-04-05T17:06:21.000Z",
"missing": "NO value KEY HERE!"
},
{
"timestamp": "2011-04-06T17:06:21.000Z",
"value": 0.3797480270068858
},
{
"timestamp": "2011-04-07T17:06:21.000Z",
"value": 0.31955288375970203
},
{
"timestamp": "2011-04-08T17:06:21.000Z",
"value": 0.3171368880067786
},
{
"timestamp": "2011-04-10T17:06:21.000Z",
"value": 0.30021395605134893
},
{
"timestamp": "2011-04-11T17:06:21.000Z",
"value": 0.3130485242947531
}
]
},
"encoding": {"y": {"field": "timestamp", "type": "temporal", "sort": "ascending"}},
"layer": [
{
"mark": {"type": "line", "interpolate": "cardinal"},
"encoding": {
"x": {
"field": "value",
"sort": null,
"type": "quantitative",
"axis": {"orient": "top"},
"impute": {"keyvals": ["value"], "method": "mean", "frame": [-5, 5]}
}
}
}
]
}
But I thought the impute line would cause it to fill that gap in the data:
"impute": {"keyvals": ["value"], "method": "mean", "frame": [-5, 5]}
Have tried many permutations of this, including:
changing keyvals to ["timestamp"]
Moving the impute line to inside the "encoding": {"y": ... definition
#2 but also switch keyvals to ["value"]
None of those seem to be working
Update
Also tried an impute in transform, and that doesn't work either:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": [
...
]
},
"transform": [
{
"impute": "value",
"key": "timestamp",
"frame": [-1, 1],
"method": "mean"
}
],
"encoding": {"y": {"field": "timestamp", "type": "temporal", "sort": "ascending"}},
"layer": [
{
"mark": {"type": "line", "interpolate": "cardinal"},
"encoding": {
"x": {
"field": "value",
"sort": null,
"type": "quantitative",
"axis": {"orient": "top"}
}
}
}
]
}
Update 2
Here's something that almost feels like progress, but doesn't behave how I would expect. This is the exact same data with the "transform" : [ "impute" : { ... approach, but now it's displaying imputed_value_value (which by the way is never mentioned in the docs) instead of value:
It does successfully impute, but it imputes (averages) everything, when I only want it to impute places with missing data. Is this how impute is supposed to work?
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": [
...
]
},
"transform": [
{
"impute": "value",
"key": "timestamp",
"frame": [-5, 5],
"method": "mean"
}
],
"encoding": {"y": {"field": "timestamp", "type": "temporal", "sort": "ascending"}},
"layer": [
{
"mark": {"type": "line", "interpolate": "cardinal"},
"encoding": {
"x": {
"field": "imputed_value_value",
"sort": null,
"type": "quantitative",
"axis": {"orient": "top"},
}
}
}
]
}

Is it possible to apply the same condition as color encoding for legend

My source code is following
"transform": [
{
"window": [
{
"op": "rank", "field": "Value", "as": "_rank"
}
],
"sort": [
{
"field": "Value",
"order": "descending"
}
]
}
],
"encoding": {
"color": {
"field": "_rank",
"condition": {
"test": "datum._rank>5",
"value": "grey"
}
},
"x": {
"field": "Week",
"type": "nominal",
"axis": {
"labelAngle": 0
}
},
"y": {
"field": "Value",
"type": "quantitative",
"axis": {
"grid": false
}
}
},
"layer": [
{
"mark": {
"type": "bar",
"tooltip": true
}
},
{
"mark": {
"type": "text",
"align": "center",
"baseline": "middle",
"dx": 0,
"dy": -5,
"tooltip": true
},
"encoding": {
"text": {
"field": "Value"
}
}
}
]
I put a condition for the color encoding to show anything but top5 to show in different colors and any values that are not top5 should be grey.
"color": {
"field": "_rank",
"condition": {
"test": "datum._rank>5",
"value": "grey"
}
}
It is all good for the bars but the legends don't generate with the same conditions.
Is it possible to extend the same top5 logics for the legend's color as well? i.e. anything <5 are grey in color (each) in legend and everything else is the same color as the condition (currently this part is getting generated)
Editor
The legend colors will reflect the color scale that you specify, and not reflect conditions.
The easiest way to do what you want is likely by setting the range for your color scheme; for example:
{
"data": {"url": "data/cars.json"},
"mark": "point",
"encoding": {
"x": {"field": "Horsepower", "type": "quantitative"},
"y": {"field": "Miles_per_Gallon", "type": "quantitative"},
"color": {
"field": "Origin", "type": "nominal",
"scale": {"range": ["purple", "#ff0000", "teal"]}
}
}
}
You'll have to modify the specified colors based on how many color categories you have in your data.

How to filter data in vega-lite?

I have a following code for line plot, I am not sure how to use the filter transform, I have the mark and encoding inside a layer to use the tooltip for the plot
{
"$schema": "https://vega.github.io/schema/vega-lite/v2.4.json",
"title": "Dashboard",
"data": {
"url" : {
"%context%": true,
"index": "paytrans",
"body": {
"size":10000,
"_source": ["Metrics","Value","ModelName"],
}
}
"format": {"property": "hits.hits"},
},
"layer": [
{
"mark": {
"type": "line",
"point": true
},
"encoding": {
"x": {"field": "_source.ModelName",
"type": "ordinal",
"title":"Models"
"axis": {
"labelAngle": 0
}
},
"y": {"field": "_source.Value", "type": "quantitative", "title":"Metric Score"
"scale": { "domain": [0.0, 1.0] }},
"color": {"field": "_source.Metrics", "type": "nominal", "title":"Metrics"},
"tooltip": [
{"field": "_source.Metrics", "type": "nominal", "title":"Metric"},
{"field": "_source.Value", "type": "quantitative", "title":"Value"}
]
}
}
]
}
If I add
"transform": [
{
"filter": "datum.Value <= 0.5"
}
],
Its not working, may I how to filter the Value Field
It appears that you don't have a field named Value; you have a field named _source.Value. So the correct way to filter would be:
"transform": [
{
"filter": "datum._source.Value <= 0.5"
}
],

Vega lite select N number of objects (count)

I just started using Vega lite and was wondering how to cut out everything after my 10th object (I have thousands of rows and am just interested in the top 10).
This is what I have so far:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"url": "https://raw.githubusercontent.com/DanStein91/Info-vis/master/anage.csv",
"format": {
"type": "csv"
}
},
"transform": [
{
"filter": {
"field": "Female_maturity_(days)",
"gt": 0
}
}
],
"title": {
"text": "",
"anchor": "middle"
},
"mark": "bar",
"encoding": {
"y": {
"field": "Common_name",
"type": "nominal",
"sort": {
"op": "mean",
"field": "Female_maturity_(days)",
"order": "descending"
}
},
"x": {
"field": "Female_maturity_(days)",
"type": "quantitative"
}
},
"config": {}
}
You can follow the Filtering Top K Items example from the documentation. The result looks something like this (view in vega editor):
{
"data": {
"url": "https://raw.githubusercontent.com/DanStein91/Info-vis/master/anage.csv",
"format": {"type": "csv", "parse": {"Female_maturity_(days)": "number"}}
},
"transform": [
{
"window": [{"op": "rank", "as": "rank"}],
"sort": [{"field": "Female_maturity_(days)", "order": "descending"}]
},
{"filter": "datum.rank <= 10"}
],
"mark": "bar",
"encoding": {
"y": {
"field": "Common_name",
"type": "nominal",
"sort": {
"op": "mean",
"field": "Female_maturity_(days)",
"order": "descending"
}
},
"x": {"field": "Female_maturity_(days)", "type": "quantitative"}
},
"title": {"text": "", "anchor": "middle"}
}
One note: when doing transforms on CSV data (as opposed to JSON data), it's important to use format.parse to specify the desired data type for the columns: by default, CSV columns are interpreted as strings, which can cause sorting-based operations to behave in unexpected ways.

Is there a way to add a line on top of a histogram?

My best attempt at it so far : Direct link to Vega-editor
I created 2 layers with the same data, remove padding for the 'bar' layer and add a step interpolation for the 'line' layer but I can't find a way to make the line starts at the vertical axis and ends at the right of chart.
The spec (sorry I removed lines because StackOverflow doesn't want to let me post it if the ratio text/code is not enough) :
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"datasets": ...,
"width": 1130,
"height": 438,
"layer": [
{
"mark": {
"type": "bar",
"opacity": 0.7
},
"encoding": {
"x": {
"scale": {
"padding": 0
},
"field": "Continent",
"type": "nominal"
},
"y": {
"field": "Population",
"type": "quantitative"
}
},
"data": {
"name": "bar"
}
},
{
"mark": {
"type": "line",
"interpolate": "step",
"strokeWidth": 3
},
"encoding": {
"x": {
"axis": {},
"field": "Continent",
"type": "nominal"
},
"y": {
"axis": {},
"field": "Population",
"type": "quantitative"
}
},
"data": {
"name": "line"
}
}
]
}