Dynamic scale domain in vega-lite - vega-lite

I would like to define my x-axis:
minimum value should be now()
maximum value should be automatically determined (just as if the domain of the scale would have not been defined)
"encoding": {
"y": {
"field": "Reference",
"type": "nominal",
},
"x":{
"field": "Date",
"type": "temporal",
"scale": {"domain": [now(), 1618000000000]}}
I also tried to use an expression to set-up now(), to no success:
"scale": {"domain": ["expr":"now()", 1618000000000]

You were quite close with the second attempt; you just need to put braces around the expression statement:
"scale": {"domain": [{"expr": "now()"}, "2021-05-01T00:00:00"]}
Here's a full example (open in editor):
{
"data": {
"values": [
{"date": "2021-03-01T00:00:00", "value": 1},
{"date": "2021-04-01T00:00:00", "value": 3},
{"date": "2021-05-01T00:00:00", "value": 2}
]
},
"mark": "line",
"encoding": {
"x": {
"field": "date",
"type": "temporal",
"scale": {"domain": [{"expr": "now()"}, "2021-05-01T00:00:00"]}
},
"y": {"field": "value", "type": "quantitative"}
}
}
If instead of setting the domain limits, you just want to ensure that now() appears as part of the domain, you can use a domain unionWith statement:
"scale": {"domain": {"unionWith": [{"expr": "now()"}]}}
This will create an automatically calculated domain that contains the current date.

Related

Vega-lite: skip invalid values instead of treating as 0 for aggregate sum

When doing an aggregate sum on a column and charting it using Vega-Lite, is it possible to skip invalid values instead of treating them as 0 when doing the addition? When there is missing/invalid data, I want to show it as such, rather than as 0.
For example, this graph is what I expect when aggregating on date to get the sums for x and y.
Whereas in this example, the y value for both rows where date=2022-01-20 are NaN, so I would want there to be no data point for the sum of column y and show it as missing data, instead of as 0.
Is there a way to do that? I’ve looked through the documentation but may have missed something. I've tried using filter like so, but that filters out an entire row, rather than just the invalid value of a particular column for the row when doing the sum.
I’m thinking something like pandas GroupBy.sum(min_coun=1), so that if there isn't at least 1 non-NaN value, then the result will be presented as NaN.
OK, try this which removes NaN and null but leaves zero.
Editor.
Or this which removes a load of useless transforms.
Editor
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"description": "Google's stock price over time.",
"data": {
"values": [
{"date": "2022-01-20", "g": "apples", "x": "NaN", "y": "NaN"},
{"date": "2022-01-20", "g": "oranges", "x": "10", "y": "20"},
{"date": "2022-01-21", "g": "oranges", "x": "30", "y": "NaN"},
{"date": "2022-01-21", "g": "grapes", "x": "40", "y": "20"},
{"date": "2022-01-22", "g": "apples", "x": "NaN", "y": "NaN"},
{"date": "2022-01-22", "g": "grapes", "x": "10", "y": "NaN"}
]
},
"transform": [
{"calculate": "parseFloat(datum['x'])", "as": "x"},
{"calculate": "parseFloat(datum['y'])", "as": "y"},
{"fold": ["x", "y"]},
**{"filter": {"field": "value", "valid": true}},**
{
"aggregate": [{"op": "sum", "field": "value", "as": "value"}],
"groupby": ["date", "key"]
}
],
"encoding": {"x": {"field": "date", "type": "temporal"}},
"layer": [
{
"encoding": {
"y": {"field": "value", "type": "quantitative"},
"color": {"field": "key", "type": "nominal"}
},
"mark": "line"
},
{
"encoding": {
"y": {"field": "value", "type": "quantitative"},
"color": {"field": "key", "type": "nominal"}
},
"mark": {"type": "point", "tooltip": {"content": "encoding"}}
}
]
}

Set domainMin to 6 months before max date in data

I have the following Vega-Lite chart:
Open the Chart in the Vega Editor
Currently, I have the scale set as follows:
"scale": {"domainMin": "2021-06-01"}
However, what I really want is for the domainMin to be automatically calculated to be 6 months before the latest date in the notification_date field in the data.
I've looked at aggregate and expressions, but it's not exactly clear.
How can I get the maximum value of notification_date and subtract 6 months from it, and use that in "domainMin"?
Edit: To clarify, I don't want to filter the data. I want the user to be able to zoom out or pan to see the data outside the initial 6-month window. I get exactly what I want with "scale": {"domainMin": "2021-06-01"}, but this becomes out-of-date very quickly.
I have tried giving params and expr to domainMin, but I was unable to use the data fields in expr through datum.
The 2nd approach I tried will work for you, in this you will need to make use of joinaggregate/calculate/filter transforms. You will manually gather the max year and max months and then use it to filter your data.
Below is the modified config or refer the editor url:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"width": "container",
"height": "container",
"config": {
"group": {"fill": "#e5e5e5"},
"arc": {"fill": "#2b2c39"},
"area": {"fill": "#2b2c39"},
"line": {"stroke": "#2b2c39"},
"path": {"stroke": "#2b2c39"},
"rect": {"fill": "#2b2c39"},
"shape": {"stroke": "#2b2c39"},
"symbol": {"fill": "#2b2c39"},
"range": {
"category": [
"#2283a2",
"#003e6a",
"#a1ce5e",
"#FDBE13",
"#F2727E",
"#EA3F3F",
"#25A9E0",
"#F97A08",
"#41BFB8",
"#518DCA",
"#9460A8",
"#6F7D84",
"#D1DCA5"
]
}
},
"title": "South Western Sydney Cumulative and Daily COVID-19 Cases by LGA",
"data": {
"url": "https://davidwales.github.io/nsw-covid-19-data/confirmed_cases_table1_location.csv"
},
"transform": [
{
"filter": {
"and": [
{"field": "lhd_2010_name", "equal": "South Western Sydney"},
{"not": {"field": "lga_name19", "equal": "Penrith"}}
]
}
},
{"calculate": "utcyear(datum.notification_date)", "as": "yearNumber"},
{"calculate": "utcmonth(datum.notification_date)", "as": "monthNumber"},
{
"window": [
{"op": "count", "field": "notification_date", "as": "cumulative_count"}
],
"frame": [null, 0]
},
{
"joinaggregate": [
{"field": "monthNumber", "op": "max", "as": "max_month_count"},
{"field": "yearNumber", "op": "max", "as": "max_year"}
]
},
{"calculate": "abs(datum.max_month_count-6)", "as": "min_month_count"},
{
"filter": "datum.min_month_count < datum.monthNumber && datum.max_year === datum.yearNumber"
}
],
"layer": [
{
"selection": {
"date": {"type": "interval", "bind": "scales", "encodings": ["x"]}
},
"mark": {"type": "bar", "tooltip": true},
"encoding": {
"x": {
"timeUnit": "yearmonthdate",
"field": "notification_date",
"type": "temporal",
"title": "Date"
},
"color": {
"field": "lga_name19",
"type": "nominal",
"title": "LGA",
"legend": {"orient": "top", "columns": 4}
},
"y": {
"aggregate": "count",
"field": "lga_name19",
"type": "quantitative",
"title": "Cases",
"axis": {"title": "Daily Cases by SWS LGA"}
}
}
},
{
"mark": "line",
"encoding": {
"x": {
"timeUnit": "yearmonthdate",
"field": "notification_date",
"title": "Date",
"type": "temporal"
},
"y": {
"aggregate": "max",
"field": "cumulative_count",
"type": "quantitative",
"axis": {"title": "Cumulative Cases"}
}
}
}
],
"resolve": {"scale": {"y": "independent"}}
}
A bit simpler approach to filter approximately the last six months of data might look like this:
"transform": [
...,
{"joinaggregate": [{"op": "max", "field": "notification_date", "as": "last_date"}]},
{"filter": "datum.notification_date > datum.last_date - 6 * 30 * 24 * 60 * 60 * 1000"}
]
It makes use of the fact that dates are stored as millisecond time-stamps, and has the benefit that it will work across year boundaries.
This is not quite an answer to the question you asked, but if your data is reasonably up-to-date (that is, the most recent data point is close to the current date), you can do something like this:
"scale": { "domainMin": { "expr": "timeOffset('month', now(), -6)" } }

Vega-Lite: Is it possible to have images as labels in a line chart?

I want to have images as my axis labels. I tried to use the image mark, but that did not work and that is kind of expected. Label expression is also something I tried, but that did not work if I want it to be an image. What else could I tried or is it possible at all?
Line chart example
Using the same technique in another answer, an image axis can be added via an extra layer:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": [
{"x": 0.5, "y": 0.5, "img": "data/ffox.png"},
{"x": 1.5, "y": 1.5, "img": "data/gimp.png"},
{"x": 2.5, "y": 2.5, "img": "data/7zip.png"}
]
},
"layer": [{
"mark": {"type": "image", "width": 50, "height": 50},
"encoding": {
"x": {"field": "x", "type": "quantitative"},
"y": {"field": "y", "type": "quantitative"},
"url": {"field": "img", "type": "nominal"}
}
}, {
"transform": [{"calculate": "-.2", "as": "axis"}],
"mark": {"type": "image", "width": 25, "height": 25},
"encoding": {
"x": {"field": "x", "type": "quantitative", "axis":{"labelExpr": "", "title": null}},
"y": {"field": "axis", "type": "quantitative", "scale": {"domain": [0, 2.5]}},
"url": {"field": "img", "type": "nominal"}
}
}],
"resolve": {"scale": {"y": "shared"}}
}
Vega Editor
===== 2021-07-20 =====
Your BarChart's x encoding uses the x field which is not evenly distributed, so it is misaligned.
If you do have the a field shown in your editor, simplest way is to replace the encoding as "x": {"field": "a", "type": "ordinal", "axis": null}
Vega Editor
Even the a field was not there, you may wanna add such a field for aligning, or even ordering, the image axis.
The last resort I can think of is window transform which is an overkill, but adds no extra value as well:
Vega Editor

How to use zero=false in vega-lite when also using a color encoding?

I am trying to figure out how to not have my y-axis start at zero? It works in general for me, but if I add the color encoding (see below) it is not working anymore and instead I get to see the zero.
{
"data": {"name": "d"},
"mark": {"type": "bar"},
"encoding": {
"color": {"type": "nominal", "field": "group"},
"x": {"type": "nominal", "field": "model"},
"y": {
"type": "quantitative",
"field": "inf_f1",
"scale": {"zero": false}
}
},
"$schema": "https://vega.github.io/schema/vega-lite/v4.0.2.json",
"datasets": {
"d": [
{
"model": "lr-bow",
"inf_f1": 0.7991841662090597,
"group" : "A"
},
{
"model": "fcn-bow",
"inf_f1": 0.8220151833558302,
"group" : "B"
}
]
}
}
The reason the scale includes zero is that bars are stacked by default, and each bar has an implicit stacked zero-height bar for the group that does not appear, but does affect the automatically chosen axis limits. You can address this by setting stack to "none" in the y encoding (view in editor):
{
"data": {"name": "d"},
"mark": {"type": "bar"},
"encoding": {
"color": {"type": "nominal", "field": "group"},
"x": {"type": "nominal", "field": "model"},
"y": {
"type": "quantitative",
"field": "inf_f1",
"stack": "none",
"scale": {"zero": false}
}
},
"datasets": {
"d": [
{"model": "lr-bow", "inf_f1": 0.7991841662090597, "group": "A"},
{"model": "fcn-bow", "inf_f1": 0.8220151833558302, "group": "B"}
]
}
}

how to control the order of groups in a vega-lite stacked bar chart

Given a stacked bar chart such as in this example: https://vega.github.io/editor/?#/examples/vega-lite/stacked_bar_weather
I want to control the order of the items in the aggregation so that, for example, 'fog' comes at the bottom, with 'sun' next etc. Is that possible?
The reason for this is I have one type that is much larger than the others. I want this to appear at the bottom, then control the domain to 'cut off' most of that section.
Thanks
You can control the stack order via the order encoding: see https://vega.github.io/vega-lite/docs/stack.html#sorting-stack-order
Unfortunately, this only allows sorting by field value, rather than by an explicit order as you want here. The workaround is to use a calculate transform to turn your explicit order into a field (view in editor):
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {"url": "data/seattle-weather.csv"},
"transform": [
{
"calculate": "indexof(['sun', 'fog', 'drizzle', 'rain', 'snow'], datum.weather)",
"as": "order"
}
],
"mark": "bar",
"encoding": {
"x": {
"timeUnit": "month",
"field": "date",
"type": "ordinal",
"axis": {"title": "Month of the year"}
},
"y": {"aggregate": "count", "type": "quantitative"},
"color": {
"field": "weather",
"type": "nominal",
"scale": {
"domain": ["sun", "fog", "drizzle", "rain", "snow"],
"range": ["#e7ba52", "#c7c7c7", "#aec7e8", "#1f77b4", "#9467bd"]
},
"legend": {"title": "Weather type"}
},
"order": {"field": "order", "type": "ordinal"}
}
}