Waterfall chart with subtotals - vega-lite

I need to create a waterfall chart with vega but I need to have subtotals. For this subtotals the next bars keep showing as expected but starting at the subtotal quantity.
This is feasible with excel and Tableau but I didn't manage to do it with Vega.
I need to create something like this:
Example
Any idea if the vega waterfall chart allows this functionality

Here is an example of Vega waterfall chart with "subtotals" bars the way you described.
There is a waterfall chart in Vega-Lite gallery (but not in Vega), so we start off with using the Vega version that is generated by Vega-Lite in the on-line editor.
The first step is to add records to the input data for where the subtotals will appear. In this example, we added "Qtr_1", "Qtr_2" and "Qtr_3" to represent quarterly subtotals:
"data": [
{
"name": "source_0",
"values": [
{"label": "Begin", "amount": 4000},
{"label": "Jan", "amount": 1707},
{"label": "Feb", "amount": -1425},
{"label": "Mar", "amount": -1030},
{"label": "Qtr_1", "amount": 0},
{"label": "Apr", "amount": 1812},
{"label": "May", "amount": -1067},
{"label": "Jun", "amount": -1481},
{"label": "Qtr_2", "amount": 0},
{"label": "Jul", "amount": 1228},
{"label": "Aug", "amount": 1176},
{"label": "Sep", "amount": 1146},
{"label": "Qtr_3", "amount": 0},
{"label": "Oct", "amount": 1205},
{"label": "Nov", "amount": -1388},
{"label": "Dec", "amount": 1492},
{"label": "End", "amount": 0}
]
},
Note that there are existing bars for "Begin" and "End". We can modify the code referencing these bars to render bars for quarterly subtotals.
For example, code such as
"fill": [
{
"test": "datum.label === 'Begin' || datum.label === 'End',
"value": "#725a30"
},
has been changed to:
"fill": [
{
"test": "datum.label === 'Begin' || datum.label === 'End' || substring(datum.label, 0, 4) === 'Qtr_'",
"value": "#725a30"
},
Often code has been changed to use signal expressions for greater flexibility. For example,
"y": {"scale": "y", "field": "previous_sum"},
has been changed to
"y": {"signal": "scale('y', substring(datum.label, 0, 4) === 'Qtr_' ? 0 : datum['previous_sum'])"},
Here is the result waterfall chart rendered on the Vega online editor. To find code changes, search for "Qtr_" in the code.

Related

VegaLite Split Slices and aggregate by ranges

I'm trying to create a similar dashboard using VegaLite:
My example is in this link
Is there a way to configure the ranges in the dashboard and show it in a similar way as in the screenshot?
In need to devide the pie chart to two ranges:
0<=x<1
X>=1
You could use a binned transform or if you want just two discrete categories then a calculated field works just fine and can also be used in the legend.
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"description": "A simple pie chart with embedded data.",
"data": {
"values": [
{"username": "client1", "value": 4},
{"username": "client2", "value": 0.6},
{"username": "client3", "value": 0},
{"username": "client4", "value": 3},
{"username": "client5", "value": 7},
{"username": "client6", "value": 8}
]
},
"transform": [
{"calculate": "datum.value>=3?'>=3':'<3'" ,"as": "binned"}
]
,
"mark": "arc",
"encoding": {
"theta": {"field": "value", "type": "quantitative", "aggregate": "count" },
"color": {"field": "binned", "type": "nominal"}
}
}

Is it possible to turn the clear property of select to "false" for bar graphs?

I have a concatenated graph with two sets of data being displayed. By selecting bars on the left graph you can change the data shown on the right graph. I would like to prevent the user from being able to clear the selection and am trying to set {"select": {"clear": false}} to achieve this, however it is not working. Whenever I click outside of the graph or if I double click on the bars, the selection clears.
I have tried this using an example bar graph in the vega-lite examples page and that is also not working. So I am just wondering if this is not possible for certain types of graphs or if there is a specific way of doing it. I have attached the code for the bar graph that does not work with the clear: false property and for a heatmap that does work.
Not working bar graph:
Vega-lite link to graph: https://vega.github.io/editor/#/url/vega-lite/N4IgJAzgxgFgpgWwIYgFwhgF0wBwqgegIDc4BzJAOjIEtMYBXAI0poHsDp5kTykBaADZ04JAKyUAVhDYA7EABoQAEzjQATjRyZ289AEEABEyTrDsU5kMB3OjEMwaZGMOc7ZZQ3IdtSZpLLKhhBwgnBQ7p7eUMJQANaUhgAUAJKyEDg06nBBTACehgAqSExhSAwA5BCGNLKYcOpIEbrBmHlhlACUiipImCiooMRIggxqaADaoAMg+j1MaABMABwAvgrTaCAAQvNoYmLrm+gAwnuoACwAzEcgMwAi5wCcAIy3MwCi58tvG3dbADFzmIbn8ZgBxc4vJ7vLYACW+AHZYegUsDFqsALrrEA4UxIBAQSagWQEuBbRzOVxYHohMIRNCgNo4cnoHBsWqYHoxOCmLYAMxGIRAq1upIQrJAdPCXKU0oZbI5dRFmKUyHUcUZIGZkpM6h6-JogkEWwAxBcTojlvplrTMOo2HFdYImpqlFAGOoZPrFZyGiKlHBZFA2MpamQtQAPLWG0LKLYoJQ6rZsdRh0kmnF5GM0ONbBZJvIsrYARwYAR0-R0pADIENxoA8nioHRs4MQCHAnRdFq8Y0EFt5bKQMNRqzfiORmM0AAGSigqX2x1wADqNGU9C1nbDOjkxNx+IH6EpLicNKUo+nqAxfz7BIpZ+pw8v46xF6nrJnovdckNEfb0AjKyoAmIEAAKSDKOmZBpLI-qoHOGKikAA
Code:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"description": "A bar chart with highlighting on hover and selecting on click. (Inspired by Tableau's interaction style.)",
"data": {
"values": [
{"a": "A", "b": 28}, {"a": "B", "b": 55}, {"a": "C", "b": 43},
{"a": "D", "b": 91}, {"a": "E", "b": 81}, {"a": "F", "b": 53},
{"a": "G", "b": 19}, {"a": "H", "b": 87}, {"a": "I", "b": 52}
]
},
"params": [
{
"name": "highlight",
"select": {"type": "point", "clear": "false"}
},
{"name": "select", "select": "point"}
],
"mark": {
"type": "bar",
"fill": "#4C78A8",
"stroke": "black",
"cursor": "pointer"
},
"encoding": {
"x": {"field": "a", "type": "ordinal"},
"y": {"field": "b", "type": "quantitative"},
"fillOpacity": {
"condition": {"param": "select", "value": 1},
"value": 0.3
},
"strokeWidth": {
"condition": [
{
"param": "highlight",
"value": 2
},
{
"param": "highlight",
"value": 1
}
],
"value": 0
}
},
"config": {
"scale": {
"bandPaddingInner": 0.2
}
}
}
Working heatmap graph:
Vega-lite link:https://vega.github.io/editor/#/url/vega-lite/N4IgJAzgxgFgpgWwIYgFwhgF0wBwqgegIDc4BzJAOjIEtMYBXAI0poHsDp5kTykBaADZ04JAKyUAVhDYA7EABoQAEySYUqUMSSCGcCGgDaoJFEwMdaEAEFFIHACc4ymmedXbSqGwazMaAEYAZgBfBRMzC0EPO0dnV0x3dAAhO29ff1QABjCI80t0T3snFzdlKwBhNJ8-NBzwkFN86JTYkoSkm2qMutzGyIKQVKU40sTy1q8azID6vKirYeL4ssru2tQANj6mhfQqkfbVwvXMuf7mtcOV8cXT3obdwYPlsc6X9I2AThCAXTD7EgHEgEAZUMYQLIQXArDAaGQYMIEf4lBA4II4GY0KBMABPHAw9A4Ng0WpeDFAtAAMx0aJCfyUyAcAGtsSA8QSrE4sajMA42My4AB1GjKehoABMALgsm8LlkZDZuLZVJo6ImFwWSg5hMhbAQpMsAIAHiq1YINaMOuVtfjdbJ9YbogDVYIWqBVeqrJ8Uey7VYAI4WPx0NQ0UggAEQPkCwmgbyyFyYdjyTSA4EIWHwxHZ32IHB46m0uBKbS6XVMQSmVkAst6NCyBhugFsHCmOjKtMJpMptltjNZhFIrB2OuEgK1nT17KUMQthzKOAONndui9tP9kGDnPI0dT8eT8u9AEJ1WKtPQHRxkBMJCJgAKSGU8rIAElZLIl3UlLeH0+XwA8gwiTLtktZqgA7my0ZwDgaAACz1CAwIKteGZwResCILquLooIbAQWQTgypW9b0koSDGjQYKgMo+pIKSRaCHS9JAA
Code:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": [
{"actual": "A", "predicted": "A", "count": 13},
{"actual": "A", "predicted": "B", "count": 0},
{"actual": "A", "predicted": "C", "count": 0},
{"actual": "B", "predicted": "A", "count": 0},
{"actual": "B", "predicted": "B", "count": 10},
{"actual": "B", "predicted": "C", "count": 6},
{"actual": "C", "predicted": "A", "count": 0},
{"actual": "C", "predicted": "B", "count": 0},
{"actual": "C", "predicted": "C", "count": 9}
]
},
"params": [
{"name": "highlight", "select": {"type": "point", "clear": false}}
],
"mark": {"type": "rect", "strokeWidth": 2},
"encoding": {
"y": {"field": "actual", "type": "nominal"},
"x": {"field": "predicted", "type": "nominal"},
"fill": {"field": "count", "type": "quantitative"},
"stroke": {
"condition": {"param": "highlight", "empty": false, "value": "black"},
"value": null
},
"opacity": {"condition": {"param": "highlight", "value": 1}, "value": 0.5},
"order": {"condition": {"param": "highlight", "value": 1}, "value": 0}
},
"config": {
"scale": {"bandPaddingInner": 0, "bandPaddingOuter": 0},
"view": {"step": 40},
"range": {"ramp": {"scheme": "yellowgreenblue"}},
"axis": {"domain": false}
}
}
The difference between heatmap and bar chart seems a bit different from your understanding:
heatmap rects fill up all area on the grid
bar chart leaves some area on the grid outside the bars
As a result, upon clicks outside the grid, both won't clear the selection as you desire. However, clicks on the empty area inside the grid of bar chart emit an event with nothing selected, thus clearing the original selection.
I'm not sure if you know there is a toggle property for select. When it is true, any newly clicked point is inserted to the selection, thus clicking on the empty area adds nothing and preserves your previous selection. BUT, a side effect is that multi selection will be allowed...
"params": [
{"name": "highlight", "select": {"type": "point", "toggle": "true"}},
{"name": "select", "select": {"type": "point", "toggle": "true"}}
],
Last but not least, according to the Doc, clear property specifies an event to clear the selection. Yet, I'm not sure what happens when it is set as false.
clear property identifies which events must fire to empty a selection of all selected value

Using inline CSV data in Vega charts

I'm trying to use inline csv data with Vega charts, using the values property of the Vega data specification. The Vega documentation says that this possible, but doesn't give an example. I have tried to change the bar chart example from the examples gallery to use inline CSV data instead of JSON, but without success.
I replaced the data section from the example code with my own code. The original snippet looks like this:
"data": [
{
"name": "table",
"values": [
{"category": "A", "amount": 28},
{"category": "B", "amount": 55},
{"category": "C", "amount": 43},
{"category": "D", "amount": 91},
{"category": "E", "amount": 81},
{"category": "F", "amount": 53},
{"category": "G", "amount": 19},
{"category": "H", "amount": 87}
]
} ]
I replaced it with this one:
"data": [
{
"name": "table",
"format": "csv",
"values": {"category", "amount"
"A", "28"
"B", "55"
"C", "43"
"E", "91"
"E", "81"
"F", "53"
"G", "19"
"H", "87"}
} ]
I used the Vega online editor, but got only error messages about unexpected tokens in the JSON. I also tried the following variation:
"data": [
{
"name": "table",
"format": "csv",
"values": "category, amount
A, 28
B, 55
C, 43
E, 91
E, 81
F, 53
G, 19
H, 87"
} ]
But this lead to the same error messages. What is the correct syntax here?
The way, as you can view in the documentation, is something like this:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"description": "A simple bar chart with embedded data.",
"data": {"values": "a,b\nA,50\nB,30\nC,60", "format": {"type": "csv"}},
"mark": "bar",
"encoding": {
"x": {"field": "a", "type": "nominal", "axis": {"labelAngle": 0}},
"y": {"field": "b", "type": "quantitative"}
}
}
An example here

Dynamically Change Y-Axis Field in Encoding Based on Selection Vega-Lite

How can I dynamically change a data field encoded for the y-axis based upon a selection? I am trying to build a visualization to display event count data over the 24 hours of a day, and I want the user to be able to select different timezones (e.g. EST, CST, MST, or PST).
To do this, I have built out a single selection where I specify all the options I list above in the parentheses, with EST being set as my default. I want to create a condition where when I chose another option besides EST, I see the visualization dynamically update. I've explored creating other hour fields specifically for those timeframes, or adding in condition logic to try to account for these dynamic changes, but I have not arrived at a good solution yet. Can anyone help out?
Here is an example of what a few lines of my data look like
"data": {
"values": [
{
"title_column":"example",
"Type": "Technology",
"Events": "100",
"Hour": "0",
"Date": "9/1/20",
"Time Period": "Last Time"
},
{
"title_column":"example",
"Type": "Technology",
"Events": "110",
"Hour": "1",
"Date": "9/1/20",
"Time Period": "Last Time"
},
and the visualization looks like this when it is put together, with it dynamically updating based on the selection:
And when my code is static, it looks like this:
"layer":[
{"mark":{
"type":"bar",
"point":true,
"color":"#FFC94E",
"height":15
},
"selection": {
"timezone": {
"type": "single",
"init": {"changer": "EST"},
"bind": {
"changer": {"input": "select",
"options": ["EST","CST (-1 Hour)","MST (-2 Hours)","PST (-3 Hours)"]}
}
}
},
"encoding":
{
"x":{"field":"Events",
"type":"quantitative",
"aggregate":"sum",
"axis":null},
"y": {"field":"Hour",
"type":"ordinal",
"axis":{
"labelSeparation":1,
"labelPadding":4,
"title":null
}
}
}}]
}
However, focusing in particular on the y encoding of the bottom part of the code, I would ideally like to make that dynamic. I'm thinking I could create calculations for each of the timezones and then write a condition that works like the following below, but I have not been able to get this to work. Any help is greatly appreciated!
"y": {
"condition": {
"selection": {"timezone" : "EST"},
"datum": "datum.Hour"
}
"condition": {
"selection": {"timezone" : "CST (-1 Hour)"},
"datum": "datum.Hour_CST"
}
...
}
Here is the link to my code:
vega editor.
Selections can only filter on column values, not column names. Fortunately, you can convert column names to column values by using a Fold Transform.
To accomplish what you want, I'd suggest the following:
Use a series of Calculate Transforms to calculate new columns containing the values you want to show.
Use a Fold Transform to stack these values into a single column with an associated key column.
Link the selection binding to the key column created in the fold transform.
Use a Filter Transform to filter the values based on the selection
Finally, add a row encoding so that the selected column is labeled on the axis.
Put together, it looks like this (open in vega editor):
{
"width": 300,
"data": {
"values": [...]
},
"transform": [
{"filter": {"field": "Time Period", "equal": "Last Time"}},
{"calculate": "datum.Hour - 0", "as": "EST"},
{"calculate": "datum.Hour - 1", "as": "CST (-1 Hour)"},
{"calculate": "datum.Hour - 2", "as": "MST (-2 Hours)"},
{"calculate": "datum.Hour - 3", "as": "PST (-3 Hours)"},
{
"fold": ["EST", "CST (-1 Hour)", "MST (-2 Hours)", "PST (-3 Hours)"],
"as": ["Zone", "Hour"]
},
{"filter": {"selection": "timezone"}}
],
"selection": {
"timezone": {
"type": "single",
"init": {"Zone": "EST"},
"bind": {
"Zone": {
"name": "timezone",
"input": "select",
"options": [
"EST",
"CST (-1 Hour)",
"MST (-2 Hours)",
"PST (-3 Hours)"
]
}
}
}
},
"mark": {"type": "bar", "point": true, "color": "#FFC94E", "height": 15},
"encoding": {
"x": {
"field": "Events",
"type": "quantitative",
"aggregate": "sum",
"axis": null
},
"y": {
"field": "Hour",
"type": "ordinal",
"axis": {"labelSeparation": 1, "labelPadding": 4, "title": null}
},
"row": {
"field": "Zone",
"type": "nominal",
"title": null
}
}
}

Vega-lite: the bar chart is too thin when x channel is a fieldT, how can align it better and adjust width or padding?

I am making some bar charts with vega-lite, using vega-lite-api; the raw data comes with a field called "month" with values like "2020/09" "2020/08" ... "2019/06" ...
the fieldT recognized it nicely, and I can apply a brush to select narrower time ranges; but then the bar charts don't look good, it seems always a fixed value of width, too thin and the spacing between is too wide,
but in this visual, what makes more sense is to make the bar aligned to center of a month, because the data on y axis is aggregated for the whole month, not of a single date (first date of the month);
So how can make these bars to cover since beginning of each month till end of the month, and just leave a little gap (like 5px between? like in the fieldO below)
if change x channel to use fieldO of Ordinal values instead, then the width is better to wanted, and it adapts width well when brush select changes; but the month labels would be left as is, not so good;
It sounds like the feature you're looking for is the Time Unit. If you apply a timeUnit to a temporal encoding, it will cause the visual representation of the feature to fill the given timespan.
For example, here is some data similar to yours that uses a raw temporal encoding (view in editor):
{
"mark": "bar",
"encoding": {
"x": {"field": "month", "type": "temporal"},
"y": {"field": "value", "type": "quantitative"}
},
"data": {
"values": [
{"month": "2019/01", "value": 1}, {"month": "2019/02", "value": 2}, {"month": "2019/03", "value": 1},
{"month": "2019/04", "value": 4}, {"month": "2019/05", "value": 7}, {"month": "2019/06", "value": 3},
{"month": "2019/07", "value": 4}, {"month": "2019/08", "value": 6}, {"month": "2019/09", "value": 8},
{"month": "2019/10", "value": 10}, {"month": "2019/11", "value": 7}, {"month": "2019/12", "value": 5},
{"month": "2020/01", "value": 6}, {"month": "2020/02", "value": 9}, {"month": "2020/03", "value": 8},
{"month": "2020/04", "value": 10}, {"month": "2020/05", "value": 11}, {"month": "2020/06", "value": 9},
{"month": "2020/07", "value": 14}, {"month": "2020/08", "value": 15}, {"month": "2020/09", "value": 13},
{"month": "2020/10", "value": 10}, {"month": "2020/11", "value": 16}, {"month": "2020/12", "value": 18}
]
},
"width": 500
}
You can apply a yearmonth timeUnit to the x encoding like this:
"x": {"field": "month", "type": "temporal", "timeUnit": "yearmonth"},
If you do so, the result looks like this (view in editor):