Dynamically Change Y-Axis Field in Encoding Based on Selection Vega-Lite - vega-lite

How can I dynamically change a data field encoded for the y-axis based upon a selection? I am trying to build a visualization to display event count data over the 24 hours of a day, and I want the user to be able to select different timezones (e.g. EST, CST, MST, or PST).
To do this, I have built out a single selection where I specify all the options I list above in the parentheses, with EST being set as my default. I want to create a condition where when I chose another option besides EST, I see the visualization dynamically update. I've explored creating other hour fields specifically for those timeframes, or adding in condition logic to try to account for these dynamic changes, but I have not arrived at a good solution yet. Can anyone help out?
Here is an example of what a few lines of my data look like
"data": {
"values": [
{
"title_column":"example",
"Type": "Technology",
"Events": "100",
"Hour": "0",
"Date": "9/1/20",
"Time Period": "Last Time"
},
{
"title_column":"example",
"Type": "Technology",
"Events": "110",
"Hour": "1",
"Date": "9/1/20",
"Time Period": "Last Time"
},
and the visualization looks like this when it is put together, with it dynamically updating based on the selection:
And when my code is static, it looks like this:
"layer":[
{"mark":{
"type":"bar",
"point":true,
"color":"#FFC94E",
"height":15
},
"selection": {
"timezone": {
"type": "single",
"init": {"changer": "EST"},
"bind": {
"changer": {"input": "select",
"options": ["EST","CST (-1 Hour)","MST (-2 Hours)","PST (-3 Hours)"]}
}
}
},
"encoding":
{
"x":{"field":"Events",
"type":"quantitative",
"aggregate":"sum",
"axis":null},
"y": {"field":"Hour",
"type":"ordinal",
"axis":{
"labelSeparation":1,
"labelPadding":4,
"title":null
}
}
}}]
}
However, focusing in particular on the y encoding of the bottom part of the code, I would ideally like to make that dynamic. I'm thinking I could create calculations for each of the timezones and then write a condition that works like the following below, but I have not been able to get this to work. Any help is greatly appreciated!
"y": {
"condition": {
"selection": {"timezone" : "EST"},
"datum": "datum.Hour"
}
"condition": {
"selection": {"timezone" : "CST (-1 Hour)"},
"datum": "datum.Hour_CST"
}
...
}
Here is the link to my code:
vega editor.

Selections can only filter on column values, not column names. Fortunately, you can convert column names to column values by using a Fold Transform.
To accomplish what you want, I'd suggest the following:
Use a series of Calculate Transforms to calculate new columns containing the values you want to show.
Use a Fold Transform to stack these values into a single column with an associated key column.
Link the selection binding to the key column created in the fold transform.
Use a Filter Transform to filter the values based on the selection
Finally, add a row encoding so that the selected column is labeled on the axis.
Put together, it looks like this (open in vega editor):
{
"width": 300,
"data": {
"values": [...]
},
"transform": [
{"filter": {"field": "Time Period", "equal": "Last Time"}},
{"calculate": "datum.Hour - 0", "as": "EST"},
{"calculate": "datum.Hour - 1", "as": "CST (-1 Hour)"},
{"calculate": "datum.Hour - 2", "as": "MST (-2 Hours)"},
{"calculate": "datum.Hour - 3", "as": "PST (-3 Hours)"},
{
"fold": ["EST", "CST (-1 Hour)", "MST (-2 Hours)", "PST (-3 Hours)"],
"as": ["Zone", "Hour"]
},
{"filter": {"selection": "timezone"}}
],
"selection": {
"timezone": {
"type": "single",
"init": {"Zone": "EST"},
"bind": {
"Zone": {
"name": "timezone",
"input": "select",
"options": [
"EST",
"CST (-1 Hour)",
"MST (-2 Hours)",
"PST (-3 Hours)"
]
}
}
}
},
"mark": {"type": "bar", "point": true, "color": "#FFC94E", "height": 15},
"encoding": {
"x": {
"field": "Events",
"type": "quantitative",
"aggregate": "sum",
"axis": null
},
"y": {
"field": "Hour",
"type": "ordinal",
"axis": {"labelSeparation": 1, "labelPadding": 4, "title": null}
},
"row": {
"field": "Zone",
"type": "nominal",
"title": null
}
}
}

Related

Force vega-lite to show label when number is 0

I'm still very much a beginner in vega-lite but I'm trying to create a stacked bar chart with different sales channels. Sometimes a sales channel has a 0 and doesn't show up, how can I still show the label?
{
"layer": [
{
"mark": {
"type": "bar",
"cornerRadius": 50,
"color": "#90C290",
"tooltip": true
},
"encoding": {
"x": {
"field": "Number of customers"
}
}
},
{
"mark": {
"type": "text",
"tooltip": true,
"align": "left",
"baseline": "middle",
"x": 10,
"color": "white"
},
"encoding": {
"text": {
"field": "Number of customers",
"type": "text"
}
}
}
],
"encoding": {
"y": {
"field": "Sales channel",
"type": "nominal",
"sort": "descending",
"title": null
},
"x": {
"type": "quantitative",
"title": null,
"axis": null
}
}
}
I tried the code above and looked through documentation but couldn't exactly find what I was looking for
I added sample data to your spec, and there are a few changes I would make.
Around line 29 you have "type": "text" which should be "type": "quantitative".
I think your problem is that the text color is white and the background color is white, so the text is there but you can't see it. A simple fix would be to set the text color to black, or change the background color to something other than white (add "background": "lightgray", before "layer").
It's also possible you don't see the channel at all depending on how you are passing data from Power BI. Check the data tab in the Deneb window to make sure the Channel is there.
If the channel is not there, you'll have to adjust something on the Power BI side. A good practice is to put the data in a table in Power BI first so you know what you are sending into Deneb. If you use an aggregation like SUM on the data field in Power BI, nulls will drop out, but zeros should stay. If you use "Don't summarize" then nulls or errors (text in a number field) will pass in as nulls to Deneb, but you may need to add an "aggregate": "sum" to your encoding.
In any case, here's the spec the way I would write it.
{
"data": {"name": "dataset"},
"layer": [
{
"mark": {
"type": "bar",
"cornerRadius": 50,
"color": "#90C290",
"tooltip": true
}
},
{
"mark": {
"type": "text",
"tooltip": true,
"align": "left",
"baseline": "middle",
"dx": 5,
"color": "black"
},
"encoding": {
"text": {
"field": "Number of customers",
"type": "quantitative"
}
}
}
],
"encoding": {
"y": {
"field": "Sales channel",
"type": "nominal",
"sort": "descending",
"title": null
},
"x": {
"field": "Number of customers",
"type": "quantitative",
"title": null,
"axis": null
}
}
}
Link to sample data in Vega Editor

visualize the duration of events

I want to visualize durations of events as a bar, my input value is a decimal value where the integer part represents days and the decimal part a fraction of a day. I can convert the input value to any value needed.
An event can span multiple days.
The code below contains data for two events, the duration of event a is 36 hours and the duration of event b is 12 hours. Of course, it's possible that an event can be over after just some minutes or take 3hours 14minutes 24seconds.
I want the x-axis have ticks every 30minutes, from the sample data I need 36 hours, an axis label can look like 0d 0:00.
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"height": "container",
"width": "container",
"data": {
"values": [
{
"event": "a",
"durationdecimal": 1.5
},
{
"event": "b",
"durationdecimal": 0.5
}
]
},
"mark": {"type": "bar"},
"encoding": {
"x": {
"field": "durationdecimal",
"type": "temporal",
"axis": {"grid": false},
"timeUnit": "utchoursminutes"
},
"y": {"field": "event", "type": "nominal", "title": null}
,
"tooltip": [{"field": "durationdecimal"}]
}
}
I appreciate any help.
I don't think your durationdecimal should be temporal as there is no date/month/year provided. I tried recreating your sample using quantitative type and have done conversions on labels using labelExpr and some expressions. It mostly covers all your mentioned requirements. The only remaining part seems to be of ticks for 30 mins.
Below is the config or refer editor:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"height": "container",
"width": "container",
"data": {
"values": [
{"event": "a", "durationdecimal": 1.5},
{"event": "c", "durationdecimal": 2.1},
{"event": "b", "durationdecimal": 0.5}
]
},
"mark": {"type": "bar"},
"transform": [
{
"calculate": "split(toString(datum.durationdecimal),'.')[0] + 'd ' + (split(toString(datum.durationdecimal),'.')[1] ? floor(('0.'+split(toString(datum.durationdecimal),'.')[1])*24) + ':00': '0:00')",
"as": "x_dateLabelTooltip"
}
],
"encoding": {
"x": {
"field": "durationdecimal",
"type": "quantitative",
"axis": {
"grid": false,
"labelExpr": "split(toString(datum.label),'.')[0] + 'd ' + (split(toString(datum.label),'.')[1] ? floor(('0.'+split(toString(datum.label),'.')[1])*24) + ':00': '0:00')"
}
},
"y": {"field": "event", "type": "nominal", "title": null},
"tooltip": [{"field": "x_dateLabelTooltip"}]
}
}
Let me know if this works for you.

Vega transform to select the first n rows

Is there a Vega/Vega-Lite transform which I can use to select the first n rows in data set?
Suppose I get a dataset from a URL such as:
Person
Height
Jeremy
6.2
Alice
6.0
Walter
5.8
Amy
5.6
Joe
5.5
and I want to create a bar chart showing the height of only the three tallest people. Assume that we know for certain that the dataset from the URL is already sorted. Assume that we cannot change the data as returned by the URL.
I want to do something like this:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"url": "heights.csv"
},
"transform": [
{"head": 3}
],
"mark": "bar",
"encoding": {
"x": {"field": "Person", "type": "nominal"},
"y": {"field": "Height", "type": "quantitative"}
}
}
only the head transform does not actually exist - is there something else I can do to get the same effect?
The Vega-Lite documentation has an example along these lines in filtering top-k items.
Your case is a bit more specialized: you do not want to order based on rank, but rather based on the original ordering of the data. You can do this using a count-based window transform followed by an appropriate filter. For example (view in editor):
{
"data": {
"values": [
{"Person": "Jeremy", "Height": 6.2},
{"Person": "Alice", "Height": 6.0},
{"Person": "Walter", "Height": 5.8},
{"Person": "Amy", "Height": 5.6},
{"Person": "Joe", "Height": 5.5}
]
},
"transform": [
{"window": [{"op": "count", "as": "count"}]},
{"filter": "datum.count <= 3"}
],
"mark": "bar",
"encoding": {
"x": {"field": "Height", "type": "quantitative"},
"y": {"field": "Person", "type": "nominal", "sort": null}
}
}

issues transforming string to date vega lite

I am trying to display some data in a line graph. However, my "Harvest_Year" data, which is a date in years, like 2017 or 2018, is being displayed as what I believe is a string
I imported by data from a .csv file, and the following are the steps I took to change the string to a date formate. I tired to do:
"Harvest_Year": "year"
But that did not work as it made all my values null. So I thought first I will make it into a int and then transform it into year. However in Vega-Lite all my years re displayed correctly in the table but when I display it on the line graph I only see 1970 which I am sure I don't have in the dataset, and it only displays that single year.
Where as in the image below, you can see I have all the years in my data:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"url": "https://raw.githubusercontent.com/DanStein91/Info-vis/master/CoffeeRobN.csv",
"format": {
"type": "csv",
"parse": {
"Number_of_Bags": "number",
"Bag_weight": "number",
"Harvest_Year": "number"
}
}
},
"transform": [
{
"timeUnit": "year",
"field": "Harvest_Year",
"as": "Year"
},
{
"calculate": "datum.Number_of_Bags * datum.Bag_Weight ",
"as": "Total_Export"
}
],
"width": 300,
"height": 200,
"mark": "line",
"encoding": {
"y": {
"field": "Total_Export",
"type": "quantitative"
},
"x": {
"field": "Harvest_Year",
"type": "temporal"
}
},
"config": {}
}
When you tell vega-lite to interpret numbers as dates, it treats them as unix timestamps, i.e. milliseconds after January 1 1970. Each of your resulting dates is in the year 1970, which leads to the chart you are seeing.
Your dates appear to be in a non-standard format (e.g. "2017.0" means the year 2017) so you'll have to use vega expressions to manually parse them into date objects. Here is an example of this (view in editor):
{
"data": {
"url": "https://raw.githubusercontent.com/DanStein91/Info-vis/master/CoffeeRobN.csv",
"format": {
"type": "csv",
"parse": {
"Number_of_Bags": "number",
"Bag_weight": "number",
"Harvest_Year": "number"
}
}
},
"transform": [
{"filter": "isValid(datum.Harvest_Year)"},
{"calculate": "datetime(datum.Harvest_Year, 1)", "as": "Harvest_Year"},
{
"calculate": "datum.Number_of_Bags * datum.Bag_Weight ",
"as": "Total_Export"
}
],
"mark": "point",
"encoding": {
"y": {"field": "Total_Export", "type": "quantitative"},
"x": {"field": "Harvest_Year", "type": "ordinal", "timeUnit": "year"}
},
"width": 300,
"height": 200
}
Another option is to avoid datetime and timeUnit logic altogether (since your data does not actually contain any dates), and just use the year numbers directly in your encoding; e.g.
{
"data": {
"url": "https://raw.githubusercontent.com/DanStein91/Info-vis/master/CoffeeRobN.csv",
"format": {
"type": "csv",
"parse": {
"Number_of_Bags": "number",
"Bag_weight": "number",
"Harvest_Year": "number"
}
}
},
"transform": [
{"filter": "isValid(datum.Harvest_Year)"},
{
"calculate": "datum.Number_of_Bags * datum.Bag_Weight ",
"as": "Total_Export"
}
],
"mark": "point",
"encoding": {
"y": {"field": "Total_Export", "type": "quantitative"},
"x": {"field": "Harvest_Year", "type": "ordinal"}
},
"width": 300,
"height": 200
}

Labeled bar chart with fill encoding does not respect sorting

I am trying to make a sorted bar chart with labels and fill encoding. But when I add the the fill encoding it breaks the sort. Via the github issues it seems like there are ways to get around this, but I can seem find a solution.
Given the spec without using the fill encoding the sorting works as expected.
{
"$schema": "https://vega.github.io/schema/vega-lite/v3.json",
"data": {
"values": [
{
"a": "A",
"b": 28,
"color": "black"
},
{
"a": "B",
"b": 55,
"color": "grey"
},
{
"a": "C",
"b": 43,
"color": "red"
}
]
},
"encoding": {
"y": {
"field": "a",
"type": "ordinal",
"sort": {
"encoding": "x",
"order": "descending"
}
},
"x": {
"field": "b",
"type": "quantitative"
}
},
"layer": [
{
"mark": "bar"
},
{
"mark": {
"type": "text",
"align": "left",
"baseline": "middle",
"dx": 3
},
"encoding": {
"text": {
"field": "b",
"type": "quantitative"
}
}
}
]
}
When you add the fill encoding to the top level encoding object it breaks the sort with the following warning
"fill": {
"field": "color",
"type": "ordinal",
"scale": null
}
[Warning] Domains that should be unioned has conflicting sort properties. Sort will be set to true.
Full vega-editor here
Is there a work around for this.
It appear to relate to the these issues (maybe) #2536, #5408
Yep, the underlying issue is https://github.com/vega/vega-lite/issues/5048. In this particular case, adding color to once layer adds a stack transform to one part of the dataflow but not the other so we cannot merge it. This is a great test case. Can you add this example to a new github issue so we can try to resolve it?
You can manually fix this example by disabling stacking the x encoding.
"stack": null
See this spec.