Static Ranges for X Axis and Counting - vega-lite

I'm trying to manually define ranges for vega-lite and failing. I've read through documentation trying to understand span, band, scale but not finding any examples.
For my values listed I'm just trying to count how many of each field fall into the pre defined range.
Y axis is just a count of how many of each field falls into the manually defined ranges for the X axis.
For example. the ATR bar in the 1 to 15 range would have a value of 4 and 0 for the 15 to 30 range.
ETO would have a value of 3 in the 1 to 15 range and 1 for the 15 to 30.
Any help would be appreciated.
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": [
{"ATR": 1, "ETO": 10, "ITE": 20, "RTI": 15},
{"ATR": 3, "ETO": 4, "ITE": 8, "RTI": 4},
{"ATR": 8, "ETO": 23, "ITE": 1, "RTI": 6},
{"ATR": 7, "ETO": 9, "ITE": 4, "RTI": 11}
]
},
"transform": [
{"fold": ["ATR", "ETO", "ITE", "RTI"]}
],
"mark": "bar",
"encoding": {
"x": {
"type": "quantitative",
"scale": {
"domain": [1,15],[15,30]
}
},
"y": {"field": "value", "aggregate": "count", "type": "quantitative"},
"color": {"field": "key", "type": "nominal"}
}
}
I've tried understanding/plugging in various items from the vega lite documentation and working in the vega editor, but can't quite seem to get it right. If I add the "value" field to X axis, it just counts how many of each identifier has that value.
Edit: Update using Bin as suggested
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": [
{"ATR": 1, "ETO": 10, "ITE": 20, "RTI": 15},
{"ATR": 3, "ETO": 4, "ITE": 8, "RTI": 4},
{"ATR": 8, "ETO": 23, "ITE": 1, "RTI": 6},
{"ATR": 7, "ETO": 9, "ITE": 4, "RTI": 11}
]
},
"transform": [
{"fold": ["ATR", "ETO", "ITE", "RTI"]} ,
{
"bin":{"binned": true, "steps": [15]},
"field": "value"
}
],
"mark": "bar",
"encoding": {
"x": {
"field": "value",
"type": "ordinal"
},
"y": {"aggregate": "count"},
"xOffset":{"field": "key"},
"color": {"field": "key", "type": "nominal"}
}
}

Do you mean like this?
If so, use a label expression.
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": [
{"ATR": 1, "ETO": 10, "ITE": 20, "RTI": 15},
{"ATR": 3, "ETO": 4, "ITE": 8, "RTI": 4},
{"ATR": 8, "ETO": 23, "ITE": 1, "RTI": 6},
{"ATR": 7, "ETO": 9, "ITE": 4, "RTI": 11}
]
},
"transform": [
{"fold": ["ATR", "ETO", "ITE", "RTI"]},
{"bin": {"binned": true, "steps": [15]}, "field": "value"}
],
"mark": "bar",
"encoding": {
"x": {"field": "value", "type": "ordinal", "axis": {
"labelExpr": "datum.label==15?'0-15':'15-30'",
"labelAngle":0
}},
"y": {"aggregate": "count"},
"xOffset": {"field": "key"},
"color": {"field": "key", "type": "nominal"}
}
}

Related

Add additional tick to allow some space between the data points and the borders]

See the attached screenshot
Desire is to allow some offset between the data points and the border of the chart. Especially useful for user interactions such as brush to allow users to start brushing from the right side.
I think fall back plan is to manually compute the scale domain (min, max + some padding), but I was trying to see if there is already a prebuilt option available.
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"description": "A scatterplot showing horsepower and miles per gallons for various cars.",
"data": {
"values": [
{"Horsepower": 10, "Miles_per_Gallon": 100},
{"Horsepower": 10, "Miles_per_Gallon": 120},
{"Horsepower": 8, "Miles_per_Gallon": 77},
{"Horsepower": 6, "Miles_per_Gallon": 80},
{"Horsepower": 4, "Miles_per_Gallon": 20},
{"Horsepower": 2, "Miles_per_Gallon": 60},
{"Horsepower": 0, "Miles_per_Gallon": 150}
]
},
"mark": "point",
"encoding": {
"x": {"field": "Horsepower", "type": "quantitative"},
"y": {"field": "Miles_per_Gallon", "type": "quantitative"}
}
}
I don't know of a built-in configuration to add padding to the automatically-determined scale domain, but here's a hack that lets you achieve this by plotting a transparent point at a specified position past the maximum:
{
"data": {
"values": [
{"Horsepower": 10, "Miles_per_Gallon": 100},
{"Horsepower": 10, "Miles_per_Gallon": 120},
{"Horsepower": 8, "Miles_per_Gallon": 77},
{"Horsepower": 6, "Miles_per_Gallon": 80},
{"Horsepower": 4, "Miles_per_Gallon": 20},
{"Horsepower": 2, "Miles_per_Gallon": 60},
{"Horsepower": 0, "Miles_per_Gallon": 150}
]
},
"layer": [
{
"mark": {"type": "point", "opacity": 0},
"transform": [
{
"aggregate": [
{"field": "Horsepower", "op": "max", "as": "max_Horsepower"}
]
},
{"calculate": "datum.max_Horsepower + 2", "as": "max_Horsepower"}
],
"encoding": {"x": {"field": "max_Horsepower", "type": "quantitative"}}
},
{
"mark": "point",
"encoding": {
"x": {"field": "Horsepower", "type": "quantitative"},
"y": {"field": "Miles_per_Gallon", "type": "quantitative"}
}
}
]
}

Is it possible to achieve this aggregation using vega/vega-lite?

I have a data list of this format
[
{"id": 100, "y": 28, "c":0},
{"id": 1, "y": 20, "c":1},
{"id": 2, "y": 43, "c":0},
{"id": 3, "y": 35, "c":1},
{"id": 4, "y": 81, "c":0},
{"id": 5, "y": 10, "c":1},
{"id": 6, "y": 19, "c":0},
{"id": 7, "y": 15, "c":1},
{"id": 8, "y": 52, "c":0},
{"id": 9, "y": 48, "c":1}
]
My goal is to achieve sum aggregation of x and y fields for all documents excluding id=100, then to subtract this result of aggregation from x and y values of document with id=100, and display this result as text type mark.
I've tried the following :
{
$schema: https://vega.github.io/schema/vega/v3.0.json
title: Sum amount Per id
data: [
{
"name": "table",
"values": [
{"id": 100, "y": 28, "c":0},
{"id": 1, "y": 20, "c":1},
{"id": 2, "y": 43, "c":0},
{"id": 3, "y": 35, "c":1},
{"id": 4, "y": 81, "c":0},
{"id": 5, "y": 10, "c":1},
{"id": 6, "y": 19, "c":0},
{"id": 7, "y": 15, "c":1},
{"id": 8, "y": 52, "c":0},
{"id": 9, "y": 48, "c":1}
]
transform: [
{
type: aggregate
ops: ["sum","sum"]
fields: ["c", "y"]
as: ["sumc","sumy"]
}
]
}
]
marks: [
{
type: text
from: {data: "table"}
encode: {
update: {
text: {signal: "datum.sumc"}
align: {value: "center"}
baseline: {value: "middle"}
xc: {signal: "width/4"}
yc: {signal: "height/2"}
fontSize: {signal: "min(width/10, height)/1.3"}
}
}
}
{
type: text
from: {data: "table"}
encode: {
update: {
text: {signal: "datum.sumy"}
align: {value: "center"}
baseline: {value: "middle"}
xc: {signal: "width*3/4"}
yc: {signal: "height/2"}
fontSize: {signal: "min(width/10, height)/1.3"}
}
}
}
]
}
Please help me with how to achieve the subtraction from id=100
I was able to solve this using the JoinAggregate transform of Vega by adding the aggregation values as additional columns to the data set and then filtering to get a single row with the desired values!
{
"$schema": "https://vega.github.io/schema/vega/v3.0.json",
"title": "Sum amount Per id",
"data": [
{
"name": "table",
"values": [
{"id": 100, "y": 2800, "c": 1000},
{"id": 1, "y": 20, "c": 1},
{"id": 2, "y": 43, "c": 0},
{"id": 3, "y": 35, "c": 1},
{"id": 4, "y": 81, "c": 0},
{"id": 5, "y": 10, "c": 1},
{"id": 6, "y": 19, "c": 0},
{"id": 7, "y": 15, "c": 1},
{"id": 8, "y": 52, "c": 0},
{"id": 9, "y": 48, "c": 1}
],
"transform": [
{
"type": "joinaggregate",
"ops": ["sum", "sum"],
"fields": ["c", "y"],
"as": ["sumc", "sumy"]
},{
"type":"filter"
"expr":"datum.id==100"
}
]
}
],
"marks": [
{
"type": "text",
"from": {"data": "table"},
"encode": {
"update": {
"text": {"signal": "-datum.sumc+datum.c*2"},
"align": {"value": "center"},
"baseline": {"value": "middle"},
"xc": {"signal": "width/4"},
"yc": {"signal": "height/2"},
"fontSize": {"signal": "min(width/10, height)/1.3"}
}
}
},
{
"type": "text",
"from": {"data": "table"},
"encode": {
"update": {
"text": {"signal": "datum.y-datum.sumy+datum.y"},
"align": {"value": "center"},
"baseline": {"value": "middle"},
"xc": {"signal": "width*3/4"},
"yc": {"signal": "height/2"},
"fontSize": {"signal": "min(width/10, height)/1.3"}
}
}
}
]
}

How do I create a legend for a layered line plot

Basically what I have is a line graph that is layered from several line graphs. Since each graph has only one line, there is no legend automatically generated, so what is the best way to get a legend for the chart? I have been considering trying to transform my dataset. This is weekly deaths total from the cdc from 2019-June 2020. The way the csv is arranged is each date for each state has a record with each disease type as it's own column and integers as the column values. So there isn't one field to chart, there are many, hence the layering. Any insights into how to solve this problem would be much appreciated! Here is my work so far:
https://observablehq.com/#justin-krohn/covid-excess-deaths
You can create a legend for a layered chart by setting the color encoding for each layer to a datum specifying what label you would like it to have. For example (vega editor):
{
"data": {
"values": [
{"x": 1, "y1": 1, "y2": 2},
{"x": 2, "y1": 3, "y2": 1},
{"x": 3, "y1": 2, "y2": 4},
{"x": 4, "y1": 4, "y2": 3},
{"x": 5, "y1": 3, "y2": 5}
]
},
"encoding": {"x": {"field": "x", "type": "quantitative"}},
"layer": [
{
"mark": "line",
"encoding": {
"y": {"field": "y1", "type": "quantitative"},
"color": {"datum": "y1"}
}
},
{
"mark": "line",
"encoding": {
"y": {"field": "y2", "type": "quantitative"},
"color": {"datum": "y2"}
}
}
]
}
Alternatively, you can use a Fold Transform to pivot your data so that instead of manual layers, you can plot the multiple lines with a simple color encoding. For example (vega editor):
{
"data": {
"values": [
{"x": 1, "y1": 1, "y2": 2},
{"x": 2, "y1": 3, "y2": 1},
{"x": 3, "y1": 2, "y2": 4},
{"x": 4, "y1": 4, "y2": 3},
{"x": 5, "y1": 3, "y2": 5}
]
},
"transform": [{"fold": ["y1", "y2"], "as": ["name", "y"]}],
"mark": "line",
"encoding": {
"x": {"field": "x", "type": "quantitative"},
"y": {"field": "y", "type": "quantitative"},
"color": {"field": "name", "type": "nominal"}
}
}

Force order of painting, in non-stacked area chart

While working with Vega Lite, I'm unable to force the same order for painting non-stacked areas.
For example, see what happens here:
In the rows picture above, I expect light-blue to always be behind dark-blue. However this is not true for the 3rd, 4th, and 6th row.
I've tried several combinations of order, sort, and zrank — with no success. Any idea on how to force this?
See the sample full viz in the editor — I can't get India, Israel, or Japan to display the dark-blue area on top of the light-blue.
I don't think there is currently a way to control the z order of area charts from the chart spec, but you can control it via the order of the data source: the colors are stacked in the order that they appear. For example, here in row 0, color 1 comes first, and in row 1, color 0 comes first:
{
"data": {
"values": [
{"x": 0, "y": 2, "row": 0, "color": 1},
{"x": 1, "y": 1, "row": 0, "color": 1},
{"x": 0, "y": 1, "row": 0, "color": 0},
{"x": 1, "y": 2, "row": 0, "color": 0},
{"x": 0, "y": 2, "row": 1, "color": 0},
{"x": 1, "y": 1, "row": 1, "color": 0},
{"x": 0, "y": 1, "row": 1, "color": 1},
{"x": 1, "y": 2, "row": 1, "color": 1}
]
},
"mark": "area",
"encoding": {
"x": {"field": "x", "type": "temporal"},
"y": {"field": "y", "type": "quantitative", "stack": false},
"color": {"field": "color", "type": "ordinal"},
"row": {"field": "row", "type": "ordinal"}
},
"height": 50
}
If you rearrange the rows so that color 0 appears before color 1 in both cases, the stack order on the chart will be consistent:
{
"data": {
"values": [
{"x": 0, "y": 1, "row": 0, "color": 0},
{"x": 1, "y": 2, "row": 0, "color": 0},
{"x": 0, "y": 2, "row": 0, "color": 1},
{"x": 1, "y": 1, "row": 0, "color": 1},
{"x": 0, "y": 2, "row": 1, "color": 0},
{"x": 1, "y": 1, "row": 1, "color": 0},
{"x": 0, "y": 1, "row": 1, "color": 1},
{"x": 1, "y": 2, "row": 1, "color": 1}
]
},
"mark": "area",
"encoding": {
"x": {"field": "x", "type": "temporal"},
"y": {"field": "y", "type": "quantitative", "stack": false},
"color": {"field": "color", "type": "ordinal"},
"row": {"field": "row", "type": "ordinal"}
},
"height": 50
}
If you re-order the rows your input data by year (so all 2019 entries come before all 2020 entries), the stack order should be the same in each panel.

Why are tooltip values rounded?

For some reason my tooltips are rounded to the nearest integer?
Any help is appreciated.
Here is the link to chart in the VL editor (version 3.0.0-rc14).
(vega editor link)
{
"width": 300,
"height": 300,
"config": {
"title": {"fontSize": 15},
"numberFormat": ".0f",
"style": {
"bar": {"size": 20},
"guide-title": {"value": "asdf", "fontSize": 15},
"guide-label": {"fontSize": 15}
},
"scale": {"bandPaddingInner": 0.5, "bandPaddingOuter": 0.5},
"legend": {"symbolSize": 100, "titleFontSize": 15, "labelFontSize": 15},
"axis": {"titleFontSize": 15, "labelFontSize": 15, "labelLimit": 1000}
},
"data": {"name": "data-dba50c8bae540866b10e6763560b8ec9"},
"mark": "circle",
"encoding": {
"tooltip": [
{"type": "quantitative", "field": "expressiveness"},
{"type": "quantitative", "field": "customization"}
],
"x": {"type": "quantitative", "field": "expressiveness"},
"y": {"type": "quantitative", "field": "customization"}
},
"$schema": "https://vega.github.io/schema/vega-lite/v2.6.0.json",
"datasets": {
"data-dba50c8bae540866b10e6763560b8ec9": [
{"library": "A", "expressiveness": 0, "customization": 1},
{"library": "B", "expressiveness": 0.4, "customization": 0.7},
{"library": "C", "expressiveness": 1, "customization": 0.7},
{"library": "D", "expressiveness": 0.6, "customization": 0.7},
{"library": "E", "expressiveness": 0, "customization": 1}
]
}
}
Because you set "numberFormat": ".0f" in the config, and that's applied to the tooltip too.