Why am I seeing an empty density plot, despite no errors in the editor log? - vega-lite

As a learning exercise I decided to try plotting a density plot of continuously-compounded daily returns of the Nasdaq 100 index for calendar year 2020. I am unable to get vega-lite to produce any visualization, and yet there are no errors in the online editor. I'm just inexplicably given an empty plot.
Because of the embedded data, the plot spec is some 2500 lines long, so I've saved it as a gist: https://gist.github.com/nathanvy/2c080ee0b7e93b11e544c5275d31f2b1
What am I missing?

Change logreturn to value:
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"width": 300,
"height": 300,
"title": "Nasdaq 100 (NDX) Log Returns, 2020",
"mark": "area",
"transform": [{"density": "logreturn"}],
"encoding": {
"x": {
"field": "value",
"title": "Logarithmic Daily Return",
"type": "quantitative"
},
"y": {
"field": "density",
"title": "Probability of Return",
"type": "quantitative"
}
}

Related

Fill missing data with 0 in specified date range with Vega-Lite

I have to render a line-chart that consumes data from an API and it only returns values for the days that do have some data. For the days where there is no data, it does not return an entry with 0 as it'd be expected.
This means that the chart doesn't represent values with 0, which is an issue.
I can't modify this API, so my question would be if there is a way I can tell vega-lite to render data within a date range and, if there is no data for some day, show it as 0.
I guess I'd be able to transform the data before sending it to my react-vega component, but if this can be done by vega-lite, it'd be much better.
You can use impute (you have to supply dates converted to number in the impute though - I have raised a bug here)
The spec below imputes a zero value for 2012-01-05:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"description": "Using utc scale with local time input.",
"data": {
"values": [
{"date": "2012-01-01", "price": 150},
{"date": "2012-01-02", "price": 100},
{"date": "2012-01-03", "price": 170},
{"date": "2012-01-04", "price": 165},
{"date": "2012-01-06", "price": 200}
]
},
"transform": [
{
"impute": "price",
"key": "date",
"value": 0,
"keyvals": [
1325376000000,
1325462400000,
1325548800000,
1325635200000,
1325721600000,
1325721600000
]
},
{"timeUnit": "day", "field": "date", "as": "dateTU"}
],
"mark": "line",
"encoding": {
"x": {"field": "date", "timeUnit": "date"},
"y": {"field": "price", "type": "quantitative"}
}
}

Vega-Lite v4 - Pivot example (and my code) throwing errors

So I am working with vega-lite-v4 (that's the version our businesses airtable extension uses) and the answer to my previous post was that I need to use the pivot transform
But any time I try and use it as it is explained in the v4 documentation (https://vega.github.io/vega-lite-v4/docs/pivot.html) it throws an error as if the pivoted field does not exist
I've used the following test data:
Airtable test data
With the following test code:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"title": "Table 2",
"transform": [{
"pivot": "type",
"value": "calls",
"groupby": ["Month"]
}],
"mark": "bar",
"encoding": {
"x": {"field": "Month", "type": "nominal"},
"y": {"field": "Total", "type": "quantitative"}
}
}
And I still get the same error:
Total is not a valid transform name, inline data name, or field name from the Table 2 table:
"y": {
"field": "Total",
------------^
"type": "quantitative"
}
Even when I copy and paste the examples from the above documentation into the widget, it comes up with this error like pivot isn't making these fields
Can anyone help me figure out why this isn't working, or what to use instead?
EDIT:
So, a weird solution/workaround I found is to calculate the field as itself:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"title": "Table 2",
"transform": [{
"pivot": "type",
"value": "calls",
"groupby": ["Month"]
},
{"calculate" : "datum.Total", "as" : "newTotal"}
],
"mark": "bar",
"encoding": {
"x": {"field": "Month", "type": "nominal"},
"y": {"field": "newTotal", "type": "quantitative"}
}
}
This makes the graph behave completely as normal. I can use this for now, but it means I have to hard code each field name with a calculate transform, does this help anyone understand what's going on with this transform?
First of all, Vega and Vega-lite field names are case-sensitive, so "Month" is not the same as "month".
In your first code sample, "month" is incorrect and should be "Month":
"x": {"field": "month", "type": "nominal"},
but in the second code sample that was changed to "Month" which is correct:
"x": {"field": "Month", "type": "nominal"},
Try just correcting field name "Month" in the first code sample without calculating "newTotal":
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"title": "Table 2",
"transform": [{
"pivot": "type",
"value": "calls",
"groupby": ["Month"]
}],
"mark": "bar",
"encoding": {
"x": {"field": "Month", "type": "nominal"},
"y": {"field": "Total", "type": "quantitative"}
}
}
[EDIT: added following]
Here is a working example using your example data with pivot transform and rendered as bar chart by Vega-lite v5.2.0 with no errors.
Try using Vega-lite v5.2.0 instead of v4.
View in Vega on-line editor

Multiple Datasets Within Vega Lite

I'm trying to build a visualization for histograms of numerical data using Vega Lite. Right now I am prototyping the visualization using a very simple mock dataset (Also available here):
{
"data": {
"fill": [
{"count": 30000, "level": "filled"},
{"count": 50000, "level": "missing"}
],
"histogram": [
{"bin_end": 20, "bin_start": 0, "count": 1000},
{"bin_end": 30, "bin_start": 20, "count": 20000}
]
},
"metadata": {}
}
The data format above is predetermined and unfortunately I am not able to change it as it comes from an API. I'm trying to plot the histogram section of the data to plot, well, an histogram, and the fill section of the data to plot a simple bar chart. Something like this:
I understand that I can use the "property" option to access nested data like this, as document in this section of Vega documentation, and this works as long as I am only plotting one of the charts, as shown by the examples below:
Example 1 in Vega Editor: Histogram only
Example 2 in Vega Editor: Barplot only
However, when I try to put both of them together it simply does not work. I get the weird chart below, where it seems that the data for the barplot is completely absent.
Link to vega editor for weird chart
And when inspecting the data using Vega Editor built in Data Viewer it seems that only the histogram data is being read.
Furthermore, this behavior seems to be order dependent, as switching the order of the charts in the HConcat block changes which chart gets messed up:
Inverted Chart
Am I missing something here? Is this some sort of limitation of Vegalite?
You're missing the name property so it looks like the data was simply overwritten by whatever was retrieved last. Here you go.
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.2.0.json",
"config": {"view": {"continuousHeight": 300, "continuousWidth": 400}},
"hconcat": [
{
"data": {"name": "a",
"format": {"type": "json", "property": "data.histogram"},
"url": "https://gist.githubusercontent.com/hemagso/f7b4381be43b34ece4d8aa78c936c7d5/raw/0bae0177b8a2a5d33e23c0d164d4439d248aa9ff/mock,json"
},
"encoding": {
"x": {
"bin": {"binned": true},
"field": "bin_start",
"scale": {"type": "linear"}
},
"x2": {"field": "bin_end"},
"y": {"field": "count", "type": "quantitative"}
},
"mark": "bar"
},
{
"data": {"name": "b",
"format": {"type": "json", "property": "data.fill"},
"url": "https://gist.githubusercontent.com/hemagso/f7b4381be43b34ece4d8aa78c936c7d5/raw/0bae0177b8a2a5d33e23c0d164d4439d248aa9ff/mock,json"
},
"encoding": {
"color": {"field": "level", "type": "nominal"},
"x": {"field": "level", "type": "nominal"},
"y": {"field": "count", "type": "quantitative"}
},
"mark": "bar"
}
]
}

Vega transform to select the first n rows

Is there a Vega/Vega-Lite transform which I can use to select the first n rows in data set?
Suppose I get a dataset from a URL such as:
Person
Height
Jeremy
6.2
Alice
6.0
Walter
5.8
Amy
5.6
Joe
5.5
and I want to create a bar chart showing the height of only the three tallest people. Assume that we know for certain that the dataset from the URL is already sorted. Assume that we cannot change the data as returned by the URL.
I want to do something like this:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"url": "heights.csv"
},
"transform": [
{"head": 3}
],
"mark": "bar",
"encoding": {
"x": {"field": "Person", "type": "nominal"},
"y": {"field": "Height", "type": "quantitative"}
}
}
only the head transform does not actually exist - is there something else I can do to get the same effect?
The Vega-Lite documentation has an example along these lines in filtering top-k items.
Your case is a bit more specialized: you do not want to order based on rank, but rather based on the original ordering of the data. You can do this using a count-based window transform followed by an appropriate filter. For example (view in editor):
{
"data": {
"values": [
{"Person": "Jeremy", "Height": 6.2},
{"Person": "Alice", "Height": 6.0},
{"Person": "Walter", "Height": 5.8},
{"Person": "Amy", "Height": 5.6},
{"Person": "Joe", "Height": 5.5}
]
},
"transform": [
{"window": [{"op": "count", "as": "count"}]},
{"filter": "datum.count <= 3"}
],
"mark": "bar",
"encoding": {
"x": {"field": "Height", "type": "quantitative"},
"y": {"field": "Person", "type": "nominal", "sort": null}
}
}

How to use "multiply" or "custom fn" aggregation in Vega Lite?

In example below the mean aggregation used. How to calculate aggregation as a multiplication of all the elements?
And, is it possible to use a custom JS function? Like const myfn = (list) => list.length, (I know there's a buit-in count, it's just to illustrate the idea).
Playground
{
"data": {"url": "data/cars.json"},
"mark": "bar",
"encoding": {
"x": {"field": "Cylinders", "type": "ordinal"},
"y": {"aggregate": "mean", "field": "Acceleration", "type": "quantitative"}
}
}
Unfortunately, product is not one of the Built-in Aggregations in Vega-Lite, and by design the schema does not support injecting arbitrary Javascript functions (it supports a limited Vega Expression syntax). Unless you preprocess your data before injecting into the Vega-Lite specification, you're limited to building your custom computation from the operations available there.
For your specific question, since the log of a product equals the sum of logs, one way you could compute the product within the specification is via a series of transforms like this (playground):
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {"url": "data/cars.json"},
"transform": [
{"calculate": "log(datum.Acceleration)", "as": "logA"},
{"aggregate": [{"op": "sum", "field": "logA", "as": "log_prod_A"}], "groupby": ["Cylinders"]},
{"calculate": "exp(datum.log_prod_A)", "as": "prod_A"}
],
"mark": "bar",
"encoding": {
"x": {"field": "Cylinders", "type": "ordinal"},
"y": {"field": "prod_A", "type": "quantitative", "title": "prod(A)"}
}
}
A single bar dominates because there are many more entries with 4 Cylinders than with other numbers.