In example below the mean aggregation used. How to calculate aggregation as a multiplication of all the elements?
And, is it possible to use a custom JS function? Like const myfn = (list) => list.length, (I know there's a buit-in count, it's just to illustrate the idea).
Playground
{
"data": {"url": "data/cars.json"},
"mark": "bar",
"encoding": {
"x": {"field": "Cylinders", "type": "ordinal"},
"y": {"aggregate": "mean", "field": "Acceleration", "type": "quantitative"}
}
}
Unfortunately, product is not one of the Built-in Aggregations in Vega-Lite, and by design the schema does not support injecting arbitrary Javascript functions (it supports a limited Vega Expression syntax). Unless you preprocess your data before injecting into the Vega-Lite specification, you're limited to building your custom computation from the operations available there.
For your specific question, since the log of a product equals the sum of logs, one way you could compute the product within the specification is via a series of transforms like this (playground):
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {"url": "data/cars.json"},
"transform": [
{"calculate": "log(datum.Acceleration)", "as": "logA"},
{"aggregate": [{"op": "sum", "field": "logA", "as": "log_prod_A"}], "groupby": ["Cylinders"]},
{"calculate": "exp(datum.log_prod_A)", "as": "prod_A"}
],
"mark": "bar",
"encoding": {
"x": {"field": "Cylinders", "type": "ordinal"},
"y": {"field": "prod_A", "type": "quantitative", "title": "prod(A)"}
}
}
A single bar dominates because there are many more entries with 4 Cylinders than with other numbers.
Related
So I am working with vega-lite-v4 (that's the version our businesses airtable extension uses) and the answer to my previous post was that I need to use the pivot transform
But any time I try and use it as it is explained in the v4 documentation (https://vega.github.io/vega-lite-v4/docs/pivot.html) it throws an error as if the pivoted field does not exist
I've used the following test data:
Airtable test data
With the following test code:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"title": "Table 2",
"transform": [{
"pivot": "type",
"value": "calls",
"groupby": ["Month"]
}],
"mark": "bar",
"encoding": {
"x": {"field": "Month", "type": "nominal"},
"y": {"field": "Total", "type": "quantitative"}
}
}
And I still get the same error:
Total is not a valid transform name, inline data name, or field name from the Table 2 table:
"y": {
"field": "Total",
------------^
"type": "quantitative"
}
Even when I copy and paste the examples from the above documentation into the widget, it comes up with this error like pivot isn't making these fields
Can anyone help me figure out why this isn't working, or what to use instead?
EDIT:
So, a weird solution/workaround I found is to calculate the field as itself:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"title": "Table 2",
"transform": [{
"pivot": "type",
"value": "calls",
"groupby": ["Month"]
},
{"calculate" : "datum.Total", "as" : "newTotal"}
],
"mark": "bar",
"encoding": {
"x": {"field": "Month", "type": "nominal"},
"y": {"field": "newTotal", "type": "quantitative"}
}
}
This makes the graph behave completely as normal. I can use this for now, but it means I have to hard code each field name with a calculate transform, does this help anyone understand what's going on with this transform?
First of all, Vega and Vega-lite field names are case-sensitive, so "Month" is not the same as "month".
In your first code sample, "month" is incorrect and should be "Month":
"x": {"field": "month", "type": "nominal"},
but in the second code sample that was changed to "Month" which is correct:
"x": {"field": "Month", "type": "nominal"},
Try just correcting field name "Month" in the first code sample without calculating "newTotal":
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"title": "Table 2",
"transform": [{
"pivot": "type",
"value": "calls",
"groupby": ["Month"]
}],
"mark": "bar",
"encoding": {
"x": {"field": "Month", "type": "nominal"},
"y": {"field": "Total", "type": "quantitative"}
}
}
[EDIT: added following]
Here is a working example using your example data with pivot transform and rendered as bar chart by Vega-lite v5.2.0 with no errors.
Try using Vega-lite v5.2.0 instead of v4.
View in Vega on-line editor
I'm trying to build a visualization for histograms of numerical data using Vega Lite. Right now I am prototyping the visualization using a very simple mock dataset (Also available here):
{
"data": {
"fill": [
{"count": 30000, "level": "filled"},
{"count": 50000, "level": "missing"}
],
"histogram": [
{"bin_end": 20, "bin_start": 0, "count": 1000},
{"bin_end": 30, "bin_start": 20, "count": 20000}
]
},
"metadata": {}
}
The data format above is predetermined and unfortunately I am not able to change it as it comes from an API. I'm trying to plot the histogram section of the data to plot, well, an histogram, and the fill section of the data to plot a simple bar chart. Something like this:
I understand that I can use the "property" option to access nested data like this, as document in this section of Vega documentation, and this works as long as I am only plotting one of the charts, as shown by the examples below:
Example 1 in Vega Editor: Histogram only
Example 2 in Vega Editor: Barplot only
However, when I try to put both of them together it simply does not work. I get the weird chart below, where it seems that the data for the barplot is completely absent.
Link to vega editor for weird chart
And when inspecting the data using Vega Editor built in Data Viewer it seems that only the histogram data is being read.
Furthermore, this behavior seems to be order dependent, as switching the order of the charts in the HConcat block changes which chart gets messed up:
Inverted Chart
Am I missing something here? Is this some sort of limitation of Vegalite?
You're missing the name property so it looks like the data was simply overwritten by whatever was retrieved last. Here you go.
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.2.0.json",
"config": {"view": {"continuousHeight": 300, "continuousWidth": 400}},
"hconcat": [
{
"data": {"name": "a",
"format": {"type": "json", "property": "data.histogram"},
"url": "https://gist.githubusercontent.com/hemagso/f7b4381be43b34ece4d8aa78c936c7d5/raw/0bae0177b8a2a5d33e23c0d164d4439d248aa9ff/mock,json"
},
"encoding": {
"x": {
"bin": {"binned": true},
"field": "bin_start",
"scale": {"type": "linear"}
},
"x2": {"field": "bin_end"},
"y": {"field": "count", "type": "quantitative"}
},
"mark": "bar"
},
{
"data": {"name": "b",
"format": {"type": "json", "property": "data.fill"},
"url": "https://gist.githubusercontent.com/hemagso/f7b4381be43b34ece4d8aa78c936c7d5/raw/0bae0177b8a2a5d33e23c0d164d4439d248aa9ff/mock,json"
},
"encoding": {
"color": {"field": "level", "type": "nominal"},
"x": {"field": "level", "type": "nominal"},
"y": {"field": "count", "type": "quantitative"}
},
"mark": "bar"
}
]
}
Is there a Vega/Vega-Lite transform which I can use to select the first n rows in data set?
Suppose I get a dataset from a URL such as:
Person
Height
Jeremy
6.2
Alice
6.0
Walter
5.8
Amy
5.6
Joe
5.5
and I want to create a bar chart showing the height of only the three tallest people. Assume that we know for certain that the dataset from the URL is already sorted. Assume that we cannot change the data as returned by the URL.
I want to do something like this:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"url": "heights.csv"
},
"transform": [
{"head": 3}
],
"mark": "bar",
"encoding": {
"x": {"field": "Person", "type": "nominal"},
"y": {"field": "Height", "type": "quantitative"}
}
}
only the head transform does not actually exist - is there something else I can do to get the same effect?
The Vega-Lite documentation has an example along these lines in filtering top-k items.
Your case is a bit more specialized: you do not want to order based on rank, but rather based on the original ordering of the data. You can do this using a count-based window transform followed by an appropriate filter. For example (view in editor):
{
"data": {
"values": [
{"Person": "Jeremy", "Height": 6.2},
{"Person": "Alice", "Height": 6.0},
{"Person": "Walter", "Height": 5.8},
{"Person": "Amy", "Height": 5.6},
{"Person": "Joe", "Height": 5.5}
]
},
"transform": [
{"window": [{"op": "count", "as": "count"}]},
{"filter": "datum.count <= 3"}
],
"mark": "bar",
"encoding": {
"x": {"field": "Height", "type": "quantitative"},
"y": {"field": "Person", "type": "nominal", "sort": null}
}
}
Given a stacked bar chart such as in this example: https://vega.github.io/editor/?#/examples/vega-lite/stacked_bar_weather
I want to control the order of the items in the aggregation so that, for example, 'fog' comes at the bottom, with 'sun' next etc. Is that possible?
The reason for this is I have one type that is much larger than the others. I want this to appear at the bottom, then control the domain to 'cut off' most of that section.
Thanks
You can control the stack order via the order encoding: see https://vega.github.io/vega-lite/docs/stack.html#sorting-stack-order
Unfortunately, this only allows sorting by field value, rather than by an explicit order as you want here. The workaround is to use a calculate transform to turn your explicit order into a field (view in editor):
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {"url": "data/seattle-weather.csv"},
"transform": [
{
"calculate": "indexof(['sun', 'fog', 'drizzle', 'rain', 'snow'], datum.weather)",
"as": "order"
}
],
"mark": "bar",
"encoding": {
"x": {
"timeUnit": "month",
"field": "date",
"type": "ordinal",
"axis": {"title": "Month of the year"}
},
"y": {"aggregate": "count", "type": "quantitative"},
"color": {
"field": "weather",
"type": "nominal",
"scale": {
"domain": ["sun", "fog", "drizzle", "rain", "snow"],
"range": ["#e7ba52", "#c7c7c7", "#aec7e8", "#1f77b4", "#9467bd"]
},
"legend": {"title": "Weather type"}
},
"order": {"field": "order", "type": "ordinal"}
}
}
I am using this example named "Line Chart with Point Markers" as reference, but not see other example or any clues about conditional or "selected by symbol" points.
The illustration shows a typical case (see also SPC) where I need only the blue central line with dots.
You can do this by layering filtered versions of the dataset. Modifying the example you linked to, it might look something like this (vega editor):
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"description": "Stock prices of 5 Tech Companies over Time.",
"data": {"url": "data/stocks.csv"},
"encoding": {
"x": {"timeUnit": "year", "field": "date", "type": "temporal"},
"y": {"aggregate": "mean", "field": "price", "type": "quantitative"},
"color": {"field": "symbol", "type": "nominal"}
},
"layer": [
{
"mark": {"type": "line", "point": true},
"transform": [{"filter": "datum.symbol == 'GOOG'"}]
},
{
"mark": {"type": "line"},
"transform": [{"filter": "datum.symbol != 'GOOG'"}]
}
]
}