How to draw a line in a line chart in vega-lite? - vega-lite

I have a curve plotted from index
{
"$schema": "https://vega.github.io/schema/vega-lite/v2.4.json",
"title": {
"text": "Receiver Operating Characteristics - Area Under Curve",
"anchor": "middle",
"fontSize": 16,
"frame": "group",
"offset": 4
},
"data": {
"url" : {
"%context%": true,
"index": "roccurve_index2",
"body": {
"size":10000,
"_source": ["lr_fpr", "lr_tpr"],
}
}
"format": {"property": "hits.hits"},
},
"mark": {
"type": "line",
"point": true
},
"encoding": {
"x": {"field": "_source.lr_fpr", "type": "quantitative", "title":"False Positive Rate"},
"y": {"field": "_source.lr_tpr", "type": "quantitative", "title":"True Positive Rate"}
}
}
the plot looks like
Now i need to draw a base line for base model between 0 and 1 like
Is this possible, and make that as dashed line with legend showing names as Base Model, RF Model

Yes, it is possible using Layered Views.
I'll use the Line Chart example to modify and add another line that's also dashed.
Original chart:
https://vega.github.io/editor/#/examples/vega-lite/line
Here's the modified chart, I used explicit values for the straight line:
https://vega.github.io/editor/#/gist/152fbe5f986ba78e422bb3430628f010/spec.json
Layer Solution
When you use layered view, you can lay multiple lines in the same chart and same x and y axis
"layer" : [
{
//mark #1
},
{
//mark #2
}
]
Dashed Line
Can be achieved using strokeDashproperty. See this example: Line chart with varying stroke dash

Related

Vega-Lite Calculated Scale domainMax

I'm trying to calculate a value for domainMax on the Y-axis scale. I tried the following example where I want the Y-axis domainMax to be one greater than the maximum value in the dataset field named "value". The example produces the error 'Unrecognized signal name: "domMax"'. How can I get it to work?
{
"data": {
"values": [
{"date": "2021-03-01T00:00:00", "value": 1},
{"date": "2021-04-01T00:00:00", "value": 3},
{"date": "2021-05-01T00:00:00", "value": 2}
]
},
"transform": [
{ "calculate": "max(datum.value)+1","as": "domMax"}
],
"mark": "line",
"encoding": {
"x": {
"field": "date",
"type": "temporal"
},
"y": {"field": "value", "type": "quantitative",
"scale": {"domainMax": {"expr": "domMax"}}
}
}
}
This transform
"transform": [
{ "calculate": "max(datum.value)+1","as": "domMax"}
]
adds a new column to your data set - it does not create a new signal. You can check that in the editor. Go to the DataViewer tab and select data_0 from the drop down. Can you see the new domMax column?
Signals are a different thing entirely - have a look here in the documentation. Note that the link points to Vega, not Vega-Lite. (Vega-Lite specifications are compiled to Vega.)
Vega-Lite does not let you declare signals; you declare parameters instead. Here is another example using the domMax parameter. Vega-Lite parameters are translated to Vega signals.
It looks like you are trying to derive the value of your parameter/signal from the data. I am not sure you can do that in Vega-Lite.
On the other hand it's very easy in Vega. For example you could use the extent transform:
https://vega.github.io/vega/docs/transforms/extent/
Side comment - while Vega specifications are more verbose you can sometimes find their primitives simpler and a good way to understand how the visualisation works. (You can see compiled Vega in the editor.)
I tried to get a custom domain based on the data but hit the same limitations as you did.
In my case, I update the data from the outside a bit like the streaming example. I compute the domain from the outside and modify them in the visualization with params. This is quite easy as vega-lite params are exposed as vega signals.
This is the gist of the layout:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"params": [
{
"name": "lowBound",
"value": -10
},
{
"name": "highBound",
"value": 100
}
],
../..
"vconcat": [
{
"name": "detailed",
../..
"layer": [
{
../..
"mark": "line",
"encoding": {
"y": {
"field": "value",
"title": "Temperature",
"type": "quantitative",
"scale": {
"domainMin": {
"expr": "lowBound"
},
"domainMax": {
"expr": "highBound"
}
}
},
...
The lowBound and highBound are dynamically changed through vega signals. I change them with the regular JS API.
You can add a param to pan and zoom in case your hard coded values are less than ideal.
"params": [{"name": "grid", "select": "interval", "bind": "scales"}],
Open the Chart in the Vega Editor

How to draw vertical lines based on a different quantity than that for the x channel in charts by row channel?

I'd like to have multiple time series drawn by row channel ("field": "PLATFORM"), x channel: ("field": "estimating-date-time"), and y channel ("field": "eta-variance").
Besides the lines of time series, I'd like to draw a vertical line at x = arrvial-time which is another field, conditioned by the value of "PLATFORM".
The following is a working example of the charts except the desirable vertical line in each chart:
vega-lite for multiple time series
Below is the desired effect with manual illustration:
My question is how to add the vertical line for each chart to the specifications?
The challenge to me is that the field "arrival-time" from which the value used to draw the vertical line is not the same as the chart's x channel "estimating-data-time". I've found examples of drawing such a line using a value related to the same x channel.
You can do this by nesting a layer specification within a facet operator; something like this (open in editor):
{
"facet": {"row": {"field": "PLATFORM"}},
"spec": {
"height": 80,
"width": 300,
"layer": [
{
"mark": "line",
"encoding": {
"x": {"field": "estimating-date-time", "type": "temporal"},
"y": {"field": "ETA-variance", "type": "quantitative"}
}
},
{
"mark": "rule",
"encoding": {"x": {"field": "arrival-time", "type": "temporal"}}
}
]
},
"data": {...}
}

Data-layout, layers and legends in vega-lite

I have a very simple situation, and I believe my solution is too complicated and there's a good chance I'm missing something. Say I have measures of time, positions (x,y,z), angles (roll, pitch, yaw) and speed. I want a simple visualization like I currently have where the speed plot can be used as "brush" to zoom dynamically into the first two graphs.
A small example of my plot in the vega-editor can be found here.
1. Can I use a different data-layout?
Right now, each point is an object
{
"pitch": -0.006149084584096612,
"roll": 0.0007914191778949736,
"speed": 4.747345444390669,
"time": 0.519741,
"x": -0.01731604791076788,
"y": 0.020068310429957575,
"yaw": 0.0038123065311157552,
"z": -0.016005977140476142
}
With many data-points, this is a lot of memory just for repeating column names. Much better would be to have the data in the form
{
"time": [t1, t2, t3, ...],
"x": [...],
...
}
but vega's "row first" representation doesn't allow for that. I already asked on Slack where someone suggested to use Fold and Pivot, but I'm not sure how to implement this. Is it possible to use data that are stored as arrays? I'm creating the data myself from a C++ program and I'm free to export a different representation easily. The only question is how do I make vega-lite understand?
2. Layers and legends.
If I had time-series data with an "indicator column", I could create plots that combine several graphs easily. Unfortunately, I don't have that and the only solution I found is to use layers. With this, I have to set the colours for different graphs explicitly (instead of using schemes) and I don't get a legend.
If layers are really to only option here to combine, e.g. x,y,z into one "Movement" plot, how can I get a legend for this plot that tells me red -> x, green -> y, and blue -> z?
The answer is "yes" to both of your questions.
The key to the first question is to pass the data in a dense format and use the Flatten Transform to expand it.
The key to the second question is to use a Fold Transform to turn multiple columns into an indicator plus a value.
Here is a demonstration of this for a single chart (open in editor):
{
"data": {
"values": [
{
"time": [1, 2, 3, 4],
"x": [5, 4, 5, 2],
"y": [2, 3, 2, 4],
"z": [1, 2, 1, 0]
}
]
},
"transform": [
{"flatten": ["time", "x", "y", "z"]},
{"fold": ["x", "y", "z"], "as": ["column", "value"]}
],
"mark": "line",
"encoding": {
"x": {"field": "time", "type": "quantitative"},
"y": {"field": "value", "type": "quantitative"},
"color": {"field": "column", "type": "nominal"}
}
}

How to build pre-calculated histogram in Vega-Lite?

VegaLite can bin and aggregate himself. But I have complex calculation and build histogram separately.
The resulting data is following
bins = [1, 2, 3, 4] // 4 edges
// |1-2|2-3|3-4| // 3 bars
counts = [1, 2, 1]
The problem is - how to properly display bar edges - there are 3 bars, but 4 edges.
You can specify bin start and endpoints using the x and x2 encodings. It's also helpful to specify bin='binned' which tells Vega-Lite that the data is pre-binned & triggers the same display defaults used when a bin operation appears in the specification. For example (editor link):
{
"data": {
"values": [
{"bin1": 1, "bin2": 2, "counts": 1},
{"bin1": 2, "bin2": 3, "counts": 2},
{"bin1": 3, "bin2": 4, "counts": 1}
]
},
"mark": "bar",
"encoding": {
"x": {"field": "bin1", "type": "quantitative", "bin": "binned"},
"x2": {"field": "bin2"},
"y": {"field": "counts", "type": "quantitative"}
}
}
For more information, see Using Vega-Lite with Binned data.

Is it possible to use facets and repeat operator for histograms?

I want to combine facet operators (row, column) along with the repeat operator to create 'small multiple' charts that display different data variables. This works for some types of charts (e.g. simple bar charts) but not others (i.e. histograms). For example, below I have modified the 'Horizontally repeated charts' example (https://vega.github.io/vega-lite/examples/repeat_histogram.html).
{
"$schema": "https://vega.github.io/schema/vega-lite/v3.json",
"repeat": {"column": ["Horsepower","Miles_per_Gallon", "Acceleration"]},
"spec": {
"data": {"url": "data/cars.json"},
"mark": "bar",
"encoding": {
"row":{"field":"Origin", "type":"nominal"},
"x": {
"field": {"repeat": "column"},
"bin": true,
"type": "quantitative"
},
"y": {"aggregate": "count","type": "quantitative"}
}
}
}
I expect three rows, with each row showing histograms of cars from different countries. However, this code results in the error :
'Error: Undefined data set name: "scale_child_Miles_per_Gallon_child_main"'
I'm reasonably sure that this worked with Vega-Lite v2. Is there some reason that the aggregate / bin operator can't work with a combination of facets and repeats?