vega-lite bar chart stacked colored values - vega-lite

"data": {"values": [{
"key": "test1",
"doc_count": 14,
"misc": {
"min": 5,
"max": 8,
"avg": 6.5
}
},
{
"key": "test2",
"doc_count": 14,
"misc": {
"min": 2,
"max": 8,
"avg": 4.5
}
}]}
Given this data a need to paint a stacked bar chart with 3 colors per each bar for min.avg and max.
Currently i can not find any solution for this because this is already an aggregation coming from elastic and all examples i´ve seen for stacked bar chart use the color scale for a fields value but i need the same for 3 fields.
Is it possible with this source data ?

You may want to use the fold transform to convert data into long format.
{"fold": ["misc.min", "misc.max","misc.avg"]}
should work. If it doesn't work due to nested data, you can use calculate to flatten each field first (e.g., `{"calculate": "misc.min", "as": "min"}) and then fold the flattened fields.

Related

Vega lite: How to change value labels

I could not found out way how to change labels of X axis.
I have this data and I need to show bars which have title taken from field label. Currently bars are aggregated by id and goals is to label bars with label even if the texts are same. I'm confused by examples what to do. Whether I should create layer or some signals. Could someone help me? IMHO this should be trivial, but I did not found anything useful.
{
"id": 15971,
"label": "Click Next to continue",
"result": "Success",
"value": 2
},
{
"id": 15972,
"label": "Click Next to continue",
"result": "No data",
"value": 0
},
There's not really any way to do this in the Vega-Lite grammar, because what you want to do is semantically inconsistent: if you group values by one column, there is no guarantee that the values in another column will match within that grouping.
One option you have is to use labelExpr to define the desired label manually, e.g.
"x": {
"field": "id",
"type": "nominal",
"axis": {"title": "x", "labelExpr": "'Click Next to continue'"},
"scale": {"type": "point"}
},

Data-layout, layers and legends in vega-lite

I have a very simple situation, and I believe my solution is too complicated and there's a good chance I'm missing something. Say I have measures of time, positions (x,y,z), angles (roll, pitch, yaw) and speed. I want a simple visualization like I currently have where the speed plot can be used as "brush" to zoom dynamically into the first two graphs.
A small example of my plot in the vega-editor can be found here.
1. Can I use a different data-layout?
Right now, each point is an object
{
"pitch": -0.006149084584096612,
"roll": 0.0007914191778949736,
"speed": 4.747345444390669,
"time": 0.519741,
"x": -0.01731604791076788,
"y": 0.020068310429957575,
"yaw": 0.0038123065311157552,
"z": -0.016005977140476142
}
With many data-points, this is a lot of memory just for repeating column names. Much better would be to have the data in the form
{
"time": [t1, t2, t3, ...],
"x": [...],
...
}
but vega's "row first" representation doesn't allow for that. I already asked on Slack where someone suggested to use Fold and Pivot, but I'm not sure how to implement this. Is it possible to use data that are stored as arrays? I'm creating the data myself from a C++ program and I'm free to export a different representation easily. The only question is how do I make vega-lite understand?
2. Layers and legends.
If I had time-series data with an "indicator column", I could create plots that combine several graphs easily. Unfortunately, I don't have that and the only solution I found is to use layers. With this, I have to set the colours for different graphs explicitly (instead of using schemes) and I don't get a legend.
If layers are really to only option here to combine, e.g. x,y,z into one "Movement" plot, how can I get a legend for this plot that tells me red -> x, green -> y, and blue -> z?
The answer is "yes" to both of your questions.
The key to the first question is to pass the data in a dense format and use the Flatten Transform to expand it.
The key to the second question is to use a Fold Transform to turn multiple columns into an indicator plus a value.
Here is a demonstration of this for a single chart (open in editor):
{
"data": {
"values": [
{
"time": [1, 2, 3, 4],
"x": [5, 4, 5, 2],
"y": [2, 3, 2, 4],
"z": [1, 2, 1, 0]
}
]
},
"transform": [
{"flatten": ["time", "x", "y", "z"]},
{"fold": ["x", "y", "z"], "as": ["column", "value"]}
],
"mark": "line",
"encoding": {
"x": {"field": "time", "type": "quantitative"},
"y": {"field": "value", "type": "quantitative"},
"color": {"field": "column", "type": "nominal"}
}
}

Query All Elements in Nested JSON Array PostrgreSQL

I am trying to create a query in SQL to retrieve DNS answer information so that I can visualize it in Grafana with the add of TimescaleDB. Right now, I am struggling to get postgres to query more than one element at a time. The structure of my JSON that I am trying to query looks like this:
{
"Z": 0,
"AA": 0,
"ID": 56559,
"QR": 1,
"RA": 1,
"RD": 1,
"TC": 0,
"RCode": 0,
"OpCode": 0,
"answer": [
{
"ttl": 19046,
"name": "i.stack.imgur.com",
"type": 5,
"class": 1,
"rdata": "i.stack.imgur.com.cdn.cloudflare.net"
},
{
"ttl": 220,
"name": "i.stack.imgur.com.cdn.cloudflare.net",
"type": 1,
"class": 1,
"rdata": "104.16.30.34"
},
{
"ttl": 220,
"name": "i.stack.imgur.com.cdn.cloudflare.net",
"type": 1,
"class": 1,
"rdata": "104.16.31.34"
},
{
"ttl": 220,
"name": "i.stack.imgur.com.cdn.cloudflare.net",
"type": 1,
"class": 1,
"rdata": "104.16.0.35"
}
],
"ANCount": 13,
"ARCount": 0,
"QDCount": 1,
"question": [
{
"name": "i.stack.imgur.com",
"qtype": 1,
"qclass": 1
}
]
}
There can be any number of answers, including zero, so I would like to figure out a way to query all answers. For example, I am trying to retrieve the ttl field from every index answer, and I can query a specific index, but have trouble querying all occurrences.
This works for querying a single index:
SELECT (data->'answer'->>0)::json->'ttl'
FROM dns;
When I looked around, I found this as a potential solution for querying all indices within the array, but it did not seem to work and told me "cannot extract elements from a scalar":
SELECT answer->>'ttl' ttl
FROM dns, jsonb_array_elements(data->'answer') answer, jsonb_array_elements(answer->'ttl') ttl
Using jsonb_array_elements() will give you a row for every object in the answer array. You can then dereference that object:
select a.obj->>'ttl' as ttl, a.obj->>'name' as name, a.obj->>'rdata' as rdata
from dns d
cross join lateral jsonb_array_elements(data->'answer') as a(obj)

Work around to display vega-lite graph based on period time column?

I am asking this question as a follow-up to :
Original Question
The basic requirement is very simple:
To display sport competition results on a graph, based on human readable time period.
For example top 8 Men's 800m , from Rio 2016.
Rank Name Time
1 David Lekuta Rudisha 1:42.15
2 Taoufik Makhloufi 1:42.61
3 Clayton Murphy 1:42.93
4 Pierre-Ambroise Bosse 1:43.41
5 Ferguson Cheruiyot Rotich 1:43.55
6 Marcin Lewandowski 1:44.20
7 Alfred Kipketer 1:46.02
8 Boris Berian 1:46.15
There were some issues such as :
The zero-point for a timestamp is not well defined, so a bar mark is not a good fit
for temporal data.
I will appreciate any workaround to display to display time-period results to solve such a problem.
Thanks
Yoav
Vega-lite does not have any native data type to represent time periods, it only has a data type representing timestamps. When using bar marks for timestamps, the zero-point is context-dependent, and so Vega-Lite will not try to infer it for you.
For your data, I would probably approach it as follows:
Use a parse argument in your data to specify the expected format of your timestamps, as in the original question
Use a timeUnit transform to manually compute a relevant zero-point for your data: here a yearmonthdate timeUnit works well because it strips away hours, minutes, and seconds.
Use a y2 encoding in your bar to specify this zero-point for your bar mark.
Put together, the result might look something like this (Vega Editor):
{
"data": {
"values": [
{"Rank": 1, "Name": "David Lekuta Rudisha", "Time": "1:42.15"},
{"Rank": 2, "Name": "Taoufik Makhloufi", "Time": "1:42.61"},
{"Rank": 3, "Name": "Clayton Murphy", "Time": "1:42.93"},
{"Rank": 4, "Name": "Pierre-Ambroise Bosse", "Time": "1:43.41"},
{"Rank": 5, "Name": "Ferguson Cheruiyot Rotich", "Time": "1:43.55"},
{"Rank": 6, "Name": "Marcin Lewandowski", "Time": "1:44.20"},
{"Rank": 7, "Name": "Alfred Kipketer", "Time": "1:46.02"},
{"Rank": 8, "Name": "Boris Berian", "Time": "1:46.15"}
],
"format": {"parse": {"Time": "date:'%M:%S.%L'"}}
},
"transform": [
{"timeUnit": "yearmonthdate", "field": "Time", "as": "zeropoint"}
],
"mark": "bar",
"encoding": {
"x": {"field": "Name", "type": "nominal"},
"y": {
"field": "Time",
"timeUnit": "minutessecondsmilliseconds",
"type": "temporal",
"title": "Time"
},
"y2": {"field": "zeropoint", "timeUnit": "minutessecondsmilliseconds"}
},
"$schema": "https://vega.github.io/schema/vega-lite/v4.0.0.json"
}

How to build pre-calculated histogram in Vega-Lite?

VegaLite can bin and aggregate himself. But I have complex calculation and build histogram separately.
The resulting data is following
bins = [1, 2, 3, 4] // 4 edges
// |1-2|2-3|3-4| // 3 bars
counts = [1, 2, 1]
The problem is - how to properly display bar edges - there are 3 bars, but 4 edges.
You can specify bin start and endpoints using the x and x2 encodings. It's also helpful to specify bin='binned' which tells Vega-Lite that the data is pre-binned & triggers the same display defaults used when a bin operation appears in the specification. For example (editor link):
{
"data": {
"values": [
{"bin1": 1, "bin2": 2, "counts": 1},
{"bin1": 2, "bin2": 3, "counts": 2},
{"bin1": 3, "bin2": 4, "counts": 1}
]
},
"mark": "bar",
"encoding": {
"x": {"field": "bin1", "type": "quantitative", "bin": "binned"},
"x2": {"field": "bin2"},
"y": {"field": "counts", "type": "quantitative"}
}
}
For more information, see Using Vega-Lite with Binned data.