I'm trying to embed this vega-lite diagram in my antora docs: https://vega.github.io/vega-lite/examples/line_overlay.html
My Asciidoc file looks like this:
... some text ...
== Attachments
[vegalite, rendered-vega-image, svg]
----
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"description": "Stock prices of 5 Tech Companies over Time.",
"data": {"url": "data/stocks.csv"},
"mark": {
"type": "line",
"point": true
},
"encoding": {
"x": {"timeUnit": "year", "field": "date"},
"y": {"aggregate":"mean", "field": "price", "type": "quantitative"},
"color": {"field": "symbol", "type": "nominal"}
}
}
----
... some more text ...
The problem is the CSV file containing my data is not found. I tried every path I imagined ... relative and absolute filesystem paths and relativ and absolute http urls. Still I always get this message:
[DONE] build ui bundle
Skipping vegalite block. No such file: http://localhost:8080/vegalite/svg/eNpVkEFywyAMRfc5BcN0mdibdpNtj9D0AAQrWA0gArKnnozvXoHbpl0Zv9F_X3DfKaWfih0hGH1UemRO5dj3MzjTOeRxOndI_TbQ6MEjQz8_dx-Fot7X_ADFZkyMAsTxxmSvKmW0UBRd1Is6gR3VK4VkIlY2Q1YnDNB95w3X8ruesq-Cu2E2dgwQuQyY1z5QRKaM0R08OVe_GRJlllNfal3pbJn12nTB5GvVyVn-eElQpR4jtDphiTCyQM4TCNliEC0NInxEP9tSLIu-S3-VLGCy3it9QfBDBbI6bLUyv7R541yWdxJ-1AFM_DffXqWSn7Vuk4mMbBjnh8iSp9xkv8GyhDP5v8lIAaPxeq032K1fIi6OZQ==
[DONE] build docs
The Skipping vegalite block should not appear.
I tried vega-lite with data which is defined directly in the json block. This snippet works. So Vega-Lite in general works (I'm using a Kroki Server for diagram generation).
[vegalite, rendered-vega-image, svg]
----
{
"description": "A simple bar chart with embedded data.",
"data": {
"values": [
{"a": "A","b": 28}, {"a": "B","b": 55}, {"a": "C","b": 43},
{"a": "D","b": 91}, {"a": "E","b": 81}, {"a": "F","b": 53},
{"a": "G","b": 19}, {"a": "H","b": 87}, {"a": "I","b": 52}
]
},
"mark": "bar",
"encoding": {
"x": {"field": "a", "type": "ordinal"},
"y": {"field": "b", "type": "quantitative"}
}
}
----
Defining my data directly inside the diagram definition is not a solution for me. Right now I'm just trying to make it work using the demo-CSV file. But afterwards I will switch to my own autogenerated, large CSV file. Both my real CSV and the Demo File are located in docs/modules/technical-docs/assets/attachments/monitoring-logging-reporting/stocks.csv.
My general setup consists of several projects.
Project containing the antora playbook und shell-script to generate my docs on my localhost
Project containing Asciidoc, the CSV files and a bunch of other docs and source code
Several other projects containing asciidoc docs which are not relevant for the problem at hand.
Anyone got any thougts? Thanks for yout help.
Kroki provides security levels that restrict access to files on the file system and on the network. Each level includes the restrictions enabled in the prior security level:
UNSAFE: disables any security features.
SAFE: Assume the diagram libraries secure mode request sanitization is sufficient.
SECURE: prevents attempts to read files from the file system or from the network.
By default, Kroki is running in SECURE mode.
As a result, "data": {"url": "data/stocks.csv"} will be removed/ignored.
If you are using Asciidoctor Kroki, the preprocessor should resolve the path, read the content and replace data.url with the actual values. See: https://github.com/Mogztter/asciidoctor-kroki
Alternatively, and since you are running a local instance of Kroki, you can use KROKI_SAFE_MODE=unsafe environment variable to configure the safe mode to unsafe.
Related
I'm trying to calculate a value for domainMax on the Y-axis scale. I tried the following example where I want the Y-axis domainMax to be one greater than the maximum value in the dataset field named "value". The example produces the error 'Unrecognized signal name: "domMax"'. How can I get it to work?
{
"data": {
"values": [
{"date": "2021-03-01T00:00:00", "value": 1},
{"date": "2021-04-01T00:00:00", "value": 3},
{"date": "2021-05-01T00:00:00", "value": 2}
]
},
"transform": [
{ "calculate": "max(datum.value)+1","as": "domMax"}
],
"mark": "line",
"encoding": {
"x": {
"field": "date",
"type": "temporal"
},
"y": {"field": "value", "type": "quantitative",
"scale": {"domainMax": {"expr": "domMax"}}
}
}
}
This transform
"transform": [
{ "calculate": "max(datum.value)+1","as": "domMax"}
]
adds a new column to your data set - it does not create a new signal. You can check that in the editor. Go to the DataViewer tab and select data_0 from the drop down. Can you see the new domMax column?
Signals are a different thing entirely - have a look here in the documentation. Note that the link points to Vega, not Vega-Lite. (Vega-Lite specifications are compiled to Vega.)
Vega-Lite does not let you declare signals; you declare parameters instead. Here is another example using the domMax parameter. Vega-Lite parameters are translated to Vega signals.
It looks like you are trying to derive the value of your parameter/signal from the data. I am not sure you can do that in Vega-Lite.
On the other hand it's very easy in Vega. For example you could use the extent transform:
https://vega.github.io/vega/docs/transforms/extent/
Side comment - while Vega specifications are more verbose you can sometimes find their primitives simpler and a good way to understand how the visualisation works. (You can see compiled Vega in the editor.)
I tried to get a custom domain based on the data but hit the same limitations as you did.
In my case, I update the data from the outside a bit like the streaming example. I compute the domain from the outside and modify them in the visualization with params. This is quite easy as vega-lite params are exposed as vega signals.
This is the gist of the layout:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"params": [
{
"name": "lowBound",
"value": -10
},
{
"name": "highBound",
"value": 100
}
],
../..
"vconcat": [
{
"name": "detailed",
../..
"layer": [
{
../..
"mark": "line",
"encoding": {
"y": {
"field": "value",
"title": "Temperature",
"type": "quantitative",
"scale": {
"domainMin": {
"expr": "lowBound"
},
"domainMax": {
"expr": "highBound"
}
}
},
...
The lowBound and highBound are dynamically changed through vega signals. I change them with the regular JS API.
You can add a param to pan and zoom in case your hard coded values are less than ideal.
"params": [{"name": "grid", "select": "interval", "bind": "scales"}],
Open the Chart in the Vega Editor
How can fix text overlap on the top axis in the example?
vega-lite example
This is what I ended up with.
Now, you will probably not like how I got there. In the editor I went to the Compiled Vega tab and edited the low-level Vega spec. The fix was:
Add a signal with the extent transform:
{"type": "extent", "field": "value", "signal": "y_domain"}
Derive two more signals from it:
{"name": "y_domain_min", "update": "y_domain[0]"},
{"name": "y_domain_max", "update": "y_domain[1] * 1.25"}
Note the multiplier.
Use the new signals for the domain definition.
{
"name": "y",
"type": "linear",
"domain": [{"signal": "y_domain_min"}, {"signal": "y_domain_max"}],
...
If a fix is possible directly in Vega-lite, I'd like to know.
Google Cloud BigQuery - Load Data via JSON file
I am trying to load data into BigQuery (JSON Newline Delimited) from a JSON file.
I'm getting stuck trying to figure out what my "Schema" is/ which I should be using?
The JSON file, is a file of products.
What I have tried so far...
NOTE: This is JUST for ONE product (of many), then it repeats the same pattern for all the other products:
[{"sku": INTEGER,"name": "STRING", "type": "STRING", "price": FLOAT, "upc": "INTEGER", "category": [{"id": "STRING", "name": "STRING"}, {"id": "STRING", "name": "STRING"}, {"id": "STRING", "name": "STRING"}, {"id": "STRING", "name": "STRING"}], "shipping": FLOAT, "description": "STRING", "manufacturer": "STRING", "model":"STRING", "url": "STRING","image": "STRING"}]
NOTE: the "image" key, is a URL to the image
UNLESS THERE IS ANOTHER WAY...
Is there a way to load the JSON file into BigQuery and have it "auto-generate" the table and dataset?
If you are using the CLI tool, then this is the schema for your data:
[{"name": "sku", "type": "INT64", "mode": "NULLABLE"},
{"name": "name", "type": "STRING", "mode": "NULLABLE"},
{"name": "type", "type": "STRING", "mode": "NULLABLE"},
{"name": "price", "type": "FLOAT", "mode": "NULLABLE"},
{"name": "upc", "type": "STRING", "mode": "NULLABLE"},
{"fields":
[{"name": "id", "type": "STRING", "mode": "NULLABLE"}, {"name": "name", "type": "STRING", "mode": "NULLABLE"}],
"name": "category", "type": "RECORD", "mode": "REPEATED"},
{"name": "shipping", "type": "FLOAT", "mode": "NULLABLE"},
{"name": "description", "type": "STRING", "mode": "NULLABLE"},
{"name": "manufacturer", "type": "STRING", "mode": "NULLABLE"},
{"name": "model", "type": "STRING", "mode": "NULLABLE"},
{"name": "url", "type": "STRING", "mode": "NULLABLE"},
{"name": "image", "type": "STRING", "mode": "NULLABLE"}]
You can save it in a file (such as "schema.json") and then run the command:
bq load --source_format=NEWLINE_DELIMITED_JSON dataset_id.test_table path/to/json_data path/to/schema.json
Where path/to/json_data is the path for your data. It can be either a path in your local machine (such as /documents/so/jsondata.json or it can also be a path in Google Cloud Storage, such as gs://analyzes/json_data.json for instance).
The schema must be in your local machine or specified along the command line but in this operation, it has to be specified.
Now you mentioned in the comments for my first answer about a type of operation where BigQuery does not require schemas.
You can do so indeed only for federated tables, that is, tables that are created using as reference an external file (and these files usually are in GCS or Google Drive).
To do so, you first would have to have your JSON data in GCS for instance and then you'd have to create the table in BQ. Using the CLI, this command creates the federated table using as source the JSON data from GCS:
bq mk --external_table_definition=#NEWLINE_DELIMITED_JSON=gs://bucket_name/jsondata.json dataset_id.table_test
This command line does not have the schema specified and BQ does its best to find what it should be given the data (I tested with your data and it worked just fine but I could use only legacy SQL afterwards).
Keep in mind though that this process is not guaranteed to work all the times and also you should use federated tables only if such tables meet the requirements for your project, otherwise it's easier and faster to load this data inside of BQ and then run queries from there. In the second reference that I suggested, you can read more about when it's best to use federated tables.
SOLUTION:
First, Will's solution is correct, and it will/should work 99% of the time.
NOTE: the data-set (JSON file) has around 5,000 + products, 15,000 lines of code (depending how you have it formatted)
However, with this particular dataset in conjunction with BigQuery (the way it wants it), for some reason (not sure the exact reason) would not work as expected. I did narrow it down to what (I think) was causing the error. I believe it was the "catagory": ["id":1234,"name":"ipod", etc. ] section for each one of the products. BigQuery seems to be pretty "fussy" with nested JSON data, and you need to do it just right and via Command-Line ONLY (no Web UI).
As soon as I deleted the "category" section (along with the ids/names) for the data-set and schema, I was able to get it to load the data just fine.
This was of course, only a very small sample of the dataset, as I wasn't going to sift through 5,000 products, deleting each "catagory" section.
SOLUTION - use a CSV file (recommend being somewhat familiar with MS Excel):
STEPS I DID:
NOTE: Don't do a right-click "Open With" (in Windows) on the file.
NOTE: I imported the whole (5,000 products) data.json file as delimited.
I opened Excel (v2016 in my case) FIRST (not the file) and
created a BLANK spreadsheet.
Click on the "Data Tab" on the "Ribbon" at the top. Select the option "From Text".
Change the view/file type to see "All Files (".")" so you can see
your "JSON" file.
Make sure "Delimited" is selected, and 437:OEM United States (unless
you need it to be something else), click Next.
UN-select "Tab", and select "Comma", click Next.
Then, your going to want to select each column (inside the wizard)
and change the "Column data format" to "Text", etc. When you're done,
click Finish.
It didn't come in as perfect as I wanted (usually never does).
GETTING RID OF UNECESSARY CHARACTERS ([ { : " ") AND TEXT:
Did a "Find and Replace" (Ctrl+F for Windows), Replace All for
each of the characters/text I didn't want or need. To do a "Delete",
just leave the "Replace" BLANK (so it replaces the text/character
with nothing).
Then, I filtered and sorted my data. This allowed me to find the
columns I didn't want, which was "catagory", "ID", and "NAME" (that
corresponds to ID, NOT NAME of product).
Once you get your data how you want it, do a "SAVE AS". Change "Save
as type:" to CSV UTF-8 (Comma delimited)(*.csv), and name your file
(to anything) myfile.csv
Then you can just use the Google BigQuery Web UI (or command-line if
you want). for "Location/ File Upload" you'll need to store it on
either Cloud Storage or Google Drive if the file is too big.
Otherwise, just upload from your local computer.
ALSO!!!! DON'T FORGET!!!
Under Options, select "Comma", and put a "1" in "Header rows to skip"
(this will be the name at the top of each of your columns. you don't
want to import the column names as data. They are just to help keep
things straightened out and sorted.
I'm trying to find an automatic way to generate at dmg file with my app inside and a nice looking background and application icon. I have found a an app which generate this called appdmg
https://www.npmjs.com/package/appdmg
My issue is that the json file is not understand and still complaining about a syntax issue.
Any example I found matching my syntax..
{
"title": "myApp",
"icon": "icon.ico",
"background": "banner.png",
"icon-size": 80,
"contents": [
{ "x": 192, "y": 344, "type": "link", "path": "/Applications" },
{ "x": 448, "y": 344, "type": "file", "path": “connect.app” }
]
}
Any idea or other easy cli tool ? I need to plug the tool inside a build server which generate automatically the app.
Thanks
Curly quotes are your syntax error. "connect.app" vs. “connect.app”
I'm writing my first Avro schema, which uses JSON as the schema language. I know you cannot put comments into plain JSON, but I'm wondering if the Avro tool allows comments. E.g. Perhaps it strips them (like a preprocessor) before parsing the JSON.
Edit: I'm using the C++ Avro toolchain
Yes, but it is limited. In the schema, Avro data types 'record', 'enum', and 'fixed' allow for a 'doc' field that contains an arbitrary documentation string. For example:
{"type": "record", "name": "test.Weather",
"doc": "A weather reading.",
"fields": [
{"name": "station", "type": "string", "order": "ignore"},
{"name": "time", "type": "long"},
{"name": "temp", "type": "int"}
]
}
From the official Avro spec:
doc: a JSON string providing documentation to the user of this schema (optional).
https://avro.apache.org/docs/current/spec.html#schema_record
An example:
https://github.com/apache/avro/blob/33d495840c896b693b7f37b5ec786ac1acacd3b4/share/test/schemas/weather.avsc#L2
Yes, you can use C comments in an Avro JSON schema : /* something */ or // something Avro tools ignores these expressions during the parsing.
EDIT: It only works with the Java API.
According to the current (1.9.2) Avro specification it's allowed to put in extra attributes, that are not defined, as metadata:
This allows you add comments like this:
{
"type": "record",
"name": "test",
"comment": "This is a comment",
"//": "This is also a comment",
"TODO": "As per this comment we should remember to fix this schema" ,
"fields" : [
{
"name": "a", "type": "long"
},
{
"name": "b", "type": "string"
}
]
}
No, it can't in the C++ nor the C# version (as of 1.7.5). If you look at the code they just shove the JSON into the JSON parser without any comment preprocessing - bizarre programming style. Documentation and language support appears to be pretty sloppy...