How to create complex JSON config maps in q? - json

Is there a good way in q to input somewhat large complicated nested dictionaries which represent/will be converted to json? I'm trying to control the echarts javascript library which basically just renders charts based on json config options. What I'm doing now is:
opt.title.text:"my chart"
opt.xAxis.data:til 100
opt.series.data:100?5
opt.series.type:`line
toClient[opt] /serializes and sends to browser
but is there an obvious way to get rid of the intermediate assignment? Is making a function to take key-path/value pairs and turn them into a dictionary the way to go or is there a better way to go about this?
Or is this something that should be avoided in q, and instead just manually set write q to set specific options and handle the json object map in the javascript client?

Not sure if this is really what you are looking for, but you can create the nested dictionary structure directly if that's what you're after?
q)`title`xAxis`series!(enlist[`text]!enlist"my chart";enlist[`data]!enlist til 100;`data`type!(100?5;`line))
title | (,`text)!,"my chart"
xAxis | (,`data)!,0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 ..
series| `data`type!(0 1 1 3 3 3 2 2 4 1 3 3 1 4 0 4 4 4 2 4 3 3 4 0 4 0 0 1 0..

Related

Pandas: flattening repeating/wrapped columns in csv file

It often happens that data will be given to you with wrapped columns. Consider, for example:
CCY Decimals CCY Decimals CCY Decimals
AUD/CAD 5 EUR/CZK 4 GBP/NOK 5
AUD/CHF 5 EUR/DKK 5 GBP/NZD 5
AUD/DKK 5 EUR/GBP 5 GBP/PLN 5
AUD/JPY 3 EUR/HKD 5 GBP/SEK 5
AUD/NOK 5 EUR/HUF 3 GBP/SGD 5
...
Which should be parsed as a dataframe of two columns (CCY and Decimals), not six. My question is, what is the most idiomatic way of achieving this?
I would have wanted something like the following:
data = pd.read_csv("file.csv")
data.groupby(axis=1,by=data.columns.map(lambda s: s.replace("\..",""))).\
apply(lambda df : df.values.flatten())
When reading the csv file we end up with columns CCY,Decimals,CCY.1,Decimals.1 .. etc. The groupby operation returns a collection of data frames:
<pandas.core.groupby.DataFrameGroupBy object at 0x3a52b10>
Which we would then flatten using numpy functionality. So we would are converting DataFrames with repeating columns into Series, and then merging these into a result DF.
However, this doesn't work. I've tried passing the different keys arguments to groupBy, but it always complains about being unable to reindex non-unique columns.
There are a number of existing questions that deal with flattening groups of columns (e.g. "Flattening" output of group.nth in Pandas), but I can't find any that do this for repeating columns.
To use groupby, I'd do:
>>> groups = df.groupby(axis=1,by=lambda x: x.rsplit(".",1)[0])
>>> pd.DataFrame({k: v.values.flat for k,v in groups})
CCY Decimals
0 AUD/CAD 5
1 EUR/CZK 4
2 GBP/NOK 5
3 AUD/CHF 5
4 EUR/DKK 5
5 GBP/NZD 5
6 AUD/DKK 5
7 EUR/GBP 5
8 GBP/PLN 5
9 AUD/JPY 3
10 EUR/HKD 5
11 GBP/SEK 5
12 AUD/NOK 5
13 EUR/HUF 3
14 GBP/SGD 5
[15 rows x 2 columns]
and then sort.

How to apply a formula for removing data noise in R?

I am working on NGSim Traffic data, having 18 columns and 1180598 rows in a text file. I want to smooth the position data, in the column 'Local Y'. I know there are built-in functions for data smoothing in R but none of them seem to match with the formula I am required to apply. The data in text file looks something like this:
Index VehicleID Total_Frames Local Y
1 2 5 35.381
2 2 5 39.381
3 2 5 43.381
4 2 5 47.38
5 2 5 51.381
6 4 8 504.828
7 4 8 508.325
8 4 8 512.841
9 4 8 516.338
10 4 8 520.854
11 4 8 524.592
12 4 8 528.682
13 4 8 532.901
14 5 7 39.154
15 5 7 43.153
16 5 7 47.154
17 5 7 51.154
18 5 7 55.153
19 5 7 59.154
20 5 7 63.154
The above data columns are just example taken out of original file. Here you can see 3 vehicles, with vehicle IDs = 2, 4 and 5 but in fact there are 2169 vehicles with different IDS. The column Total_Frames tell us how many times vehicle Id of each vehicle is repeated in the first column, for example in the table above, vehicle ID 2 is repeated 5 times, hence '5' in Total_Frames column. Following is the formula I am required to apply to remove data noise (smoothing) from column 'Local Y':
Smoothed Position Value = (1/(Summation of [EXP^-abs(i-k)/delta] from k=i-D to i+D)) * ( (Summation of (Local Y) *[EXP^-abs(i-k)/delta] from k=i-D to i+D))
where,
i = index #
delta = 5
D = 15
I have tried using the built-in functions, which I know of, but they don't smooth the data as required. My question is: Is there any built-in function in R which can do the data smoothing in the way of given formula or which could take this formula as an argument? I need to apply the formula to every value in Local Y which has 15 values before and 15 values after them (i-D and i+D) for same vehicle Id. Can anyone give me any idea how to approach the problem? Thanks in advance.
You can place your formula in a function and then use the apply function of R to apply it to the elements in your "Local Y" column of the dataframe

Add together grouped rows into one value

I've got an issue where I've been presented data in this format from SQL, and have directly imported that into SSRS 2008.
I do have access to the stored procedure for this report, however I don't want to change it as a few other reports rely on it.
Project HoursSpent Cost
1 5 45
1 8 10
1 7 25
1 5 25
2 1 15
2 3 10
2 5 15
2 6 10
3 6 10
3 4 5
3 4 10
3 2 5
I've been struggling all morning to understand how/when I should be implementing the SUM() function with this.
I have tried already to SUM() the rows, but it still outputs the above result.
Should I be adding any extra groups?
Ideally, I need to have the following output:
Project HoursSpent Cost
1 25 105
2 15 40
3 16 30
EDIT: Here is my current structure:
"LineName" is a group for each project
You have to add a row group on "Project" since you want to sum up the data per each project.

MySQL query nested/sub?

OK this is the data I am working with:
category_child_id category_parent_id
1 0
2 0
3 1
4 1
5 3
6 3
7 4
8 0
9 8
10 8
11 0
12 11
13 11
14 0
15 14
16 14
17 14
18 0
19 18
20 18
21 18
0 19
It's basically categories with sub categories etc etc.
If I
SELECT category_child_id FROM category_xref WHERE category_parent_id = 1
it returns 3 & 4 which is correct. However there are no products in this category only in the category below so the results I actually want are 5 & 6 as well. However this is not always the same so it does need to be a query.
So basically I need to run a query to get all connected(nested) categories from the table. I've tried many ways with failed results so any help would be great.
If you are using PostgreSQL you could have used "WITH RECURSIVE" but MySQL does not support dynamic, recursive queries. You have to do it with your accessing language (e.g. PHP, Java).
There you can iterate over your recordset and perform a query for each returned child row until no more child rows are returned.
As TRD says, there are extensions to SQL supporting recursive searching of self-joins (Oracle uses 'CONNECT BY') however not only are these not available in MySQL, they're also not the most efficient nor flexible solution.
Have a google for adjacency list model - I did and found this.

How to build vtkPolyData based on the information within a txt file

I have a txt file which contains a set of 3 Dimensional data points and I would like to create a vtkPolyData based on those points.
In the file, I have the number of points on the first line, in my case they are 6 x 6. And after that the actual coordinates of each point. The content of the file is like this.
6 6
1 1 3
2 1 3.4
3 1 3.6
4 1 3.6
5 1 3.4
6 1 3
1 2 3
2 2 3.8
3 2 4.2
4 2 4.2
5 2 3.8
6 2 3
1 3 3
2 3 3
3 3 3
4 3 3
5 3 3
6 3 3
1 4 3
2 4 3
3 4 3
4 4 3
5 4 3
6 4 3
1 5 3
2 5 3.8
3 5 4.2
4 5 4.2
5 5 3.8
6 5 3
1 6 3
2 6 3.4
3 6 3.6
4 6 3.6
5 6 3.4
6 6 3
How can I build a vtkPolyData structure with a txt file with this data?
It looks to me like you have a regularly gridded series of points, right? If so, vtkImageData might be a better choice. You can always use a geometry filter afterwards to convert to polydata if you really need it that way.
Create a vtkImageData instance.
Set its dimensions to (6, 6, 1) (the third dimension is ignored).
Set its data type to an appropriate type (float or double, I guess).
Call AllocateScalars();
If in C++,
call GetScalarPointer() and cast it to the data type set in 3.
This pointer will point to an array of size 36. You can just fill each point as you would normally.
If in another language (TCL/Python/Java), call SetScalarComponentFromFloat on the image data, with the arguments (x, y, 0, 0, value). The first 0 is the 3rd dimension and the second is for the first component.
This will give you a grid, and it'll be far more memory efficient than a polydata.
If you want to visualize only the points, use a vtkDataSetMapper, and setup the actor's property with SetRepresentationToPoints(), setting an appropriate point size. That will do a simple job of visualization.
Are these examples useful? In particular, this does generation of points and polygons, so it should be possible to adapt. The core seems to be (with lots left out):
# ...
vtkPolyData shell
vtkFloatPoints points
vtkCellArray strips
# Generate points...
loop {
...
points InsertPoint $k $x0 $x1 $x2
}
shell SetPoints points
points Delete
# Generate triangles/polygons...
loop {
strips InsertNextCell $NP2
# ...
strips InsertCellPoint [expr $kb +$ke ]
# ...
strips InsertCellPoint [expr $kb +$ke ]
}
shell SetStrips strips
strips Delete
# ...