After performing a factor analysis the loadings object looks like this:
Loadings:
Factor1 Factor2
IV1 0.844 -0.512
IV2 0.997
IV3 -0.235
IV4 -0.144
IV5 0.997
Factor1 Factor2
SS loadings 1.719 1.333
Proportion Var 0.344 0.267
Cumulative Var 0.344 0.610
I can target the factors themselves using print(fit$loadings[,1:2])to get the following.
Factor1 Factor2
IV1 0.84352949 -0.512090197
IV2 0.01805673 0.997351400
IV3 0.05877499 -0.234710743
IV4 0.09088599 -0.144251843
IV5 0.99746785 0.008877643
I would like to create a json string that would look something like the following.
"loadings": {
"Factor1": {
"IV1": 0.84352949, "IV2":0.01805673, "IV3":0.05877499, "IV4": 0.09088599, "IV5": 0.99746785
},
"Factor2": {
"IV1": -0.512090197, "IV2": 0.997351400, "IV3": -0.234710743, "IV4": -0.144251843, "IV5": 0.008877643
}
}
I have tried accessing the individual properties using unclass(), hoping that I could then loop through and put them into a string,have not had any luck ( using loads <- loadings(fit) and <- names(unclass(loads)) names shows up as "null")
Just seconding #GSee's comment (+1) and #dickoa's answer (+1) with a closer example:
Creating some demo data for reproducible example (you should also provide one in all your Qs):
> fit <- princomp(~ ., data = USArrests, scale = FALSE)
Load RJSONIO/rjson packages:
> library(RJSONIO)
Transform your data to fit your needs:
> res <- list(loadings = apply(fit$loadings, 2, list))
Return JSON:
> cat(toJSON(res))
{
"loadings": {
"Comp.1": [
{
"Murder": -0.041704,
"Assault": -0.99522,
"UrbanPop": -0.046336,
"Rape": -0.075156
}
],
"Comp.2": [
{
"Murder": 0.044822,
"Assault": 0.05876,
"UrbanPop": -0.97686,
"Rape": -0.20072
}
],
"Comp.3": [
{
"Murder": 0.079891,
"Assault": -0.06757,
"UrbanPop": -0.20055,
"Rape": 0.97408
}
],
"Comp.4": [
{
"Murder": 0.99492,
"Assault": -0.038938,
"UrbanPop": 0.058169,
"Rape": -0.072325
}
]
}
}>
You can do something along these lines
require(RJSONIO) ## or require(rjson)
pca <- prcomp(~ ., data = USArrests, scale = FALSE)
export <- list(loadings = split(pca$rotation, rownames(pca$rotation)))
cat(toJSON(export))
## {
## "loadings": {
## "Assault": [ 0.99522, -0.05876, -0.06757, 0.038938 ],
## "Murder": [ 0.041704, -0.044822, 0.079891, -0.99492 ],
## "Rape": [ 0.075156, 0.20072, 0.97408, 0.072325 ],
## "UrbanPop": [ 0.046336, 0.97686, -0.20055, -0.058169 ]
## }
## }
If you want to export it :
cat(toJSON(export), file = "loadings.json")
If it doesn't really suit your need, just modify the data structure (export object) to the output you want.
Related
I try to insert a Amazon SNS notification eventype = Open to ClickHouse, the Json schema is complex so I don't how I can create my table ( with nested inside a nested ... )
{
"eventType":"Open",
"mail":{
"commonHeaders":{
"from":[
"sender#example.com"
],
"messageId":"EXAMPLE7c191be45-e9aedb9a-02f9-4d12-a87d-dd0099a07f8a-000000",
"subject":"Message sent from Amazon SES",
"to":[
"recipient#example.com"
]
},
"destination":[
"recipient#example.com"
],
"headers ":[
{
"name":"X-SES-CONFIGURATION-SET",
"value":"ConfigSet"
},
{
"name":"X-SES-MESSAGE-TAGS",
"value":"myCustomTag1=myCustomValue1, myCustomTag2=myCustomValue2"
},
{
"name":"From",
"value":"sender#example.com"
},
{
"name":"To",
"value":"recipient#example.com"
},
{
"name":"Subject",
"value":"Message sent from Amazon SES"
},
{
"name":"MIME-Version",
"value":"1.0"
},
{
"name":"Content-Type",
"value":"multipart/alternative; boundary=\"XBoundary\""
}
],
"headersTruncated":false,
"messageId":"EXAMPLE7c191be45-e9aedb9a-02f9-4d12-a87d-dd0099a07f8a-000000",
"sendingAccountId":"123456789012",
"source":"sender#example.com",
"tags":{
"myCustomTag1":[
"myCustomValue1"
],
"myCustomTag2":[
"myCustomValue2"
],
"ses:caller-identity":[
"ses-user"
],
"ses:configuration-set":[
"ConfigSet"
],
"ses:from-domain":[
"example.com"
],
"ses:source-ip":[
"192.0.2.0"
]
},
"timestamp":"2017-08-09T21:59:49.927Z"
},
"open":{
"ipAddress":"192.0.2.1",
"timestamp":"2017-08-09T22:00:19.652Z",
"userAgent":"Mozilla/5.0 (iPhone; CPU iPhone OS 10_3_3 like Mac OS X) AppleWebKit/603.3.8 (KHTML, like Gecko) Mobile/14G60"
}
}
I tried INSERT INTO Open FORMAT JSONEachRow and INSERT INTO Open FORMAT JSONCompact but doesn't work.
Thank you.
You should transform your JSON to more simple form without nesting and use JSONEachRow.
Or insert data to CH as JSONAsString and transform using JSONExtract
create table i(J String) Engine=Null;
create table f(a String, i Int64, f Float64) Engine=MergeTree order by a;
create materialized view vv to f
as select (JSONExtract(J, 'Tuple(String,Tuple(Int64,Float64))') as x),
x.1 as a,
x.2.1 as i,
x.2.2 as f
from i;
echo '{"s": "val1", "b2": {"i": 42, "f": 0.1}}' |clickhouse-client -q "insert into i format JSONAsString"
select * from f
┌─a────┬──i─┬───f─┐
│ val1 │ 42 │ 0.1 │
└──────┴────┴─────┘
There is some similiarity between my question and How to measure common coverage for Polymer components + .js files?. Nevertheless, it is accepted as answer "split to .js files and include it to components" in order to use wct-istanbul and all my web components and tests are in .html files (the javascript is inside of each .html file).
My straight question is: can I still use wct-istambul to check how much from my code is covered by tests? If so, what is wrong in configuration described bellow? If not, is wct-istanbub planned to replace wct-istanbul for polymer projects?
package.json
"polyserve": "^0.18.0",
"web-component-tester": "^6.0.0",
"web-component-tester-istanbul": "^0.10.0",
...
wct.conf.js
var path = require('path');
var ret = {
'suites': ['test'],
'webserver': {
'pathMappings': []
},
'plugins': {
'local': {
'browsers': ['chrome']
},
'sauce': {
'disabled': true
},
"istanbul": {
"dir": "./coverage",
"reporters": ["text-summary", "lcov"],
"include": [
"/*.html"
],
"exclude": [
],
thresholds: {
global: {
statements: 100
}
}
}
}
};
var mapping = {};
var rootPath = (__dirname).split(path.sep).slice(-1)[0];
mapping['/components/' + rootPath + '/bower_components'] = 'bower_components';
ret.webserver.pathMappings.push(mapping);
module.exports = ret;
Well, I tried WCT-istanbub (https://github.com/Bubbit/wct-istanbub) which seams to be a temporary workaround (Code coverage of Polymer Application with WCT), it works.
wct.conf.js
"istanbub": {
"dir": "./coverage",
"reporters": ["text-summary", "lcov"],
"include": [
"**/*.html"
],
"exclude": [
"**/test/**",
"*/*.js"
],
thresholds: {
global: {
statements: 100
}
}
}
...
and the result is
...
chrome 66 RESPONSE quit()
chrome 66 BrowserRunner complete
Test run ended with great success
chrome 66 (2/0/0)
=============================== Coverage summary ===============================
Statements : 21.18% ( 2011/9495 )
Branches : 15.15% ( 933/6160 )
Functions : 18.08% ( 367/2030 )
Lines : 21.14% ( 2001/9464 )
================================================================================
Coverage for statements (21.18%) does not meet configured threshold (100%)
Error: Coverage failed
Let's say I have the following document in a MongoDB database:
{
"assist_leaders" : {
"Steve Nash" : {
"team" : "Phoenix Suns",
"position" : "PG",
"draft_data" : {
"class" : 1996,
"pick" : 15,
"selected_by" : "Phoenix Suns",
"college" : "Santa Clara"
}
},
"LeBron James" : {
"team" : "Cleveland Cavaliers",
"position" : "SF",
"draft_data" : {
"class" : 2003,
"pick" : 1,
"selected_by" : "Cleveland Cavaliers",
"college" : "None"
}
},
}
}
I'm trying to collect a few values under "draft_data" for each player in an ORDERED list. The list needs to look like the following for this particular document:
[ [1996, 15, "Phoenix Suns"], [2003, 1, "Cleveland Cavaliers"] ]
That is, each nested list must contain the values corresponding to the "pick", "selected_by", and "class" keys, in that order. I also need the "Steve Nash" data to come before the "LeBron James" data.
How can I achieve this using pymongo? Note that the structure of the data is not set in stone so I can change this if that makes the code simpler.
I'd extract the data and turn it into a list in Python, once you've retrieved the document from MongoDB:
for doc in db.collection.find():
for name, info in doc['assist_leaders'].items():
draft_data = info['draft_data']
lst = [draft_data['class'], draft_data['pick'], draft_data['selected_by']]
print name, lst
List comprehension is the way to go here (Note: don't forget .iteritems() in Python2 or .items() in Python3 or you'll get a ValueError: too many values to unpack).
import pymongo
import numpy as np
client = pymongo.MongoClient()
db = client[database_name]
dataList = [v for i in ["Steve Nash", "LeBron James"]
for key in ["class", "pick", "selected_by"]
for document in db.collection_name.find({"assist_leaders": {"$exists": 1}})
for k, v in document["assist_leaders"][i]["draft_data"].iteritems()
if k == key]
print dataList
# [1996, 15, "Phoenix Suns", 2003, 1, "Cleveland Cavaliers"]
matrix = np.reshape(dataList, [2,3])
print matrix
# [ [1996, 15, "Phoenix Suns"],
# [2003, 1, "Cleveland Cavaliers"] ]
I have a recursive structure of S4 objects , that can be presented ( this is a simple version) by theses 2 classes:
cl2 <-
setClass("cl2",
representation(
id = "numeric",
date="Date"),
prototype = list(
date=Sys.Date(),
id=sample(1:100,1)
)
)
cl1 <-
setClass("cl1",
representation(
date="Date",
cl2 = "cl2"
),
prototype = list(
date=Sys.Date()
)
)
I would like to save/load objects of type cl1. I opt to use json format(suitable for unstructured objects). The problem is with dates. Dates are coerced to numeric? Is there an option/solution to get dates in the right format when I serialize the object? Note that the objects can contains other objects ( recursive structure) so I would like that all dates are in the good format.
cat(RJSONIO::toJSON(cl1(),pretty=TRUE))
{
"date" : 16861,
"cl2" : {
"id" : 90,
"date" : 16861
}
}
A solution can be to replace dates by character. But I will loose the validation mechanism of S4 object and I should implement the date validation for all objects. Thanks in advance for any help.
An expected output should be like :
{
"date" :"2016-03-01",
"cl2" : {
"id" : 76,
"date" : "2016-03-01"
}
}
Reading the documentation of toJSON I found an interesting parameter:
force unclass/skip objects of classes with no defined JSON mapping
So I tried and I think this would match you need as you can simply ignore the class entry:
> s <- jsonlite::toJSON(cl1(),force=TRUE,auto_unbox=TRUE,pretty=TRUE)
> s
{
"date": "2016-03-01",
"cl2": {
"date": "2016-03-01",
"id": 67,
"class": "cl2"
},
"class": "cl1"
}
Drawback: This is still no loadable "as-is" to s4 objects with fromJSON as it will give a named list back, analyzing the list recursively to recreate S4 objects is doable, but you'll have to create the necessary as implementation to turn a named list to your classes, for your example:
setAs('list', 'cl2',
function(from, to) {
new(to, id=from[['id']], date=as.Date(from[['date']]))
})
setAs('list','cl1',
function(from, to) {
new(to,date=as.Date(from[['date']],cl2=as(from[['cl2']],'cl2')))
})
With a dummy input from previous output:
input <- '
{
"date": "2016-03-05",
"cl2": {
"date": "2016-02-01",
"id": 83,
"class": "cl2"
},
"class": "cl1"
}'
This gives:
> as(fromJSON(input),'cl1')
An object of class "cl1"
Slot "date":
[1] "2016-03-05"
Slot "cl2":
An object of class "cl2"
Slot "id":
[1] 67
Slot "date":
[1] "2016-03-01"
I let you adapt this to your real use case, probably using fromJSON(input,FALSE) to get a 'pure' list to coerce with lapply for example if you have multiples instances of your cl1 class in the json input.
One option is to use the jsonlite package to serialize. Indeed jsonlite::tojson respects date and serilze them in well formated form. The problem is jsonlite::toJSON is not defined for S4 objects. My solution is to coerce the object to a list and then seralize it:
## S4 method to coerce any S4 object to a list
setMethod("as.list",signature(x="ANY"),
function(x) {
Map(
function(y) if (isS4(slot(x,y))) as.list(slot(x,y)) else slot(x,y)
,slotNames(class(x)))
})
## coercion
jsonlite::toJSON(as.list(cl1()),pretty=TRUE,auto_unbox=TRUE)
{
"date": "2016-03-01",
"cl2": {
"id": 24,
"date": "2016-03-01"
}
}
udpdate
in as.list I replace lapply by Map to create a named list.
For the recursive reading of S4 classes from JSON you can use a similar approach:
library(RJSONIO)
createParser <- function(className) {
setAs("list", className, function(from, to) {
to <- new(to)
for (n in names(from)) {
if (isS4(slot(to, n))) {
c <- class(slot(to, n))[[1]]
o <- as(from[[n]], c)
slot(to, n) = o
} else {
slot(to, n) = from[[n]]
}
}
to
})
}
Name <- setClass("Name", slots=c("first"="character", "last"="character"))
createParser("Name")
Customer <- setClass("Customer", slots=c("name"="Name", "age"="numeric"))
createParser("Customer")
Case <- setClass("Case", slots=c("customer"="Customer"))
createParser("Case")
c1 <- Case(customer=Customer(name=Name(first="Mika", last="R"), age=100))
j <- RJSONIO::toJSON(c1)
l <- RJSONIO::fromJSON(j, simplify = FALSE)
as(l, "Case")
I am trying to create a ragged list in R that corresponds to the D3 tree structure of flare.json. My data is in a data.frame:
path <- data.frame(P1=c("direct","direct","organic","direct"),
P2=c("direct","direct","end","end"),
P3=c("direct","organic","",""),
P4=c("end","end","",""), size=c(5,12,23,45))
path
P1 P2 P3 P4 size
1 direct direct direct end 5
2 direct direct organic end 12
3 organic end 23
4 direct end 45
but it could also be a list or reshaped if necessary:
path <- list()
path[[1]] <- list(name=c("direct","direct","direct","end"),size=5)
path[[2]] <- list(name=c("direct","direct","organic","end"), size=12)
path[[3]] <- list(name=c("organic", "end"), size=23)
path[[4]] <- list(name=c("direct", "end"), size=45)
The desired output is:
rl <- list()
rl <- list(name="root", children=list())
rl$children[1] <- list(list(name="direct", children=list()))
rl$children[[1]]$children[1] <- list(list(name="direct", children=list()))
rl$children[[1]]$children[[1]]$children[1] <- list(list(name="direct", children=list()))
rl$children[[1]]$children[[1]]$children[[1]]$children[1] <- list(list(name="end", size=5))
rl$children[[1]]$children[[1]]$children[2] <- list(list(name="organic", children=list()))
rl$children[[1]]$children[[1]]$children[[2]]$children[1] <- list(list(name="end", size=12))
rl$children[[1]]$children[2] <- list(list(name="end", size=23))
rl$children[2] = list(list(name="organic", children=list()))
rl$children[[2]]$children[1] <- list(list(name="end", size=45))
So when I print to json it's:
require(RJSONIO)
cat(toJSON(rl, pretty=T))
{
"name" : "root",
"children" : [
{
"name" : "direct",
"children" : [
{
"name" : "direct",
"children" : [
{
"name" : "direct",
"children" : [
{
"name" : "end",
"size" : 5
}
]
},
{
"name" : "organic",
"children" : [
{
"name" : "end",
"size" : 12
}
]
}
]
},
{
"name" : "end",
"size" : 23
}
]
},
{
"name" : "organic",
"children" : [
{
"name" : "end",
"size" : 45
}
]
}
]
}
I am having a hard time wrapping my head around the recursive steps that are necessary to create this list structure in R. In JS I can pretty easily move around the nodes and at each node determine whether to add a new node or keep moving down the tree by using push as needed, eg: new = {"name": node, "children": []}; or new = {"name": node, "size": size}; as in this example. I tried to split the data.frame as in this example:
makeList<-function(x){
if(ncol(x)>2){
listSplit<-split(x,x[1],drop=T)
lapply(names(listSplit),function(y){list(name=y,children=makeList(listSplit[[y]]))})
} else {
lapply(seq(nrow(x[1])),function(y){list(name=x[,1][y],size=x[,2][y])})
}
}
jsonOut<-toJSON(list(name="root",children=makeList(path)))
but it gives me an error
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
Error during wrapup: evaluation nested too deeply: infinite recursion / options(expressions=)?
The function given in the linked Q&A is essentially what you need, however it was failing on your data set because of the null values for some rows in the later columns. Instead of just blindly repeating the recursion until you run out of columns, you need to check for your "end" value, and use that to switch to making leaves:
makeList<-function(x){
listSplit<-split(x[-1],x[1], drop=TRUE);
lapply(names(listSplit),function(y){
if (y == "end") {
l <- list();
rows = listSplit[[y]];
for(i in 1:nrow(rows) ) {
l <- c(l, list(name=y, size=rows[i,"size"] ) );
}
l;
}
else {
list(name=y,children=makeList(listSplit[[y]]))
}
});
}
I believe this does what you want, though it has some limitations. In particular, it is assumed that every branch in your network is unique (i.e. there can't be two rows in your data frame that are equal for every column other than size):
df.split <- function(p.df) {
p.lst.tmp <- unname(split(p.df, p.df[, 1]))
p.lst <- lapply(
p.lst.tmp,
function(x) {
if(ncol(x) == 2L && nrow(x) == 1L) {
return(list(name=x[1, 1], size=unname(x[, 2])))
} else if (isTRUE(is.na(unname(x[ ,2])))) {
return(list(name=x[1, 1], size=unname(x[, ncol(x)])))
}
list(name=x[1, 1], children=df.split(x[, -1, drop=F]))
}
)
p.lst
}
all.equal(rl, df.split(path)[[1]])
# [1] TRUE
Though note you had the organic size switched, so I had to fix your rl to get this result (rl has it as 45, but your path as 23). Also, I modified your path data.frame slightly:
path <- data.frame(
root=rep("root", 4),
P1=c("direct","direct","organic","direct"),
P2=c("direct","direct","end","end"),
P3=c("direct","organic",NA,NA),
P4=c("end","end",NA,NA),
size=c(5,12,23,45),
stringsAsFactors=F
)
WARNING: I haven't tested this with other structures, so it's possible it will hit corner cases that you'll need to debug.