how do I extract keys from a JSON object? - json

My splunk instance queries a database once an hour for data about products, and gets a JSON string back that is structured like this:
{"counts":
{"green":413,
"red":257,
"total":670,
"product_list":
{ "urn:product:1":{
"name":"M & Ms" ,
"total":332 ,
"green":293 ,
"red":39 } ,
"urn:product:2":{
"name":"Christmas Ornaments" ,
"total":2 ,
"green":0 ,
"red":2 } ,
"urn:product:3":{
"name":"Traffic Lights" ,
"total":1 ,
"green":0 ,
"red":1 } ,
"urn:product:4":{
"name":"Stop Signs" ,
"total":2 ,
"green":0 ,
"red":2 },
...
}
}
}
I have a query that alerts when the counts.green drops by 10% over 24 hours:
index=database_catalog source=RedGreenData | head 1
| spath path=counts.green output=green_now
| table green_now
| join host
[| search index=database_catalog source=RedGreenData latest=-1d | head 1 | spath path=counts.green output=green_yesterday
| table green_yesterday]
| where green_yesterday > 0
| eval delta=(green_yesterday - green_now)/green_yesterday * 100
| where delta > 10
While I'm an experienced developer in C, C++, Java, SQL, JavaScript, and several others, I'm fairly new to Splunk's Search Processing Language, and references and tutorials seem pretty light, at least the ones I've found.
My next story is to at least expose all the individual products, and identify which ones have a 10% drop over 24 hours.
I thought a reasonable learning exercise would be to extract the names of all the products, and eventually turn that into a table with name, product code (e.g. urn:product:4), green count today, green count 24 hours ago, and then filter that on a 10% drop for all products where yesterday's count is positive. And I'm stuck. The references to {} are all for a JSON array [], not a JSON object with keys and values.
I'd love to get a table out that looks something like this:
ID
Name
Green
Red
Total
urn:product:1
M & Ms
293
39
332
urn:product:2
Christmas Ornaments
0
2
2
urn:product:3
Traffic Lights
0
1
1
urn:product:4
Stop Signs
0
2
2
How do I do that?

I think produces the output you want:
| spath
| table counts.product_list.*
| transpose
| rex field=column "counts.product_list.(?<ID>[^.]*).(?<fieldname>.*)"
| fields - column
| xyseries ID fieldname "row 1"
| table ID name green red total
use transpose to get the field names as data
use rex to extract the ID and the field name
use xyseries to pivot the data into the output
Here is a run-anywhere example using your source data:
| makeresults
| eval _raw="
{\"counts\":
{\"green\":413,
\"red\":257,
\"total\":670,
\"product_list\":
{ \"urn:product:1\":{
\"name\":\"M & Ms\" ,
\"total\":332 ,
\"green\":293 ,
\"red\":39 } ,
\"urn:product:2\":{
\"name\":\"Christmas Ornaments\" ,
\"total\":2 ,
\"green\":0 ,
\"red\":2 } ,
\"urn:product:3\":{
\"name\":\"Traffic Lights\" ,
\"total\":1 ,
\"green\":0 ,
\"red\":1 } ,
\"urn:product:4\":{
\"name\":\"Stop Signs\" ,
\"total\":2 ,
\"green\":0 ,
\"red\":2 },
}
}
}"
| spath
| table counts.product_list.*
| transpose
| rex field=column "counts.product_list.(?<ID>[^.]*).(?<fieldname>.*)"
| fields - column
| xyseries ID fieldname "row 1"
| table ID name green red total

Related

getting multiple rows group by column

I have sql table of this structure,
id | type | name
1 | type1 | name1
2 | type1 | name2
3 | type2 | name3
4 | type2 | name4
I want to get all the names grouped by the type like this
"type1" : [name1,name2]
"type2" : [name3,name4]
I am using Laravel eloquent, I tried keyBy('type') but it gives only one row of each type.
How can get all the names of one type?
Seems like you're looking for the group_concat aggregate function:
SELECT type, CONCAT('[', GROUP_CONCAT(name), ']')
FROM mytable
GROUP BY type
As far as I know, there is no direct way to solve this problem.
I had the same problem, and solved it using foreach loop:
$data = DB::connection('myConn')
->table('myTable')
->get()->toArray();
$res = [];
foreach ($data as $entry) {
if (!isset($res[$entry->type])) {
$res[$entry->type] = [];
}
array_push($res[$entry->type], $entry->name);
}

Showing all components in inventory with total stock

I have a table which shows displays list of all works orders with part numbers and quantities booked in those works orders as stock awaiting polishing.
What I want, is to list all part numbers in order to display total number of stock per part.
The current output is:
=======================================================
**Works Order** | **Part Number** | Stock Awaiting Polishing + other columns
1 | B01 | 5
2 | B012 | 12
3 | B012 | 43
4 | B014 | 32
What I want to Display is:
=======================================================
**Part Number** | Stock Awaiting Polishing
B01 | 5
B012 | 55
BO14 | 32
This may be easy but I'm still learning, can I get some help here?
SELECT
data.WORKS_ORDER1 AS Works_Order,
data.PART_NO1 AS Part_Number,
data.Part_Prim_Desc AS Part_Description,
data.Part_Secd_Desc AS Customer,
data.Qty_Painted,
data.Qty_Processed,
data.Available_Stock AS Stock_Awaiting_Polishing
FROM (
SELECT
wip.WO.WO_No AS [WORKS_ORDER1],
wip.WO.Part_No AS [PART_NO1],
wip.Ops.WO_No AS [WORKS_ORDER2],
production.Part.Part_No AS [PART_NO2],
production.Part.Part_Prim_Desc,
production.Part.Part_Secd_Desc,
wip.WO.Qty_Inc_Scrap AS Qty_Painted,
wip.Ops.Qty_Complete + wip.Ops.Qty_Rejected
+ wip.Ops.Qty_Scrapped AS Qty_Processed,
wip.WO.Qty_Inc_Scrap - (wip.Ops.Qty_Complete
+ wip.Ops.Qty_Rejected + wip.Ops.Qty_Scrapped) AS [Available_Stock]
FROM wip.Ops
INNER JOIN wip.WO ON wip.Ops.WO_No = wip.WO.WO_No
INNER JOIN production.Part ON wip.WO.Part_No = production.Part.Part_No
WHERE wip.WO.WO_Complete = 0 AND wip.WO.No_of_Ops_Completed = 1
AND wip.Ops.Op_No = 20 AND wip.Ops.WC_Code = 'VPO' AND wip.Ops.Completion_Ind_YN = 'N'
GROUP BY wip.WO.WO_No, wip.WO.Part_No,
wip.Ops.WO_No,
production.Part.Part_No,
production.Part.Part_Prim_Desc,
production.Part.Part_Secd_Desc,
wip.WO.Qty_Inc_Scrap,
wip.Ops.Qty_Complete,
wip.Ops.Qty_Rejected,
wip.Ops.Qty_Scrapped
) data
WHERE data.Available_Stock > 0
I'm starting my journey with SQL and the above code most likely could be simplified a lot, but it's all I've got and it works the way I want.
All help appreciated!

Loading quoted numbers into snowflake table from CSV with COPY TO <TABLE>

I have a problem with loading CSV data into snowflake table. Fields are wrapped in double quote marks and hence there is problem with importing them into table.
I know that COPY TO has CSV specific option FIELD_OPTIONALLY_ENCLOSED_BY = '"'but it's not working at all.
Here are some pices of table definition and copy command:
CREATE TABLE ...
(
GamePlayId NUMBER NOT NULL,
etc...
....);
COPY INTO ...
FROM ...csv.gz'
FILE_FORMAT = (TYPE = CSV
STRIP_NULL_VALUES = TRUE
FIELD_DELIMITER = ','
SKIP_HEADER = 1
error_on_column_count_mismatch=false
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
)
ON_ERROR = "ABORT_STATEMENT"
;
Csv file looks like this:
"3922000","14733370","57256","2","3","2","2","2019-05-23 14:14:44",",00000000",",00000000",",00000000",",00000000","1000,00000000","1000,00000000","1317,50400000","1166,50000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000"
I get an error
'''Numeric value '"3922000"' is not recognized '''
I'm pretty sure it's because NUMBER value is interpreted as string when snowflake is reading "" marks, but since I use
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
it shouldn't even be there... Does anyone have some solution to this?
Maybe something is incorrect with your file? I was just able to run the following without issue.
1. create the test table:
CREATE OR REPLACE TABLE
dbNameHere.schemaNameHere.stacko_58322339 (
num1 NUMBER,
num2 NUMBER,
num3 NUMBER);
2. create test file, contents as follows
1,2,3
"3922000","14733370","57256"
3,"2",1
4,5,"6"
3. create stage and put file in stage
4. run the following copy command
COPY INTO dbNameHere.schemaNameHere.STACKO_58322339
FROM #stageNameHere/stacko_58322339.csv.gz
FILE_FORMAT = (TYPE = CSV
STRIP_NULL_VALUES = TRUE
FIELD_DELIMITER = ','
SKIP_HEADER = 0
ERROR_ON_COLUMN_COUNT_MISMATCH=FALSE
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
)
ON_ERROR = "CONTINUE";
4. results
+-----------------------------------------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
| file | status | rows_parsed | rows_loaded | error_limit | errors_seen | first_error | first_error_line | first_error_character | first_error_column_name |
|-----------------------------------------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------|
| stageNameHere/stacko_58322339.csv.gz | LOADED | 4 | 4 | 4 | 0 | NULL | NULL | NULL | NULL |
+-----------------------------------------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
1 Row(s) produced. Time Elapsed: 2.436s
5. view the records
>SELECT * FROM dbNameHere.schemaNameHere.stacko_58322339;
+---------+----------+-------+
| NUM1 | NUM2 | NUM3 |
|---------+----------+-------|
| 1 | 2 | 3 |
| 3922000 | 14733370 | 57256 |
| 3 | 2 | 1 |
| 4 | 5 | 6 |
+---------+----------+-------+
Can you try with a similar test as this?
EDIT: A quick look at your data shows many of your numeric fields appear to start with commas, so something definitely amiss with the data.
Assuming your numbers are European formatted , decimal place, and . thousands, reading the numeric formating help, it seems Snowflake does not support this as input. I'd open a feature request.
But if you read the column in as text then use REPLACE like
SELECT '100,1234'::text as A
,REPLACE(A,',','.') as B
,TRY_TO_DECIMAL(b, 20,10 ) as C;
gives:
A B C
100,1234 100.1234 100.1234000000
safer would be to strip placeholders first like
SELECT '1.100,1234'::text as A
,REPLACE(A,'.') as B
,REPLACE(B,',','.') as C
,TRY_TO_DECIMAL(C, 20,10 ) as D;

Two things to do in MySQL IF()

I have a problem concerning the IF() function in MySQL.
I would like to return a string and change the value of a variable. Somwhat like:
IF(#firstRow=1, "Dear" AND #firstRow:=0, "dear")
This outputs only '0' instead of 'Dear'...
I would be very thankful for some input on ways I could solve this problem!
Louis :)
AND is a boolean operator, not a "also do this other thing" operator.
"Dear" AND 0 returns 0 because 0 is treated as false in MySQL and <anything> AND false will return false.
Also because the integer/boolean value of "Dear" is 0 as well. Using a string in a numeric context just reads initial digits in the string, if any, and ignores the rest.
It's not clear what your problem is, but I guess you want to capitalize the word "dear" if the row is the first one in the result set.
Instead of being too clever by half trying to fit the side-effect into your expression, do yourself a favor and break it out into a separate column:
mysql> SELECT IF(#firstRow=1, 'Dear', 'dear'), #firstRow:=0 AS _ignoreThis
-> FROM (SELECT #firstRow:=1) AS _init
-> CROSS JOIN
-> mytable;
+---------------------------------+-------------+
| IF(#firstRow=1, 'Dear', 'dear') | _ignoreThis |
+---------------------------------+-------------+
| Dear | 0 |
| dear | 0 |
| dear | 0 |
+---------------------------------+-------------+
But if you really want to make your code as confusing and unreadable as possible, you can do something like this:
SELECT IF(#firstRow=1, CONCAT('Dear', IF(#firstRow:=0, '', '')), 'dear')
FROM (SELECT #firstRow:=1) AS _init
CROSS JOIN
...
But remember this important metric of code quality: WTFs per minute.
Use a case expression instead of IF() as the syntax is far easier to follow e.g.
select
case when #firstRow = 1 then 'Dear' else 'dear' end AS Salutation
, #firstRow := 0
from (
select 1 n union all
select 2 n union all
select 3
) d
cross join (SELECT #firstRow:=1) var
+---+------------+----------------+
| | Salutation | #firstRow := 0 |
+---+------------+----------------+
| 1 | Dear | 0 |
| 2 | dear | 0 |
| 3 | dear | 0 |
+---+------------+----------------+
Demo

spark df.write quote all fields but not null values

I am trying to create a csv from values stored in the table:
| col1 | col2 | col3 |
| "one" | null | "one" |
| "two" | "two" | "two" |
hive > select * from table where col2 is null;
one null one
I am getting the csv using the below code:
df.repartition(1)
.write.option("header",true)
.option("delimiter", ",")
.option("quoteAll", true)
.option("nullValue", "")
.csv(S3Destination)
Csv I get:
"col1","col2","col3"
"one","","one"
"two","two","two"
Expected Csv:WITH NO DOUBLE QUOTES FOR NULL VALUE
"col1","col2","col3"
"one",,"one"
"two","two","two"
Any help is appreciated to know if the dataframe writer has options to do this.
You can go in a udf approach and apply on the column (using withColumn on the repartitioned datafrmae above) where possiblity of double quote empty string is there see below sample code
sqlContext.udf().register("convertToEmptyWithOutQuotes",(String abc) -> (abc.trim().length() > 0 ? abc : abc.replace("\"", " ")),DataTypes.StringType);
String has replace method which does the job.
val a = Array("'x'","","z")
println(a.mkString(",").replace("\"", " "))
will produce 'x',,z