Let's suppose that I have a product logs collection, all changes are being done on my products will be recorded in this collection ie :
+------------------------------+
| productId - status - comment |
| 1 0 .... |
| 2 0 .... |
| 1 1 .... |
| 2 1 .... |
| 1 2 .... |
| 3 0 .... |
+------------------------------+
I want to get all products which their status is 1 but hasn't became 2. In SQL the query would look something like :
select productId from productLog as PL1
where
status = 1
and productId not in (
select productId from productLog as PL2 where
PL1.productId = PL2.productId and PL2.status = 2
)
group by productId
I'm using native PHP MongoDB driver.
Well since the logic here on the subquery join is simply that exactly the same key matches the other then:
Setup
db.status.insert([
{ "productId": 1, "status": 0 },
{ "productId": 2, "status": 0 },
{ "productId": 1, "status": 1 },
{ "productId": 2, "status": 1 },
{ "productId": 1, "status": 2 },
{ "productId": 3, "status": 0 }
])
Then use .aggregate():
db.status.aggregate([
{ "$match": {
"status": { "$ne": 2 }
}},
{ "$group": {
"_id": "$productId"
}}
])
Or using map reduce (with a DBRef):
db.status.mapReduce(
function() {
if ( this.productId.$oid == 2 ) {
emit( this.prouctId.$oid, null )
}
},
function(key,values) {
return null;
},
{ "out": { "inline": 1 } }
);
But again the SQL here was as simple as:
select productId
from productLog
where status <> 2
group by productId
Without the superfluous join on exactly the same key value
This mongo query above doesn't meet the requirements in question,
the result of the mongo-query includes documents with productId=1,
however the result of the SQL in question doesn't. Because in sample data: there exists 1 record with status=2, and productId of that document is 1.
So, assuming db.productLog.insert executed as stated above, you can use the code below to get the results:
//First: subquery for filtering records having status=2:
var productsWithStatus2 = db.productLog .find({"status":2}).map(function(rec) { return rec.productId; });
//Second:final query to get productIds which there not exists having status=2 with same productId :
db.productLog.aggregate([ {"$match":{productId:{$nin:productsWithStatus2}}},{"$group": {"_id": "$productId"}}]) ;
//Alternative for Second final query:
//db.productLog.distinct("productId",{productId:{$nin:productsWithStatus2}});
//Alternative for Second final query,get results with product and status detail:
//db.productLog.find({productId:{$nin:productsWithStatus2}});
Related
I have a JSON variable that looks like this (the real one is more complex):
DECLARE #myJson VARCHAR(3000) = '{
"CustomerId": "123456",
"Orders": [{
"OrderId": "852",
"OrderManifests": [{
"ShippedProductId": 884,
"ProductId": 884
}, {
"ShippedProductId": 951,
"ProductId": 2564
}
]
}, {
"OrderId": "5681",
"OrderManifests": [{
"ShippedProductId": 198,
"ProductId": 4681
}, {
"ShippedProductId": 8188,
"ProductId": 8188
}, {
"ShippedProductId": 144,
"ProductId": 8487
}
]
}
]
}'
In the end, I need to know if any of the ShippedProductId values match their corresponding ProductId (in the same JSON object).
I started in by trying to get a list of all the OrderManifests. But while this will get me the array of orders:
SELECT JSON_QUERY(#myJson, '$.Orders')
I can't seem to find a way to get a list of all the OrderManifests across all the entries in the Orders array. This does not work:
SELECT JSON_QUERY(#myJson, '$.Orders.OrderManifests')
Is there a way to do a Select Many kind of query to get all the OrderManifests in the Orders array?
Use OPENJSON and CROSS APPLY to drill down into your objects.
This should do it for you:
SELECT j.CustomerId,o.OrderId, m.ShippedProductId, m.ProductId
FROM OPENJSON(#myJson)
WITH (
CustomerId NVARCHAR(1000),
Orders NVARCHAR(MAX) AS JSON
) j
CROSS APPLY OPENJSON(j.Orders)
WITH (
OrderId NVARCHAR(1000),
OrderManifests NVARCHAR(MAX) AS JSON
) o
CROSS APPLY OPENJSON(o.OrderManifests)
WITH (
ShippedProductId INT,
ProductId int
) m
WHERE m.ShippedProductId = m.ProductId;
This query returns:
CustomerId | OrderId | ShipedProductId | ProductId
------------+-----------+-------------------+-------------
123456 | 852 | 884 | 884
------------+-----------+-------------------+-------------
123456 | 5681 | 8188 | 8188
I have below API response sample
{
"items": [
{
"id":11,
"name": "SMITH",
"prefix": "SAM",
"code": "SSO"
},
{
"id":10,
"name": "James",
"prefix": "JAM",
"code": "BBC"
}
]
}
As per above response, my tests says that whenever I hit the API request the 11th ID would be of SMITH and 10th id would be JAMES
So what I thought to store this in a table and assert against the actual response
* table person
| id | name |
| 11 | SMITH |
| 10 | James |
| 9 | RIO |
Now how would I match one by one ? like first it parse the first ID and first name from the API response and match with the Tables first ID and tables first name
Please share any convenient way of doing it from KARATE
There are a few possible ways, here is one:
* def lookup = { 11: 'SMITH', 10: 'James' }
* def items =
"""
[
{
"id":11,
"name":"SMITH",
"prefix":"SAM",
"code":"SSO"
},
{
"id":10,
"name":"James",
"prefix":"JAM",
"code":"BBC"
}
]
"""
* match each items contains { name: "#(lookup[_$.id+''])" }
And you already know how to use table instead of JSON.
Please read the docs and other stack-overflow answers to get more ideas.
Summary: I'll start with a JSON schema to describe the expectation. Notice the roles with a nested array of objects and I'm looking for a "Smart query" that can fetch it one single query.
{
"id": 1,
"first": "John",
"roles": [ // Expectation -> array of objects
{
"id": 1,
"name": "admin"
},
{
"id": 2,
"name": "accounts"
}
]
}
user
+----+-------+
| id | first |
+----+-------+
| 1 | John |
| 2 | Jane |
+----+-------+
role
+----+----------+
| id | name |
+----+----------+
| 1 | admin |
| 2 | accounts |
| 3 | sales |
+----+----------+
user_role
+---------+---------+
| user_id | role_id |
+---------+---------+
| 1 | 1 |
| 1 | 2 |
| 2 | 2 |
| 2 | 3 |
+---------+---------+
Attempt 01
In a naive approach I'd run two sql queries in my nodejs code, with the help of multipleStatements:true in connection string. Info.
User.getUser = function(id) {
const sql = "SELECT id, first FROM user WHERE id = ?; \
SELECT role_id AS id, role.name from user_role \
INNER JOIN role ON user_role.role_id = role.id WHERE user_id = ?";
db.query(sql, [id, id], function(error, result){
const data = result[0][0]; // first query result
data.roles = result[1]; // second query result, join in code.
console.log(data);
});
};
Problem: Above code produces the expected JSON schema but it takes two queries, I was able to narrow it down in a smallest possible unit of code because of multiple statements but I don't have such luxury in other languages like Java or maybe C# for instance, there I've to create two functions and two sql queries. so I'm looking for a single query solution.
Attempt 02
In an earlier attempt With the help of SO community, I was able to get close to the following using single query but it can only help to produce the array of string (not array of objects).
User.getUser = function(id) {
const sql = "SELECT user.id, user.first, GROUP_CONCAT(role.name) AS roles FROM user \
INNER JOIN user_role ON user.id = user_role.user_id \
INNER JOIN role ON user_role.role_id = role.id \
WHERE user.id = ? \
GROUP BY user.id";
db.query(sql, id, function (error, result) {
const data = {
id: result[0].id, first: result[0].first,
roles: result[0].roles.split(",") // manual split to create array
};
console.log(data);
});
};
Attempt 02 Result
{
"id": 1,
"first": "John",
"roles": [ // array of string
"admin",
"accounts"
]
}
it's such a common requirement to produce array of objects so wondering there must be something in SQL that I'm not aware of. Is there a way to better achieve this with the help of an optimum query.
Or let me know that there's no such solution, this is it and this is how it's done in production code out there with two queries.
Attempt 03
use role.id instead of role.name in GROUP_CONCAT(role.id), that way you can get hold of some id's and then use another subquery to get associated role names, just thinking...
SQL (doesn't work but just to throw something out there for some thought)
SELECT
user.id, user.first,
GROUP_CONCAT(role.id) AS role_ids,
(SELECT id, name FROM role WHERE id IN role_ids) AS roles
FROM user
INNER JOIN user_role ON user.id = user_role.user_id
INNER JOIN role ON user_role.role_id = role.id
WHERE user.id = 1
GROUP BY user.id;
Edit
Based on Amit's answer, I've learned that there's such solution in SQL Server using JSON AUTO. Yes this is something I'm looking for in MySQL.
To articulate precisely.
When you join tables, columns in the first table are generated as
properties of the root object. Columns in the second table are
generated as properties of a nested object.
User this Join Query
FOR JSON AUTO will return JSON for your query result
SELECT U.UserID, U.Name, Roles.RoleID, Roles.RoleName
FROM [dbo].[User] as U
INNER JOIN [dbo].UserRole as UR ON UR.UserID=U.UserID
INNER JOIN [dbo].RoleMaster as Roles ON Roles.RoleID=UR.RoleMasterID
FOR JSON AUTO
out put of above query is
[
{
"UserID": 1,
"Name": "XYZ",
"Roles": [
{
"RoleID": 1,
"RoleName": "Admin"
}
]
},
{
"UserID": 2,
"Name": "PQR",
"Roles": [
{
"RoleID": 1,
"RoleName": "Admin"
},
{
"RoleID": 2,
"RoleName": "User"
}
]
},
{
"UserID": 3,
"Name": "ABC",
"Roles": [
{
"RoleID": 1,
"RoleName": "Admin"
}
]
}
]
Though it is an old question, just thought might help others looking for the same issue. The below script should output the json schema you have been looking for.
SELECT roles, user.* from `user_table` AS user
INNER JOIN `roles_table` AS roles
ON user.id=roles.id
Given this simple JSON file:
{
"EVT": {
"EVT_ID": "12345",
"LINES": {
"LINE": {
"LINE_NUM" : 1,
"AMT" : 100,
"EVT_DT" : "2018-01-01"
},
"LINE": {
"LINE_NUM" : 2,
"AMT" : 150,
"EVT_DT" : "2018-01-02"
}
}
}
}
We need to load that into a hive table. The ultimate goal is to flatten the json, something like this:
+--------+----------+-----+------------+
| EVT_ID | Line_Num | Amt | Evt_Dt |
+--------+----------+-----+------------+
| 12345 | 1 | 100 | 2018-01-01 |
| 12345 | 2 | 150 | 2018-01-02 |
+--------+----------+-----+------------+
Here's my current DDL for the table:
create table foo.bar (
`EVT` struct<
`EVT_ID`:string,
`LINES`:struct<
LINE: struct<`LINE_NUM`: int,`AMT`:int,`EVT_DT`:string>
>
>)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';
It seems like the second "line" is overwriting the first. A simple select * from the table returns;
{"evt_id":"12345","lines":{"line":{"line_num":2,"amt":150,"evt_dt":"2018-01-02"}}}
What am I doing wrong?
The JSON and table definition are wrong. "Repeating elements" is an Array. LINES should be array<struct>, not struct<struct> (note square brackets):
{
"EVT": {
"EVT_ID": "12345",
"LINES": [
{
"LINE_NUM" : 1,
"AMT" : 100,
"EVT_DT" : "2018-01-01"
},
{
"LINE_NUM" : 2,
"AMT" : 150,
"EVT_DT" : "2018-01-02"
}
]
}
}
And you do not need this "LINE": also, because it is just an array element
I would like to convert below mysql query to mongodb query.
SELECT substring(o.schedule_datetime,1,4) 'Year',
SUM(IF(o.order_status in ('SUCCESS','#SUCCESS'),1,0)) 'SUCCESS'
FROM (
select group_concat(distinct ifnull(os.order_status,'') order by os.order_status
separator '#') 'order_status',schedule_datetime
from order_summary os group by order_number
)o group by 1 desc;
For Example: I have sample table
id order_number product_number order_status schedule_datetime
1 001 001.1 SUCCESS 20180103
2 001 001.2 SUCCESS 20180102
3 111 111.1 SUCCESS 20171225
4 111 111.2 SUCCESS 20171224
5 222 222.1 INPROGRESS 20171122
6 222 222.2 SUCCESS 20171121
I get the output using above mysql query for order status SUCCESS
Year SUCCESS
2018 1
2017 1
I have used separator(#) to combine multiple statues as string and get the desired result by status, to get INPROGRESS i will be just changing SUM funtion as shown below :
SUM(IF(o.order_status in ('INPROGRESS','INPROGRESS#SUCCESS', '#INPROGRESS','#INPROGRESS#SUCCESS'),1,0)) 'INPROGRESS'
I have tried to write the mongodb query, but got stuck how to combine sum and if condition as well group_concat with seperator as i used in mysql query.
db.order_summary.aggregate([
{ "$project" :
{ "orderDate" : 1 , "subOrderDate" : { "$substr" : [ "$order_date" , 0 , 4]},
"order_number":"$order_number"
},
} ,
{ "$group":{
"_id": { "order_number" : "$order_number", "Year": "$subOrderDate", "order_status":{"$addToSet":{"$ifNull":["$order_status",'']}}}
}
},
{ "$group": {
"_id": "$_id.Year", "count": { "$sum": 1 }
}
},
{ "$sort" : { "_id" : -1}}
])
Anyone help will be much appreciated, thanks
There is no Group_Concat kind of functionality in mongodb.
You can compare arrays for matching values in last group with $in operator in 3.4 version.
First $group to get all the distinct order status for a combination for order number and order status.
$sort to sort the order statuses.
Second $group to push all the sorted status values by order number.
Final $group to compare the statuses for each year against the input list of status and output total count for all matches.
db.order_summary.aggregate([{"$project":{
"schedule_datetime":1,
"order_number":1,
"order_status":{"$ifNull":["$order_status",""]}
}},
{"$group":{
"_id":{
"order_number":"$order_number",
"order_status":"$order_status"
},
"schedule_datetime":{"$first": "$schedule_datetime"}
}},
{"$sort":{"_id.order_status": 1}},
{"$group":{
"_id":{
"order_number":"$_id.order_number"
},
"schedule_datetime":{"$first": "$schedule_datetime"},
"order_status":{"$push": "$_id.order_status"}
}},
{"$group":{
"_id":{"$substr":["$schedule_datetime",0,4]},
"count":{
"$sum":{
"$cond": [
{"$in": ["$order_status",[["SUCCESS"], ["","SUCCESS"]]]},
1,
0]
}
}
}},
{"$sort":{"_id":-1}}])