MySQL intersect on strings? - mysql

I do have two tables:
Quest
- (int) id
- (text) characters
User
- (int) id
- (text) characters
Entries look like this:
Quest
id | characters
1 | abcdefgh
2 | mkorti
3 | afoxi
4 | bac
User
id | characters
1 | abcd
Now I want to select the easiest Quest for User. The easiest quest is the one if the most intersections of quest.characters and user.characters. So in this example the list would look like this (for user.id = 1):
questid | easiness
4 | 100
1 | 50
3 | 40
2 | 0
The easiness simply show how many percent was matched. Is it possible with MySQL to make intersections of columns like this? What's the performance like? In fact I do have relations as well (quest -> character and user -> characters), however I guess it's not very performant. As there are a few thousand quests and also a few thousand characters.
Update #1
Okay, relational still seems the way to go, okay. Now my tables look like this:
CREATE TABLE IF NOT EXISTS `quest` (
`questid` int(10) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`questid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ;
CREATE TABLE IF NOT EXISTS `questcharacters` (
`questid` int(10) unsigned NOT NULL,
`characterid` int(10) unsigned NOT NULL,
PRIMARY KEY (`questid`,`characterid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `single_character` (
`characterid` int(10) unsigned NOT NULL AUTO_INCREMENT,
`single_char` varchar(10) NOT NULL,
PRIMARY KEY (`characterid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `user` (
`userid` int(10) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `usercharacters` (
`userid` int(10) unsigned NOT NULL,
`characterid` int(10) unsigned NOT NULL,
PRIMARY KEY (`userid`,`characterid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
PS.: Don't wonder why single_char has VARCHAR(10) as data type, but I will use multi-byte values and I don't know how MySQL handles them for char(1). So I just was generous there.
Update #2
My query for now is:
SELECT usercharacters.userid, questcharacters.questid
FROM `usercharacters`
LEFT OUTER JOIN questcharacters ON usercharacters.characterid = usercharacters.characterid
GROUP BY questcharacters.questid, usercharacters.userid;
But how to calculate the easiness/overlapping characters? On which field do I have to apply COUNT()?
Update #3
Okay, seems like I got it working with this query (uses a subselect):
SELECT usercharacters.userid as uid, questcharacters.questid as qid, (SELECT COUNT(questcharacters.characterid) FROM questcharacters LEFT OUTER JOIN usercharacters ON questcharacters.characterid = usercharacters.characterid WHERE questcharacters.questid = qid) as questcount
FROM `usercharacters`
LEFT OUTER JOIN questcharacters ON usercharacters.characterid = usercharacters.characterid
GROUP BY questcharacters.questid, usercharacters.userid;
Update #4
SELECT usercharacters.userid as uid, questcharacters.questid as qid, (SELECT COUNT(questcharacters.characterid) FROM questcharacters LEFT OUTER JOIN usercharacters ON questcharacters.characterid = usercharacters.characterid WHERE questcharacters.questid = qid) as user_knows, (SELECT COUNT(questcharacters.characterid) FROM questcharacters WHERE questcharacters.questid = qid) as total_characters
FROM `usercharacters`
LEFT OUTER JOIN questcharacters ON usercharacters.characterid = usercharacters.characterid
GROUP BY questcharacters.questid, usercharacters.userid
ORDER BY total_characters / user_knows DESC;
Only thing missing now: Selecting the easyiness. (As in the ORDER BY clause). Anyone knows how to do this?

So this is my final and working solution:
SELECT usercharacters.userid AS uid,
questcharacters.questid AS qid,
(SELECT Count(questcharacters.characterid)
FROM questcharacters
LEFT OUTER JOIN usercharacters
ON questcharacters.characterid =
usercharacters.characterid
WHERE questcharacters.questid = qid) AS user_knows,
(SELECT Count(questcharacters.characterid)
FROM questcharacters
WHERE questcharacters.questid = qid) AS total_characters,
(SELECT ( Count(questcharacters.characterid) / (SELECT
Count(questcharacters.characterid)
FROM questcharacters
WHERE
questcharacters.questid = qid) )
FROM questcharacters
LEFT OUTER JOIN usercharacters
ON questcharacters.characterid =
usercharacters.characterid
WHERE questcharacters.questid = qid) AS ratio
FROM `usercharacters`
LEFT OUTER JOIN questcharacters
ON usercharacters.characterid = usercharacters.characterid
GROUP BY questcharacters.questid,
usercharacters.userid
ORDER BY ratio DESC;
Do I really need that many sub-selects?

If you actually have questcharacter and usercharacters tables, then that is the best way to go:
SELECT uc.id AS userid,
qc.id AS qcid,
COUNT(*) AS NumCharacters,
COUNT(qc.char) AS Nummatches,
COUNT(qc.char) / count(*) AS Easiness
FROM UserCharacters uc
LEFT OUTER JOIN QuestCharacters qc ON uc.char = qc.char
WHERE uc.id = 1
GROUP BY uc.id, qc.id
ORDER BY easiness DESC
LIMIT 1
If you have them only as strings -- the SQL is not pretty. You have to do a cross join and lots of string manipulation. The best approach is to have things more normalized in the form of a relational database (one row per list element), rather than having lists embedded in strings.

Related

mySql - search rows without reference in second table

I have 2 tables.
How do i search for all rows in the first table that has no reference in the second table.
The connection field is: res_srvs.id = inv_supp2srv.srvID
So, I want to get all table "res_srvs" rows that has no srvID in table "inv_supp2srv".
TABLE: res_srvs
Collation Attributes
id int(11)
clientID int(6)
resNum int(9)
net decimal(7,2)
tax decimal(7,2)
from_date(date)
TABLE: inv_supp2srv
Collation Attributes
clientID int(6)
invNum int(10)
srvID int(11)
amount decimal(7,2)
valid tinyint(1)
This is what i tried:
SELECT srv.net , srv.tax , srv.net+srv.tax AS amount, srv.id AS srv_id
FROM res_srvs AS srv , inv_supp2srv AS i2s
WHERE srv.clientID = 1
AND srv.from_date >= '2020-03-01'
AND i2s.clientID = 1
AND i2s.srvID = srv.id
AND (NOT EXISTS
(
SELECT *
FROM inv_supp2srv AS i2s
WHERE i2s.srvID = srv.id
)
)
What you want is a left outer join with exclusion :
SELECT r.*
FROM res_srvs r
LEFT JOIN inv_supp2srv i
ON r.id = i.srvID
WHERE i.srvID IS NULL
AND (
-- Your others where clauses go there
);
You can use LEFT JOIN for second table and filter by NULL joined value like:
SELECT srv.net , srv.tax , srv.net+srv.tax AS amount, srv.id AS srv_id
FROM res_srvs AS srv
LEFT JOIN inv_supp2srv AS i2s ON i2s.srvID = srv.id
WHERE
srv.clientID = 1
AND srv.from_date >= '2020-03-01'
-- AND i2s.clientID = 1 not relevant condition
AND i2s.srvID IS NULL;
Another approach is using NOT EXISTS condition:
SELECT srv.net , srv.tax , srv.net+srv.tax AS amount, srv.id AS srv_id
FROM res_srvs AS srv
WHERE
srv.clientID = 1
AND srv.from_date >= '2020-03-01'
AND NOT EXISTS (
SELECT srvID FROM inv_supp2srv AS i2s WHERE i2s.srvID = srv.id
);
I want to get all table res_srvs rows that have no srvID in table inv_supp2srv.
It looks like you are overcomplicating this. I don't see the point for the join between the tables in the outer query - it attempts to match the tables, which contradicts the not exists condition.
I think you just want:
select r.*
from res_srvs r
where
r.from_date >= '2020-03-01'
and r.clientID = 1
and not exists (
select 1
from inv_supp2srv i
where i.srvID = r.id and i.clientID = r.clientID
)
I am unsure whether you want clientID in the correlation clause or not - your query makes it look like it is the case, so I added it.

MySQL: Update with join in subquery

I got a table with products and a table with reviews of the products. The products-table has the parent- and child-products. The parent-products should get all reviews from the child-products. I did:
DROP TABLE IF EXISTS products;
CREATE TABLE products (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`parent` int(10) unsigned DEFAULT NULL,
`review` decimal(3,2) DEFAULT NULL,
PRIMARY KEY(id)
);
DROP TABLE IF EXISTS reviews;
CREATE TABLE reviews (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`product` int(10) unsigned NOT NULL,
`review` decimal(3,2) DEFAULT NULL,
PRIMARY KEY(id)
);
INSERT INTO products SET id=1, parent=null;
INSERT INTO products SET id=2, parent=1;
INSERT INTO products SET id=3, parent=1;
INSERT INTO reviews SET product=2, review=5;
INSERT INTO reviews SET product=3, review=5;
INSERT INTO reviews SET product=3, review=4;
INSERT INTO products SET id=4, parent=null;
INSERT INTO products SET id=5, parent=4;
INSERT INTO reviews SET product=5, review=4;
INSERT INTO reviews SET product=5, review=2;
UPDATE products
SET products.review=
(SELECT SUM(reviews.review)/COUNT(reviews.review) FROM reviews
LEFT JOIN products p ON p.parent = products.id
)
WHERE products.parent IS NULL;
But with that I'm surprised I'm getting an error:
ERROR 1054 (42S22): Unknown column 'products.id' in 'on clause'
Any suggestions on how to do it correctly? The idea is that product 1 should get a review of 14/3 = 4.66 and product 4 should get a review of 6/2 = 3.
The products is not visible in the subquery. Use following syntax instead:
UPDATE products pp
LEFT JOIN (
SELECT pc.parent, SUM(r.review)/COUNT(r.review) as 'rev'
FROM reviews r
LEFT JOIN products pc on r.product = pc.id
GROUP BY pc.parent
) pcc ON pcc.parent = pp.id
SET pp.review=pcc.rev
WHERE pp.parent IS NULL;
Since you've declared p as an alias for the products table, you need to use it throughout the query. So, in your LEFT JOIN clause just use p.parent instead of products.parent.
UPDATE products
SET products.review=
(SELECT SUM(reviews.review)/COUNT(reviews.review) FROM reviews
LEFT JOIN products p ON p.parent = p.id
)
WHERE products.parent IS NULL;
At its heart, you appear to be looking for this value:
SELECT SUM(r.review)/(SELECT COUNT(*) FROM products) n FROM reviews r;
+----------+
| n |
+----------+
| 4.666667 |
+----------+
So, something like...
UPDATE products x
JOIN (SELECT SUM(r.review)/(SELECT COUNT(*) FROM products) n FROM reviews r) y
SET x.review = y.n
WHERE x.review IS NULL;

Using MySQL COUNT(1), COUNT(2) ...etc using JOIN

I have 4 tables:
Table talks
table talks_fan
table talks_follow
table talks_comments
What I'm trying to achieve is counting all comments, fans, followers for every single talk.
I came up with this so far.
All tables have talk_id and only in talks table is a primary key
SELECT
g. *,
COUNT( m.talk_id ) AS num_of_comments,
COUNT( f.talk_id ) AS num_of_followers
FROM
talks AS g
LEFT JOIN talks_comments AS m
USING ( talk_id )
LEFT JOIN talks_follow AS f
USING ( talk_id )
WHERE g.privacy = 'public'
GROUP BY g.talk_id
ORDER BY g.created_date DESC
LIMIT 30;
I also tried using this method
SELECT
t.*,
COUNT(b.talk_id) AS comments,
COUNT(bt.talk_id) AS followers
FROM
talks t
LEFT JOIN talks_follow bt
ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b
ON b.talk_id = t.talk_id
GROUP BY t.talk_id;
Both give me the same results ....?!
Update: Create Statements
CREATE TABLE IF NOT EXISTS `talks` (
`talk_id` bigint(20) NOT NULL AUTO_INCREMENT,
`user_id` mediumint(9) NOT NULL,
`title` varchar(255) NOT NULL,
`content` text NOT NULL,
`created_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`privacy` enum('public','private') NOT NULL DEFAULT 'private',
PRIMARY KEY (`talk_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=7 ;
CREATE TABLE IF NOT EXISTS `talks_comments` (
`comment_id` bigint(20) NOT NULL AUTO_INCREMENT,
`talk_id` bigint(20) NOT NULL,
`user_id` mediumint(9) NOT NULL,
`comment` text NOT NULL,
`date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`comment_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=8 ;
CREATE TABLE IF NOT EXISTS `talks_fan` (
`fan_id` bigint(20) NOT NULL AUTO_INCREMENT,
`talk_id` bigint(20) NOT NULL,
`user_id` bigint(20) NOT NULL,
`created_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`fan_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=4 ;
CREATE TABLE IF NOT EXISTS `talks_follow` (
`follow_id` bigint(20) NOT NULL AUTO_INCREMENT,
`talk_id` bigint(20) NOT NULL,
`user_id` mediumint(9) NOT NULL,
`date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`follow_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;
The final query that works
SELECT t.* , COUNT( DISTINCT b.comment_id ) AS comments,
COUNT( DISTINCT bt.follow_id ) AS followers,
COUNT( DISTINCT c.fan_id ) AS fans
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
LEFT JOIN talks_fan c ON c.talk_id = t.talk_id
WHERE t.privacy = 'public'
GROUP BY t.talk_id
ORDER BY t.created_date DESC
LIMIT 30
EDIT: Final answer to the whole issue...
I have modified the Query and created some code in PHP (Codeigniter) to solve my issue apone the reccomendation of #Bill Karwin
$sql="
SELECT t.*,
COUNT( DISTINCT b.comment_id ) AS comments,
COUNT( DISTINCT bt.follow_id ) AS followers,
COUNT( DISTINCT c.fan_id ) AS fans,
GROUP_CONCAT( DISTINCT c.user_id ) AS list_of_fans
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
LEFT JOIN talks_fan c ON c.talk_id = t.talk_id
WHERE t.privacy = 'public'
GROUP BY t.talk_id
ORDER BY t.created_date DESC
LIMIT 30
";
$query = $this->db->query($sql);
if($query->num_rows() > 0)
{
$results = array();
foreach($query->result_array() AS $talk){
$fan_user_id = explode(",", $talk['list_of_fans']);
foreach($fan_user_id AS $user){
if($user == 1 /* this supposed to be user id or session*/){
$talk['list_of_fans'] = 'yes';
}
}
$follower_user_id = explode(",", $talk['list_of_follower']);
foreach($follower_user_id AS $user){
if($user == 1 /* this supposed to be user id or session*/){
$talk['list_of_follower'] = 'yes';
}
}
$results[] = array(
'talk_id' => $talk['talk_id'],
'user_id' => $talk['user_id'],
'title' => $talk['title'],
'created_date' => $talk['created_date'],
'comments' => $talk['comments'],
'followers' => $talk['followers'],
'fans' => $talk['fans'],
'list_of_fans' => $talk['list_of_fans'],
'list_of_follower' => $talk['list_of_follower']
);
}
}
I STILL BELIEVE IT COULD BE OPTIMIZED IN THE DB AND JUST USE THE RESULT...
Im thinking if there are 1000 follower and 2000 fans of every single TALK then the result will take much longer to load.. HOW IF YOUT MULTIPLY THE NO WITH 10. Or im mistaking hear...
EDIT: adding benchmark for the query test...
I have used codeigniter profiler to know how long it take for the query to finish excuting.
that been said i also start adding data in the tables gratually
the result as follows.
Testing the DB after answerting data into it
Query Results time
table Talks
---------------
table data 50 rows.
Time: 0.0173 seconds
Table Rows: 644 rows
Time: 0.0535 seconds
Table Rows: 1250 rows
Time: 0.0856 seconds
Adding data to other tables
--------------------------
Talks = 1250 rows
talks_follow = 4115
talks_fan = 10 rows
Time: 2.656 seconds
Adding data to other tables
--------------------------
Talks = 1250 rows
talks_follow = 4115
talks_fan = 10 rows
talks_comments = 3650 rows
Time: 10.156 seconds
After replacing LEFT JOIN with STRAIGHT_JOIN
Time: 6.675 seconds
It seems that its extremely heavy on the DB.....
NOW Im Going to another dilemma on how to enhance its performance
Edited: using #leonardo_assumpcao suggestion
After rebuilding the DB using #leonardo_assumpcao suggestion
for indexing few fields..........
Adding data to other tables
--------------------------
Talks = 6000 Rows
talks_follow = 10000 Rows
talks_fan = 10000 Rows
talks_comments = 10000 Rows
Time: 17.940 second
Is this normal for heavy data DB......?
I can say this is (at least) one of the coolest select statements I improved today.
SELECT STRAIGHT_JOIN
t.* ,
COUNT( DISTINCT b.comment_id ) AS comments,
COUNT( DISTINCT bt.follow_id ) AS followers,
COUNT( DISTINCT c.fan_id ) AS fans
FROM
(
SELECT * FROM talks
WHERE privacy = 'public'
ORDER BY created_date DESC
LIMIT 0, 30
) AS t
LEFT JOIN talks_follow bt ON (bt.talk_id = t.talk_id)
LEFT JOIN talks_comments b ON (b.talk_id = t.talk_id)
LEFT JOIN talks_fan c ON (c.talk_id = t.talk_id)
GROUP BY t.talk_id ;
But it seems to me that your problem resides on your tables; A first step to obtain efficient queries is to index every field involved on your desired joins.
I've made some modifications on the tables you shown above; You can see its code here (updated).
Quite interesting, isn't it? Since we're here, take also your ERR model:
First try it using MySQL test database. Hopefully it will solve your performance troubles.
(Forgive my english, it's my second language)
You can force this into one query like so:
SELECT COUNT(*) num, 'talks' item FROM talks
UNION
SELECT COUNT(*) num, 'talks_fan' item FROM talks_fan
UNION
SELECT COUNT(*) num, 'talks_follow' item FROM talks_follow
UNION
SELECT COUNT(*) num, 'talks_comment' item FROM talks_comment
This will give you a five row resultset with one row per table. Each row is the count in a particular table.
If you must get it all into a single row you can do a pivot like so.
SELECT
SUM( CASE item WHEN 'talks' THEN num ELSE 0 END ) AS 'talks',
SUM( CASE item WHEN 'talks_fan' THEN num ELSE 0 END ) AS 'talks_fan',
SUM( CASE item WHEN 'talks_follow' THEN num ELSE 0 END ) AS 'talks_follow',
SUM( CASE item WHEN 'talks_comment' THEN num ELSE 0 END ) AS 'talks_comment'
FROM
( SELECT COUNT(*) num, 'talks' item FROM talks
UNION
SELECT COUNT(*) num, 'talks_fan' item FROM talks_fan
UNION
SELECT COUNT(*) num, 'talks_follow' item FROM talks_follow
UNION
SELECT COUNT(*) num, 'talks_comment' item FROM talks_comment
) counts
(This doesn't take into account your WHERE g.privacy = clause because I don't understand that. But you could add a WHERE clause to one one of the four queries in the UNION item to handle that.)
Notice that this truly is four queries on four separate tables coerced into a single query.
And, by the way, there is no difference in value between COUNT(*) and COUNT(id) when id is the primary key of the table. COUNT(id) doesn't count the rows for which the id is NULL, but if id is the primary key, then it is NOT NULL. But COUNT(*) is faster, so use it.
Edit if you need the number of fan, follow, and comment rows for each distinct talk, do this. It's the same idea of doing a union and a pivot, but with an extra parameter.
SELECT
talk_id,
SUM( CASE item WHEN 'talks_fan' THEN num ELSE 0 END ) AS 'talks_fan',
SUM( CASE item WHEN 'talks_follow' THEN num ELSE 0 END ) AS 'talks_follow',
SUM( CASE item WHEN 'talks_comment' THEN num ELSE 0 END ) AS 'talks_comment'
FROM
(
SELECT talk_id, COUNT(*) num, 'talks_fan' item
FROM talks_fan
GROUP BY talk_id
UNION
SELECT talk_id, COUNT(*) num, 'talks_follow' item
FROM talks_follow
GROUP BY talk_id
UNION
SELECT talk_id, COUNT(*) num, 'talks_comment' item
FROM talks_comment
GROUP BY talk_id
) counts
GROUP BY talk_id
After doing this for (too) many years, I've discovered that the best way to describe a query you need is to say to yourself "I need a result set with one row for each xxx, with columns for yyy, zzz, and qqq."
The reason the counts are the same is that it's counting rows after the joins have combined the tables. By joining to multiple tables, you're creating a Cartesian product.
Basically, you're counting not only how many comments per talk, but how many comments * followers per talk. Then you count the followers as how many followers * comments per talk. Thus the counts are the same, and they're all way too high.
Here's a simpler way to write a query to count each distinct comment, follower, etc. only once:
SELECT t.*,
COUNT(DISTINCT b.comment_id) AS comments,
COUNT(DISTINCT bt.follow_id) AS followers
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
GROUP BY t.talk_id;
Re your comment: I wouldn't fetch all the followers in the same query. You could do it this way:
SELECT t.*,
COUNT(DISTINCT b.comment_id) AS comments,
COUNT(DISTINCT bt.follow_id) AS followers,
GROUP_CONCAT(DISTINCT bt.follower_name) AS list_of_followers
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
GROUP BY t.talk_id;
But what you'd get back is a single string with the follower names separated by commas. Now you have to write application code to split the string on commas, you have to worry if some follower names actually contain commas already, and so on.
I'd do a second query, fetching the followers for a given talk. It's likely you want to display the followers only for a specific talk anyway.
SELECT follower_name
FROM talks_follow
WHERE talk_id = ?

MySQL query -- retrieve data from two different years

I cannot seem to get this MySQL query right. My table contains yearly inventory data for retail stores. Here's the table schema:
CREATE TABLE IF NOT EXISTS `inventory_data` (
inventory_id int unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
store_id smallint unsigned NOT NULL,
inventory_year smallint unsigned NOT NULL,
shortage_dollars decimal(10,2) unsigned NOT NULL
)
engine=INNODB;
Every store is assigned to a district which in this table (some non-relevant fields removed):
CREATE TABLE IF NOT EXISTS `stores` (
store_id smallint unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
district_id smallint unsigned not null
)
engine=INNODB;
I want to be able to retrieve the shortage dollar amounts for two given years for all the stores within a given district. Inventory data for each store is only added to the inventory_data table when the inventory is completed, so not all stores within a district will all be represented all the time.
This query works to return inventory data for all stores within a given district for a given year (ex: stores in district 1 for 2012):
SELECT stores.store_id, inventory_data.shortage_dollars
FROM stores
LEFT JOIN inventory_data ON (stores.store_id = inventory_data.store_id)
AND inventory_data.inventory_year = 2012
WHERE stores.district_id = 1
But, I need to be able to get data for stores within a district for two years, such that the data looks something close to this:
store_id | yr2011 | yr2012
For the specific result format that you need, you may try the following query:
SELECT `s`.`store_id`, `i`.`shortage_dollars` AS `yr2011`, `i1`.`shortage_dollars` AS `yr2012`
FROM `stores` `s`
LEFT JOIN `inventory_data` `i` ON `s`.`store_id` = `i`.`store_id`
AND `i`.`inventory_year` = 2011
LEFT JOIN `inventory_data` `i1` ON `s`.`store_id` = `i1`.`store_id`
AND `i1`.`inventory_year` = 2012
WHERE `s`.`district_id` = 1
Alternatively, you may as well try the next simpler query.
SELECT `s`.`store_id`, `i`.`inventory_year`, `i`.`shortage_dollars`
FROM `stores` `s`
LEFT JOIN `inventory_data` `i` ON `s`.`store_id` = `i`.`store_id`
WHERE `s`.`district_id` = 1
AND `i`.`inventory_year` IN (2011, 2012)
ORDER BY `s`.`store_id`, `i`.`inventory_year`
Hope it helps!
SELECT
stores.store_id,
inventory_data.inventory_year
inventory_data.shortage_dollars
FROM
(SELECT * FROM stores district_id = 1) stores
LEFT JOIN
(SELECT * FROM inventory_data
WHERE inventory_year IN (2011,2012)) inventory_data
USING (store_id)
;
or
SELECT
stores.store_id,
GROUP_CONCAT(inventory_data.shortage_dollars) dollars_per_year
FROM
(SELECT * FROM stores district_id = 1) stores
LEFT JOIN
(SELECT * FROM inventory_data
WHERE inventory_year IN (2011,2012)) inventory_data
USING (store_id)
GROUP BY stores.id,inventory_year;

Mysql query to check if all sub_items of a combo_item are active

I am trying to write a query that looks through all combo_items and only returns the ones where all sub_items that it references have Active=1.
I think I should be able to count how many sub_items there are in a combo_item total and then compare it to how many are Active, but I am failing pretty hard at figuring out how to do that...
My table definitions:
CREATE TABLE `combo_items` (
`c_id` int(11) NOT NULL,
`Label` varchar(20) NOT NULL,
PRIMARY KEY (`c_id`)
)
CREATE TABLE `sub_items` (
`s_id` int(11) NOT NULL,
`Label` varchar(20) NOT NULL,
`Active` int(1) NOT NULL,
PRIMARY KEY (`s_id`)
)
CREATE TABLE `combo_refs` (
`r_id` int(11) NOT NULL,
`c_id` int(11) NOT NULL,
`s_id` int(11) NOT NULL,
PRIMARY KEY (`r_id`)
)
So for each combo_item, there is at least 2 rows in the combo_refs table linking to the multiple sub_items. My brain is about to make bigbadaboom :(
I would just join the three tables usually and then combo-item-wise sum up the total number of sub-items and the number of active sub-items:
SELECT ci.c_id, ci.Label, SUM(1) AS total_sub_items, SUM(si.Active) AS active_sub_items
FROM combo_items AS ci
INNER JOIN combo_refs AS cr ON cr.c_id = ci.c_id
INNER JOIN sub_items AS si ON si.s_id = cr.s_id
GROUP BY ci.c_id
Of course, instead of using SUM(1) you could just say COUNT(ci.c_id), but I wanted an analog of SUM(si.Active).
The approach proposed assumes Active to be 1 (active) or 0 (not active).
To get only those combo-items whose all sub-items are active, just add WHERE si.Active = 1. You could then reject the SUM stuff anyway. Depends on what you are looking for actually:
SELECT ci.c_id, ci.Label
FROM combo_items AS ci
INNER JOIN combo_refs AS cr ON cr.c_id = ci.c_id
INNER JOIN sub_items AS si ON si.s_id = cr.s_id
WHERE si.Active = 1
GROUP BY ci.c_id
By the way, INNER JOIN ensures that there is at least one sub-item per combo-item at all.
(I have not tested it.)
See this answer:
MySQL: Selecting foreign keys with fields matching all the same fields of another table
Select ...
From combo_items As C
Where Exists (
Select 1
From sub_items As S1
Join combo_refs As CR1
On CR1.s_id = S1.s_id
Where CR1.c_id = C.c_id
)
And Not Exists (
Select 1
From sub_items As S2
Join combo_refs As CR2
On CR2.s_id = S2.s_id
Where CR2.c_id = C.c_id
And S2.Active = 0
)
The first subquery ensures that at least one sub_item exists. The second ensures that none of the sub_items are inactive.