I am writting a simple application that is ordering my medias (pictures, music, videos...). Each media can ben associated with 0 to many tags.
My goal is to have a UI where I can search my medias (for exemple, show images and videos tagged like %hol%, and return both holidays tagged photos and hollywood tagged photos).
Here's my database :
Table medias
+---------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| path | varchar(400) | NO | UNI | NULL | |
| type | varchar(5) | NO | | NULL | |
| libelle | varchar(200) | NO | | NULL | |
| ratings | int(2) | NO | | NULL | |
+---------+--------------+------+-----+---------+----------------+
Table tags
+---------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| libelle | varchar(200) | NO | UNI | NULL | |
+---------+--------------+------+-----+---------+----------------+
Table medias_tags
+----------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------+------+-----+---------+-------+
| id_media | int(11) | NO | PRI | NULL | |
| id_tag | int(11) | NO | PRI | NULL | |
+----------+---------+------+-----+---------+-------+
As I have many medias, I had to limit the result. So in my front-end, I made a pagination system, and query my medias according to the page I am (for exemple, if I am on page 3, I put LIMIT 20 OFFSET 60 in my sql statement).
Now I'm trying to filter my medias. I have a searchbar, and if I type 'hol', I want to get 20 medias with tagged like '%hol%' (holidays, hollywood...)
Getting filtered medias works, but I don't know how to get exactly 20 medias.
Here's my sql query without filtering:
SELECT
medias.id, medias.path, medias.type, medias.libelle as libelle, medias.ratings, tags.libelle as tag
FROM (select * from medias LIMIT ? OFFSET ?) medias
left outer join medias_tags on medias.id = medias_tags.id_media
left outer join tags on tags.id = medias_tags.id_tag
And here's my filtering sql query:
SELECT
medias.id, medias.path, medias.type, medias.libelle as libelle, medias.ratings, tags.libelle as tag
FROM medias
left outer join medias_tags on medias.id = medias_tags.id_media
left outer join tags on tags.id = medias_tags.id_tag
WHERE tags.libelle LIKE ? [OR tags.libelle LIKE ? ...]
(last parameters are my tags)
Both query work well, but I can't find a way to limit my filtered result. Here's a sample of my filtering query result :
+----+-------------+-------+-------------------+---------+------------+
| id | path | type | libelle | ratings | tag |
+----+-------------+-------+-------------------+---------+------------+
| 11 | mock/02.jpg | PHOTO | 02.jpg | 0 | dark |
| 1 | mock/03.jpg | PHOTO | Purple | 5 | wallpapper |
| 3 | mock/01.jpg | PHOTO | Wave | 5 | wave |
| 3 | mock/01.jpg | PHOTO | Wave | 5 | wallpapper |
+----+-------------+-------+-------------------+---------+------------+
How can I limit my filtering result to only return n different medias id ? Is there a pure sql solution ? Maybe with stored procedures ?
Thanks !
EDIT :
Here's a result I'd like with limit = 7 :
+----+-------------+-------+-------------------+---------+------------+
| id | path | type | libelle | ratings | tag |
+----+-------------+-------+-------------------+---------+------------+
| 11 | mock/02.jpg | PHOTO | 02.jpg | 0 | dark |
| 7 | mock/01.jpg | PHOTO | NEWLY ADDED MEDIA | 8 | wallpapper |
| 2 | mock/02.jpg | PHOTO | Night | 5 | wallpapper |
| 2 | mock/02.jpg | PHOTO | Night | 5 | dark |
| 1 | mock/03.jpg | PHOTO | Purple | 5 | wallpapper |
| 4 | mock/03.jpg | PHOTO | Purple 2 | 5 | wallpapper |
| 5 | mock/03.jpg | PHOTO | Purple 3 EDITED | 8 | wallpapper |
| 3 | mock/01.jpg | PHOTO | Wave | 5 | wave |
| 3 | mock/01.jpg | PHOTO | Wave | 5 | wallpapper |
+----+-------------+-------+-------------------+---------+------------+
I have 9 rows, but only 7 distincts media id. Every media has a tag like '%a%'.
EDIT 2 : someone posted an answer, but deleted it. His idea was to concatenate tags, which would be a nice solution too.
Something like that :
+----+-------------+-------+-------------------+---------+------------+
| id | path | type | libelle | ratings | tag |
+----+-------------+-------+-------------------+---------+------------+
| 11 | mock/02.jpg | PHOTO | 02.jpg | 0 | dark |
| 7 | mock/01.jpg | PHOTO | NEWLY ADDED MEDIA | 8 | wallpapper |
| 2 | mock/02.jpg | PHOTO | Night | 5 | wallpapper, dark |
| 1 | mock/03.jpg | PHOTO | Purple | 5 | wallpapper |
| 4 | mock/03.jpg | PHOTO | Purple 2 | 5 | wallpapper |
| 5 | mock/03.jpg | PHOTO | Purple 3 EDITED | 8 | wallpapper |
| 3 | mock/01.jpg | PHOTO | Wave | 5 | wave, wallpapper |
+----+-------------+-------+-------------------+---------+------------+
But I have no idea how to write this sql query...
Use GROUP_CONCAT in order to build a tag string per media and outer join this result. Then apply your limit clause as desired
select
medias.id,
medias.path,
medias.type,
medias.libelle,
medias.ratings,
mtags.tags
from medias
left outer join
(
select id_media, group_concat(tags.libelle order by tags.libelle) as tags
from medias_tags
join tags on tags.id = medias_tags.id_tag
group by id_media
) mtags on mtags.id_media = medias.id
order by medias.id
limit 20 offset 60;
Are you expecting like this?
SELECT
medias.id, medias.path, medias.type, medias.libelle as libelle, medias.ratings, tags.libelle as tag
FROM medias
left outer join medias_tags on medias.id = medias_tags.id_media
left outer join tags on tags.id = medias_tags.id_tag
WHERE tags.libelle LIKE ? [OR tags.libelle LIKE ? ...]
order by medias.id
limit 0,10
Here limit is used for first 10 record. you can use stored procedure for passing the two params of limit and pick the filtered result
Related
In attempting to pull a large series of columns (~15-20) from several joined tables, I put together 2 views that would pull the necessary information. In my local DB (only ~1k posts rows), joining these views worked fine, however; when I created those same views on our production DB (~30k posts rows) and attempted to join the view, I realized that that solution wouldn't scale beyond a test dataset.
I attempted to migrate those 2 views (categories data—like categories.title—and creators' data—like users.display_name) into a CTE post_data which, in theory, would act as a keyed version of those views, and allow me to get all post data for the eligible posts.
I have put together a sample DBFiddle with some test data to explain the table structure. The actual data has many more columns, but this is representative of the joins necessary to build the query.
table : posts
+-----+-----------+------------+------------------------------------------+----------------------------------------+
| id | parent_id | created_by | message | attachments |
+-----+-----------+------------+------------------------------------------+----------------------------------------+
| 8 | NULL | 8 | laptop for sale | [{"media_id": 1380}] |
| 9 | NULL | 4 | NEW lamp shade up for grabs | [{"media_id": 1442}, {"link_id": 103}] |
| 10 | 1 | 7 | Oooh I could be interested | |
| 11 | 1 | 7 | DMing you now! I've been looking for one | |
+-----+-----------+------------+------------------------------------------+----------------------------------------+
table : users
+----+------------------+---------------------------+
| id | display_name | created_at |
+----+------------------+---------------------------+
| 1 | John Appleseed | 2018-02-20T00:00:00+00:00 |
| 2 | Massimo Jenkins | 2018-05-14T00:00:00+00:00 |
| 3 | Johanna Marionna | 2018-06-05T00:00:00+00:00 |
| 4 | Jackson Creek | 2018-11-15T00:00:00+00:00 |
| 5 | Joe Schmoe | 2019-01-09T00:00:00+00:00 |
| 6 | John Johnson | 2019-02-14T00:00:00+00:00 |
| 7 | Donna Madison | 2019-05-14T00:00:00+00:00 |
| 8 | Jenna Kaplan | 2019-06-23T00:00:00+00:00 |
+----+------------------+---------------------------+
table : categories
+----+------------+------------+-------------------------------------------------------+
| id | created_by | title | description |
+----+------------+------------+-------------------------------------------------------+
| 1 | 2 | Technology | Anything tech; Consumer, business or education tools! |
| 2 | 2 | Home Goods | Anything for the home |
+----+------------+------------+-------------------------------------------------------+
table : categories_posts
+---------+-------------+
| post_id | category_id |
+---------+-------------+
| 8 | 1 |
| 9 | 1 |
| 10 | 1 |
| 11 | 1 |
+---------+-------------+
table : users_categories
+---------+-------------+
| user_id | category_id |
+---------+-------------+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
+---------+-------------+
table : posts_removed
+---------+----------------------+------------+
| post_id | removed_at | removed_by |
+---------+----------------------+------------+
| 10 | 2019-01-22 09:08:14 | 7 |
+---------+----------------------+------------+
In the below query, eligible posts are determined in the base SELECT; then, the post_data CTE is joined to the result set (limited to 25 rows) and all columns from the CTE are returned.
WITH post_data AS (
SELECT posts.id,
posts.parent_id,
posts.created_by,
posts.attachments,
categories_posts.category_id,
categories.title,
categories.created_by AS category_created_by,
creator.display_name AS creator_display_name,
creator.created_at AS creator_created_at
/* ... And a whole bunch of other fields from posts, categories_posts, users */
FROM posts
LEFT OUTER JOIN categories_posts
ON categories_posts.post_id = posts.id
LEFT OUTER JOIN categories
ON categories.id = categories_posts.category_id
LEFT OUTER JOIN users creator
ON creator.id = posts.created_by
/* ... And a whole bunch of other joins to facilitate the selected fields */
)
SELECT post_data.*
FROM posts
/* Set up the criteria for the posts selected before getting their data from the CTE */
LEFT OUTER JOIN posts_removed removed ON removed.post_id = posts.id
LEFT OUTER JOIN users user_me ON user_me.id = "1"
LEFT OUTER JOIN users_followed ON users_followed.user_id = posts.created_by
AND users_followed.followed_by = user_me.id
LEFT OUTER JOIN categories_posts ON categories_posts.post_id = posts.id
LEFT OUTER JOIN users_categories ON users_categories.category_id = categories_posts.category_id
LEFT OUTER JOIN posts_removed pp_removed ON pp_removed.post_id = posts.parent_id
/* Join our post_data on the post's ID */
JOIN post_data ON post_data.id = posts.id
WHERE
(
(
users_categories.user_id = user_me.id AND users_categories.left_at IS NULL
) OR categories_posts.category_id IS NULL
) AND (
posts.created_by = user_me.id
OR users_followed.followed_by = user_me.id
OR categories_posts.category_id IS NOT NULL
) AND removed.removed_at IS NULL
AND pp_removed.removed_at IS NULL
AND (post_data.id = posts.id OR post_data.id = posts.parent_id)
ORDER BY posts.id DESC
LIMIT 25
In theory, I thought this would work by selecting the rows based on the base select criteria, then doing an index scan for the CTE based on the Post ID; however, it seems that the query optimizer chooses instead to do a full table scan of the posts table.
The EXPLAIN SELECT gave me this information:
+----+-------------+------------------------+--------+-------------------------------+-------------+---------+---------------------------------------------+--------+----------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | extra |
+----+-------------+------------------------+--------+-------------------------------+-------------+---------+---------------------------------------------+--------+----------+----------------------------------------------------+
| 1 | PRIMARY | posts | ALL | PRIMARY,parent_id,created_by | | | | 33870 | 100 | Using temporary; Using filesort |
| 1 | PRIMARY | removed | eq_ref | PRIMARY | PRIMARY | 8 | posts.id | 1 | 19 | Using where |
| 1 | PRIMARY | user_me | const | PRIMARY | PRIMARY | 8 | const | 1 | 100 | Using where; Using index |
| 1 | PRIMARY | categories_posts | eq_ref | PRIMARY | PRIMARY | 8 | api.posts.id | 1 | 100 | |
| 1 | PRIMARY | categories | eq_ref | PRIMARY | PRIMARY | 8 | categories_posts.category_id | 1 | 100 | Using index |
| 1 | PRIMARY | users_categories | eq_ref | user_id_2,user_id,category_id | user_id_2 | 16 | user_me.id,api.categories_posts.category_id | 1 | 100 | Using where |
| 1 | PRIMARY | users_followed | eq_ref | user_id,followed_by | user_id | 16 | posts.created_by,api.user_me.id | 1 | 100 | Using where; Using index |
| 1 | PRIMARY | pp_removed | eq_ref | PRIMARY | PRIMARY | 8 | api.posts.parent_id | 1 | 19 | Using where |
| 1 | PRIMARY | <derived2> | ALL | | | | | 493911 | 19 | Using where; Using join buffer (Block Nested Loop) |
| 2 | DERIVED | posts | ALL | | | | | 33870 | 100 | Using temporary |
| 2 | DERIVED | categories_posts | eq_ref | PRIMARY | PRIMARY | 8 | api.posts.id | 1 | 100 | |
| 2 | DERIVED | categories | eq_ref | PRIMARY | PRIMARY | 8 | api.categories_posts.category_id | 1 | 100 | |
| 2 | DERIVED | posts_votes | ref | post_id | post_id | 8 | api.posts.id | 1 | 100 | Using index |
| 2 | DERIVED | pp | eq_ref | PRIMARY | PRIMARY | 8 | api.posts.parent_id | 1 | 100 | |
| 2 | DERIVED | pp_removed | eq_ref | PRIMARY | PRIMARY | 8 | api.pp.id | 1 | 100 | Using index |
| 2 | DERIVED | removed | eq_ref | PRIMARY | PRIMARY | 8 | api.posts.id | 1 | 100 | Using index |
| 2 | DERIVED | creator | eq_ref | PRIMARY | PRIMARY | 8 | api.posts.created_by | 1 | 100 | |
| 2 | DERIVED | usernames | ref | user_id | user_id | 8 | api.creator.id | 1 | 100 | |
| 2 | DERIVED | verifications | ALL | | | | | 4 | 100 | Using where; Using join buffer (Block Nested Loop) |
| 2 | DERIVED | categories_identifiers | ref | category_id | category_id | 8 | api.categories.id | 1 | 100 | |
+----+-------------+------------------------+--------+-------------------------------+-------------+---------+---------------------------------------------+--------+----------+----------------------------------------------------+
Beyond this, I tried refactoring my query to try and force key usage in the posts table, such as using FORCE INDEX(PRIMARY) in the select, and moving the CTE be the base query and adding a filter WHERE id IN ({the original base query}), but it seems the optimizer still does a full table scan.
In case it's helpful to decode what's happening in the query plan:
At time of writing, there are 33,387 posts rows, but the query plan shows
The query plan shows a full table scan which returns 33,870 rows
The query plan also shows the derived table (<derived2>) as having 493,911 rows
My core questions are:
Am I correct when I say that subqueries should only be executed once per result row from the base select query? If so, then the CTE should also use the JOIN on posts.id and likely use the table index?
Why does the query plan show that it selects 33,870 rows when there are only 33,387? And where do the 493,911 rows come from?
How do you prevent a full table scan in this case?
Give this a try... Do the LIMIT 25 before JOINing to the WITH:
SELECT * FROM
( SELECT ... FROM posts
JOIN categories_posts ...
ORDER BY posts.id DESC
LIMIT 25 ) AS x
JOIN post_data
ON post_data.id IN (x.id, x.parent_id)
ORDER BY posts.id DESC
+--------------+
| paintings |
+--------------+
| id | title |
+----+---------+
| 1 | muzelf1 |
| 2 | muzelf2 |
| 3 | muzelf3 |
+----+---------+
+----------------------------------------+
| tags |
+----------------------------------------+
| id | type | name |
+----+-----------------+-----------------+
| 1 | painting_medium | oil_painting |
| 2 | painting_style | impressionistic |
| 3 | painting_medium | mixed_media |
| 4 | painting_medium | watercolours |
| 5 | painting_style | mixed_media |
| 6 | painting_style | photorealistic |
+----+-----------------+-----------------+
+---------------------------+
| paintings_tags |
+---------------------------+
| id | painting_id | tag_id |
+----+-------------+--------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 4 |
| 4 | 3 | 2 |
| 5 | 3 | 1 |
+----+-------------+--------+
sql
CREATE TABLE paintings (
id integer AUTO_INCREMENT PRIMARY KEY,
title text
);
INSERT INTO paintings(id,title) VALUES
(1,'muzelf1'),(2,'muzelf2'),(3,'muzelf3');
CREATE TABLE tags (
id integer AUTO_INCREMENT PRIMARY KEY,
name text,
type text
);
INSERT INTO tags(id,name,type) VALUES
(1,'oil_painting','painting_medium')
,(2,'impressionistic','painting_style')
,(3,'mixed_media','painting_medium')
,(4,'watercolours','painting_medium')
,(5,'mixed_media','painting_style')
,(6,'photorealistic','painting_style');
CREATE TABLE paintings_tags (
id integer AUTO_INCREMENT PRIMARY KEY,
painting_id integer,
tag_id integer
);
INSERT INTO paintings_tags(id,painting_id,tag_id) VALUES
(1,1,1)
,(2,1,2)
,(3,2,4)
,(4,3,2)
,(5,3,1);
Find all the paintings with [{tags.type="painiting_medium", tags.name="oil_painitng"},{tags.type="painiting_style", tags.name="impressionistic"}].
+-----------------------------------+
| Expected Output |
+-----------------------------------+
| id | painting_title | painting_id |
+----+----------------+-------------+
| 1 | muzelf1 | 1 |
+----+----------------+-------------+
| 2 | muzelf3 | 3 |
+----+----------------+-------------+
Here is something I tried doing using bookShelf ORM and knex query builder.
Painting.query(function (qb) {
qb.innerJoin('painting_tags','paintings.id','painting_tags.painting_id')
.innerJoin('tags','painting_tags.tag_id','tags.id')
.where(qb => {
tagFilters.forEach(filter => {
qb.where('tags.type',filter.type).andWhere('tags.name',filter.name)
})
});
});
The above only works if the tag filters array has only one element. But I need it to work for all the filters in the array.
What would a raw query look like for the above? And how can I convert the same to work using ORM and query builder?
Here's one idea:
SELECT p.id painting_id
, p.title
, MAX(CASE WHEN t.type = 'painting_medium' THEN t.name END) medium
, MAX(CASE WHEN t.type = 'painting_style' THEN t.name END) style
FROM paintings p
JOIN paintings_tags pt
ON pt.painting_id = p.id
JOIN tags t
ON t.id = pt.tag_id
GROUP
BY p.id;
+-------------+---------+--------------+-----------------+
| painting_id | title | medium | style |
+-------------+---------+--------------+-----------------+
| 1 | muzelf1 | oil_painting | impressionistic |
| 2 | muzelf2 | watercolours | NULL |
| 3 | muzelf3 | oil_painting | impressionistic |
+-------------+---------+--------------+-----------------+
You could filter this as a subquery (or using HAVING) but, unless the data set was vast, I would be inclined to do the filtering in a bit of javascript.
I have 4 tables from which I want to aggregate data using MySQL 5.7.
Projects
+------------+--------+------------------+
| project_id | org_id | name |
+------------+--------+------------------+
| 1 | 1 | Big Project |
| 2 | 1 | Internal Project |
+------------+--------+------------------+
Tasks
+-----------+--------+----------------+------------+
| task_id | org_id | name | project_id |
+-----------+--------+----------------+------------+
| 1 | 1 | Check Work | 1 |
| 2 | 1 | Fix Code | 1 |
| 3 | 1 | Rebuild Office | 2 |
+-----------+--------+----------------+------------+
Resources
+-------------+--------+-------------+-----------+
| resource_id | org_id | first_name | last_name |
+-------------+--------+-------------+-----------+
| 1 | 1 | Alice | Black |
| 2 | 1 | Bob | Smith |
| 3 | 1 | Charlie | White |
+-------------+--------+-------------+-----------+
Task_Details
+-------------+--------+---------+-------------+
| resource_id | org_id | task_id | total_hours |
+-------------+--------+---------+-------------+
| 1 | 1 | 1 | 12 |
| 2 | 1 | 1 | 4 |
| 3 | 1 | 1 | 8 |
| 2 | 1 | 2 | 4 |
| 3 | 1 | 2 | 4 |
| 1 | 1 | 3 | 16 |
+-------------+--------+---------+-------------+
I want to SUM the total_hours, GROUPing by task and project, while still showing the total_hours each employee has individually spent on a task. The output I'm looking for would be something like this
Desired Output
+------------------+----------------+------------+-----------+-------------+
| project_name | task_name | first_name | last_name | total_hours |
+------------------+----------------+------------+-----------+-------------+
| Big Project | Check Work | Alice | Green | 12 |
| Big Project | Check Work | Bob | Smith | 4 |
| Big Project | Check Work | Charlie | Brown | 8 |
| Big Project | Check Work | NULL | NULL | 24 |
| Big Project | Fix Code | Bob | Smith | 4 |
| Big Project | Fix Code | Charlie | Brown | 4 |
| Big Project | Fix Code | NULL | NULL | 8 |
| Big Project | NULL | NULL | NULL | 32 |
| Internal Project | Rebuild Office | Alice | Green | 16 |
| Internal Project | Rebuild Office | NULL | NULL | 16 |
| Internal Project | NULL | NULL | NULL | 16 |
+------------------+----------------+------------+-----------+-------------+
I've managed to create a query that JOINs the relevant tables together, and even managed to GROUP them by project_id, task_id and resource_id. However, adding a WITH ROLLUP statement to the end of my query causes it to fail even though it works without one.
This is my current query:
SELECT
t1.project_name,
t1.task_name,
t2.first_name,
t2.last_name,
SUM(t1.task_hours)
FROM (
SELECT
Projects.project_id,
Projects.name AS project_name,
Tasks.task_id,
Tasks.name AS task_name,
Resources.resource_id,
Task_Details.total_hours AS task_hours
FROM
Projects
RIGHT OUTER JOIN
Tasks
ON
Projects.org_id = Tasks.org_id AND
Projects.project_id = Tasks.project_id
LEFT OUTER JOIN
Task_Details
ON
Task_Details.org_id = Tasks.org_id AND
Task_Details.task_id = Tasks.task_id
LEFT OUTER JOIN
Resources
ON
Resources.org_id = Task_Details.org_id AND
Resources.resource_id = Task_Details.resource_id
WHERE
Projects.org_id = 1
) AS t1
JOIN (
SELECT
resource_id,
first_name,
last_name
FROM
Resources
WHERE
org_id = 1
) AS t2
ON
t2.resource_id = t1.resource_id
GROUP BY
t1.project_id,
t1.task_id,
t1.resource_id;
How can I modify my query such that WITH ROLLUP works?
My SQLFiddle is here, but notably is for MySQL 5.6 rather than 5.7
IMHO, the problem with your query is this: You select some columns which are not in the GROUP BY. That causes some non-sensical values in the columns first_name, last_name, project_name and task_name. However, the sum column is probably correct, isn't it?
This works for me:
SELECT p.name as project_name,
s1.task_name,
first_name,
last_name,
s1.total_hours
FROM (
SELECT
t.project_id,
t.name as task_name,
h.resource_id,
sum(h.total_hours) as total_hours
FROM Task_Details as h
JOIN Tasks as t ON (t.task_id=h.task_id)
GROUP BY t.project_id, t.name, h.resource_id WITH ROLLUP
) AS s1
LEFT JOIN Resources AS r ON (s1.resource_id=r.resource_id)
JOIN Projects AS p ON (p.project_id=s1.project_id);
The nested SELECT does the interesting work, it sums up the total_hours of every resource_id, every task_name and and every project_id. The nesting SELECT then collects the name of every resource and project.
OUTPUT:
+------------------+----------------+------------+-----------+-------------+
| project_name | task_name | first_name | last_name | total_hours |
+------------------+----------------+------------+-----------+-------------+
| Big Project | NULL | NULL | NULL | 32 |
| Big Project | Check Work | Alice | Green | 12 |
| Big Project | Check Work | NULL | NULL | 24 |
| Big Project | Check Work | Bob | Smith | 4 |
| Big Project | Check Work | Charlie | Brown | 8 |
| Big Project | Fix Code | Bob | Smith | 4 |
| Big Project | Fix Code | Charlie | Brown | 4 |
| Big Project | Fix Code | NULL | NULL | 8 |
| Internal Project | NULL | NULL | NULL | 16 |
| Internal Project | Rebuild Office | NULL | NULL | 16 |
| Internal Project | Rebuild Office | Alice | Green | 16 |
+------------------+----------------+------------+-----------+-------------+
Hope this helps.
I have a search section for looking up products which has a navigation bar for filtering purposes that shows the total results of each product feature. For example:
TOTAL RESULTS 60
New (32)
Used (28)
Particular (10)
Company (50)
In mysql I have the following queries (one per feature)
Type
SELECT a.id_type, whois.name as whoisName, COUNT(a.id_type) as countWhois
FROM (published a
INNER JOIN types whois ON whois.id = a.id_type)
GROUP BY id_type
+---------+------------+------------+
| id_type | whoisName | countWhois |
+---------+------------+------------+
| 0 | Company | 50 |
| 1 | Particular | 10 |
+---------+------------+------------+
Condition
SELECT a.id_condition, cond.name as condName, COUNT(a.id_condition) as countCondition
FROM (published a
INNER JOIN conditions cond ON cond.id = a.id_condition)
GROUP BY id_condition
+--------------+---------------+----------------+
| id_condition | conditionName | countCondition |
+--------------+---------------+----------------+
| 0 | New | 32 |
| 1 | Used | 28 |
+--------------+---------------+----------------+
I want to summarize the two queries in a single one but can´t figure out how. I was thinking something like this:
+---------+------------+------------+--------------+---------------+----------------+
| id_type | whoisName | countWhois | id_condition | conditionName | countCondition |
+---------+------------+------------+--------------+---------------+----------------+
| 0 | Company | 50 | NULL | NULL | NULL |
| 1 | Particular | 10 | NULL | NULL | NULL |
| NULL | NULL | NULL | 0 | New | 32 |
| NULL | NULL | NULL | 1 | Used | 28 |
+---------+------------+------------+--------------+---------------+----------------+
Is this possible?
Thanks and sorry if my English is bad, it's not my native language.
I thought I understood how left outer joins work, but I have a situation that is not working, and I'm not 100% sure if the way I have my query structured is incorrect, or if it's a data issue.
For background, I have the following MySQL table structures:
mysql> describe achievement;
+-------------+----------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+----------------------+------+-----+---------+-------+
| id | varchar(64) | NO | PRI | NULL | |
| game_id | varchar(10) | NO | PRI | NULL | |
| name | varchar(64) | NO | | NULL | |
| description | varchar(255) | NO | | NULL | |
| image_url | varchar(255) | NO | | NULL | |
| gamerscore | smallint(5) unsigned | NO | | 0 | |
| hidden | tinyint(1) | NO | | 0 | |
| base_hidden | tinyint(1) | NO | | 0 | |
+-------------+----------------------+------+-----+---------+-------+
8 rows in set (0.00 sec)
and
mysql> describe gamer_achievement;
+----------------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+---------------------+------+-----+---------+-------+
| game_id | varchar(10) | NO | PRI | NULL | |
| achievement_id | varchar(64) | NO | PRI | NULL | |
| gamer_id | varchar(36) | NO | PRI | NULL | |
| earned_epoch | bigint(20) unsigned | NO | | 0 | |
| offline | tinyint(1) | NO | | 0 | |
+----------------+---------------------+------+-----+---------+-------+
5 rows in set (0.00 sec)
As for the data, this is what I have populated here (only pertinent columns included for brevity):
+----+------------+------------------------------+
| id | game_id | name |
+----+------------+------------------------------+
| 1 | 1480656849 | Cluster Buster |
| 2 | 1480656849 | Star Gazer |
| 3 | 1480656849 | Flower Child |
| 4 | 1480656849 | Oyster-meister |
| 5 | 1480656849 | Big Cheese of the South Seas |
| 6 | 1480656849 | Hexic Addict |
| 7 | 1480656849 | Collapse Master |
| 8 | 1480656849 | Survivalist |
| 9 | 1480656849 | Tick-Tock Doc |
| 10 | 1480656849 | Marathon Mogul |
| 11 | 1480656849 | Millionaire Extraordinaire |
| 12 | 1480656849 | Grand Pearl Pooh-Bah |
+----+------------+------------------------------+
12 rows in set (0.00 sec)
and
+----------------+------------+--------------+---------+
| achievement_id | game_id | earned_epoch | offline |
+----------------+------------+--------------+---------+
| 1 | 1480656849 | 0 | 1 |
| 2 | 1480656849 | 0 | 1 |
| 3 | 1480656849 | 0 | 1 |
| 4 | 1480656849 | 1149789371 | 0 |
| 7 | 1480656849 | 1149800406 | 0 |
| 8 | 1480656849 | 0 | 1 |
| 9 | 1480656849 | 1149794790 | 0 |
| 10 | 1480656849 | 1149792417 | 0 |
+----------------+------------+--------------+---------+
8 rows in set (0.02 sec)
In this particular case, the achievement table is the "master" table and will contain the information that I always want to see. The gamer_achievement table only contains information for achievements that are actually earned. For any particular game for any particular gamer, there can be any number of rows in the gamer_achievement table - including none if no achievements have been earned for that game. For example, in the sample data above, achievements with ids 5, 6, 11, and 12 have not been earned.
What I currently have written is
select a.id,
a.name,
ga.earned_epoch,
ga.offline
from achievement a
LEFT OUTER JOIN gamer_achievement ga
ON (a.id = ga.achievement_id and a.game_id = ga.game_id)
where ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024'
and a.game_id = '1480656849'
order by convert (a.id, unsigned)
but this is only returning the full information for those achievements that have actually been earned - the unearned achievement information from the right side table (gamer_achievement) is not being show with the NULL values as I would expect from this type of query. This is what I am expecting to see:
+----+-------------------------------+--------------+---------+
| id | name | earned_epoch | offline |
+----+-------------------------------+--------------+---------+
| 1 | Cluster Buster | 0 | 1 |
| 2 | Star Gazer | 0 | 1 |
| 3 | Flower Child | 0 | 1 |
| 4 | Oyster-meister | 1149789371 | 0 |
| 5 | Big Cheese of the South Seas | NULL | NULL |
| 6 | Hexic Addict | NULL | NULL |
| 7 | Collapse Master | 1149800406 | 0 |
| 8 | Survivalist | 0 | 1 |
| 9 | Tick-Tock Doc | 1149794790 | 0 |
| 10 | Marathon Mogul | 1149792417 | 0 |
| 11 | Millionaire Extraordinaire | NULL | NULL |
| 12 | Grand Pearl Pooh-Bah | NULL | NULL |
+----+-------------------------------+--------------+---------+
12 rows in set (0.00 sec)
What am I missing here? From what I understand, the basic query LOOKS right to me, but I'm obviously missing some piece of critical information.
Many have answered, but I'll try too and hopefully lend in some more clarification. How I have always interpreted it (and you can check so many other posts I've responded to with LEFT joins), I try to list the table I want everything from first (left side... hence read from left to right). Then left join to the "Other" table (right side) on whatever the criteria is between them... Then, when doing a left join, and there are additional criteria against the right side table, those conditions would stay with that join condition. By bringing them into the "WHERE" clause would imply an INNER JOIN (must always match) which is not what you want... I also try to always show the left table alias.field = right table alias.field to keep the correlation clear... Then, apply the where clause to the basis criteria you want from the first table.. something like
select
a.id,
a.name,
ga.earned_epoch,
ga.offline
from
achievement a
LEFT OUTER JOIN gamer_achievement ga
ON a.id = ga.achievement_id
AND a.game_id = ga.game_id
AND ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024'
where
a.game_id = '1480656849'
order by
convert (a.id, unsigned)
Notice the direct relation between "a" and "ga" by the common ID and game ID values, but then tacked on the specific gamer. The where clause only cares at the outer level of achievement based on the specific game.
In the WHERE clause you discard some rows that the LEFT JOIN would have filled with NULL values. You want to put the condition ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024' inside the JOIN clause.
Another option is:
LEFT OUTER JOIN (SELECT * FROM gamer_achievement
WHERE ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024'
) ga
Remember that the join is performed, and at this time, NULL values come if the condition cannot be met; then the where filter applies.
WHERE clauses filter results from the entire result set. If you want to apply a filter only to the JOIN, then you can add the expression to the ON clause.
In the following query, I've moved the filter expression that applies to the joined table (ga.gamer_id =) from the WHERE clause to the ON clause. This prevents the expression from filtering out rows where gamer_achievement values are NULL.
SELECT a.id,
a.name,
ga.earned_epoch,
ga.offline
FROM achievement a
LEFT OUTER JOIN gamer_achievement ga
ON ga.achievement_id = a.id
AND ga.game_id = a.game_id
AND ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024'
WHERE
a.game_id = '1480656849'
ORDER BY CONVERT(a.id, UNSIGNED)
It's because of this line:
where ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024'
If the gamer hasn't earned the achievement, the ga.gamer_id value will be NULL and not qualify for the WHERE condition.
My guess is that the where clause is filtering out your desired results, moving it to the left join may work.
select a.id,
a.name,
ga.earned_epoch,
ga.offline
from achievement a
LEFT OUTER JOIN gamer_achievement ga
ON (a.id = ga.achievement_id and
a.game_id = ga.game_id and
ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024' and
a.game_id = '1480656849')
order by convert (a.id, unsigned)