Using MySQL COUNT(1), COUNT(2) ...etc using JOIN

Using MySQL COUNT(1), COUNT(2) ...etc using JOIN - mysql

I have 4 tables:
Table talks
table talks_fan
table talks_follow
table talks_comments
What I'm trying to achieve is counting all comments, fans, followers for every single talk.
I came up with this so far.
All tables have talk_id and only in talks table is a primary key
SELECT
g. *,
COUNT( m.talk_id ) AS num_of_comments,
COUNT( f.talk_id ) AS num_of_followers
FROM
talks AS g
LEFT JOIN talks_comments AS m
USING ( talk_id )
LEFT JOIN talks_follow AS f
USING ( talk_id )
WHERE g.privacy = 'public'
GROUP BY g.talk_id
ORDER BY g.created_date DESC
LIMIT 30;
I also tried using this method
SELECT
t.*,
COUNT(b.talk_id) AS comments,
COUNT(bt.talk_id) AS followers
FROM
talks t
LEFT JOIN talks_follow bt
ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b
ON b.talk_id = t.talk_id
GROUP BY t.talk_id;
Both give me the same results ....?!
Update: Create Statements
CREATE TABLE IF NOT EXISTS `talks` (
`talk_id` bigint(20) NOT NULL AUTO_INCREMENT,
`user_id` mediumint(9) NOT NULL,
`title` varchar(255) NOT NULL,
`content` text NOT NULL,
`created_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`privacy` enum('public','private') NOT NULL DEFAULT 'private',
PRIMARY KEY (`talk_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=7 ;
CREATE TABLE IF NOT EXISTS `talks_comments` (
`comment_id` bigint(20) NOT NULL AUTO_INCREMENT,
`talk_id` bigint(20) NOT NULL,
`user_id` mediumint(9) NOT NULL,
`comment` text NOT NULL,
`date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`comment_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=8 ;
CREATE TABLE IF NOT EXISTS `talks_fan` (
`fan_id` bigint(20) NOT NULL AUTO_INCREMENT,
`talk_id` bigint(20) NOT NULL,
`user_id` bigint(20) NOT NULL,
`created_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`fan_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=4 ;
CREATE TABLE IF NOT EXISTS `talks_follow` (
`follow_id` bigint(20) NOT NULL AUTO_INCREMENT,
`talk_id` bigint(20) NOT NULL,
`user_id` mediumint(9) NOT NULL,
`date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`follow_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;
The final query that works
SELECT t.* , COUNT( DISTINCT b.comment_id ) AS comments,
COUNT( DISTINCT bt.follow_id ) AS followers,
COUNT( DISTINCT c.fan_id ) AS fans
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
LEFT JOIN talks_fan c ON c.talk_id = t.talk_id
WHERE t.privacy = 'public'
GROUP BY t.talk_id
ORDER BY t.created_date DESC
LIMIT 30
EDIT: Final answer to the whole issue...
I have modified the Query and created some code in PHP (Codeigniter) to solve my issue apone the reccomendation of #Bill Karwin
$sql="
SELECT t.*,
COUNT( DISTINCT b.comment_id ) AS comments,
COUNT( DISTINCT bt.follow_id ) AS followers,
COUNT( DISTINCT c.fan_id ) AS fans,
GROUP_CONCAT( DISTINCT c.user_id ) AS list_of_fans
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
LEFT JOIN talks_fan c ON c.talk_id = t.talk_id
WHERE t.privacy = 'public'
GROUP BY t.talk_id
ORDER BY t.created_date DESC
LIMIT 30
";
$query = $this->db->query($sql);
if($query->num_rows() > 0)
{
$results = array();
foreach($query->result_array() AS $talk){
$fan_user_id = explode(",", $talk['list_of_fans']);
foreach($fan_user_id AS $user){
if($user == 1 /* this supposed to be user id or session*/){
$talk['list_of_fans'] = 'yes';
}
}
$follower_user_id = explode(",", $talk['list_of_follower']);
foreach($follower_user_id AS $user){
if($user == 1 /* this supposed to be user id or session*/){
$talk['list_of_follower'] = 'yes';
}
}
$results[] = array(
'talk_id' => $talk['talk_id'],
'user_id' => $talk['user_id'],
'title' => $talk['title'],
'created_date' => $talk['created_date'],
'comments' => $talk['comments'],
'followers' => $talk['followers'],
'fans' => $talk['fans'],
'list_of_fans' => $talk['list_of_fans'],
'list_of_follower' => $talk['list_of_follower']
);
}
}
I STILL BELIEVE IT COULD BE OPTIMIZED IN THE DB AND JUST USE THE RESULT...
Im thinking if there are 1000 follower and 2000 fans of every single TALK then the result will take much longer to load.. HOW IF YOUT MULTIPLY THE NO WITH 10. Or im mistaking hear...
EDIT: adding benchmark for the query test...
I have used codeigniter profiler to know how long it take for the query to finish excuting.
that been said i also start adding data in the tables gratually
the result as follows.
Testing the DB after answerting data into it
Query Results time
table Talks
---------------
table data 50 rows.
Time: 0.0173 seconds
Table Rows: 644 rows
Time: 0.0535 seconds
Table Rows: 1250 rows
Time: 0.0856 seconds
Adding data to other tables
--------------------------
Talks = 1250 rows
talks_follow = 4115
talks_fan = 10 rows
Time: 2.656 seconds
Adding data to other tables
--------------------------
Talks = 1250 rows
talks_follow = 4115
talks_fan = 10 rows
talks_comments = 3650 rows
Time: 10.156 seconds
After replacing LEFT JOIN with STRAIGHT_JOIN
Time: 6.675 seconds
It seems that its extremely heavy on the DB.....
NOW Im Going to another dilemma on how to enhance its performance
Edited: using #leonardo_assumpcao suggestion
After rebuilding the DB using #leonardo_assumpcao suggestion
for indexing few fields..........
Adding data to other tables
--------------------------
Talks = 6000 Rows
talks_follow = 10000 Rows
talks_fan = 10000 Rows
talks_comments = 10000 Rows
Time: 17.940 second
Is this normal for heavy data DB......?

I can say this is (at least) one of the coolest select statements I improved today.
SELECT STRAIGHT_JOIN
t.* ,
COUNT( DISTINCT b.comment_id ) AS comments,
COUNT( DISTINCT bt.follow_id ) AS followers,
COUNT( DISTINCT c.fan_id ) AS fans
FROM
(
SELECT * FROM talks
WHERE privacy = 'public'
ORDER BY created_date DESC
LIMIT 0, 30
) AS t
LEFT JOIN talks_follow bt ON (bt.talk_id = t.talk_id)
LEFT JOIN talks_comments b ON (b.talk_id = t.talk_id)
LEFT JOIN talks_fan c ON (c.talk_id = t.talk_id)
GROUP BY t.talk_id ;
But it seems to me that your problem resides on your tables; A first step to obtain efficient queries is to index every field involved on your desired joins.
I've made some modifications on the tables you shown above; You can see its code here (updated).
Quite interesting, isn't it? Since we're here, take also your ERR model:
First try it using MySQL test database. Hopefully it will solve your performance troubles.
(Forgive my english, it's my second language)

You can force this into one query like so:
SELECT COUNT(*) num, 'talks' item FROM talks
UNION
SELECT COUNT(*) num, 'talks_fan' item FROM talks_fan
UNION
SELECT COUNT(*) num, 'talks_follow' item FROM talks_follow
UNION
SELECT COUNT(*) num, 'talks_comment' item FROM talks_comment
This will give you a five row resultset with one row per table. Each row is the count in a particular table.
If you must get it all into a single row you can do a pivot like so.
SELECT
SUM( CASE item WHEN 'talks' THEN num ELSE 0 END ) AS 'talks',
SUM( CASE item WHEN 'talks_fan' THEN num ELSE 0 END ) AS 'talks_fan',
SUM( CASE item WHEN 'talks_follow' THEN num ELSE 0 END ) AS 'talks_follow',
SUM( CASE item WHEN 'talks_comment' THEN num ELSE 0 END ) AS 'talks_comment'
FROM
( SELECT COUNT(*) num, 'talks' item FROM talks
UNION
SELECT COUNT(*) num, 'talks_fan' item FROM talks_fan
UNION
SELECT COUNT(*) num, 'talks_follow' item FROM talks_follow
UNION
SELECT COUNT(*) num, 'talks_comment' item FROM talks_comment
) counts
(This doesn't take into account your WHERE g.privacy = clause because I don't understand that. But you could add a WHERE clause to one one of the four queries in the UNION item to handle that.)
Notice that this truly is four queries on four separate tables coerced into a single query.
And, by the way, there is no difference in value between COUNT(*) and COUNT(id) when id is the primary key of the table. COUNT(id) doesn't count the rows for which the id is NULL, but if id is the primary key, then it is NOT NULL. But COUNT(*) is faster, so use it.
Edit if you need the number of fan, follow, and comment rows for each distinct talk, do this. It's the same idea of doing a union and a pivot, but with an extra parameter.
SELECT
talk_id,
SUM( CASE item WHEN 'talks_fan' THEN num ELSE 0 END ) AS 'talks_fan',
SUM( CASE item WHEN 'talks_follow' THEN num ELSE 0 END ) AS 'talks_follow',
SUM( CASE item WHEN 'talks_comment' THEN num ELSE 0 END ) AS 'talks_comment'
FROM
(
SELECT talk_id, COUNT(*) num, 'talks_fan' item
FROM talks_fan
GROUP BY talk_id
UNION
SELECT talk_id, COUNT(*) num, 'talks_follow' item
FROM talks_follow
GROUP BY talk_id
UNION
SELECT talk_id, COUNT(*) num, 'talks_comment' item
FROM talks_comment
GROUP BY talk_id
) counts
GROUP BY talk_id
After doing this for (too) many years, I've discovered that the best way to describe a query you need is to say to yourself "I need a result set with one row for each xxx, with columns for yyy, zzz, and qqq."

The reason the counts are the same is that it's counting rows after the joins have combined the tables. By joining to multiple tables, you're creating a Cartesian product.
Basically, you're counting not only how many comments per talk, but how many comments * followers per talk. Then you count the followers as how many followers * comments per talk. Thus the counts are the same, and they're all way too high.
Here's a simpler way to write a query to count each distinct comment, follower, etc. only once:
SELECT t.*,
COUNT(DISTINCT b.comment_id) AS comments,
COUNT(DISTINCT bt.follow_id) AS followers
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
GROUP BY t.talk_id;
Re your comment: I wouldn't fetch all the followers in the same query. You could do it this way:
SELECT t.*,
COUNT(DISTINCT b.comment_id) AS comments,
COUNT(DISTINCT bt.follow_id) AS followers,
GROUP_CONCAT(DISTINCT bt.follower_name) AS list_of_followers
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
GROUP BY t.talk_id;
But what you'd get back is a single string with the follower names separated by commas. Now you have to write application code to split the string on commas, you have to worry if some follower names actually contain commas already, and so on.
I'd do a second query, fetching the followers for a given talk. It's likely you want to display the followers only for a specific talk anyway.
SELECT follower_name
FROM talks_follow
WHERE talk_id = ?

Related

mySql - search rows without reference in second table

I have 2 tables.
How do i search for all rows in the first table that has no reference in the second table.
The connection field is: res_srvs.id = inv_supp2srv.srvID
So, I want to get all table "res_srvs" rows that has no srvID in table "inv_supp2srv".
TABLE: res_srvs
Collation Attributes
id int(11)
clientID int(6)
resNum int(9)
net decimal(7,2)
tax decimal(7,2)
from_date(date)
TABLE: inv_supp2srv
Collation Attributes
clientID int(6)
invNum int(10)
srvID int(11)
amount decimal(7,2)
valid tinyint(1)
This is what i tried:
SELECT srv.net , srv.tax , srv.net+srv.tax AS amount, srv.id AS srv_id
FROM res_srvs AS srv , inv_supp2srv AS i2s
WHERE srv.clientID = 1
AND srv.from_date >= '2020-03-01'
AND i2s.clientID = 1
AND i2s.srvID = srv.id
AND (NOT EXISTS
(
SELECT *
FROM inv_supp2srv AS i2s
WHERE i2s.srvID = srv.id
)
)

What you want is a left outer join with exclusion :
SELECT r.*
FROM res_srvs r
LEFT JOIN inv_supp2srv i
ON r.id = i.srvID
WHERE i.srvID IS NULL
AND (
-- Your others where clauses go there
);

You can use LEFT JOIN for second table and filter by NULL joined value like:
SELECT srv.net , srv.tax , srv.net+srv.tax AS amount, srv.id AS srv_id
FROM res_srvs AS srv
LEFT JOIN inv_supp2srv AS i2s ON i2s.srvID = srv.id
WHERE
srv.clientID = 1
AND srv.from_date >= '2020-03-01'
-- AND i2s.clientID = 1 not relevant condition
AND i2s.srvID IS NULL;
Another approach is using NOT EXISTS condition:
SELECT srv.net , srv.tax , srv.net+srv.tax AS amount, srv.id AS srv_id
FROM res_srvs AS srv
WHERE
srv.clientID = 1
AND srv.from_date >= '2020-03-01'
AND NOT EXISTS (
SELECT srvID FROM inv_supp2srv AS i2s WHERE i2s.srvID = srv.id
);

I want to get all table res_srvs rows that have no srvID in table inv_supp2srv.
It looks like you are overcomplicating this. I don't see the point for the join between the tables in the outer query - it attempts to match the tables, which contradicts the not exists condition.
I think you just want:
select r.*
from res_srvs r
where
r.from_date >= '2020-03-01'
and r.clientID = 1
and not exists (
select 1
from inv_supp2srv i
where i.srvID = r.id and i.clientID = r.clientID
)
I am unsure whether you want clientID in the correlation clause or not - your query makes it look like it is the case, so I added it.

MySQL group by multiple columns issue

The query below collects temporary overview data for every user into memory table. Basicaly, user sees, count of items by keyword.
The problem is, it calculates total count of items by keyword_id.
What I need is, to calculate item_count by both keyword_id and item_type (Item.type).
SELECT
`Item`.`user_id` AS `user_id` ,
`ItemKeyword`.`keywordID` AS `keyword_id` ,
`Keyword`.`title` AS `keyword_title`,
count(`ItemKeyword`.`ItemID`) AS `ico_count`
FROM
(
(
`ItemKeyword`
JOIN `Item` ON(
(
`Item`.`id` = `ItemKeyword`.`ItemID`
)
)
)
JOIN `Keyword` ON(
(
`Keyword`.`id` = `ItemKeyword`.`keywordID`
)
)
)
GROUP BY
`Item`.`user_id` ,
`ItemKeyword`.`keywordID`;
Details
For example, now result looks like below
Basicaly, item_count is total of all item_types. What I need is, to separate the result below
user_id keyword_id keyword_title item_count
1 9645 surveillance 20
Into something like this:
user_id keyword_id keyword_title item_count item_type
1 9645 surveillance 18 1
1 9645 surveillance 2 2
Where, item_count are calculated by both keyword_id and item_type.
I can't figure out how to include item_type also into this query.
Any suggestions?

I do not understand your love of brackets (parenthesis). Why so many? in my opinion you lose the readability. It was just a side note.
If you need an extra grouping level you need to modify the query like this:
SELECT
`Item`.`user_id` AS `user_id` ,
`ItemKeyword`.`keywordID` AS `keyword_id` ,
`Keyword`.`title` AS `keyword_title`,
count(`ItemKeyword`.`ItemID`) AS `ico_count`,
`Item`.`type` AS `item_type`
FROM `ItemKeyword` JOIN `Item` ON
`Item`.`id` = `ItemKeyword`.`ItemID`
JOIN `Keyword` ON
`Keyword`.`id` = `ItemKeyword`.`keywordID`
GROUP BY
`Item`.`user_id` ,
`Item`.`type` ,
`ItemKeyword`.`keywordID`,
`Keyword`.`title`;

How to optimize this SELECT?

I have one-to-many tables Payment and PaymentFlows to keep track of payment workflows.
For different managers, they are interested in certain workflows only. So whenever a payment reach a certain workflow, a list is provided to them.
For example,
Payment 1 - A) Apply
B) Checked
C) Approved by Manager
D) Approved by CFO
E) Cheque issued
Payment 2 - A) Apply
B) Checked
C) Approved by Manager
Payment 3 - A) Apply
B) Checked
C) Approved by Manager
Payment 4 - A) Apply
B) Checked
To show all payments at workflow C, what I did is:
class Payment < ActiveRecord::Base
def self.search_by_workflow(flow_code)
self.find_by_sql("SELECT * FROM payments P INNER JOIN (
SELECT payment_id FROM (
SELECT * FROM (
SELECT * FROM payment_flows F
ORDER BY F.payment_flow_id DESC
) latest GROUP BY payment_id
) flows WHERE flows.code = flow_code)
) IDs ON IDs.payment_id = P.payment_id ORDER BY P.payment_id DESC LIMIT 100;")
end
end
so:
#payments = Payment.search_by_workflow('Approved by Manager')
returns: Payment 2 and 3
However, the performance is not very good (5 to 7 seconds for 15,000 payments and 55,000 workflows).
How can I improve the performance?
UPDATE (with table structures):
CREATE TABLE `payments` (
`payment_id` int(11) NOT NULL,
`payment_type_code` varchar(50) default 'PETTY_CASH',
`status` varchar(16) NOT NULL default '?',
PRIMARY KEY (`payment_id`),
KEY `status` (`status`),
KEY `payment_type_code` (`payment_type_code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `payment_flows` (
`payment_flow_id` int(11) NOT NULL,
`payment_id` int(11) default NULL,
`code` varchar(64) default NULL,
`status` varchar(255) NOT NULL default 'new',
PRIMARY KEY (`payment_flow_id`),
KEY `payment_id` (`payment_id`),
KEY `code` (`code`),
KEY `status` (`status`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
UPDATE (with name_scope):
named_scope :by_workflows, lambda { |workflows| { :conditions => [ "EXISTS (
SELECT 'FLOW'
FROM payment_flows pf
WHERE pf.payment_id = payments.payment_id
AND pf.proc_code IN (:flows)
AND NOT EXISTS (
SELECT 'OTHER'
FROM payment_flows pfother
WHERE pfother.payment_id = pf.payment_id
AND pfother.payment_flow_id > pf.payment_flow_id
)
)", { :flows => workflows } ]}
}
for convenience, e.g.:
Payment.by_workflows(['Approved by Manager', 'Approved by CFO']).count

Try this:
SELECT * FROM payment p
WHERE EXISTS(
SELECT 'FLOW'
FROM payment_flows pf
WHERE pf.payment_id = p.payment_id
AND pf.code = flow_code
AND NOT EXISTS(
SELECT 'OTHER'
FROM payment_flows pf2
WHERE pf2.payment_id = pf.payment_id
AND pf2.payment_flow_id > pf.payment_flow_id
)
)
Pay attention: in the query flow_code is a variable with the code you want to search
I've added a main EXISTS condition about the presence of flow_code and a nested NOT EXISTS condition about the absence of other id of the same payment next about flow_code.
Tell me if is it OK about better performance.

It looks like you are defining "latest" payment_flows for a given payment to be the row with largest value of payment_flow_id.
For better performance, if you can replace a couple of your indexes on payment_flow_id
by ADDING these indexes
... ON payment_flow_id(code,payment_id,payment_flow_id)
... ON payment_flow_id(payment_id,payment_flow_id)
and DROPPING these (now redundant) indexes
... ON payment_flow_id(code)
... ON payment_flow_id(payment_id)
I would suggest this query:
SELECT p.*
FROM payments p
JOIN ( SELECT c.payment_id
, MAX(c.payment_flow_id) AS flow_id
FROM payment_flows c
WHERE c.code = :flow_code /* <-- query parameter */
GROUP BY c.payment_id
ORDER BY c.code DESC, c.payment_id DESC
) d
ON d.payment_id = p.payment_id
LEFT
JOIN payment_flows n
ON n.payment_id = d.payment_id
AND n.payment_flow_id > d.payment_flow_id
WHERE n.payment_id IS NULL
ORDER BY d.payment_id DESC
LIMIT 100
The inline view query "d" gets the payment_flow_id (if any) for the specified code (:flow_code), so it returns only the payments that are at least that far in the processing flow.
The query uses an anti-join pattern to exclude rows that have a payment_flow_id that is "later" than the one for the specified code.
The anti-join is an outer join, to return all rows from the left side, along with matching rows from the right side, with a condition in the WHERE clause that excludes all rows that had a matching row. (Note the inequality comparison, only rows that had a "later" payment_flow_id values would be a match.)
There's no guarantee that this will be faster.
But with the suggested index improvements, it should get you nice looking EXPLAIN output. (Using EXPLAIN is a pretty good handle on the access plan that will be used by the query.)

How to write correct sql with left join on some tables?

Good day.
STRUCTURE TABLES AND ERROR WHEN EXECUTE QUERY ON SQLFIDDLE
I have some sql queries:
First query:
SELECT
n.Type AS Type,
n.UserIdn AS UserIdn,
u.Username AS Username,
n.NewsIdn AS NewsIdn,
n.Header AS Header,
n.Text AS Text,
n.Tags AS Tags,
n.ImageLink AS ImageLink,
n.VideoLink AS VideoLink,
n.DateCreate AS DateCreate
FROM News n
LEFT JOIN Users u ON n.UserIdn = u.UserIdn
SECOND QUERY:
SELECT
IFNULL(SUM(Type = 'up'),0) AS Uplikes,
IFNULL(SUM(Type = 'down'),0) AS Downlikes,
(IFNULL(SUM(Type = 'up'),0) - IFNULL(SUM(Type = 'down'),0)) AS SumLikes
FROM JOIN Likes
WHERE NewsIdn=NewsIdn //only for example- in main sql NewsIdn = value NewsIdn from row table News
ORDER BY UpLikes DESC
AND TREE QUERY
SELECT
count(*) as Favorit
Form Favorites
WHERE NewsIdn=NewsIdn //only for example- in main sql NewsIdn = value NewsIdn from row table News
I would like to combine both queries, display all rows from the table News, as well as the number of Uplikes, DownLikes and number of Favorit for each value NewsIdn from the table of News (i.e. number of Uplikes, DownLikes and number of Favorit for each row of News) and make order by Uplikes Desc.
Tell me please how to make it?
P.S.: in result i would like next values
TYPE USERIDN USERNAME NEWSIDN HEADER TEXT TAGS IMAGELINK VIDEOLINK DATECREATE UPLIKES DOWNLIKES SUMLIKES FAVORIT
image 346412 test 260806 test 1388152519.jpg December, 27 2013 08:55:27+0000 2 0 2 2
image 108546 test2 905554 test2 1231231111111111111111111 123. 123 1388153493.jpg December, 27 2013 09:11:41+0000 1 0 1 0
text 108546 test2 270085 test3 123 .123 December, 27 2013 09:13:30+0000 1 0 1 0
image 108546 test2 764955 test4 1388192300.jpg December. 27 2013 19:58:22+0000 0 1 -1 0

First, your table structures with all the "Idn" of varchar(30). It appears those would actually be ID keys to the other tables and should be integers for better indexing and joining performance.
Second, this type of process, especially web-based is a perfect example of DENORMALIZING the values for likes, dislikes, and favorites by actually having those columns as counters directly on the record (ex: News table). When a person likes, dislikes or makes as a favorite, stamp it right away and be done with it. If a first time through you do a bulk sql-update do so, but also have triggers on the table to automatically handle updating the counts appropriately. This way, you just query the table directly and order by that which you need and you are not required to query all likes +/- records joined to all news and see which is best. Having an index on the news table will be your best bet.
Now, that said, and with your existing table constructs, you can do via pre-aggregate queries and joining them as aliases in the sql FROM clause... something like
SELECT
N.Type,
N.UserIdn,
U.UserName,
N.NewsIdn,
N.Header,
N.Text,
N.Tags,
N.ImageLink,
N.VideoLink,
N.DateCreate,
COALESCE( SumL.UpLikes, 0 ) as Uplikes,
COALESCE( SumL.DownLikes, 0 ) as DownLikes,
COALESCE( SumL.NetLikes, 0 ) as NetLikes,
COALESCE( Fav.FavCount, 0 ) as FavCount
from
News N
JOIN Users U
ON N.UserIdn = U.UserIdn
LEFT JOIN ( select
L.NewsIdn,
SUM( L.Type = 'up' ) as UpLikes,
SUM( L.Type = 'down' ) as DownLikes,
SUM( ( L.Type = 'up' ) - ( L.Type = 'down' )) as NetLikes
from
Likes L
group by
L.NewsIdn ) SumL
ON N.NewsIdn = SumL.NewsIdn
LEFT JOIN ( select
F.NewsIdn,
COUNT(*) as FavCount
from
Favorites F
group by
F.NewsIdn ) Fav
ON N.NewsIdn = Fav.NewsIdn
order by
SumL.UpLikes DESC
Again, I do not understand why you would have an auto-increment numeric ID column for the news table, then ANOTHER value for it as NewsIdn as a varchar. I would just have this and your other tables reference the News.ID column directly... why have two columns representing the same component. And obviously, each table you are doing aggregates (likes, favorites), should have indexes on any such criteria you would join or aggregate on (hence NewsIdn) column, UserIdn, etc.
And final reminder, this type of query is ALWAYS running aggregates against your ENTIRE TABLE of likes, favorites EVERY TIME and suggest going with denormalized columns to hold the counts when someone so selects them. You can always go back to the raw tables if you ever want to show or update for a particular person to change their like/dislike/favorite status.
You'll have to look into reading on triggers as each database has its own syntax for handling.
As for table structures, this is a SIMPLIFIED version of what I would have (removed many other columns from you SQLFiddle sample)
CREATE TABLE IF NOT EXISTS `News` (
id int(11) NOT NULL AUTO_INCREMENT,
UserID integer NOT NULL,
... other fields
`DateCreate` datetime NOT NULL,
PRIMARY KEY ( id ),
KEY ( UserID )
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;
extra key on the User ID in case you wanted all news activity created by a specific user.
CREATE TABLE IF NOT EXISTS `Users` (
id int(11) NOT NULL AUTO_INCREMENT,
other fields...
PRIMARY KEY ( id ),
KEY ( LastName, Name )
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;
additional key in case you want to do a search by a user's name
CREATE TABLE IF NOT EXISTS `Likes` (
id int(11) NOT NULL AUTO_INCREMENT,
UserId integer NOT NULL,
NewsID integer NOT NULL,
`Type` enum('up','down') NOT NULL,
`IsFavorite` enum('yes','no') NOT NULL,
`DateCreate` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY ( UserID ),
KEY ( NewsID, IsFavorite )
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=6 ;
additional keys here for joining and/or aggregates. I've also added a flag column for being a favorite too. This could prevent the need of a favorites table since they hold the same basic content of the LIKES. So someone could just LIKE/DISLIKE, against a given news item, but ALSO LIKE/DISLIKE it as a FAVORITE the end-user wants to quickly be able to reference.
Now, how do these table structures get simplified for querying? Each table has its own "id" column, but any OTHER table is uses the tableNameID (UserID, NewsID, LikesID or whatever) and that is the join.
select ...
from
News N
Join Users U
on N.UserID = U.ID
Join Likes L
on N.ID = L.NewsID
Integer columns are easier and more commonly identifiable by others when writing queries... Does this make a little more sense?

SELECT
n.Type AS Type,
n.UserIdn AS UserIdn,
u.Username AS Username,
n.NewsIdn AS NewsIdn,
n.Header AS Header,
n.Text AS Text,
n.Tags AS Tags,
n.ImageLink AS ImageLink,
n.VideoLink AS VideoLink,
n.DateCreate AS DateCreate,
IFNULL(SUM(Likes.Type = 'up'),0) AS Uplikes,
IFNULL(SUM(Likes.Type = 'down'),0) AS Downlikes,
(IFNULL(SUM(Likes.Type = 'up'),0) - IFNULL(SUM(Likes.Type = 'down'),0)) AS SumLikes,
COUNT(DISTINCT Favorites.id) as Favorit
FROM News n
LEFT JOIN Users u ON n.UserIdn = u.UserIdn
LEFT JOIN Likes ON Likes.NewsIdn = n.NewsIdn
LEFT JOIN Favorites ON n.NewsIdn=Favorites.NewsIdn
GROUP BY n.NewsIdn

Using JOIN and SUM returns unwanted null row when WHERE condition is not met

Please consider the following query:
Select all payments of a user and UNION the results with the user's invoices.
SELECT `id`,
`amount` AS `value`,
'PAYMENT' AS `transaction_type`
FROM `payment`
WHERE `user_id` = $user_id
UNION ALL
SELECT `i`.`id`,
(-1) * SUM(`ii`.`unit_price` * `ii`.`quantity`) AS `value`,
'INVOICE' AS `transaction_type`
FROM `invoice` `i`
JOIN `invoiceitem` `ii` ON `ii`.`invoice_id` = `i`.`id`
WHERE `user_id` = $user_id AND `type` = 'invoice'
The problem is that for users that have no payment and no invoice, an unwanted row is returned like this:
id | value | transaction_type
=================================
NULL | 0 | NULL
But for users that have some data, the result is completely expected.
IMPORTANT EDIT
After some more research, I got that the problem should be from the second subquery below:
SELECT i.id,
(-1) * SUM(ii.unit_price * ii.quantity) AS `value`,
'INVOICE' AS `trans_type`
FROM invoice i
JOIN invoiceitem ii ON ii.invoice_id = i.id
WHERE user_id = 4 AND type = 'invoice'
which returns the following:
id | value | transaction_type
=================================
NULL | NULL | INVOICE
Of course the user with user_id = 4 has not yet any invoice. But for another user that has some invoices, the result is OK.

This row is created by the aggregate function SUM. In order to prevent this, use a valid GROUP BY clause, probably GROUP BY user_id

It's impossible to say with any certainty without understanding the complete table descriptions, but based on the update to your question, you need to eliminate rows that have NULL values for the column i.id:
SELECT i.id
, (-1) * SUM(ii.unit_price * ii.quantity) AS `value`
, 'INVOICE' AS `trans_type`
FROM invoice i
JOIN invoiceitem ii
ON ii.invoice_id = i.id
WHERE user_id = 4
AND type = 'invoice'
AND i.id IS NOT NULL
I'm guessing that there is a logical defect in your data model or there might be some other column you should use. I can speculate that this invoice row could be a cancelled order, but it is clear that a row exists where the id column is null, which is why it appears in the result.

To avoid such nulls just use a LEFT JOIN instead of INNER JOIN, so, replace your following sql line:
JOIN `invoiceitem` `ii` ON `ii`.`invoice_id` = `i`.`id`
for this one:
LEFT OUTER JOIN `invoiceitem` `ii` ON `ii`.`invoice_id` = `i`.`id`

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Using MySQL COUNT(1), COUNT(2) ...etc using JOIN - mysql

Related

mySql - search rows without reference in second table

MySQL group by multiple columns issue

How to optimize this SELECT?

How to write correct sql with left join on some tables?

Using JOIN and SUM returns unwanted null row when WHERE condition is not met

Categories

Resources