MySQL query, COUNT and SUM with two joined tables - mysql

I need a little help with a MySQL query.
I have two tables one table is a list of backlinks with a is_homepage (bool) flag. The second table is a list of the domains for all of the backlinks, a was link_found (bool) flag, and a url_count column which is the number of rows in the backlinks table that are associated with each domain.
Note that the domain_id column is the foreign key to the domain table id column. Heres some sample data.
backlinks
id domain_id is_homepage page_href
1 1 1 http://ablog.wordpress.com/
2 1 0 http://ablog.wordpress.com/contact/
3 1 0 http://ablog.wordpress.com/archives/
4 2 1 http://www.somewhere.org/
5 2 0 http://www.somewhere.org/page=3
6 3 1 http://www.great-fun-site.com/
7 3 0 http://www.great-fun-site.com/index.html
8 4 0 http://red.blgspot.com/page=7
9 4 0 http://blue.blgspot.com/page=9
domains
id url_count link_found domain_name
1 3 1 wordpress.com
2 2 0 somewhere.org
3 2 1 great-fun-site.com
4 2 1 blgspot.com
The results Im looking to get from the above data would be: count = 2, total = 5.
Im trying to get the count of rows from the domains table (count) and then the sum of the url_count (total) from the domains table WHERE link_found is 1 and where one of the links in the backlink table is_homepage is 1.
Here's the query I'm trying to work with.
SELECT SUM(1) AS count, SUM(`url_count`) total
FROM `domains` AS domain
LEFT JOIN `backlinks` AS link ON link.domain_id = domain.id
WHERE domain.id IN (
SELECT DISTINCT(bl.domain_id)
FROM `backlinks` AS bl
WHERE bl.tablekey_id = 11
AND bl.is_homepage = 1
)
AND domain.link_found = 1
AND link.is_homepage = 1
GROUP BY `domain`.`id`
The problem with this query is that it returns a row for each entry in the domains table. I think I might need one more sub query to add up the returned results but I'm not sure if that's correct. Does anyone see what I'm doing wrong? Thank you!
EDIT:
The problem I'm having is that if there are more than one homepage in the back-links table then its counted multiple times. I need to only count each domain once.

Well, you shouldn't have to do a group by as you are not selecting anything other than aggregated fields. I'm no mysql expert, but this should work:
SELECT count(d.id) as count, sum(d.url_count) as total from domains as d
inner join backlinks as b
on b.domain_id = d.id
Where d.Link_found = 1 and b. is_homepage = 1

The reason you're getting a row for each entry in the domains table is that you're grouping by domain.id. If you want grand totals only, just leave off the GROUP BY piece.
I think a fairly simple query will do the trick:
SELECT COUNT(*), SUM(domains.URL_Count)
FROM domains
WHERE domains.link_found = 1 AND domains.id IN (
SELECT domain_id FROM backlinks WHERE is_homepage = 1)
There's a working SQLFiddle here.

Thanks for the help. Sorry it was so hard to explain I need a MySQL fiddle :)
If anyones interested heres what I ened up with:
SELECT SUM(1) AS count, SUM(total) AS total
FROM
(
SELECT SUM(`url_count`) total
FROM `domains` AS domain
LEFT JOIN `backlinks` AS link ON link.domain_id = domain.id
WHERE domain.id IN (
SELECT DISTINCT(bl.domain_id)
FROM `backlinks` AS bl
WHERE bl.tablekey_id = 11
AND bl.is_homepage = 1
)
AND domain.link_found = 1
AND link.is_homepage = 1
GROUP BY `domain`.`id`
) AS result

Related

How to do counts across 2 tables

I have searched for hours for a solution to this mysql user case. I have found many examples but none of them had composite primary keys. I want to do a count across 2 tables and also calculate a difference. Here are the two separate queries
primary key testId, tpId
SELECT Count(*) AS count,
testid,
tpid
FROM cicdexpecteddocument e
WHERE e.testid = 8
GROUP BY e.tpid;
3 8 756abdaa-31c0-11ea-9c52-0245f4ff0412
3 8 7ea2b31b-31c0-11ea-9c52-0245f4ff0412
1 8 c25780cb-31c0-11ea-9c52-0245f4ff0412
2 8 c9f70ed9-31c0-11ea-9c52-0245f4ff0412
primary key testId, tpId, executionId
SELECT Count(*) AS count,
testid,
tpid
FROM cicdactualdocument a
WHERE a.testid = 8
AND a.executionid =
'execution-d0c5e270-50f2-472e-a609-ac2c381e0a5f-2020.01.09'
GROUP BY tpid;
2 8 7ea2b31b-31c0-11ea-9c52-0245f4ff0412
2 8 c25780cb-31c0-11ea-9c52-0245f4ff0412
2 8 c9f70ed9-31c0-11ea-9c52-0245f4ff0412
I would like to end up with something like
3 3 8 756abdaa-31c0-11ea-9c52-0245f4ff0412
3 2 1 8 87ea2b31b-31c0-11ea-9c52-0245f4ff0412
1 2 -1 8 8c25780cb-31c0-11ea-9c52-0245f4ff0412
2 2 0 8 8c9f70ed9-31c0-11ea-9c52-0245f4ff0412
Any guidance is appreciated. Thank you in advance
Here's the code. See below for an explanation.
SELECT count_t1, count_t2,
IFNULL(count_t1, 0) - IFNULL(count_t2, 0) AS diff,
t1.testid, t1.tpid
FROM (
SELECT COUNT(*) AS count_t1,
testid,
tpid
FROM cicdexpecteddocument e
WHERE e.testid = 8
GROUP BY e.tpid
) AS t1
LEFT OUTER JOIN
(
SELECT COUNT(*) AS count_t2,
testid,
tpid
FROM cicdactualdocument a
WHERE a.testid = 8 AND a.executionid =
'execution-d0c5e270-50f2-472e-a609-ac2c381e0a5f-2020.01.09'
GROUP BY tpid
) AS t2 ON t1.testid = t2.testid AND t1.tpid = t2.tpid
ORDER BY tpid
Dissecting this:
Skip past the SELECT columns and notice how the two queries that you specified for each table are contained in parenthesis followed by "AS t1" (or t2) -- those are subqueries. After the 2nd subquery (2nd to last line) is where I specified the join condition.
Next is how the "diff" column is calculated. It uses the IFNULL() function which returns the value specified in the 2nd parameter if the primary value is NULL. That allows the database to do the calculation even on NULL values.
Note: I just put this together quick and dirty, but I'm not making any assumptions about speed here. If you have a couple hundred rows, no big deal. But if you're dealing with thousands of rows in each table, you may need to work on optimizing this query.
Hope that helps!

How to select a row from table1 if the row id isn't present in table2 more than x times

accounts as a1 | team_logs as tl1
--------------------------------------------------------
id Name counter | id team_id user_id account_id
1 Account 1 2 | 1 1 100 1
2 Account 2 2 | 2 2 200 1
3 Account 3 0 | 3 3 300 2
... | 4 2 200 2
This is an account review app. Based on the 2 tables above a query is needed that will output 1 account from a1 table based on the tl1 records as below:
A team member is requesting an account, and once an account is assigned to him a log entry is made in tl1 that an account_id is assigned to him.
An account can be assigned to a Team only once.
An account can be assigned to x teams (In the above example we have only 3 teams).
An record can be reviewed x times(In the example above it can be reviewed 3 times).
I had a project where I had only 3 teams and each teams logs were stored in its own table, and I had this query which worked:
Example for Team1
SELECT `a1`.*
FROM `accounts` AS `a1`
LEFT JOIN `team1_logs` AS `tl1` ON tl1.account_id = a1.id
WHERE (tl1.account_id IS NULL)
AND (a1.counter < '3')
ORDER BY RAND()
LIMIT 1
a1 has a counter column which has a value that represents the number of times a row was shown to teams. Now my project can house x teams, we made the teams dynamic, so making a table log for each team isn't an option.
So in the above tables if i want an account to be reviewed(assigned to a team member) 3 times.
Account 1 can be reviewed 1 more time by any team that isn't 1 and 2
Account 2 can be reviewed 1 more time by any team that isn't 2 and 3
What would my new query need to look like if i want to get the next first available record, based on the 1-4 criteria from above?
The data in Table 2 is more than enough, you don't need to know any other
data to make the needed query.
team_id is an query input (since we need to output an account to the team
member)
Answer
Assuming that I am a team member of team 1
SELECT DISTINCT a.*
FROM accounts AS a
LEFT JOIN (
SELECT account_id, team_id FROM team_logs) AS tl1 ON a.id = tl1.account_id
WHERE a.id NOT IN (
SELECT account_id FROM team_logs WHERE team_id =1)
AND a.counter < 3
ORDER BY a.id ASC
If you just want to see which teams are not allowed to review the account again, join with a subquery that uses GROUP_CONCAT to get the list of teams that have reviewed it.
SELECT a.*, 3 - counter AS remaining_reviews, IFNULL(tl.already_reviewed, '') AS already_reviewed
FROM accounts AS a
LEFT JOIN (
SELECT account_id, GROUP_CONCAT(team_id ORDER BY team_id) AS already_reviewed
FROM team_logs
GROUP BY account_id) AS tl ON a.id = tl.account_id
WHERE a.counter < 3
DEMO

Select sum of zero if no records in second table?

I did some research and learned about the COALESCE(sum(num), 0) function. The issue is the example I found only related to using one table.
I am calculating a sum from a second table, and if there are no records for an item in the second table, I still want it to show up in my query and have a sum of zero.
SELECT note.user, note.product, note.noteID, note.note, COALESCE(sum(noteTable.Score), 0) as points
FROM note, noteTable
WHERE note.user <> 3 AND note.noteID = noteTable.noteID
I am only recieving results if there is an entry in the second table noteTable. If there are scores added for a note, I still want them to show up in the result with a points value of zero.
Table Examples:
Note
user | product | noteID |note
3 1 1 Great
3 2 2 Awesome
4 1 3 Sweet
NoteTable
noteID | score
1 5
The query should show me this:
user | noteID | sum(points)
3 1 5
3 2 0
4 3 0
But I am only getting this:
user | noteID | sum(points)
3 1 5
http://sqlfiddle.com/#!9/aae812/2
SELECT
note.user,
note.product,
note.noteID, note.note,
COALESCE(sum(noteTable.Score),0) as points
FROM note
LEFT JOIN noteTable
ON note.noteID = noteTable.noteID
WHERE note.user <> 3
and I guess you should add:
GROUP BY note.noteid
if you expect to get SUM for every user. So you want to get more then 1 record back.
First, learn to use proper JOIN syntax and table aliases. The answer to your question is SUM() along with COALESCE():
SELECT n.user, n.product, n.noteID, n.note,
COALESCE(sum(nt.Score), 0) as points
FROM note n LEFT JOIN
noteTable nt
ON n.noteID = nt.noteID
WHERE n.user <> 3
GROUP BY n.user, n.product, n.noteID, n.note;
You also need a GROUP BY.

MySQL order by points from 2nd table

So I have MySQL 3 tables, items (which in this case are lodging properties and the data is simplified below), amenities that the properties might offer, and amenities_index which is a list of item ids and amenity ids for each amenity offered. The end user can select any number of amenities they want and I want to return the results in order of the number of amenities that match what they are looking for. So, if they search for 3 different amenities, I want the items listed that offer all 3, then those that offer 2, 1 and finally the rest of the items. I have a query that I think is working for getting the results in the correct order, but I was hoping that I could also return a point value based on the matches, and that's where I'm running into trouble. My SQL skills are a bit lacking when it comes to more complex queries.
Here is an example query I have that returns the results in the correct order:
SELECT * FROM items
ORDER BY
(
SELECT count(*) AS points
FROM `amenities_index`
WHERE
(amenity_id = 1 || amenity_id = 2)
AND amenities_index.item_id = items.id
) DESC
And here is what the tables are structured like. Any help is appreciated.
items table
id name
1 location 1
2 location 2
3 location 3
4 location 4
amenities table
id name
1 fireplace
2 television
3 handicapped accessible
4 kitchenette
5 phone
amenities_index
item_id amenity_id
1 2
1 3
1 5
2 1
2 2
2 6
3 2
3 3
3 4
3 5
You want to move your expression into the select clause:
SELECT i.*,
(SELECT count(*) AS points
FROM `amenities_index` ai
WHERE amenity_id in (1, 2) AND
ai.item_id = i.id
) as points
FROM items i
ORDER BY points desc;
You can also do this as a join query with aggregation:
SELECT i.*, ai.points
FROM items i join
(select ai.item_id, count(*) as points
from amenities_index ai
where amenity_id in (1, 2)
) ai
on ai.item_id = i.id
ORDER BY ai.points desc;
In most databases, I would prefer this version over the first one. However, MySQL would allow the first in a view but not the second, so it has some strange limitations under some circumstances.

MySQL join with a subquery

I have three tables and am trying to get info from two and then perform a calculation on the third and display all the results in one query.
The (simplified) tables are:
table: employee_work
employee_id name
1 Joe
2 Bob
3 Jane
4 Michelle
table: carryover
employee_id days
1 5
2 10
3 3
table: timeoff
employee_id time_off_type days
1 Carryover 2
1 Leave 3
1 Carryover 1
2 Sick 4
2 Carryover 4
3 Leave 1
4 Sickness 4
The results I would like are:
employee_id, carryover.days, timeoff.days
1 5 3
2 10 4
3 3 0
However when I run the query, whilst I get the correct values in columns 1 and 2, I get the same number repeated in the third column for all entries.
Here is my query:
Select
employee_work.employee_id,
carryover.carryover,
(SELECT SUM(days) FROM timeoff WHERE timeoff.time_off_type = 'Carryover'
AND timeoff.start_date>='2013-01-01') AS taken
From
carryover Left Join
employee_work On employee_work.employee_id = carryover.employee_id Left Join
timeoff On employee_work.employee_id = timeoff.employee_id Left Join
Where
carryover.carryover > 0
Group By
employee_work.employee_id
I have tried to group by in the sub query but I then get told "Subquery returns more than one row" - how can I ensure that the sub query is respecting the join so it only looks at each employee at a time so I get my desired results?
The answer to your question is to use a correlated subquery. You don't need to mention the timeoff table twice in this case:
Select
employee_work.employee_id,
carryover.carryover,
(SELECT SUM(days)
FROM timeoff
WHERE timeoff.time_off_type = 'Carryover' and
timeoff.start_date>='2013-01-01' and
timeoff.employee_id = employee_work.employee_id
) AS taken
From
carryover Left Join
employee_work On employee_work.employee_id = carryover.employee_id
Where
carryover.carryover > 0
Group By
employee_work.employee_id;
An alternative structure is to do the grouping for all employees in the from clause. You can also remove the employee_work table, because it does not seem to be being used. (You can use carryover.employee_id for the id.)
Select co.employee_id, co.carryover, et.taken
From carryover c Left Join
(SELECT employee_id, SUM(days) as taken
FROM timeoff
WHERE timeoff.time_off_type = 'Carryover' and
timeoff.start_date>='2013-01-01'
) et
on co.employee_id = et.employee_id
Where c.carryover > 0;
I don't think the group by is necessary. If it is, then you should probably have an aggregation function in the original query.