Why aren't these two result sets adding up? - mysql

I have a "master" table (cited_papers) of 144,194 rows, and a "sample" table (publication) that contains a sample of 7,977 of these rows. I am trying to get the rows (by their unique id field) that are in the master table but not the sample table:
SELECT DISTINCT c.*
FROM alex_WOS.cited_papers as c LEFT JOIN alex_WOS.publication USING (id)
WHERE alex_WOS.publication.id IS NULL
This works, but the result count I get is 141,019. Why aren't these counts adding up? (141,019 + 7977 != 144,194) I did a SELECT DISTINCT to count the rows in both the master and sample tables, so I am certain there are no duplicates in either of those tables.

The distinct may be throwing things off. Run the queries below to verify your numbers.
Verify the number of "master" rows:
SELECT count(*) FROM alex_WOS.cited_papers
Verify the number of "sample" rows:
SELECT count(*) FROM alex_WOS.publication
Verify the number of "master" rows not in "sample" table:
SELECT count(*) FROM alex_WOS.cited_papers c LEFT JOIN alex_WOS.publication p USING(id) WHERE p.id IS NULL
These numbers should add up...

As JNevill suggested, my sample table was not a subset of master after all. I'm a dummy...

Related

WHERE clause with INNER JOIN and Sub Query

I have achieved my desired query but I want to know how this one worked. I have multiple tables on my database and my requirements was to take the id from table called product and using this id, I want to retrieve some data from multiple tables and product id is a foreign key to the other tables. The query below works fine (by the way I was just experimenting and luckily got this query).
SELECT ponsfdp.*, product.pName, product.pImage, product.productSizes FROM product
INNER JOIN priceOnSizesForDigitalPrinting AS ponsfdp ON ponsfdp.pId_fk =
(SELECT pId FROM product WHERE pName LIKE "%booklet%")
WHERE pName LIKE "%booklet%";
But when I tried this query,
SELECT ponsfdp.*, product.pName, product.pImage, product.productSizes FROM product
INNER JOIN priceOnSizesForDigitalPrinting AS ponsfdp ON ponsfdp.pId_fk =
(SELECT pId FROM product WHERE pName LIKE "%booklet%");
It contains all the data even with null fields too. Can someone explain to me how it works? My personal opinion is both query should return same data because on the second query, I am using a subquery and it returns only one id, on the other hand, first query has a WHERE clause which generates the same id but by the help of name. How does the first query returns very specific columns and second return all columns even null columns too? I need an explanation for both queries.
Your first query also returning all rows as returned from your second query. But, when you are adding the last filter-
WHERE pName LIKE "%booklet%"
It's just keeping one single row from all rows where pName is like 'booklet'. You can consider the output from your second query as a single table and your logic working as below-
SELECT * FROM (
SELECT ponsfdp.*, product.pName, product.pImage, product.productSizes
FROM product
INNER JOIN priceOnSizesForDigitalPrinting AS ponsfdp
ON ponsfdp.pId_fk = (SELECT pId FROM product WHERE pName LIKE "%booklet%")
)A
WHERE pName LIKE "%booklet%"
Hope this will at least give you some insight of your query.
I don't see any need for a subquery here. You should be using the where condition to select rows from your FROM table, then use the ON clause of your join to find the right record(s) in your joined table for each row of the FROM table:
SELECT ponsfdp.*, product.pName, product.pImage, product.productSizes
FROM product
INNER JOIN priceOnSizesForDigitalPrinting AS ponsfdp
ON ponsfdp.pId_fk = pId
WHERE pName LIKE "%booklet%";

Different row count while creating a table or view in Impala

Different row count when trying to create a table and view in Impala
I am trying to run a query in Impala having a left outer join with another table. The table structure is as below:
SELECT COUNT (*)
FROM (
SELECT A.*,
B.ORDERED_DATE,
B.PROMISE_DATE,
B.REQUEST_DATE,
B.SCHEDULE_SHIP_DATE,
A.SCHEDULED_START_DATE,
A.SCHEDULED_COMPLETION_DATE,
A.DATE_RELEASED,
A.DATE_COMPLETED,
B.ORDERED_DATE_DT,
B.PROMISE_DATE_DT,
B.REQUEST_DATE_DT,
B.ORDERED_QUANTITY,
a.DEMAND_SOURCE_LINE_NUMBER,
B.FLOW_STATUS_CODE,
A.ORDER_NUMBER
FROM TABLE A
LEFT OUTER JOIN TABLE B
ON (A.DEMAND_SOURCE_LINE_ID) = (B.LINE_ID)
) AAAAA
Demand_source_line_id can be null here.
The row count is always different if I do select count(*), count(1). Also the inner select gives me row count different than outer one. Also if i try to create a view out of this query, the record count is different from if i create table out of same query.
Can someone help me?
Expected should be 3585 records. I am getting only 299 on count(*), and 662 on count(1) -- demand source line id is not null for 662 records.
As you mentioned Demand_source_line_id can be null and you are using in on condition, So definitely you will not get expected output and it will impact count as well.
Can you use coalesce function in on condition e.g coalesce(A.DEMAND_SOURCE_LINE_ID,-1) = coalesce(B.LINE_ID, -1).

mysql - how to include zeros in the count in only one single query

I only have one table to count, I am not using any join. Is this possible?
Select engagement_type as name, COUNT(engagement_type) as y
From events
group By engagement_type
order By engagement_type
But only result is 1 row with count per engagement_type. I want to show all count of accounts without any engagement_type. Like these:
Will appreciate your answers! Thanks!
If there is a lookup-table, say EngagementTypes, where all possible values of engagement types are stored, then you can query this table to get the full list of all types and do a LEFT JOIN to events table in order to get the corresponding count:
Select t1.engagement_type as name, COUNT(t2.engagement_type) as y
From EngagementTypes AS t1
left join events as t2 on t1.engagement_type = t2.engagement_type
group By t1.engagement_type
order By t1.engagement_type

include null and zero in count() from related table

I would like to list in table (staging) the number of related records from table (studies).
So far this statement works well but returns only the rows where there are >0 related records:
SELECT staging.*,
COUNT(studies.PMID) AS refcount
FROM studies
LEFT JOIN staging
ON studies.rs_number = staging.rs
GROUP BY staging.idstaging;
How can I adjust this statement to list ALL rows in table (staging) including where there are zero or null related records from table (studies)?
Thank you
You have the tables in the wrong order in the LEFT JOIN:
SELECT staging.*, COUNT(studies.PMID) AS refcount
FROM staging LEFT JOIN
studies
ON studies.rs_number = staging.rs
GROUP BY staging.idstaging;
LEFT JOIN keeps everything in the first ("left") table and all matching rows in the second. If you want to keep everything in the staging table, then put it first.
And, in case anyone wants to complain about the use of staging.* with GROUP BY. This particular usage is (presumably) ANSI compliant because staging.idstaging is (presumably) a unique id in that table.

SQL Select - Some Rows Won't Display

I have two tables. One of them named files and there is al list of all files. the second table called payments, and there is in there a list of payments for some files.
Payments:
id | fileid | {...}
1 2
2 3
3 2
Files:
id | {...}
1
2
3
I want to select all files, and join the table payments to order by count of this table.
In this case, the first row will be file #2, because it repeats the most in the payments table.
I tried to do it, but when I do it - not all of the rows are shown!
I think it happens because not all of the files are in the payments table. So in this case, I think that it won't display the first row.
Thanks, and sorry for my English
P.S: I use mysql engine
** UPDATE **
My Code:
SELECT `id`,`name`,`size`,`downloads`,`upload_date`,`server_ip`,COUNT(`uploadid`) AS numProfits
FROM `uploads`
JOIN `profits`
ON `uploads`.`id` = `profits`.`uploadid`
WHERE `uploads`.`userid` = 1
AND `removed` = 0
ORDER BY numProfits
As others have noted you need to use LEFT JOIN. - This tells MySQL that entries from the tables to the left should be included even if no corresponding entries exists in the table on the right.
Also you should use GROUP BY to indicate how the COUNT should be deliminated.
So the SQL should be something like;
SELECT Files.ID, count(Payments.FileID) as numpays FROM
Files
LEFT OUTER JOIN
Payments
ON Files.id=Payments.FileID
GROUP BY files.ID
ORDER BY numpays desc
SQL Fiddle
Try this:
select B.fileid,A.{}.....
from
(select id,.....
from files A
inner join
(select count(*),fileid,.....
from payments
group by fileid) B
on files.id=payments.fileid)
I hope this helps. I'm assuming that all ID in files table are unique. In this answer, you can apply an order by clause as per your wish. I've left the select statement to you to select whatever data you want to fetch.
As far as your problem is described, I think this should work. If any problems, do post a comment.
Try LEFT JOIN - in MySQL, the default JOIN is actually an INNER JOIN. In an INNER JOIN, you will only get results back that are in both sides of the join.
See: Difference in MySQL JOIN vs LEFT JOIN
And, as noted in the comments, you may need a GROUP BY with your COUNT as well, to prevent it from just counting all the rows that come back.