mysql Get missing values - mysql

I know this is ordinary question but I need something more. I have an issue about getting values that are not inserted in one table.
Ok here are my tables:
name: importantDates; cols: id, date
name: inserts; cols: id, date, employe_id
My question: how to get missing values for each employe? Let's say I need missing inserts from employe with id=213?
So far, I wrote this, but it doesn't work yet as if there is insert for one worker in one day, it eliminates one day for all workers.
code:
SELECT i.date
FROM importantDates i
LEFT OUTER JOIN inserts s
ON i.date = DATE(s.date)
WHERE i.date BETWEEN '2013-1-1'
AND '2013-2-23'
AND s.date IS NULL;
Now how can I add checking for employe_id?
Thanks guys, if you need anything more I'm always available.
EDIT:
Here is sample:
Employe:
1. sam
2. mike
3. joe
importantDate:
1. 2013-01-01
2. 2013-01-02
3. ...
40. 2013-02-23
inserts:
1. 2013-02-01, 1
2. 2013-02-01, 2
3. 2013-02-01, 3
4. 2013-02-02, 3
5. 2013-02-03, 1
6. 2013-02-03, 2
7. 2013-01-12, 1
So, when I run query, I should get all "missing" inserts. For each employe I should get date and ID of employee when insert is missing. A lot of data but it is important to know which are not inserted and which are.

Assuming you have an employee table, try:
select sq.* from
(select e.employe_id, i.date
FROM importantDates i
CROSS JOIN employee e
WHERE i.date BETWEEN '2013-1-1' AND '2013-2-23') sq
LEFT OUTER JOIN inserts s
ON sq.date = DATE(s.date) and sq.employe_id = s.employe_id
WHERE s.date IS NULL;
If you don't have a separate employee table, you can simulate one by changing employee in the above query to be:
(select distinct employe_id from inserts) as e

Instead of LEFT OUTER JOIN use a simple LEFT JOIN and provide the dates correctly
SELECT
i.date
FROM importantDates i
LEFT JOIN inserts s
ON i.date = DATE(s.date)
WHERE i.date BETWEEN '2013-01-01'
AND '2013-02-23'
AND s.date IS NULL;

I'm not quite sure if I understand, what you're trying to achieve, but if you want to get every row from table_a which is not in table_b you can do this:
SELECT * FROM table_a
WHERE table_a.col NOT IN
(
SELECT col FROM table_b
)
So (if I understand you correctly) in your case:
SELECT i.date FROM importantDates i
WHERE i.date NOT IN
(
SELECT date FROM inserts WHERE employe_id = 213
)
AND i.date BETWEEN '2013-01-01' AND '2013-02-23';
For more documentation to the IN-clause see mysql documentation
UPDATE:
To get the corresponding employee you can alter the statement to this:
SELECT i.date, e.* FROM importantDates i
JOIN employees e
WHERE i.date NOT IN
(
SELECT s.date FROM inserts s WHERE s.employe_id = e.employe_id
)
AND i.date BETWEEN '2013-01-01' AND '2013-02-23';
However, this is not recommendable because the subquery is correlated to the mainquery.

Related

Mysql Query where max(time) less than today

I have two tables, the first table ( job ) stores the data and the second table ( job_locations ) stores the locations for each job, I'm trying to show the number of jobs that job locations are less than today
I use the DateTime for the Date Column
unfortunately, the numbers that appear after test the next code are wrong
My code
SELECT *
FROM `job`
left join job_location
on job_location.job_id = job.id
where job_location.cutoff_time < CURDATE()
group by job.id
Please help me to write the working Query.
I think you need to rephrase your query slightly. Select a count of jobs where the cutoff time is earlier than the start of today.
SELECT
j.id,
COUNT(CASE WHEN jl.cutoff_time < CURDATE() THEN 1 END) AS cnt
FROM job j
LEFT JOIN job_location jl;
ON j.id = jl.job_id
GROUP BY
j.id;
Note that the left join is important here because it means that we won't drop any jobs having no matching criteria. Instead, those jobs would still appear in the result set, just with a zero count.
As a note, you can simplify the count (in MySQL). And, assuming that all jobs have at least one location, you don't need a JOIN at all. So:
SELECT jl.job_id, sum( jl.cutoff_time < CURDATE() )
FROM job_location jl
GROUP BY jl.job_id;
If this is not correct (and you need the JOIN), then the condition on the date should go in the ON clause:
SELECT jl.job_id, COUNT(jo.job_id)
FROM job LEFT JOIN
job_location jl
ON jl.job_id = j.id AND jl.cutoff_time < CURDATE()
GROUP BY jl.job_id;

SQL Join with data associated to dates

Currently I have a simple SQL request to get aall group departure date and the associated group size (teamLength) between 2 dates but it doesn't work properly.
SELECT `groups`.`departure`, COUNT(`group_users`.`group_id`) as 'teamLength'
FROM `groups`
INNER JOIN `group_users`
ON `groups`.`id` = `group_users`.`group_id`
WHERE departure BETWEEN '2017-03-01' AND '2017-03-31'
In fact, if I have more than 1 group between the 2 dates, only 1 date will be recovered in association with the total number of teamLength.
For exemple, if I have 2 groups in the same interval with, for group 1, 2 people and for group 2, 1 people, the result will be:
Here are 2 screenshots of the current state of my groups and group_users tables:
Is it even possible to do what I want in only 1 SQL request ? Thanks
In addition to what jarlh commented (JOIN with ON). Don't ever group data without an explicit GROUP BY. I don't know why MYSQL still allows this...
Change your query to something like this and you should get the result you are looking for. Currently, the other departure dates get lost in the aggregation.
SELECT
groups.departure,
COUNT(1) as team_length
FROM
groups
INNER JOIN group_users
ON groups.id = group_users.group_id
WHERE
groups.departure BETWEEN '2017-03-01' AND '2017-03-31'
GROUP BY
groups.departure
I think that you have a syntax issue in your query. You are missing the ON statement so your database could be trying to get a cartesian product since there is no join clause.
SELECT `groups`.`departure`, COUNT(`group_users`.`id`) as 'teamLength'
FROM `groups`
INNER JOIN `group_users` ON `groups`.`id` = `group_users`.`group_id`
WHERE departure BETWEEN '2017-03-01' AND '2017-03-31'
GROUP BY `groups`.`departure`
You also are missing the GROUP BYclause which is not mandatory in all RDBS but it is a good practice to set it.

Merging 3 Tables, Limiting 1 Table With Multiple Fields Needed

Been looking into this for awhile. Hoping someone might be able to provide some insight. I have 3 tables. All of which I'm grabbing multiple columns, but the 3rd I need to limit the output to just the most recent timestamp entry, BUT still display multiple columns.
If I have the following data [ Please see SQL Fiddle ]:
http://sqlfiddle.com/#!2/84b91/6
The fiddle is a list of (names) in Table1(users), (job_name,years) in Table2(job), and then (score, timestamp) in Table3(job_details). All linked together by the users id.
I am definitely not great at MYSQL. I know I'm missing something.. possibly a series of JOINs. I have been able to get Table 1, Table 2 and one column of Table 3 by doing this:
select a.id, a.name, b.job_name, b.years,
(select c.timestamp
from job_details as c
where c.user_id = a.id
order by c.timestamp desc limit 1) score
from users a, job as b where a.id = b.user_id;
At this point, I can get multiple column data on the first two columns, limit the 3rd to one value and sort that value on the last timestamp...
My question is: How does one go about adding a second column to the limit? In the example in the fiddle, I'd like to add the score as well as the timestamp to the output.
I'd like the output to be:
NAME, JOB, YEARS, SCORE, TIMESTAMP. The last two columns would only be the last entry in job_details sorted by the most recent TIMESTAMP.
Please let me know if more information is required! Thank you for your time!
T
Try this:
select a.id, a.name, b.job_name, b.years, c.timestamp, c.score
from users a
INNER JOIN job as b ON a.id = b.user_id
INNER JOIN (SELECT jd.user_id, jd.timestamp, jd.score
FROM job_details as jd
INNER JOIN (select user_id, MAX(timestamp) as tstamp
from job_details
GROUP BY user_id) as max_ts ON jd.user_id = max_ts.user_id
AND jd.timestamp = max_ts.tstamp
) as c ON a.id = c.user_id
;

Slow aggregate query with join on same table

I have a query to show customers and the total dollar value of all their orders. The query takes about 100 seconds to execute.
I'm querying on an ExpressionEngine CMS database. ExpressionEngine uses one table exp_channel_data, for all content. Therefore, I have to join on that table for both customer and order data. I have about 14,000 customers, 30,000 orders and 160,000 total records in that table.
Can I change this query to speed it up?
SELECT link.author_id AS customer_id,
customers.field_id_122 AS company,
Sum(orders.field_id_22) AS total_orders
FROM exp_channel_data customers
JOIN exp_channel_titles link
ON link.author_id = customers.field_id_117
AND customers.channel_id = 7
JOIN exp_channel_data orders
ON orders.entry_id = link.entry_id
AND orders.channel_id = 3
GROUP BY customer_id
Thanks, and please let me know if I should include other information.
UPDATE SOLUTION
My apologies. I noticed that entry_id for the exp_channel_data table customers corresponds to author_id for the exp_channel_titles table. So I don't have to use field_id_117 in the join. field_id_117 duplicates entry_id, but in a TEXT field. JOINING on that text field slowed things down. The query is now 3 seconds
However, the inner join solution posted by #DRapp is 1.5 seconds. Here is his sql with a minor edit:
SELECT
PQ.author_id CustomerID,
c.field_id_122 CompanyName,
PQ.totalOrders
FROM
( SELECT
t.author_id
SUM( o.field_id_22 ) as totalOrders
FROM
exp_channel_data o
JOIN
exp_channel_titles t ON t.author_id = o.entry_id AND o.channel_id = 3
GROUP BY
t.author_id ) PQ
JOIN
exp_channel_data c ON PQ.author_id = c.entry_id AND c.channel_id = 7
ORDER BY CustomerID
If this is the same table, then the same columns across the board for all alias instances.
I would ensure an index on (channel_id, entry_id, field_id_117 ) if possible. Another index on (author_id) for the prequery of order totals
Then, start first with what will become an inner query doing nothing but a per customer sum of order amounts.. Since the join is the "author_id" as the customer ID, just query/sum that first. Not completely understanding the (what I would consider) poor design of the structure, knowing what the "Channel_ID" really indicates, you don't want to duplicate summation values because of these other things in the mix.
select
o.author_id,
sum( o.field_id_22 ) as totalOrders
FROM
exp_channel_data customers o
where
o.channel_id = 3
group by
o.author_id
If that is correct on the per customer (via author_id column), then that can be wrapped as follows
select
PQ.author_id CustomerID,
c.field_id_122 CompanyName,
PQ.totalOrders
from
( select
o.author_id,
sum( o.field_id_22 ) as totalOrders
FROM
exp_channel_data customers o
where
o.channel_id = 3
group by
o.author_id ) PQ
JOIN exp_channel_data c
on PQ.author_id = c.field_id_117
AND c.channel_id = 7
Can you post the results of an EXPLAIN query?
I'm guessing that your tables are not indexed well for this operation. All of the columns that you join on should probably be indexed. As a first guess I'd look at indexing exp_channel_data.field_id_117
Try something like this. Possibly you have error in joins. also check whether joins on columns are correct in your databases. Cross join may takes time to fetch large data, by mistake if your joins are not proper on columns.
select
link.author_id as customer_id,
customers.field_id_122 as company,
sum(orders.field_id_22) as total_or_orders
from exp_channel_data customers
join exp_channel_titles link on (link.author_id = customers.field_id_117 and
link.author_id = customer.channel_id = 7)
join exp_channel_data orders on (orders.entry_id = link.entry_id and orders.entry_id = orders.channel_id = 3)
group by customer_id

Subquery returns more than 1 row

im geting this error when trying to do 2 counts inside of my query
first ill show you the query:
$sql = mysql_query("select c.id, c.number, d.name,
(select count(*) from `parts` where `id_container`=c.id group by `id_car`) as packcount,
(select count(*) from `parts` where `id_container`=c.id) as partcount
from `containers` as c
left join `destinations` as d on (d.id = c.id_destination)
order by c.number asc") or die(mysql_error());
now the parts table has 2 fields that i need to use in the count:
id_car
id_container
id_car = the ID of the car the part is for
id_container = the ID of the container the part is in
for packcount all i want is a count of the total cars per container
for partcount all i want it a count of the total parts per container
It's because of GROUP BY You're using
Try something like
(select count(distinct id_car) from `parts` where `id_container`=c.id)
in You're subquery (can't check right now)
EDIT
PFY - I think UNIQUE is for indexes
Your grouping in your first sub-query is causing multiple rows to be returned, you will probably need to run separate queries to get the results you are looking for.
This subquery may return more than one row.
(select count(*) from `parts` where `id_container`=c.id group by `id_car`) as packcount, ...
so, i'd suggest to try something of the following:
(select count(DISTINCT `id_car`) from `parts` where `id_container`=c.id) as packcount, ...
see: COUNT(DISTINCT) on dev.mysql.com
and: QA on stackoverflow