how to perform a table join when the other table has many rows corresponding to one row in mother table - mysql

I have two table --> tbl_book_details and tbl_table_traking
tbl_book_details has columns bd_book_code,
bd_isbn,
bd_title,
bd_edition,
bd_author,
bd_publisher,
bd_supplier,
bd_page,
bd_price_type,
bd_cost_price,
bd_price,
bd_Tax,
bd_covering,
bd_availability,
bd_keywords,
bd_notes,
bd_details,
bd_news_latter,
bd_etDate,
bd_weight,
bd_expire_date,
bd_status
tbl_table_traking has columns
tt_id,
tt_action,
tt_table,
tt_record_id,
tt_on_date,
tt_user,
tt_status
the process is a trigger is defined on tbl_book_details which in case of insert/modify insert the data in tbl_table_traking for traking when and who has modified the records.
till now i have been using following query which is not a join -->
SELECT
tbl_books_details.bd_book_code AS bkid,
tbl_books_details.bd_isbn,
tbl_books_details.bd_title AS title,
-- This part is what I believe is slowing down my query
(SELECT
tt_on_date
FROM
tbl_table_tracking
WHERE tt_action = 'MODIFY'
AND tt_record_id = tbl_books_details.bd_book_code ORDER BY tt_on_date) AS bd_etdate
it was working fine when the records count were below 3 million, but now script time out is occurring.
I have made the index on tbl_table_traking on 'tt_ondate' and on tt_action,
If there any way i can convert it to a join or improve the performance?
the table traking query is returning the mostrecent date on which the record was modified.
my database is in mysql.

You should be able to do something like this.
SELECT
tbl_books_details.bd_book_code AS bkid,
tbl_books_details.bd_isbn,
tbl_books_details.bd_title AS title,
tbl_table_traking.tt_on_date
FROM
tbl_books_details
INNER JOIN tbl_table_tracking
tbl_books_details.bd_book_code = tbl_table_tracking.tt_record_id
WHERE tt_action = 'MODIFY';
Hope this helps...

Related

selecting most recent row from joined table in MySQL

I have two tables, Project and Projectnote
There is a one to many relationship between project and projectnote.
I want to be able to list my projects and select the most recent projectnotes based on the id.
Is this possible to do in Mysql query, I can't figure it out.
Thanks for any help!
Edit: so far I have a basic query (below) that joins the two tables. However, this only selects projects where a note exists and I get multiple rows where there are several notes per project.
SELECT `driver_checkins`.*, `driver_trips`.`id` AS `trip_id`, `driver_trips`.`trip_num` AS `trip_num`, `driver_trips`.`status` AS `trip_status`, `driver_trips`.`ride_date` AS `ride_date`, `driver_trips`.`today_date` AS `trip_today_date`, `driver_trips`.`pick_up_time` AS `pick_up_time`, `driver_trips`.`d_time` AS `d_time`, `driver_trips`.`trip_type` AS `trip_type`
FROM `driver_checkins`
LEFT JOIN `driver_trips` ON `driver_trips`.`driver_id` = `driver_checkins`.`driver_id` WHERE `checkin_status` = 1 AND `booking_status` = 0;
Using a GROUP BY clause and the GROUP_CONCAT function should do the trick.
I am assuming that "driver_checkins" is your "Project" table and "driver_trips" is your "Projectnote" table.
SELECT `driver_checkins`.*, GROUP_CONCAT(`driver_trips`.`id`, `driver_trips`.`status` ORDER BY `driver_trips`.id DESC SEPARATOR " --- " LIMIT 3)
FROM `driver_checkins`
LEFT JOIN `driver_trips` ON `driver_trips`.`driver_id` = `driver_checkins`.`driver_id`
WHERE `checkin_status` = 1 AND `booking_status` = 0
GROUP BY `driver_checkins`.id;
This should display id and status for the last 3 driver_trips per driver_checkin, separated by " --- ".
Something to consider: while in many cases ordering by id will work chronologically, it's always better to add a timestamp column (e.g. called created) instead to order by chronologically.

MySQL Query gets too complex for me

I'm trying to write a MYSQL Query that updates a cell in table1 with information gathered from 2 other tables;
The gathering of data from the other 2 tables goes without much issues (it is slow, but that's because one of the 2 tables has 4601537 records in it.. (because all the rows for one report are split in a separate record, meaning that 1 report has more than 200 records)).
The Query that I use to Join the two tables together is:
# First Table, containing Report_ID's: RE
# Table that has to be updated: REGI
# Join Table: JT
SELECT JT.report_id as ReportID, REGI.Serienummer as SerialNo FROM Blancco_Registration.TrialTable as REGI
JOIN (SELECT RE.Value_string, RE.report_id
FROM Blancco_new.mc_report_Entry as RE
WHERE RE.path_id=92) AS JT ON JT.Value_string = REGI.Serienummer
WHERE REGI.HardwareType="PC" AND REGI.BlanccoReport=0 LIMIT 100
This returns 100 records (I limit it because the database is in use during work hours and I don't want to steal all resources).
However, I want to use these results in a Query that updates the REGI table (which it uses to select the 100 records in the first place).
However, I get the error that I cannot select from the table itself while updateing it (logically). So I tried selecting the select statement above into a temp table and than Update it; however, then I get the issue that I get to much results (logically! I only need 1 result and get 100) however, I'm getting stuck in my own thougts.. I ultimately need to fill the ReportID into each record of REGI.
I know it should be possible, but I'm no expert in MySQL.. is there anybody that can point me into the right direction?
Ps. fixing the table containing 400k records is not an option, it's a program from an external developer and I can only read that database.
The errors I'm talking about are as follows:
Error Code: 1093. You can't specify target table 'TrialTable' for update in FROM clause
When I use:
UPDATE TrialTable SET TrialTable.BlanccoReport =
(SELECT JT.report_id as ReportID, REGI.Serienummer as SerialNo FROM Blancco_Registration.TrialTable as REGI
JOIN (SELECT RE.Value_string, RE.report_id
FROM Blancco_new.mc_report_Entry as RE
WHERE RE.path_id=92) AS JT ON JT.Value_string = REGI.Serienummer
WHERE REGI.HardwareType="PC" AND REGI.BlanccoReport=0 LIMIT 100)
WHERE TrialTable.HardwareType="PC" AND TrialTable.BlanccoReport=0)
Then I tried:
UPDATE TrialTable SET TrialTable.BlanccoReport = (SELECT ReportID FROM (<<and the rest of the SQL>>> ) as x WHERE X.SerialNo = TrialTable.Serienummer)
but that gave me the following error:
Error Code: 1242. Subquery returns more than 1 row
Haveing the Query above with a LIMIT 1, gives everything the same result
Firstly, your query seems to be functionally identical to the following:
SELECT RE.report_id ReportID
, REGI.Serienummer SerialNo
FROM Blancco_Registration.TrialTable REGI
JOIN Blancco_new.mc_report_Entry RE
ON RE.Value_string = REGI.Serinummer
WHERE REGI.HardwareType = "PC"
AND REGI.BlanccoReport=0
AND RE.path_id=92
LIMIT 100
So, why not use that?
EDIT:
I still don't get it. I can't see what part of the problem the following fails to solve...
UPDATE TrialTable REGI
JOIN Blancco_new.mc_report_Entry RE
ON RE.Value_string = REGI.Serinummer
SET TrialTable.BlanccoReport = RE.report_id
WHERE REGI.HardwareType = "PC"
AND REGI.BlanccoReport=0
AND RE.path_id=92;
(This is not an answer, but maybe a pointer towards a few points that need further attention)
Your JT sub query looks suspicious to me:
(SELECT RE.Value_string, RE.report_id
FROM Blancco_new.mc_report_Entry as RE
WHERE RE.path_id=92
GROUP BY RE.report_id)
You use group by but don't actually use any aggregate functions. The column RE.Value_string should strictly be something like MAX(RE.Value_string) instead.

Leave out certain dates on Access report

I have a report in Access 2013 that prints an equipment log. There is a bunch of dates listed for each piece of equipment. I wanted to only print the newest date for each piece of equipment. I have searched the internet and this site with no luck. So any suggestions will be greatly appreciated.
My SQL statement is:
SELECT dbo_eq_location_transfer_d.equipment_id, dbo_equipment.description, dbo_eq_location_transfer_d.transaction_no, dbo_eq_location_transfer_d.job_no, dbo_jobs.description, dbo_eq_location_transfer_d.date_booked, dbo_eq_location_transfer_d.delivery_time, dbo_eq_location_transfer_d.line_no, dbo_eq_location_transfer_d.row_modified_by, dbo_eq_location_transfer_d.comment
FROM (dbo_eq_location_transfer_d INNER JOIN dbo_jobs ON dbo_eq_location_transfer_d.job_no = dbo_jobs.job_no) INNER JOIN dbo_equipment ON dbo_eq_location_transfer_d.equipment_no = dbo_equipment.equipment_no
ORDER BY dbo_eq_location_transfer_d.equipment_id, dbo_eq_location_transfer_d.transaction_no;
The date_booked field is the date field I am trying narrow down. I have a simple SQL query that works and I have been trying copy that into the about SQL but cannot seem to get it to mesh. It is:
SELECT [dbo_eq_location_transfer_d.equipment_no], Max(dbo_eq_location_transfer_d.date_booked) AS ["Newest Date"]
FROM dbo_eq_location_transfer_d
GROUP BY [dbo_eq_location_transfer_d.equipment_no];
In your query set the date fields criteria to:
>Now()-30
This will show any dates for the last 30 days just change 30 to the number of days you want to see.
Now that I understand your structure & data, here is what I did:
(1) Create the following query to select only the most recent 'date_booked' for each 'equipment_no'; save the query with name '23020071_A':
SELECT dbo_eq_location_transfer_d.equipment_no,
First(dbo_eq_location_transfer_d.transaction_no) AS FirstOftransaction_no,
First(dbo_eq_location_transfer_d.job_no) AS FirstOfjob_no,
First(dbo_eq_location_transfer_d.date_booked) AS FirstOfdate_booked
FROM (dbo_eq_location_transfer_d
INNER JOIN dbo_jobs ON dbo_eq_location_transfer_d.job_no = dbo_jobs.job_no)
INNER JOIN dbo_equipment ON dbo_eq_location_transfer_d.equipment_no = dbo_equipment.equipment_no
GROUP BY dbo_eq_location_transfer_d.equipment_no
ORDER BY First(dbo_eq_location_transfer_d.date_booked) DESC;
(2) I created the following query combining the new query with your existing query:
SELECT dbo_eq_location_transfer_d.equipment_id, dbo_equipment.description,
dbo_eq_location_transfer_d.transaction_no, dbo_eq_location_transfer_d.job_no,
dbo_jobs.description, dbo_eq_location_transfer_d.date_booked,
dbo_eq_location_transfer_d.delivery_time, dbo_eq_location_transfer_d.line_no,
dbo_eq_location_transfer_d.row_modified_by, dbo_eq_location_transfer_d.comment
FROM 23020071_A INNER JOIN ((dbo_eq_location_transfer_d
INNER JOIN dbo_jobs ON dbo_eq_location_transfer_d.job_no = dbo_jobs.job_no)
INNER JOIN dbo_equipment ON dbo_eq_location_transfer_d.equipment_no = dbo_equipment.equipment_no)
ON ([23020071_A].FirstOftransaction_no = dbo_eq_location_transfer_d.transaction_no)
AND ([23020071_A].equipment_no = dbo_eq_location_transfer_d.equipment_no)
AND ([23020071_A].FirstOfjob_no = dbo_eq_location_transfer_d.job_no)
ORDER BY dbo_eq_location_transfer_d.equipment_id, dbo_eq_location_transfer_d.transaction_no;
Now when I run the second query, it returns only the most recent row for that piece of equipment.

Why this Query takes such a long time to execute

I have three tables
glSalesJournal
HMISAdd
HMISMain
Now what i am trying to do is add the glSalesJournal amt with HMISAdd amt while grouping up with various Fields and inserting the result into glSalesJournal
The glSalesJournal contains 633173 records
The HMISAdd contains 4193 records
HMISAdd and glSalesJournal contains the same columns which are
loc
glAcct
glSubAcct
batchNbr
contractNbr
amt
I added indexes to the table still the results are the same.
Here is my code:
INSERT INTO hmismain
(loc,
glacct,
subacct,
batchnbr,
contractnbr,
amt)
SELECT glsalesjournal.loc,
glsalesjournal.glacct,
glsalesjournal.glsubacct,
( glsalesjournal.amt + hmisadd.amt ) AS sumAmt,
glsalesjournal.batchnbr,
glsalesjournal.salescontnbr
FROM glsalesjournal
LEFT OUTER JOIN hmisadd
ON ( glsalesjournal.loc = hmisadd.loc
AND glsalesjournal.glacct = hmisadd.glacct
AND glsalesjournal.glsubacct = hmisadd.subacct
AND glsalesjournal.batchnbr = hmisadd.batchnbr
AND glsalesjournal.salescontnbr = hmisadd.contractnbr )
GROUP BY glsalesjournal.loc,
hmisadd.loc,
glsalesjournal.glacct,
hmisadd.glacct,
glsalesjournal.glsubacct,
hmisadd.subacct,
glsalesjournal.batchnbr,
hmisadd.batchnbr,
glsalesjournal.salescontnbr,
hmisadd.contractnbr
The time taken by the script to execute is more than 2 hours. Even when I limit the Records to 100 the time taken is the same.
Can someone please guide me how can I optimize the script.
Thanks
1) It looks like it's a one off query, am I correct here? If not than you are inserting the same data into hmismain table every time.
2) You are grouping on fields from TWO separate tables, so no amount of indexing will ever help you. The ONLY index that will help is an index over a view linking these two tables in the same way.
Further note:
What is the point of
GROUP BY glsalesjournal.loc,
hmisadd.loc,
glsalesjournal.glacct,
hmisadd.glacct,
glsalesjournal.glsubacct,
hmisadd.subacct,
glsalesjournal.batchnbr,
hmisadd.batchnbr,
glsalesjournal.salescontnbr,
hmisadd.contractnbr
You are grouping the data by the same fields twice
glsalesjournal.loc, hmisadd.loc
glsalesjournal.glacct, hmisadd.glacct,
...
Remove the duplicates from GROUP BY and it should run fast
Did you add an index on this fields:
glSalesJournal.loc
glSalesJournal.glAcct
glSalesJournal.glSubAcct
glSalesJournal.batchNbr
glSalesJournal.salesContNbr
HMISAdd.Loc
HMISAdd.GlAcct
HMISAdd.SubAcct
HMISAdd.batchNbr
HMISAdd.contractNbr
If this fields are unindexed, it will perform fulltable scan for each individual record thus causing slow performance.
MySQL Create Index Syntax

indexes in mysql SELECT AS or using Views

I'm in over my head with a big mysql query (mysql 5.0), and i'm hoping somebody here can help.
Earlier I asked how to get distinct values from a joined query
mysql count only for distinct values in joined query
The response I got worked (using a subquery with join as)
select *
from media m
inner join
( select uid
from users_tbl
limit 0,30) map
on map.uid = m.uid
inner join users_tbl u
on u.uid = m.uid
unfortunately, my query has grown more unruly, and though I have it running, joining into a derived table is taking too long because there is no indexes available to the derived query.
my query now looks like this
SELECT mdate.bid, mdate.fid, mdate.date, mdate.time, mdate.title, mdate.name,
mdate.address, mdate.rank, mdate.city, mdate.state, mdate.lat, mdate.`long`,
ext.link,
ext.source, ext.pre, meta, mdate.img
FROM ext
RIGHT OUTER JOIN (
SELECT media.bid,
media.date, media.time, media.title, users.name, users.img, users.rank, media.address,
media.city, media.state, media.lat, media.`long`,
GROUP_CONCAT(tags.tagname SEPARATOR ' | ') AS meta
FROM media
JOIN users ON media.bid = users.bid
LEFT JOIN tags ON users.bid=tags.bid
WHERE `long` BETWEEN -122.52224684058 AND -121.79760915942
AND lat BETWEEN 37.07500915942 AND 37.79964684058
AND date = '2009-02-23'
GROUP BY media.bid, media.date
ORDER BY media.date, users.rank DESC
LIMIT 0, 30
) mdate ON (mdate.bid = ext.bid AND mdate.date = ext.date)
phew!
SO, as you can see, if I understand my problem correctly, i have two derivative tables without indexes (and i don't deny that I may have screwed up the Join statements somehow, but I kept messing with different types, is this ended up giving me the result I wanted).
What's the best way to create a query similar to this which will allow me to take advantage of the indexes?
Dare I say, I actually have one more table to add into the mix at a later date.
Currently, my query is taking .8 seconds to complete, but I'm sure if I could take advantage of the indexes, this could be significantly faster.
First, check for indices on ext(bid, date), users(bid) and tags(bid), you should really have them.
It seems, though, that it's LONG and LAT that cause you most problems. You should try keeping your LONG and LAT as a (coordinate POINT), create a SPATIAL INDEX on this column and query like that:
WHERE MBRContains(#MySquare, coordinate)
If you can't change your schema for some reason, you can try creating additional indices that include date as a first field:
CREATE INDEX ix_date_long ON media (date, `long`)
CREATE INDEX ix_date_lat ON media (date, lat)
These indices will be more efficient for you query, as you use exact search on date combined with a ranged search on axes.
Starting fresh:
Question - why are you grouping by both media.bid and media.date? Can a bid have records for more than one date?
Here's a simpler version to try:
SELECT
mdate.bid,
mdate.fid,
mdate.date,
mdate.time,
mdate.title,
mdate.name,
mdate.address,
mdate.rank,
mdate.city,
mdate.state,
mdate.lat,
mdate.`long`,
ext.link,
ext.source,
ext.pre,
meta,
mdate.img,
( SELECT GROUP_CONCAT(tags.tagname SEPARATOR ' | ')
FROM tags
WHERE ext.bid = tags.bid
ORDER BY tags.bid GROUP BY tags.bid
) AS meta
FROM
ext
LEFT JOIN
media ON ext.bid = media.bid AND ext.date = media.date
JOIN
users ON ext.bid = users.bid
WHERE
`long` BETWEEN -122.52224684058 AND -121.79760915942
AND lat BETWEEN 37.07500915942 AND 37.79964684058
AND ext.date = '2009-02-23'
AND users.userid IN
(
SELECT userid FROM users ORDER BY rank DESC LIMIT 30
)
ORDER BY
media.date,
users.rank DESC
LIMIT 0, 30
You might want to compare your perforamnces against using a temp table for each selection, and joining those tables together.
create table #whatever
create table #whatever2
insert into #whatever select...
insert into #whatever2 select...
select from #whatever join #whatever 2
....
drop table #whatever
drop table #whatever2
If your system has enough memory to hold full tables this might work out much faster. It depends on how big your database is.