I have a user table holding user datas
++ id | username | btc_recive_address++
----------------------------------------
++ 1 | myuser | 123kahpoiq31328 ++
order table
++ order_id | user_id | amount | order_timestamp
------------------------------------------------------
++ h6765-a1s | 1 | 0.1 BTC | 2014-04-09 13:21:34
------------------------------------------------------
++ kzg765-a1 | 1 | 0.1 BTC | 2014-04-09 17:11:23
and collector table which retrieves data from bitcoin API( here I identify sender with btc_recive_address)
++ block_chain | user | amount | timestamp
--------------------------------------------------------------------------------
++ 2d37e5351196... | 1 | 0.1 | 2014-04-09 16:21:34
--------------------------------------------------------------------------------
++ 123kjhg7231k.. | 1 | 0.1 | 2014-04-08 19:33:56
--------------------------------------------------------------------------------
and I try to assign transaction to order_id like generating a joined view from order and collector table but I have problems when the amount and user is the same
THE PROBLEM
User places multiple order with same value
0,1 X 3
I get back transactions data from API than I identify user with the reciver address
someaddress - (here the transaction has 3 incoming confirms)
than I try to build a MySQL View as comparing
order table with collector table like joining on user and amount. When the amount and the user is same I do not get the unique transaction_id from order_table in my view
Here is the view query
ALTER ALGORITHM=UNDEFINED DEFINER=`my_view`#`%` SQL SECURITY DEFINER VIEW `ci_orders_in` AS (
SELECT
`c`.`block_chain` AS `block_chain`,
`c`.`assigned_user` AS `assigned_user`,
`c`.`incoming_amount` AS `incoming_amount`,
`c`.`timestamp` AS `timestamp`,
`c`.`type` AS `type`,
`c`.`category` AS `category`,
`c`.`import_timestamp` AS `import_timestamp`,
`o`.`transaction_id` AS `transaction_id`,
`o`.`datum` AS `datum`,
`o`.`status` AS `status`,
`o`.`convert_coin` AS `convert_coin`,
`o`.`convert_coin_to` AS `convert_coin_to`,
`o`.`amount` AS `amount`,
`o`.`converted_amount` AS `converted_amount`,
`o`.`conversion_rate` AS `conversion_rate`,
`o`.`user` AS `user`,
`o`.`units_to_transfer` AS `units_to_transfer`,
`o`.`provision` AS `provision`
FROM (`ci_orders` `o`
JOIN `ci_collector` `c`
ON ((`o`.`user` = `c`.`assigned_user`)))
WHERE (`o`.`convert_coin` = `c`.`type`)
GROUP BY `o`.`converted_amount`)$$
DELIMITER ;
here I should use another join which should give me the nearest timestamp but I do not get forward with it
Well just looking at your view, its obvious you aren't showning us all the columns in these tables, and you've fudged the table names because the query has the column "user" in the ci_order table, but in the sample data is has the column "user_id". But since I read your question on the bitcoin SE site, and am vaguely familiar with what you're trying to do, I'm guessing you're going to want a query similar to this
[incorrect query]
Edit:
Sorry for not looking at the timestamps more closely. I actually bothered to load your dataset in to SQL (MS SQL 2014) this time, although I may have renamed the columns slightly. How about this? Also if you could provide details about the delay it would be helpful, such as does the order timestamp always come after the collector timestamp?
select *
from ci_orders
join ci_collector
on ci_orders.user_id = ci_collector.user_id
and ci_orders.amount = ci_collector.amount
and ci_collector.timestamp = (
select top 1 timestamp
from ci_collector
where ci_orders.user_id = ci_collector.user_id
and ci_orders.amount = ci_collector.amount
and ci_orders.timestamp > ci_collector.timestamp
order by timestamp desc
)
Related
I have a table which contains task list of persons. followings are columns
+---------+-----------+-------------------+------------+---------------------+
| task_id | person_id | task_name | status | due_date_time |
+---------+-----------+-------------------+------------+---------------------+
| 1 | 111 | walk 20 min daily | INCOMPLETE | 2017-04-13 17:20:23 |
| 2 | 111 | brisk walk 30 min | COMPLETE | 2017-03-14 20:20:54 |
| 3 | 111 | take medication | COMPLETE | 2017-04-20 15:15:23 |
| 4 | 222 | sport | COMPLETE | 2017-03-18 14:45:10 |
+---------+-----------+-------------------+------------+---------------------+
I want to find out monthly compliance in percentage(completed task/total task * 100) of each person like
+---------------+-----------+------------+------------+
| compliance_id | person_id | compliance | month |
+---------------+-----------+------------+------------+
| 1 | 111 | 100 | 2017-03-01 |
| 2 | 111 | 50 | 2017-04-01 |
| 3 | 222 | 100 | 2017-03-01 |
+---------------+-----------+------------+------------+
Here person_id 111 has 1 task in month 2017-03-14 and which status is completed, as 1 out of 1 task is completed in march then compliance is 100%
Currently, I am using separate table which stores this compliance but I have to calculate compliance update that table every time the task status is changed
I have tried creating a view also but it's taking too much time to execute view almost 0.5 seconds for 1 million records.
CREATE VIEW `person_compliance_view` AS
SELECT
`t`.`person_id`,
CAST((`t`.`due_date_time` - INTERVAL (DAYOFMONTH(`t`.`due_date_time`) - 1) DAY)
AS DATE) AS `month`,
COUNT(`t`.`status`) AS `total_count`,
COUNT((CASE
WHEN (`t`.`status` = 'COMPLETE') THEN 1
END)) AS `completed_count`,
CAST(((COUNT((CASE
WHEN (`t`.`status` = 'COMPLETE') THEN 1
END)) / COUNT(`t`.`status`)) * 100)
AS DECIMAL (10 , 2 )) AS `compliance`
FROM
`task` `t`
WHERE
((`t`.`isDeleted` = 0)
AND (`t`.`due_date_time` < NOW())
GROUP BY `t`.`person_id` , EXTRACT(YEAR_MONTH FROM `t`.`due_date_time`)
Is there any optimized way to do it?
The first question to consider is whether the view can be optimized to give the required performance. This may mean making some changes to the underlying tables and data structure. For example, you might want indexes and you should check query plans to see where they would be most effective.
Other possible changes which would improve efficiency include adding an extra column "year_month" to the base table, which you could populate via a trigger. Another possibility would be to move all the deleted tasks to an 'archive' table to give the view less data to search through.
Whatever you do, a view will always perform worse than a table (assuming the table has relevant indexes). So depending on your needs you may find you need to use a table. That doesn't mean you should junk your view entirely. For example, if a daily refresh of your table is sufficient, you could use your view to help:
truncate table compliance;
insert into compliance select * from compliance_view;
Truncate is more efficient than delete, but you can't use a rollback, so you might prefer to use delete and top-and-tail with START TRANSACTION; ... COMMIT;. I've never created scheduled jobs in MySQL, but if you need help, this looks like a good starting point: here
If daily isn't often enough, you could schedule this to run more often than daily, but better options will be triggers and/or "partial refreshes" (my term, I've no idea if there is a technical term for the idea.
A perfectly written trigger would spot any relevant insert/update/delete and then insert/update/delete the related records in the compliance table. The logic is a little daunting, and I won't attempt it here. An easier option would be a "partial refresh" on called within a trigger. The trigger would spot user targetted by the change, delete only the records from compliance which are related to that user and then insert from your compliance_view the records relating to that user. You should be able to put that into a stored procedure which is called by the trigger.
Update expanding on the options (if a view just won't do):
Option 1: Daily full (or more frequent) refresh via a schedule
You'd want code like this executed (at least) daily.
truncate table compliance;
insert into compliance select * from compliance_view;
Option 2: Partial refresh via trigger
I don't work with triggers often, so can't recall syntax, but the logic should be as follows (not actual code, just pseudo-code)
AFTER INSERT -- you may need one for each of INSERT / UPDATE / DELETE
FOR EACH ROW -- or if there are multiple rows and you can trigger only on the last one to be changed, that would be better
DELETE FROM compliance
WHERE person_id = INSERTED.person_id
INSERT INTO compliance select * from compliance_view where person_id = INSERTED.person_id
END
Option 3: Smart update via trigger
This would be similar to option 2, but instead of deleting all the rows from compliance that relate to the relevant person_id and creating them from scratch, you'd work out which ones to update, and update them and whether any should be added / deleted. The logic is a little involved, and I'm not not going to attempt it here.
Personally, I'd be most tempted by Option 2, but you'd need to combine it with option 1, since the data goes stale due to the use of now().
Here's a similar way of writing the same thing...
Views are of very limited benefit in MySQL, and I think should generally be avoided.
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(task_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,person_id INT NOT NULL
,task_name VARCHAR(30) NOT NULL
,status ENUM('INCOMPLETE','COMPLETE') NOT NULL
,due_date_time DATETIME NOT NULL
);
INSERT INTO my_table VALUES
(1,111,'walk 20 min daily','INCOMPLETE','2017-04-13 17:20:23'),
(2,111,'brisk walk 30 min','COMPLETE','2017-03-14 20:20:54'),
(3,111,'take medication','COMPLETE','2017-04-20 15:15:23'),
(4,222,'sport','COMPLETE','2017-03-18 14:45:10');
SELECT person_id
, DATE_FORMAT(due_date_time,'%Y-%m') yearmonth
, SUM(status = 'complete')/COUNT(*) x
FROM my_table
GROUP
BY person_id
, yearmonth;
person_id yearmonth x
111 2017-03 1.0
111 2017-04 0.5
222 2017-03 1.0
I have two tables
Users:
user | name | country | rank | country_rank | points
1 | frank | US | to be determined | to be determined | to be determined
Awards:
awarded_to | points_awarded
1 | 10
1 | 30
How can I make a stored procedure to update the users total points based off of the points from awards, then their rank and country_rank respectively based off of the order of the points (i.e. rank 1 would be the user with the most points)?
I considered making a PHP script and using a crontab to call it occasionally that would just select the info and do the math etc in PHP, but stored procedures seems much more practical for my use-case.
create temporary table awardsum (user int, total int); #temp
insert into awardsum
select a.awarded_to, sum(a.points_awarded)
from users u
inner join awards a on u.user=a.awarded_to
group by a.awarded_to;
update users
join awardsum on users.user=awardsum.user
set users.points = awardsum.total;
SELECT #row:=0;
UPDATE users
SET rank = (#row:=#row+1)
ORDER BY points desc;
drop table awardsum;
I have a script which uploads a file and stores the details of the file name in the database. When a document gets uploaded I want to be able to update the name of the file in the database to be proceeded by an incremental number such as _1, _2, _3 (before the file extension) if the DOCUMENT_ID already exists. The table structure looks like this:
ID | DOCUMENT_ID | NAME | MODIFIED | USER_ID
33 | 81 | document.docx | 2014-03-21 | 1
34 | 82 | doc.docx | 2014-03-21 | 1
35 | 82 | doc.docx | 2014-03-21 | 1
36 | 82 | doc.docx | 2014-03-21 | 1
So in the case above I would want ID 35 NAME to be doc_1.docx and ID 36 NAME to be doc_2.docx.
This is where I have got to so far. I have retrieved the last file details that have been uploaded:
$result1 = mysqli_query($con,"SELECT ID, DOCUMENT_ID, NAME, MODIFIED
FROM b_bp_history ORDER BY ID DESC LIMIT 1");
while($row = mysqli_fetch_array($result1))
{
$ID = $row['ID'];
$documentID = $row['DOCUMENT_ID'];
$documentName = $row['NAME'];
$documentModified = $row['MODIFIED'];
}
So this will give me the details I need to see whether the DOCUMENT_ID exists already. Now I thought it would be best to see if it does exist then by carrying out the following:
$sql = "SELECT ID, DOCUMENT_ID
FROM b_bp_history WHERE DOCUMENT_ID = $documentID";
$result2 = mysqli_query($sql);
if(mysqli_num_rows($result2) >0){
/* This is where I need my update */
} else {
/* I don't need an update in here as it will automatically add to the database
table with no number after it. Not sure if I should always add the first one
with a _1 after it so the increment is easy? */
}
As you can see from the above I need an update in there that basically checks to see if a number exists after the name and if it does then increment it by one. On the else statement i.e. if the DOCUMENT_ID doesn't already exist I could add the first one with an _1.docx so that the increment will be easier?
If the DOCUMENT_ID does already exist the update in the first half will need to check the last number before the extension and increment by +1, so if it's _1 then then next will be _2. Not sure how to do this though either. The end result I want is:
ID | DOCUMENT_ID | NAME | MODIFIED | USER_ID
33 | 81 | document.docx | 2014-03-21 | 1
34 | 82 | doc.docx | 2014-03-21 | 1
35 | 82 | doc_1.docx | 2014-03-21 | 1
36 | 82 | doc_2.docx | 2014-03-21 | 1
Generating a Sequence ID Value in MySQL to Represent a Revision ID Based Naming Convention
I used MySQL 5.5.32 to develop and test this solution. Be sure to review the bottom section of my solution for a few homework assignments for future consideration in your overall design approach.
Summary of Requirements and Initial Comments
A external script writes to a document history table. Meta information about a user submitted file is kept in this table, including its user assigned name. The OP requests a SQL update statement or procedural block of DML operations that will reassign the original document name to one that represents the concept of a discrete REVISION ID.
The original table design contains a independent primary key: ID
An implied business key also exists in the relationship between DOCUMENT_ID (a numerical id possibly assigned externally by the script itself) and MODIFIED (a DATE typed value representing when the latest revision of a document was submitted/recorded).
Although other RDBMS systems have useful objects and built-in features such as Oracle's SEQUENCE object and ANALYTICAL FUNCTIONS, There are options available with MySQL's SQL based capabilities.
Setting up a Working Schema
Below is the DDL script used to build the environment discussed in this solution. It should match the OP description with an exception (discussed below):
CREATE TABLE document_history
(
id int auto_increment primary key,
document_id int,
name varchar(100),
modified datetime,
user_id int
);
INSERT INTO document_history (document_id, name, modified,
user_id)
VALUES
(81, 'document.docx', convert('2014-03-21 05:00:00',datetime),1),
(82, 'doc.docx', convert('2014-03-21 05:30:00',datetime),1),
(82, 'doc.docx', convert('2014-03-21 05:35:00',datetime),1),
(82, 'doc.docx', convert('2014-03-21 05:50:00',datetime),1);
COMMIT;
The table DOCUMENT_HISTORY was designed with a DATETIME typed column for the column called MODIFIED. Entries into the document_history table would otherwise have a high likeliness of returning multiple records for queries organized around the composite business key combination of: DOCUMENT_ID and MODIFIED.
How to Provide a Sequenced Revision ID Assignment
A creative solution to SQL based, partitioned row counts is in an older post: ROW_NUMBER() in MySQL by #bobince.
A SQL query adapted for this task:
select t0.document_id, t0.modified, count(*) as revision_id
from document_history as t0
join document_history as t1
on t0.document_id = t1.document_id
and t0.modified >= t1.modified
group by t0.document_id, t0.modified
order by t0.document_id asc, t0.modified asc;
The resulting output of this query using the supplied test data:
| DOCUMENT_ID | MODIFIED | REVISION_ID |
|-------------|------------------------------|-------------|
| 81 | March, 21 2014 05:00:00+0000 | 1 |
| 82 | March, 21 2014 05:30:00+0000 | 1 |
| 82 | March, 21 2014 05:35:00+0000 | 2 |
| 82 | March, 21 2014 05:50:00+0000 | 3 |
Note that the revision id sequence follows the correct order that each version was checked in and the revision sequence properly resets when it is counting a new series of revisions related to a different document id.
EDIT: A good comment from #ThomasKöhne is to consider keeping this REVISION_ID as a persistent attribute of your version tracking table. This could be derived from the assigned file name, but it may be preferred because an index optimization to a single-value column is more likely to work. The Revision ID alone may be useful for other purposes such as creating an accurate SORT column for querying a document's history.
Using MySQL String Manipulation Functions
Revision identification can also benefit from an additional convention: the column name width should be sized to also accommodate for the appended revision id suffix. Some MySQL string operations that will help:
-- Resizing String Values:
SELECT SUBSTR('EXTRALONGFILENAMEXXX',1,17) FROM DUAL
| SUBSTR('EXTRALONGFILENAMEXXX',1,17) |
|-------------------------------------|
| EXTRALONGFILENAME |
-- Substituting and Inserting Text Within Existing String Values:
SELECT REPLACE('THE QUICK <LEAN> FOX','<LEAN>','BROWN') FROM DUAL
| REPLACE('THE QUICK <LEAN> FOX','<LEAN>','BROWN') |
|--------------------------------------------------|
| THE QUICK BROWN FOX |
-- Combining Strings Using Concatenation
SELECT CONCAT(id, '-', document_id, '-', name)
FROM document_history
| CONCAT(ID, '-', DOCUMENT_ID, '-', NAME) |
|-----------------------------------------|
| 1-81-document.docx |
| 2-82-doc.docx |
| 3-82-doc.docx |
| 4-82-doc.docx |
Pulling it All Together: Constructing a New File Name Using Revision Notation
Using the previous query from above as a base, inline view (or sub query), this is a next step in generating the new file name for a given revision log record:
SQL Query With Revised File Name
select replace(docrec.name, '.', CONCAT('_', rev.revision_id, '.')) as new_name,
rev.document_id, rev.modified
from (
select t0.document_id, t0.modified, count(*) as revision_id
from document_history as t0
join document_history as t1
on t0.document_id = t1.document_id
and t0.modified >= t1.modified
group by t0.document_id, t0.modified
order by t0.document_id asc, t0.modified asc
) as rev
join document_history as docrec
on docrec.document_id = rev.document_id
and docrec.modified = rev.modified;
Output With Revised File Name
| NEW_NAME | DOCUMENT_ID | MODIFIED |
|-----------------|-------------|------------------------------|
| document_1.docx | 81 | March, 21 2014 05:00:00+0000 |
| doc_1.docx | 82 | March, 21 2014 05:30:00+0000 |
| doc_2.docx | 82 | March, 21 2014 05:35:00+0000 |
| doc_3.docx | 82 | March, 21 2014 05:50:00+0000 |
These (NEW_NAME) values are the ones required to update the DOCUMENT_HISTORY table. An inspection of the MODIFIED column for DOCUMENT_ID = 82 shows that the check-in revisions are numbered in the correct order with respect to this part of the composite business key.
Finding Un-processed Document Records
If the file name format is fairly consistent, a SQL LIKE operator may be enough to identify the record names which have been already altered. MySQL also offers filtering capabilities through REGULAR EXPRESSIONS, which offers more flexibility with parsing through document name values.
What remains is figuring out how to update just a single record or a set of records. The appropriate place to put the filter criteria would be on the outermost part of the query right after the join between aliased tables:
...
and docrec.modified = rev.modified
WHERE docrec.id = ??? ;
There are other places where you can optimize for faster response times, such as within the internal sub query that derives the revision id value... the more you know about the specific set of records that you are interested in, you can segment the beginning SQL statements to look only at what is of interest.
Homework: Some Closing Comments on the Solution
This stuff is purely optional and they represent some side thoughts that came to mind on aspects of design and usability while writing this up.
Two-Step or One-Step?
With the current design, there are two discrete operations per record: INSERT by a script and then UPDATE of the value via a SQL DML call. It may be annoying to have to remember two SQL commands. Consider building a second table built for insert only operations.
Use the second table (DOCUMENT_LIST) to hold nearly identical information, except possibly two columns:
BASE_FILE_NAME (i.e., doc.docx or document.docx) which may apply for multiple HISTORY_ID values.
FILE_NAME (i.e., doc_1.docx, doc_2.docx, etc.) which will be unique for each record.
Set a database TRIGGER on the source table: DOCUMENT_HISTORY and put the SQL query we've developed inside of it. This will automatically populate the correct revision file name at roughly the same moment after the script fills the history table.
WHY BOTHER? This suggestion mainly fits under the category of SCALABILITY of your database design. The assignment of a revision name is still a two step process, but the second step is now handled automatically within the database, whereas you'd have to remember to include it everywhere you invoked a DML operation on top of the history table.
Managing Aliases
I didn't see it anywhere, but I assume that the USER initially assigns some name to the file being tracked. In the end, it appears that it may not matter as it is an internally tracked thing that the end user of the system would never see.
For your information, this information isn't portrayed to the customer, it is saved in a table in the database as a version history...
Reading the history of a given document would be easier if the "base" name was kept the same once it has been given:
In the data sample above, unless the DOCUMENT_ID is known, it may not be clear that all the file names listed are related. This may not necessarily be a problem, but it is a good practice from a semantic point of view to separate user assigned file names as ALIASES that can be changed and assigned at will at any time.
Consider setting up a separate table for tracking the "User-Friendly" name given by the end user, and associating it with the document id it is supposed to represent. A user may make hundreds or thousands of rename requests... while the back end file system uses a simpler, more consistent naming approach.
I had similar trouble recently, but I'm using MSSQL and I don't no MySQL syntax, so here is a T-SQL code. Hope, it will help you!
declare
#id int,
#document_id int,
#document_name varchar(255),
#append_name int,
#name varchar(255),
#extension varchar(10)
set #append_name = 1
select top 1
#id = ID,
#document_id = DOCUMENT_ID,
#document_name = NAME
from
b_bp_history
while exists (
select *
from b_bp_history
where
NAME = #document_name and
DOCUMENT_ID = #document_id and
ID <> #id)
begin
set #name = ''
set #extension = ''
declare #dot_index int -- index of dot-symbol in document name
set #dot_index = charindex('.', reverse(#document_name))
if (#dot_index > 0)
begin
set #name = substring(#document_name, 0, len(#document_name) - #dot_index + 1)
set #extension = substring(#document_name, len(#document_name) - #dot_index + 2, len(#document_name) - len(#name))
end
else
set #name = #document_name
if (#append_name > 1) -- if not first try to rename file
begin
if (right(#name, len(cast(#append_name - 1 as varchar)) + 1)) = '_' + cast(#append_name - 1 as varchar)
begin
set #name = substring(#name, 0, len(#name) - (len(cast(#append_name - 1 as varchar))))
end
end
set #name = #name + '_' + cast(#append_name as varchar)
if (len(#extension) > 0)
set #document_name = #name + '.' + #extension
else
set #document_name = #name
set #append_name = #append_name + 1
end
update b_bp_history
set NAME = #document_name
where ID = #id
Here is the Working UPDATE QUERY
UPDATE document_history
INNER JOIN (SELECT dh.id, IF(rev.revision_id = 0, dh.name,REPLACE(dh.name, '.', CONCAT('_', rev.revision_id, '.'))) AS new_name,
rev.document_id, rev.modified
FROM (
SELECT t0.document_id, t0.modified, count(*) - 1 AS revision_id
FROM document_history as t0
JOIN document_history as t1
ON t0.document_id = t1.document_id
AND t0.modified >= t1.modified
GROUP BY t0.document_id, t0.modified
ORDER BY t0.document_id ASC, t0.modified ASC) AS rev
JOIN document_history dh
ON dh.document_id = rev.document_id
AND dh.modified = rev.modified) update_record
ON document_history.id = update_record.id
SET document_history.name = update_record.new_name;
You can see the SQL Fiddle at http://www.sqlfiddle.com/#!2/9b3cda/1
I used the information available on this page on UPDATE to assemble my query:
MySQL - UPDATE query based on SELECT Query
Used the page below for generating a Revision ID:
ROW_NUMBER() in MySQL
Also used the schema provided by Richard Pascual in his elaborate answer.
Hope this query helps you to name your document as you wish.
In my application I have association between two entities employees and work-groups.
This association usually changes over time, so in my DB I have something like:
emplyees
| EMPLOYEE_ID | NAME |
| ... | ... |
workgroups
| GROUP_ID | NAME |
| ... | ... |
emplyees_workgroups
| EMPLOYEE_ID | GROUP_ID | DATE |
| ... | ... | ... |
So suppose I have an association between employee 1 and group 1, valid from 2014-01-01 on.
When a new association is created, for example from 2014-02-01 on, the old one is no longer valid.
This structure for the associative table is a bit problematic for queries, but I actually would avoid to add an END_DATE field to the table beacuse it will be a reduntant value and also requires the execution of an insert + update or update on two rows every time a change happens in an association.
So have you any idea to create a more practical architecture to solve my problem? Is this the better approach?
You have what is called a slowly changing dimension. That means that you need to have dates in the employees_workgroup table in order to find the right workgroup at the right time for a set of employees.
The best way to handle this is to have to dates, which I often call effdate and enddate on each row. This greatly simplifies queries, where you are trying to find the workgroup at a particular point in time. Such a query might look like with this structure:
select ew.*
from employees_workgroup ew
where MYDATE between effdate and enddate;
Now consider the same results using only one date per field. It might be something like this:
select ew.*,
from employees_workgroup ew join
(select employee_id, max(date) as maxdate
from employees_workgroup ew2
where ew2.employee_id = ew.employee_id and
ew2.date <= MYDATE
) as rec
on ew.employee_id = rec.employee_id and ew.adte = ew.maxdate;
The expense of doing an update along with the insert is minimal compared to the complexity this will introduce in the queries.
I have a number of tables in my database and I need help in structuring queries that are quick and efficient. With the different queries that I have written so far either the results have been inconsistent or to big (returning more information than I need and therefore, have to use PHP later to constraint the results). Here is the background.
Our database handles leads for the senior housing industry. Each lead has many notes (a sales history) that not only give a history of the sale but inform the user of the next follow up date (actionDate). There are also many different statuses (i.e., active, top 10, move in, etc.) each lead can be assigned to (though not at the same time). The status of a lead is a history of the progression of a lead through the sales process. We can see what status the lead was and when.
In the base "lead" table each lead has a Primary Key called "inquiryID" that auto increments. This key is referenced in most other tables to relate them to the "lead" table. Here is the structure of the "lead" table.
TABLE: lead (~500 rows)
+-------------------+------------+-------+--------+
| Field | Type | Key | Extra |
+-------------------+------------+-------+--------+
| inquiryID | int(11) | PK | AI |
| communityID | int(3) | | |
| initialDate | date | | |
| inquirySource | tinytext | | |
| inquiryType | tinytext | | |
+-------------------+------------+-------+--------+
Another table is titled "leadNote". This table handles the sales journal for each lead. Basically a salesperson would enter in the date the note was written (date) who is writing the note (salesCounselor), the note itself (note), who is to follow up with the lead (actionCounselor), and what date they will follow up (actionDate).
TABLE: leadNote (~15000 rows)
+-------------------+------------+-------+--------+
| Field | Type | Key | Extra |
+-------------------+------------+-------+--------+
| inquiryNoteID | int(11) | PK | AI |
| inquiryID | int(11) | FK | |
| date | date | | |
| salesCounselor | tinytext | | |
| note | text | | |
| actionCounselor | int(5) | | |
+-------------------+------------+-------+--------+
The final table I will reference is titled "leadStatusHistory". This table handles the history of the status of this lead. A lead can have many different statuses, but not at the same time. We want to be able to track what a lead's status is and when. A lead would have a status (leadStatus), a date the status was assigned to them (statusDate), and who assigned the status to them (author) among other gathered data.
TABLE: leadStatusHistory (~1200 rows)
+-------------------+-------------+-------+--------+
| Field | Type | Key | Extra |
+-------------------+-------------+-------+--------+
| historyID | int(11) | PK | AI |
| inquiryID | int(11) | FK | |
| leadStatus | tintytext | | |
| date | datetime | | |
| communityID | int(3) | | |
| timestamp | timestamp | | |
+-------------------+-------------+-------+--------+
My goal is to be able to run a query that returns the inquiryID, actionCounselor, actionDate, and current leadStatus. As I said earlier the many different queries that I have tried have brought mixed results. There are two types of ways that I want to gather this list. 1) find all leads that have a next contact date that is less than or equal to today (this is the list of leads scheduled to follow up with today). 2) find all leads that match a certain leadStatus currently (i.e., to look up all leads that are currently with a status of "move in".
This is how I would ORDER the tables to get the information that I am looking for.
1) Find inquiryID, actionCounselor (value in the actionCounselor column on the most recently created "leadNote" row or "date" that is the greatest), actionDate (value in the actionDate column on the most recently created "leadNote" row or "date" that is the greatest), and leadStatus (value in the leadStatus column on the most recently created "leadStatusHistory" row or "timestamp" that is the greatest) WHERE the actionDate (value in the actionDate column on the most recently created "leadNote" row or "date" that is the greatest) is less than or equal to today.
2) Find inquiryID, actionCounselor (value in the actionCounselor column on the most recently created "leadNote" row or "date" that is the greatest), actionDate (value in the actionDate column on the most recently created "leadNote" row or "date" that is the greatest), and leadStatus (value in the leadStatus column on the most recently created "leadStatusHistory" row or "timestamp" that is the greatest) WHERE leadStatus (value in the leadStatus column on the most recently created "leadStatusHistory" row or "timestamp" that is the greatest) is equal to "move in".
Here are some examples of current and past queries with my comments as to what is wrong with them.
query #1:
SELECT
tt.inquiryID,
tt.actionDate,
tt.date,
tt.actionCounselor,
(SELECT
leadstatushistory.leadstatus
FROM
leadstatushistory
WHERE
leadstatushistory.inquiryID = tt.inquiryID AND leadstatushistory.historyID = (SELECT
MAX(leadstatushistory.historyID) as historyID
FROM
leadstatushistory
WHERE
inquiryID = tt.inquiryID)) AS leadStatus
FROM
leadnote tt
INNER JOIN
(SELECT
inquiryID,
MAX(inquiryNoteID) as inquiryNoteID,
MAX(leadnote.actionDate) AS actionDate
FROM
leadnote
GROUP BY inquiryID) groupedtt ON tt.inquiryID = groupedtt.inquiryID AND tt.inquiryNoteID = groupedtt.inquiryNoteID
WHERE
tt.actionDate <= '2012-08-27' AND tt.actionDate != '0000-00-00' AND (SELECT
leadstatushistory.leadstatus
FROM
leadstatushistory
WHERE
leadstatushistory.inquiryID = tt.inquiryID AND leadstatushistory.historyID =
(SELECT
MAX(leadstatushistory.historyID) as historyID
FROM
leadstatushistory
WHERE
inquiryID = tt.inquiryID)) != 'Resident' AND tt.communityID = 4
GROUP BY tt.inquiryID
COMMENTS: Gave me the columns I needed, but have had complaints that now and then the "actionDate" column will not reflect the the date of the of the most recently created leadNote row and sometimes the leadStatus was wrong. For example, the max(historyID) for the leadStatusHistory table is not necessarily the most recent status that we want to find. Sometimes our employees will go back and fill in missing leadStatus for leads in the past. This creates a new leadStatusHistory row with a new auto increment historyID. In this case the most recent (or greatest historyID) does not have the greatest "leadStatusHistory.date", because the date the user entered in was a past date (filling in past information so our historical records are accurate). The exact same problem we have with entering notes into the leadNote table for past notes. The new auto increment inquiryNoteID does not necessarily match the row with the greatest "tt.date".
query #2:
SELECT
maxDate.inquiryID, maxDate.date, maxDate.actionDate, maxDate.actionCounselor
FROM
(SELECT
*
FROM
leadnote
ORDER BY date DESC , type ASC, inquiryNoteID DESC) as maxDate
LEFT JOIN
staff ON maxDate.actionCounselor = staff.staffID
WHERE
maxDate.communityID = 4
GROUP BY inquiryID
COMMENTS: Gave me the columns I needed, but it also finds the information for all leads. This wastes valuable time and makes the response slower. I then have to use PHP to constrain the data to show only those leads that have an actionDate of <= today and a date that isn't "0000-00-00" or I constrain the data to show only those leads with a leadStatus of "Move In". Again, this does give me the results that I am looking for, but it is slow. Also, if I add into the query "WHERE date<=[today] AND date != '0000-00-00'" in the subquery it changes the results so they are not accurate and then I still have to use PHP to constrain the results to show only those that are of the status that I am looking for.
By looking at the above information does anyone have any ideas of how to better structure my query so that I can quickly find the exact information that I am looking for. Or is there a way to change the structure or relationship of the tables to get the results I am looking for. Please, any help is appreciated.
My goal is to be able to run a query that returns the inquiryID, actionCounselor, actionDate, and current leadStatus.
You are seeking to find the groupwise maxima from your leadNote and leadStatusHistory tables: namely the records with the maximum dates within each group of inquiryID.
You can achieve this with a query along the following lines:
SELECT inquiryID, actionCounselor, actionDate, leadStatus
FROM (
leadNote NATURAL JOIN (
SELECT inquiryID, MAX(actionDate) AS actionDate
FROM leadNote
GROUP BY inquiryID
) AS t
) JOIN (
leadStatusHistory NATURAL JOIN (
SELECT inquiryID, MAX(statusDate) AS statusDate
FROM leadStatusHistory
GROUP BY inquiryID
) AS t
) USING (inquiryID)
For the best performance, you should ensure that leadNote has a composite index on (inquiryID, actionDate) and that leadStatusHistory has a composite index on (inquiryID, statusDate, leadStatus):
ALTER TABLE leadNote ADD INDEX (inquiryID, actionDate);
ALTER TABLE leadStatusHistory ADD INDEX (inquiryID, statusDate, leadStatus);
There are two types of ways that I want to gather this list. 1) find all leads that have a next contact date that is less than or equal to today (this is the list of leads scheduled to follow up with today). 2) find all leads that match a certain leadStatus currently (i.e., to look up all leads that are currently with a status of "move in".
Add WHERE actionDate <= CURRENT_DATE
Add WHERE leadStatus = 'move in'