Assistance with complex MySQL query (using LIMIT ?) - mysql

I wonder if anyone could help with a MySQL query I am trying to write to return relevant results.
I have a big table of change log data, and I want to retrieve a number of record 'groups'. For example, in this case a group would be where two or more records are entered with the same timestamp.
Here is a sample table.
==============================================
ID DATA TIMESTAMP
==============================================
1 Some text 1379000000
2 Something 1379011111
3 More data 1379011111
3 Interesting data 1379022222
3 Fascinating text 1379033333
If I wanted the first two grouped sets, I could use LIMIT 0,2 but this would miss the third record. The ideal query would return three rows (as two rows have the same timestamp).
==============================================
ID DATA TIMESTAMP
==============================================
1 Some text 1379000000
2 Something 1379011111
3 More data 1379011111
Currently I've been using PHP to process the entire table, which mostly works, but for a table of 1000+ records, this is not very efficient on memory usage!
Many thanks in advance for any help you can give...

Get the timestamps for the filtering using a join. For instance, the following would make sure that the second timestamp is in a completed group:
select t.*
from t join
(select timestamp
from t
order by timestamp
limit 2
) tt
on t.timestamp = tt.timestamp;
The following would get the first three groups, no matter what their size:
select t.*
from t join
(select distinct timestamp
from t
order by timestamp
limit 3
) tt
on t.timestamp = tt.timestamp;

Related

SQL Capture duplicate records across two DIFFERENT columns

I am writing an Exception Catching Page using MySQL for catching duplicate billing entries the following scenario.
Items details are entered in a table which has the following two columns (among others).
ItemCode VARCHAR(50), BillEntryDate DATE
It often happens that same item's bill is entered multiple times, but over a period of few days. Like,
"Football","2019-01-02"
"Basketball","2019-01-02"
...
...
"Football","2019-01-05"
"Rugby","2019-01-05"
...
"Handball","2019-01-05"
"Rugby","2019-01-07"
"Rugby","2019-01-10"
In the above example, the item Football is billed twice - first on 2Jan and again on 5Jan. Similarly, item Rugby is billed thrice on 5,7,10Jan.
I am looking to write simple SQL which can pickup each item [say, using distinct(ItemCode) clause], and then display all the records which are duplicates over a period of 30 days.
In the above case, the expected output should be the following 5 records:
"Football","2019-01-02"
"Football","2019-01-05"
"Rugby","2019-01-05"
"Rugby","2019-01-07"
"Rugby","2019-01-10"
I am trying to run the following SQL:
select * from tablen a, tablen b, where a.ItemCode=b.ItemCode and a.BillEntryDate = b.BillEntryDate+30;
However, this seems to be highly inefficient as it is running for long without displaying any records.
Is there any possibility for getting a less complex and faster method?
I did explore existing topics (like How do I find duplicates across multiple columns?), but it is catching duplicates where BOTH columns have same value. My requirement is one column same value, and second column varying over a month-long date range.
You can use:
select t.*
from tablen t
where exists (select 1
from tablen t2
where t2.ItemCode = t.ItemCode and
t2.BillEntryDate <> t.BillEntryDate and
t2.BillEntryDate >= t1.BillEntryDate - interval 30 day and t2.BillEntryDate <= t1.BillEntryDate + interval 30 day
);
This will pick up both duplicates in the pair.
For performance, you want an index on (ItemCode, BillEntryDate).
With EXISTS:
select ItemCode, BillEntryDate
from tablename t
where exists (
select 1 from tablename
where
ItemCode = t.ItemCode
and
abs(datediff(BillEntryDate, t.BillEntryDate)) between 1 and 30
)

MYSQL pagination performance

I have the following sample MYSQL table:
id | count_likes
-----------
1 | 30
2 | 95
3 | 60
4 | 60
5 | 22
I want to order the table by column count_likes descending and display 5 rows at a time (this is a sample table so assume thousands of rows).
To achieve this I run the following command:
SELECT * FROM table ORDER BY count_likes DESC, id DESC LIMIT 5
I want to give the option for users to load more rows like loading facebook comments for example (5 rows at a time).
To achieve this I run the following command:
SELECT * FROM table WHERE id NOT IN(values already loaded)
ORDER BY count_likes DESC, id DESC LIMIT 5
This could work well for few pages but I think it's not recommended to have like hundred values in the WHERE NOT IN clause.
If I make the command like this:
SELECT * FROM table WHERE count_likes < 'the last displayed count number'
I could miss some rows which have the same count like the last loaded row.
If I make the command like this:
SELECT * FROM table WHERE count_likes <= 'the last displayed count number'
I could get duplicate values that are already loaded.
If I make the command like this:
SELECT * FROM table ORDER BY count_likes DESC LIMIT offset,5
I may get disorganized or duplicate rows as the count_likes for any row may increase or decrease while other users are manipulating the same page.
What is the best way to load more rows in my case above?
The most accurate one would be the WHERE NOT IN but I don't know if it causes performance issues on large number of members like hundred or even thousand.

How to select two MySQL rows and then compare a column and return an output

I've a table with a structure something like this,
Device | paid | time
abc 1 2 days ago
abc 0 1 day ago
abc 0 5 mins ago
Is it possible to write a query that checks the paid column on all the rows where Device = abc and then outputs the most recent two rows that different. Basically, something like an if statement saying if row 1 = 1 and row 2 = 0 output that but only if it's the most recent two columns that are different. For example, in this case, the first and second row. The table is being updated whenever a user changes from a free to paid account etc. It is also updated in different columns for different reasons hence the duplicate 0s for example.
I know this would probably be done better by having another table altogether and updating that every time the user switches account type, but is there any way to make this work?
Thanks
Example:
http://rextester.com/MABU7860 need further testing on edge cases but this seems to work.
SELECT A.*, B.*
FROM SQLfoo A
INNER JOIN SQLFoo B
on A.Device = B.Device
and A.mTime < B.mTime
WHERE A.Paid <> B.Paid
and A.device = 'abc'
ORDER BY B.mTime Desc, A.MTime Desc
LIMIT 1
By performing a self join we on the devices where the time from one table is less than the time from the next table (thus the two records will never matach and we only get the reuslts one way) and we order by those times descending, the highest times appear first in the result since we limit by a single device we don't need to concern ourselves with the devices. We then just need compare the paid from one source to the paid in the 2nd source and return the first result encountered thus limit 1.
Or using user variables
http://rextester.com/TWVEVX7830
in other engines one might accomplish this task by performing the join as in above, assigning a row number partitioned by the device and then simply return all those row_numbers with a value of 1; which would be the earliest date discrepency.
Use LIMIT to limit the number of record on mysql:
http://www.mysqltutorial.org/mysql-limit.aspx
In your case, use LIMIT 2
and then put the 2 record that you just select into an array, then compare the array if the value is different. If they are different then print

SQL group by and order by issue

Lets say I have a table - tasks - with the following data:
task user when_added
---------------------------
run 1 2012-08-09
walk 2 2012-08-07
bike 2 2012-08-07
car 1 2012-08-06
run 2 2012-08-06
car 1 2012-08-05
bike 1 2012-08-04
run 1 2012-08-04
As you can see the task is repetitive.
Question is, when i show the data e.g.
select * from tasks group by task order by when_added desc
How does the group by affecting the results? Does 'group by' group them in any order, can I make it?
The reason i ask is that I have a large table which i show data as above and if I lose the group by and just show results in date order, I get some results which do not show on group by, which means the task has been done before but it seems to be grouping by the oldest date and i want the newest date at the top of the pile.
Hope this makes sense...is it possible to affect the group by order?
Is that what you want?
select task, group_concat(user), max(when_added)
from tasks
group by task
order by when_added desc
group by is an aggregate function. In MySQL you can select not aggregates columns anyway, but you should not do that.
If you group by a column then the results will be distinct for that column and all other data will be grouped around it. So there might be multiple data where task is run for instance. Just selecting other columns will select a random result. You should pick a specific result from that group like max or min or sum or concatenate them.

Fetching multiple records from MAX mysql keyword?

I have only one table with the name of offers and it has multiple offers in it like each time we pull in an offer, we create a new row for example: for travelling to Timbuktu, there can be 10 or more rows each containing an offer, each time a offers comes in, it is being saved with PHP unix timestamp in the column name 'created_on', so to figure out which offer is latest, I am currently using following query:
SELECT * FROM offers WHERE city= 'Timbuktu' AND created_on=(SELECT max(created_on)from offers WHERE city = 'Timbuktu')
This serves the purpose if I have to fetch only one latest row, if say I want to fetch last 4 or 8 rows with the greatest timestamp, how I can do that in most efficient way?
SELECT *
FROM offers
WHERE city= 'Timbuktu'
order by created_on desc
limit 0, 8
and for 1 row you can use same request just replace 8 with 1
SELECT * FROM offers WHERE city='Timbuktu' ORDER BY created_on DESC LIMIT 4;