Delete all rows except first N from a table having single column - mysql

I need a single query. Delete all rows from the table except the top N rows. The table has only one column. Like,
|friends_name|
==============
| Arunji |
| Roshit |
| Misbahu |
| etc... |
This column may contain repeated names as well.
Contains repeated names
Only one column.

If you can order your records by friends_name, and if there are no duplicates, you could use this:
DELETE FROM names
WHERE
friends_name NOT IN (
SELECT * FROM (
SELECT friends_name
FROM names
ORDER BY friends_name
LIMIT 10) s
)
Please see fiddle here.
Or you can use this:
DELETE FROM names ORDER BY friends_name DESC
LIMIT total_records-10
where total_records is (SELECT COUNT(*) FROM names), but you have to do this by code, you can't put a count in the LIMIT clause of your query.

If you don't have an id field, i suppose you use an alphabetic order.
MYSQL
DELETE FROM friends
WHERE friends_name
NOT IN (
SELECT * FROM (
SELECT friends_name
FROM friends
ORDER BY friends_name ASC
LIMIT 10) r
)
You delete all rows exept the 10 firsts (alphabetic order)

I just wanted to follow up on this relatively old question because the existing answers don't capture the requirement and/or are incorrect. The question states the names can be repeated, but only the top N must be preserved. Other answers will delete incorrect rows and/or incorrect number of them.
For example, if we have this table:
|friends_name|
==============
| Arunji |
| Roshit |
| Misbahu |
| Misbahu |
| Roshit |
| Misbahu |
| Rohan |
And we want to delete all but top 3 rows (N = 3), the expected result would be:
|friends_name|
==============
| Arunji |
| Roshit |
| Misbahu |
The DELETE statement from the currently selected answer will result in:
|friends_name|
==============
| Arunji |
| Misbahu |
| Misbahu |
| Misbahu |
See this sqlfiddle. The reason for this is that it first sorts names alphabetically, then takes top 3, then deletes all that don't equal that. But since they are sorted by name they may not be the top 3 we want, and there's no guarantee that we'll end up with only 3.
In the absence of unique indexes and other fields to determine what "top N" means, we go by the order returned by the database. We could be tempted to do something like this (substitute 99999 with however high number):
DELETE FROM names LIMIT 99999 OFFSET 3
But according to MySQL docs, while the DELETE supports the LIMIT clause, it does not support OFFSET. So, doing this in a single query, as requested, does not seem to be possible; we must perform the steps manually.
Solution 1 - temporary table to hold top 3
CREATE TEMPORARY TABLE temp_names LIKE names;
INSERT INTO temp_names SELECT * FROM names LIMIT 3;
DELETE FROM names;
INSERT INTO names SELECT * FROM temp_names;
Here's the sqlfiddle for reference.
Solution 2 - new table with rename
CREATE TABLE new_names LIKE names;
INSERT INTO new_names SELECT * FROM names LIMIT 3;
RENAME TABLE names TO old_names, new_names TO names;
DROP TABLE old_names;
Here's the sqlfiddle for this one.
In either case, we end up with top 3 rows in our original table:
|friends_name|
==============
| Arunji |
| Roshit |
| Misbahu |

Related

MySQL/MariaDB find one or more numbers in list, matching lottery numbers with past results

I have a MariaDB table with an archive of past lottery results, imagine EuroMillions or Powerball lotteries.
For example on EuroMillions numbers go from 1 to 50 and then the extra balls from 1 to 12, each result is 5 numbers form the main pool and 2 from the extra pool. So my historic results table could look like this:
Lottery Results table
(other columns like id, date, draw number, etc) | main_numbers | extra_numbers | (timestamp columns)
... | 1,2,3,4,5 | 1,2 | ...
... | 3,12,34,35,45 | 5,11 | ...
... | 4,15,34,39,45 | 10,11 | ...
... | 7,11,25,28,44 | 10,12 | ...
(you get the idea, I have thousands of records...)
So I could select main_numbers and get result "3,12,34,35,45" for that second example row. And for the extra_numbers I would get "5,11".
What I want is to given a set of numbers for main and extra to see if they match any of my results, finding any number of numbers (numbered lottery balls).
So for example if I SELECT to find main_numbers "5,9,22,34,45" with extra_numbers "2,11" I would get (from my extracted example) two records:
... | 3,12,34,35,45 | 5,11 | ...
... | 4,15,34,39,45 | 10,11 | ...
Matching two main numbers and one extra number, in this case finding lottery prizes in the results table. Makes sense?
I'm using MariaDB and I'm a bit lost on how to proceed, I tried WHERE IN, FIELD_IN_SET, etc.
Is there a way to perform a SELECT to find results in only one statement or do I have to pick all records and then iterate elsewhere, php for example?
My aim would be to have it in one statement, so I could just send the numbers and get the matching records... Possible?
I hope this makes sense.
Many thanks for your answers.
Consider the following.
For simplicity, let's say that a lottery comprises 3 main balls, and two bonus balls:
DROP TABLE IF EXISTS lottery_results;
CREATE TABLE lottery_results
(draw_id INT NOT NULL
,ball_no INT NOT NULL
,ball_val INT NOT NULL
,PRIMARY KEY(draw_id,ball_no)
);
INSERT INTO lottery_results VALUES
(1,1,22),
(1,2,35),
(1,3,62),
(1,4,27),
(1,5,17),
(2,1,18),
(2,2,33),
(2,3,49),
(2,4, 4),
(2,5,35);
And we want to find all results where 34, 35, or 36 were drawn as a main number...
SELECT draw_id
FROM lottery_results
WHERE ball_no <=3
AND ball_val IN(34,35,36);
+---------+
| draw_id |
+---------+
| 1 |
+---------+
Thanks Strawberry,
I found a solution if I have all numbers in distinct columns, but could I find if they are in the same column in CSV?
So if I put my CSV in distinct columns for numbers (n_1...n_5) and extra numbers for the stars in (s_1, s_2) I can seek matched in those multiple columns.
This is using multiple columns:
To find matches numbers 1,2,3,4,5 with stars 1,2...
In EuroMillions you get a prize with 2 or more numbers and any star (one or two).
SELECT
main_numbers, extra_numbers,
((n_1 IN (1,2,3,4,5)) +
(n_2 IN (1,2,3,4,5)) +
(n_3 IN (1,2,3,4,5)) +
(n_4 IN (1,2,3,4,5)) +
(n_5 IN (1,2,3,4,5))) AS matched_numbers,
((s_1 IN (1,2)) +
(s_2 IN (1,2))) AS matched_stars,
created_at
FROM `lottery_results_archive`
HAVING matched_numbers >= 3 OR matched_numbers = 2 AND matched_stars > 0
ORDER BY matched_numbers DESC, matched_stars DESC, created_at DESC
Makes sense?
Thanks.

get one record at a time from joint table

I want to get a record from a joint table at a time. But I don't hope the tables are joined as a whole.
The actual tables are as follow.
table contents -- stores content information.
+----+----------+----------+----------+-------------------+
| id | name |status |priority |last_registered_day|
+----+----------+----------+----------+-------------------+
| 1 | content_1|0 |1 |2020/10/10 11:20:20|
| 2 | content_2|2 |1 |2020/10/10 11:21:20|
| 3 | content_3|2 |2 |2020/10/10 11:22:20|
+----+----------+----------+----------+-------------------+
table clusters -- stores cluster information
+----+----------+
| id | name |
+----+----------+
| 1 | cluster_1|
| 2 | cluster_2|
+----+----------+
table content_cluster -- each record indicates that one content is on one cluster
+----------+----------+-------------------+
|content_id|cluster_id| last_update_date|
+----------+----------+-------------------+
| 1 | 1 |2020-10-01T11:30:00|
| 2 | 2 |2020-10-01T11:30:00|
| 3 | 1 |2020-10-01T10:30:00|
| 3 | 2 |2020-10-01T10:30:00|
+----------+----------+-------------------+
By specifying a cluster_id, I want to get one content name at a time where contents.status=2 and (contents name, cluster_id) pair is in content_cluster. The query in sql is something like follow.
SELECT contents.name
FROM contents
JOIN content_cluster
ON contents.content_id = content_cluster.content_id
where contents.status = 2
AND content_cluster.cluster_id = <cluster_id>
ORDER
BY contents.priority
, contents.last_registered_day
, contents.name
LIMIT 1;
However, I don't want the tables to be joined as a whole every time as I have to do it frequently and the tables are large. Is there any efficient way to do this? I can add some indices to the tables. What should I do?
I would try writing the query like this:
SELECT c.name
FROM contents c
WHERE EXISTS (SELECT 1
FROM content_cluster cc
WHERE cc.content_id = c.content_id AND
cc.cluster_id = <cluster_id>
) AND
c.status = 2
ORDER BY c.priority, c.last_registered_day, c.name
LIMIT 1;
Then create the following indexes:
content(status, priority, last_registered_day, name, content_id, name)
content_cluster(content_id, cluster_id).
The goal is for the execution plan to scan the index for context and for each row, look up to see if there is a match in content_cluster. The query stops at the first match.
I can't guarantee that this will generate that plan (avoiding the sort), but it is worth a try.
This query can easily be optimized by applying correct indexes. Apply the alter statements I am mentioning below. And let me know if the performance have considerably increased or not:
alter table contents
add index idx_1 (id),
add index idx_2(status);
alter table content_cluster
add index idx_1 (content_id),
add index idx_2(cluster_id);
If a content can be in multiple clusters and the number of clusters can change, I think that doing a join like this is the best solution.
You could try splitting your contents table into different tables each containing the contents of a specific cluster, but it would need to be updated frequently.

NOT IN subquery gives 0 results

i'm not an mysqlologist but i have to deal with the following problem:
given a following table:
+-------+-----------+-------------+------+
| id | articleID | img | main |
+-------+-----------+-------------+------+
| 48350 | 4325 | scr426872xa | 1 |
| 48351 | 4325 | scr426872ih | 2 |
| 48352 | 4325 | scr426872jk | 2 |
| 48353 | 4326 | scr426882vs | 1 |
| 48354 | 4326 | scr426882ss | 2 |
| 48355 | 4326 | scr426882nf | 2 |
+-------+-----------+-------------+------+
each set of images of one distinct articleID should have one image set as main=1 and an unspecified number of images with main value of 2
Due to processing issues it can happen that there is no main=1 set for an image and i need to find the articleID where images with main=2 exist, but not with main=1.
By explaining it backwards it is easier to fomulate what my thinking process for the query is. My idea was to create a result set (subquery) by querying the table for articleID where main is "1". Then use that result to check which distinct articleID of a query where main=2 is not in the results of aforementioned (sub-)query. Basically "substracting" all matching articleID lines.
This should give basically the leftover of all main=2 lines which have no line with the same articleID where main=1
SELECT DISTINCT articleID
FROM img_table WHERE main = 2
AND articleID
NOT IN (SELECT articleID FROM img_table WHERE main = 1 );
I get no result when I know for a fact that there are some. There is surely something I'm doing wrong. I hope my problem is explained in a way that not only me know what I want :)
Given your problem description, it looks like you're actually looking for NOT EXISTS to check for rows that don't have a matching row in the subselect. Note that you do have to add the article id to the where clause in the subselect:
SELECT DISTINCT articleID
FROM img_table t1
WHERE main = 2
AND NOT EXISTS
(SELECT articleID
FROM img_table t2
WHERE main = 1
AND t2.articleID = t1.articleID);
I think your current solution should work too, but maybe you didn't show all the data. For the data you specified, the query would indeed return 0 rows, because all articleIDs have at least one main=1 and a main=2 image.
One important thing to remember: the subquery must not return any NULL value, otherwise NOT IN won't work properly. So if articleID is nullable, make sure your subselect looks like this:
(SELECT articleID FROM img_table WHERE main = 1 and articleID IS NOT NULL)
I didn't find any issue in your query, Please add some data where article id having only main 2. Your query checking both article ID contains main 1,2. ie why you not getting any result.

How does SELECT DISTINCT work in MySQL?

I have a table with multiple rows which have a same data. I used SELECT DISTINCT to get a unique row and it works fine. But when i use ORDER BY with SELECT DISTINCT it gives me unsorted data.
Can anyone tell me how distinct works?
Based on what criteria it selects the row?
From your comment earlier, the query you are trying to run is
Select distinct id from table where id2 =12312 order by time desc.
As I expected, here is your problem. Your select column and order by column are different. Your output rows are ordered by time, but that order doesn't necessarily need to preserved in the id column. Here is an example.
id | id2 | time
-------------------
1 | 12312 | 34
2 | 12312 | 12
3 | 12312 | 48
If you run
SELECT * FROM table WHERE id2=12312 ORDER BY time DESC
you will get the following result
id | id2 | time
-------------------
2 | 12312 | 12
1 | 12312 | 34
3 | 12312 | 48
Now if you select only the id column from this, you will get
id
--
2
1
3
This is why your results are not sorted.
When you specify SELECT DISTINCT it will give you all the rows, eliminating duplicates from the result set. By "duplicates" I mean rows where all fields have the same values. For example, say you have a table that looks like:
id | num
--------------
1 | 1
2 | 3
3 | 3
SELECT DISTINCT * would return all rows above, whereas SELECT DISTINCT num would return two rows:
num
-----
1
3
Note that which row actual row (eg: whether it's row 2 or row 3) it selects is irrelevant, as the result would be indistinguishable.
Finally, DISTINCT should not affect how ORDER BY works.
Reference: MySQL SELECT statement
The behaviour you describe happens when you ORDER BY an expression that is not present in the SELECT clause. The SQL standard does not allow such a query but MySQL is less strict and allows it.
Let's try an example:
SELECT DISTINCT colum1, column2
FROM table1
WHERE ...
ORDER BY column3
Let's say the content of the table table1 is:
id | column1 | column2 | column3
----+---------+---------+---------
1 | A | B | 1
2 | A | B | 5
3 | X | Y | 3
Without the ORDER BY clause, the above query returns following two records (without ORDER BY the order is not guaranteed):
column1 | column2
---------+---------
A | B
X | Y
But with ORDER BY column3 the order is also not guaranteed.
The DISTINCT clause operates on the values of the expressions present in the SELECT clause. If row #1 is processed first then (A, B) is placed in the result set and it is associated with row #1. Then, when row #2 is processed, the values of the SELECT expressions produce the record (A, B) that is already in the result set. Because of DISTINCT it is dropped. Row #3 produces (X, Y) that is also put in the result set. Then, the ORDER BY column3 clause makes the records be sorted in the result set as (A, B), (X, Y).
But if row #2 is processed before row #1 then, following the same logic exposed in the previous paragraph, the records in the result set are sorted as (X, Y), (A, B).
There is no rule imposed on the database engine about the order it processes the rows when it runs a query. The database is free to process the rows in any order it consider it's better for performance.
Your query is invalid SQL and the fact that it can return different results using the same input data proves it.

Reorder rows in a MySQL table

I have a table:
+--------+-------------------+-----------+
| ID | Name | Order |
+--------+-------------------+-----------+
| 1 | John | 1 |
| 2 | Mike | 3 |
| 3 | Daniel | 4 |
| 4 | Lisa | 2 |
| 5 | Joe | 5 |
+--------+-------------------+-----------+
The order can be changed by admin hence the order column. On the admin side I have a form with a select box Insert After: to entries to the database. What query should I use to order+1 after the inserted column.
I want to do this in a such way that keeps server load to a minimum because this table has 1200 rows at present. Is this the correct way to save an order of the table or is there a better way?
Any help appreciated
EDIT:
Here's what I want to do, thanks to itsmatt:
want to reorder row number 1 to be after row 1100, you plan to leave 2-1100 the same and then modify 1 to be 1101 and increment 1101-1200
You need to do this in two steps:
UPDATE MyTable
SET `Order` = `Order` + 1
WHERE `Order` > (SELECT `Order`
FROM MyTable
WHERE ID = <insert-after-id>);
...which will shift the order number of every row further down the list than the person you're inserting after.
Then:
INSERT INTO MyTable (Name, `Order`)
VALUES (Name, (SELECT `Order` + 1 FROM MyTable WHERE ID = <insert-after-id>));
To insert the new row (assuming ID is auto increment), with an order number of one more than the person you're inserting after.
Just add the new row in any normal way and let a later SELECT use ORDER BY to sort. 1200 rows is infinitesimally small by MySQL standards. You really don't have to (and don't want to) keep the physical table sorted. Instead, use keys and indexes to access the table in a way that will give you what you want.
you can
insert into tablename (name, `order`)
values( 'name', select `order`+1 from tablename where name='name')
you can also you id=id_val in your inner select.
Hopefully this is what you're after, the question isn't altogether clear.