How to get count of matching strings in a large table - mysql

I have a table with following structure:
+-----+-------------------+
| ID | Name |
+-----+-------------------+
| 1 | abc |
+-----+-------------------+
| 2 | abc (duplicate) |
+-----+-------------------+
| 3 | bcd |
+-----+-------------------+
| 4 | bcd (duplicate) |
+-----+-------------------+
| 5 | bcd (duplicate) |
+-----+-------------------+
| 6 | efg |
+-----+-------------------+
| 7 | hij |
+-----+-------------------+
I need to count each Name occurance (with (duplicate) included), i.e.:
+-------------------+--------+
| Name | Count |
+-------------------+--------+
| abc | 2 |
+-------------------+--------+
| bcd | 3 |
+-------------------+--------+
| efg | 1 |
+-------------------+--------+
| hij | 1 |
+-------------------+--------+
I want to mention, that Name column is actually have type TINYTEXT. And there will be very lot of rows in it: 5396 in test mode already. I tried to make self join of table by TRIM(REPLACE(Name, '(duplicate)', '')) with grouping:
SELECT
DISTINCT TRIM(REPLACE(`t`.`Name`, '(duplicate)', '')) as `name`,
COUNT(`s`.`ID`) as `count`
FROM
`Table` as `t` INNER JOIN `Table` as `s` ON
TRIM(REPLACE(`t`.`Name`, '(duplicate)', '')) LIKE TRIM(REPLACE(`s`.`Name`, '(duplicate)', ''))
GROUP BY 1;
And... Well, it took 122.62 sec (?!) with result of 4846 rows on my development machine.
Q1: Was it a correct approach?
Q2: Is there any way to make it faster?
Any help would be appreciated.

Just remove the "duplicate" text:
select replace(name, ' (duplicate)', ''), count(*)
from mytable
group by 1

This should be quicker, although with that many rows you're basically storing a growing array of objects that you're counting and since it's a TINYTEXT field it can be immense.
select Name,count(ID) from Table group by Name
I see what you're saying now. Here's an updated SQL:
select DISTINCT TRIM(REPLACE(Name, ' (duplicate)', ''))
as name, count(ID) from Table group by name

Related

How to select only a specific set that includes some or all of another set in mySQL

I'm trying to extract specific rows from a mySQL table that contains lists of numbers.
I have a single table with 2 columns- id and data. Each row has a sorted, comma separated record of numbers ranging from 1 to 1000. I want to only select records with a partial or full set of specific numbers in it. I've tried using LIKE and IN and also looked at FIND_IN_SET.
t1.id t1.data
1 2,9,569
2 2,9,991,979
3 9,569,763
4 52,57,569,763,892,897
5 763
6 2,9,10,15,151,569,771,801,888,973
If I'm looking for rows with one or more of the values (2,9,569,763), I don't want to have to write:
SELECT t1.id from t1
WHERE t1.data NOT IN (1,3,4,5,6,7,8,10,11,...........,1000);
to return 3 rows, t1.id = 1,3 and 5.
Is there a simpler way? Something like (in mySQL):
SELECT t1.id from t1
WHERE t1.data "only includes one or more of" (2,9,569,763);
Paul Spiegel's answer gives the correct result, but it can't be optimized with indexes, because of the use of FIND_IN_SET(). It will always do a table-scan, which will get more and more expensive the more rows you have.
You should take this as a clue that storing lists of numbers as a comma-separated list in a string column is a bad idea when you actually want to do some searches for discrete members of that list.
What you should do instead is store the list as a child table, with one member per row.
CREATE TABLE mydata (
t1id INT NOT NULL,
member INT NOT NULL,
PRIMARY KEY (t1id, member),
FOREIGN KEY (t1id) REFERENCES t1(id)
);
INSERT INTO mydata VALUES
(1,2),(1,9),(1,569),
(2,2),(2,9),(2,991),(2,979),
(3,9),(3,569),(3,763),
(4,52,(4,57,(4,569,(4,763,(4,892),(4,897),
(5,763),
(6,2),(6,9),(6,10),(6,15),(6,151),(6,569),(6,771),(6,801),(6,888),(6,973);
Now you would join your original table t1 to mydata but exclude the matches to values in your desired list.
mysql> select * from t1 left join mydata on t1.id=mydata.t1id
and mydata.member not in (2,9,569,763);
+----+------+--------+
| id | t1id | member |
+----+------+--------+
| 1 | NULL | NULL |
| 2 | 2 | 979 |
| 2 | 2 | 991 |
| 3 | NULL | NULL |
| 4 | 4 | 52 |
| 4 | 4 | 57 |
| 4 | 4 | 892 |
| 4 | 4 | 897 |
| 5 | NULL | NULL |
| 6 | 6 | 10 |
| 6 | 6 | 15 |
| 6 | 6 | 151 |
| 6 | 6 | 771 |
| 6 | 6 | 801 |
| 6 | 6 | 888 |
| 6 | 6 | 973 |
+----+------+--------+
You see there are NULLs of id 1, 3, 5 because there are no values that are NOT in your specified list. Those are the id's that you want to return.
mysql> select t1.id from t1 left join mydata on t1.id=mydata.t1id
and mydata.member not in (2,9,569,763)
where mydata.member is null;
+----+
| id |
+----+
| 1 |
| 3 |
| 5 |
+----+
Not simple but..
Count single hits and compare it with the number of all values in the data column. They must be equal.
select id, data
from t1
where (find_in_set(2, data) > 0)
+ (find_in_set(9, data) > 0)
+ (find_in_set(569, data) > 0)
+ (find_in_set(763, data) > 0)
= char_length(data) - char_length(replace(data, ',', '')) + 1
Demo: https://www.db-fiddle.com/f/oLcrz4vmXRWXqYQhCnZA5z/0
Try to use the REGEXP function:
SELECT t1.id from t1
WHERE t1.data REGEXP '(2,9,569,763)';

MySQL concat columns and rows from multiple tables

I'm trying to concatenate data from three related tables according to:
orders orderrow orderrow_op
+----+ +----+----------+ +----+-------------+
| id | | id | id_order | | id | id_orderrow |
+----+ +----+----------+ +----+-------------+
| 1 | | 1 | 1 | | 1 | 1 |
| 2 | | 2 | 1 | | 2 | 1 |
| 3 | | 3 | 2 | | 3 | 2 |
+----+ | 4 | 3 | | 4 | 3 |
+----+----------+ | 5 | 3 |
| 6 | 3 |
+----+-------------+
The result i'm looking for is something like:
orderops (Desired Result)
+----------+-----------------+
| id_order | id_row:id_ops |
+----------+-----------------+
| 1 | 1:(1,2); 2:(3); |
| 2 | 3:(4,5,6) |
| 3 | 4:NULL |
+----------+-----------------+
I.e i want the operations and rows all be displayed on one row related to the order. So far i've tried things like:
SELECT
db.orders.id AS orderid,
db.orderrow.id AS rowids,
GROUP_CONCAT(DISTINCT db.orderrow.id) AS a,
GROUP_CONCAT(db.orderrow.id, ':', db.orderrow_op.id) AS b
FROM
db.orders
LEFT JOIN db.orderrow ON db.orders.id = db.orderrow.id_order
LEFT JOIN db.orderrow_op ON db.orderrow.id = db.orderrow_op.id_orderrow
GROUP BY orderid
Where in column 'a' i get the row ids and in column 'b' i get the operation_ids with corresponding row_id prepended. I'd like to combine the two into a single column such that related values in 'b' will start of with id from 'a' and only show once.
I'm fairly new to MySQL so i don't know if this is even possible or if i'ts a good idea at all? The aim is to structure the data into JSON for delivery via REST application so perhaps it's better to deliver the rows directly to the webserver and handle json parsing over there? I just figured that this approach might be faster.
This is not the nicest query but it's working for your example table setup.
SELECT
o.id AS id_order,
group_concat(sub.ops
SEPARATOR ' ') AS id_row_id_ops
FROM
(SELECT
orderrow.id_order,
IF(isnull(l3.ops), concat(orderrow.id, ':', 'NULL'), concat(orderrow.id, ':', l3.ops)) as ops
FROM
orderrow
LEFT JOIN (SELECT
orderrow_op.id_orderrow,
concat('(', group_concat(orderrow_op.id), '); ') as ops
FROM
orderrow_op
GROUP BY orderrow_op.id_orderrow) l3 ON l3.id_orderrow = orderrow.id) sub
LEFT JOIN
orders o ON o.id = sub.id_order
GROUP BY o.id;
One of the things to mind is the LEFT JOIN and that you need to cast a "null" value to a "null" text (otherwise your element 4 will vanish).
The output:

Get the count of data based on id in mysql result

I have a table like the below one
id | id_fk | data |
-------------------------
1 | 2 | data1 |
2 | 2 | data2 |
3 | 1 | data3 |
4 | 3 | data4 |
5 | 1 | data5 |
-------------------------
here I have the table id as 'id', foreign key from another table as id_fk.
What I try to achieve is, to get the count of each foreign key in an increment mode. that is, if the id_fk -> 2 occur on the first time, then the count should be 1, at the next occurance count become 2, and so on for all the id_fk. I tried many ways. But none give me the actual output.
From the above table, the result table will look like:
id_fk | count |
------------------
1 | 1 |
1 | 2 |
2 | 1 |
2 | 2 |
3 | 1 |
------------------
Please help me to solve this.. any help will be appreciated.
Try this
SELECT `id_fk`,
#a:=IF(id_fk=#b,#a+1,1) serial_number,
#b:=id_fk
FROM your_table,(SELECT #a:= 0,#b:=0) AS a
ORDER BY `id_fk` ASC
It works perfect with join.
select t1.id_fk,t1.id,count(*)
from your_table t1
left join your_table t2
on t1.id_fk=t2.id_fk and t1.id>=t2.id
group by t1.id_fk,t1.id
See Sql Fiddle Demo

SQL : Select statement order by

i want to select a column but with diferent order :
i have 2 table :
table_name:
+------+-----------+
| id | name |
+------+-----------+
| 1 | Sindra |
| 2 | Auli |
| 3 | Brian |
| 4 | Bina |
| 5 | zian |
| 6 | Bri |
| 7 | Andre |
+------+-----------+
table_temp, id_temp_name foreign key of id(table_name) :
+------+--------------+
| id | id_temp_name |
+------+--------------+
| 1 | 1 |
| 2 | 3 |
| 3 | 4 |
| 4 | 2 |
+------+--------------+
with this query :
SELECT *
FROM table_name
WHERE id IN
(SELECT id_temp_name FROM table_temp)
the result is always same look with table_name, i was looking for result that exactly same with id_temp_name order , so the result will be :
+------+-----------+
| id | name |
+------+-----------+
| 1 | Sindra |
| 3 | Brian |
| 4 | Bina |
| 2 | Auli |
+------+-----------+
thanks for any advice, .
You need to rewrite the query to be a JOIN between both tables, then you can set an ordering based on any column involved, even when not in the final result set:
SELECT table_name.id,
table_name.name
FROM table_name
INNER JOIN table_temp ON table_name.id = table_temp.id_temp_name
ORDER BY table_temp.id ;
Use a join instead of a sub-query.
SELECT table_name.id, table_name.name
FROM table_name
INNER JOIN table_temp ON table_name.id = table_temp.id
ORDER BY table_temp.id_temp_name
And... usually best to list the fields explicitly instead of using * to select all.
You should use a simple JOIN to achieve your result.
Your query:
SELECT *
FROM table_name
WHERE id IN
(SELECT id_temp_name FROM table_temp)
actually returns all the rows and columns of the table_name Table. So you won't get the desired id_temp_name results, since it's in a different table. That's why, you should use LEFT JOIN, since your left table is table_name, and your right table is table_temp, and you want to show data from a column of table_temp, which is id_temp_name.
So, what you need to do, is this:
SELECT tn.id, tn.name
FROM table_name AS tn
LEFT JOIN table_temp AS tt
ON tn.id= tt.id_temp_name
GROUP BY tn.id

Selecting entire rows from two tables

I don't know if it's possible and i might be asking a stupid question here, if so, forgive me.
I have two tables that are somewhat similar but not entirely.
Table 1 (user_opinions)
| o_id | user_id | opinion |is_shared_from| date,isglobal etc
|:---------|---------|:------------:|:------------:|
| 1 | 11| text 1 | 0
| 2 | 13| text 2 | 2
| 3 | 9| text 3 | 0
Table 2 (Buss_opinions)
| bo_id | o_id | user_id | opinion | date
|:---------|--------|:------------:|:------------:|
| 1 | 2| 52 | bus text 1
| 2 | 3| 41 | bus text 2
If i do a standard select and join like this:
SELECT * FROM user_opinions uo
JOIN Buss_opinions bo
ON uo.o_id = bo.o_id
This will return rows with both table's data joined together.
My question is, what if i wanted to get data from these two tables in seperate rows.
Result should be something like this:
| oid | bo_id | opinion | nb: and other rows from both tables
|:---------|---------|:------------:|
| 1 | NULL | text 1 | nb:from table 1
| NULL | 1| bus text 1| nb:from table 2
| 2 | NULL | text 2 |nb:from table 1
and so on
It gets both table's data and where there's no common field it puts a NULL value in the field. Is there a type of join for this? or are there other ways of doing this?
One way you could go about it is using a UNION (http://dev.mysql.com/doc/refman/5.0/en/union.html):
SELECT oid AS OID, null AS BOID, user_id AS USERID, opinion AS Opinion
FROM table1
UNION
SELECT null AS OID, bo_id AS BOID, user_id AS USERID, opinion AS Opinion
FROM table2
Edit: You could even take this a step further and marry it with the CONCAT function:
SELECT CONCAT('OID-', oid) AS ID, user_id AS USERID, opinion AS Opinion
FROM table1
UNION
SELECT CONCAT('BOID-', bo_id) AS ID, user_id AS USERID, opinion AS Opinion
FROM table2