I have a small problem, I have a table like this:
id|name|group|date_created
1|Volvo|1,3|06-04-2020 10:00:00
2|Audi|3|06-04-2020 10:00:00
etc....
Now I wish I could get all the records that have the value 1 inside the group column.
I tried LIKE "%1%", but I don't think it's a good query. Can you address me?
SELECT id FROM cars WHERE group LIKE '%1%'
The problem with your query is that it would wrongly match '1' against a list like '45,12,5' for example.
One method is to add commas on both ends before searching:
where concat(',', `group`, ',') like '%,1,%';
But in MySQL, it is much more convenient to use string function find_in_set(), whose purpose is just what you are looking for, ie search for a value in a comma-separated list:
select id from cars where find_in_set('1', `group`) > 0
Notes:
you should fix your data model, and have a separated table to store relationship between ids and groups, with each tuple on a separate row. Related reading: Is storing a delimited list in a database column really that bad?
group is a reserved word in MySQL, so not a good choice for a column name (you would need to surround it with backticks everytime you use it, which is error-prone)
Related
I have a MySQL column which contains a string of scores separated by a semi-colon eg: "5;21;24;25;26;28;117".
This column was created not by design, but by collecting the values from multiple rows in a table using GROUP_CONCAT and GROUP BY. The original data arrived as a spreadsheet with multiple rows with the ID value.
I can use a select clause with REPLACE function to replace the ; with a +.
SELECT values, REPLACE(values,";","+") AS score FROM [table_name] WHERE 1
values score
5;21;24;25;26;28;117 5+21+24+25+26+28+117
However what I need is the sum of: 5+21+24+25+26+28+117 to get a total of 246.
Is there any way to do this in MySQL without using some other scripting language?
The SELECT clause shows me a string of numbers joined with the + symbol.
Am looking for a way to evaluate that string to give me the result: 246
UPDATE:
As I was framing my question, I did more research and came up with this link which solves my problem:
(https://dba.stackexchange.com/questions/120747/evaluate-a-string-value-as-a-computed-expression-in-an-sql-statement-sthg-like).
Am keeping this question and the link to the answer here in case it could help other people searching for the same.
I have a comma separate string like "1,2,3" and a column in table is also contain comma separate value like "1,2,4,5,3" how to get all records that match any value to any value.
for example
id---category
1---1,2,4,5
2---1,2,3,6
3---2,3,5
If I search for string "1,2,3" then I should get record the category contains 1 or 2 or 3 or 1,2 or 1,3 or 2,3 or 1,2,3. It should not return the duplicate value where as we can group them.
Is it possible to get all record with a single query.
Try this:
select * from table where category IN (1,2,3);
You should not be storing lists of ids in strings. Here are reasons why:
Values should be stored using the correct type. int <> string.
SQL has lousy string processing functions.
Foreign keys should be properly declared.
SQL will not be able to optimize these queries.
SQL has a great way to store lists. It is called a table not a string.
But, sometimes you are stuck with someone else really, really, really, really, really bad data modeling decisions. You can do something, using regular expressions:
where category regexp replace($string, ',', '|')
or perhaps more accurately:
where concat(',', category, ',') regexp concat(',', replace($string, ',', ',|,'), ',')
I have a table in sqlite that contains roughly about 3 billion values (a lot of them will be repeats). It's basically a giant vector of values. I'm trying to calculate the frequency in which values appear in the table by performing this:-
SELECT abs(diffs), count(*) as total FROM mzdiff GROUP by abs(diffs);
abs(diffs) is the name of my column and mzdiff is my table name, but when I try performing the code above it comes up with an error message saying that the column diffs doesn't exist. I know that the naming of my column isn't really ideal for sql, but is there any way I can get around this?
Thanks
The answer to this is not an alias since the column name must be identified before it can be aliased so use the backtick to quote the name and make it a habit to always quote identifiers.
SELECT `abs(diffs)`, count(*) as total FROM `mzdiff` GROUP by `abs(diffs)`;
I need to export a single column from a MySQL database which shows each entry only once. So in the following table:
id author(s) content
________________________________________
1 Bill, Sara, Mike foo1
1 Sara foo2
2 Bill, Sara, Mike foo3
2 Sara foo4
3 David foo5
3 Mike foo5
I would need to export a list of authors as "Bill, Sara, Mike, Susan" so that each name is shown only once.
Thanks!
UPDATE: I realize this may not be possible, so I am going to have to accept an exported list which simply eliminates any exact duplicates within the column, so the output would be as such: Bill, Sara, Mike, Sara, David, Mike Any help forming this query would be appreciated.
Thanks again!
It's possible to get the resultset, but I'd really only do this to convert this to another table, with one row per author. I wouldn't want to run queries like this from application code.
The SUBSTRING_INDEX function can be used to extract the first, secpond, et al. author from the list, e.g.
SUBSTRING_INDEX(SUBSTRING_INDEX(authors,',', 1 ),',',-1) AS author1
SUBSTRING_INDEX(SUBSTRING_INDEX(authors,',', 2 ),',',-1) AS author2
SUBSTRING_INDEX(SUBSTRING_INDEX(authors,',', 3 ),',',-1) AS author3
But this gets messy at the end, because you get the last author when you retrieve beyond the length of the list.
So, you can either count the number of commas, with a rather ugly expression:
LENGTH(authors)-LENGTH(REPLACE(authors,',','')) AS count_commas
But it's just as easy to append a trailing comma, and then convert empty strings to NULL
So, replace authors with:
CONCAT(authors,',')
And then wrap that in TRIM and NULLIF functions.
NULLIF(TRIM( foo ),'')
Then, you can write a query that gets the first author from each row, another query that gets the second author from each row (identical to the first query, just change the '1' to a '2', the third author, etc. up to the maximum number of authors in a column value. Combine all those queries together with UNION operations (this will eliminate the duplicates for you.)
So, this query:
SELECT NULLIF(TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(a.authors,','),',',1),',',-1)),'') AS author
FROM unfortunately_designed_table a
UNION
SELECT NULLIF(TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(a.authors,','),',',2),',',-1)),'')
FROM unfortunately_designed_table a
UNION
SELECT NULLIF(TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(a.authors,','),',',3),',',-1)),'')
FROM unfortunately_designed_table a
UNION
SELECT NULLIF(TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(a.authors,','),',',4),',',-1)),'')
FROM unfortunately_designed_table a
this will return a resultset of unique author names (and undoubtedly a NULL). That's only getting the first four authors in the list, you'd need to extend that to get the fifth, sixth, etc.
You can get the maximum count of entries in that column by finding the maximum number of commas, and adding 1
SELECT MAX(LENGTH(a.authors)-LENGTH(REPLACE(a.authors,',','')))+1 AS max_count
FROM unfortunately_designed_table a
That lets you know how far you need to extend the query above to get all of the author values (at the particular point in time you run the query... nothing prevents someone from adding another author to the list within a column at a later time.
After all the work to get distinct author values on separate rows, you'd probably want to leave them in a list like that. It's easier to work with.
But, of course, it's also possible to convert that resultset back into a comma delimited list, though the size of the string returned is limited by max_allowed_packet session variable (iirc).
To get it back as a single row, with a comma separated list, take that whole mess of a query from above, and wrap it in parens as an line view, give it an alias, and use the GROUP_CONCAT function.
SELECT GROUP_CONCAT(d.author ORDER BY d.author) AS distinct_authors
FROM (
...
) d
WHERE d.author IS NOT NULL
If you think all of these expressions are ugly, and there should be an easier way to do this, unfortunately (aside from writing procedural code), there really isn't. The relational database is designed to handle information in tuples (rows), with each row representing one entity. Stuffing multiple entities or values into a single column goes against relational design. As such, SQL does not provide a simple way to extract values from a string into separate tuples, which is why the code to do this is so messy.
I have a mysql table in which there is a column e.g. called name. The column data has a specific pattern nameBase+number. E.g.
name
----------
test0
test1
test2
stack0
stack1
stack2
Each time I want to add data to the column, I have to find the last number for specific nambeBase and add the new entry +number+1.
For example, if now test came, I have to add test3 to db.
My question: What is the best way to 1. check if the nameBase already exists in db(sth like contains) and 2.find the last nameBase number. E.g. here for test is 3.
Update : Everyone, one update. I finally used java Pattern class. So cool and easy. It made everything so simple. I just could add the /d to pattern and then I could check if that matches the name and could use the pattern group to easily access the second part.
The real solution here is to change the database schema to split this into two columns, the name and its number. It becomes trivial then to get the aggregate MAX() via
SELECT name, MAX(num) AS num FROM tbl GROUP BY name
However,if changing it is not an option, I would recommend using REPLACE() to remove the name portion from the column value leaving only the number portion when querying, and get the aggregate MAX() of that to find the highest existing number for it:
SELECT
MAX(REPLACE(name, <the name to search>, '')) AS maxnum
FROM tbl
WHERE
name LIKE '<the name to search>%'
Or instead of LIKE, using a regular expression, which is more accurate than LIKE (in case a name contains another name, the LIKE might match) but more expensive:
SELECT
MAX(REPLACE(name, <the name to search>, '')) AS maxnum
FROM tbl
WHERE
name REGEXP '^<the name to search>[0-9]+$'
I would do this with an additional table with two columns and store in this table each name and the last assigned id. And then replace your nameBase+number column in your original table with a name column being a foreign key to the addition table, and a number column, being the appropriate count for that entry.
This will be much easier and more efficient to manipulate.
If possible, I would restructure the table to place these in either 2 tables (better) or at least two columns (medium). The structure you have is not normalized at all :-/
Without knowing too much about your schema; here is my recommendation for the two-table solution: (note: this is normalized and also follows the idiom "Do not store that which can be calculated")
names
------
id | name
01 | test
02 | stack
name_hits
-------
name_id | date
01 | 01/01/2001
01 | 01/15/2001
01 | 04/03/2001
02 | 01/01/2001
...
and then select like this:
SELECT names.name, count(name_hits.id) as hits
FROM names JOIN name_hits ON names.id=name_hits.name_id
GROUP BY names.id
and insert like this:
INSERT INTO name_hits SELECT id, NOW() FROM names WHERE name = "stack";
Presuming that you are unable to change the structure of the table, you can do what you want. However, it is rather expensive.
What you would like to do is something like:
select name
from t
where left(name, length(<name parameter>)) = <name parameter>
order by name desc
limit 1
Unfortunately, your naming probably does not allow this, because you are not left padding the numeric portion with zeroes.
So, the following gets around this:
select name,
cast(substring(name, length(<name parameter>), 1000) as int) as number
from t
where left(name, length(<name parameter>)) = <name parameter>
order by 2 desc
limit 1
This is not particularly efficient. Also, indexes cannot really help with this because the collating sequence for strings is different than for numbers (test0, test1, test10, test100, test11, etc. versus 0, 1, 2, 3, 4 . . .).
If you can, I would follow the advice of the others who suggest multiple columns or tables. I only offer this as a method where you don't have to modify the current table.
If you cannot change the schema, try this:
INSERT INTO names (name)
SELECT CONCAT("stack", CAST(TRIM(LEADING "stack" FROM name) AS INT)+1)
WHERE name LIKE "stack%" ORDER BY name DESC LIMIT 1;
The idea is:
select the "highest" previous value,
chop of the name,
cast the remaining string as an int,
add one to it,
then put the name back on it.
I have not tested this... I hope it leads you in the right direction.
Note that I have used a constant string "stack" as an example, you will likely want to make that dynamic.