How to replace MySQL enum values for sorting - mysql

I have a mysql table with an enum field storing the state of elements, e.g.:
draft
inactive
published
These states get translated into the user's locale in the application, and since the translations can differ greatly, it is not possible to sort records by state in the Mysql query, since the order of the enum values will not match the order of the translated strings.
For example:
SELECT state FROM records ORDER BY state ASC
Would give the following results for english, german and french:
draft » Draft / Entwurf / Ébauche
inactive » Inactive / Inaktiv / Inactif
published » Published / Freigeschaltet / Publié
As Mysql sorts by the enum values, using this order in the application makes it seem like the sorting by state is jumbled.
Of course it is possible to do the sorting by state afterwards in the application using the translated strings, but it would remove a layer of complexity to be able to do this directly in the query - as well as improve application performance.
One solution I found would be to use a CASE statement in the query:
SELECT
CASE state
WHEN 'draft' THEN 'Entwurf'
WHEN 'inactive' THEN 'Inaktiv'
WHEN 'published' THEN 'Freigeschaltet'
END
FROM
records
ORDER BY
state ASC
Are there better/faster ways to sort an enum by custom strings translated in the application?

you can use ORDER BY FIELD(state, opt1, opt2, opt3....)
https://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_field
as its just a CSV list you should be able to use the application to pass the order you want.

Probably the quickest solution you can get would be to create a translation table along the lines of:
TABLE myEnum
lang VARCHAR(2)
keyValue VARCHAR(32)
localeValue VARCHAR(32)
Then fill the table with your translations of the enum, and use this table in a join in your SQL Query:
SELECT r.state FROM records r, myEnum e WHERE e.lang = 'de' and e.keyValue = r.state ORDER BY e.localeValue ASC
You'll probably not want to use those table or column names, of course.

Related

Select distinct column and then count 2 columns that relate to that column in MySQL

So I have an error log that I need to analyze.
In that error log there is are fields called
EVENT_ATTRIBUTE that displays the name of the device that collected that information.
EVENT_SEVERITY that displays a number from 1 to 5. In this column I need to find the amount 4's and 5's.
The problem is I need to get the distinct EVENT_ATTRIBUTES and then count all the 4's and 5's related to that specific EVENT_ATTRIBUTE and output the count.
Basically the sensors(event_attribute) detect different errors. I need to analyze how many 4's and 5's each of the sensors picks up so that I can analyze them.
I am having problems taking the distinct sensors and linking them to the specific sensor. I have tried this so far but it only returns me same number for 4 and 5 so I don't think I am doing it correctly.
SELECT DISTINCT LEFT(EVENT_ATTRIBUTE, locate('(', EVENT_ATTRIBUTE, 1)-1) AS
SensorName,
COUNT(CASE WHEN 'EVENT_SEVERITY' <>5 THEN 1 END) AS ERROR5,
COUNT(CASE WHEN 'EVENT_SEVERITY' <>4 THEN 1 END) AS ERROR4
FROM nodeapp.disc_event
WHERE EVENT_SEVERITY IN (5,4)
Group BY SensorName;
Here is the table that I am looking at.
Event Error Table
Im truncating the event attribute because the IP address doesn't matter. Basically I want to make the unique event_attribute act as a primary key and count the amount of 4's and 5's connected to that primary key.
With the code above I get this output: Event Result Table
Thank you for all your help!
You're very close.
DISTINCT is unnecessary when you're grouping.
You want SUM(). COUNT() simply counts everything that's not null. You can exploit the hack that a boolean expression evaluates to either 1 or 0.
SELECT LEFT(EVENT_ATTRIBUTE, LOCATE('(', EVENT_ATTRIBUTE, 1)-1) AS SensorName,
SUM(EVENT_SEVERITY = 5) ERROR_5,
SUM(EVENT_SEVERITY = 4) ERROR_4,
COUNT(*) ALL_ERRORS
FROM nodeapp.disc_event
GROUP BY LEFT(EVENT_ATTRIBUTE, LOCATE('(', EVENT_ATTRIBUTE, 1)-1);
Even if EVENT_SEVERITY values are stored as strings in your DBMS, expressions like EVENT_SEVERITY = 4 implicitly coerce them to integers.
It's generally good practice to include batch totals like COUNT(*) especially when you're debugging; they form a good sanity check that you're handling your data correctly.
The query is interpreting 'EVENT_SEVERITY' as string, try using ` or double quotes to delimit the field instead. ...and while it is "standard", I tend to shy away from double-quotes because they look like they should be for strings (and in some configurations of MySQL are).
Edit (for clarity): I mean it is literally interpreting 'EVENT_SEVERITY' as the string "EVENT_SEVERITY", not the underlying value of the field as a string.

Mysql Match Against Ranking

Im currently using a query for an autocomplete box with like. However I want to use the match, against which should be faster but I'm running against some issues with the sorting.
I want to rank a query like this:
[query] %
[query]%
% [query]%
%[query]%
For now I use
SELECT * FROM table
WHERE name LIKE '%query%'
ORDER BY (case
WHEN name LIKE 'query %' THEN 1
WHEN name LIKE 'query%' THEN 2
WHEN name LIKE '% query%' THEN 3
ELSE 4 END) ASC
When I use...
SELECT * FROM table
WHERE MATCH(name) AGAINST('query*' IN BOOLEAN MODE)
...all results get the same 'ranking score'.
For example searching for Natio
returns Pilanesberg National Park and National Park Kruger with the same score while I want the second result as first becouse it starts with the query.
How can I achieve this?
I had your same problem and I had to approach it in a different way.
The documentation of MySQL says:
The term frequency (TF) value is the number of times that a word appears in a document. The inverse document frequency (IDF) value of a word is calculated using the following formula, where total_records is the number of records in the collection, and matching_records is the number of records that the search term appears in.
${IDF} = log10( ${total_records} / ${matching_records} )
When a document contains a word multiple times, the IDF value is multiplied by the TF value:
${TF} * ${IDF}
Using the TF and IDF values, the relevancy ranking for a document is calculated using this formula:
${rank} = ${TF} * ${IDF} * ${IDF}
And this is followed by an example where it explains the above declaration: it search for the word 'database' in different fields and returns a rank based upon the results.
In your example the words "Pilanesberg National Park", "National Park Kruger" will return the same rank against ('Natio' IN BOOLEAN MODE)* because the rank is based not on the common sense similarity of the word (or in this case you'd expected to tell the database what's meaning -for you- "similar to"), but is based on the above formula, related to the frequency.
And note also that the value of the freqency is affected by the type of index (InnoDB or MyISAM) and by the version of MySQL (in older version you cannot use Full-text indexes with InnoDB tables).
Regarding your problem, you can use MySQL user defined variables or functions or procedures in order to evaluate the rank basing upon your idea of rank. Examples here, here or here. And also here.
See also:
MySQL match() against() - order by relevance and column?
MYsql FULLTEXT query yields unexpected ranking; why?

MySQL WHERE, LIMIT and pagination

I have tables: documents, languages and document_languages. Documents exist in one or more languages and this relationship is mapped in document_languages.
Imagine now I want to display the documents and all of its languages on a page, and paginate my result set to show 10 records on each page. There will be a WHERE statement, specifying which languages should be retrieved (ex: en, fr, it).
Even though I only want to display 10 documents on the page (LIMIT 10), I have to return more than 10 records if a document has more than one language (which most do).
How can you combine the WHERE statement with the LIMIT in a single query to get the records I need?
Use sub query to filter only documents records
select * from
(select * from documents limit 0,10) as doc,
languages lan,
document_languages dl
where doc.docid = dl.docid
and lan.langid = dl.langid
Check sub query doc as well
http://dev.mysql.com/doc/refman/5.0/en/from-clause-subqueries.html
http://dev.mysql.com/doc/refman/5.0/en/subqueries.html
You can add a little counter to each row counting how many unique documents you're returning and then return just 10. You just specify what document_id to start with and then it returns the next coming 10.
SELECT document_id,
if (#storedDocumentId <> document_id,(#docNum:=#docNum+1),#docNum),
#storedDocumentId:=document_id
FROM document, document_languages,(SELECT #docNum:=0) AS document_count
where #docNum<10
and document_id>=1234
and document.id=document_languages.document_id
order by document_id;
I created these tables:
create table documents (iddocument int, name varchar(30));
create table languages (idlang char(2), lang_name varchar(30));
create table document_languages (iddocument int, idlang char(2));
Make a basic query using GROUP_CONCAT function to obtain the traspose of languages results:
select d.iddocument, group_concat(dl.idlang)
from documents d, document_languages dl
where d.iddocument = dl.iddocument
group by d.iddocument;
And finally set the number of the documents with LIMIT option:
select d.iddocument, group_concat(dl.idlang)
from documents d, document_languages dl
where d.iddocument = dl.iddocument
group by d.iddocument limit 10;
You can check more info about GROUP_CONCAT here: http://dev.mysql.com/doc/refman/5.0/es/group-by-functions.html
Hmmmm... so, if you post your query (SQL statement), it might be easier to spot the error. Your outermost LIMIT statement should "do the trick." As Rakesh said, you can use subqueries. However, depending on your data, you may (probably) just want to use simple JOINs (e.g. where a.id = b.id...).
This should be fairly straightforward in MySQL. In the unlikely case that you're doing something "fancy," you can always pull the datasets into variables to be parsed by an external language (e.g., Python). In the case that you're literally just trying to limit screen output (interactive session), check-out the "pager" command (I like "pager less;").
Lastly, check-out using the UNION statement. I hope that something, here, is useful. Good luck!

Mysql View of last version of some strings

In my database, I have a table that contains all the versions of a certain translated strings. I want to create a view that returns a table with all the latest versions of each string.
I've come up with the following solution.
Table Strings
idString, Date, String, Chapter
These are the attributes of the table
View StringOrdered
Create View StringOrdered(idString, Date, String, Chapter) AS
SELECT * FROM Strings ORDER BY Date DESC
This is the supporting view.
View LastVersion
CREATE VIEW LastVersion (idString, Date, String, Chapter) AS
SELECT * FROM StringOrdered GROUP BY Chapter
Is there a way to obtain it without the supporting view?
You are depending on the group by taking the last version of something, based on the ordering of the data. Although this might work in practice, MySQL documentation specifically says this is not supported. Instead, try something like:
create view LastVersion as
select s.*
from strings s
where s.chapter = (select max(chapter)
from strings s2
where s.String = s2.String
)
I am assuming that you are looking for the latest chapter for String. If you are looking for the latest chapter for StringId, then change the where statement accordingly.
Note that MySQL does not support subqueries in the from clause of a view. The quote is quite clear (here):
The SELECT statement cannot contain a subquery in the FROM clause.
It does support subqueries in other clauses.

mySQL convert integer to text in SELECT query

I want to convert an integer to text in a mySQL select query. Here's what a table looks like:
Languages
--------
1,2,3
I want to convert each integer to a language (e.g., 1 => English, 2 => French, etc.)
I've been reading up on CONVERT and CAST functions in mySQL, but they mostly seem to focus on converting various data types to integers. And also I couldn't find anything that dealt with the specific way I'm storing the data (multiple numbers in one field).
How can I convert the integers to text in a mySQL query?
UPDATE
Here's my mySQL query:
SELECT u.id, ulp.userid, ulp.languages, ll.id, ll.language_detail
FROM users AS u
JOIN user_language_profile AS ulp ON (ulp.userid = u.id)
JOIN language_detail AS ll ON (ulp.languages = ll.id)
Use either:
MySQL's ELT() funtion:
SELECT
ELT(Languages
, 'English' -- 1
, 'French' -- 2
-- etc.
)
FROM table_name
A CASE expression:
SELECT
CASE Languages
WHEN 1 THEN 'English'
WHEN 2 THEN 'French'
-- etc.
END
FROM table_name
Although, if possible I would be tempted to either JOIN with a lookup table (as #Mr.TAMER says) or change the data type of the column to ENUM('English','French',...).
UPDATE
From your comments, it now seems that each field contains a set (perhaps even using the SET data type?) of languages and you want to replace the numeric values with strings?
First, read Bill Karwin's excellent answer to "Is storing a delimited list in a database column really that bad?".
In this case, I suggest you normalise your database a tad: create a new language-entity table wherein each record associates the PK of the entities in the existing table with a single language. Then you can use a SELECT query (joining on that new table) with GROUP_CONCAT aggregation to obtain the desired list of language names.
Without such normalisation, your only option is to do string-based search & replace (which would not be particularly efficient); for example:
SELECT CONCAT_WS(',',
IF(FIND_IN_SET('1', Languages), 'English', NULL),
IF(FIND_IN_SET('2', Languages), 'French' , NULL),
-- etc.
)
FROM table_name
Why don't you make a number-language table and, when SELECTing, get the language associated with that number that you selected.
This is better in case you want to add a new language. You will only insert it into the table instead of changing all the queries in your code, and also easier if others are using your code (they won't be happy debugging and editing all the queries).
From your other comments, are you saying that the languages field is a literal string embedded with commas?
From an SQL perspective, that's a pretty unworkable design. A variable number of languages should be stored in another table.
However, if you're stuck with what you've got, you might be able to construct a regexp replacement algorithm, but it seems terribly fragile, and I wouldn't recommend it. If you've got more than 9 languages, the following will be broken, and you would need the Regexp UDF, which introduces a bunch of complexity.
Assuming the simple case:
SELECT REPLACE(
REPLACE(
REPLACE(Languages, '1', 'English'),
'2', 'French'),
N, DESCRIPTION)
and so on. But I repeat: this is an awful data design. If it's possible to fix it to something like:
person person_lang language
========== ============ =========
person_id -----< person_id
... lang_id >----- lang_id
lang_desc
Then I strongly suggest you do so.