Disable MySQL to convert string to 0 [duplicate] - mysql

I came across this auto typecasting of MYSQL from String to Integer seems to me weird.
mysql> select * from `isps` where `id` = '3ca6fb49-9749-3099-b30d-19ce56349ad6' OR `unique_id` = '3ca6fb49-9749-3099-b30d-19ce56349ad6';
+----+--------------------------------------+---------------+--------------+---------------------+---------------------+
| id | unique_id | name | code | created_at | updated_at |
+----+--------------------------------------+---------------+--------------+---------------------+---------------------+
| 3 | ee8db3be-1bf7-3440-8add-37232cfc4ecb | TTN | ttn | 2019-09-26 08:12:14 | 2019-09-26 08:12:14 |
| 7 | 3ca6fb49-9749-3099-b30d-19ce56349ad6 | ONE BROADBAND | onebroadband | 2019-09-26 08:12:14 | 2019-09-26 08:12:14 |
+----+--------------------------------------+---------------+--------------+---------------------+---------------------+
I had not expected result with id = 3 can anyone help with this.
DataType in my database
id - BIGINT
unique_id - varchar(200)

You can cast id to a string before comparing.
select * from `isps` where CAST(`id` AS CHAR) = '3ca6fb49-9749-3099-b30d-19ce56349ad6' OR `unique_id` = '3ca6fb49-9749-3099-b30d-19ce56349ad6';
Note that this will slow down the query significantly, since it won't be able to use the index on the id column.

Related

Mysql: Order By with numerical value shows wrong order

I'm using Spring Boot 2.2.6.RELEASE. I have a repository method looks like this:
#Query(value = "SELECT m FROM Media m ORDER BY m.viewCount DESC")
Page<Media> findMedias(Pageable pageable);
I get unordered result list with this. I tried to run the next query in the cli:
SELECT media.view_count FROM mydb.media ORDER BY media.view_count DESC;
The result looks like this:
--------------
| 9 |
| 8 |
| 7 |
| 6 |
| 5 |
| 4 |
| 3 |
| 3 |
| 20 |
| 19 |
| 18 |
| 17 |
| 16 |
| 15 |
| 13 |
| 12 |
| 12 |
| 11 |
| 10 |
| 1 |
| 1 |
--------------
I want the value of 20 to be the first and not 9. Why MySQL do this kind of order? It shows one digit value first instead of the highest number?
EDIT:
I use sql file to create my tables. view_count column has LONG as type, not String. The query looks like this:
CREATE TABLE IF NOT EXISTS media(m_id INTEGER PRIMARY KEY AUTO_INCREMENT, title VARCHAR(255) NOT NULL, category VARCHAR(10) NOT NULL, file_name VARCHAR(255) NOT NULL, view_count LONG NOT NULL, download_count LONG NOT NULL);
The view_count is stored as a string in your table which is not incorrect. If you can, then change it to integer. If you can not do that then, use the below to get the desired output.
SELECT m.view_count
FROM media m
ORDER BY CAST(m.view_count AS UNSIGNED) DESC;

Get distinct results from several tables

I need to implement mysql query to calculate space used by user's mailbox.
A message thread may have multiple messages (reply, follow up) by 2 parties
(sender/recipient) and is tagged with one or more tags (Inbox, Sent etc.).
The following conditions have to be met:
a) user is either recipient OR author of the message;
b) message IS TAGGED by any of the tags: 1,2,3,4;
c) distinct records only, ie if the thread, containing messages is tagged with
more than one of the 4 tags (for example 1 and 4: Inbox and Sent) the calculation
is done on one tag only
I have tried the following query but I am not able to get distinct values - the
subject/body values are duplicated:
SELECT SUM(LENGTH(subject)+LENGTH(body)) AS sum
FROM om_msg_message omm
JOIN om_msg_index omi ON omm.mid = omi.mid
JOIN om_msg_tags_index omti ON omi.thread_id = omti.thread_id AND omti.uid = user_id
WHERE (omi.recipient = user_id OR omi.author = user_id) AND omti.tag_id IN (1,2,3,4)
GROUP BY omi.mid;
Structure of the tables:
om_msg_message - fields subject and body are the ones to be calculated
+--------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+------------------+------+-----+---------+----------------+
| mid | int(10) unsigned | NO | PRI | NULL | auto_increment |
| subject | varchar(255) | NO | | NULL | |
| body | longtext | NO | | NULL | |
| timestamp | int(10) unsigned | NO | | NULL | |
| reply_to_mid | int(10) unsigned | NO | | 0 | |
+--------------+------------------+------+-----+---------+----------------+
om_msg_index
+-----+-----------+-----------+--------+--------+---------+
| mid | thread_id | recipient | author | is_new | deleted |
+-----+-----------+-----------+--------+--------+---------+
| 1 | 1 | 1392 | 1211 | 0 | 0 |
| 2 | 1 | 1211 | 1392 | 1 | 0 |
+-----+-----------+-----------+--------+--------+---------+
om_msg_tags_index
+--------+------+-----------+
| tag_id | uid | thread_id |
+--------+------+-----------+
| 1 | 1211 | 1 |
| 4 | 1211 | 1 |
| 1 | 1392 | 1 |
| 4 | 1392 | 1 |
+--------+------+-----------+
Here's another solution:
SELECT SUM(LENGTH(omm.subject) + LENGTH(omm.body)) as totalLength
FROM om_msg_message omm
JOIN om_msg_index omi
ON omi.mid = omm.mid
AND (omi.recipient = user_id OR omi.author = user_id)
JOIN (SELECT DISTINCT thread_id
FROM om_msg_tags_index
WHERE uid = user_id
AND tag_id IN (1, 2, 3, 4)) omti
ON omti.thread_id = omi.thread_id
I'm assuming that:
user_id is a parameter marker/host variable, being queried for an individual user.
You want the total of all messages per user, not the total length of each message (which is what the GROUP BY clause in your version was getting you).
That mid in both om_msg_message and om_msg_index is unique.
So, your problem is the IN clause. I'm not a MYSQL guru, but in T-SQL you could change it to have a where clause on a subquery that contained an EXISTS so your join didn't pop out two rows. You need to compensate for the fact that you have two rows with different tagID's associated with each row of your primary join data.
The way I could do it cross-platform would be with four left-joins that linked tables then demanded a non-null value for 1, 2, 3, or 4. Fairly inefficient; I'm sure there's a better way to do it in MySQL, but now that you know what the problem is you might know a better solution.

SQL algorithm to as near to linear time as possible and tweaking of select statement

I am using MySQL version 5.5 on Ubuntu.
My database tables are setup as follows:
DDLs:
CREATE TABLE 'asx' (
'code' char(3) NOT NULL,
'high' decimal(9,3),
'low' decimal(9,3),
'close' decimal(9,3),
'histID' int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY ('histID'),
UNIQUE KEY 'code' ('code')
)
CREATE TABLE 'asxhist' (
'date' date NOT NULL,
'average' decimal(9,3),
'histID' int(11) NOT NULL,
PRIMARY KEY ('date','histID'),
KEY 'histID' ('histID'),
CONSTRAINT 'asxhist_ibfk_1' FOREIGN KEY ('histID') REFERENCES 'asx' ('histID')
ON UPDATE CASCADE
)
t1:
| code | high | low | close | histID (primary key)|
| asx | 10.000 | 9.500 | 9.800 | 1
| nab | 42.000 | 41.250 | 41.350 | 2
t2:
| date | average | histID (foreign key) |
| 2013-01-01| 10.000 | 1 |
| 2013-01-01| 39.000 | 2 |
| 2013-01-02| 9.000 | 1 |
| 2013-01-02| 38.000 | 2 |
| 2013-01-03| 9.500 | 1 |
| 2013-01-03| 39.500 | 2 |
| 2013-01-04| 11.000 | 1 |
| 2013-01-04| 38.500 | 2 |
I am attempting to complete a select query that produces this as a result:
| code | high | low | close | asxhist.average |
| asx | 10.000 | 9.500 | 9.800 | 11.000, 9.5000 |
| nab | 42.000 | 41.250 | 41.350 | 38.500,39.500 |
Where the most recent information in table 2 is returned with table 1 in a csv format.
I have managed to get this far:
SELECT code, high, low, close,
(SELECT GROUP_CONCAT(DISTINCT t2.average ORDER BY date DESC SEPARATOR ',') FROM t2
WHERE t2.histID = t1.histID)
FROM t1;
Unfortunately this returns all values associated with hID. I'm taking a look at xaprb.com's firstleastmax-row-per-group-in-sql solution but I have been banging my head all day and the slight wooziness seems to be dimming my ability to comprehend how I should use it to my benefit. How can I limit the results to the most 5 recent values and considering the tables will eventually be megabytes in size, try and remain in O(n2) or less? (Or can I?)
Temporary work around using SUBSTRING_INDEX and not a feasible solution for huge data
SELECT code, high, low, close,
(SELECT SUBSTRING_INDEX(GROUP_CONCAT(asxhist.average), ',', 3)
FROM asxhist
WHERE asxhist.histID = asx.histID
ORDER BY date DESC)
FROM asx;
From what I gather Limit option in GROUP_CONCAT is still under feature-request.
Also on stackoverflow hack MySQL GROUP_CONCAT

Average Time MYSQL from a row

Hi all i have a table with the time in each row
how do i get the average time for each row with a select
eg 22:56:39 should be the result
+---------------------+---------------------+
| Day_16 | Day_12 |
+---------------------+---------------------+
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| 2011-01-16 23:52:34 | 2011-02-15 22:00:45 |
Ps there is a ID for each row also
SELECT SEC_TO_TIME(AVG(TIME_TO_SEC(day_16),TIME_TO_SEC(day_12))) FROM Table1;
SELECT FROM_UNIXTIME(UNIX_TIMESTAMP(Day_12) + ((UNIX_TIMESTAMP(Day_16) - UNIX_TIMESTAMP(Day_12)) / 2)) FROM tablename
Edit: Ulvund's solution is much cleaner

JOIN query is far too slow. Won't use INDEX?

I have a transitional table that I temporarily fill with some values before querying it and destroying it.
CREATE TABLE SearchListA(
`pTime` int unsigned NOT NULL ,
`STD` double unsigned NOT NULL,
`STD_Pos` int unsigned NOT NULL,
`SearchEnd` int unsigned NOT NULL,
UNIQUE INDEX (`pTime`,`STD` ASC) USING BTREE
) ENGINE = MEMORY;
It looks as such:
+------------+------------+---------+------------+
| pTime | STD | STD_Pos | SearchEnd |
+------------+------------+---------+------------+
| 1105715400 | 1.58474499 | 0 | 1105723200 |
| 1106297700 | 2.5997839 | 0 | 1106544000 |
| 1107440400 | 2.04860375 | 0 | 1107440700 |
| 1107440700 | 1.58864998 | 0 | 1107467400 |
| 1107467400 | 1.55207218 | 0 | 1107790500 |
| 1107790500 | 2.04239417 | 0 | 1108022100 |
| 1108022100 | 1.61385678 | 0 | 1108128000 |
| 1108771500 | 1.58835083 | 0 | 1108771800 |
| 1108771800 | 1.65734727 | 0 | 1108772100 |
| 1108772100 | 2.09378189 | 0 | 1109027700 |
+------------+------------+---------+------------+
Only columns pTime and SearchEnd are relevant to my problem.
My intention is to use this table to speed up searching through a much larger, static table.
The first column, pTime, is where the search should start
The fourth column, SearchEnd, is where the search should end
The larger table is similar; it looks like this:
CREATE TABLE `b50d1_abs` (
`pTime` int(10) unsigned NOT NULL,
`Slope` double NOT NULL,
`STD` double NOT NULL,
`Slope_Pos` int(11) NOT NULL,
`STD_Pos` int(11) NOT NULL,
PRIMARY KEY (`pTime`),
KEY `Slope` (`Slope`) USING BTREE,
KEY `STD` (`STD`),
KEY `ID1` (`pTime`,`STD`) USING BTREE
) ENGINE=MyISAM DEFAULT CHARSET=latin1 MIN_ROWS=339331 MAX_ROWS=539331 PACK_KEYS=1 ROW_FORMAT=FIXED;
+------------+-------------+------------+-----------+---------+
| pTime | Slope | STD | Slope_Pos | STD_Pos |
+------------+-------------+------------+-----------+---------+
| 1107309300 | 1.63257919 | 1.39241698 | 0 | 1 |
| 1107314400 | 6.8959276 | 0.22425643 | 1 | 1 |
| 1107323100 | 18.19909502 | 1.46854808 | 1 | 0 |
| 1107335400 | 2.50135747 | 0.4736305 | 0 | 0 |
| 1107362100 | 4.28778281 | 0.85576985 | 0 | 1 |
| 1107363300 | 6.96289593 | 1.41299044 | 0 | 0 |
| 1107363900 | 8.10316742 | 0.2859726 | 0 | 0 |
| 1107367500 | 16.62443439 | 0.61587645 | 0 | 0 |
| 1107368400 | 19.37918552 | 1.18746968 | 0 | 0 |
| 1107369300 | 21.94570136 | 0.94261744 | 0 | 0 |
| 1107371400 | 25.85701357 | 0.2741292 | 0 | 1 |
| 1107375300 | 21.98914027 | 1.59521158 | 0 | 1 |
| 1107375600 | 20.80542986 | 1.59231289 | 0 | 1 |
| 1107375900 | 19.62714932 | 1.50661679 | 0 | 1 |
| 1107381900 | 8.23167421 | 0.98048205 | 1 | 1 |
| 1107383400 | 10.68778281 | 1.41607579 | 1 | 0 |
+------------+-------------+------------+-----------+---------+
...etc (439340 rows)
Here, the columns pTime, STD, and STD_Pos are relevant to my problem.
For every element in the smaller table (SearchListA), I need to search the specified range within the larger table (b50d1_abs()) and return the row with the lowest b50d1_abs.pTime that is higher than the current SearchListA.pTime and that also matches the following conditions:
SearchListA.STD < b50d1_abs.STD AND SearchListA.STD_Pos <> b50d1_abs.STD_Pos
AND
b50d1_abs.pTime < SearchListA.SearchEnd
The latter condition is simply to reduce the length of the search.
This seems to me like a pretty straightforward query that should be able to use indexes; especially since all values are unsigned numbers - But I cannot get it to execute nearly fast enough! I think it is because it rebuilds the entire table each time instead of just omitting values from it.
I would be extremely grateful if someone takes a look at my code and figures out a more efficient way to go about this:
SELECT
m.pTime as OpenTime,
m.STD,
m.STD_Pos,
mu.pTime AS CloseTime
FROM
SearchListA m
JOIN b50d1_abs mu ON mu.pTime =(
SELECT
md.pTime
FROM
b50d1_abs as md
WHERE
md.pTime > m.pTime
AND md.pTime <=m.SearchEnd
AND m.STD < md.STD AND m.STD_Pos <> md.STD_Pos
LIMIT 1
);
Here is my EXPLAIN EXTENDED statement:
+----+--------------------+-------+--------+-----------------+---------+---------+------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+--------+-----------------+---------+---------+------+--------+----------+--------------------------+
| 1 | PRIMARY | m | ALL | NULL | NULL | NULL | NULL | 365 | 100.00 | |
| 1 | PRIMARY | mu | eq_ref | PRIMARY,ID1 | PRIMARY | 4 | func | 1 | 100.00 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | md | ALL | PRIMARY,STD,ID1 | NULL | NULL | NULL | 439340 | 100.00 | Using where |
+----+--------------------+-------+--------+-----------------+---------+---------+------+--------+----------+--------------------------+
It looks like the lengthiest query (#2) doesn't use indexes at all!
If I try FORCE INDEX then it will list it under possible_keys, but still list NULL under Key and still take an extremely long time (over 80 seconds).
I need to get this query under 10 second; and even 10 is too long.
Your subquery is a dependent subquery, so the best case is that it's going to be evaluated once for every row in table m. Since m contains few rows, that would be OK.
But if you put that subquery in a JOIN condition, it is going to be executed (rows in m)*(rows in mu) times, no matter what.
Note that your results may be incorrect since :
return the row with the lowest b50d1_abs.pTime
but you don't specify that anywhere.
Try this query :
SELECT
m.pTime as OpenTime,
m.STD,
m.STD_Pos,
(
SELECT min( big.pTime )
FROM b50d1_abs as big
WHERE big.pTime > m.pTime
AND big.pTime <= m.SearchEnd
AND m.STD < big.STD AND m.STD_Pos <> big.STD_Pos
) AS CloseTime
FROM SearchListA m
or this one :
SELECT
m.pTime as OpenTime,
m.STD,
m.STD_Pos,
min( big.pTime )
FROM
SearchListA m
JOIN b50d1_abs as big ON (
big.pTime > m.pTime
AND big.pTime <= m.SearchEnd
AND m.STD < big.STD AND m.STD_Pos <> big.STD_Pos
)
GROUP BY m.pTime
(if you also want rows where the search was unsuccessful, make that a LEFT JOIN).
SELECT
m.pTime as OpenTime,
m.STD,
m.STD_Pos,
(
SELECT big.pTime
FROM b50d1_abs as big
WHERE big.pTime > m.pTime
AND big.pTime <= m.SearchEnd
AND m.STD < big.STD AND m.STD_Pos <> big.STD_Pos
ORDER BY big.pTime LIMIT 1
) AS CloseTime
FROM SearchListA m
(Try an index on b50d1_abs( pTime, STD, STD_Pos)
FYI here are some tests using Postgres on a test data set that should look like yours (maybe remotely, lol)
CREATE TABLE small (
pTime INT PRIMARY KEY,
STD FLOAT NOT NULL,
STD_POS BOOL NOT NULL,
SearchEnd INT NOT NULL
);
CREATE TABLE big(
pTime INTEGER PRIMARY KEY,
Slope FLOAT NOT NULL,
STD FLOAT NOT NULL,
Slope_Pos BOOL NOT NULL,
STD_POS BOOL NOT NULL
);
INSERT INTO small SELECT
n*100000,
random(),
random()<0.1,
n*100000+random()*50000
FROM generate_series( 1, 365 ) n;
INSERT INTO big SELECT
n*100,
random(),
random(),
random() > 0.5,
random() > 0.5
FROM generate_series( 1, 500000 ) n;
Query 1 : 6.90 ms (yes milliseconds)
Query 2 : 48.20 ms
Query 3 : 6.46 ms
I'll start a new answer cause it starts to look like a mess ;)
With your data I get, using MySQL 5.1.41
Query 1 : takes forever, Ctrl-C
Query 2 : 520 ms
Query 3 : takes forever, Ctrl-C
Explain for 2 looks good :
+----+-------------+-------+------+---------------------+------+---------+------+--------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------------+------+---------+------+--------+------------------------------------------------+
| 1 | SIMPLE | m | ALL | PRIMARY,STD,ID1,ID2 | NULL | NULL | NULL | 743 | Using temporary; Using filesort |
| 1 | SIMPLE | big | ALL | PRIMARY,ID1,ID2 | NULL | NULL | NULL | 439340 | Range checked for each record (index map: 0x7) |
+----+-------------+-------+------+---------------------+------+---------+------+--------+------------------------------------------------+
So, I loaded your data into postgres...
Query 1 : 14.8 ms
Query 2 : 100 ms
Query 3 : 14.8 ms (same plan as 1)
In fact rewriting 2 as query 1 (or 3) fixes a little optimizer shortcoming and finds the optimal query plan for this scenario.
Would you recommend using Postgres over MySql for this scenario?
Speed is extremely important to me.
Well, I don't know why mysql barfs so much on queries 1 and 3 (which are pretty simple and easy), in fact it should even beat postgres (using an index only scan) but apparently not, eh. You should ask a mysql specialist !
I'm more used to postgres... got fed up with mysql a long time ago ! If you need complex queries postgres usually wins big time (but you'll need to re-learn how to optimize and tune your new database)...