Convert an MySQL script to Presto involving DATE_FORMAT - mysql

I'm trying to convert this MySQL line:
if(DATE_FORMAT(y.first_endperiod,"%Y-%m-%d") = DATE_FORMAT(x.end_period,"%Y-%m-%d"), 1, 0) = 1
to PrestoDB. I have tried using date_format, date_parse, and to_char, and all of them return the following error:
An error has been thrown from the AWS Athena client. SYNTAX_ERROR: line 40:41: Column '%y-%m-%d' cannot be resolved.
I'm using Athena for querying data from S3 bucket. Any idea how to fix this?

It looks like you're comparing date/time by the date portion, so you should just be able to do this:
CAST(y.first_endperiod AS date) = CAST(x.end_period AS date)

In standard SQL, the double-quotes are used to delimit identifiers, e.g. column names. So your SQL query above is interpreted as if you had a column named %Y-%m-%d. This is unlikely, but technically it'd be a legal identifier in SQL.
You're probably accustomed to MySQL, in which by default double-quotes are used the same as single-quotes, to delimit a string literal. This is a non-standard feature of MySQL.
Switch to single-quotes around your string literals and it should fix your problem:
if(DATE_FORMAT(y.first_endperiod,'%Y-%m-%d') = DATE_FORMAT(x.end_period,'%Y-%m-%d'), 1, 0) = 1
See also Do different databases use different name quote?

Related

Match Numeric value after comma separated, concatenated by underscore values using MYSQL/MariaDB & REGEXP_SUBSTR

I have field column values stored like:
texta_123,textb_456
My SQL:
SELECT *
FROM mytable
WHERE 456 = REGEXP_SUBSTR(mytable.concatenated_csv_values, 'textb_(?<number>[0-9]+)')
NOTE: I'm aware there are multiple ways of doing this, but for the purposes of example I simplified my query substantially; the part I need to work is REGEXP_SUBSTR()
Effectively, I want to: "query results where an id equals the numeric value extracted after an underscore in a column with comma-separated values"
When I test my Regex, it seems to work fine.
However, in MySQL (technically, I'm using MariaDB 10.4.19), when I run the query I get a warning: "Warning: #1292 Truncated incorrect INTEGER value:textb_456"
You should seriously consider fixing your database design to not store unnormalized CSV data like this. As a temporary workaround, we can use REGEXP_REPLACE along with FIND_IN_SET:
SELECT *
FROM mytable
WHERE FIND_IN_SET(
'456',
REGEXP_REPLACE(concatenated_csv_values, '^.*_', '')) > 0;
The regex trick used here would convert a CSV input of texta_123,textb_456 to just 123,456. Then, we can easily search for a given ID using FIND_IN_SET.

MySQL 5.7 - Query to set the value of a JSON key to a JSON Object

Using MySQL 5.7, how to set the value of a JSON key in a JSON column to a JSON object rather than a string.
I used this query:
SELECT json_set(profile, '$.twitter', '{"key1":"val1", "key2":"val2"}')
from account WHERE id=2
Output:
{"twitter": "{\"key1\":\"val1\", \"key2\":\"val2\"}", "facebook": "value", "googleplus": "google_val"}
But it seems like it considers it as a string since the output escapes the JSON characters in it. Is it possible to do that without using JSON_OBJECT()?
There's a couple of options that I know of:
Use the JSON_UNQUOTE function to unquote the output (ie not cast it to string) as documented here
Possibly use the ->> operator and select a specific path, documented here
Has a lot of implications, but you could disable backslashes as an escape character. I haven't tried this, so I don't even know if that works, but it's mentioned in the docs
On balance, I'd either use the ->> operator, or handle the conversion on the client side, depending on what you want to do.

MySQL group and merge JSON values

I am using some native JSON fields to store information about some application entities in a MySQL 5.7.10 database. I can have 'N' rows per "entity" and need to roll-up and merge the JSON objects together, and any conflicting keys should replace instead of merge. I can do this through code, but if I can do it natively and efficiently in MySQL even better.
I have attempted this using a combination of GROUP_CONCAT and JSON_MERGE, but I've run into two issues:
JSON_MERGE won't take the results of GROUP_CONCAT as a valid argument
JSON_MERGE combines conflicting keys instead of replacing them. What I really need is more of a JSON_SET but with 'N' number of JSON docs instead of "key, value" notation.
Is this possible with the current MySQL JSON implementation?
First of all, GROUP_CONCAT only returns a string, so you have to cast it. Second of all, there is a function doing exactly what you want called JSON_MERGE_PATCH(). Try the following:
SELECT
JSON_MERGE_PATCH(
yourExistingJson,
CAST(
CONCAT(
'[',GROUP_CONCAT(myJson),']'
)
AS JSON)
) AS myJsonArray
....
Just realized your version. You would have to upgrade to 5.7.22 or higher. Is it possible in your case? If not, there may be other ways but they wont be elegant :(
You could do something like the following:
SELECT
CAST(CONCAT(
'[',
GROUP_CONCAT(
DISTINCT JSON_OBJECT(
'foo', mytable.foo,
'bar', mytable.bar
)
),
']'
) AS JSON) AS myJsonArr
FROM mytable
GROUP BY mytable.someGroup;
JSON_MERGE won't take the results of GROUP_CONCAT as a valid argument
GROUP_CONCAT gives a,b,c,d, not a JSON array. Use JSON_ARRAYAGG (introduced in MySQL 5.7.22), which works just like group_concat, but gives a correct array ["a", "b", "c", "d"], that should be accepted with JSON functions.
Prior to 5.7.22, you need to use a workaround:
cast(
concat('["', // begin bracket and quote
group_concat(`field` separator '", "'), // separator comma and quotes
'"]' // end quote and bracket
) as json
)
JSON_MERGE combines conflicting keys instead of replacing them. What I really need is more of a JSON_SET but with 'N' number of JSON docs instead of "key, value" notation.
Use JSON_MERGE_PATCH instead, as introduced in MySQL 5.7.22. JSON_MERGE is a synonym for JSON_MERGE_PRESERVE.
See https://dev.mysql.com/doc/refman/5.7/en/json-function-reference.html.
Read my Best Practices for using MySQL as JSON storage.
Aggregation of JSON Values
For aggregation of JSON values, SQL NULL values are ignored as for other data types. Non-NULL values are converted to a numeric type and aggregated, except for MIN(), MAX(), and GROUP_CONCAT(). The conversion to number should produce a meaningful result for JSON values that are numeric scalars, although (depending on the values) truncation and loss of precision may occur. Conversion to number of other JSON values may not produce a meaningful result.
I just found this in mysql docs

MySQL for replace with wildcard

I'm trying to write a SQL update to replace a specific xml node with a new string:
UPDATE table
SET Configuration = REPLACE(Configuration,
"<tag>%%ANY_VALUE%%</tag>"
"<tag>NEW_DATA</tag>");
So that
<root><tag>SDADAS</tag></root>
becomes
<root><tag>NEW_DATA</tag></root>
Is there a syntax im missing for this type of request?
Update: MySQL 8.0 has a function REGEX_REPLACE().
Below is my answer from 2014, which still applies to any version of MySQL before 8.0:
REPLACE() does not have any support for wildcards, patterns, regular expressions, etc. REPLACE() only replaces one constant string for another constant string.
You could try something complex, to pick out the leading part of the string and the trailing part of the string:
UPDATE table
SET Configuration = CONCAT(
SUBSTR(Configuration, 1, LOCATE('<tag>', Configuration)+4),
NEW_DATA,
SUBSTR(Configuration, LOCATE('</tag>', Configuration)
)
But this doesn't work for cases when you have multiple occurrences of <tag>.
You may have to fetch the row back into an application, perform string replacement using your favorite language, and post the row back. In other words, a three-step process for each row.

Single Quotes in MySQL queries

If I have a MySQL query like:
SELECT this FROM that WHERE id='10'
and
SELECT this FROM that WHERE id=10
both seem to work correctly.
What is the use of the single speech marks in MySQL queries? When is it correct to use them?
When MySQL performs the query, there is an implicit conversion of the argument.
If id is INT, then '10' is cast to an integer.
If id is VARCHAR or another text type, 10 is cast to string.
In both cases both queries will work (unless you are running in STRICT mode).
From a performance point of view, you have to use the right data type (do not use quotes for integer arguments) - the implicit cast adds overhead and in some cases, it may hurt the performance of index lookups.
From security perspective, it easier to always use quotes and mysql_real_escape_string (in case the argument is not quoted, mysql_real_escape_string won't stop any attack, that do not use quotes, for example 'UNION SELECT password FROM users'. However, better approach is to cast your variable to int, when it's expected to be int, or use prepared statements
If the value is a string, you have to use ' or ".
If the value is a number, like in your example, you have not to use ', but MySQL handles it if you put it around 's.
Assuming that id is a numeric column, what happens is that MySQL casts your parameter to number automatically so data types match before comparing. It works flawlessly unless casting provides unexpected results. E.g., these expressions with match the row with id=10 because all the strings cast to 10:
id='10'
id=' 10'
id='00010'
id='10foo'
The following will not match the row because non-parseable strings cast to 0 and 10<>0:
id='foo10'
id='bar'
When to use each? If you want a string, you need to quote it (there's no other way to type a string and get valid SQL). If you want a number, it must be unquoted (otherwise, you'll get a string that happens to contain a number). Of course, you can always provide numbers as strings and let MySQL do the conversion, but it doesn't really add anything to the query apart from one extra step and possibly incorrect results that go unnoticed.
You should always use them. They can help to stop SQL injection attacks because mysql_real_escape_string isn't enough on its own.
That is assuming you are running a query via PHP.