SQL Where Clause with Cast or convert doesnt work - mysql

I've a table on ArcGis which contains nummbers and dates. I need to filter these via a sql-query. I just have the possibility to change the where clause.
See here: https://services3.arcgis.com/rKOPqLnqVBkPP9th/arcgis/rest/services/Arbeitsmappe1/FeatureServer/0//query
Just type in the where clause 1=1 and outfield * then you will get all results.
I have to filter installierte_leistung which contains numbers in the following formats:
1.050,20 ; 18; 0,1 ; 1.230
and dates of following format: 11.04.08
wished filters:
installierte_leistung: I want to execute a sql-statement like: where (installierte_leistung BETWEEN '1' AND '2'). In the result there is also the 18. Or if I ask for values greater 10 it shows me also the 1.050,20.
I tried to convert with cast and convert to decimal, signed, unsigned, integer and so on, but the query has been always invalid. I tried with 'number' and with number and with "number". lowercase and uppercase and almost all thinkable possibilities. I get no results with cast or convert.
Same issue with the Date. I want to filter monthly. so means between 01.2008 and 09.2009 for instance.
Could someone please help me? Thanks a lot!
Falk

I had a similar problem in the past with nested query. The more database specific queries (like cast and so) don't work because ArcGIS server is by default configured to work only with standardized queries. If you need to use more specific queries you have to change "standardizedQueries": "false" in server setting, check here how (bottom of the page): http://resources.arcgis.com/en/help/main/10.2/index.html#//015400000641000000. Should work for you. Good luck.

Related

Find the Number of Books in each format

I have a query to run for coursework that has to show the number of books for each format.
Here is the table I am trying to query format can be (hardback, softback, audio, ecopy)
booktable
Here is the code I have tried, I am unaware how to expand to include all format types:
SELECT format, COUNT(format) FROM book WHERE format = 'hardback' OR 'softback' OR 'audio' OR 'ecopy'
I know this is incorrect but it only shows the hardback format and how many hardback books are included.
I've decided to write an answer, because you must be wondering what happens in your query. I suppose you think it should either work or fail, but instead it works correctly for one format, but then it doesn't show any other. Why?
Your query works as follows: A where clause consists of a boolean expresssion. This can be multiple sub expressions combined with AND and OR. Your sub conditions are: format = 'hardback', 'softback', 'audio', 'ecopy'. Now, 'softback' is not really a condition. format = 'softback' would be. And here it gets weird. Rather then reporting a syntax error, MySQL wants a boolean, so it brashly converts your string.
It does so in two steps, because a string cannot be converted to boolean, but a string can be converted to number and a number to boolean. Hence the DBMS first converts your string 'softback' into a number. That should fail, but it doesn't obviously. This is the second time we expect a syntax error, but it isn't happening. MySQL takes the liberty to convert non-numeric strings into a zero.
Then MySQL converts this number into a boolean. In MySQL true = 1 and false = 0. So you have: WHERE format = 'hardback' OR false OR false OR false. Thus you only get 'hardback' books and count these. As there is just one format you select, it can be shown along with the count. I don't know whether MySQL really detects that this is valid, because the query only selects one format. I find it more likely that you are in MySQL's cheat mode (i.e. you haven't SET sql_mode = 'ONLY_FULL_GROUP_BY', which is a bad idea, because by working outside ONLY_FULL_GROUP_BY mode, you tell MySQL to let certain invalid queries pass and muddle through.) So MySQL sees there is a format to be selected, but it must be chosen which row to pick it from, and MySQL muddles through with silently applying ANY_VALUE(format).
What you want is an aggregation (count) with one result row per format. "Per ____" translates to GROUP BY ____ in SQL. So you want:
SELECT format, COUNT(*)
FROM book
GROUP BY format;
You just need to add GROUP BY format at the end of the query.
You have to write select query for each format, seperately!

Snowflake interpreting timestamp wrong?

I'm loading a bunch of semi-structured data (JSON) into my database through Snowflake. The timestamp values in the entries are javascript timestamps that look like this:
"time": 1621447619899
Snowflake automatically converts this into a timestamp variable that looks like this:
53351-08-15 22:04:10.000.
All good so far. However, I think that the new timestamp is wrong. The actual datetime should by May 19, 2021 around 12pm MDT. Am I reading it wrong? Is it dependent on the timezone that my Snowflake instance is in?
When comparing the following options manually in SQL:
with x as (
SELECT parse_json('{time:1621447619899}') as var
)
SELECT var:time,
var:time::number,
var:time::varchar::timestamp,
1621447619899::timestamp,
'1621447619899'::timestamp,
var:time::timestamp
FROM x;
It appears that what you want to do is execute the following:
var:time::varchar::timestamp
Reviewing the documentation it does look like the to_timestamp is looking for the number as a string, so you need to cast to varchar first, and then cast to timestamp, otherwise you get what you are getting.
The question says that Snowflake transforms it to "53351-08-15 22:04:10.000" looks right, but it doesn't look right to me.
When I try the input number in Snowflake I get this:
select '1621447619899'::timestamp;
-- 2021-05-19T18:06:59.899Z
That makes a lot more sense.
You'll need to provide more code or context for further debugging - but if you tell Snowflake to transform that number to a timestamp, you'll get the correct timestamp out.
See the rules that Snowflake uses here:
https://docs.snowflake.com/en/sql-reference/functions/to_timestamp.html#usage-notes
The ::timestamp handles strings and numeric inputs differently. I.e. a string is added to 1970-01-01 as milliseconds (correct) whereas the numeric value is added in seconds which returns a date way in the future "53351-08-18 20:38:19.000".
SELECT TO_VARCHAR(1621447619899::timestamp) AS numeric_input
,'1621447619899'::timestamp AS string_input
numeric_input = 53351-08-18 20:38:19.000
string_input = 2021-05-19 18:06:59.899
Solutions are to convert to a string or divide by 1000:
SELECT TO_TIMESTAMP(time::string)
SELECT TO_TIMESTAMP(time/1000)

Reading negative numbers in a column

I'm using SSIS to separate good data from unusable date. In order to do that I used derived columns, script task and conditional split where I assigned certain conditions. One of the conditions I need to apply is that none of the numbers in one column cannot be negative. I'm guessing that the best way to solve this would be using conditional split, but I cannot get it to work. I'm new to SSIS, so any help would be appreciated.
You'd have an Expression like
[MyCaseSensitiveColumnName] < 0
and then name the output path something like BadData_NegativeValue
From the comments
that is what I did before, but I'm getting an error saying that The data types "DT_WSTR" and "DT_I4" are incompatible for binary operator ">"
That error message indicates that you are attempting to compare a unicode string (DT_WSTR) and an integer (DT_I4) and that the expression language does not allow it.
To resolve this type incompatibility, you would need to first convert the value of MyCaseSensitiveColumnName from DT_WSTR to an integer.
I'd likely add a Derived Column Component to my data flow and create a new column called MyCaseSensitiveColumnNameAsInteger with an expression like
(DT_I4) [MyCaseSensitiveColumnName]
Now, that may be perilous depending on the quality of your source data. I don't know why you are pulling numeric data in as a string. If there could be non whole numbers in the data set, then we will need to check before making the cast. If there are NULLs in that dataset, those too may cause issues.
That would result in our conditional split check becoming
[MyCaseSensitiveColumnNameAsInteger] < 0

How can I order a mySQL column by a number imbedded in a string?

I am parsing genomic positions from a MySQL field. The field is called "change" and the entries are of the form:
g.100214985T>C
g.100249769C>A
g.10185G>T
I am trying to order the field by the numerical portion of the string. I am trying to figure out what mySQL query I can use to accomplish this. I have tried using REGEXPs and SUBSTRING_INDEX but am still running into issues. Any help would be much appreciated!
Assuming you have always 2 characters in front of and 3 at the end you need to have removed:
SELECT CAST(SUBSTR(col from 3) AS UNSIGNED) AS value
FROM `my_table`
ORDER BY value
Watch this sql fiddle also: http://sqlfiddle.com/#!2/7bc0e/67
Thank you #MarcusAdams and #amoudhgz! The following code works:
CAST(SUBSTR(field, 3) AS UNSIGNED).
MySQL already stops the conversion at the first non-numerical character.

Performance in rlike expression or alternate query?

I am doing a series of updates on some tables after I import them from tab-separated values. The data comes with dates in a format I do not like. I bring them in as strings, manipulate them so that they are in the same format as MySQL dates and then convert the column. Or sometimes not, but I want them to be like MySQL dates even if they are strings.
They start out like '1/4/2013 12:00:00 AM' or '11/4/2012 2:37:45 PM'.
I turn these into '2013-01-04' (usually, since times are present even when the original schema clearly specifies dates only) and '2012-11-04 14:37:45'.
I am using rlike. And this does not use indexes? Wow. That sucks.
But already, for each column, I have to use 4 updates to handle the different cases ('1/7', '2/13', '11/2', '12/24'). If I did these using like, it might take 16 different updates for each column....
And, if I am seeing it right, I cannot even get positional parameters out of the rlike expression, yes? You know, the part of the expression wrapped in parentheses that becomes $1 or $2....
So, it seems as though it is going to be quicker to pre-process the tsv file with perl. Really? Wow. Again, this sucks.
Any other suggestions? I cannot have this taking 3 hours every time I need to pull in the data.
Recall the classic 1997 quote from Jamie Zawinski:
Some people, when confronted with a problem, think "I know, I'll use regular expressions."
Now they have two problems.
Have you tried using STR_TO_DATE()? This is exactly for parsing nonstandard date/time strings into canonical datetime values.
If you try parsing with STR_TO_DATE() and the string doesn't match the expected format, the function returns NULL.
So you could try parsing in different formats, and return the first one that gives a non-null result.
UPDATE mytable
SET datecolumn = COALESCE(
STR_TO_DATE(stringcolumn, '%m/%d'),
STR_TO_DATE(stringcolumn, '%d/%m/%Y'),
...etc.
);
I can't tell what your different cases are. It might or might not be possible to cover all cases in one pass.
Another alternative is as you say, preprocess the raw data with Perl before you load it into MySQL. But even then, don't fight with regular expressions, use Date::Parse instead.