How do I partition data in a column table in SnappyData? - snappydata

I am unable to figure out the syntax to partition my 'column' table. Here is an example that fails on me as well as many variations on it.
CREATE TABLE SENSOR_DATA_COL_BY_YEAR USING column OPTIONS(PARTITION_BY year_num, buckets '11') AS (SELECT sensor_id,metric,collection_time,value,sensor_time,year AS year_num, month AS month_num from STAGING_1);
And... the error.
ERROR 38000: (SQLState=38000 Severity=-1)
(Server=172.31.8.115[1528],Thread[DRDAConnThread_34,5,gemfirexd.daemons])
The exception 'Invalid input 'C', expected dmlOperation, insert,
withIdentifier, select or put (line 1, column 1): CREATE TABLE
SENSOR_DATA_COL_BY_YEAR USING column OPTIONS(PARTITION_BY
year_num, buckets '11') AS (SELECT
sensor_id,metric,collection_time,value,sensor_time,year AS year_num,
month AS month_num from STAGING_1) ^;' was thrown while evaluating an
expression.

column name specified in PARTITION_BY clause should be in quotes "year_num"
modified query:
CREATE TABLE SENSOR_DATA_COL_BY_YEAR USING column OPTIONS(PARTITION_BY "year_num", buckets '11') AS (SELECT sensor_id,metric,collection_time,value,sensor_time,year AS year_num, month AS month_num from STAGING_1);

Related

MYSQL Column addition based on a condition in another Column

I am trying to add a column to the query o/p based on a condition in another column. There is a column for Enrollment_Status, if the value is 'Cancelled' or' Dis enrolled', then the new column should have a concatenated value.
I have written a query for this , but the o/p is not correct. I have tried to write a CASE WHEN query, but it is giving me an error that - "Sub-query is returning multiple rows". I am currently using an 'IF' condition for the same.
The name of the table is 'sbi' in the database.
The o/p should be the concatenation of multiple columns and one of them should have a minimum value which is a date field.
IF((sbi.enrollment_status='Cancelled' OR sbi.enrollment_status='Disenrolled'), CONCAT(HCID, enrollment_status, created_at) IN (SELECT
CONCAT(hcid, enrollment_status, MIN(created_at))
FROM
sbi
WHERE sbi.enrollment_status='Cancelled' OR sbi.enrollment_status='Disenrolled'
Group By hcid),
'--') AS 'Disenroll Date'
The expected o/p is '000M99920Cancelled2019-08-28 00:00:00'
But currently the query is generating a '0' for the same.
Attached is the image for the o/p that I have received

Error while converting mysql query to postgres

The following query works in MySQL:
SELECT
f.created_date as time_sec
,sum(f.value) as value
, date_format( f.created_date , '%a') as metric
FROM ck_view_fills as f
GROUP BY date_format(f.created_date, '%a' )
I have migrated my database to PostgreSQL and I am now converting my queries to match. My naive conversion looks like this:
SELECT
f.created_date as time_sec
,sum(f.value) as value
, to_char( f.created_date , "D") as metric
FROM ck_view_fills as f
GROUP BY to_char( f.created_date , "D")
This query is not accepted and the error message produced by PostgreSQL is the following:
Error in query (7): ERROR: column "f.created_date" must appear in the
GROUP BY clause or be used in an aggregate function LINE 2:
f.created_date as time_sec
As far as I can tell f.created_date is indeed used in the group by clause. I have also seen examples using this very syntax. So what is the cause of this error and how do I get around it?
Postgres is correct. In fact, your query would fail with the most recent versions of MySQL as well -- using the default settings.
Just use an aggregation function:
SELECT MIN(f.created_date) as time_sec,
SUM(f.value) as value
TO_CHAR(f.created_date, 'D') as metric
FROM ck_view_fills f
GROUP BY to_char(f.created_date , 'D');
You should have used a similar construct in MySQL (regarding MIN() -- or MAX()).

MySQL return summed values and a virtual column as (count - sum)

I have a table as follows:
log (log_id, log_success (bool), log_created)
I would like to SELECT and return 3 columns date success and no_success, where the former does not exist in table and finally aggregate them by day.
I have created this query:
SELECT
log_created as 'date'
COUNT(*) AS 'count',
SUM(log_success) AS 'success'
SUM('count' - 'success') AS 'no_success'
FROM send_log
GROUP BY DATE_FORMAT(log_created, '%Y-%m-%d');
Would I be able to achieve it with this query? Is my syntax correct?
Thanks.
You can't reuse an alias defined in the select within the same select clause. The reason for this is that it might not even have been defined when you go to access it. But, you easily enough can repeat the logic:
SELECT
log_created AS date,
SUM(log_success) AS success,
COUNT(*) - SUM(log_success) AS no_success,
FROM send_log
GROUP BY
log_created;
I don't know why you are calling DATE_FORMAT in the group by clause of your query. DATE_FORMAT is usually a presentation layer function, which you call because you want to view a date formatted a certain way. Since it appears that log_created is already a date, there is no need to call DATE_FORMAT on it when aggregating. You also should not even need in the select clause, because the default format for a MySQL date is already Y-m-d.
You must select DATE_FORMAT(log_created, '%Y-%m-%d') if you want to group by this.
Also you can get the no_success counter with SUM(abs(log_success - 1))
SELECT
DATE_FORMAT(log_created, '%Y-%m-%d') date,
SUM(log_success) log_success,
SUM(abs(log_success - 1)) no_success
FROM send_log
GROUP BY date;
See the demo

Extracting last value of unique filtered column

As a beginner in MySQL I'm having some difficulties building a query. I want to extract the values of the second column (Fecha) in my table for every unique value in the first one (CodigoEst). My final goal is to know the last/most recent value of "Fecha".
My table looks like
Then I want to have the values of "Fecha" for any different value of "CodigoEst".
I have tried DISTINCT but this gives the list of unique values in CodigoEst, not the values in Fecha. I have also tried
SELECT DISTINCT `CodigoEst`,`Fecha` FROM temperatura_max ORDER BY `Fecha` DESC LIMIT 1
But this gives the last value of "Fecha" just for one value of "CodigoEst". Expected output would be something like
CodigoEst Fecha
7031 2010-10-31
8460 2012-01-15
3610 2010-12-31
where the values in "Fecha" are the most recent dates
Any suggestion would be welcome, thanks
Group by CodigoEst and select max value
SELECT CodigoEst, MAX(fetcha) mostRecent FROM temperatura_max GROUP BY CodigoEst
Use the MAX() to get the most recent data:
SELECT MAX(ColName) FROM Table
Or you can use:
SELECT LAST_INSERT_ID(ColName) FROM Table

SQL INSERT INTO SELECT Statement Invalid use of group function

I've the following query:
INSERT INTO StatisticalConsultationAgreement VALUES (
queryType, entityCode, entityType, queryClass,queryTables,period,
COUNT(queryClass), SUM(numberRecords), SUM(recordsFound),
SUM(NorecordsFound), NOW(), 'system');
SELECT
MONTH(EndDateTimeProcessing),YEAR(EndDateTimeProcessing),
entityType,
entityCode,
queryType,
queryClass,
EndDateTimeProcessing as period
FROM agreementFile
WHERE
MONTH(EndDateTimeProcessing)=MONTH(DATE_SUB( CURDATE(), INTERVAL 1 MONTH ))
AND YEAR(EndDateTimeProcessing)=YEAR(CURDATE())
GROUP BY entityType,entitycode,queryType, queryClass;
When I run the query I get the next mistake:
Error code 1111, SQL state HY000: Invalid use of group function
Line 1, column 1
Executed successfully in 0,002 s.
Line 5, column 2
why ocurre this?
how to fix it?
You are mixing a values statement with a select statement in insert. You only need select. This is my best guess on what you want:
INSERT INTO StatisticalConsultationAgreement
SELECT queryType, entityCode, entityType, queryClass,queryTables,period,
COUNT(queryClass), SUM(numberRecords), SUM(recordsFound),
SUM(NorecordsFound), NOW(), 'system'
FROM agreementFile
WHERE MONTH(EndDateTimeProcessing)=MONTH(DATE_SUB( CURDATE(), INTERVAL 1 MONTH )) AND
YEAR(EndDateTimeProcessing)=YEAR(CURDATE())
GROUP BY entityType, entitycode, queryType, queryClass;
However, you should also list the column names for StatisticalConsultationAgreement in the insert statement.
You are not grouping EndDateTimeProcessing and when you try to do the Insert it can't figure out which EndDateTimeProcessing value, from the grouped rows, it should take.
The solution is either you add it on your group clause:
GROUP BY entityType,entitycode,queryType, queryClass, EndDateTimeProcessing;
Or you use a function group as MAX(), MIN(), etc.
Best Regards
EDIT
As said by Gordon Linoff, you are also mixing the query with the INSERT, everything should be gotten by the query.
The right syntax should be:
INSERT INTO StatisticalConsultationAgreement
SELECT
'queryType', --I don't know what is the query type so i put it on single quote
entityCode,
entityType,
queryClass,
queryTables,
MAX(EndDateTimeProcessing), --Period put on group function MAX, but it cant be grouped below or put into another group function
COUNT(queryClass), --
SUM(numberRecords), -- ASUMING THOSE ARE COLUMNS IN agreementFile
SUM(recordsFound), --
SUM(NorecordsFound),--
NOW(),
'system'
FROM agreementFile
WHERE
MONTH(EndDateTimeProcessing)=MONTH(DATE_SUB( CURDATE(), INTERVAL 1 MONTH ))
AND YEAR(EndDateTimeProcessing)=YEAR(CURDATE())
GROUP BY entityType,entitycode,queryType, queryClass;
The fields MONTH(EndDateTimeProcessing),YEAR(EndDateTimeProcessing), for the query were removed because i didn't know where thouse should be