Amazon Athena - map column name to value with Java SDK - aws-sdk

After invoking get query results sdk method, I was wondering how should I map between the column name and the actual value in each row.
Can I assume that the column info list (under ResultSetMetadata) is ordered the same way each value is ordered in the list of Datum (under the Row)
if not, how can I map between the key (column name) and value (actual value of the column data in specific row)
Thanks

Yes you can. The documentation doesn't say so explicitly, but that's the way it works. It would be impossible for Athena to return the results to you in any other way because of the underlying storage format of the result (which is CSV).

Related

NIFI - QueryDatabaseTable processor. How to query rows which is modified?

I am working on NIFI Data Flow where my usecase is fetch mysql table data and put into hdfs/local file system.
I have built a data flow pipeline where i used querydatabaseTable processor ------ ConvertRecord --- putFile processor.
My Table Schema ---> id,name,city,Created_date
I am able to receive files in destination even when i am inserting new records in table
But, but ....
When i am updating exsiting rows then processor is not fetching those records looks like it has some limitation.
My Question is ,How to handle this scenario? either by any other processor or need to update some property.
PLease someone help
#Bryan Bende
QueryDatabaseTable Processor needs to be informed which columns it can use to identify new data.
A serial id or created timestamp is not sufficient.
From the documentation:
Maximum-value Columns:
A comma-separated list of column names. The processor will keep track of the maximum value for each column that has been returned since the processor started running. Using multiple columns implies an order to the column list, and each column's values are expected to increase more slowly than the previous columns' values. Thus, using multiple columns implies a hierarchical structure of columns, which is usually used for partitioning tables. This processor can be used to retrieve only those rows that have been added/updated since the last retrieval. Note that some JDBC types such as bit/boolean are not conducive to maintaining maximum value, so columns of these types should not be listed in this property, and will result in error(s) during processing. If no columns are provided, all rows from the table will be considered, which could have a performance impact. NOTE: It is important to use consistent max-value column names for a given table for incremental fetch to work properly.
Judging be the table scheme, there is no sql-way of telling whether data was updated.
There are many ways to solve this. In your case, the easiest thing to do might be to rename column created to modified and set to now() on updates
or to work with a second timestamp column.
So for instance
| stamp_updated | timestamp | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
is the new column added. In the processor you use the stamp_updated column to identify new data
Don't forget to set Maximum-value Columns to those columns.
So what I am basically saying is:
If you cannot tell that it is a new record in sql yourself, nifi cannot either.

Two Way table in knime?

I am new to KNIME and I have a question, I have column splitter node that is outputting one column and one row. This will naturally have one value in the cell. I want to feed this value into a column of a table in KNIME. How do I do this?
I Don't see two way tables in KNIME.
You can use the Cross Joiner node to append the constant column. (There is also the Table Row to Variable and Constant Value Column combination if the constant value is one of the primitive types (String, Double, Int).)
You may need the RowID node to replace/restore the row ids.

How to store the result of a SQL statement as a variable and use the result in an SSIS Expression?

I am using a SSIS Data Flow Task to transfer data from one table to another. Column A in Table A contains a number, the last 3 digits of which I want to store in Column B of Table B.
First I'm trying to grab all of the data in Column A and store in a variable via a simple SELECT statement SELECT COLUMN_A FROM TABLE_A. However, the variable stores the statement as a string when I want the result set of the query. I have set the EvaluateAsExpression property to False but to no avail.
Secondly I want to be able to use the result of this query in the Derived Column of my Data Flow to extract the last 3 digits and store the values in Column_B in the destination. The expression I have is:
(DT_STR,3,1252)RIGHT(#User::[VariableName],3)
I want to store this as a string hence the (DT_STR,3,1252) data type.
All I'm getting so far in Column_B of Table_B is is the last 3 characters of the SELECT statement "E_A". There is a lot of useful information on the web including YouTube videos for things like setting file paths and server names as parameters or variables but I can't see many relevant to the specifics of my query.
I have used an Execute SQL Task to insert row counts from flat files but, in this example, I want to use the Derived Column tool instead.
What am i doing wrong? Any help is gratefully appreciated.
I prefer to do all the work in SQL if you aren't doing anything else with that number.
select right(cast(ColA as varchar(20)),3) from tableA
-- you can add another cast if you want it to be an int
use that in an execute sql to result set = single row.
Map that to a variable.
In a derived column in data flow you can set that variable to the new column.
Thanks KeithL thats one solution I will use in future but I found another.
I dropped the variable and in the Expression box of the Transformation Editor did:
(DT_STR,3,1252)RIGHT((DT_STR,3,1252)Column_A,3).
In my question, I failed to cast Column_A from Table_A as a string. The first use of (DT_STR,3,1252) simply sets the destination column as a string so as not to use the same data type as the source which in my case was int.
Its the 2nd use of (DT_STR,3,1252) that actually casts Column_A from int to a string.

Is there a way to get value example from database, that can be inserted to the column in mysql?

I'm working on tool, that will replace values in specified columns with some other values. User specifies needed table columns in config and the tool replaces their values. I can get the type of a column from information.schema. Is there a way to get a valid value for this column? Or I need to specify them by myself for each column type?

SQL Query for retrieving items and their data stored as key value pairs

I'm setting up a system to store different types of items that can have different data types associated with them but I'm unsure how I should query the database to return items and all of their data.
I am using mysql.
This is the model of the tables that I'm using.
Items have an associated record ID (rID) and the records table stores key-value pairs containing the data (datatype as key, value as value).
Ideally I would be able to query the database for items where a certain record's value is equal to x and it would return a single row for each item with a field for each data type's value and its data type ID (I'm not sure whether this is possible or not though).
As an added constraint, new item types and data types may be added at any time so a query allows for this would make things much easier.
Any help would greatly be appreciated.
Thanks.