How to Select all the column name from file (e.g:- csv or, json or, parquet) in Apache Drill - apache-drill

I am using Drill in Window 10 in embedded mode having latest version 1.8.
I have a Drill query for selecting all the columns from a sql server database table:-
SELECT DISTINCT info.COLUMN_NAME AS `NAME`, info.TABLE_SCHEMA AS `TABLESCHEMA`, info.TABLE_NAME AS `TABLENAME`, info.ORDINAL_POSITION AS `POSITION`, info.IS_NULLABLE AS `ISNULLABLE`, info.DATA_TYPE AS `DATATYPE`, tc.CONSTRAINT_TYPE AS `CONSTRAINTTYPE`, kcufk.TABLE_SCHEMA AS `REFRENCESCHEMA`, kcufk.TABLE_NAME AS `REFRENCETABLE`, kcufk.COLUMN_NAME AS `REFRENCECOLUMN` FROM DemoSQLServer.INFORMATION_SCHEMA.`COLUMNS` info LEFT OUTER JOIN DemoSQLServer.INFORMATION_SCHEMA.`KEY_COLUMN_USAGE` kcu ON kcu.COLUMN_NAME = info.COLUMN_NAME AND kcu.TABLE_NAME = info.TABLE_NAME LEFT OUTER JOIN DemoSQLServer.INFORMATION_SCHEMA.`TABLE_CONSTRAINTS` tc ON tc.CONSTRAINT_NAME = kcu.CONSTRAINT_NAME AND tc.TABLE_NAME = kcu.TABLE_NAME LEFT OUTER JOIN DemoSQLServer.INFORMATION_SCHEMA.`REFERENTIAL_CONSTRAINTS` rk ON rk.CONSTRAINT_NAME = tc.CONSTRAINT_NAME LEFT OUTER JOIN DemoSQLServer.INFORMATION_SCHEMA.`KEY_COLUMN_USAGE` kcufk ON kcufk.CONSTRAINT_NAME = rk.UNIQUE_CONSTRAINT_NAME WHERE info.TABLE_NAME = 'Attribute' AND info.TABLE_SCHEMA = 'dbo' ORDER BY info.ORDINAL_POSITION ASC;
This query will return all the columns from the attribute table.
I want to select column name from some files(e.g:-csv or, parquet or, json,etc).
Is it possible using Drill.?

You can select columns for files in a similar way like tables.
Sample query:
select N_NAME,N_REGIONKEY from dfs.`<drill-home>/sample-data/nation.parquet`;
P.S. - Make sure dfs plugin is enabled.

Related

mysql group by partially

Hi I've written this code :
select `wp_term_taxonomy`.*, count(*)
from `wp_posts`
inner join `wp_term_relationships` on `wp_term_relationships`.`object_id` = `wp_posts`.`ID`
inner join `wp_term_taxonomy` on `wp_term_taxonomy`.`term_taxonomy_id` = `wp_term_relationships`.`term_taxonomy_id`
where `post_type` = 'product'
and exists(select *
from `wp_term_taxonomy`
inner join `wp_term_relationships` on `wp_term_taxonomy`.`term_taxonomy_id` = `wp_term_relationships`.`term_taxonomy_id`
where `wp_posts`.`ID` = `wp_term_relationships`.`object_id`
and `wp_term_taxonomy`.`term_taxonomy_id` in (401)
and `taxonomy` = 'product_cat')
and wp_posts.ID = wp_term_relationships.object_id
group by `wp_term_taxonomy`.`term_taxonomy_id`
So here is the result :
In taxonomy column every value is what I want except pa_shirt_size (at row 22 and 23), for pa_% items I want only one row of each no matter how many they are and which one gets chosen I just want one of each pa_shirt_size and one of pa_shirt_color and ..., basically one of each pa_% in taxonomy.How's this possible? (I don't want to use a seperate query.)
Test
GROUP BY CASE WHEN `wp_term_taxonomy`.`taxonomy` LIKE 'pa_%'
THEN `wp_term_taxonomy`.`taxonomy`
ELSE `wp_term_taxonomy`.`term_taxonomy_id`
END
ONLY_FULL_GROUP_BY SQL mode must be disabled. If not then use ANY_VALUE() or any other aggregate function(s).
Modelling DEMO

SQL Inner Query WHERE clause access to Outer Query tables

Good morning -
This is my first post here, after many years using SO as a very useful resource.
I've run into a problem with a complex (for me) query I'm pulling together for a wordpress site running woocommerce to process orders. I'm trying to add a filter to the order list which filters orders which contain products in a particular product category.
I'm afraid I've gotten in over my head with this query which joins a variety of meta tables on inner queries in order to get at the information I need in order to determine the product's category.
The problem is that I can't get the scoping rules to work in order to access required outer table information in the inner queries.
The query is:
SELECT SQL_CALC_FOUND_ROWS
wp_ot6q6i_posts.ID
FROM
wp_ot6q6i_posts
WHERE
1 = 1 AND YEAR(wp_ot6q6i_posts.post_date) = 2015 AND MONTH(wp_ot6q6i_posts.post_date) = 12 AND wp_ot6q6i_posts.post_type = 'shop_order' AND(
(
wp_ot6q6i_posts.post_status = 'wc-pending' OR wp_ot6q6i_posts.post_status = 'wc-processing' OR wp_ot6q6i_posts.post_status = 'wc-on-hold' OR wp_ot6q6i_posts.post_status = 'wc-completed' OR wp_ot6q6i_posts.post_status = 'wc-cancelled' OR wp_ot6q6i_posts.post_status = 'wc-refunded' OR wp_ot6q6i_posts.post_status = 'wc-failed'
)
) AND EXISTS(
SELECT
t2.PROD_ID
FROM
(
SELECT
wp_ot6q6i_woocommerce_order_itemmeta.meta_value AS PROD_ID
FROM
wp_ot6q6i_woocommerce_order_items
LEFT JOIN
wp_ot6q6i_woocommerce_order_itemmeta
ON
wp_ot6q6i_woocommerce_order_itemmeta.order_item_id = wp_ot6q6i_woocommerce_order_items.order_item_id
WHERE
wp_ot6q6i_woocommerce_order_items.order_item_type = 'line_item' AND wp_ot6q6i_woocommerce_order_itemmeta.meta_key = '_product_id' AND wp_ot6q6i_posts.ID = wp_ot6q6i_woocommerce_order_items.order_id
) t1
INNER JOIN
(
SELECT DISTINCT
wposts.ID AS PROD_ID
FROM
wp_ot6q6i_posts wposts
LEFT JOIN
wp_ot6q6i_postmeta wpostmeta
ON
wposts.ID = wpostmeta.post_id
LEFT JOIN
wp_ot6q6i_term_relationships
ON
(
wposts.ID = wp_ot6q6i_term_relationships.object_id
)
LEFT JOIN
wp_ot6q6i_term_taxonomy
ON
(
wp_ot6q6i_term_relationships.term_taxonomy_id = wp_ot6q6i_term_taxonomy.term_taxonomy_id
)
WHERE
wp_ot6q6i_term_taxonomy.taxonomy = 'product_cat' AND wp_ot6q6i_term_taxonomy.term_id IN(
SELECT
term_id
FROM
`wp_ot6q6i_terms`
WHERE
slug = 'preorder'
)
ORDER BY
wpostmeta.meta_value
) t2
ON
t1.PROD_ID = t2.PROD_ID
)
ORDER BY
wp_ot6q6i_posts.post_date
DESC
LIMIT 0, 20
And the error I'm getting is:
1054 - Unknown column 'wp_ot6q6i_posts.ID' in 'where clause'
Thanks all for your help. I ended up going in a different direction to solve this problem, one I'm more comfortable with as a dev...I'm pulling the fixed list of items from the last join and building a query in code that has a series of more simple queries in the where clause, thereby avoiding the whole Exists approach.
Thanks again for your help.

converting mysql sql to sql server

I have a MySQL SQL that works fine with Jaspersoft report:
SELECT pr.id AS project_id,
pr.project_name as project_name,
pr.export_event_id,
au.full_name,
ee.timestamp
FROM (
SELECT project.id, project.project_name, MAX(project.export_event_id) AS max_export_event_id FROM project INNER JOIN export_event iee ON project.export_event_id = iee.id
where IIF ($P{exportEventDate} IS NULL, TRUE, CONVERT(DATE, iee.timestamp) <= $P{exportEventDate})
GROUP BY project_name
) AS in_PR INNER JOIN project AS pr ON pr.project_name = in_PR.project_name AND pr.export_event_id = in_PR.max_export_event_id
INNER JOIN project_owner_base pob ON pob.id = pr.project_owner_id
INNER JOIN export_event AS ee ON pr.export_event_id = ee.id
INNER JOIN auth_user au ON pob.auth_user_id = au.id
WHERE IIF ($P{projectOwner} IS NULL, TRUE, au.id = $P{projectOwner})
I am trying to convert it to SQL Server but can't figure out the equivalent.
Think of the $P{...} as '?' in dynamic SQL
Any idea?
I think this is just a simple OR statement.
Where #ProjectOwner IS NULL
OR au.id = #ProjectOwner
Your query is pretty close. I would remove the IIF() entirely -- in either database. The result is something like this:
SELECT pr.id AS project_id,
pr.project_name as project_name,
pr.export_event_id,
au.full_name,
ee.timestamp
FROM (SELECT p.project_name, MAX(p.export_event_id) AS max_export_event_id
FROM project p INNER JOIN
export_event iee
ON p.export_event_id = iee.id
WHERE ? IS NULL OR CONVERT(DATE, iee.timestamp) <= ?
GROUP BY p.project_name
) in_PR INNER JOIN
project pr
ON pr.project_name = in_PR.project_name AND
pr.export_event_id = in_PR.max_export_event_id INNER JOIN
project_owner_base pob
ON pob.id = pr.project_owner_id INNER JOIN
export_event ee
ON pr.export_event_id = ee.id INNER JOIN
auth_user au
ON pob.auth_user_id = au.id
WHERE ? IS NULL OR au.id = ?;
I replaced the variables with ? (as suggested by your question). The above should work in either database.
Note that this also fixes the aggregation in the subquery to remove p.id which seems unnecessary (and should cause an error in SQL Server).

Problems with simple MySQL join

I am working with Expression Engine and the query module which allows you to use MySQL to get results. I have a set of data which I'm trying to associate with a user. My query is currently as follows:
SELECT COUNT(*)
FROM exp_channel_grid_field_11
INNER JOIN exp_member_data
WHERE `col_id_12` = 'Race' && `member_id` = '1'
So, I'm not too clued up when it comes to joins, but I am just looking for the count. Thanks.
Not sure what you're after - you don't necessarily need an 'ON' to do a JOIN but perhaps you do need to define the tables. I don't know which columns belong to which tables (and neither does mysql, perhaps that's the problem)
Assuming that 'member_id' is in exp_member_data and 'col_id_12' is in exp_channel_grid_field_11, you probably need to do something like this:
SELECT COUNT(*)
FROM exp_channel_grid_field_11
INNER JOIN exp_member_data
WHERE `exp_channel_grid_field_11.col_id_12` = 'Race'
&& `exp_member_data.member_id` = '1'
and you can "pretty it up" with "table aliases" such as like this:
SELECT COUNT(*)
FROM exp_channel_grid_field_11 e11
INNER JOIN exp_member_data ed
WHERE `e11.col_id_12` = 'Race'
AND `ed.member_id` = '1'
Or, maybe there should be an 'ON' member_id?
SELECT COUNT(*)
FROM exp_channel_grid_field_11 e11
INNER JOIN exp_member_data ed
ON e11.member_id = ed.member_id
WHERE `e11.col_id_12` = 'Race'
AND `ed.member_id` = '1'
In stead of WHERE col_id_12 = 'Race', use: on col_id_12 = 'Race'
SELECT COUNT(*)
FROM exp_channel_grid_field_11
INNER JOIN exp_member_data ON `col_id_12` = 'Race'
WHERE `member_id` = '1'

How to find Full-text indexing on database in SQL Server 2008?

Hi I am looking for a query that is able to find Full text indexing on all tables and columns within a database using SQL Server 2008. Any information or help that can be provided for this is welcomed
Here's how you get them
SELECT
t.name AS ObjectName,
c.name AS FTCatalogName ,
i.name AS UniqueIdxName,
cl.name AS ColumnName
FROM
sys.objects t
INNER JOIN
sys.fulltext_indexes fi
ON
t.[object_id] = fi.[object_id]
INNER JOIN
sys.fulltext_index_columns ic
ON
ic.[object_id] = t.[object_id]
INNER JOIN
sys.columns cl
ON
ic.column_id = cl.column_id
AND ic.[object_id] = cl.[object_id]
INNER JOIN
sys.fulltext_catalogs c
ON
fi.fulltext_catalog_id = c.fulltext_catalog_id
INNER JOIN
sys.indexes i
ON
fi.unique_index_id = i.index_id
AND fi.[object_id] = i.[object_id];
select distinct
object_name(fic.[object_id])as table_name,
[name]
from
sys.fulltext_index_columns fic
inner join sys.columns c
on c.[object_id] = fic.[object_id]
and c.[column_id] = fic.[column_id]
I know this is an old thread, but I just now needed this answer, and found Sadra Abedinzadeh's answer above useful, but a slightly lacking for my needs, so I thought I'd post another answer here, which is a modification of Sadra's answer, to include Indexed Views with FullText Indexes, and some extra column information:
use MyDatabaseName -- Modify here, of course
SELECT
tblOrVw.[name] AS TableOrViewName,
tblOrVw.[type_desc] AS TypeDesc,
tblOrVw.[stoplist_id] AS StopListID,
c.name AS FTCatalogName ,
cl.name AS ColumnName,
i.name AS UniqueIdxName
FROM
(
SELECT TOP (1000)
idxs.[object_id],
idxs.[stoplist_id],
tbls.[name],
tbls.[type_desc]
FROM sys.fulltext_indexes idxs
INNER JOIN sys.tables tbls
on tbls.[object_id] = idxs.[object_id]
union all
SELECT TOP (1000)
idxs.[object_id],
idxs.[stoplist_id],
tbls.[name],
tbls.[type_desc]
FROM sys.fulltext_indexes idxs
INNER JOIN sys.views tbls -- 'tbls' reused here to mean 'views'
on tbls.[object_id] = idxs.[object_id]
) tblOrVw
INNER JOIN sys.fulltext_indexes fi
on tblOrVw.[object_id] = fi.[object_id]
INNER JOIN
sys.fulltext_index_columns ic
ON
ic.[object_id] = tblOrVw.[object_id]
INNER JOIN
sys.columns cl
ON
ic.column_id = cl.column_id
AND ic.[object_id] = cl.[object_id]
INNER JOIN
sys.fulltext_catalogs c
ON
fi.fulltext_catalog_id = c.fulltext_catalog_id
INNER JOIN
sys.indexes i
ON
fi.unique_index_id = i.index_id
AND fi.[object_id] = i.[object_id];