Apache Drill 1.8 UI doesn't display Hive join query results - exception

I am using Apache Drill 1.8.0 on AWS EMR and joining two hive tables. Below is sample query. This working fine in Drill CLI but giving below error after running few minutes. If i try simple select query (select t1.col from hive.table t1) it works fine in both Drill CLI and UI. Only problem with join query.
If i cancel the join query from background, it displays results in UI. This is very strange situation.
Join Query:
select t1.col FROM hive.table1 as t1 join hive.table2 as t2 on t1.col = t2.col limit 1000;
Error:
Query Failed: An Error Occurred
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: RpcException: Data not accepted downstream. Fragment 1:4 [Error Id: 0b5ed2db-3653-4e3a-9c92-d0a6cd69b66e on ip-172-31-16-222.us-west-2.compute.internal:31010]

Related

MySql Query acts differently on my local version (8.0.21) and my staging (5.7.12)

I have an issue running a query on MySql using a Goapplication.
My local version works fine using mysql 8.0.21 but the same query on my staging version 5.7.12 on aurora fails
SELECT COUNT(*) AS cnt
FROM (
SELECT ? AS item_id, ? AS ar
)
AS A
JOIN item ON A.item_id = item.id
AND A.ar + item.existing_qty > item.qty;
Running this code in data grip with replacements works fine on both local and staging
Running this code but replacing the question marks with ints works fine
The error I get is:
Error 1054: Unknown column 'A.ar' in 'field list
I am thinking there is some driver / version issue
Does it work if you phrase the query like this?
select count(*)
from item
where item = ? and existing_qty + ? < qty
This is the same logic as your original query, and the parameters are passed in the same order.

Lost connection to MySQL server during query just on MacOS

I have an SQL query with 2 subqueries. whenever I run it on MySQL Workbench on macOS, it gives "Error Code: 2013. Lost connection to MySQL server during query". However, when it runs on Workbench on Windows, it runs normally without any errors.
I tried to increase the connection timeout, but still no success!
Any clue on how to solve this issue?
I appreciate your support and cooperation.
here is a query that gives an error:
with t1 as(
SELECT s.name rep_name, r.name region_name, sum(o.total_amt_usd) as total_amt
FROM sales_reps s
JOIN accounts a
ON a.sales_rep_id = s.id
JOIN orders o
ON o.account_id = a.id
JOIN region r
ON r.id = s.region_id
group by 1,2),
t2 as(
select region_name, max(total_amt) as total_amt
from t1
group by 1)
select t1.rep_name, t1.region_name, t1.total_amt
from t1
join t2
ON t1.region_name = t2.region_name AND t1.total_amt = t2.total_amt;
Your query is taking too long to return data so the connection gets dropped. There are 2 ways to fix this issue.
(i) Optimize query
(ii) Increase MySQL timeout
Explaining 2nd way:
1. In the application menu, select Edit > Preferences > SQL Editor.
2. Look for the MySQL Session section and increase the DBMS connection read time out value.
3. Save the settings, quite MySQL Workbench and reopen the connection.
Finally, I uninstalled the workbench and installed it again and now it is working properly. Thanks for who tried to answer my questions.

ON clause in HIVE (for version < 0.14)

I have tried running below query in Hive but it is showing following error:
" Both left and right aliases encountered in JOIN 'eff_start_dt' ". I think there is a problem with ON clause in below hive query.
I have tried running the same query on Teradata and it runs perfectly fine.
So my question is:
What all alzebraic expression we can use in 'ON' clause in hive?
How can I make my below query run in hive?
SELECT
q.calendar_dt as calendar_dt ,
x.corp_id as corp_id ,
x.mkt_cd as mkt_cd ,
x.bill_curr_cd as bill_curr_cd
FROM
corpmis_daily_time_dim as q
INNER JOIN corpmis_daily_global_corp_daily_expsr as x
INNER JOIN (select max(eff_start_dt) as max_eff_start_dt from corpmis_daily_global_corp_daily_expsr) as y
ON
q.calendar_dt between x.eff_start_dt and x.eff_end_dt WHERE
q.calendar_dt <= y.max_eff_start_dt;

Convert MS Access "TOP" to MySQL "LIMIT" in subquery

I am trying to convert an MS Access query to MySQL and the problem is converting MS Access top to MySQL limit to get the same result. When I change query to limit I get the error that this version of MySQL does not support limit in subquery.
This is the MS Access query:
SELECT a.FK_CONTRIBUTOR_ID
FROM tPUBLISHERS
INNER JOIN (tCONTRIBUTORS AS b
INNER JOIN tCLIPS AS a ON b.CONTRIBUTOR_ID = a.FK_CONTRIBUTOR_ID)
ON tPUBLISHERS.PUBLISHER_ID = b.FK_PUBLISHER_ID
WHERE ((a.CLIP_ID) In
(select top 5 CLIP_ID
from tCLIPS
where FK_CONTRIBUTOR_ID = a.FK_CONTRIBUTOR_ID
AND SUSPEND = a.SUSPEND))
AND ((a.FK_CONTRIBUTOR_ID) In (1922,2034,2099))
Previously answered at:
MySQL Subquery LIMIT
basically change the subquery to a Join
Google for more with "mysql limit on subquery"

mysql: Java api to call mysql metadata service

Is there any java api to call mysql metadata service? The things I am particularly interested in is getting schema of the table using api not modifying the schema of the table.
The best source for getting table metadata is MySQL itself. Use the INFORMATION_SCHEMA tables/views in MySQL to get this data. You can execute the queries and read the results set back like a normal query.
For table information:
SELECT * FROM INFORMATION_SCHEMA.TABLES
For columns:
SELECT * FROM INFORMATION_SCHEMA.COLUMNS
You can use this form for indexes on InnoDB:
SELECT t.name AS `Table`,
i.name AS `Index`,
GROUP_CONCAT(f.name ORDER BY f.pos) AS `Columns`
FROM information_schema.innodb_sys_tables t
JOIN information_schema.innodb_sys_indexes i USING (table_id)
JOIN information_schema.innodb_sys_fields f USING (index_id)
WHERE t.schema = 'sakila'
GROUP BY 1,2;