I'm trying to send email from my server. But, I am getting the SMTP time out error - smtp

I get this error.
File "/opt/seller-360/verify_fms_employee.py", line 214, in send_notification_mail\n'
{logging_mixin.py:98} INFO - {pod_launcher.py:106} INFO - b' send_email(msg, smtp_server=conf["smtp.server"])\n'
{logging_mixin.py:98} INFO -{pod_launcher.py:106} INFO - b' File "/opt/seller-360/helpers.py", line 189, in send_email\n'
{logging_mixin.py:98} INFO - {pod_launcher.py:106} INFO - b' smtp = smtplib.SMTP(smtp_server)\n'
{logging_mixin.py:98} INFO -{pod_launcher.py:106} INFO - b' File "/usr/lib/python3.6/smtplib.py", line 251, in init\n'
{logging_mixin.py:98} INFO -{pod_launcher.py:106} INFO - b' (code, msg) = self.connect(host, port)\n'
{logging_mixin.py:98} INFO - {pod_launcher.py:106} INFO - b' File "/usr/lib/python3.6/smtplib.py", line 336, in connect\n'
{logging_mixin.py:98} INFO - {pod_launcher.py:106} INFO - b' self.sock = self._get_socket(host, port, self.timeout)\n'
{logging_mixin.py:98} INFO - {pod_launcher.py:106} INFO - b' File "/usr/lib/python3.6/smtplib.py", line 307, in _get_socket\n'
{logging_mixin.py:98} INFO -{pod_launcher.py:106} INFO - b' self.source_address)\n'
{logging_mixin.py:98} INFO - {pod_launcher.py:106} INFO - b' File "/usr/lib/python3.6/socket.py", line 724, in create_connection\n'
{logging_mixin.py:98} INFO - {pod_launcher.py:106} INFO - b' raise err\n'
{logging_mixin.py:98} INFO - {pod_launcher.py:106} INFO - b' File "/usr/lib/python3.6/socket.py", line 713, in create_connection\n'
{logging_mixin.py:98} INFO -{pod_launcher.py:106} INFO - b' sock.connect(sa)\n'
{logging_mixin.py:98} INFO - [2023-02-10 17:30:14,794] {pod_launcher.py:106} INFO - b'TimeoutError: [Errno 110] Connection timed out\n'

Related

Snowflake sqlalchemy python cant connect because of getaddrinfo failed inside virtual machine of company network

I need to read and write some data to a snowflake databse. I have the credentials and everything is working fine on my local pc. But in the virtual machine of the company I am doing this I get this
I think its a proxy problem but i dont know what to do and how to fix. On the virtual machine i can access the snowflake urls flawlessly and everything works in google chrome for example. but why is this request not working in python??
Why?
How can i fix it?
please help :(
error message `Traceback (most recent call last):
File "snowflake\connector\vendored\urllib3\connection.py", line 174, in _new_conn
File "snowflake\connector\vendored\urllib3\util\connection.py", line 72, in create_connection
File "socket.py", line 954, in getaddrinfo
socket.gaierror: [Errno 11002] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "snowflake\connector\vendored\urllib3\connectionpool.py", line 703, in urlopen
File "snowflake\connector\vendored\urllib3\connectionpool.py", line 386, in _make_request
File "snowflake\connector\vendored\urllib3\connectionpool.py", line 1042, in _validate_conn
File "snowflake\connector\vendored\urllib3\connection.py", line 358, in connect
File "snowflake\connector\vendored\urllib3\connection.py", line 186, in _new_conn
snowflake.connector.vendored.urllib3.exceptions.NewConnectionError: <snowflake.connector.vendored.urllib3.connection.HTTPSConnection object at 0x000002073B6C6C10>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "snowflake\connector\vendored\requests\adapters.py", line 489, in send
File "snowflake\connector\vendored\urllib3\connectionpool.py", line 815, in urlopen
File "snowflake\connector\vendored\urllib3\connectionpool.py", line 787, in urlopen
File "snowflake\connector\vendored\urllib3\util\retry.py", line 592, in increment
snowflake.connector.vendored.urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='kzhwbsi-gb82213.snowflakecomputing.com', port=443): Max retries exceeded with url: /session/v1/login-request?request_id=d395ab35-5f2a-4fb2-a83b-48458979f2c9&databaseName=project_database&schemaName=project_schema&request_guid=19c8e3a5-33ad-48c2-8124-fc69a1fa2af9 (Caused by NewConnectionError('<snowflake.connector.vendored.urllib3.connection.HTTPSConnection object at 0x000002073B6C6C10>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "snowflake\connector\network.py", line 1018, in _request_exec
File "snowflake\connector\vendored\requests\sessions.py", line 587, in request
File "snowflake\connector\vendored\requests\sessions.py", line 701, in send
File "snowflake\connector\vendored\requests\adapters.py", line 565, in send
snowflake.connector.vendored.requests.exceptions.ConnectionError: HTTPSConnectionPool(host='kzhwbsi-gb82213.snowflakecomputing.com', port=443): Max retries exceeded with url: /session/v1/login-request?request_id=d395ab35-5f2a-4fb2-a83b-48458979f2c9&databaseName=project_database&schemaName=project_schema&request_guid=19c8e3a5-33ad-48c2-8124-fc69a1fa2af9 (Caused by NewConnectionError('<snowflake.connector.vendored.urllib3.connection.HTTPSConnection object at 0x000002073B6C6C10>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "snowflake\connector\connection.py", line 1072, in __authenticate
File "snowflake\connector\auth.py", line 257, in authenticate
File "snowflake\connector\network.py", line 704, in _post_request
File "snowflake\connector\network.py", line 794, in fetch
File "snowflake\connector\network.py", line 917, in _request_exec_wrapper
File "snowflake\connector\network.py", line 837, in _request_exec_wrapper
File "snowflake\connector\network.py", line 1095, in _request_exec
snowflake.connector.errors.OperationalError: 251011: 251011: ConnectionTimeout occurred. Will be handled by authenticator
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "sqlalchemy\engine\base.py", line 3361, in _wrap_pool_connect
File "sqlalchemy\pool\base.py", line 327, in connect
File "sqlalchemy\pool\base.py", line 894, in _checkout
File "sqlalchemy\pool\base.py", line 493, in checkout
File "sqlalchemy\pool\impl.py", line 146, in _do_get
File "sqlalchemy\util\langhelpers.py", line 70, in __exit__
File "sqlalchemy\util\compat.py", line 211, in raise_
File "sqlalchemy\pool\impl.py", line 143, in _do_get
File "sqlalchemy\pool\base.py", line 273, in _create_connection
File "sqlalchemy\pool\base.py", line 388, in __init__
File "sqlalchemy\pool\base.py", line 691, in __connect
File "sqlalchemy\util\langhelpers.py", line 70, in __exit__
File "sqlalchemy\util\compat.py", line 211, in raise_
File "sqlalchemy\pool\base.py", line 686, in __connect
File "sqlalchemy\engine\create.py", line 578, in connect
File "sqlalchemy\engine\default.py", line 598, in connect
File "snowflake\connector\__init__.py", line 51, in Connect
File "snowflake\connector\connection.py", line 297, in __init__
File "snowflake\connector\connection.py", line 550, in connect
File "snowflake\connector\connection.py", line 789, in __open_connection
File "snowflake\connector\connection.py", line 1052, in _authenticate
File "snowflake\connector\connection.py", line 1117, in __authenticate
File "snowflake\connector\connection.py", line 1094, in __authenticate
File "snowflake\connector\auth_by_plugin.py", line 117, in handle_timeout
snowflake.connector.errors.OperationalError: 250001: 250001: Could not connect to Snowflake backend after 0 attempt(s).Aborting
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "src\snowflakesqlalchemy.py", line 19, in <module>
File "sqlalchemy\engine\base.py", line 3315, in connect
File "sqlalchemy\engine\base.py", line 96, in __init__
File "sqlalchemy\engine\base.py", line 3394, in raw_connection
File "sqlalchemy\engine\base.py", line 3364, in _wrap_pool_connect
File "sqlalchemy\engine\base.py", line 2198, in _handle_dbapi_exception_noconnection
File "sqlalchemy\util\compat.py", line 211, in raise_
File "sqlalchemy\engine\base.py", line 3361, in _wrap_pool_connect
File "sqlalchemy\pool\base.py", line 327, in connect
File "sqlalchemy\pool\base.py", line 894, in _checkout
File "sqlalchemy\pool\base.py", line 493, in checkout
File "sqlalchemy\pool\impl.py", line 146, in _do_get
File "sqlalchemy\util\langhelpers.py", line 70, in __exit__
File "sqlalchemy\util\compat.py", line 211, in raise_
File "sqlalchemy\pool\impl.py", line 143, in _do_get
File "sqlalchemy\pool\base.py", line 273, in _create_connection
File "sqlalchemy\pool\base.py", line 388, in __init__
File "sqlalchemy\pool\base.py", line 691, in __connect
File "sqlalchemy\util\langhelpers.py", line 70, in __exit__
File "sqlalchemy\util\compat.py", line 211, in raise_
File "sqlalchemy\pool\base.py", line 686, in __connect
File "sqlalchemy\engine\create.py", line 578, in connect
File "sqlalchemy\engine\default.py", line 598, in connect
File "snowflake\connector\__init__.py", line 51, in Connect
File "snowflake\connector\connection.py", line 297, in __init__
File "snowflake\connector\connection.py", line 550, in connect
File "snowflake\connector\connection.py", line 789, in __open_connection
File "snowflake\connector\connection.py", line 1052, in _authenticate
File "snowflake\connector\connection.py", line 1117, in __authenticate
File "snowflake\connector\connection.py", line 1094, in __authenticate
File "snowflake\connector\auth_by_plugin.py", line 117, in handle_timeout
sqlalchemy.exc.OperationalError: (snowflake.connector.errors.OperationalError) 250001: 250001: Could not connect to Snowflake backend after 0 attempt(s).Aborting
(Background on this error at: https://sqlalche.me/e/14/e3q8)
[44952] Failed to execute script 'snowflakesqlalchemy' due to unhandled exception!`
I also printed these logs:
`2023-01-10 11:17:18,250 - MainThread connection.py:275 - __init__() - INFO - Snowflake Connector for Python Version: 2.8.3, Python Version: 3.9.13, Platform: Windows-10-10.0.17763-SP0
2023-01-10 11:17:18,250 - MainThread connection.py:520 - connect() - DEBUG - connect
2023-01-10 11:17:18,250 - MainThread connection.py:810 - __config() - DEBUG - __config
2023-01-10 11:17:18,250 - MainThread connection.py:934 - __config() - INFO - This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2023-01-10 11:17:18,251 - MainThread connection.py:952 - __config() - INFO - Setting use_openssl_only mode to False
2023-01-10 11:17:18,251 - MainThread converter.py:145 - __init__() - DEBUG - use_numpy: False
2023-01-10 11:17:18,251 - MainThread converter_issue23517.py:27 - __init__() - DEBUG - initialized
2023-01-10 11:17:18,251 - MainThread connection.py:713 - __open_connection() - DEBUG - REST API object was created: kzhwbsi-gb82213.snowflakecomputing.com:443
2023-01-10 11:17:18,252 - MainThread auth.py:170 - authenticate() - DEBUG - authenticate
2023-01-10 11:17:18,252 - MainThread auth.py:200 - authenticate() - DEBUG - assertion content: *********
2023-01-10 11:17:18,252 - MainThread auth.py:203 - authenticate() - DEBUG - account=kzhwbsi-gb82213, user=karlpd4c, database=project_database, schema=project_schema, warehouse=None, role=None, request_id=10256208-cea8-4269-a480-820a1c55e4a3
2023-01-10 11:17:18,252 - MainThread auth.py:236 - authenticate() - DEBUG - body['data']: {'CLIENT_APP_ID': 'PythonConnector', 'CLIENT_APP_VERSION': '2.8.3', 'SVN_REVISION': None, 'ACCOUNT_NAME': 'kzhwbsi-gb82213', 'LOGIN_NAME': 'karlpd4c', 'CLIENT_ENVIRONMENT': {'APPLICATION': 'PythonConnector', 'OS': 'Windows', 'OS_VERSION': 'Windows-10-10.0.17763-SP0', 'PYTHON_VERSION': '3.9.13', 'PYTHON_RUNTIME': 'CPython', 'PYTHON_COMPILER': 'MSC v.1929 64 bit (AMD64)', 'OCSP_MODE': 'FAIL_OPEN', 'TRACING': 10, 'LOGIN_TIMEOUT': 120, 'NETWORK_TIMEOUT': None}, 'SESSION_PARAMETERS': {'AUTOCOMMIT': False, 'CLIENT_PREFETCH_THREADS': 4}}
2023-01-10 11:17:18,252 - MainThread auth.py:254 - authenticate() - DEBUG - Timeout set to 120
2023-01-10 11:17:18,253 - MainThread retry.py:351 - from_int() - DEBUG - Converted retries value: 1 -> Retry(total=1, connect=None, read=None, redirect=None, status=None)
2023-01-10 11:17:18,253 - MainThread retry.py:351 - from_int() - DEBUG - Converted retries value: 1 -> Retry(total=1, connect=None, read=None, redirect=None, status=None)
2023-01-10 11:17:18,253 - MainThread network.py:1147 - _use_requests_session() - DEBUG - Session status for SessionPool 'kzhwbsi-gb82213.snowflakecomputing.com', SessionPool 1/1 active sessions
2023-01-10 11:17:18,253 - MainThread network.py:827 - _request_exec_wrapper() - DEBUG - remaining request timeout: 120, retry cnt: 1
2023-01-10 11:17:18,254 - MainThread network.py:808 - add_request_guid() - DEBUG - Request guid: 3bcec6fe-8f7d-4c05-9203-626636c975ea
2023-01-10 11:17:18,254 - MainThread network.py:1006 - _request_exec() - DEBUG - socket timeout: 60
2023-01-10 11:17:18,257 - MainThread connectionpool.py:1003 - _new_conn() - DEBUG - Starting new HTTPS connection (1): kzhwbsi-gb82213.snowflakecomputing.com:443
2023-01-10 11:17:19,295 - MainThread retry.py:594 - increment() - DEBUG - Incremented Retry for (url='/session/v1/login-request?request_id=10256208-cea8-4269-a480-820a1c55e4a3&databaseName=project_database&schemaName=project_schema&request_guid=3bcec6fe-8f7d-4c05-9203-626636c975ea'): Retry(total=0, connect=None, read=None, redirect=None, status=None)
2023-01-10 11:17:19,295 - MainThread connectionpool.py:812 - urlopen() - WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<snowflake.connector.vendored.urllib3.connection.HTTPSConnection object at 0x00000299937F9A30>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed')': /session/v1/login-request?request_id=10256208-cea8-4269-a480-820a1c55e4a3&databaseName=project_database&schemaName=project_schema&request_guid=3bcec6fe-8f7d-4c05-9203-626636c975ea
2023-01-10 11:17:19,296 - MainThread connectionpool.py:1003 - _new_conn() - DEBUG - Starting new HTTPS connection (2): kzhwbsi-gb82213.snowflakecomputing.com:443
2023-01-10 11:17:19,297 - MainThread network.py:1090 - _request_exec() - DEBUG - Hit a timeout error while logging in. Will be handled by authenticator. Ignore the following. Error stack: HTTPSConnectionPool(host='kzhwbsi-gb82213.snowflakecomputing.com', port=443): Max retries exceeded with url: /session/v1/login-request?request_id=10256208-cea8-4269-a480-820a1c55e4a3&databaseName=project_database&schemaName=project_schema&request_guid=3bcec6fe-8f7d-4c05-9203-626636c975ea (Caused by NewConnectionError('<snowflake.connector.vendored.urllib3.connection.HTTPSConnection object at 0x00000299937F9BE0>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed'))
Traceback (most recent call last):
File "snowflake\connector\vendored\urllib3\connection.py", line 174, in _new_conn
File "snowflake\connector\vendored\urllib3\util\connection.py", line 72, in create_connection
File "socket.py", line 954, in getaddrinfo
socket.gaierror: [Errno 11002] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "snowflake\connector\vendored\urllib3\connectionpool.py", line 703, in urlopen
File "snowflake\connector\vendored\urllib3\connectionpool.py", line 386, in _make_request
File "snowflake\connector\vendored\urllib3\connectionpool.py", line 1042, in _validate_conn
File "snowflake\connector\vendored\urllib3\connection.py", line 358, in connect
File "snowflake\connector\vendored\urllib3\connection.py", line 186, in _new_conn
snowflake.connector.vendored.urllib3.exceptions.NewConnectionError: <snowflake.connector.vendored.urllib3.connection.HTTPSConnection object at 0x00000299937F9BE0>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "snowflake\connector\vendored\requests\adapters.py", line 489, in send
File "snowflake\connector\vendored\urllib3\connectionpool.py", line 815, in urlopen
File "snowflake\connector\vendored\urllib3\connectionpool.py", line 787, in urlopen
File "snowflake\connector\vendored\urllib3\util\retry.py", line 592, in increment
snowflake.connector.vendored.urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='kzhwbsi-gb82213.snowflakecomputing.com', port=443): Max retries exceeded with url: /session/v1/login-request?request_id=10256208-cea8-4269-a480-820a1c55e4a3&databaseName=project_database&schemaName=project_schema&request_guid=3bcec6fe-8f7d-4c05-9203-626636c975ea (Caused by NewConnectionError('<snowflake.connector.vendored.urllib3.connection.HTTPSConnection object at 0x00000299937F9BE0>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "snowflake\connector\network.py", line 1018, in _request_exec
File "snowflake\connector\vendored\requests\sessions.py", line 587, in request
File "snowflake\connector\vendored\requests\sessions.py", line 701, in send
File "snowflake\connector\vendored\requests\adapters.py", line 565, in send
snowflake.connector.vendored.requests.exceptions.ConnectionError: HTTPSConnectionPool(host='kzhwbsi-gb82213.snowflakecomputing.com', port=443): Max retries exceeded with url: /session/v1/login-request?request_id=10256208-cea8-4269-a480-820a1c55e4a3&databaseName=project_database&schemaName=project_schema&request_guid=3bcec6fe-8f7d-4c05-9203-626636c975ea (Caused by NewConnectionError('<snowflake.connector.vendored.urllib3.connection.HTTPSConnection object at 0x00000299937F9BE0>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed'))
2023-01-10 11:17:19,305 - MainThread network.py:1152 - _use_requests_session() - DEBUG - Session status for SessionPool 'kzhwbsi-gb82213.snowflakecomputing.com', SessionPool 0/1 active sessions
2023-01-10 11:17:19,305 - MainThread connection.py:1087 - __authenticate() - DEBUG - Operational Error raised at authenticationfor authenticator: AuthByDefault
2023-01-10 11:17:19,305 - MainThread auth_by_plugin.py:114 - handle_timeout() - DEBUG - Default timeout handler invoked for authenticator
2023-01-10 11:17:19,305 - MainThread auth_by_plugin.py:123 - handle_timeout() - DEBUG - Hit connection timeout, attempt number 0. Will retry in a bit...
2023-01-10 11:17:19,305 - MainThread auth_by_plugin.py:56 - next_sleep_duration() - DEBUG - Sleeping for 2 seconds
2023-01-10 11:17:21,306 - MainThread auth.py:170 - authenticate() - DEBUG - authenticate
2023-01-10 11:17:21,306 - MainThread auth.py:200 - authenticate() - DEBUG - assertion content: *********
2023-01-10 11:17:21,306 - MainThread auth.py:203 - authenticate() - DEBUG - account=kzhwbsi-gb82213, user=karlpd4c, database=project_database, schema=project_schema, warehouse=None, role=None, request_id=f8210470-5260-46c4-b7a5-1458f5dc318a
2023-01-10 11:17:21,306 - MainThread auth.py:236 - authenticate() - DEBUG - body['data']: {'CLIENT_APP_ID': 'PythonConnector', 'CLIENT_APP_VERSION': '2.8.3', 'SVN_REVISION': None, 'ACCOUNT_NAME': 'kzhwbsi-gb82213', 'LOGIN_NAME': 'karlpd4c', 'CLIENT_ENVIRONMENT': {'APPLICATION': 'PythonConnector', 'OS': 'Windows', 'OS_VERSION': 'Windows-10-10.0.17763-SP0', 'PYTHON_VERSION': '3.9.13', 'PYTHON_RUNTIME': 'CPython', 'PYTHON_COMPILER': 'MSC v.1929 64 bit (AMD64)', 'OCSP_MODE': 'FAIL_OPEN', 'TRACING': 10, 'LOGIN_TIMEOUT': 120, 'NETWORK_TIMEOUT': None}, 'SESSION_PARAMETERS': {'AUTOCOMMIT': False, 'CLIENT_PREFETCH_THREADS': 4}}
2023-01-10 11:17:21,307 - MainThread auth.py:254 - authenticate() - DEBUG - Timeout set to 120
2023-01-10 11:17:21,307 - MainThread network.py:1147 - _use_requests_session() - DEBUG - Session status for SessionPool 'kzhwbsi-gb82213.snowflakecomputing.com', SessionPool 1/1 active sessions
2023-01-10 11:17:21,307 - MainThread network.py:827 - _request_exec_wrapper() - DEBUG - remaining request timeout: 120, retry cnt: 1
2023-01-10 11:17:21,307 - MainThread network.py:808 - add_request_guid() - DEBUG - Request guid: dbe010e1-6776-46ed-bc5a-9979d617bee4
2023-01-10 11:17:21,307 - MainThread network.py:1006 - _request_exec() - DEBUG - socket timeout: 60
2023-01-10 11:17:21,311 - MainThread connectionpool.py:1003 - _new_conn() - DEBUG - Starting new HTTPS connection (3): kzhwbsi-gb82213.snowflakecomputing.com:443
2023-01-10 11:17:21,312 - MainThread retry.py:594 - increment() - DEBUG - Incremented Retry for (url='/session/v1/login-request?request_id=f8210470-5260-46c4-b7a5-1458f5dc318a&databaseName=project_database&schemaName=project_schema&request_guid=dbe010e1-6776-46ed-bc5a-9979d617bee4'): Retry(total=0, connect=None, read=None, redirect=None, status=None)
2023-01-10 11:17:21,312 - MainThread connectionpool.py:812 - urlopen() - WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<snowflake.connector.vendored.urllib3.connection.HTTPSConnection object at 0x000002999382B790>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed')': /session/v1/login-request?request_id=f8210470-5260-46c4-b7a5-1458f5dc318a&databaseName=project_database&schemaName=project_schema&request_guid=dbe010e1-6776-46ed-bc5a-9979d617bee4
2023-01-10 11:17:21,312 - MainThread connectionpool.py:1003 - _new_conn() - DEBUG - Starting new HTTPS connection (4): kzhwbsi-gb82213.snowflakecomputing.com:443
2023-01-10 11:17:21,314 - MainThread network.py:1090 - _request_exec() - DEBUG - Hit a timeout error while logging in. Will be handled by authenticator. Ignore the following. Error stack: HTTPSConnectionPool(host='kzhwbsi-gb82213.snowflakecomputing.com', port=443): Max retries exceeded with url: /session/v1/login-request?request_id=f8210470-5260-46c4-b7a5-1458f5dc318a&databaseName=project_database&schemaName=project_schema&request_guid=dbe010e1-6776-46ed-bc5a-9979d617bee4 (Caused by NewConnectionError('<snowflake.connector.vendored.urllib3.connection.HTTPSConnection object at 0x000002999382BA00>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed'))
Traceback (most recent call last):
File "snowflake\connector\vendored\urllib3\connection.py", line 174, in _new_conn
File "snowflake\connector\vendored\urllib3\util\connection.py", line 72, in create_connection
File "socket.py", line 954, in getaddrinfo
socket.gaierror: [Errno 11002] getaddrinfo failed
During handling of the above exception, another exception occurred:
...
This is my python code:
#!/usr/bin/env python
from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine
import pandas as pd
import logging
import os
engine = create_engine('snowflake://user:password#myaccount(e.g.asdfhjk-jh45567)/project_database/project_schema')
os.environ['NO_PROXY'] = 'snowflakecomputing.com'
path = input('logpath:\n')
for logger_name in ['snowflake','botocore']:
logger = logging.getLogger(logger_name)
logger.setLevel(logging.DEBUG)
ch = logging.FileHandler(path+'python_connector.log')
ch.setLevel(logging.DEBUG)
ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
logger.addHandler(ch)
try:
connection = engine.connect()
sql = "SELECT * FROM project_comments"
df = pd.read_sql_query(sql,connection)
print(df)
df.to_csv("./snowflakedata.csv",index=False)
finally:
engine.dispose()
connection.close()
Besides working with sqlalchemy i tried the snowflake connector:
import pandas as pd
import snowflake.connector
import sys
import os
us = input("user: \n")
pw = input("password: \n")
acc = input("account: \n")
cnn = snowflake.connector.connect(
user=us,
password=pw,
account=acc,
)
cs = cnn.cursor()
wh = input("warehouse: \n")
db = input("database: \n")
schema = input("schema: \n")
table = input("table: \n")
path = input("output directory: \n").replace("\\", "/")
try:
sql = "USE WAREHOUSE " + wh
cs.execute(sql)
sql = "USE DATABASE "+ db
cs.execute(sql)
sql = "USE SCHEMA " + schema
cs.execute(sql)
sql = "SELECT * FROM " + table
df = pd.read_sql_query(sql,cnn)
print(df)
df.to_csv(path + "snowflakedata.csv",index=False)
input("snowflake has been read, press any key to close")
finally:
cs.close()
cnn.close()
same problem occurs. On local pc its working fine and on the virtual machine in the on premise network it does not work.
and i tried to deactivate proxy in windows with:
set NO_PROXY=snowflakecomputing.com
Do I need the ip of the snowflake db? How do I get it? Is there something wrong with the dns configuration in sqlalchemy? how do i fix it? do i need another connector or database engine?
I figured out the answer:
If you are behind a coorperate proxy, you need to configure the proxy one in the environment variables as new variable then add these two:
HTTP_PROXY
http://companyuser:companypw#proxy.companydomain.companyname.com:8080
and
HTTPS_PROXY
http://companyuser:companypw#proxy.companydomain.companyname.com:8080
.Then you also need to configre those http and https proxy at application level, meaning in your python program at the beginning:
os.environ["http_proxy"] = "http://companyuser:companypw#proxy.companydomain.companyname.com:8080"
os.environ["https_proxy"] ="http://companyuser:companypw#proxy.companydomain.companyname.com:8080"
in my case the companyuser is the user from my windows machine.

ERROR 1066: Unable to open iterator for alias- PIG SCRIPT

I have been facing this issue from long time. I tried to solve this but i couldn't. I need some experts advice to solve this.
I am trying to load a sample tweets json file.
sample.json;-
{"filter_level":"low","retweeted":false,"in_reply_to_screen_name":"FilmFan","truncated":false,"lang":"en","in_reply_to_status_id_str":null,"id":689085590822891521,"in_reply_to_user_id_str":"6048122","timestamp_ms":"1453125782100","in_reply_to_status_id":null,"created_at":"Mon Jan 18 14:03:02 +0000 2016","favorite_count":0,"place":null,"coordinates":null,"text":"#filmfan hey its time for you guys follow #acadgild To #AchieveMore and participate in contest Win Rs.500 worth vouchers","contributors":null,"geo":null,"entities":{"symbols":[],"urls":[],"hashtags":[{"text":"AchieveMore","indices":[56,68]}],"user_mentions":[{"id":6048122,"name":"Tanya","indices":[0,8],"screen_name":"FilmFan","id_str":"6048122"},{"id":2649945906,"name":"ACADGILD","indices":[42,51],"screen_name":"acadgild","id_str":"2649945906"}]},"is_quote_status":false,"source":"<a href=\"https://about.twitter.com/products/tweetdeck\" rel=\"nofollow\">TweetDeck<\/a>","favorited":false,"in_reply_to_user_id":6048122,"retweet_count":0,"id_str":"689085590822891521","user":{"location":"India ","default_profile":false,"profile_background_tile":false,"statuses_count":86548,"lang":"en","profile_link_color":"94D487","profile_banner_url":"https://pbs.twimg.com/profile_banners/197865769/1436198000","id":197865769,"following":null,"protected":false,"favourites_count":1002,"profile_text_color":"000000","verified":false,"description":"Proud Indian, Digital Marketing Consultant,Traveler, Foodie, Adventurer, Data Architect, Movie Lover, Namo Fan","contributors_enabled":false,"profile_sidebar_border_color":"000000","name":"Bahubali","profile_background_color":"000000","created_at":"Sat Oct 02 17:41:02 +0000 2010","default_profile_image":false,"followers_count":4467,"profile_image_url_https":"https://pbs.twimg.com/profile_images/664486535040000000/GOjDUiuK_normal.jpg","geo_enabled":true,"profile_background_image_url":"http://abs.twimg.com/images/themes/theme1/bg.png","profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","follow_request_sent":null,"url":null,"utc_offset":19800,"time_zone":"Chennai","notifications":null,"profile_use_background_image":false,"friends_count":810,"profile_sidebar_fill_color":"000000","screen_name":"Ashok_Uppuluri","id_str":"197865769","profile_image_url":"http://pbs.twimg.com/profile_images/664486535040000000/GOjDUiuK_normal.jpg","listed_count":50,"is_translator":false}}
I have tried to load this json file using ELEPHANT BIRD
script:-
REGISTER json-simple-1.1.1.jar
REGISTER elephant-bird-2.2.3.jar
REGISTER guava-11.0.2.jar
REGISTER avro-1.7.7.jar
REGISTER piggybank-0.12.0.jar
twitter = LOAD 'sample.json' USING com.twitter.elephantbird.pig.load.JsonLoader();
B = foreach twitter generate (chararray)$0#'created_at' as created_at,(chararray)$0#'id' as id,(chararray)$0#'id_str' as id_str,(chararray)$0#'text' as text,(chararray)$0#'source' as source,com.twitter.elephantbird.pig.piggybank.JsonStringToMap($0#'entities') as entities,(boolean)$0#'favorited' as favorited;
describe B;
OUTPUT:-
B: {created_at: chararray,id: chararray,id_str: chararray,text: chararray,source: chararray,entitis: map[chararray],favorited: boolean}
But when I tried to DUMP B the follwoing error has occured
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias B
I am providing the complete logs here.
2016-09-11 14:07:57,184 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1 2016-09-11 14:07:57,184 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1 2016-09-11 14:07:57,194 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
Metrics with processName=JobTracker, sessionId= - already initialized
2016-09-11 14:07:57,194 [main] INFO
org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script
settings are added to the job 2016-09-11 14:07:57,194 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2016-09-11 14:07:57,199 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job 2016-09-11 14:07:57,199 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is
false, will not generate code. 2016-09-11 14:07:57,199 [main] INFO
org.apache.pig.data.SchemaTupleFrontend - Starting process to move
generated code to distributed cacche 2016-09-11 14:07:57,199 [main]
INFO org.apache.pig.data.SchemaTupleFrontend - Distributed cache not
supported or needed in local mode. Setting key
[pig.schematuple.local.dir] with code temp directory:
/tmp/1473583077199-0 2016-09-11 14:07:57,206 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission. 2016-09-11 14:07:57,207 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= -
already initialized 2016-09-11 14:07:57,208 [JobControl] WARN
org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.
User classes may not be found. See Job or Job#setJar(String).
2016-09-11 14:07:57,211 [JobControl] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input
paths to process : 1 2016-09-11 14:07:57,211 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths (combined) to process : 1 2016-09-11 14:07:57,212
[JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1 2016-09-11 14:07:57,216 [JobControl] INFO
org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job:
job_local360376249_0009 2016-09-11 14:07:57,267 [JobControl] INFO
org.apache.hadoop.mapreduce.Job - The url to track the job:
http://localhost:8080/ 2016-09-11 14:07:57,267 [Thread-214] INFO
org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter set in
config null 2016-09-11 14:07:57,270 [Thread-214] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File
Output Committer Algorithm version is 1 2016-09-11 14:07:57,270
[Thread-214] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter -
FileOutputCommitter skip cleanup _temporary folders under output
directory:false, ignore cleanup failures: false 2016-09-11
14:07:57,270 [Thread-214] INFO org.apache.hadoop.mapred.LocalJobRunner
- OutputCommitter is org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2016-09-11 14:07:57,271 [Thread-214] INFO
org.apache.hadoop.mapred.LocalJobRunner - Waiting for map tasks
2016-09-11 14:07:57,272 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local360376249_0009_m_000000_0 2016-09-11 14:07:57,277
[LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File
Output Committer Algorithm version is 1 2016-09-11 14:07:57,277
[LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter -
FileOutputCommitter skip cleanup _temporary folders under output
directory:false, ignore cleanup failures: false 2016-09-11
14:07:57,277 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree :
[ ] 2016-09-11 14:07:57,278 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits
:1 Total Length = 2416 Input split[0]: Length = 2416 ClassName:
org.apache.hadoop.mapreduce.lib.input.FileSplit Locations:
----------------------- 2016-09-11 14:07:57,282 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader
- Current split being processed file:/root/PIG/PIG/sample.json:0+2416 2016-09-11 14:07:57,282 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File
Output Committer Algorithm version is 1 2016-09-11 14:07:57,282
[LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter -
FileOutputCommitter skip cleanup _temporary folders under output
directory:false, ignore cleanup failures: false 2016-09-11
14:07:57,288 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not
set... will not generate code. 2016-09-11 14:07:57,290 [LocalJobRunner
Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map
- Aliases being processed per job phase (AliasName[line,offset]): M: twitter[20,10],B[21,4] C: R: 2016-09-11 14:07:57,291 [Thread-214] INFO
org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2016-09-11 14:07:57,296 [Thread-214] WARN
org.apache.hadoop.mapred.LocalJobRunner - job_local360376249_0009
java.lang.Exception: java.lang.IncompatibleClassChangeError: Found
interface org.apache.hadoop.mapreduce.Counter, but class was expected
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.IncompatibleClassChangeError: Found interface
org.apache.hadoop.mapreduce.Counter, but class was expected at
com.twitter.elephantbird.pig.util.PigCounterHelper.incrCounter(PigCounterHelper.java:55)
at
com.twitter.elephantbird.pig.load.LzoBaseLoadFunc.incrCounter(LzoBaseLoadFunc.java:70)
at
com.twitter.elephantbird.pig.load.JsonLoader.getNext(JsonLoader.java:130)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
at
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) 2016-09-11 14:07:57,467
[main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_local360376249_0009 2016-09-11 14:07:57,467 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Processing aliases B,twitter 2016-09-11 14:07:57,467 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- detailed locations: M: twitter[20,10],B[21,4] C: R: 2016-09-11 14:07:57,468 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete 2016-09-11 14:07:57,468 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure. 2016-09-11 14:07:57,468 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job job_local360376249_0009 has failed! Stop running all dependent jobs 2016-09-11 14:07:57,468 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete 2016-09-11 14:07:57,469 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
Metrics with processName=JobTracker, sessionId= - already initialized
2016-09-11 14:07:57,469 [main] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
Metrics with processName=JobTracker, sessionId= - already initialized
2016-09-11 14:07:57,469 [main] ERROR
org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce
job(s) failed! 2016-09-11 14:07:57,470 [main] INFO
org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script
Statistics: HadoopVersionPigVersionUserIdStartedAtFinishedAtFeatures
2.7.1.2.3.4.7-40.15.0.2.3.4.7-4root2016-09-11 14:07:572016-09-11 14:07:57UNKNOWN Failed! Failed Jobs: JobIdAliasFeatureMessageOutputs
job_local360376249_0009B,twitterMAP_ONLYMessage: Job
failed!file:/tmp/temp252944192/tmp-470484503, Input(s): Failed to read
data from "file:///root/PIG/PIG/sample.json" Output(s): Failed to
produce result in "file:/tmp/temp252944192/tmp-470484503" Counters:
Total records written : 0 Total bytes written : 0 Spillable Memory
Manager spill count : 0 Total bags proactively spilled: 0 Total
records proactively spilled: 0 Job DAG: job_local360376249_0009
And please give a clarification on how to use jar files,
And what are the versions to use.I am so confused which version to use.
Someone says use Elephant Bird, and Someone says use AVRO. But I have with both non of them are working.
Please help.
Mohan.V
I got it on my own.
It is of jar versions issue.
script:-
REGISTER elephant-bird-core-4.1.jar
REGISTER elephant-bird-pig-4.1.jar
REGISTER elephant-bird-hadoop-compat-4.1.jar
And it worked fine.

Apache Pig error while dumping Json data

I have a JSON file and want to load using Apache Pig.
I am using the built-in JSONLOADER to load json data, Below is the sample json data.
cat jsondata1.json
{ "response": { "id": 10123, "thread": "Sloths", "comments": ["Sloths are adorable So chill"] }, "response_time": 0.425 }
{ "response": { "id": 13828, "thread": "Bigfoot", "comments": ["hello world"] } , "response_time": 0.517 }
Here I loading json data using builtin Json loader. While loading there is no error, but while dumping the data it gives the following error.
grunt> a = load '/home/cloudera/jsondata1.json' using JsonLoader('response:tuple (id:int, thread:chararray, comments:bag {tuple(comment:chararray)}), response_time:double');
grunt> dump a;
2016-04-17 01:11:13,286 [pool-4-thread-1] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/home/cloudera/jsondata1.json:0+229
2016-04-17 01:11:13,287 [pool-4-thread-1] WARN org.apache.hadoop.conf.Configuration - dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
2016-04-17 01:11:13,311 [pool-4-thread-1] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2016-04-17 01:11:13,321 [pool-4-thread-1] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: a[5,4] C: R:
2016-04-17 01:11:13,349 [Thread-16] INFO org.apache.hadoop.mapred.LocalJobRunner - Map task executor complete.
2016-04-17 01:11:13,351 [Thread-16] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local801054416_0004
java.lang.Exception: org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not numeric, can not use numeric value accessors
at [Source: java.io.ByteArrayInputStream#2484de3c; line: 1, column: 120]
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:406)
Caused by: org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not numeric, can not use numeric value accessors
at [Source: java.io.ByteArrayInputStream#2484de3c; line: 1, column: 120]
at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385)
at org.codehaus.jackson.impl.JsonNumericParserBase._parseNumericValue(JsonNumericParserBase.java:399)
at org.codehaus.jackson.impl.JsonNumericParserBase.getDoubleValue(JsonNumericParserBase.java:311)
at org.apache.pig.builtin.JsonLoader.readField(JsonLoader.java:203)
at org.apache.pig.builtin.JsonLoader.getNext(JsonLoader.java:157)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:483)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2016-04-17 01:11:13,548 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local801054416_0004
2016-04-17 01:11:13,548 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a
2016-04-17 01:11:13,548 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[5,4] C: R:
2016-04-17 01:11:18,059 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2016-04-17 01:11:18,059 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local801054416_0004 has failed! Stop running all dependent jobs
2016-04-17 01:11:18,059 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2016-04-17 01:11:18,059 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2016-04-17 01:11:18,060 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats reported below may be incomplete
2016-04-17 01:11:18,060 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.0.0-cdh4.7.0 0.11.0-cdh4.7.0 cloudera 2016-04-17 01:11:12 2016-04-17 01:11:18 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_local801054416_0004 a MAP_ONLY Message: Job failed! file:/tmp/temp-1766116741/tmp1151698221,
Input(s):
Failed to read data from "/home/cloudera/jsondata1.json"
Output(s):
Failed to produce result in "file:/tmp/temp-1766116741/tmp1151698221"
Job DAG:
job_local801054416_0004
2016-04-17 01:11:18,060 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2016-04-17 01:11:18,061 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias a
Details at logfile: /home/cloudera/pig_1460877001124.log
I could not able to find the issue. Can I know how to define the correct schema for the above json data?.
Try this:
comments:{(chararray)}
because this version:
comments:bag {tuple(comment:chararray)}
fits this JSON schema:
"comments": [{comment:"hello world"}]
and you have simple string values, not another nested documents:
"comments": ["hello world"]

Error processing complex json object of twitter with pig JsonLoader() of elephant-bird Jars

I wanted to process twitter json object with pig using elephant-bird jars for which i wrote the pig script as below.
REGISTER '/usr/lib/pig/lib/elephant-bird-hadoop-compat-4.1.jar';
REGISTER '/usr/lib/pig/lib/elephant-bird-pig-4.1.jar';
A = LOAD '/user/flume/tweets/data.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') AS myMap;
B = FOREACH A GENERATE myMap#'id' AS ID,myMap#'created_at' AS createdAT;
DUMP B;
which gave me error as below
2015-08-25 11:06:34,295 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1439883208520_0177
2015-08-25 11:06:34,295 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A,B
2015-08-25 11:06:34,295 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[3,4],B[4,4] C: R:
2015-08-25 11:06:34,303 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2015-08-25 11:06:34,303 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1439883208520_0177]
2015-08-25 11:07:06,449 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2015-08-25 11:07:06,449 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1439883208520_0177]
2015-08-25 11:07:09,458 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2015-08-25 11:07:09,458 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1439883208520_0177 has failed! Stop running all dependent jobs
2015-08-25 11:07:09,459 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-08-25 11:07:09,667 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://trinityhadoopmaster.com:8188/ws/v1/timeline/
2015-08-25 11:07:09,668 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at trinityhadoopmaster.com/192.168.1.135:8032
2015-08-25 11:07:09,678 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
2015-08-25 11:07:09,779 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.ClassNotFoundException: org.json.simple.parser.ParseException
2015-08-25 11:07:09,779 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2015-08-25 11:07:09,780 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.6.0 0.14.0 hdfs 2015-08-25 11:06:33 2015-08-25 11:07:09 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_1439883208520_0177 A,B MAP_ONLY Message: Job failed! hdfs://trinityhadoopmaster.com:9000/tmp/temp1554332510/tmp835744559,
Input(s):
Failed to read data from "hdfs://trinityhadoopmaster.com:9000/user/flume/tweets/data.json"
Output(s):
Failed to produce result in "hdfs://trinityhadoopmaster.com:9000/tmp/temp1554332510/tmp835744559"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1439883208520_0177
2015-08-25 11:07:09,780 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2015-08-25 11:07:09,787 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias B. Backend error : java.lang.ClassNotFoundException: org.json.simple.parser.ParseException
Details at logfile: /tmp/pig-err.log
grunt>
which i have no clue on how to approach, can any one help me on this.
REGISTER '/tmp/elephant-bird-core-4.1.jar';
REGISTER '/tmp/elephant-bird-pig-4.1.jar';
REGISTER '/tmp/elephant-bird-hadoop-compat-4.1.jar';
REGISTER '/tmp/google-collections-1.0.jar';
REGISTER '/tmp/json-simple-1.1.jar';
It works.

JsonLoader throws error in pig

I am unable to decode this simple json , i dont know what i am doing wrong.
please help me in this pig script.
I have to decode the below data in json format.
3.json
{
"id": 6668,
"source_name": "National Stock Exchange of India",
"source_code": "NSE"
}
and my pig script is
a = LOAD '3.json' USING org.apache.pig.builtin.JsonLoader ('id:int, source_name:chararray, source_code:chararray');
dump a;
the error i get is given below:
2015-07-23 13:40:08,715 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - Starting task: attempt_local1664361500_0001_m_000000_0
2015-07-23 13:40:08,775 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2015-07-23 13:40:08,780 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 88
Input split[0]:
Length = 88
Locations:
-----------------------
2015-07-23 13:40:08,793 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/home/hariprasad.sudo/3.json:0+88
2015-07-23 13:40:08,844 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: a[1,4] C: R:
2015-07-23 13:40:08,861 [Thread-5] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2015-07-23 13:40:08,867 [Thread-5] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local1664361500_0001
java.lang.Exception: org.codehaus.jackson.JsonParseException: Unexpected end-of-input: expected close marker for OBJECT (from [Source: java.io.ByteArrayInputStream#61a79110; line: 1, column: 0])
at [Source: java.io.ByteArrayInputStream#61a79110; line: 1, column: 3]
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: org.codehaus.jackson.JsonParseException: Unexpected end-of-input: expected close marker for OBJECT (from [Source: java.io.ByteArrayInputStream#61a79110; line: 1, column: 0])
at [Source: java.io.ByteArrayInputStream#61a79110; line: 1, column: 3]
at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportInvalidEOF(JsonParserMinimalBase.java:318)
at org.codehaus.jackson.impl.JsonParserBase._handleEOF(JsonParserBase.java:354)
at org.codehaus.jackson.impl.Utf8StreamParser._skipWSOrEnd(Utf8StreamParser.java:1841)
at org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:275)
at org.apache.pig.builtin.JsonLoader.readField(JsonLoader.java:180)
at org.apache.pig.builtin.JsonLoader.getNext(JsonLoader.java:164)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-07-23 13:40:09,179 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2015-07-23 13:40:09,179 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1664361500_0001 has failed! Stop running all dependent jobs
2015-07-23 13:40:09,179 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-07-23 13:40:09,180 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2015-07-23 13:40:09,180 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats reported below may be incomplete
2015-07-23 13:40:09,181 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.3.0-cdh5.1.3 0.12.0-cdh5.1.3 hariprasad.sudo 2015-07-23 13:40:07 2015-07-23 13:40:09 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_local1664361500_0001 a MAP_ONLY Message: Job failed! file:/tmp/temp-65649055/tmp1240506051,
Input(s):
Failed to read data from "file:///home/hariprasad.sudo/3.json"
Output(s):
Failed to produce result in "file:/tmp/temp-65649055/tmp1240506051"
Job DAG:
job_local1664361500_0001
2015-07-23 13:40:09,181 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2015-07-23 13:40:09,186 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias a
Details at logfile: /home/hariprasad.sudo/pig_1437673203961.log
grunt> 2015-07-23 13:40:14,754 [communication thread] INFO org.apache.hadoop.mapred.LocalJobRunner - map > map
Please help me in understanding what is wrong.
Thanks,
Hari
Have the compact version of json in 3.json. We can use http://www.jsoneditoronline.org for the same.
3.json
{"id":6668,"source_name":"National Stock Exchange of India","source_code":"NSE"}
with this we are able to dump the data :
(6668,National Stock Exchange of India,NSE)
Ref : Error from Json Loader in Pig where similar issue is discussed.
Extract from the above ref. link :
Pig doesn't usually like "human readable" json. Get rid of the spaces and/or indentations, and you're good.