I am trying to fine-tune the donut model using the naver-clova-ix/cord-v2 dataset, but I get "EOFError: Ran out of input" issue. I have a windows based server with 2gpus, and tried to make only one visible for training by CUDA_VISIBLE_DEVICES=0. The downloaded data and .json and .arrow files for train/test/validation are not empty. Any idea?
>>train.py --config ../config/train_cord.yaml
resume_from_checkpoint_path: None
result_path: ./result
pretrained_model_name_or_path: naver-clova-ix/donut-base
dataset_name_or_paths:
- naver-clova-ix/cord-v2
sort_json_key: False
train_batch_sizes:
- 8
val_batch_sizes:
- 1
input_size:
- 1280
- 960
max_length: 768
align_long_axis: False
num_nodes: 1
seed: 2022
lr: 3e-05
warmup_steps: 300
num_training_samples_per_epoch: 800
max_epochs: 30
max_steps: -1
num_workers: 8
val_check_interval: 1.0
check_val_every_n_epoch: 3
gradient_clip_val: 1.0
verbose: True
exp_name: train_cord
exp_version: 20221103_132724
Config is saved at result\train_cord\20221103_132724\config.yaml
D:\Tools\PVPythonVE\Donut\lib\site-packages\pytorch_lightning\utilities\seed.py:49: LightningDeprecationWarning: `pytorch_lightning.utilities.seed.seed_everything` has been deprecated in v1.8.0 and will be removed in v1.10.0. Please use `lightning_lite.utilities.seed.seed_everything` instead.
"`pytorch_lightning.utilities.seed.seed_everything` has been deprecated in v1.8.0 and will be"
Global seed set to 2022
D:\Tools\PVPythonVE\Donut\lib\site-packages\torch\functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:3191.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Some weights of DonutModel were not initialized from the model checkpoint at naver-clova-ix/donut-base and are newly initialized because the shapes did not match:
- encoder.model.layers.0.blocks.1.attn_mask: found shape torch.Size([3072, 100, 100]) in the checkpoint and torch.Size([768, 100, 100]) in the model instantiated
- encoder.model.layers.1.blocks.1.attn_mask: found shape torch.Size([768, 100, 100]) in the checkpoint and torch.Size([192, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.1.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.3.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.5.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.7.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.9.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.11.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.13.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.3.blocks.1.attn_mask: found shape torch.Size([48, 100, 100]) in the checkpoint and torch.Size([12, 100, 100]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Using custom data configuration naver-clova-ix--cord-v2-6daad2d1fa36191a
Found cached dataset parquet (C:/Users/s.yousefi/.cache/huggingface/datasets/naver-clova-ix___parquet/naver-clova-ix--cord-v2-6daad2d1fa36191a/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec)
Using custom data configuration naver-clova-ix--cord-v2-6daad2d1fa36191a
Found cached dataset parquet (C:/Users/s.yousefi/.cache/huggingface/datasets/naver-clova-ix___parquet/naver-clova-ix--cord-v2-6daad2d1fa36191a/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec)
D:\Tools\PVPythonVE\Donut\lib\site-packages\pytorch_lightning\trainer\connectors\accelerator_connector.py:447: LightningDeprecationWarning: Setting `Trainer(gpus=2)` is deprecated in v1.7 and will be removed in v2.0. Please use `Trainer(accelerator='gpu', devices=2)` instead.
f"Setting `Trainer(gpus={gpus!r})` is deprecated in v1.7 and will be removed"
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
[rank: 0] Global seed set to 2022
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/2
[W C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [kubernetes.docker.internal]:59749 (system error: 10049 - The requested address is not valid in its context.).
[rank: 1] Global seed set to 2022
D:\Tools\PVPythonVE\Donut\lib\site-packages\torch\functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:3191.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Some weights of DonutModel were not initialized from the model checkpoint at naver-clova-ix/donut-base and are newly initialized because the shapes did not match:
- encoder.model.layers.0.blocks.1.attn_mask: found shape torch.Size([3072, 100, 100]) in the checkpoint and torch.Size([768, 100, 100]) in the model instantiated
- encoder.model.layers.1.blocks.1.attn_mask: found shape torch.Size([768, 100, 100]) in the checkpoint and torch.Size([192, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.1.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.3.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.5.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.7.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.9.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.11.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.2.blocks.13.attn_mask: found shape torch.Size([192, 100, 100]) in the checkpoint and torch.Size([48, 100, 100]) in the model instantiated
- encoder.model.layers.3.blocks.1.attn_mask: found shape torch.Size([48, 100, 100]) in the checkpoint and torch.Size([12, 100, 100]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Using custom data configuration naver-clova-ix--cord-v2-6daad2d1fa36191a
Found cached dataset parquet (C:/Users/s.yousefi/.cache/huggingface/datasets/naver-clova-ix___parquet/naver-clova-ix--cord-v2-6daad2d1fa36191a/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec)
Using custom data configuration naver-clova-ix--cord-v2-6daad2d1fa36191a
Found cached dataset parquet (C:/Users/s.yousefi/.cache/huggingface/datasets/naver-clova-ix___parquet/naver-clova-ix--cord-v2-6daad2d1fa36191a/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec)
[rank: 1] Global seed set to 2022
Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/2
[W C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [kubernetes.docker.internal]:59749 (system error: 10049 - The requested address is not valid in its context.).
[W C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [kubernetes.docker.internal]:59749 (system error: 10049 - The requested address is not valid in its context.).
[W C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [kubernetes.docker.internal]:59749 (system error: 10049 - The requested address is not valid in its context.).
----------------------------------------------------------------------------------------------------
distributed_backend=gloo
All distributed processes registered. Starting with 2 processes
----------------------------------------------------------------------------------------------------
D:\Tools\PVPythonVE\Donut\lib\site-packages\pytorch_lightning\callbacks\model_checkpoint.py:606: UserWarning: Checkpoint directory E:\Sahar\ILR_sahar\donut\result\train_cord\20221103_132724 exists and is not empty.
rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]
LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1]
| Name | Type | Params
-------------------------------------
0 | model | DonutModel | 201 M
-------------------------------------
201 M Trainable params
0 Non-trainable params
201 M Total params
402.248 Total estimated model params size (MB)
Epoch 0: 0%| | 0/50 [00:00<?, ?it/s] win32
win32
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "D:\Tools\PVPython\lib\multiprocessing\spawn.py", line 106, in spawn_main
exitcode = _main(fd)
File "D:\Tools\PVPython\lib\multiprocessing\spawn.py", line 115, in _main
preparation_data = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "D:\Tools\PVPython\lib\multiprocessing\spawn.py", line 106, in spawn_main
exitcode = _main(fd)
File "D:\Tools\PVPython\lib\multiprocessing\spawn.py", line 115, in _main
preparation_data = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Related
Good day!
I have built a small IoT device that monitors the conditions inside a specific enclosure using an ESP32 and a couple of sensors. I want to monitor that data by publishing it to the ThingSpeak cloud, then writing it to InfluxDB with Telegraf and finally using the InfluxDB data source in Grafana to visualize it.
So far I have made everything work flawlessly, but with one small exception.
Which is: One of the plugins in my telegraf config fails with the error:
parsing metrics failed: Unable to convert field 'temperature' to type int: strconv.ParseInt: parsing "15.4": invalid syntax
The plugins are [inputs.http]] and [[inputs.http.json_v2]] and what I am doing with them is authenticating against my ThingSpeak API and parsing the json output of my fields. Then in my /etc/telegraf/telegraf.conf under [[inputs.http.json_v2.field]] I have added type = int as otherwise telegraf writes my metrics as Strings in InfluxDB and the only way to visualize them is using either a table or a single stat, because the rest of the flux queries fail with the error unsupported input type for mean aggregate: string. However, when I change to type = float in the config file I get a different error:
unprocessable entity: failure writing points to database: partial write: field type conflict: input field "temperature" on measurement "sensorData" is type float, already exists as type string dropped=1
I have a suspicion that I have misconfigured the parser plugin, however after hours of debugging I couldn't come up with a solution.
Some information that might be of use:
Telegraf version: Telegraf 1.24.2
Influxdb version: InfluxDB v2.4.0
Please see below for my telegraf.conf as well as the error messages.
Any help would be highly appreciated! (:
[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 1000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
hostname = ""
omit_hostname = false
[[outputs.influxdb_v2]]
urls = ["http://localhost:8086"]
token = "XXXXXXXX"
organization = "XXXXXXXXX"
bucket = "sensor"
[[inputs.http]]
urls = [
"https://api.thingspeak.com/channels/XXXXX/feeds.json?api_key=XXXXXXXXXX&results=2"
]
name_override = "sensorData"
tagexclude = ["url", "host"]
data_format = "json_v2"
## HTTP method
method = "GET"
[[inputs.http.json_v2]]
[[inputs.http.json_v2.field]]
path = "feeds.1.field1"
rename = "temperature"
type = "int" #Error message 1
#type = "float" #Error message 2
Error when type = "float":
me#myserver:/etc/telegraf$ telegraf -config telegraf.conf --debug
2022-10-16T00:31:43Z I! Starting Telegraf 1.24.2
2022-10-16T00:31:43Z I! Available plugins: 222 inputs, 9 aggregators, 26 processors, 20
parsers, 57 outputs
2022-10-16T00:31:43Z I! Loaded inputs: http
2022-10-16T00:31:43Z I! Loaded aggregators:
2022-10-16T00:31:43Z I! Loaded processors:
2022-10-16T00:31:43Z I! Loaded outputs: influxdb_v2
2022-10-16T00:31:43Z I! Tags enabled: host=myserver
2022-10-16T00:31:43Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"myserver",
Flush Interval:10s
2022-10-16T00:31:43Z D! [agent] Initializing plugins
2022-10-16T00:31:43Z D! [agent] Connecting outputs
2022-10-16T00:31:43Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2022-10-16T00:31:43Z D! [agent] Successfully connected to outputs.influxdb_v2
2022-10-16T00:31:43Z D! [agent] Starting service inputs
2022-10-16T00:31:53Z E! [outputs.influxdb_v2] Failed to write metric to sensor (will be
dropped: 422 Unprocessable Entity): unprocessable entity: failure writing points to
database: partial write: field type conflict: input field "temperature" on measurement
"sensorData" is type float, already exists as type string dropped=1
2022-10-16T00:31:53Z D! [outputs.influxdb_v2] Wrote batch of 1 metrics in 8.9558ms
2022-10-16T00:31:53Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
Error when type = "int"
me#myserver:/etc/telegraf$ telegraf -config telegraf.conf --debug
2022-10-16T00:37:05Z I! Starting Telegraf 1.24.2
2022-10-16T00:37:05Z I! Available plugins: 222 inputs, 9 aggregators, 26 processors, 20
parsers, 57 outputs
2022-10-16T00:37:05Z I! Loaded inputs: http
2022-10-16T00:37:05Z I! Loaded aggregators:
2022-10-16T00:37:05Z I! Loaded processors:
2022-10-16T00:37:05Z I! Loaded outputs: influxdb_v2
2022-10-16T00:37:05Z I! Tags enabled: host=myserver
2022-10-16T00:37:05Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"myserver",
Flush Interval:10s
2022-10-16T00:37:05Z D! [agent] Initializing plugins
2022-10-16T00:37:05Z D! [agent] Connecting outputs
2022-10-16T00:37:05Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2022-10-16T00:37:05Z D! [agent] Successfully connected to outputs.influxdb_v2
2022-10-16T00:37:05Z D! [agent] Starting service inputs
2022-10-16T00:37:10Z E! [inputs.http] Error in plugin:
[url=https://api.thingspeak.com/channels/XXXXXX/feeds.json?
api_key=XXXXXXX&results=2]: parsing metrics failed: Unable to convert field
'temperature' to type int: strconv.ParseInt: parsing "15.3": invalid syntax
Fixed it by leaving type = float under [[inputs.http.json_v2.field]] in telegraf.conf and creating a NEW bucket with a new API key in Influx.
The issue was that the bucket sensor that I had previously defined in my telegraf.conf already had the field temperature created in my influx database from previous tries with its type set as last (aka: String) which could not be overwritten with the new type mean (aka: float).
As soon as I deleted all pre existing buckets everything started working as expected.
InfluxDB dashboard
I'm using a connection pool and I'm clueless about what to do when the mysql server drops my client's connection due to inactivity/mysql server goes down. I'm calling the below function everytime I've to make a query:
def getDbCnx():
try:
dbConn = mysql.connector.connect(pool_name = "connectionPool", pool_size = 3, pool_reset_session=True, **dbConfig)
except mysql.connector.Error as err:
if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
print("Something is wrong with your user name or password")
dbConn.close()
return None
elif err.errno == errorcode.ER_BAD_DB_ERROR:
print("Database does not exist")
dbConn.close()
return None
else:
print(err)
dbConn.close()
return None
else:
return dbConn
As per my current understanding, the connection pool will be initialised on the first call of this function. And after that, it will just return a free connection from the already initialised pool. Now, suppose connection pool gets initialised successfully on the first call. And after sometime, say the mysql server goes down or it drops the connection due to inactivity. What will happen, when I query after such a situation. Because I suppose the older context would have gone stale.
Basically how do I ensure that the connection pool refreshes its internal contexts everytime it loses connectivity with the mysql server.
When you invoke dbConn.close() The connection will be reset (and we can observe the source here: https://github.com/mysql/mysql-connector-python/.../mysql/connector/pooling.py#L118 we can expect session variables deallocated, lost uncommitted transactions, etc.). The connection is not fully closed and it can be check by printing the connection id (it should not change if it is the same connection).
Once you attempt to retrieve another connection from the pool with mysql.connector.connect(pool_name = "connectionPool") it will check the connection and if the connection can not be reconnected, a new connection will be open (with a new session id), but in the case the new connection fails an error will be raised. So, if there is a server online and the user account you are using exist in the server is almost certain you will get the connection if the pool is not exhausted and server is online, even if the server was restarted or if you have updated your server after the creation of the connection pool, and also if the server has closed the inactive session, so make sure you close the connection so it can back to the pool and can be reused.
In the below example I shutdown the server with SHUTDOWN command from the MySQL console and then restart it with mysqladmin, you can see the connection id of each connection in the pool (some connections where reused), and that the variables are deallocated due to the connection being reset when goes back to the pool.
from time import sleep
import mysql.connector
from mysql.connector import errorcode
from builtins import range
from mysql.connector import errors
dbConfig = {
'host': '127.0.0.1',
'user': 'some_user', 'password': 'some_pass',
'port': 4824,
}
def getDbCnx():
try:
dbConn = mysql.connector.connect(
pool_name = "connectionPool",
pool_size = 3,
pool_reset_session=True,
**dbConfig
)
return dbConn
except (AttributeError, errors.InterfaceError) as err:
# Errors from bad configuration. not supported options or not valid
print(f"Something is wrong with the connection pool: {err}", flush=True)
except (errors.PoolError) as err:
# Errors from bad connection pool configuration or pool exhausted
print(f"Something is wrong with the connection pool: {err}", flush=True)
except errors.OperationalError as err:
# Errors from MySQL like Lost connection 2013 2055
print(f"Something is wrong with the MySQL server: {err}", flush=True)
except errors.ProgrammingError as err:
# Errors from bad connection data
if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
print("Something is wrong with your user name or password", flush=True)
elif err.errno == errorcode.ER_BAD_DB_ERROR:
print("Database does not exist", flush=True)
except mysql.connector.Error as err:
print(f"{err}", flush=True)
print(f"err type: {type(err)}")
return None
def can_connect():
print("Getting connections...")
greetings = ["hello", "hi", "howdy", "hola"]
for n in range(4):
print(f"getting connection {n}")
cnx = getDbCnx()
if not cnx:
print("No database connection!!!")
return False
cur = cnx.cursor()
cur.execute("select connection_id()")
res = cur.fetchall()
print(f"connection id: {res}")
cur.execute('show variables like "%greeting%"')
res = cur.fetchall()
print(f"greeting?: {res}")
cur.execute(f"select #greeting")
greet = cur.fetchall()
print(f"greet: {greet}")
cur.execute(f"SET #greeting='{greetings[n]}'")
cur.execute(f"select #greeting")
greet = cur.fetchall()
print(f"greet: {greet}\n")
cur.close()
cnx.close()
print("")
return True
def pause(sleep_secs=30, count_down=29):
sleep(sleep_secs)
for s in range(count_down, 0, -1):
print(f"{s}, ", end='')
sleep(1)
print()
def test():
print("Initial test")
assert can_connect()
print("\nStop the server now...")
pause(10, 20)
print("\ntest with server stoped")
print("\ngetting connections with server shutdown should fail")
assert not can_connect()
print("\nStart the server now...")
pause()
print("\ntest if we can get connections again")
print("second test")
assert can_connect()
if __name__ == "__main__":
test()
Here is the output of the example above, even if the server was shutdown you can still retrieve connections once the server comes back online:
Initial test
Getting connections...
getting connection 0
connection id: [(9,)]
greeting?: []
greet: [(None,)]
greet: [('hello',)]
getting connection 1
connection id: [(10,)]
greeting?: []
greet: [(None,)]
greet: [('hi',)]
getting connection 2
connection id: [(11,)]
greeting?: []
greet: [(None,)]
greet: [('howdy',)]
getting connection 3
connection id: [(9,)]
greeting?: []
greet: [(None,)]
greet: [('hola',)]
Stop the server now...
20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1,
test with server stoped
getting connections with server shutdown should fail
Getting connections...
getting connection 0
Something is wrong with the connection pool: Can not reconnect to MySQL after 1 attempt(s): 2003: Can't connect to MySQL server on '127.0.0.1:4824' (10061 No connection could be made because the target machine actively refused it)
No database connection!!!
Start the server now...
29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1,
test if we can get connections again
second test
Getting connections...
getting connection 0
connection id: [(23,)]
greeting?: []
greet: [(None,)]
greet: [('hello',)]
getting connection 1
connection id: [(24,)]
greeting?: []
greet: [(None,)]
greet: [('hi',)]
getting connection 2
connection id: [(25,)]
greeting?: []
greet: [(None,)]
greet: [('howdy',)]
getting connection 3
connection id: [(23,)]
greeting?: []
greet: [(None,)]
greet: [('hola',)]
We can see that the first time we retrieve connections from the pool we have the connections ids [9, 10, 11] and the connection 9 was reused. Later when the shutdown the connection the "No database connection!!!" text is printed and after I started the server the connections ids where [23, 24, 25] and the connection with id 23 was reused. In addition the greeting variable was deallocated in the server.
When I'm trying to take data backup from couchbase VM using the below command
cbbackup -v http://...:8091 /opt/couchbase/backup -u Administrator -p ******. I'm getting the below error.
2018-10-22 07:13:01,647: mt cbbackup...
2018-10-22 07:13:01,648: mt source : http://**.***.**.***:8091
2018-10-22 07:13:01,648: mt sink : /opt/couchbase/backup
2018-10-22 07:13:01,648: mt opts : {'username': '<xxx>', 'verbose': 1, 'extra':
{'max_retry': 10.0, 'rehash': 0.0, 'dcp_consumer_queue_length': 1000.0, 'data_only': 0.0, 'uncompress': 0.0, 'nmv_retry': 1.0, 'conflict_resolve': 1.0, 'cbb_max_mb': 100000.0, 'report': 5.0, 'mcd_compatible': 1.0, 'try_xwm': 1.0, 'backoff_cap': 10.0, 'batch_max_bytes': 400000.0, 'report_full': 2000.0, 'flow_control': 1.0, 'batch_max_size': 1000.0, 'seqno': 0.0, 'design_doc_only': 0.0, 'allow_recovery_vb_remap': 0.0, 'recv_min_bytes': 4096.0}
, 'collection': None, 'ssl': False, 'threads': 4, 'key': None, 'password': '<xxx>', 'id': None, 'bucket_source': None, 'silent': False, 'dry_run': False, 'single_node': False, 'vbucket_list': None, 'separator': '::', 'mode': 'diff'}
2018-10-22 07:13:01,655: mt Starting new HTTP connection (1): *********
2018-10-22 07:13:01,662: mt bucket: sample_bucket
Exception in thread s3:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self._target(*self.args, **self._kwargs)
File "/opt/couchbase/lib/python/pump_bfd.py", line 646, in run
return self.future_done(future, rv)
UnboundLocalError: local variable 'future' referenced before assignment
I'm using couchbase EE 5.1.1
I'm getting the above error. Any suggestions?
Use cbbackupmgr instead for EE
I ended up reading the documentation and found out for Enterprise edition there's backup manger "cbbackupmgr" which is faster more efficient than cbbackup and cbrestore.
All it requires is to first configure an empty directory as backup directory. For more information pls read the link below.https://docs.couchbase.com/server/5.5/backup-restore/cbbackupmgr-tutorial.html
I am a newbie to Grails, I have just started to create my first application just to make CRUD operations by storing the data to my local MySql database and I want to see the entered data in my machine. But I am unable to connect to database itself. Please help me where I am wrong.
My DataSource.groovy code is as shown below:
dataSource {
pooled = true
jmxExport = true
driverClassName = com.mysql.jdbc.Driver
dialect = org.hibernate.dialect.MySQL5InnoDBDialect
username = "test"
password = "test"
}
environments {
development {
dataSource {
dbCreate = "update"
url = "jdbc:mysql://localhost:3306/librarydb"
}
}
test {
dataSource {
dbCreate = "update"
url = "jdbc:mysql://localhost:3306/librarydb"
//url = "jdbc:h2:mem:testDb;MVCC=TRUE;LOCK_TIMEOUT=10000;DB_CLOSE_ON_EXIT=FALSE"
}
}
production {
dataSource {
dbCreate = "update"
url = "jdbc:mysql://localhost:3306/librarydb"
//url = "jdbc:h2:prodDb;MVCC=TRUE;LOCK_TIMEOUT=10000;DB_CLOSE_ON_EXIT=FALSE"
properties {
// See http://grails.org/doc/latest/guide/conf.html#dataSource for documentation
jmxEnabled = true
initialSize = 5
maxActive = 50
minIdle = 5
maxIdle = 25
maxWait = 10000
maxAge = 10 * 60000
timeBetweenEvictionRunsMillis = 5000
minEvictableIdleTimeMillis = 60000
validationQuery = "SELECT 1"
validationQueryTimeout = 3
validationInterval = 15000
testOnBorrow = true
testWhileIdle = true
testOnReturn = false
jdbcInterceptors = "ConnectionState"
defaultTransactionIsolation =
java.sql.Connection.TRANSACTION_READ_COMMITTED
}
}
}
}
And my BuildConfig.groovy file is configured as shown.
grails.servlet.version = "3.0" // Change depending on target container
compliance (2.5 or 3.0)
grails.project.class.dir = "target/classes"
grails.project.test.class.dir = "target/test-classes"
grails.project.test.reports.dir = "target/test-reports"
grails.project.work.dir = "target/work"
grails.project.target.level = 1.6
grails.project.source.level = 1.6
//grails.project.war.file = "target/${appName}-${appVersion}.war"
grails.project.fork = [
// configure settings for compilation JVM, note that if you alter the Groovy
version forked compilation is required
// compile: [maxMemory: 256, minMemory: 64, debug: false, maxPerm: 256,
daemon:true],
// configure settings for the test-app JVM, uses the daemon by default
test: [maxMemory: 768, minMemory: 64, debug: false, maxPerm: 256,
daemon:true],
// configure settings for the run-app JVM
run: [maxMemory: 768, minMemory: 64, debug: false, maxPerm: 256,
forkReserve:false],
// configure settings for the run-war JVM
war: [maxMemory: 768, minMemory: 64, debug: false, maxPerm: 256,
forkReserve:false],
// configure settings for the Console UI JVM
console: [maxMemory: 768, minMemory: 64, debug: false, maxPerm: 256]
]
grails.project.dependency.resolver = "maven" // or ivy
grails.project.dependency.resolution = {
// inherit Grails' default dependencies
inherits("global") {
// specify dependency exclusions here; for example, uncomment this to
disable ehcache:
// excludes 'ehcache'
}
log "error" // log level of Ivy resolver, either 'error', 'warn', 'info',
'debug' or 'verbose'
checksums true // Whether to verify checksums on resolve
legacyResolve false // whether to do a secondary resolve on plugin
installation, not advised and here for backwards compatibility
repositories {
inherits true // Whether to inherit repository definitions from plugins
grailsPlugins()
grailsHome()
mavenLocal()
grailsCentral()
mavenCentral()
// uncomment these (or add new ones) to enable remote dependency resolution from public Maven repositories
//mavenRepo "http://repository.codehaus.org"
//mavenRepo "http://download.java.net/maven/2/"
//mavenRepo "http://repository.jboss.com/maven2/"
}
dependencies {
// specify dependencies here under either 'build', 'compile', 'runtime', 'test' or 'provided' scopes e.g.
runtime 'mysql:mysql-connector-java:5.1.44'
// runtime 'org.postgresql:postgresql:9.3-1101-jdbc41'
test "org.grails:grails-datastore-test-support:1.0.2-grails-2.4"
}
plugins {
// plugins for the build system only
build ":tomcat:7.0.70" // or ":tomcat:8.0.22"
// plugins for the compile step
compile ":scaffolding:2.1.2"
compile ':cache:1.1.8'
// asset-pipeline 2.0+ requires Java 7, use version 1.9.x with Java 6
compile ":asset-pipeline:2.5.7"
// plugins needed at runtime but not for compilation
runtime ":hibernate4:4.3.10" // or ":hibernate:3.6.10.18"
runtime ":database-migration:1.4.0"
runtime ":jquery:1.11.1"
// Uncomment these to enable additional asset-pipeline capabilities
//compile ":sass-asset-pipeline:1.9.0"
//compile ":less-asset-pipeline:1.10.0"
//compile ":coffee-asset-pipeline:1.8.0"
//compile ":handlebars-asset-pipeline:1.3.0.3"
}
}
But I am getting the below error in my command prompt.
[localhost-startStop-1] ERROR pool.ConnectionPool - Unable to create initial connections of pool.
Message: class com.mysql.jdbc.Driver
Line | Method
->> 266 | run in java.util.concurrent.FutureTask
1149 | runWorker in java.util.concurrent.ThreadPoolExecutor
624 | run . . . in java.util.concurrent.ThreadPoolExecutor$Worker
748 | run in java.lang.Thread
Caused by ClassNotFoundException: class com.mysql.jdbc.Driver
->> 381 | findClass in java.net.URLClassLoader
424 | loadClass in java.lang.ClassLoader
348 | forName . in java.lang.Class
266 | run in java.util.concurrent.FutureTask
1149 | runWorker in java.util.concurrent.ThreadPoolExecutor
624 | run in java.util.concurrent.ThreadPoolExecutor$Worker
748 | run . . . in java.lang.Thread
Error |
2017-11-21 23:22:20,629 [localhost-startStop-1] ERROR
pool.ConnectionPool - Unable to create initial connections of pool.
Message: class com.mysql.jdbc.Driver
Line | Method
->>266 | run in java.util.concurrent.FutureTask
I am using Grails 2.5.6 version, and I have added the mysql-connector-java-5.1.44-bin.jar file in my lib folder of grails application and JAVA_HOME as well.Please help me to solve this and connect to my database to store the data. Thanks in advance.
Try grails clean
then grails refresh-dependencies ceck and make confirm the jar mysql:mysql-connector-java:5.1.29 is in your build path.
UPDATE :
Remove mysql dependency from BuildConfig.groovy if you have mysql-connector-java:5.x.x jar in the lib or vise-versa.
Multiple Jar file makes trouble.
Related post:
Unable to create initial connections of pool issues in Grails
java.lang.ClassNotFoundException:com.mysql.jdbc.Driver
The database needs to be reated first. Grails will try to connect to existing database, it will not be able to create a database for you.
Run these commands in the terminal, if you are using linux/unix, become a superuser and run the following:
CREATE DATABASE librarydb;
CREATE USER 'test'#'test' IDENTIFIED BY 'test';
GRANT ALL ON librarydb.* TO test#localhost;
if you have done all of that, please check username/password combo.
in terminal run this:
mysql -u test -p test and see if you can connect to the database.
it worked for me when I removed the jar file from the lib folder of my grails application and used the default mysql connection. i.e username="root" and password="root" in my datasource file. Thanks to #devbd for the answer.
When I try to convert a model from Caffe to Core ML model with coremltools, I get the following:
================= Starting Conversion from Caffe to CoreML ======================
Layer 0: Type: 'Data', Name: 'data'. Output(s): 'data', 'label'.
WARNING: Skipping Data Layer 'data' of type 'Data'. It is recommended to use Input layer for deployment.
Layer 1: Type: 'Split', Name: 'label_data_1_split'. Input(s): 'label'. Output(s): 'label_data_1_split_0', 'label_data_1_split_1'.
Layer 2: Type: 'Convolution', Name: 'conv1'. Input(s): 'data'. Output(s): 'conv1'.
Layer 3: Type: 'Slice', Name: 'slice1'. Input(s): 'conv1'. Output(s): 'slice1_1', 'slice1_2'.
Layer 4: Type: 'Eltwise', Name: 'etlwise1'. Input(s): 'slice1_1', 'slice1_2'. Output(s): 'eltwise1'.
Traceback (most recent call last):
File "test.py", line 2, in <module>
coreml_model = coremltools.converters.caffe.convert('_iter_3560000.caffemodel')
File "/Users/zfh/Desktop/face_verification_experiment/model/python27/lib/python2.7/site-packages/coremltools/converters/caffe/_caffe_converter.py", line 142, in convert
predicted_feature_name)
File "/Users/zfh/Desktop/face_verification_experiment/model/python27/lib/python2.7/site-packages/coremltools/converters/caffe/_caffe_converter.py", line 187, in _export
predicted_feature_name
RuntimeError: Unsupported option 'Max' for the parameter 'operation' in layer 'etlwise1' of type 'Elementwise' during caffe conversion.
This is the code I am using:
import coremltools
coreml_model = coremltools.converters.caffe.convert(('_iter_3560000.caffemodel', 'LCNN_deploy.prototxt'))
coreml_model.save('_iter_3560000.mlmodel')
Any ideas what the problem is? Thank you very much!
As the error message says, the problem is that the Max operation in an Eltwise layer is not supported by the coremltools. Core ML only supports a limited number of layers.
However... it seems like maybe you're trying to convert the .prototxt that was used for training (even though the filename is LCNN_deploy.prototxt). Are you sure this is the correct deploy.prototxt?
Recently, I extract the convert tools of caffe2mlmodel from the coremltools, its the c++ implemention.
First of all,you need to know this tool supported caffe layer, defined in caffe.proto(include in caffeconverter directory)
And, then, open the caffe.proto, you can locate at the message LayerParameter, like below:,. You can find the supported caffe layer.
caffe.proto in caffeconverter of coremltools
Finally, if you want the custom caffe layer, just add adapt the caffe.proto , and learn the Core ML mode protobuf specification(https://apple.github.io/coremltools/coremlspecification/#)