I've recently installed OrientDB and trying to create an import using the ETL module.
Running on OS X, i've installed orientDB using homebrew.
I've created the following ETL script:
{
"config": {
"log": "debug"
},
"begin": [
],
"extractor" : {
"row": {}
},
"transformers": [
{ "jdbc": {
"driver": "com.mysql.jdbc.Driver",
"url": "jdbc:mysql://localhost/dev_database",
"userName": "root",
"userPassword": "",
"query": "select * from users limit 20"
}
},
{ "vertex": { "class": "V" } }
],
"loader": {
"orientdb": {
"dbURL": "plocal:../databases/ETLDemo",
"dbUser": "admin",
"dbPassword": "admin",
"dbAutoCreate": true,
"tx": false,
"batchCommit": 1000,
"dbType": "graph"
}
}
}
Followed the instructions here: http://www.orientechnologies.com/docs/2.0/orientdb-etl.wiki/Import-from-DBMS.html
and installed the jdbc driver for mysql from here: http://dev.mysql.com/downloads/connector/j/
and set the classpath as described.
Running the command:
./oetl.sh ../import_mysql.json
Gives the following output:
OrientDB etl v.2.0.2 (build #BUILD#) www.orientechnologies.com
Exception in thread "main" com.orientechnologies.orient.core.exception.OConfigurationException: Error on creating ETL processor
at com.orientechnologies.orient.etl.OETLProcessor.parse(OETLProcessor.java:278)
at com.orientechnologies.orient.etl.OETLProcessor.parse(OETLProcessor.java:188)
at com.orientechnologies.orient.etl.OETLProcessor.main(OETLProcessor.java:163)
Caused by: java.lang.IllegalArgumentException: Transformer 'jdbc' not found
at com.orientechnologies.orient.etl.OETLComponentFactory.getTransformer(OETLComponentFactory.java:141)
at com.orientechnologies.orient.etl.OETLProcessor.parse(OETLProcessor.java:260)
... 2 more
I did manage to create a working import using a CSV file so i'm pretty sure that the database is set up correctly.
Thoughts?
{
"config": {
"log": "debug"
},
"extractor": {
"jdbc": {
"driver": "com.mysql.jdbc.Driver",
"url": "jdbc:mysql://localhost/dev_database",
"userName": "root",
"userPassword": "",
"query": "select * from users limit 20"
}
},
"transformers" : [
{ "vertex": { "class": "V"} }
],
"loader": {
"orientdb": {
"dbURL": "plocal:../databases/ETLDemo",
"dbUser": "admin",
"dbPassword": "admin",
"dbAutoCreate": true,
"tx": false,
"batchCommit": 1000,
"dbType": "graph"
}
}
}
Can you see if this solves the problem?
Related
Error: String cannot be coerced to a nodeId
Hi,
I was busy setting up a connection between the Orion Broker and an PLC with OPC-UA Server using the opcua iotagent agent.
I managed to setup all parts and I am able to receive (test) data, but I am unable to follow the tutorial with regards to adding an entity to the Orion-Broker using a json file:
curl http://localhost:4001/iot/devices -H "fiware-service: plcservice" -H "fiware-servicepath: /demo" -H "Content-Type: application/json" -d #add_device.json
The expected result would be an added entity to the OrionBroker with the supplied data, but this only results in a error message:
{"name":"Error","message":"String cannot be coerced to a nodeId : ns*4:s*MAIN.mainVar"}
suspected Error
Is it possible that the iotagent does not work nicely with nested Variables?
steps taken
doublechecked availability of OPC Data:
OPC data changes every second, can be seen in Broker log
reduced complexity of setup to only include Broker and IOT-agent
additional information:
add_device.json file:
{
"devices": [
{
"device_id": "plc1",
"entity_name": "PLC1",
"entity_type": "plc",
"attributes": [
{
"object_id": "ns*4:s*MAIN.mainVar",
"name": "main",
"type": "Number"
}
],
"lazy": [
],
"commands" : []
}
]
}
config of IOT-agent (from localhost:4081/config):
{
"config": {
"logLevel": "DEBUG",
"contextBroker": {
"host": "orion",
"port": 1026
},
"server": {
"port": 4001,
"baseRoot": "/"
},
"deviceRegistry": {
"type": "memory"
},
"mongodb": {
"host": "iotmongo",
"port": "27017",
"db": "iotagent",
"retries": 5,
"retryTime": 5
},
"types": {
"plc": {
"service": "plcservice",
"subservice": "/demo",
"active": [
{
"name": "main",
"type": "Int16"
},
{
"name": "test1",
"type": "Int16"
},
{
"name": "test2",
"type": "Int16"
}
],
"lazy": [],
"commands": []
}
},
"browseServerOptions": null,
"service": "plc",
"subservice": "/demo",
"providerUrl": "http://iotage:4001",
"pollingExpiration": "200000",
"pollingDaemonFrequency": "20000",
"deviceRegistrationDuration": "P1M",
"defaultType": null,
"contexts": [
{
"id": "plc_1",
"type": "plc",
"service": "plcservice",
"subservice": "/demo",
"polling": false,
"mappings": [
{
"ocb_id": "test1",
"opcua_id": "ns=4;s=test.TestVar.test1",
"object_id": null,
"inputArguments": []
},
{
"ocb_id": "test2",
"opcua_id": "ns=4;s=test.TestVar.test2",
"object_id": null,
"inputArguments": []
},
{
"ocb_id": "main",
"opcua_id": "ns=4;s=MAIN.mainVar",
"object_id": null,
"inputArguments": []
}
]
}
]
}
}
I'm one of the maintainers of the iotagent-opcua repo, we have identified and fixed the bug you were addressing, please update your agent to the latest version (1.4.0)
If you haven't ever heard about it, starting from 1.3.8 we have introduced a new configuration property called "relaxTemplateValidation" which let you use previously forbidden characters (e.g. = and ; ). I suggest you to have a look at it on the configuration examples provided.
The DHCP server is on a different network. I lifted the virtual machine Linux, there are two interfaces. Error DHCPDISCOVER PACKET_NAK_0001.
On a Linux virtual machine, I execute the commands:
dhcrelay ip_dhcp -i name_interface
dhclient -v name_interface -s ip_dhcp
An example of a config which I send through "curl":
{
"command": "config-set",
"service": [
"dhcp4"
],
"arguments": {
"Dhcp4": {
"option-def": [
{
"name": "configRevision",
"code": 254,
"type": "string",
"space": "dhcp4"
}
],
"interfaces-config": {
"interfaces": [
"*"
],
"dhcp-socket-type": "udp"
},
"control-socket": {
"socket-type": "unix",
"socket-name": "/tmp/kea-dhcp4-ctrl.sock"
},
"lease-database": {
"type": "postgresql",
"host": "host",
"name": "name",
"user": "name",
"password": "pass",
"port": 5432,
"lfc-interval": 600
},
"expired-leases-processing": {
"reclaim-timer-wait-time": 10,
"flush-reclaimed-timer-wait-time": 25,
"hold-reclaimed-time": 3600,
"max-reclaim-leases": 100,
"max-reclaim-time": 250,
"unwarned-reclaim-cycles": 5
},
"valid-lifetime": 3600,
"authoritative": true,
"hooks-libraries": [
{
"library": "/usr/local/lib/hooks/libdhcp_lease_cmds.so"
},
{
"library": "/usr/local/lib/hooks/libdhcp_stat_cmds.so"
}
],
"option-data": [
{
"name": "configRevision",
"code": 254,
"data": "1",
"always-send": false
},
{
"name": "domain-name-servers",
"data": "<IP>, <IP>",
"always-send": true
},
{
"name": "time-servers",
"data": "<IP>",
"always-send": true
},
{
"name": "ntp-servers",
"data": "<IP>",
"always-send": true
},
{
"name": "domain-name",
"data": "<DOMAIN>",
"always-send": true
},
{
"name": "dhcp-server-identifier",
"data": "<IP>"
}
],
"shared-networks": [
{
"name": "Zone 1",
"relay": {
"ip-addresses": [
"172.100.100.100",
"<IP>",
"<IP>",
"<IP>"
]
},
"option-data": [],
"subnet4": [
{
"id": 1314,
"subnet": "172.100.100.99/23",
"option-data": [
{
"name": "routers",
"data": "172.100.100.100"
}
],
"pools": [
{
"pool": "172.100.100.130-172.100.100.254",
"client-class": "UNKNOWN"
}
],
"valid-lifetime": 86400,
"reservations": []
}
]
}
]
}
}
Expected Result:
Successful issuance of IP address.
Actual result:
ERROR [kea-dhcp4.bad-packets/26218] DHCP4_PACKET_NAK_0001 [hwtype=1
], cid=[no info], tid=0x23acf436: failed to select a subnet for
incoming packet, src 172.100.100.100, type DHCPDISCOVER
Problem lies with client-class, not being known in time for subnet selection. Viz Kea Docs
The determination whether there is a reservation for a given client is made after a subnet is selected, so it is not possible to use “KNOWN”/”UNKNOWN” classes to select a shared network or a subnet.
userauth.json
{
"name": "userauth",
"base": "PersistedModel",
"idInjection": true,
"options": {
"validateUpsert": true
},
"properties": {
"id":{
"type":"number",
"required":true,
"length":11,
"mysql":
{
"columnName":"id",
"dataType":"INT",
"dataLength":11,
"nullable":"N"
}
},
"firstname":{
"type":"string",
"required":true,
"length":25,
"mysql":
{
"columnName":"firstname",
"dataType":"VARCHAR",
"dataLength":25,
"nullable":"N"
}
},
"lastname":{
"type":"string",
"required":true,
"length":25,
"mysql":
{
"columnName":"lastname",
"dataType":"VARCHAR",
"dataLength":25,
"nullable":"N"
}
},
"email":{
"type":"string",
"required":true,
"length":50,
"mysql":
{
"columnName":"email",
"dataType":"VARCHAR",
"dataLength":50,
"nullable":"N"
}
},
"password":{
"type":"string",
"required":true,
"length":30,
"mysql":
{
"columnName":"password",
"dataType":"VARCHAR",
"dataLength":30,
"nullable":"N"
}
},
"dd":{
"type":"number",
"required":true,
"length":2,
"mysql":
{
"columnName":"dd",
"dataType":"INT",
"dataLength":2,
"nullable":"N"
}
},
"mm":{
"type":"number",
"required":true,
"length":2,
"mysql":
{
"columnName":"mm",
"dataType":"INT",
"dataLength":2,
"nullable":"N"
}
},
"yyyy":{
"type":"number",
"required":true,
"length":4,
"mysql":
{
"columnName":"yyyy",
"dataType":"INT",
"dataLength":4,
"nullable":"N"
}
}
},
"validations": [],
"relations": {},
"acl": [],
"methods": {}
}
userauth.js
'use strict';
module.exports = function(userauth) {
};
model-config.json
{
"_meta": {
"sources": [
"loopback/common/models",
"loopback/server/models",
"../common/models",
"./models"
],
"mixins": [
"loopback/common/mixins",
"loopback/server/mixins",
"../common/mixins",
"./mixins"
]
},
"User": {
"dataSource": "db"
},
"AccessToken": {
"dataSource": "db",
"public": false
},
"ACL": {
"dataSource": "db",
"public": false
},
"RoleMapping": {
"dataSource": "db",
"public": false
},
"Role": {
"dataSource": "db",
"public": false
},
"userauth": {
"dataSource": "db",
"public": true
}
}
datasource.json
{
"db": {
"host": "localhost",
"port": 3306,
"url": "",
"database": "users",
"password": "12121212",
"name": "db",
"user": "root",
"connector": "mysql"
}
}
ERROR IN RESPONSE WHEN TRYING TO GET or POST
> {
> "error": {
> "statusCode": 500,
> "name": "Error",
> "message": "ER_BAD_FIELD_ERROR: Unknown column 'model' in 'field list'",
> "code": "ER_BAD_FIELD_ERROR",
> "errno": 1054,
> "sqlMessage": "Unknown column 'model' in 'field list'",
> "sqlState": "42S22",
> "index": 0,
> "sql": "SELECT `model`,`property`,`accessType`,`permission`,`principalType`,`principalId`,`id`
> FROM `ACL` WHERE `model` IN ('userauth','*') AND `property` IN
> ('find','*') AND `accessType` IN ('READ','*') ORDER BY `id`",
mySQL db is already connected.
another point I noted that loopback is creating own db name as "acl"
and not using the db name defined while creating the model.
I have db name users, and created table acl with the exact column
names properties name in userauth.json file
LoopBack is actually not creating its own database named as "ACL". As you have created a datasource with the required details of the database, your app is using that database.
You can use the following script to identify and create tables in the database based on LoopBack models you've created.
Create a file: "createTables.js" in server folder, and add following code:
var server = require('./server');
var ds = server.dataSources.mysql;
var lbTables = ['User', 'AccessToken', 'ACL', 'RoleMapping', 'Role', 'userauth'];
ds.automigrate(lbTables, function (er) {
if (er) throw er;
console.log('Loopback tables [' - lbTables - '] created in ', ds.adapter.name);
ds.disconnect();
});
This will create all of the tables with columns based on the properties of the LoopBack models. You can run the script by moving in the server directory and running node createTable.js command.
Refer to their documentation: https://loopback.io/doc/en/lb2/Creating-database-tables-for-built-in-models.html
Warning: Auto-migration will drop an existing table if its name
matches a model name. When tables with data exist, use
auto-update to avoid data loss.
This is actually used for creating tables for built-in loopback models, but you can add other models (using their names) to lbTables array and the script will create the database tables for those models too.
This is currently my config file
{
"config": {
"haltOnError": false
},
"source": {
"file": {
"path": "/home/user1/temp/real_user/user3.csv"
}
},
"extractor": {
"csv": {
"columns": ["id", "name", "token", "username", "password", "created", "updated", "enabled", "is_admin", "is_banned", "userAvatar"],
"columnsOnFirstLine": true
},
"field": {
"fieldName": "created",
"expression": "created.asDateTime()"
}
},
"transformers": [{
"vertex": {
"class": "user"
}
}],
"loader": {
"orientdb": {
"dbURL": "plocal:/home/user1/orientdb/real_user",
"dbAutoCreateProperties": true,
"dbType": "graph",
"classes": [{
"name": "user",
"extends": "V"
}],
"indexes": [{
"class": "user",
"fields": ["id:long"],
"type": "UNIQUE"
}]
}
}
}
and my csv currently looks like this
6,Olivia Ong,2jkjkl54k5jklj5k4j5k4jkkkjjkj,\N,\N,2013-11-15 16:36:33,2013-11-15 16:36:33,1,0,\N,\N
7,Matthew,32kj4h3kjh44hjk3hk43hkhhkjhasd,\N,\N,2013-11-18 17:29:13,2013-11-15 16:36:33,1,0,\N,\N
I still wonder when I execute the ETL, orientdb wont recognize my datetime as datetime.
I tried putting datatype in column fields "created:datetime", but it ended up not showing any data.
I wonder what is the proper solution for this case.
from next version, 2.2.8, you will be able to define different default pattern for date and datetime: CSV extractor documentation
Note that when you define the columns, you need to specify the column's type:
"columns": ["id:string", "created:date", "updated:datetime"],
You can use the snapshot jar of 2.2.8 of ETL module with 2.2.7 without any problem:
https://oss.sonatype.org/content/repositories/snapshots/com/orientechnologies/orientdb-etl/2.2.8-SNAPSHOT/
I want to import two csv files to a Orientdb database. The first is the apex, with 1 million records. The second are the edges with 59 million records
I have two json file to import:
vértex
{
"source": { "file": { "path": "../csvs/metodo01/pesquisador.csv" } },
"extractor": { "row": {} },
"transformers": [
{ "csv": {} },
{ "vertex": { "class": "Pesquisador" } }
],
"loader": {
"orientdb": {
"dbURL": "remote:localhost/dbCemMilM01",
"dbType": "graph",
"batchCommit": 1000,
"classes": [
{"name": "Pesquisador", "extends": "V"}
], "indexes": [
{"class":"Pesquisador", "fields":["psq_id:integer"], "type":"UNIQUE" }
]
}
}
}
edge
{
"config": {
"log": "info",
"parallel": false
},
"source": {
"file": {
"path": "../csvs/metodo01/a10.csv"
}
},
"extractor": {
"row": {
}
},
"transformers": [{
"csv": {
"separator": ",",
"columnsOnFirstLine": true,
"columns": ["psq_id_from:integer",
"pub_id_to:integer",
"ordem:integer"]
}
},
{
"command": {
"command": "create edge PUBLICOU from (select from Pesquisador where psq_id = ${input.psq_id_from}) to (select from Publicacao where pub_id = ${input.pub_id_to}) set ordem = ${input.ordem} ",
"output": "edge"
}
}],
"loader": {
"orientdb": {
"dbURL": "remote:localhost/dbUmMilhaoM01",
"dbType": "graph",
"standardElementConstraints": false,
"batchCommit": 1000,
"classes": [{
"name": "PUBLICOU",
"extends": "E"
}]
}
}
}
In the process the Orientdb suggests using index to accelerate the process.
How do I do that?
Just the command is create edge PUBLICOU from (select from Pesquisador where psq_id = ${input.psq_id_from}) to (select from Publicacao where pub_id = ${input.pub_id_to}) set ordem = ${input.ordem}
To speed up the create edge process you may need indexes on both properties Pesquisador.psq_id , that you already have, and on Publicacao.pub_id.
Ivan
You can declare indexes directly in the ETL configuration. Example taken from DBPedia importer:
"orientdb": {
"dbURL": "plocal:/temp/databases/dbpedia",
"dbUser": "importer",
"dbPassword": "IMP",
"dbAutoCreate": true,
"tx": false,
"batchCommit": 1000,
"wal" : false,
"dbType": "graph",
"classes": [
{"name":"Person", "extends": "V" },
{"name":"Customer", "extends": "Person", "clusters":8 }
],
"indexes": [
{"class":"V", "fields":["URI:string"], "type":"UNIQUE" },
{"class":"Person", "fields":["town:string"], "type":"NOTUNIQUE" ,
metadata : { "ignoreNullValues" : false }
}
]
}
For more information look at: http://orientdb.com/docs/2.2/Loader.html
To speedup the load process my suggestion is to work in plocal mode and then mode the created db to a standalone OrientDB server.