I have a simple mongodb database. I'm dumping using mongodump.
dump command
mongodump --db user_profiles --out /data/dumps/user-profiles
Here is the content of the user_profiles database. It has one collection (user_data) consisting of the following:
{ "_id" : ObjectId("555a882a722f2a009fc136e4"), "username" : "thor", "passwd" : "*1D28C7B35C0CD618178988146861D37C97883D37", "email" : "thor#avengers.com", "phone" : "4023331000" }
{ "_id" : ObjectId("555a882a722f2a009fc136e5"), "username" : "ironman", "passwd" : "*626AC8265C7D53693CB7478376CE1B4825DFF286", "email" : "tony#avengers.com", "phone" : "4023331001" }
{ "_id" : ObjectId("555a882a722f2a009fc136e6"), "username" : "hulk", "passwd" : "*CB375EA58EE918755D4EC717738DCA3494A3E668", "email" : "hulk#avengers.com", "phone" : "4023331002" }
{ "_id" : ObjectId("555a882a722f2a009fc136e7"), "username" : "captain_america", "passwd" : "*B43FA5F9280F393E7A8C57D20648E8E4DFE99BA0", "email" : "steve#avengers.com", "phone" : "4023331003" }
{ "_id" : ObjectId("555a882a722f2a009fc136e8"), "username" : "daredevil", "passwd" : "*B91567A0A3D304343624C30B306A4B893F4E4996", "email" : "daredevil#avengers.com", "phone" : "4023331004" }
After copying the dump to a nfs and then trying to load the dump into a test server using mongorestore
mongorestore --host db-test --port 27017 /remote/dumps/user-profiles
I'm getting the following error:
Mon May 18 20:19:23.918 going into namespace [user_profiles.user_data]
assertion: 16619 code FailedToParse: FailedToParse: Bad characters in value: offset:30
How to resolve this FailedToParse Error
To do further testing I created a test_db with a test_collection that only had one simple value 'x':1, and even that didn't work. So I knew something else had to be going on.
Versions of your tools matters
The version of mongodump that was being used was 3.0.3. The version on another virtual machine that was using mongorestore was 2.4.x. This was the cause for the errors. Once I got the mongodb-org-tools updated on my virtual machine (see official guide) I was able to get up and running as expected.
Hopefully this helps someone in the future. Check your versions!
mongodump --version
mongorestore --version
Related
Any idea what's wrong with the following curl statement? I am using this to upload files to a neptune database from an EC2 instance.
curl -X POST \
-H 'Content-Type: application/json' \
https://*my neptune endpoint*:8182/loader -d '
{
"source" : "s3://<file path>/<file name>.nq",
"format" : "nquads",
"iamRoleArn" : "arn:aws:iam::##########:role/NeptuneLoadFromS3",
"region" : "us-east-1",
"failOnError" : "FALSE",
"parallelism" : "MEDIUM",
"updateSingleCardinalityProperties" : "FALSE",
"queueRequest" : "TRUE"
}'
I have used this command template multiple times before without issue. The only things that i have changed here are the neptune endpoint and the file location on s3. When i run it now, i get the following error:
{"detailedMessage":"Json parse error: Unexpected character ('' (code 8203 / 0x200b)): was expecting double-quote to
start field name\n at [Source: (String)\"{\n \"source\" : \"s3://<file path>/<file name>.nq\",\n \"format\"
: \"nquads\",\n \"iamRoleArn\" : \"arn:aws:iam::#########:role/NeptuneLoadFromS3\",\n \"region\"
: \"us-east-1\",\n \"failOnError\" : \"FALSE\",\n \"parallelism\" : \"MEDIUM\",\n
\"updateSingleCardinalityProperties\" : \"FALSE\",\n \"queueRequest\" : \"TRUE\"\n }\"; line: 1, column: 3]",
"requestId":"4ebb82c9-107d-8578-cf84-8056817e504e","code":"BadRequestException"}
Anything that i change in the statement does not seem to have an effect on the outcome. Is there something really obvious that i am missing here?
Tomcat does not want to start case-management 6.0.1.
I do not understand I put the json file.
I am not in https.
What do I need to check ?
In cas-management.properties
cas.server.name=http://192.168.0.112:8443
cas.server.prefix=${cas.server.name}/cas
mgmt.serverName=http://192.168.0.112:8443
mgmt.serverName=http://192.168.0.112
server.context-path=/cas-management
server.port=8443
mgmt.adminRoles[0]=ROLE_ADMIN
logging.config=file:/etc/cas/config/log4j2-management.xml
cas.serviceRegistry.json.location=file:/etc/cas/services
cas.authn.attributeRepository.stub.attributes.cn=cn
cas.authn.attributeRepository.stub.attributes.displayName=displayName
cas.authn.attributeRepository.stub.attributes.givenName=givenName
cas.authn.attributeRepository.stub.attributes.mail=mail
cas.authn.attributeRepository.stub.attributes.sn=sn
cas.authn.attributeRepository.stub.attributes.uid=uid
In the file /etc/cas/services/http_cas_management-1560930209.json
GNU nano 2.7.4
Fichier : /etc/cas/services/http_cas_management-1560930209.json
{
"#class" : "org.apereo.cas.services.RegexRegisteredService",
"serviceId" : "^http://192.168.0.112/cas-management/.*",
"name" : "CAS Services Management",
"id" : 1560930209,
"description" : "CAS services management webapp",
"evaluationOrder" : 5500
"allowedAttributes":["cn","mail"]
}
Thank
Best regard
I am attempting to import a MySQL table into Elasticsearch.It is a table containing 10 different columns with a an 8 digits VARCHAR set as a Primary Key. MySQL database is located on a remote host.
To transfer data from MySQL into Elasticsearch I've decided to use Logstash and jdbc MySQL driver.
I am assuming that Logstash will create the index for me if it isn't there.
Here's my logstash.conf script:
input{
jdbc {
jdbc_driver_library => "/home/user/logstash/mysql-connector-java-5.1.17-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://[remotehostipnumber]/databasename"
jdbc_validate_connection => true
jdbc_user => "username"
jdbc_password => "password"
schedule => "* * * * *"
statement => "select * from table"
}
}
output
{
elasticsearch
{
index => "tables"
document_type => "table"
document_id => "%{table_id}"
hosts => "localhost:9200"
}stdout { codec => json_lines }
}
When running logstash config test it outputs 'Configration OK' message:
sudo /opt/logstash/bin/logstash --configtest -f /home/user/logstash/logstash.conf
Also when executing the logstash.conf script, Elasticsearch outputs:
Settings: Default filter workers: 1
Logstash startup completed
But when I go to check whether the index has been created and data has also been added:
curl -XGET 'localhost:9200/tables/table/_search?pretty=true'
I get:
{
"error" : {
"root_cause" : [ {
"type" : "index_not_found_exception",
"reason" : "no such index",
"resource.type" : "index_or_alias",
"resource.id" : "tables",
"index" : "table"
} ],
"type" : "index_not_found_exception",
"reason" : "no such index",
"resource.type" : "index_or_alias",
"resource.id" : "tables",
"index" : "tables"
},
"status" : 404
}
What could be the potential reasons behind the data not being indexed?
PS. I am keeping the Elasticsearch server running in the separate terminal window, to ensure Logstash can connect and interact with it.
For those who end up here looking for the answer to the similar problem.
My database had 4m rows and it must have been too much for logstash/elasticsearch/jdbc driver to handle in one command.
After I divided the initial transfer into 4 separate chunks of work, the script run and added the desired table into the elasticsearch NoSQL db.
use following code to export data from mysql table and create index in elastic search
echo '{
"type":"jdbc",
"jdbc":{
"url":"jdbc:mysql://localhost:3306/your_database_name",
"user":"your_database_username",
"password":"your_database_password",
"useSSL":"false",
"sql":"SELECT * FROM table1",
"index":"Index_name",
"type":"Index_type",
"poll" : "6s",
"autocommit":"true",
"metrics": {
"enabled" : true
},
"elasticsearch" : {
"cluster" : "clustername",
"host" : "localhost",
"port" : 9300
}
}
}' | java -cp "/etc/elasticsearch/elasticsearch-jdbc-2.3.4.0/lib/*" -"Dlog4j.configurationFile=file:////etc/elasticsearch/elasticsearch-jdbc-2.3.4.0/bin/log4j2.xml" "org.xbib.tools.Runner" "org.xbib.tools.JDBCImporter"
I am trying to connect MongoDb with Hadoop. I have Hadoop-1.2.1 installed in my Ubuntu 14.04. I installed MongoDB-3.0.4 and also downloaded and added mongo-hadoop-hive-1.3.0.jar, mongo-java-driver-2.13.2.jar jars in hive session. I have downloaded mongo-connector.sh (found in this site)and included it under Hadoop_Home/lib.
I have set input and output sources like this :
hive> set MONGO_INPUT=mongodb://[user:password#]<MongoDB Instance IP>:27017/DBname.collectionName;
hive> set MONGO_OUTPUT=mongodb://[user:password#]<MongoDB Instance IP>:27017/DBname.collectionName;
hive> add JAR brickhouse-0.7.0.jar;
hive> create temporary function collect as 'brickhouse.udf.collect.CollectUDAF';
My collection in MongoDb is this :
> db.shows.find()
{ "_id" : ObjectId("559eb22fa7999b1a5f50e4e6"), "title" : "Arrested Development", "airdate" : "November 2, 2003", "network" : "FOX" }
{ "_id" : ObjectId("559eb238a7999b1a5f50e4e7"), "title" : "Stella", "airdate" : "June 28, 2005", "network" : "Comedy Central" }
{ "_id" : ObjectId("559eb23ca7999b1a5f50e4e8"), "title" : "Modern Family", "airdate" : "September 23, 2009", "network" : "ABC" }
>
Now I am trying to create a Hive table
CREATE EXTERNAL TABLE mongoTest(title STRING,network STRING)
> STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
> WITH SERDEPROPERTIES('mongo.columns.mapping'='{"title":"name",”airdate”:”date”,”network”:”name”}')
> TBLPROPERTIES('mongo.uri'='${hiveconf:MONGO_INPUT}');
When I run this command, it says
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. com/mongodb/util/JSON
Then I added hive-json-serde.jar and hive-serdes-1.0-SNAPSHOT.jar jars and tried to create the table again. But the error remains the same. How can I rectify this error?
I actually added these mongo-hadoop-core-1.3.0.jar , mongo-hadoop-hive-1.3.0.jar and mongo-java-driver-2.13.2.jar jars in Hadoop_Home/lib folder. Then I was able to get data from MongoDb to Hive without any errors.
There are smart-quotes which the parser is seeing - ”
”airdate”:”date”,”network”:”name”
They should be
"airdate":"date","network":"name"
I am using Elastic search version 1.2.0, Jdbc river version 1.2.0.1.
Following is my Jdbc river command.
curl -XPUT 'localhost:9200/_river/tbl_messages/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"strategy" : "simple",
"url" : "jdbc:mysql://localhost:3306/messageDB",
"user" : "username",
"password" : "password",
"sql" : "select messageAlias.id as _id,messageAlias.subject as subject from tbl_messages messageAlias",
"index" : "MessageDb",
"type" : "tbl_messages",
"maxbulkactions":1000,
"maxconcurrentbulkactions" : 4,
"autocommit" : true,
"schedule" : "0 0-59 0-23 ? * *"
}
}'
Subject column's index meta data
subject: {
type: string
}
This table has 2 Million records and subject field contains arbitrary strings. Some sample data are "You're invited ","{New York:45} We rock!!","{Invitation:27}" so on.
My problem is that when jdbc river encounters one such record with {anything inside of this}, It stalls the river and throws parsing exception. It never moves on to index next records.
org.elasticsearch.index.mapper.MapperParsingException: failed to parse [subject]
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:418)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeObject(ObjectMapper.java:537)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:479)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:515)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:462)
at org.elasticsearch.index.shard.service.InternalIndexShard.prepareIndex(InternalIndexShard.java:394)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:413)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:155)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:534)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:433)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.ElasticsearchIllegalArgumentException: unknown property [Inivitation]
at org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateFieldForString(StringFieldMapper.java:332)
at org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateField(StringFieldMapper.java:278)
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:408)
... 12 more
Deleting this record in db,clearing data inside ES_HOME/data and recreating the river seems to be the only way to proceed until it encounter the above said formatted record again.
How do I make it to continue indexing irrespective of exception when parsing few records?
It is related to Elastic search and not the river.
https://github.com/jprante/elasticsearch-river-jdbc/issues/258
https://github.com/elasticsearch/elasticsearch/issues/2898