Lost results in ElasticSearch index with autoupdate - mysql

I have problem with data being lost when using a jdbc driver to update elasticsearch from a mysql database. My river is below:
curl -XPUT 'http://localhost:9200/_river/river_index_type/_meta' -d '
{
"type": "jdbc",
"jdbc": {
"strategy": "simple",
"driver": "com.mysql.jdbc.Driver",
"url": "jdbc:mysql://localhost/tt",
"user": "user",
"password": "password",
"sql": "SELECT p.product_id AS _id, ... FROM product p ... WHERE ...",
"poll": "5m",
"autocommit": true
},
"index": {
"type": "type",
"index": "index"
}
}
Initially everything works fine, but later, instead of 1200 results in my index i only have 800-900, and each five minutes the count changes. I don't understand what the problem is. Any help would be appreciated.

Related

Import database schema in MongoDB

I created json schema for MongoDB. It's look like:
{
"schemaType": "Collection",
"name": "Manufacturier",
"defaultValue": "",
"description": "...",
"fields": [
{
"schemaType": "Field",
"name": "_id",
"type": "ObjectId",
"required": true,
"unique": true,
"defaultValue": "",
"description": "...",
"index": 0,
"customProps": []
}
]
}
...
Can I import this schema into mongodb? And how (if it's possible)
This is one of the biggest advantage of mongoDB that you can insert documents without the need to define the schema in advance , afcourse there is option to add schema validation rules as commented by #Alex Blex , but this is limiting the main reason you would like to use noSQL mongoDB database instead of other possible data storage system ...

How to select multiple parameters from JSON output, which meets a condition & further select individual value

I have a json output, from which I need to get id value and IPv4_address value where IPv4_address exists (this shouldn't be null). Have to use this ID value for another request along with random generated string.
Here is the breakdown of the requirement :
STEP 1 :
In the following example, for the ipv4_address:1.1.1.1 & ipv4_address:1.1.1.2, i need to get the id output which is "4e-0365-4e29-95ca-329165eecf8a" and "c9061b6674a8546cea" along with IP address.
Example of my output should look like (something similar):
1.1.1.1 4e-0365-4e29-95ca-329165eecf8a
1.1.1.2 c9061b6674a8546cea
I was trying to use jq but with this I'm not able to get the both values :
ID="$(echo "$test" \n | jq -r '.USER[] | select(.ipv4_address) | .ipv4_address')"
ID1="$(echo "$test" \n | jq -r '.USER[] | select(.ipv4_address) | .id')"
Sample output which is getting displayed with the above 2 commands :
ID value is : 1.1.1.1 1.1.1.2
ID1 value is : 4e-0365-4e29-95ca-329165eecf8a c9061b6674a8546cea
STEP 2: Profile creation: I need to use each $ID1 value in another request along with random generated string. Random string is generated as per the count of $ID1's (so here I will generate 2 random string)
And thus 2 profiles are created.
Ques: How can I get each ID from the $ID1 variable ? I tried something like ID1[0] but that seems to be wrong
STEP 3 :
Will use each ID and random string for another request, Once its done or if that step is failed, i need to provide the output to a file & output should look like :
My requirement for the final output is :
1.1.1.1 4e-0365-4e29-95ca-329165eecf8a <randomvalue-1> <profile-1> DONE
1.1.1.2 c9061b6674a8546cea <randomvalue-2> <profile-2> FAILED
where random value will be generated randomly and shall be used against the ID.
JSON output which needs to be parsed:
{
"errorcode": 0,
"message": "Done",
"operation": "get",
"resourceType": "USER",
"username": "root",
"tenant_name": "Owner",
"tenant_id": "05db6674ad458546cd2",
"resourceName": "",
"USER": [
{
"is_default": "false",
"session_timeout": "0",
"permission": "root",
"name": "ee",
"session_timeout_unit": "",
"tenant_id": "55bcb6674ad45854",
"id": "4e-0365-4e29-95ca-329165eecf8a",
"ipv4_address": "1.1.1.1",
"state": "Up",
"tenant_name": "Owner",
"encrypted": "false",
"groups": [
"owner"
],
"root_user": ""
},
{
"is_default": "false",
"session_timeout": "0",
"permission": "read",
"name": "test",
"session_timeout_unit": "",
"tenant_id": "bc906674ad458546cd2",
"id": "12cd0-fb7f-4abf-b060-48e98b794b06",
"tenant_name": "Owner",
"encrypted": "false",
"groups": [
"read"
],
"root_user": ""
},
{
"is_default": "true",
"session_timeout": "0",
"permission": "root",
"name": "root",
"session_timeout_unit": "",
"tenant_id": "c905db6d458546cd2",
"id": "c9061b6674a8546cea",
"ipv4_address": "1.1.1.2",
"state": "Not Reachable",
"tenant_name": "Owner",
"encrypted": "false",
"groups": [
"owner"
],
"root_user": ""
},
{
"is_default": "false",
"session_timeout": "0",
"permission": "readonly",
"name": "a",
"session_timeout_unit": "",
"tenant_id": "c905674ad458546cd2",
"id": "bc8a-4fd6-bc09-8c39c131b54e",
"tenant_name": "Owner",
"encrypted": "false",
"groups": [
"read"
],
"root_user": ""
}
]
}
Not quite clear with the logic of marking it DONE and FAILED. But to answer your first question where you want to select the multiple fields, you can do something like this:
$ cat input.js | jq -r '.USER[] | select(.ipv4_address) | "\(.ipv4_address) \(.id)"' > result.js
This will output the result in a file named result.js. You can apply your custom logic of marking DONE and Failed on this file.
In the above command when you do select(.ipv4_address) It basically drops all the records for which ipv4_address value is null or it is not present.
if you want to select the records which have ipv4_address as null, then your select statement would become something like this
select(.ipv4_address == null)

Regex on Zabbix API?

After some hours spent reading Zabbix Api's documentation, I haven't found a way of doing a search by key with more than one possible value.
So, with this code:
{
"jsonrpc": "2.0",
"method": "item.get",
"params": {
"output": "extend",
"hostids": " 10355",
"search": {
"key_": "[in_*|out_*]"
},
"sortfield": "name"
},
"auth": "15729708df1f5936f6ea840ae1b41cb6",
"id": 0
}
I'm trying to get every item which key is in_<anything> OR out_<anything> so, the output would be the combination of all the items related to the interfaces. Instead, I get this:
{"jsonrpc":"2.0","result":[],"id":0}
I know that there's the possibility to use filter instead of search but, from what I read, it is used when you want exact match, which is not the case.
Zabbix API (and filtering in other places) does not support regexp. In some versions you could pass wildcards, but that won't solve your current issue. You will have to do two separate API queries.
To answer the question in the comment here, search can be negated with the excludeSearch parameter - see the API documentation for more detail.
To this work you need put this key searchWildcardsEnabled
{
"jsonrpc": "2.0",
"method": "item.get",
"params": {
"output": "extend",
"hostids": " 10355",
"searchWildcardsEnabled": "true",
"search": {
"key_": "[in_*|out_*]"
},
"sortfield": "name"
},
"auth": "15729708df1f5936f6ea840ae1b41cb6",
"id": 0
}
I think you should consider using the searchByAny parameter that let's you find items that matchs any of the search criteria. Here is the correct json that you should try :
{
"jsonrpc": "2.0",
"method": "item.get",
"params": {
"output": "extend",
"hostids": " 10355",
"searchWildcardsEnabled": "true",
"search": {
"key_": [
"in_*",
"out_*"
]
},
"sortfield": "name"
},
"auth": "15729708df1f5936f6ea840ae1b41cb6",
"id": 1
}

How to index multidimensional arrays in couchdb

I have a multidimensional array that I want to index with CouchDB (really using Cloudant). I have users which have a list of the teams that they belong to. I want to search to find every member of that team. So, get me all the User objects that have a team object with id 79d25d41d991890350af672e0b76faed. I tried to make a json index on "Teams.id", but it didn't work because it isn't a straight array but a multidimensional array.
User
{
"_id": "683be6c086381d3edc8905dc9e948da8",
"_rev": "238-963e54ab838935f82f54e834f501dd99",
"type": "Feature",
"Kind": "Profile",
"Email": "gc#gmail.com",
"FirstName": "George",
"LastName": "Castanza",
"Teams": [
{
"id": "79d25d41d991890350af672e0b76faed",
"name": "First Team",
"level": "123"
},
{
"id": "e500c1bf691b9cfc99f05634da80b6d1",
"name": "Second Team Name",
"level": ""
},
{
"id": "4645e8a4958421f7d843d9b34c4cd9fe",
"name": "Third Team Name",
"level": "123"
}
],
"LastTeam": "79d25d41d991890350af672e0b76faed"
}
This is a lot like my response at Cloudant Selector Query but here's the deal, applied to your question:
The easiest way to run this query is using "Cloudant Query" (or "Mango", as it's called in the forthcoming CouchDB 2.0 release) -- and not the traditional MapReduce view indexing system in CouchDB. (This blog covers the differences: https://cloudant.com/blog/mango-json-vs-text-indexes/ and this one is an overview: https://developer.ibm.com/clouddataservices/2015/11/24/cloudant-query-json-index-arrays/).
Here's what your CQ index should look like:
{
"index": {
"fields": [
{"name": "Teams.[].id", "type": "string"}
]
},
"type": "text"
}
And what the subsequent query looks like:
{
"selector": {
"Teams": {"$elemMatch": {"id": "79d25d41d991890350af672e0b76faed"}}
},
"fields": [
"_id",
"FirstName",
"LastName"
]
}
You can try it yourself in the "Query" section of the Cloudant dashboard or via curl with something like this:
curl -H "Content-Type: application/json" -X POST -d '{"selector":{"Teams":{"$elemMatch":{"id":"79d25d41d991890350af672e0b76faed"}}},"fields":["_id","FirstName","LastName"]}' https://broberg.cloudant.com/teams_test/_find
That database is world-readable, so you can see the sample documents I created in there here: https://broberg.cloudant.com/teams_test/_all_docs?include_docs=true
Dig the Seinfeld theme :D
You simply need to loop through the Teams array and emit a view entry for each of the teams.
function (doc) {
if(doc.Kind === "Profile"){
for (var i=0; i<doc.Teams.length; i++) {
var team = doc.Teams[i];
emit(team.id, [doc.FirstName, doc.LastName]);
}
}
}
You can then query for all profiles with a specific team id by keying on the team id like this
.../view?key="79d25d41d991890350af672e0b76faed"
giving
{"total_rows":7,"offset":2,"rows":[
{"id":"0d15041f43b43ae07e8faa737f00032c","key":"79d25d41d991890350af672e0b76faed","value":["Adam","Alpha"]},
{"id":"68779729be3610fd8b52b22574000ae8","key":"79d25d41d991890350af672e0b76faed","value":["Bob","Bravo"]},
{"id":"9f97f1565f03aebae9ca73e207001ee1","key":"79d25d41d991890350af672e0b76faed","value":["Chuck","Charlie"]}
]}
or you can include the actual profiles in the result by adding &include_docs=true to the query.

UnavailableShardsException

I want to index and search mysql database using elasticsearch & I followed this tutorial
https://github.com/jprante/elasticsearch-river-jdbc/wiki/Quickstart
At first I downloaded elasticsearch and installed river-jdbc in its plugin folder. then added mysql-jdbc inside ES_HOME/plugins/river-jdbc/ Then started elasticsearch and Started another terminal window, and created a new JDBC river with name my_jdbc_river with this curl command
curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"driver" : "com.mysql.jdbc.Driver",
"url" : "jdbc:mysql://localhost:3306/bablool",
"user" : "root",
"password" : "babloo",
"sql" : "select * from details"
},
"index" : {
"index" : "jdbc",
"type" : "jdbc"
}
}'
I'm getting the following error:-
then when I run this command: curl -XGET 'localhost:9200/jdbc/jdbc/_search?pretty&q=*'
and I'm getting following error:
"error": "IndexMissingException[[jdbc] missing]", "status" : 404
And when I give this in my browser:
http://localhost:9201/_search?q=*
Im getting like this:
{
"took": 51,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.0,
"hits": [
{
"_index": "_river",
"_type": "my_jdbc_river",
"_id": "_meta",
"_score": 1.0,
"_source": {
"type": "jdbc",
"jdbc": {
"driver": "com.mysql.jdbc.Driver",
"url": "jdbc:mysql://localhost:3306/bablool",
"user": "root",
"password": "babloo",
"sql": "select * from details"
},
"index": {
"index": "jdbc",
"type": "jdbc"
}
}
}
]
}
}
Is mysql dB indexed? How can I Search in my Db?
I encountered similar problem and this is how I managed to solve the issue:
Firstly, I checked all the indices via http://localhost:9200/_cat/indices?v
Deleted all indices with health status as red (there was just one index _river with health status red)
This is how you delete in index curl -XDELETE 'localhost:9200/_river/'
Redo step 7 in link you mentioned https://github.com/jprante/elasticsearch-river-jdbc/wiki/Quickstart
Hope it solves your problem as well :) Good luck!!