I have a json file that im trying to import into MongoDB.
The compass says it is invalid json so I used the mongoimport command,
it did imporot it but everything is in 1 row.
How can I make it so the following json format is imported as while using the id as the main id instead of the auto generated ObjectId.
{
"id": {
"value1": "value"
},
"id": {
"value1": "value",
"value2": "value"
}
}
("id" is a string with the value of the actual id so the json doesnt actually have "id" there)
I guess one way to solve this is to fully reformat my json file to the correct format but I have a lot of records in the json and would like to keep it this way.
edit:
I have reformatted my jsons and everything works.
Related
I have inherited project where an avro file that is being consumed by Snowflake. The schema of the avro is as follows:
{
"name": "TableName",
"namespace": "sqlserver",
"type": "record",
"fields": [
{
"name": "hAccount",
"type": "string"
},
{
"name": "hTableName",
"type": "string"
},
{
"name": "hRawJSON",
"type": "string"
}
]
}
The hRawJSON is a blob of JSON itself. The previous dev put this as a type of string, and this is where I believe the problem lies.
The application takes a JSON object (the JSON is varible so I never know the contents or what it contains) and populates the hRawJSON field in the Avro record. But it contains the escape characters for the double quotes in the string:
hAccount:"H11122"
hTableName:"Departments"
hRawJSON:"{\"DepartmentID\":1,\"ModelID\":0,\"Description\":\"P Medicines\",\"Margin\":\"3.300000000000000e+001\",\"UCSVATRateID\":0,\"References\":719,\"HeadOfficeID\":1,\"DividendID\":0}"
As a result the JSON blob is staged into Snowflake as a VARIANT field but still retains the escape characters:
Snowflake image
This means when querying the data in the JSON I constantly have to use this:
PARSE_JSON(RAW_FILE:hRawJSON):DepartmentID
I can't help feeling that the field type of string in the Avro file is causing the issue and that a different type should be used. I've tried Record, but without fields it's unuable. Doc also not working.
The other alternative is that this behavior is correct and when moving the hRawJSON from staging into "proper" tables I should use something like:
INSERT INTO DATA.PUBLIC.DEPARTMENTS
SELECT
RAW_FILE:hAccount::VARCHAR(4) as Account,
PARSE_JSON(RAW_FILE:hRawJSON) as JsonRaw
FROM DATA.STAGING.AVRO_RAW WHERE RAW_FILE:hTableName::STRING = 'Department';
So if this should be the correct approach and I'm over thinking this I'd appreciate guidance.
I have POSTAL_CODE field in my json file. If I try importing that data to SOLR using solr/post, the fieldtype is being set as 'plongs' which is not suitable for data like "108-0023". Beacause of that the data import is throwing out an error. Is there any work around for this kind of issues?
Edit:
Sample data which you might use to check it.
{
"id": "1",
"POSTAL_CODE": "1982"
},
{
"id": "2",
"POSTAL_CODE": "1947"
},
{
"id": "3",
"POSTAL_CODE": "19473"
},
{
"id": "4",
"POSTAL_CODE": "19471"
},
{
"id": "5",
"POSTAL_CODE": "1947-123"
}
In the above sample, I don't understand why 'id' is not being considered as 'plongs' or 'pints' but only 'POSTAL_CODE' has that issue. if the first element has POSTAL_CODE as, say "1947-145" then the field type is being taken as 'text_general'. Generally if the value has double quotes, (i.e., "Data": "123") shouldn't it be considered as a string value?
Remove the collection, create it as new and before you index anything, define a field POSTAL_CODE in your schema as type string. This will then index any incoming data on this field without guessing, but instead use the string type, which means it is indexed as-is.
Copied and adapted from https://lucene.apache.org/solr/guide/7_0/schema-api.html, but untested:
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field":{
"name":"POSTAL_CODE",
"type":"string",
"stored":true }
}' http://localhost:8983/solr/yourcollectionhere/schema
I tried to import the data by creating a raw json document with the field POSTAL_CODE. Below is my json & my solr version is 7.2.1
{"array": [1,2,3],"boolean": true,"color": "#82b92c","null": null,"number": 123,"POSTAL_CODE": "108-0023"}
It is indexed as Text Field in solr below is the attached screenshot. Command I triggered to index the data is as below:
bin/post -c gettingstarted test.json
Could you please provide the sample data and version of solr on which you are facing this issue.
My Goal is to retrieve JSON type fields in an Solr index and also perform search queries on such fields.
I have the following documents in Solr Index and using the auto generated schema utilizing schemaless feature in Solr.
POST http://localhost:8983/solr/test1/update?commitWithin=1000
[
{"id" : "1", "type_s":"book", "title_t" : "The Way of Kings", "author_s" : "Brandon Sanderson",
"miscinfo": {"provider": "orielly", "site": "US"}
},
{"id" : "2", "type_s":"book", "title_t" : "The Game of Thrones", "author_s" : "James Sanderson",
"miscinfo": {"provider": "pacman", "site": "US"}
}
]
I see the JSON types are stored as strings in the schemaField type as seen in the output for following
GET http://localhost:8983/solr/test1/schema/fields
{
"name":"miscinfo",
"type":"strings"}
I had tried using srcField as mentioned in this post. However, a query to retrieve json type returns empty response. Below are the GET request used for the same
GET http://localhost:8983/solr/test1/select?q=1&fl=miscinfo&wt=json
GET http://localhost:8983/solr/test1/select?q=1&fl=miscinfo,source_s:[json]&wt=json
Also, the search queries for values inside JSON type fields return empty response
http://localhost:8983/solr/test1/select?q=pacman&wt=json
{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "pacman",
"json": "",
"wt": "json"
}
},
"response": {
"numFound": 0,
"start": 0,
"docs": []
}
}
Please help in searching object types in Solr.
Have you checked this: https://cwiki.apache.org/confluence/display/solr/Response+Writers
JSON Response Writer A very commonly used Response Writer is the
JsonResponseWriter, which formats output in JavaScript Object Notation
(JSON), a lightweight data interchange format specified in specified
in RFC 4627. Setting the wt parameter to json invokes this Response
Writer. Here is a sample response for a simple query like
q=id:VS1GB400C3&wt=json:
I've been able to successfully import of a single JSON file to Power BI - but am struggling to find out how to appease the 'Folder' import data source with the same data schema across multiple files.
The error is always some variation on We found extra characters at the end of JSON input - Usually the opening character of the 2nd file. E.g.:
I assume there's some spec to how it expects the data to be split across files - but I really can't figure out what it wants. Help?
For example:
I've been able to import a single file (in JSON-array format) like:
[
{
"name": "Person1",
"id": 1
},
{
"name": "Person2",
"id": 2
}
]
When I create a 2nd file with the same schema (and slightly varied data) - I get the error.
I tried moving it out to a named param - but still the same issue.
{
"data": [
{
"name": "Person1",
"id": 1
},
{
"name": "Person2",
"id": 2
}
]
}
Note: It's from expanding the Binary content:
So I assume it's from either the of the last 2 steps:
You don't want to use the combine binaries button here. Instead, you can add a custom column with the custom formula Json.Document([Content]).
I have this JSON that is returned from a REST-service I'm using.
{
"id": "6804",
"signatories": [
{
"id": "12125",
"fields": [
{
"type": "standard",
"name": "fstname",
"value": "John"
},
{
"type": "standard",
"name": "sndname",
"value": "Doe"
},
{
"type": "standard",
"name": "email",
"value": "john.doe#somwhere.com"
},
{
"type": "standard",
"name": "sigco",
"value": "Company"
}
]
}
]
}
Currently I'm looking into a way to parse this with json4s, iterating over the "fields" array, to be able to change the property "value" of the different objects in there. So far I've tried a few json libs and ended up with json4s.
Json4s allows me to parse the json into a JObject, which I can try extract the "fields" array
from.
import org.json4s._
import org.json4s.native.JsonMethods._
// parse to JObject
val data = parse(json)
// extract the fields into a map
val fields = data \ "signatories" \ "fields"
// parse back to JSON
println(compact(render(fields)))
I've managed to extract a Map like this, and rendered it back to JSON again. What I can't figure out though is, how to loop through these fields and change the property "value" in them?
I've read the json4s documentation but I'm very new to both Scala and it's syntax so I'm having a difficult time.
The question becomes, how do I iterate over a parsed JSON result, to change the property "value"?
Here's the flow I want to achieve.
Parse JSON into iterable object
Loop through and look for certain "names" and change their value, for example fstname, from John to some other name.
Parse it back to JSON, so I can send the new JSON with the updated values back.
I don't know if this is the best way to do this at all, I'd really appreciate input, maybe there's an easier way to do this.
Thanks in advance,
Best regards,
Stefan Konno
You can convert the json into an array of case class which is the easiest thing to do. For example: you can have case class for Fields like
case class Field(`type`: String, name: String, value: String)
and you can convert your json into array of fields like read[Array[Field]](json) where json is
[
{
"type": "standard",
"name": "fstname",
"value": "John"
},
...
]
which will give you an array of fields. Similarly, you can model for your entire Json.
As now you have an array of case classes, its pretty simple to iterate the objects and change the value using case classes copy method.
After that, to convert the array of objects into Json, you can simply use write(objects) (read and write functions of Json4s are available in org.json4s.native.Serialization package.
Update
To do it without converting it into case class, you can use transformField function
parse(json).transformField{case JField(x, v) if x == "value" && v == JString("Company")=> JField("value1",JString("Company1"))}