Error with the JSON File while using scrapy for crawling - json

I am using a script, Financial-News-Crawler, that I found on GitHub using Scrapy. Basically this crawls the mentioned website. It uses a Json file where rules and paths are mentioned for the website to crawl. I had a look at the scrapy documentation in order to write the JSON file for my yahoo finance website. When I try to run the script it says "No json could be decoded."
Here is the structure of the .json file:
{
"allowed_domains" : [“finance.yahoo.com”],
"start_urls": [
"http://finance.yahoo.com/investing-news/"
],
"rules": [
{
"allow": [“/investing-news"],
"follow": true
},
],
"paths": {
"title" : ["//title/text()"],
"date" : ["//span[#class='datestamp']/text()"],
"text" : ["//div[#id=‘article_content’]”, "//div[#id='article_body']"]
},
"source": “finance yahoo“,
"company": “Yahoo”
}
What am I doing anything wrong?

Related

Trying to generate a json asset file in Flutter

I have created a flutter web project and I am using flutter_azure_b2c package which uses asset json files, everything works fine when I run it.
it's shown like this
Now I want to generate these json config files automatically using env variables when I build the app.
Here is my json file format:
{
"client_id" : "",
"redirect_uri" : "",
"cache_location": "localStorage",
"interaction_mode": "redirect",
"authorities": [
{
"type": "B2C",
"authority_url":""
},
{
"type": "B2C",
"authority_url":""
}
],
"default_scopes": [
]
}
How can it be done?
I tried to write into the file when executing main.dart using this package https://pub.dev/packages/global_configuration but json values are not updated when I pass them in flutter_azure_b2c method

Can Filebeat parse JSON fields instead of the whole JSON object into kibana?

I am able to get a single JSON object in Kibana:
By having this in the filebeat.yml file:
output.elasticsearch:
hosts: ["localhost:9200"]
How can I get the individual elements in the JSON string. So say if I wanted to compare all the "pseudorange" fields of all my JSON objects. How would I:
Select "pseudorange" field from all my JSON messages to compare them.
Compare them visually in kibana. At the moment I can't even find the message let alone the individual fields in the visualisation tab...
I have heard of people using logstash to parse the string somehow but is there no way of doing this simply with filebeat? If there isn't then what do I do with logstash to help filter the individual fields in the json instead of have my message just one big json string that I cannot interact with?
I get the following output from output.console, note I am putting some information in <> to hide it:
"#timestamp": "2021-03-23T09:37:21.941Z",
"#metadata": {
"beat": "filebeat",
"type": "doc",
"version": "6.8.14",
"truncated": false
},
"message": "{\n\t\"Signal_data\" : \n\t{\n\t\t\"antenna type:\" : \"GPS\",\n\t\t\"frequency type:\" : \"GPS\",\n\t\t\"position x:\" : 0.0,\n\t\t\"position y:\" : 0.0,\n\t\t\"position z:\" : 0.0,\n\t\t\"pseudorange:\" : 20280317.359730639,\n\t\t\"pseudorange_error:\" : 0.0,\n\t\t\"pseudorange_rate:\" : -152.02620448094211,\n\t\t\"svid\" : 18\n\t}\n}\u0000",
"source": <ip address>,
"log": {
"source": {
"address": <ip address>
}
},
"input": {
"type": "udp"
},
"prospector": {
"type": "udp"
},
"beat": {
"name": <ip address>,
"hostname": "ip-<ip address>",
"version": "6.8.14"
},
"host": {
"name": "ip-<ip address>",
"os": {
<ubuntu info>
},
"id": <id>,
"containerized": false,
"architecture": "x86_64"
},
"meta": {
"cloud": {
<cloud info>
}
}
}
In Filebeat, you can leverage the decode_json_fields processor in order to decode a JSON string and add the decoded fields into the root obejct:
processors:
- decode_json_fields:
fields: ["message"]
process_array: false
max_depth: 2
target: ""
overwrite_keys: true
add_error_key: false
Credit to Val for this. His answer worked however as he suggested my JSON string had a \000 at the end which stops it being JSON and prevented the decode_json_fields processor from working as it should...
Upgrading to version 7.12 of Filebeat (also ensure version 7.12 of Elasticsearch and Kibana because mismatched versions between them can cause issues) allows us to use the script processor: https://www.elastic.co/guide/en/beats/filebeat/current/processor-script.html.
Credit to Val here again, this script removed the null terminator:
- script:
lang: javascript
id: trim
source: >
function process(event) {
event.Put("message", event.Get("message").trim());
}
After the null terminator was removed the decode_json_fields processor did its job as Val suggested and I was able to extract the individual elements of the JSON field which allowed Kibana visualisation to look at the elements I wanted!

Copy JSON Array data from REST data factory to Azure Blob as is

I have used REST to get data from API and the format of JSON output that contains arrays. When I am trying to copy the JSON as it is using copy activity to BLOB, I am only getting first object data and the rest is ignored.
In the documentation is says we can copy JSON as is by skipping schema section on both dataset and copy activity. I followed the same and I am the getting the output as below.
https://learn.microsoft.com/en-us/azure/data-factory/connector-rest#export-json-response-as-is
Tried copy activity without schema, using the header as first row and output files to BLOB as .json and .txt
Sample REST output:
{
"totalPages": 500,
"firstPage": true,
"lastPage": false,
"numberOfElements": 50,
"number": 0,
"totalElements": 636,
"columns": {
"dimension": {
"id": "variables/page",
"type": "string"
},
"columnIds": [
"0"
]
},
"rows": [
{
"itemId": "1234",
"value": "home",
"data": [
65
]
},
{
"itemId": "1235",
"value": "category",
"data": [
92
]
},
],
"summaryData": {
"totals": [
157
],
"col-max": [
123
],
"col-min": [
1
]
}
}
BLOB Output as the text is below: which is only first object data
totalPages,firstPage,lastPage,numberOfElements,number,totalElements
500,True,False,50,0,636
If you want to write the JSON response as is, you can use an HTTP connector. However, please note that the HTTP connector doesn't support pagination.
If you want to keep using the REST connector and to write a csv file as output, can you please specify how you want the nested objects and arrays to be written ?
In csv files, we can not write arrays. You could always use a custom activity or an azure function activity to call the REST API, parse it the way you want and write to a csv file.
Hope this helps.

How to parse, and render, json for use in Django template

I'm a newbie in Django and trying to learn, but I'm confused about how I can render data pulled from URL in a template in Django and display it in the HTML page.
The json data sample is:
{
"docs":
[
{
"hostIP": "X.X.X.X",
"time": "August 13, 2018 13:43:44",
"site":
[
{
"site": "site1",
"path": "/path/to/site1",
"git_branch": "master",
"git_commit_message": "New changes"
},
{
"site": "site2",
"path": "/path/to/site2",
"git_branch": "master",
"git_commit_message": "add card"
}
]
}
]
}
Also how i can loop it using Jinja2? Please someone help me out of this.
In your view code, parse it with json.loads():
import json
data = json.loads(my_json_data)
Then pass in data as a context variable to the view. Then you can see these variables and loop over them how you want in the template.

Angular 6 - Unexpected token in JSON at position 0

My Angular 5 project was working without issues, just after having updated it to version 6, it stopped building using ng build due to the next:
ERROR in ./src/app/assets/i18/en.json Module parse failed: Unexpected
token in JSON at position 0 You may need an appropriate loader to
handle this file type.
here is my json file:
{
"app": {
"Welcome": "Welcome",
"New": "New"
},
"mainMenu": {
"Home": "Home",
"Logout": "Logout"
},
"pageHeader": {
"About": "About",
"Settings": "Settings"
}
}
Most solutions on the web are talking about CopyWebpackPlugin but the project doesn't use any Webpack configuration file.
then, following this link I tried to make the json as an array:
{
"menu":[
"app": {
"Welcome": "Welcome",
"New": "New"
},
"mainMenu": {
"Home": "Home",
"Logout": "Logout"
},
"pageHeader": {
"About": "About",
"Settings": "Settings"
}
]
}
But got the following error, despite the file contains 16 lines.
Unexpected token : in JSON at position 24
Any idea ?
Inspired by #AndrewJuniorHoward, found that while upgrade process, all the json files were encoded to UTF-8-BOM instead of UTF-8, that's why Angular was unable to load them during build.
In Visual Studio code, I just created empty files, pasted in them the content of the old json files and then overwritten them, and all worked perfectly.
Resave the angular.json file as UTF8. There seems to be a recent problem with upgrading to Angular 6 regarding this.
Hope you have resolved the issue but still if you want some minor changes you can try adding "id" to objects in array as below, I Tried this in my CLI project on Angular 6 while performing CURD operation in JSON file.
{
"menu":[
"app": {
"id": 1,
"Welcome": "Welcome",
"New": "New"
},
"mainMenu": {
"id": 2,
"Home": "Home",
"Logout": "Logout"
},
"pageHeader": {
"id": 3,
"About": "About",
"Settings": "Settings"
}
]
}