Submiting parsed content into ElasticSearch - json

I am trying to upload files(.txt, .pdf) in Elasticsearch. Elasticsearch receives only content in json format. Is there any way that I send parsed content(.pdf or .txt to String) directly or I have to parse String into json document to send it to Elasticsearch.

You can only send JSON when indexing a Document, so basically, a base64 encoded version of the file in some field of that JSON will do just fine. If you do not wish to search inside this content, then all you have to do is disabling indexing on that "binary data" field (option index:false in your mapping).
If you wish to send a PDF file and have the textual content extracted and indexed / searchable, you should have a look at the Ingest Attachment Plugin.

You can look at this https://github.com/dadoonet/fscrawler for your use case.
Basically, This crawler helps to index binary documents such as PDF, Open Office, MS Office and will give you following feature
Local file system (or a mounted drive) crawling and index new files,
update existing ones and removes old ones.
Remote file system over SSH crawling.
REST interface to let you "upload" your binary documents to
elasticsearch.

Related

import json without caching the file (next.js or node.js)

I built an api with next.js.
I use a JSON file as a data source. I import it as a module. But if the content of the JSON changes, it still shows the first content, the same, when i started the server.
Is there a way to avoid caching JSON with import?
I need to get the JSON content, but also the updates in the JSON file, without restarting my api.
If your Server returns the JSON files with a specific File-Extension like .json you could try to turn off the caching for those file-types:
Here is an example for ngnix-servers
Here is an example for apache-servers
Another possibility is to load the JSON via Javascript where you add some random parameter to the Query-Params of the URL
Here is the Example

What does a JSON file do?

I went through previous posts on SO and some of the answers say that a JSON file is used to send data from server to client.
Well that seems to be okay but then we can create package.json, Apidoc.json, manifest.json which do not interact with the client and server
So can someone tell me what actually is a JSON file?
JSON stands for JavaScript Object Notation. It is used to describe a data structure in a simple format. It can be a plain text file, which may be used to pass data from the server to a client, but it could be equally used to hold and consume that data at the same layer e.g. you could have a configuration file at the client side which is read an interpreted by your application.
Note also that JSON does not need to be held in a file; you could create a string variable with JSON data in it and pass this from one method to another without ever storing it in a file.
The tag definition in Stack Overflow can be found here https://stackoverflow.com/tags/json/info and further information can be found here https://www.json.org/.
JSON is a file format, just like CSV. Just because CSV is used with Microsoft Excel, does not mean that is all it is used for (just like with JSON). Just because it is common to get info from a server in JSON format, does not mean that is all JSON is used for. Do some googling before asking a question like this on Stack Overflow.
Here is an intro to JSON. JSON Intro W3Schools

How to store a blob of JSON in Airtable?

There does not appear to be a dedicated field type in Airtable for "meta" data blobs and/or a JSON string.
Is the "Attachment" type my best bet?
I could store it either as a json attachment, or on a String type column.
Since a full json on a text column would likely not be readable, I would store it as attachments.
However, it seems that at least for now, uploading attachments require the file to be already hosted somewhere first, so this route might not be the easiest one:
https://community.airtable.com/t/is-it-possible-to-upload-attachments/188
Right now this isn’t possible with the Airtable API alone. It’s
something we’ll think about for future API versions though. A
workaround for now is to use a different service
(e.g. Filestack90, imgur52, etc.) to process the upload before then
sending the url to Airtable. When Airtable processes the attachment,
it will copy the file to Airtable’s own (S3) server for safekeeping,
so it’s OK if the original uploaded file url is just temporary

Polymer:how to handle csv file upload and convert to json and send to server

i want to handle a requirement in polymer webcomponents where user can upload csv file from ui and csv file can be parsed to json and sent to server ,i searched and found for vaadin upload,looked over the api but i am not sure how to receive the csv file and convert to json and sent to server,can anyone show a jsfiddle of vaadin upload or any other web component to handle this scenario?
First of all, I am wondering why you would not simply do the conversion on the server side.
In this case, you would be able to use the vaadin-upload directly indeed.
Here is a snippet that would upload all files to the example.com server, and only allow CSV files.
<vaadin-upload target="https://example.com/upload" method="POST" accept="text/csv">
</vaadin-upload>
There are plenty of resources on how to convert CSV files to JSON.
Here is a snippet
And here is a node library
If you really wanted to do the conversion client side, then I would suggest to create an element that would embed a vaadin-upload, and convert the Files array to Json before manually calling the uploadFiles method.

Send Json inside FormData as File alongside files

I was wondering if it was ok to encode json data as a blob (using atob) alongside some other binary data to the server, and then reload the json file file into memory to save it in the database?
I am doing this in order to prevent multiple posts to the server, I could for example, send the json data, validate it, and when it's OK, I could send the files with some Ids, and then link the both of them server side.
but this method seems broken, because if the user leaves the browser while he is sending the data, and part of the files have been sent, then i'd have alot of uncomplete posts.
Thanks for your answers in advance.