Displaying a PDF file stored in a Dataset on Palantir Foundry in Slate Application - palantir-foundry

I am trying to display PDF files in Slate on Palantir Foundry. I managed to display PDF files that are stored in a folder on Foundry without a schema, but not PDFs that are in a Dataset.
Is there a way to display PDF files that are stored in a dataset or alternatively how can I store PDF files that I extracted from an email file using Code Repository into a folder on Foundry.
Edit: Since displaying PDF files stored in a dataset seems to be difficult. Can someone help me with the API call to store PDFs in a folder?

You can do it in two steps:
Obtain the precise address of the PDF you are trying to display.
Use image widget in Slate where you specify the source as the address from #1.
You can get 1 by looking at the address of any single file (dataset view -> details -> files). It will usually be in the shape of (...)/foundry-data-proxy/api/web/dataproxy/datasets/<dataset_rid>/transactions/<transaction_rid>/filename

I found a solution for saving files in a Palantir Foundry folder:
def upload_file(token: str, target_filename: str, parent_folder_rid: str, content_type: str, raw_data) -> None:
header = {'content-type': "application/json",
'authorization': f"Bearer {token}"}
target = urllib.parse.quote(tgt_filename, safe='')
response = requests.post(f"{blobster}salt?filename={target}&parent={parent_folder_rid}",
headers=header,
verify=True,
data=raw_data)
return response.json()

Related

Azure Logic Apps how to combine multiple JSON files to Blob storage

I have a logic app that calls an API daily and saves the output to a .JSON file within a blob/storage container
These JSON files are then picked up by Power BI for reporting purposes.
The number of files is growing quickly and I want to see if it's possible to have just one JSON file which gets appended with the new data each day?
Power BI can then just connect to one file.
Use the Update Blob connector to append the data in a blob.
Follow the workflow:
Blob Content - expression:
Use this concat expression to add your API response to append in blob:
concat(body('Get_blob_content_(V2)'),outputs('Compose'))
Result:

export nested JSON from GCS into Spreadsheet

I have a nested NDJSON file that I exported from BQ into Google Cloud Storage. From there I would like to open it in Spreadsheet again as a nested table.
I see a lot of Appscripts to import JSON files but none are for files stored in GCS.
What would be the best solution to open the data table in spreadsheet?
the csv file I see when I use the tool suggested by Alex
This is the NDJSON example:
{"page":"/xxxx","country":"DE","pageviews":"72136","daily_peak_pageviews":"5465","daily_peak_users":"3118","users_unique":"37763","SEO":true,"campaign_info":[{"channel_group":"Referral","users_c":"16","pageviews_c":"17","title":"404"},{"channel_group":"Social","users_c":"2255","pageviews_c":"3839","title":"OK"},{"channel_group":"other","users_c":"33185","pageviews_c":"63320","title":"OK"},{"channel_group":"Referral","users_c":"316","pageviews_c":"556","title":"OK"},{"channel_group":"Paid","users_c":"47","pageviews_c":"49","title":"404"},{"channel_group":"Paid","users_c":"1088","pageviews_c":"1706","title":"OK"},{"channel_group":"other","users_c":"1888","pageviews_c":"2517","title":"404"},{"channel_group":"Social","users_c":"100","pageviews_c":"132","title":"404"}]}
{"page":"/yyy","country":"DE","pageviews":"67576","daily_peak_pageviews":"5390","daily_peak_users":"2843","users_unique":"32772","SEO":true,"campaign_info":[{"channel_group":"other","users_c":"7","pageviews_c":"10","title":"404"},{"channel_group":"other","users_c":"30951","pageviews_c":"64345","title":"OK"},{"channel_group":"Paid","users_c":"782","pageviews_c":"1303","title":"OK"},{"channel_group":"Referral","users_c":"265","pageviews_c":"467","title":"OK"},{"channel_group":"Social","users_c":"889","pageviews_c":"1450","title":"OK"},{"channel_group":"Paid","users_c":"1","pageviews_c":"1","title":"404"}]}
{"page":"/zzz","country":"DE","pageviews":"7558","daily_peak_pageviews":"619","daily_peak_users":"331","users_unique":"4117","SEO":true,"campaign_info":[{"channel_group":"other","users_c":"7","pageviews_c":"14","title":"404"},{"channel_group":"Paid","users_c":"38","pageviews_c":"70","title":"OK"},{"channel_group":"other","users_c":"3987","pageviews_c":"7309","title":"OK"},{"channel_group":"Paid","users_c":"1","pageviews_c":"1","title":"404"},{"channel_group":"Referral","users_c":"18","pageviews_c":"26","title":"OK"},{"channel_group":"Social","users_c":"70","pageviews_c":"138","title":"OK"}]}
{"page":"hdhh","country":"DE","pageviews":"3616","daily_peak_pageviews":"336","daily_peak_users":"206","users_unique":"2131","campaign_info":[{"channel_group":"Social","users_c":"267","pageviews_c":"379","title":"OK"},{"channel_group":"Paid","users_c":"776","pageviews_c":"1394","title":"OK"},{"channel_group":"other","users_c":"1089","pageviews_c":"1814","title":"OK"},{"channel_group":"Referral","users_c":"17","pageviews_c":"24","title":"OK"},{"channel_group":"other","users_c":"2","pageviews_c":"5","title":"404"}]}
{"page":"/ethehh","country":"DE","pageviews":"1394","daily_peak_pageviews":"322","daily_peak_users":"294","users_unique":"1232","campaign_info":[{"channel_group":"Paid","users_c":"61","pageviews_c":"67","title":"OK"},{"channel_group":"Social","users_c":"271","pageviews_c":"301","title":"OK"},{"channel_group":"other","users_c":"3","pageviews_c":"5","title":"404"},{"channel_group":"Referral","users_c":"10","pageviews_c":"10","title":"OK"},{"channel_group":"other","users_c":"888","pageviews_c":"1011","title":"OK"}]}
and this is the csv example:
page,country,pageviews,daily_peak_pageviews,daily_peak_users,users_unique,SEO,campaign_info/0/channel_group,campaign_info/0/users_c,campaign_info/0/pageviews_c,campaign_info/0/title,campaign_info/1/channel_group,campaign_info/1/users_c,campaign_info/1/pageviews_c,campaign_info/1/title,campaign_info/2/channel_group,campaign_info/2/users_c,campaign_info/2/pageviews_c,campaign_info/2/title,campaign_info/3/channel_group,campaign_info/3/users_c,campaign_info/3/pageviews_c,campaign_info/3/title,campaign_info/4/channel_group,campaign_info/4/users_c,campaign_info/4/pageviews_c,campaign_info/4/title,campaign_info/5/channel_group,campaign_info/5/users_c,campaign_info/5/pageviews_c,campaign_info/5/title,campaign_info/6/channel_group,campaign_info/6/users_c,campaign_info/6/pageviews_c,campaign_info/6/title,campaign_info/7/channel_group,campaign_info/7/users_c,campaign_info/7/pageviews_c,campaign_info/7/title
/xxxx,DE,72136,5465,3118,37763,true,Referral,16,17,404,Social,2255,3839,OK,other,33185,63320,OK,Referral,316,556,OK,Paid,47,49,404,Paid,1088,1706,OK,other,1888,2517,404,Social,100,132,404
/yyy,DE,67576,5390,2843,32772,true,other,7,10,404,other,30951,64345,OK,Paid,782,1303,OK,Referral,265,467,OK,Social,889,1450,OK,Paid,1,1,404,,,,,,,,
/zzz,DE,7558,619,331,4117,true,other,7,14,404,Paid,38,70,OK,other,3987,7309,OK,Paid,1,1,404,Referral,18,26,OK,Social,70,138,OK,,,,,,,,
hdhh,DE,3616,336,206,2131,,Social,267,379,OK,Paid,776,1394,OK,other,1089,1814,OK,Referral,17,24,OK,other,2,5,404,,,,,,,,,,,,
/ethehh,DE,1394,322,294,1232,,Paid,61,67,OK,Social,271,301,OK,other,3,5,404,Referral,10,10,OK,other,888,1011,OK,,,,,,,,,,,,
I found some scripts to load json files into a Google SpreadSheet, but all of them need to be loaded using a url, so the steps to get a public link to your JSON file in GCS are:
Go to your Google Cloud Storage bucket and then in your json file click in the three dots at the right.
click into "edit permissions"
Click into "Add item"
in "ENTITY" choose "User", then en "NAME" type "allUsers" and in "ACCESS" choose "Reader".
Now you have an external link to load your JSON using some scripts, like this one or this other one, but you need to edit the JSON file or the code a bit.
Another solution (and the easiest one), is to convert the JSON file into CSV using this tool and then, import the CSV into Google SpreadSheet clicking into "File" -> "import" -> "Upload" and then select your CSV file.

Convert html to google docs or docx using drive api v3

I'm trying to convert a html file (or preformatted html String) to Google Docs using drive api v3 and android studio, using these lines:
MetadataChangeSet changeSet = new MetadataChangeSet.Builder()
.setTitle("report.html")
.setMimeType("text/html")
.build();
(I extract the code from android-demos-master examples )
If I try put another mimetype like "application/vnd.google-apps.document", my app crash. I want to upload the file and convert to Gdocs editor or Docx. I need convert before or after upload the file. Can someone guide me?
Using the python libraries, I found I had to specify two mimetypes:
Use 'application/vnd.google-apps.document' when creating the metadata for the Drive file. This is the type of file you want created - a Google Document.
Use 'text/html' for the object representing uploaded data, as that is the type of the content. In python, this were objects of type io.MediaUpload (file upload) or io.MediaIoBaseUpload (in-memory content).
I imagine it's something similar in Java.

Need to convert html to google doc native format using Java

I've read through the API documentation at https://developers.google.com/drive/v2/reference/ however I cannot find the answer to my question. And attempts to google a solution have failed.
I have a series of previously uploaded small HTML files sitting in Google Drive. What I want to do is write a short application to convert each of these to native Google Document format (mime type "application/vnd.google-apps.document").
I want to do this using Java code and not using GAS code.
The approach I used was to query drive for the File object corresponding to the item I want to convert. Then I pull that file's content as a string. Then I create a new file of mime type "application/vnd.google-apps.document" and upload it with the HTML content. Not surprisingly it didn't work.
So then I tried a different approach: Upload the content as "text/html" but set the "convert" flag to "true". Well I didn't see any direct API to set the convert flag to true. So I tried:
File oBody = new File() ;
oBody.setTitle ( sTitle ) ;
oBody.setDescription ( sDescription ) ;
oBody.setMimeType ( sMimeType ) ;
oBody.set("convert", bConvert);
This did not fail. But it did not create a Google Document either. It just created a text file identical to the original file.
How do I upload a document containing "text/html" content and get Google Drive to convert it automatically to a Google Document?
The convert flag has to be set in the files.insert request and not the File resource.
Using the snippet in the files.insert documentation as reference, this is what you should do:
...
File file = service.files().insert(body, mediaContent).setConvert(true).execute();
...

Partial download for Drive documents

I want to use partial download for all files from Google Drive.
For files that have 'downloadUrl' attribute - partial download works great.
But partial download doesn't work for 'exportLinks'.
range = "bytes=0-100"
result = client.execute!(:uri => url, :headers => {"Range" => range})
result.body.size
=> 3960
How can I use partial download for Google Documents(Documents, Spreadsheets, Presentations, etc.)?
There is an argument that a partial download of an export link makes no sense. The media content contained in an export download is generated dynamically by a Google file conversion process. There is no guarantee that the exported data will always be the same, eg. Google could easily tweak its toHtml conversion, or add additional meta data within a PDF.
What is your use case - perhaps there is a different approach? It shouldn't be too difficult to create a proxy on AppEngine which downloads the exported content and then offers a partial download.