Cakephp 3 searching a plugin or a solution to read keyword into pdf file - cakephp-3.0

I'm looking for a solution to use on cakephp, I have a directory that contains pdf files, inside my webapp I need to implement a search input, which takes care of finding the file that contains a certain keyword, but I haven't been able to find solutions. Can someone help me?

I had a similar problem, but could not find anything suitable.
We ended up storing the PDF in Google Drive in a dedicated account and use Google APi to do the search.
Obviously it works only if the documents have no sensitive data.

Related

Limiting downloading, copying and printing for EDITORS in Google drive

I am looking to find out if it is POSSIBLE to restrict EDITORS from downloading, copying or printing a Google sheet or other documents in a Google Drive. We share sheets/documents with our customers so they can fill in the details. For that we need to make them EDITORS (so they can edit and even invite others to the party). I know that we can restrict COMMENTERS and VIEWERS from downloading, but in our case we need to prevent EDITORS.
We have a LOT of intellectual property in our sheets (custom formulas and approaches), and we would like to be able to prevent people from simply downloading it. As I understand it this SHOULD be possible using the Google Drive API, but I have not been able to figure out how to do it, yet. Looking at the API it obliquely says it MAY be possible, but it is not clear :-(
Direction, or sample code, would be VERY much appreciated.
TIA
It's not possible to restrict editors from downloading files. The documentation mentions that only commenters and viewers can be prevented from this. Then in the API docs concerning permissions and their definitions you will see that there's nothing controlling downloads either. This is just a UI change.
If you think about it, the reason is clear: Even if you manage to stop direct downloads, to anyone with at least read access to the file or API this is just a minor inconvenience. They can still read all the content and metadata from the API and replicate the file perfectly. Even viewers with copy disabled can still read the formulas from the formula bar. Sharing the Sheets file is inherently unsafe if you have confidential data in it, since a determined attacker can still get all your trade secrets easily. You're only supposed to share these files with trusted users.
My suggestion is to take a different approach. Do not share the Sheets file at all and use something else as intermediary to request data from your customers. For example:
Create a Google Form to save the responses to a Sheet. Your customers will only need to fill out the form and the sheet will be filled with data that you can handle on your side.
If you need the users to also view some information in the Sheet before filling out their info you can build an Apps Script Web App that displays only the plain data that you need to show them. With this you can hide the formulas and other sensitive information. Using templates and server functions you can allow the users to interface with the Sheet data similarly to how they do it now, but with a more restricted view. You could even allow them to edit only the data you want them to. This requires more work and is starting to delve into web development, but It's much easier than a fully fledged website since the hosting and interfacing with APIs is handled by Apps Script.
You could just create your own application and use the Sheets API to read and write data from the Sheet. This is pretty much the previous suggestion but much harder, though in the end it will give you more flexibility.
The bottom line is that sharing your Sheet in any way is akin to giving your users full database read or write access, and there's no single setting that can prevent that. Your best bet is to avoid sharing these files and use a different method to request user data.

Retrieving comments in the most recent revision of Google Doc

I'm pretty new to Google Apps Script and the API, so please bear with me!
As part of a project I'm working on, I would like to retrieve all the comments that are present in the current revision of a file, hopefully using either Google Apps Script or the Google API.
The reason I want this is that the documents I'm working with have had many collaborators and many different revisions, and some of the people who were supposed to be 'resolving' comments didn't realise that even though deleting the text a comment is anchored to removes the comment from the user interface, this comment still counts as "open". i.e., these comments still appear in the comments thread, it just looks like they link to an earlier revision of the file.
Now, I want to retrieve all the comments present in the most recent versions of these files and -- say -- export them to a Google Sheet. However, if I choose all "open" comments, I get many, many more comments than I want, for the reason I stated above.
One possible way I was considering is to decipher the comment's anchor id "kix.xxxxxxxxxxxx", hoping that it would at least give me information on the revision history, but I see no documentation on this, and I'm not even sure it's possible since this is related to Google proprietary editor Kix.
I've read these articles that don't give me much hope:
How to match comments on an image using kix anchor (or not) in Google
Docs
Anchor documentation does not exist?
Creating anchored comments programmatically in Google Docs
However, If I download the Google Doc to a .docx, then I get only the comments that are present in the most latest revision, and this is what gave me some hope that maybe it would be possible to extract them using GAS.
I suppose if all else fails I could download all the documents and then try to get some sort of macro on Microsoft Word to extract the comments, but I decided to ask here before I resorted to that! If anyone could suggest any ideas, I'd be grateful!

advice on coding duplicate file delete on google drive

I would like to write a duplicate file removing program, but don't see a way to compare files' binary without first downloading the file. I would like the software to run on the server and allow other people to use it as well.
Any advice on doing this is appreciated.
You first need to define "duplicate" in the context of the Google Drive file system. Eg. Do you consider them to be duplicates if the title is the same, or if the content is the same.
In either case, you'll need to fetch the metadata for each of your files, then look for duplicate title or md5sum only for non-Google documents.

File name conversion for cloud storages?

Lets say I have a web URL to a file on a cloud storage (like Dropbox, Google Drive, etc). How do I convert that to the corresponding file path on my pc? On Android? On iOS?
Assuming of course I have the utilities/apps installed locally.
EDIT: I interested in file name the reverse direction too. (I.e. when I have the local file path, what is the web path?)
EDIT 2: #Greg just made me realize that the problem with file name is much worse on Google Drive than on Dropbox.
And that is very bad. :-(
The reason? Google has good search capabilities on Drive and therefor I and many, many others have put their documents on Drive. However, once I found it I must locate it on my on computer/device. (If I want to edit a pdf for example.)
EDIT 3: #Dan McGrath kindly asked what parts remain unsolved.
Short answer: All. ;-)
Long answer: My actual use case, see below.
My actual use case is a Zotero web app. Zotero is a reference database where you store references to scientific articles, web pages, etc. The items stored in Zotero may include PDF files or - which I prefer - links to PDF files.
I just want to be able to easy access (read) this PDF files from any computer through the web app. And on my own computer I want to be able to edit the files with my local PDF editor. (Be it Android, Windows or whatever.)
By using a cloud storage I do not have to download/upload the files myself. The cloud storage takes care of that part.
For the "reverse" scenario, that is, you have a file and you want the Dropbox shared link, you can use this API endpoint, assuming you're connected to the account via the API:
https://www.dropbox.com/developers/core/docs#shares

Determine if a certain File is process by a project that uses Google Drive API

Good day! I am currently developing a website using Google Drive API. However, I am wondering if is it possible to know if a certain file is created/uploaded/shared by the project using an App Id. I was thinking if it is belong to the attributes of a file. But, when i checked, it seems that, there's no such thing.
What I am trying to do, is to filter the files' shared by the other user to the owner of the account using my web site. Is it possible? Any suggestions on how to do it?
Thank you in advance.
File metadata doesn't contain this information unfortunatelly, therefore you can't know if some file was created by your app or not. But FYI Google store it somewhere, if you will try to upload somefile without providing its content type (* / *) and after that try to open this file through browser you will see the message:
No preview available
This item was created with YourAppName, a Google Drive app.
Download this file or use one of the apps you have installed to open it.