Exiftool export JSON with Python - json

I´m trying to extract some metadata and store them in a JSON file using Exiftool via Python.
If I run the following command (according to the documentation) in the CMD it works fine, generating a temp.json file:
exiftool -filename -createdate -json C:/Users/XXX/Desktop/test_folder > C:/Users/XXX/Desktop/test_folder/temp.json
When managing Exiftool from Python the data is extracted correctly but no JSON file is generated.
import os
import subprocess
root_path = 'C:/Users/XXX/Desktop/test_folder'
for path, dirs, files in os.walk(root_path):
for file in files:
file_path = path + os.sep + file
exiftool_exe = 'C/Users/XXX/Desktop/exiftool.exe'
json_path = path + os.sep + 'temp.json'
export = os.path.join(path + ' > ' + json_path)
exiftool_command = [exiftool_exe, '-filename', '-createdate', '-json', export]
process = subprocess.run(exiftool_command)
print(process.stdout)
When I run the code it shows the error:
Error: File not found - C:/Users/XXX/Desktop/test_folder > C:/Users/XXX/Desktop/test_folder/temp.json
What am I missing, any ideas on how to get it to work? Thanks!
Edit with the solution:
I let the fixed code here just in case it could help someone else:
import os
import subprocess
root_path = 'C:/Users/XXX/Desktop/test_folder'
for path, dirs, files in os.walk(root_path):
for file in files:
file_path = path + os.sep + file
exiftool_exe = 'C/Users/XXX/Desktop/exiftool.exe'
export = root_path + os.sep + 'temp.json'
exiftool_command = [exiftool_exe, file_path, '-filename', '-createdate', '-json', '-W+!', export]
process = subprocess.run(exiftool_command)
print(process.stdout)
Thanks to StarGeek!

I believe the problem is that file redirection is a property of the command line and isn't available with subprocess.run. See this StackOverflow question.
For a exiftool solution, you would use the -W (-tagOut) option, specifically -W+! C:/Users/XXX/Desktop/test_folder/temp.json. See Note #3 under that link.

Related

Creating an exe - file for data exchange with server in Tcl

I am completely lost and I do not know how to approach the following problem which my boss assigned to me.
I have to create an exe - file containing a code which works as follows when I run it: It sends a certain file, say file_A, to a server. When the server receives this file it sends back a json-file, say file_B, which contains an url. More precisely, the attribute of the json-file contains the url. The code should then open the url in a browser.
And here are the details:
The above code (one version in tcl) must accept three parameters and a fourth optional parameter (so, it is not necessary to pass a fourth parameter). The three parameters are: server, type and file.
server: this is the path to the server. For example, https://localhost:0008.
type: this is the type of the file (file_A) to be send to the server: xdt / png
file: this is the path to the file (file_A) to be send to the server.
The fourth optional parameter is:
wksName: if this paramater is given, then the url should be opened with it in the browser.
I got an example code for the above procedure written in python. It should serve as an orientation. I do not know anything about this language but to a large extend I understand the code. It looks as follows:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import platform
import sys
import webbrowser
import requests
args_dict = {}
for arg in sys.argv[1:]:
if '=' in arg:
sep = arg.find('=')
key, value = arg[:sep], arg[sep + 1:]
args_dict[key] = value
server = args_dict.get('server', 'http://localhost:0008')
request_url = server + '/NAME_List_number'
type = args_dict.get('type', 'xdt')
file = args_dict.get('file', 'xdtExamples/xdtExample.gdt')
wksName = args_dict.get('wksName', platform.node())
try:
with open(file, 'rb') as f:
try:
r = requests.post(request_url, data={'type': type}, files={'file': f})
request_url = r.json()['url'] + '?wksName=' + wksName
webbrowser.open(url = request_url, new = 2)
except Exception as e:
print('Error:', e)
print('Request url:', request_url)
except:
print('File \'' + file + '\' not found')
As you can see, the crucial part of the above code is this:
try:
with open(file, 'rb') as f:
try:
r = requests.post(request_url, data={'type': type}, files={'file': f})
request_url = r.json()['url'] + '?wksName=' + wksName
webbrowser.open(url = request_url, new = 2)
except Exception as e:
print('Error:', e)
print('Request url:', request_url)
except:
print('File \'' + file + '\' not found')
Everything else above it are just definitions. If possible, I would like to translate the above code into tcl. Could you please help me with this issue?
It doesn't have to be a 1-1 "tcl-translation" as long as it works as described above, and hopefully as simple as the above one.
I am not familiar with concepts such as sending/receiving data to/from servers, reading json-files etc.
Any help is welcome.

Download a file from a webpage using python

I need to download a file every 2 weeks from a webpage but the file is a new one every 2 weeks and therefore the name changes too, but it only changes the last 3 characters and the first "Vermeldung %%%" are the same. After that I need to send it to someone via email could someone help me accomplish that?
This is the code I have right now;
url ='https://worbis-kirche.de/downloads?view=document&id=339:vermeldungen-kw-9&catid=61'
from bs4 import BeautifulSoup
from bs4.dammit import EncodingDetector
import requests
parser = 'html.parser' # or 'lxml' (preferred) or 'html5lib', if installed
resp = requests.get(url)
http_encoding = resp.encoding if 'charset' in resp.headers.get('content-type', '').lower() else None
html_encoding = EncodingDetector.find_declared_encoding(resp.content, is_html=True)
encoding = html_encoding or http_encoding
soup = BeautifulSoup(resp.content, parser, from_encoding=encoding)
for link in soup.find_all('a', href=True):
print(link['href'])
It gives me all the links I need but how do I tell the program which link to download. The link that needs to be downloaded is /downloads?view=document&id=339&format=raw
I think you need to get this link:
https://worbis-kirche.de/downloads?view=document&id=339&format=raw
So, you could just do this:
import shutil
...
for link in soup.find_all('a', href=True):
myLink = link['href'] # Assuming the myLink is /downloads?view=document&id=339&format=raw
myLink = "https://worbis-kirche.de" + myLink
r = requests.get(myLink, stream=True) # To download it
r.raw.decode_content = True
with open(filename, "wb") as f: # Filename is the name of pdf
shutil.copyfileobj(r.raw, f)
try:
shutil.move(os.getcwd() + "/" + filename, directory + filename) # Directory is your aimed (preferred) downloads folder
except Exception as e:
print(e, ": File couldn\'t be transferred")
I hope I answered your question...

Python read JSON files from all sub-directories

I have a following folder structure:
Directory
- Subdirectory 1:
file.json
- Subdirectory 2:
file.json
- Subdirectory 3:
file.json
- Subdirectory 4:
file.json
How do I read these JSON files using Pandas?
Try this code:
import pandas as pd
from pathlib import Path
files = Path("Directory").glob("**/*.json")
for file in files:
df = pd.read_json(file)
To learn more about converting JSON string to Pandas object:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_json.html
You could do the following:
import glob, os
working_directory = os.getcwd()
sub_directories = [active_directory + "/" + x for x in os.listdir(working_directory) if os.path.isdir(active_directory + "/"+x)]
all_json_files = []
for sub_dir in sub_directories:
os.chdir(sub_dir)
for file in glob.glob("*.json"):
all_json_files.append(sub_dir + "/" + file)
#Get back to original working directory
os.chdir(working_directory)
list_of_dfs = [pd.read_json(x) for x in all_json_files]
From there, if all json files have the same structure, you could concatenate them to get one single dataframe:
final_df = pd.concat(list_of_dfs)

User to download file to their local directory

Backend - I wrote a python script that creates a csv file after some aggregation.
Frontend - Once the method finished running and the .csv file is generated and saved to a directory in the server, I want to be able to prompt the user to save the .csv file on their local computer (just like the windows prompt you get when you press "save as..." on a webpage).
This is an example of what I've done so far from what I learned in Return Excel file in Flask app and Download a file when button is pressed on web application? :
Sample code:
with open(save_path + unique_filename + ".csv", 'w', encoding = 'utf8') as g:
writer = csv.writer(g, lineterminator = '\n')
writer.writerow(['name', 'place', 'location'])
HTML:
#app.route('/login', method='POST')
def do_login():
category = request.forms.get('category')
return '''
<html><body>
Hello. Save Results
</body></html>
'''
#app.route("/getCSV", methods = ['GET', 'POST'])
def getPlotCSV():
return send_from_directory(save_path + unique_filename + ".csv", as_attachment=True)
if __name__ == "__main__":
run(app, host = 'localhost', port = 8000)
My questions are:
1) send_from_directory is from flask, what is the bottle equivalent?
2) Where in the code do I place the csv I created so the user can download it to their local machine?
3) What else is wrong with my code?
Bottle example: From https://bottlepy.org/docs/dev/tutorial.html
#route('/download/<filename:path>')
def download(filename):
return static_file(filename, root='/path/to/static/files', download=filename)

Open JSON files in different directory - Python3, Windows, pathlib

I am trying to open JSON files located in a directory other than the current working directory (cwd). My setting: Python3.5 on Windows (using Anaconda).
from pathlib import *
import json
path = Path("C:/foo/bar")
filelist = []
for f in path.iterdir():
filelist.append(f)
for file in filelist:
with open(file.name) as data_file:
data = json.load(data_file)
In this setting I have these values:
file >> C:\foo\bar\0001.json
file.name >> 0001.json
However, I get the following error message:
---> 13 with open(file.name) as data_file:
14 data = json.load(data_file)
FileNotFoundError: [Errno 2] No such file or directory: '0001.json'
Here is what I tried so far:
Use .joinpath() to add the directory to the file name in the open command:
with open(path.joinpath(file.name)) as data_file:
data = json.load(data_file)
TypeError: invalid file: WindowsPath:('C:/foo/bar/0001.json')
Used .resolve() as that works for me to load CSV files into Pandas. Did not work here.
for file in filelist:
j = Path(path, file.name).resolve()
with open(j) as data_file:
data = json.load(data_file)
Since I'm on Windows write path as (and yes, the file is in that directory):
path = Path("C:\\foo\\bar") #resulted in the same FileNotFoundError above.
Instantiate path like this:
path = WindowsPath("C:/foo/bar")
#Same TypeError as above for both '\\' and '/'
The accepted answer has a lot of redundants - re-collected generator and mixed with statement with pathlib.Path.
pathlib.Path is awesome solution to handle paths especially if we want to create scripts which may work with Linux and Windows.
# modules
from pathlib import Path
import json
# static values
JSON_SUFFIXES = [".json", ".js", ".other_suffix"]
folder_path = Path("C:/users/user/documents")
for file_path in folder_path.iterdir():
if file_path.suffix in JSON_SUFFIXES:
data = json.loads(file_path.read_bytes())
Just adding modification for new users. pathlib.Path works with Python3.
Complete solution; thanks #eryksun:
from pathlib import *
import json
path = Path("C:/foo/bar")
filelist = []
for f in path.iterdir():
filelist.append(f)
for file in filelist:
with open(str(file) as data_file:
data = json.load(data_file)
This line works as well:
with file.open() as data_file: