Opening a downloaded mht file from Selenium (Help needed) - selenium-chromedriver

Long story short, I'm not a coder.
My team used to have this coder who created this Python/Selenium code to extract some information from chrome browser (Echocardiography reports) and/or downloaded mht file (also Echocardiography reports).
This code was working fine until recently, it stopped working.
The program still successfully downloads the mht file via chrome.
However, it fails to open the file and hence, code continues without extracting any information - resulting in empty extractions.
This is the part I need help figuring out
driver.get('chrome://downloads')
# driver.get('file:///C:/Users/name/Downloads/')
root1 = driver.find_element_by_tag_name('downloads-manager')
shadow_root1 = expand_shadow_element(root1)
time.sleep(2)
root2 = shadow_root1.find_element_by_css_selector('downloads-item')
shadow_root2 = expand_shadow_element(root2)
time.sleep(1.5)
openEchoFileButton = shadow_root2.find_element_by_id('file-link')
mhtFileName = openEchoFileButton.text
driver.get('file:///C:/Users/name/Downloads/' + mhtFileName) # go to web page
try:
echoDateElement = WebDriverWait(driver, delay).until(
EC.presence_of_element_located((By.XPATH, '/html/body/div[3]/p[1]/span[3]')))
except TimeoutException:
print("Loading page took too much time!")
I'm trying to figure out why it suddenly fails to open the downloaded mht files.
Last time our team tried using this code is back in 2020 and was successful.
Were there any updates to Chrome perhaps?
Help would be immensely appreciated.
Thank you so much in advance.

There are three obvious weaknesses in this code. The first two are the use of time.sleep() to wait for the element to appear and be manipulable. What if the machine is busy doing something else, and 1.5 seconds isn't enough? The right way to do that is to repeatedly check for the element to be ready. You've got a great example of how to do that using WebDriverWait() in this code already. The third weakness is the locator used in that presence_of_element_located() call. XPath locators rooted at "/html" are notoriously fragile, subject to breakage by small changes to the web page. Try to find something in the page that you can check via a more stable locator - ideally, an element with an ID= attribute.

Related

Convert a JSON Response to pdf - NodeJS

When a get request is sent to: 'http://localhost:4000/features'
There is a response with JSON Data which has HTML inside it.
I need the contents of the field name and description to be saved as PDF
Sample:
[{"_id":"5ad4951d0ba1c37c65818bc7","name":"Find your work faster","description":"<p>With an improved <strong>quick search</strong>, searching through all your issues and projects will be nothing else but a breeze. Whether you know the full issue key, part of the issue name, or just have a distant memory of a project from a year ago, start typing the words, and we’ll do the rest for you. The quick search instantly shows the most relevant results, and refreshes them whenever you change your search term.</p>\n\n<p><img alt=\"\" src=\"https://confluence.atlassian.com/jirasoftware/files/945521251/945528523/1/1518181922686/quicksearch.png\" style=\"height:400px; width:800px\" /></p>\n\n<p>If you’ve already found what you were looking for, just treat quick search as a handy work diary. Click anywhere in the box to see the issues and projects you’ve been working on recently, and have the most important work always at your fingertips.</p>\n\n<p>Learn more</p>\n","__v":0},{"_id":"5ad5ddddcd054b2b5b20143c","name":"Project sidebar","description":"<p>The project sidebar that we previewed in JIRA 6.4 is here to stay. We built this new navigation experience to make it easier for you to find what you need in your projects. It's even better, if you are using JIRA Agile: your backlog, sprints, and reports are now just a click away. If you've used the sidebar with JIRA Agile before, you'll notice that cross-project boards, which include multiple projects, now have a project sidebar as well — albeit a simpler version.</p>\n","__v":0}]
Can this be done in nodeJS?
Conversion isn't the right word but generation is. According to the generalized response in json response you can write logic for generation of pdf from it in node-js server.
PDFKit and PDFmake are two good libraries for this purpose.
I've used pdfmake and is very good.
See doc here: https://pdfmake.github.io/docs/
Use html-pdf to generated PDF from html, Where it works on top of phantom
var pdf = require('html-pdf');
pdf.create(file[0].description).toFile('./' + file[0].description + '.pdf', function (err, res) {
console.log(res.filename);
});
Note : Sample code snippet above to handle first object in your array

Scan an area of a web page's source code for changes while reporting it?

this is one heck of a confusing question to ask so here it goes. Firstly, I'm not asking you to write me any code I just need help going in the right direction for what I'm trying to achieve here. Basically the task is this, I want to scan a select area of a web page's source code for changes and if something does change, I want to report it somewhere (like a console or something). However, I do not want just a notification of change, I also want what the change is/was. I've been looking into things like jsoup but I am still struggling to even find out what this is called.
Any pointers would be insanely appreciated. Thanks, Optimistic.
Here are some steps assuming this is from a node.js project:
Get the URL for the specific script file you're looking for a change in.
Using the request() module, fetch that URL.
Break the data up into lines (probably using .split()).
Find the specific line you are looking for either by counting line numbers of by searching for some representative text in that line.
Using some sort of search in that line (perhaps a regex), find the current value of the exact item in that line you are looking for.
Save the current value.
Then, at some future time, repeat this whole process and compare what you find to the previous value.
If this is being done from a browser instead of node.js, then use an Ajax call to retrieve the file. If the file is on another domain from your web page and that domain does not permit cross-origin requests, then you cannot solve this problem in an automated fashion from a browser in your own web page.
Here is how I would do it with Jsoup:
Document doc = Jsoup.connect(url).get();
String scriptCssQuery = "script"; // Tune this CSS query to find THE script you need.
Element script = doc.select(scriptCssQuery).first();
if (script != null) {
String scriptLines = script.html();
// Store the changing line somewhere and compare it to its previous value...
}

Verify a Tif with ApprovalTests

I have been asked to update a system where header information gets injected into a tif via a 3rd party console application. I don't need to worry about that bit.
The part I have been asked to look at it the merge process that generates the header information.
The current file generated by the process is assumed as correct, before I make any changes, so I want to add this as an approved result, from that I can then check that the changes I make will alter the file as expected.
I thought this would be a good opportunity to look at using ApprovalTests
The problem I have is that for what ever reason the links to the videos are considered corruptible (Possibly show me kittens jumping into boxes or something, which will stop me working, which ironically means I slow down my work done because I cannot see any help videos).
What I have been looking at is the Approvals.Verify and Approvals.VerifyFile extensions.
But what appears to be happening is confusing me.
using VerifyFile creates a received file, but the contents of the file are just a line the name of the file I have asked it to verify.
using Verify(new FileInfo("FileNameHere")) does not appear to generate the received file that I need to flag as approved, but the test does return saying that it cannot find the approved tif file.
I am probably using VerifyFile completely wrong and might be looking at using Verify wrong as well.
useful info?
Might be useful to know, that as this is a legacy application, running as a windows service, I have wrapped the service in a harness that allows me to call the routines, so the files are physically being written elsewhere on the machine outside of my control (well there is a config, but the return of the service I call generates a file in a fixed location if it is successful). I have tried copying that into the Unit Test project, but that doesn't appear to help.
Verify(File) and VerifyFile(string) are both meant to verify an existing file. As such they merely setting the received file to the file you pass in. You will still need to move/approval/create the approved file.
Here is the pseudo code and process.
[UseReporter(typeof(DiffReporter), typeof(ClipboardReporter)]
public void TestTiff()
{
string tif = YourProcessToCreateTifFile();
Approvals.VerifyFile(tif);
}
[Note: if you don't have an image diff installed, like TortoiseDiff, you might want to use the FileLauncherReporter]
Run this, once you get the result, move the file over by pasting your clipboard into a cmd window.
It will move the temporary tif to your test directory with the name ClassName.TestTiff.approved.tif
After that the test should pass until something changes.
Happy Testing!

Close a single tab in Chrome using Batch command

I'm relatively new to batch commands and have been learning steadily. My problem is like this:
I've understood how to kill processes using batch commands using many different methods. However, I've been unable to figure out how to close a single tab in, preferably, chrome.
Any thoughts would be greatly appreciated!
Thanks!
So, I suppose I should state my exact problem.
I'm using notepad++ as my LaTeX compiler and sending the final pdf to chrome. The reason: I usually have ~20 tabs open related to the project I'm working on and it just makes my work much easier to split my screen between notepad++ and chrome.
My current batch file compiles the LaTeX code and sends the compiled document to chrome as a new tab. For obvious reasons, i don't want to close a tab each time I compile, so I thought that closing the current tab at the same time during compiling would solve my problem. But, I just can't find a way to get my batch file to only close the tab with my compiled pdf.
Thanks in advance!
check all running chrome instances/tabs with :
wmic process where "caption='chrome.exe'" get
and see processes properties.Probably the best indicator that you can rely on in this case is CreationDate (other properties are basically the same for all chrome instances) - it always comes in format YYYYMMDDHHmmss.ms and is easy for string comparison.But you'll have to know the time when it was started.

Protect Air application content

On Mac Os, I see that all content on my application can be readable (mxml and as files).
Indeed with right clic on application, you can see all application content and so all files.
So It's very dangerous for a company to distribute air application like that.
Is a solution exist to protect those files.
Thanks
It is not possible to protect 100% your code. After all, if the computer can run it, it can be decompiled, regardless of the language. However, you can make it more difficult.
One method is to encrypt the swf as stated in another answer. But all the "attacker" needs to do is find the key and then they can decrypt all your swfs.
Another method is to use obfuscators. Obfuscators don't depend on encryption, nor they prevent decompiling, they just make it harder to understand what gets decompiled.
For example if you had a method called saveInvoice() the obfuscator would rename it to aa1() or something like that, so it would make it diffucult to guess what that function does. It basically turns everything into spaguetti code.
You can use a decompiler to see what can be obtained from a SWF file (which is alot), and play with obfuscators to see if they meet your espectations.
An example of one is http://www.kindi.com/ which I'm not endorsing btw, it just shows up quickly on google.
Although there are loads of decompilers which can read all your code. There is one guy who came up with encryption solution it might worth a try. (It's for Desktop AIR applications)
Have a look at this post: http://forums.adobe.com/message/3510525#3510525
Quoted text (in case of page being erased)
The method I use will allow you encrpyt most of your source code using
a key that is unique to every computer. The initial download of my
software is a simple air app that does not contain the actual program.
It is more like a shell that first retreaves a list of the clients mac
addresses and the user entered activation code that is created at time
of purchase. This is sent to server and logged. The activation code
is saved to a file client side. At the server the mac address and
activation key are used to create the encryption key. The bulk of the
program code is then encrypted using that key, then divided into parts
and sent back to the client. The client puts the parts back together
and saves the encrypted file. At runtime the shell finds the mac
address list and the activation key, then using same method as server
gets the encryption key and decrypts the program file. Run simple
check to make sure it loaded. For encyption i found an aes method that
works in php and javascript.
Next I use this code to load the program
var loader = air.HTMLLoader.createRootWindow(true, options, true, windowBounds);
loader.cacheResponse=false;
loader.placeLoadStringContentInApplicationSandbox=true;
loader.loadString(page);
This method makes it very difficult to copy
to another computer although since I wrote it i know there are some
weeknesses in the security but to make it harder i obv. the shell
code. It at least keeps most from pirating. However there are issues
with this that I have found. First i was using networkInfo to get the
list of mac address but this failed in a test windows XP computer.
When the wireless was off it did not return the MAC. I was not able
to recreate this in VISTA or 7. Not sure if it could happen. Was not
tested on a mac computer. To fix this (at least for windows). I
wrote a simple bat file that gets the MAC list, then converted it to
an exe which is included. This does force you to create native
installers. call the exe with this
var nativeProcessStartupInfo = new air.NativeProcessStartupInfo();
var file = air.File.applicationDirectory.resolvePath("findmac.exe");
nativeProcessStartupInfo.executable = file;
process = new air.NativeProcess();
process.start(nativeProcessStartupInfo);
process.addEventListener(air.ProgressEvent.STANDARD_OUTPUT_DATA, onOutputData);
process.addEventListener(air.ProgressEvent.STANDARD_ERROR_DATA, onErrorData);
process.addEventListener(air.NativeProcessExitEvent.EXIT, onExit);
process.addEventListener(air.IOErrorEvent.STANDARD_OUTPUT_IO_ERROR, onIOError);
process.addEventListener(air.IOErrorEvent.STANDARD_ERROR_IO_ERROR, onIOError);
put the list together in the onOutputData event using array.push and
continue on the onExit event using the findmac.exe will return the
same info every time (that i know of) beware thought that using the
native install will break the standard application update process so
you will have to write your own. My updates are processed the same way
as above. This is contents of the .bat file to get the mac list
#Echo off
SETLOCAL SET MAC = SET Media = Connected
FOR /F "Tokens=1-2 Delims=:" %%a in ('ipconfig /all^| FIND "Physical Address"') do #echo %%b ENDLOCAL
using this method makes it simple to implement at try before you by
method. at runtime if no activation code get try me version from
server instead of full version.