I have written some python code:
from bs4 import beautifulsoup
from urllib.request import urlopen
r = input()
html = urlopen("http://www.google.co.in/"+r._str_())
soup = beautifulsoup(html, "lxml")
print (soup)
How do I connect my python file into a html file?
Are there any links? (Like javascript)
Edit:
Well, by your comment, I think you mean using python as the backend server language and html as frontend (pretty much like any website).
For that solution, you have multiple ways to do it. You can create your server from scratch, using the base http.server from python 3 (see here for more informations).
You also have multiple ways to do a python-based server, including Frameworks, which are, by today, the best way to do it.
You already have multiple python frameworks that does the server job, letting you defining the routes, data processing etc..
These frameworks include Django, Flask, Bottle, or Tornado.
A good one to start doing the server-part in python is Flask. The syntax is quite easy to understand, it does the job well, and there's no extra over-complicated work to do. Also, the community is great. (See here for a good way to start learning using flask).
Old:
You need to be more precise. What do you want to achieve ?
What do you mean by "connect my python file into a html file" ?
HTML file is not something you can connect to, it's just a basic file.
If you meant to write the html response body into a file just take a look at this for file interaction. If not, please be more precise about your problem/question.
Related
I am trying to use Node.js to implement Data Scrawling. I used axios to GET HTML file and use cheerio to get data.
However, I found that the HTML doesn't return with data but only layout. I guess the website with load the layout first, then doing ajax things to query data then rendering.
So, Anyone know how to GET the full HTML with data? Any library or tools?
Thanks.
i would suggest you to use selenium library with bs4 library in python if have some experience on python.
for node
https://www.npmjs.com/package/selenium-webdriver
i have written scraper in python using both library.
scraper is for linked in profile which take name from excel file and search if data available add it into another excel file
https://github.com/harsh4870/Scraper_LinkedIn
for node code goes like
driver = webdriver.Firefox();
driver.get("http://example.com");
html = driver.getPageSource();
I came across two ways to import local json files to my code.
Using angulars http get.
Thats well known for loading json input. You can switch easily from remote to local json files.
Typescript require
Another way to load json in typescript files is via require. This is simple as I don't have to deal with Promises/Observables. I just include them like this:
data: any = require('assets/json/my.json');
I want to know something about the advantages and disadvantages between these two approaches. Is there a prefered way and why?
Hi it depends on your requirement.
If your file is constant, will not be changed then it is best option is to use .require()
- .require() will cache your file, and when you import again it will give the cached file, so it might be bad option you want current time data because you will not get the updated data from that file
But if your file is getting updated then you have to use HTTP.
Is there a way to decompile flash files into html in python 3?
I'm using urllib to gather html data from a website and would like to include flash content .. in html format, as part of the rest of the html content, preferably without downloading the file.
The few packages available are old and not made for python 3 or are not web based.
There are many free online decompiler tools for this so I thought it would have been easier to find code for this.
Thanks in advance!
UPDATE
As a workaround solution I have found this:
http://www.nowrap.de/flare.html
Which is a command line swf to html converter. I'm considering calling it as a subprocess unless someone has a better idea?
I want to make a program that prepares an HTML file. It would either be on the server side or just running in my local machine.
I think it would be nice to be able to use the dart:html library since it has a lot of methods for manipulating html (obviously). But it is thought to be used dynamically on the client side, and I want to use it like this: manipulate an html DOM tree with dart:html, and when its ready, write a static html file. For instance using query('body').innerHtml
The problem I'm running into is that I if start a project with the "console application" template, I am not able to make dart:html talk to an html file. And if I choose "web application", in which I am able to do this, I cannot load the dart:io library, maybe it has to do with it being tagged as [server] in the SDK?
Of course I could just do:
print(query('body').innerHtml);
and manually copying the output to a file, but I thought maybe there is a more elegant solution.
See html5lib.
html5lib in Pure Dart
This is a pure Dart html5 parser. It's a port of
html5lib from Python. Since it's 100% Dart you can use it safely from
a script or server side app.
Eventually the parse tree API will be compatible with dart:html, so
the same code will work on the client or the server.
It doesn't support much in the way of queries yet.
On an Ubuntu platform, I installed the nice little perl script
libtext-mediawikiformat-perl - Convert Mediawiki markup into other text formats
which is available on cpan. I'm not familiar with perl and have no idea how to go about using this library to write a perl script that would convert a mediawiki file to an html file. e.g. I'd like to just have a script I can run such as
./my_convert_script input.wiki > output.html
(perhaps also specifying the base url, etc), but have no idea where to start. Any suggestions?
I believe #amon is correct that perl library I reference in the question is not the right tool for the task I proposed.
I ended up using the mediawiki API with the action="parse" to convert to HTML using the mediawiki engine, which turned out to be much more reliable than any of the alternative parsers I tried proposed on the list. (I then used pandoc to convert my html to markdown.) The mediawiki API handles extraction of categories and other metadata too, and I just had to append the base url to internal image and page links.
Given the page title and base url, I ended up writing this as an R function.
wiki_parse <- function(page, baseurl, format="json", ...){
require(httr)
action = "parse"
addr <- paste(baseurl, "/api.php?format=", format, "&action=", action, "&page=", page, sep="")
config <- c(add_headers("User-Agent" = "rwiki"), ...)
out <- GET(addr, config=config)
parsed_content(out)
}
The Perl library Text::MediawikiFormat isn't really intended for stand-alone use but rather as a formatting engine inside a larger application.
The documentation at CPAN does actually show a way how to use this library, and does note that other modules might provide better support for one-off conversions.
You could try this (untested) one-liner
perl -MText::MediawikiFormat -e'$/=undef; print Text::MediawikiFormat::format(<>)' input.wiki >output.html
although that defies the whole point (and customization abilities) of this module.
I am sure that someone has already come up with a better way to convert single MediaWiki files, so here is a list of alternative MediaWiki processors on the mediawiki site. This SO question coud also be of help.
Other markup languages, such as Markdown provide better support for single-file conversions. Markdown is especially well suited for technical documents and mirrors email conventions. (Also, it is used on this site.)
The libfoo-bar-perl packages in the Ubuntu repositories are precompiled Perl modules. Usually, these would be installed via cpan or cpanm. While some of these libraries do include scripts, most don't, and aren't meant as stand-alone applications.