I was having some difficulties rendering my static data through the css background-image tag.
For example, take the following piece of code:
<section class="banner-area" style="background-image: url('../../static/main/img/search4.jpg')>;
When I inspect the element on my browser, I can see that it is not being linked to AWS S3 bucket, and the url remains exactly the same as per the following:
url('../../static/main/img/search4.jpg')
However if I render an image using the source tag, I get the behaviour I want, for example:
<img src="{% static 'main/img/search4.jpg' %}/>
Here, when inspecting the element, I can see in the browser that it is being linked to my S3 Bucket as per the following:
src="https://mybucket-bucket.s3.amazonaws.com/main/img/search4.jpg?ETC...."
In my settings.py
STATIC_ROOT = os.path.join(BASE_DIR, 'staticfiles')
STATICFILES_DIRS = (
os.path.join(BASE_DIR, "main/static"),
)
MEDIA_ROOT = os.path.join(BASE_DIR, 'static/img')
MEDIA_URL = "/media/"
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'
STATICFILES_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'
STATIC_URL='https://mybucket-bucket.s3.us-east-2.amazonaws.com/'
Can you please advise ?
Kind regards,
Marvin
Try changing
url('../../static/main/img/search4.jpg')
to
url('{% static 'main/img/search4.jpg' %}')
The reason for this is your site is hosted on a domain different from your s3 bucket domain so the browser is unable to resolve the relative url
For example, if your website is at example.com the browser attempts to pop off the last 2 path components(parts of the url following a forward slash e.g /component1/component2/) so it just does nothing. You can place the relative path in your css file since it's hosted on the same domain as your image
Related
I am looking for a way to parse the images on a web page. Many posts already exist on the subject, and I was inspired by many of them, in particular :
How Can I Download An Image From A Website In Python
The script presented in this post works very well, but I have encountered a type of image that I don't manage to automate the saving. On the website, inspection of the web page gives me:
<img class="lazy faded" data-src="Uploads/Media/20220315/1582689.jpg" src="Uploads/Media/20220315/1582689.jpg">
And when I parse the page with Beautifulsoup4, I get this (fonts.gstatic.com Source section content) :
<a class="ohidden" data-size="838x1047" href="Uploads/Media/20220315/1582689.jpg" itemprop="contentUrl">
<img class="lazy" data-src="Uploads/Media/20220315/1582689.jpg" />
</a>
The given URL is not a bulk web URL which can be used to download the image from anywhere, but a link to the "Sources" section of the web page (CTRL + MAJ + I on the webpage), where the image is.
When I put my mouse on the src link of the source code of the website, I can get the true bulk url under "Current source". This information is located in the Elements/Properties of the DevTools (CTRL + MAJ + I on the webpage), but I don't know how to automate the saving of the images, either by directly using the link to access the web page sources, or to access the bulk address to download the images. Do you have some idea ?
PS : I found this article about lazy fading images, but my HTLM knowledge isn't enough to find a solution for my problem (https://davidwalsh.name/lazyload-image-fade)
I'm not too familiar with web scraping or the benefits. However, I found this article here that you can reference and I hope it helps!
Reference
However, here is the code and everything you need in one place.
First you have to find the webpage you want to download the images from, which is your decision.
Now we have to get the urls of the images, create an empty list, open it, select them, loop through them, and then append them.
url = ""
link_list[]
response = urllib.request.urlopen(url)
soup = BeautifulSoup(response, "html.parser")
image_list = soup.select('div.boxmeta.clearfix > h2 > a')
for image_link in image_list:
link_url = image_link.attrs['href']
link_list.append(link_url)
This theoretically should look for any href tag linking an image to the website and then append them to that list.
Now we have to get the tags of the image file.
for page_url in link_list:
page_html = urllib.request.urlopen(page_url)
page_soup = BeautifulSoup(page_html, "html.parser")
img_list = page_soup.select('div.seperator > a > img')
This should find all of the div tags that seperate from the primary main div class, look for an a tag and then the img tag.
for img in img_list:
img_url = (img.attrs['src'])
file_name = re.search(".*/(.*png|.*jpg)$", img_url)
save_path = output_folder.joinpath(filename.group(1))
Now we are going to try to download that data using the try except method.
try:
image = requests.get(img_url)
open(save_path, 'wb').write(image.content)
print(save_path)
except ValueError:
print("ValueError!")
I think you are talking about the relative path and absolute path.
Things like Uploads/Media/20220315/1582689.jpg is a relative path.
The main difference between absolute and relative paths is that absolute URLs always include the domain name of the site with http://www. Relative links show the path to the file or refer to the file itself. A relative URL is useful within a site to transfer a user from point to point within the same domain. --- ref.
So in your case try this:
import requests
from bs4 import BeautifulSoup
from PIL import Image
URL = 'YOUR_URL_HERE'
r = requests.get(URL)
soup = BeautifulSoup(r.text, 'html.parser')
for img in soup.find_all("img"):
# Get the image absolute path url
absolute_path = requests.compat.urljoin(URL, img.get('data-src'))
# Download the image
image = Image.open(requests.get(absolute_path, stream=True).raw)
image.save(absolute_path.split('/')[-1].split('?')[0])
I have a Go HTTP web server and I'm loading static assets like so:
http.Handle("/assets/", http.StripPrefix("/assets/", http.FileServer(http.Dir("assets/"))))
The directory assets exist at the directory the web server is running, and the image file assets/images/logo.svg exist.
If I try going to http://localhost/assets/images/logo.svg it redirects to http://localhost/.
From an HTML page I have the following:
<img src="assets/images/logo.svg">
This fails to load the image.
I then tried the following as well with no luck:
<img src="./assets/images/logo.svg">
<img src="//localhost/assets/images/logo.svg">
Unsure what I'm doing wrong to host static files and being able to use them from html.
EDIT
I've added the code for everything here.
Along with a photo showing the broken images.
Try to modify the line from:
http.Handle(
"/assets/",
http.StripPrefix(
"/assets/",
http.FileServer(http.Dir("assets/")),
),
)
to
http.Handle(
"/assets/",
http.StripPrefix(
"/assets/",
http.FileServer(http.Dir("./assets/")),
),
)
Please note, your img->src should be something like this assets/images/logo.svg
EDITED:
The below image is the response to the comment link:
I have a problem with my project.
I'm working on a university project based on Django and in fact we don't know HTML, CSS at all.
I have a small hierarchy of catalogs:
-> Project
-> LibProject
-> templates
-> LibProject
-> main.html
-> images
-> logo.png
There LibProject and images are in the same level.
I would like to add a logo.png via inside main.html file.
I've tried to get 3 directories up and then go to images/logo.png but it doesn't work.
PS: It has to be static image.
May you give me any sollution for this? Thanks in advance!
you should do some settings to before that :
add
STATIC_URL = '/static/'
to your settings.py
add
STATICFILES_DIRS = [
os.path.join(BASE_DIR, "static"),#this is your static files dir
]
to your settings.py
in your url.py file you should add this:
urlpatterns = ["""your urls"""] + static(settings.STATIC_URL, document_root=settings.STATIC_ROOT)
for more details please read django doc :
https://docs.djangoproject.com/en/2.2/howto/static-files/
I'm making Django app in which I need to embed many external HTML files in the template. Each HTML file is stored in its own directory, along with the subdirectory that contains all the images. The file structure:
Abstract1
Pictures
image1.png
image2.png
abstract1.html
I use a custom template tag for embedding (see below). My problem: the HTML files are loaded, but linked resources (e.g. img) are not working properly (i.e. they're not being displayed). HTML files use relative urls, which, mixed with the django template base path produce invalid url, but even if I use hardcoded absolute urls the problem remains. I feel like I'm missing something obvious. Is there some proper (or not proper but working) way to overcome such problem?
template
{% load abstracts_extras %}
<!DOCTYPE html>
<html>
<body style="margin-left:10px">
<h2>{{abstract}}</h2>
<b>Authors:</b><br/>
<ul>
{% for author in authors %}
<li>{{author}}</li>
{% endfor %}
</ul>
<p>
<b>Title: </b>{{abstract.title}}
<p>
<hr>
{% include_external filename|add:'.html' %}
</body>
</html>
abstracts_extras
from django.template import Library
register = Library()
def include_external (url):
url = 'file:///' + url
import urllib2
return urllib2.urlopen (url).read ()
If I am understanding well, your templates load but not statics like img.
It would be a configuration error.
You should check both settings.py for Django and httpd.conf for Apache and check staticfiles are properly configured.
Have you any error shown or just images are not loaded (but no error)?
I am making a project in django. I am able to implement html pages but not able to show images from that page.
<img src= "home/ashish/PycharmProjects/django/webapp/family/static/images/error-img.png" />
it is showing error as
Resource interpreted as Image but transferred with MIME type text/html
when I hit the url,Request URL:http://127.0.0.1:8002/home/ashish/PycharmProjects/django_webapp/family/static/images/error-img.png
While if i make a src to online image it works fine.
setting.py
STATIC_URL = '/static/' # You may find this is already defined as such.
STATICFILES_DIRS = (
STATIC_PATH,)
MEDIA_URL = '/media/'
And all the images are inside app/static/images.
Thanks
You have not defined static file URLconf in your urls.py file. You can define them for development server use, like this:
from django.contrib.staticfiles.urls import staticfiles_urlpatterns
...
urlpatterns += staticfiles_urlpatterns()
And then you can view your static files at:
http://127.0.0.1:8002/static/images/error-img.png
See: Django docs on Managing static files