Why is the data attribute being used in this way? - html

Learning how to utilize Bootstrap, I noticed that the thumbnails had strange markup for the image source (at least, strange to me.)
<img data-src="holder.js/260x120" alt="260x120" style="width: 260px; height: 120px;" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAQQAAAB4CAYAAAAUn4wEAAAGP0lEQVR4Xu3aPUsdWxQG4LEwKiSFTbQTsUwsRfDvp7KRVBFrEVIEsRFT+HHvHJjDODrH96ghQ9aT5nLD8rjXs/Z+3TNx5fLy8qHxhwABAv8LrAgE+4AAgU5AINgLBAjMBQSCzUCAgECwBwgQeCrghmBXECDghmAPECDghmAPECCwQMAjg+1BgIBHBnuAAAGPDPYAAQIeGewBAgQSAe8QEiU1BIoICIQig9YmgURAICRKaggUERAIRQatTQKJgEBIlNQQKCIgEIoMWpsEEgGBkCipIVBEQCAUGbQ2CSQCAiFRUkOgiIBAKDJobRJIBARCoqSGQBEBgVBk0NokkAgIhERJDYEiAgKhyKC1SSAREAiJkhoCRQQEQpFBa5NAIiAQEiU1BIoICIQig9YmgURAICRKaggUERAIRQatTQKJgEBIlNQQKCIgEIoMWpsEEgGBkCipIVBEQCAUGbQ2CSQCAiFRUkOgiIBAKDJobRJIBARCoqSGQBEBgVBk0NokkAgIhERJDYEiAgKhyKC1SSAREAiJkhoCRQQEQpFBa5NAIiAQEiU1BIoICIQig9YmgURAICRKaggUERAIRQatTQKJgEBIlNQQKCIgEIoMWpsEEgGBkCipIVBEQCAUGbQ2CSQCAiFRUkOgiIBAmOCgT09Pm4uLi/nKPn361BweHj5a6eXlZfP9+/fm/v5+9vfP1ZyfnzdnZ2fzr9vd3W329vZe3fHJyUnTft+tra1mf3//0ecka37v9by6EV84KiAQJrY5ukM3XFb/wA/DoKvtH9Th4etqXhsK/c8bBkKy5vdez8TG9s8sRyBMaJS3t7fNt2/fmru7u6Y7uP2D9PXr1+bz58/N8fFxc319/aRmZWWlOTo6atbW1uY13eHtfoKvr6/Pbhurq6tx58MD3w+EZdf8HuuJF65waQGBsDTZn/uC7vB3B3tjY6MZHridnZ1ZaLSPCu3hb2uGf/o3iDZEtre3Z1f99hHj4eFh9nU3NzfzR44ufIZf1w+f9nu0YfL79+9HjwzJmjc3N+ffa2w9z/Xx56R98piAQJj43hge0g8fPswOV/vf9oBeXV3NOuj/1B4e/mGwdIey+8nffs7BwUHT/n978+g+qw2d9jby8ePH5suXL09uHWN0Y2vuwmhsPRMfRYnlCYQJj7k7kO0h7d4hjD2Lt20MaxbdNNqXi/3bR3uL+PnzZ9P/mj5Nfy3PvVTsahet+aX1THgUZZYmECY66v7B6h+kfiAM3zN0db9+/Zr960JyANM3/0kgvLTmZD0THUeZZQmECY567GC1S00eB7rHiuSK3r8ljN0O2u/7UiC8dc3tDcWfvy8gEP7+DB6tYNHBWjYQ2s966SXe8PcHxh4HFgVCuuZkPRMbR7nlCISJjbx70bfMs/zwTX/6z47920b70vDHjx+zm0AXIuk7hNes+S3/DDqxkf1TyxEIExrn2C8cdUt87ncT+stf5heTnvuJv+iQjt0Q3rrm1/6i1ITG9k8tRSBMaJzD6/twaf3DMzyIz131F70w7L7X2Iu+4eeNBcIya05fYE5oJOWWIhDKjVzDBMYFBILdQYDAXEAg2AwECAgEe4AAgacCbgh2BQECbgj2AAECbgj2AAECCwQ8MtgeBAh4ZLAHCBDwyGAPECDgkcEeIEAgEfAOIVFSQ6CIgEAoMmhtEkgEBEKipIZAEQGBUGTQ2iSQCAiEREkNgSICAqHIoLVJIBEQCImSGgJFBARCkUFrk0AiIBASJTUEiggIhCKD1iaBREAgJEpqCBQREAhFBq1NAomAQEiU1BAoIiAQigxamwQSAYGQKKkhUERAIBQZtDYJJAICIVFSQ6CIgEAoMmhtEkgEBEKipIZAEQGBUGTQ2iSQCAiEREkNgSICAqHIoLVJIBEQCImSGgJFBARCkUFrk0AiIBASJTUEiggIhCKD1iaBREAgJEpqCBQREAhFBq1NAomAQEiU1BAoIiAQigxamwQSAYGQKKkhUERAIBQZtDYJJAICIVFSQ6CIgEAoMmhtEkgEBEKipIZAEQGBUGTQ2iSQCAiEREkNgSICAqHIoLVJIBEQCImSGgJFBARCkUFrk0AiIBASJTUEiggIhCKD1iaBREAgJEpqCBQREAhFBq1NAonAfyCREfyopr43AAAAAElFTkSuQmCC">
What's going on here, and why is this being done? Is the image somehow saved to local storage at some point in base64?
To clarify, I'm asking about the src="image:/ part.

What you are seeing is not the HTML5 data- attribute, but the data URI scheme. To quote Wikipedia:
The data URI scheme ... provides a way to include data in-line in web
pages as if they were external resources. This technique allows
normally separate elements such as images and style sheets to be
fetched in a single HTTP request rather than multiple HTTP requests,
which can be more efficient.
What you're seeing is the base64-encoded image data, in this case a PNG. When browsers see this, they decode the data as instructed, and display it as if it were an external resource.
Given this image's size, the creators of Bootstrap rightly believe it is more efficient to inline the image like this rather than keeping it separate. Had they kept it separate, it would require an additional HTTP request to load the image, which increases the total load time of the page.

In the case of Bootstrap, what I think you are seeing is Javascript being used to generate the data that goes in the src attribute.
If you look at the raw source (not the source in the inspector), you will probably not see the src attribute, only data-src.
The data-src attribute is an instruction to javascript to use the holder.js script to generate the data to go in the src.
So holder.js generates the image, which is then loaded into the img as a data uri as explained by the other answers.

Related

HTML reference an image using extension only without using explicit image file name

I have a question about referencing image in HTML. I have a single snippet of HTML code as
below, where I need to reference an image in a folder called static. The image has an extension of SVG
but it's name is dynamically created. If there a way for HTML to refer to this image by only referring to the file extension? The code below using a wild card doesn't work.
<p>
<img src="/static/*.svg" width="1000">
</p>
Additionally, can we add a logic in HTML such that if there is no SVG file in static folder, we don't render it; if there is one, then render it.
Thank you.
I don't think it's possible to do what you're asking with just HTML. However you can easily do this by adding an id to the tag and applying the attribute based on your condition. There are plenty of examples on the web on how to do this including one already answered here: Javascript set img src
This is possible with a server that is designed for this purpose. There is not a feature of just html that will do this, however. If you don't control the server on the backend, you probably can't get this to work, as it most likely will require custom backend code.
On the backend, you make a simple static html server that will match file patterns, and determine and serve the best match. You can do this any number of ways, and if you look up "how to make a static http server" for your favorite backend language, you likely will find an example to get you started, like these:
https://www.digitalocean.com/community/tutorials/how-to-create-a-web-server-in-node-js-with-the-http-module
https://blog.appsignal.com/2016/11/23/ruby-magic-building-a-30-line-http-server-in-ruby.html
https://stackabuse.com/serving-files-with-pythons-simplehttpserver-module/
You would then have to modify whatever base example you chose with your custom pattern matching code. Your server could be designed to do whatever you wanted with the request you send it, including the scenario you described.
This works because an img tag like the one in your example causes the browser to make a GET http request to your server, passing on the url in the src attribute. So, if you control that server, you can have it respond in any way you want, including treating specific characters like * specially.

Can browsers cache the embedded base64 images also?

I was wondering if any of the modern browsers actually cache the embedded images -- base64 strings, or not?
Also is that a possibility in the near future? based on the official documents by either W3C or major browsers.
I don't think so, because you're missing an Resource Identifier as the key for the cached image. With embedded images you only have the data itself.
Furthermore a potential conditional request for inlined images must be at the level of the HTML document containing it. The inlined image is just data with no additional request. But HTTP does not support something like conditional requests for parts of the data.
As I understand it, if the base64 string is part of the HTML document (inline) then it will both have to be downloaded and parsed as an image each time the document is downloaded - there is no way to cache fragments of documents. If it is a background image in an external CSS file then it can be cached with the CSS file, but will still need to be parsed with every request. I have also read that base64 encoding adds circa 30% overhead on top of the image bytes, but this can largely be negated by gzipping.
Browsers can cache downloaded files. If the Base64 string is in a text (or JSON) file then it can be cached. This data can then be used directly in the HTML (or if JSON, parsed with Javascript and used with HTML).

Image with parameter - HTML

I do not know how to frame the question. And I do not know how the below tag is working...
<img src="img.png?value=23"/>
This tag is working fine. And its rendering the image correctly.
Does the value=23 has some effect??? or It is been ignored by Browser??
I do not even know how to Google this!!! Is it like passing parameter to the Image??? If that is the case, how to retrieve the value attribute. Does the parameter has any sense
It depends on server, if you have png MIME type as text and you parse the files as if they were text files with php code then it has effect.
It really depends on configuration of server not a browser.
Morover, mod_rewrite can be used to change files that look like png to php files.
Adding parsing png files via php parser
AddType application/x-httpd-php .png
mod_rewrite
RewriteEngine On
RewriteRule ^([a-zA-Z0-9_\-]*)\.png$ img.php?value=$1
With these lineasdfasdf.png will be treated as img.php?value=asdfasdf
So in this case when you use $_GET['value'] on asdfasdf.png or img.php?value=asdfasdf. It will have effect.
If server is not configured to do such things and images are images(Yes i know it's briliant sentence) then it has no effect on common image.
To sum up.
It depends on server configuration not on the browser
If this image is dynamic in some way, then the server that's hosting this image must be generating the image from PHP code.
Take a look at the GD library, which lets you use PHP to generate images based on nothing or other images. The parameter must be passed to include that value inside the image (for example, an image that includes the text "123" or calculates using it somehow, for example a user ID).
Then an .htaccess on the server rewrites the extension of .png to .php (or maybe another one) to make it look like a genuine image to some libraries and crawlers, or scripts etc.
Another option, is that this is a simple image and that value is being ignored, or is just random to make sure the image isn't cached.
The value=23 only has an effect if the server uses it. The browser requests http://example.com/img.png?valud=23, so the server will see the parameter.
For example, with PHP, if your use $_GET['value'], and that variable changes what image is sent, then the value=23 is needed.
Parameters are often sent with images to specify a height or width, or to determine which image is loaded.
For this sample i don't know what does it mean
But it's possible to write
<img src="path_img.png?<?php time() ?>" />
to force your browser to download resource without using cache
It sure does!
For example, take a look at this piece of software that is intended to dynamically resize an image.
http://imageresizing.net/docs/basics
If done correctly, adding parameters to an image url could be very useful.
Edit:
As others point out, you need to ensure that the server knows how to handle the extra parameter. In this case it is intended to resize/watermark/rotate an image. It can certainly do other wonderful things.
value=23 is not ignored by your browser but by the server.
It's like passing parameters to the web server. For images it will be mostly ignored.
The arguments in a url are mostly used for getting some information about a specific item but can be used in more other ways. When talking about the images, the browser won't ignore the argument value=23 but the server you are using will.
But if the image is some sort of dynamic then it may be used in order to change the image's URL or other things.

Extracting meta tags attribute using wget

I have a file having some URLs per line. I need to extract the "keywords" present in the tags i.e. if there is meta tag for "keywords" then i want to get "content" value for it. Example: if the web-page has this meta-tag then for that URL i want "wikipedia,encyclopedia" to be extracted.
One approach is to download the web-page using "wget" and then parse it using some standard HTML parser.
I was wondering is there any better way to do this without downloading the entire web-page.
What you described is the simplest solution to implement.
If you worried about the network traffic generated you could write a small program that only reads the header. As soon as you read the <body..> tag you can finish downloading.
Update: You have to set a very small receive buffer for you socket otherwise the kernel will probably still download the whole page. Verify your solution with tcpdump.

How does Flickr prevent people from downloading images from the site?

Just wondering how does Flickr prevent people from downloading images from its site? What are they using?
Transparent .gif over the image. You can still download the actual image by viewing the HTML source and finding the image's actual URL.
For example, a random image:
http://www.flickr.com/photos/34285128#N00/4300352607/
<img style="position:absolute;top:0px;left:0px;display:block" src="http://l.yimg.com/g/images/spaceball.gif" alt="" width="500" height="366">
That's the transparent image on top.
<img src="http://farm5.static.flickr.com/4057/4300352607_edcc5a4a9e.jpg" alt="Say It With Flowers by *sido* (back in a few days)." title="" width="500" height="366" class="reflect">
That's the actual image, which is displayed below spaceball.gif.
Not to thread dump, but conceptually if you are really trying to lock out downloading an image, you could (i think). Using a framework like asp.net mvc, you could tag the image with a unique key, storing the key either in memory or some other form of persistence and pass it down to the client with the id as the filename. On the returning end, upon request for the file, you could intercept the request for the image and perform a lookup on the key matching it to the actual file. Once you have the file, you return the image as a custom result with the appropriate meta tags (at least in mvc, not sure how you'd do it elsewhere). Before you return it though, you flag the result as being viewed.
It would be a great deal of work on the server, but it would require a great deal of effort for anyone to snag the image if you utilized Flickr's transparent gif technique in conjunction with it.
The idea being that a single request would be issued on a normal view and any further attempts to view the image directly (by viewing source and grabbing the url) would be blocked.
<./threadump>
Sorry, just had the idea and wanted to add it to the already answered question (sleep deprevation and all that jazz).