Finding way around chromium's source code - google-chrome

I just checked out chromium's source, but I desperately need to learn how to navigate around this monster.
How would I search for parts of the code that implement behavior/features I'm interested in?
Let's say I want to see what happens after a URL is entered into the address bar. How do I find that piece of code?
Or, that I want to see what happens when, while parsing HTML, a certain tag is reached.
I have before me a huge amount of source code, and no skill of navigating around it. How do I learn that skill?

My recommendation for sort of diving in is to take a look at the source for http://code.google.com/p/chromiumembedded/.
It's sort of the condensed version of Chrome and if you look at the files it specifically uses, either ones included in its source tree, or files included therein from the Chromium repo at large. The Chromium code base is a huge amount of stuff, most of which isn't actually in the browser. There's a ton of pulled in code from third party repos which are then boiled down in the build process or Chromium's implementation is located somewhere else in the tree, there's a lot of side projects that (while interesting and an awesome resource for a wide breadth of stuff) will prevent you from achieving your goal of specifically honing in on the browser implementation and how that fits together.
CEF is great because you can see someone who's already done the process of pulling all that stuff together to build a project very specifically scoped at the browser view and nothing else. You can see which parts are primarily derived from webkit easily, you can see where the crossover comes in with Google's implementations, and you can see pretty easily how V8 gets tossed into the mix.
I do say "easily" in relative terms because we're still talking a huge amount of code overall. CEF will put you smack in the center of the requirements, but that stuff is still pulling in the massive amount of various things from the rest of the tree. Compiling it takes me about an hour on a really good computer with 12 gigs of ram and 8 cores, and the generated files take up like 6-10 gigs depending.
At the very least, there's not going to be any sort of quick jump into the shallow end to pick something here or there piecemeal. Browsers are incredibly complex pieces of engineering necessarily, because they have to subsume such a huge amount of individual pieces of functionality and then combine them into a shared context. You may find the one thing you're looking for, but you'll find that it's part of a class library that likely is composed of dozens or hundreds of files, which in turn relies on a hundred more of these libraries to handle each task, so to really take something away you'll have to commit time to taking in a lot more than any given piece of information.
Edit: oh also as your specific example.
src is root http://src.chromium.org/viewvc/chrome/trunk/src
/chrome http://src.chromium.org/viewvc/chrome/trunk/src/chrome
The "chrome" tree largely contains the direct implementations (a lot of stuff isn't in there though, most of it even, but that's the starting point). This has overlap with chromeos (chromeos is kind of chromium browser taken to a crazy extreme)
/chrome/browser http://src.chromium.org/viewvc/chrome/trunk/src/chrome/browser/
Is getting you to close to where you want to be. You start to see specific references to things that you can match to the browser, like the tabs and whatnot (ignoring the giant elephant of the actual browser implementation itself which is what takes up the majority of the mindspace in all this stuff)
/chrome/browser/ui http://src.chromium.org/viewvc/chrome/trunk/src/chrome/browser/ui/
Brings you to where most of the ui code is for the browser. It can be confusing when there's crossover or when stuff migrates, like there's a "ui" in the root src directory which has some crossover.
And finally http://src.chromium.org/viewvc/chrome/trunk/src/chrome/browser/ui/omnibox/
Which has a surprisingly small amount of code in it. But this is what you find a lot. The code here is an implementation of a number of classes that are built up elsewhere. For non-webview gui component you'll find them mostly pointing back to the root "ui" and the native widgets stuff there, which is where the bulk of the actual event handling code is if I remember correctly.

You can try this... it may actually lead somewhere too :-)
http://aaronboodman-com-v1.blogspot.com/2010/10/wherein-i-help-you-get-good-job.html
Reading through the dev forums may help too...
http://groups.google.com/a/chromium.org/group/chromium-dev/topics
Also, this section has a lot of useful documents, such as style guides, etc.
http://dev.chromium.org/developers/contributing-code
Last, but not least, IRC is your friend...
http://dev.chromium.org/developers/irc

Related

Should I preload a large image that's above the fold?

I'm excited about the rel="preload" property, because it looks like it can help speed up page render times.
The use case is a web page with a large image above the fold. Right now, Chrome doesn't start downloading the image until after fetching jQuery (a fairly heavy file). With preloading enabled, they download in parallel.
But I'm reading conflicting reports about whether I should use preload for things that are in visible HTML elements elsewhere (as opposed to things made visible by user interaction, like a dropdown menu).
This post seems to recommend not preloading:
When not to use preloading:
When the asset is referred to somewhere else on the same page.
When you're not sure the user will actually require that asset. Like on a page visitors only go to 3% of the time.
While this one seems to indicate it was really helpful for a similar situation on the Financial Times website:
When the Financial Times introduced a Link preload header to their site, they shaved 1 second off the time it took to display the masthead image...
So which is it? Should I provide an early "hint" to display the always-shown, above-the-fold image? Or should I just let the browser get to it in the usual order?
I think that in cases involving performance optimization, you really want to create an A/B test to determine which one you should do. There really is no cookie cutter answer for image preloading that applies to everybody as a best practice.
One of the biggest tenets of a favorite book of mine, Lean Enterprise, is use to A/B tests to prove or disprove a HIPPO (highly paid person's opinion). Certain opinions carry a lot of weight, both in your organization and on the internet. Because of their importance and reputation, their opinions veer towards the realm of fact, even though it may not be.
The practice of measuring performance empirically is also touted by another book I love which deals with performance tuning - Code Complete 2nd Ed. In that book, McConnell gives several code examples where you would expect one piece of code to be optimal, but in fact, it performed poorly (see chapters 25 and 26). One of his key points is that you should always test a performance optimization. If it isn't worth testing, it isn't worth writing the "highly performant" code in the first place. McConnell's premise doesn't just apply to his low level coding examples, but to high level decisions such as preloading images above the fold as well.
I can also attest to the importance of A/B testing at a professional level. I used to work on Amazon's SEO team and we A/B tested everything. The fact of the matter is that you never really know how customers will respond to something. Nobody can predict customer behavior - not even Jeff Bezos - and you really need to back up your hypothesis with real data to prove or disprove the validity of what you're doing.
Even though you can find multiple blog articles and online sources discussing whether preloading is better or not, you don't really know whether that will work for you until you've done it. Different people have different servers with different performance characteristics and different network topologies, etc. You just don't know which way is better for you until you have the data. If you launch your A/B test and find that find that your repel rate goes up when preloading, then you know that you have to dial back your treatment and return to your control. If however, you find that customers don't get bored waiting and click through at a higher rate than before, then you have a winner and you dial up the treatment - deleting the control code entirely after a period.
I hope that helps.
The article is wrong about those statements. Preloading is used by the developer because he already knows those assets are needed on a page. Preloading is not a "hint". Prefetching is more in line with what he states, that you are telling a browser that it's possible the asset may be needed.
If an image is above the fold, then you know it will be needed, and that is exactly what prefetching is for.
Addy Osmani of Google
preload is a declarative fetch, allowing you to force the browser to
make a request for a resource without blocking the document’s onload
event.
I would use rel="preload" for images only if they are to be found in css, javascript that will led to a bottleneck in the render of the page at a later time, if user driven events will request the image and it will have detrimental impact in user experience or if you are experiencing any flows due to this large image in the rendering of the page.
I understand your image is above the fold, and if it's to be found in the source code from the beginning I would let browser prioritize what to download first.
On the other hand you could always A/B this change and see if it works for you.

Auditing unused CSS on complex web pages

I know there are several tools available to find unused CSS on a static web page. But in most real world scenarios I encounter, a lot of the CSS is used after some or the other interaction on the page, maybe a new modal opening up or an options popup etc.
In such scenarios, what would you suggest? How do I keep a tab on my ever-growing render blocking CSS?
The only way I guess one could do that is by running regular unused-css-detector type tools in conjunction with Selenium - test known interactions and see whats left unused. But a big assumption here is that I'd need to know all interactions on my page which could use new CSS. Is there a way to achieve my goal without making this assumption?
In an ideal world, I'd be able to post-back all CSS used by a visitor's browser on my page to my server. Then I'd collect data over a month, aggregate, and get a pretty accurate idea about actual unused CSS.
Any good ideas?
I am the author of a tool that is aiming at doing what you are describing. Everywhere I worked, the CSS is this "append-only" thing that is too risky, too time-consuming to clean up. And even when you try, the ROI is so low that it not worth it.
So I am working on a tool that is very similar to what you are describing. The goal is to bring confidence on what can be removed, and to actually do it automatically by submitting pull requests.
A snippet of JavaScript is running in the browser and sends reports of what is being used to a server. Once enough data is accumulated to build some "confidence score", it can create Pull Request automatically to remove selectors that are actually not used.
It is still very early stage, but you are welcome to try it and give some feedback about it.
https://www.bleachcss.com/

Is there something like profile-guided dead code removal?

When building applications, especially when using static linking and having a lot of dependences, I often feel that most of this 50-megabyte executable is just unused bloat, especially if consider only the mode I want.
Is there something that lets you run the program in various scenarious, collect data and build the program again (or tinker already compiled code) to remove the unvisited code (replacing things with abort)? If yes, how is it correctly called and where is it implemented?
I'm perfectly willing to use techniques, rather than tools.
What I've done for your problem is get a map file and just look through it.
There may well be a lot of methods for classes you doubt you need. Find out what references put them in there. Chances are it's just because somewhere something fancy was coded, like a bells-and-whistles container class, when something simple would do. Or a whole math library when all you needed was max.
After fixing that, the map file is smaller, and something else is the biggest thing in it, so you can do it all again.
And again...
This can cut out gobs of bloated binary.

Planning web applications

I'm about to start building new start-up so i need some guidelines from you.
What's the best way to plan a website? I don't think like "first design, then the database relations, then start development", but "how to plan the way application is going to work"?
Are there some proven methods, like THE best way to do website 'blueprints', like with some tool or something?
I need as much feedback as you people can give me, this is really important to me.
Links, experiences, everything is welcome =)
I'd like some tool to be able to draw the process like
page
- if logged in - do that
- if not logged in - do that 2
Write it down. On a piece of paper. Draw lines between the related parts.
Rinse and repeat as necessarily.
I'm serious. Tools, fancy diagrams, flowcharts, all look pretty for management, but they get in the way of actually understanding how your app is going to work. If it's so complex you can't get it all on a couple sheets of paper, do a big picture view and then do each sub-section.
If you can get a HUGE whiteboard or piece of butcher paper, it's even better. For some reason, having a large space to work on is fantastic for working things out.
I used to do huge specifications in word...hundred pages or more. I no longer believe in that since as soon as you write it down it is likely to change (your ideas, features, etc.). Instead I suggest that you look at an MS product called SketchFlow (which comes with Blend). This allows you to quickly and without writing any code snap together a working wireframe, sitemap, and mock up. While you can create a high fidelity (it functions and looks very close to the real thing) I suggest that you instead focus on creating a low fidelity mock up. There are sketch styles which looks like hand drawn UI elements. This way you can focus purely on how your product works and no so much about how it looks. If there is too much 'finish' put on your mock up you will get hung up on the "big blue button" syndrome where people are more concerned with how a button looks and less concerned with what it does.
I wrote four articles on wireframes, mock ups, and the like and suggest various tools and why I chose SketchFlow. Then I go into building a mock up in SketchFlow.
http://dotnetslackers.com/articles/aspnet/Building-a-StackOverflow-inspired-Knowledge-Exchange-Sitemap-and-wireframes-with-Expression-Blend-3-and-SketchFlow-part-1.aspx
http://dotnetslackers.com/articles/aspnet/Building-a-StackOverflow-inspired-Knowledge-Exchange-Sitemap-and-wireframes-with-Expression-Blend-3-and-SketchFlow-part-2.aspx
http://dotnetslackers.com/articles/aspnet/Building-a-StackOverflow-inspired-Knowledge-Exchange-Sitemap-and-wireframes-with-Expression-Blend-3-and-SketchFlow-part-3.aspx
http://dotnetslackers.com/articles/aspnet/Building-a-StackOverflow-inspired-Knowledge-Exchange-Sitemap-and-wireframes-with-Expression-Blend-3-and-SketchFlow-part-4.aspx
Hope this helps you!
This is a pretty big question. My best advice is to start by thinking hard about your user, what their goals are etc and then draw up some personas. Personas are descriptions of the typical people who will use your site. Once you have your personas you can start to plan your site.
http://en.wikipedia.org/wiki/Personas
Once you have your personas you can work out user journeys - these are essentially flow diagrams detailing how users will achieve the tasks you want to help them with.
http://www.boxesandarrows.com/view/an_introduction_to_user_journeys
From user journeys you can work out the pages you'll need to create in the form of a site map.
http://en.wikipedia.org/wiki/Site_map
Then finally you can work out the content of the pages as wireframes.
http://en.wikipedia.org/wiki/Website_wireframe
There are loads of tools out there: Visio (http://office.microsoft.com/en-us/visio/FX100487861033.aspx) for the PC and Omnigraffle (http://www.omnigroup.com/applications/OmniGraffle/) for the Mac. Both these tools have web design stencils available for free download on the web. There's also a great online tool call Balsamiq (http://www.balsamiq.com/products/mockups) that allows you to lay out pages without having to use a design tool. In the first instance though a pen and paper are the only tools you need.
Once you have these details sorted you can start to think about data models, graphic design etc etc. However this whole process is iterative and you might say will never be finished :)
Good luck!
Develop your data model first. Even if you don't go whole-hog with UML, get a sense of how your data will be represented.
Come up with some use cases to validate your data model. It doesn't have to be perfect, but it should be 99% there to avoid pain in the future.
Once you have a data model and use cases, your interface needs (in your case, your website) will become a great deal clearer. Of course, I'm coming at this from a purely back-end perspective.
Once you have an interface that's workable, hire a good UX specialist to round out the details for you (both workflow and actual interface).

Why HTML/JavaScript/CSS are not compiled languages and will they ever be?

Why HTML/JavaScript/CSS are not becoming compiled languages (or maybe even merge into a single compiled language)? What if browsers were running "Browser Virtual Machine" and html/javascript/css sources could by compiled to a "browser bytecode". Wouldn't it help developers and users a lot?
I can see a few challenges:
What to do with zillions of existing pages? Make this compilation optional, so if you want you can use plain old html. If you want to feed a browser with a compiled page just use .chtml for example.
How search providers would index pages? Make a decompiler that would decompile bytecode into exact original sources (for example like flash can be decompiled). Or search providers can use the same virtual machine and get data they need from there.
How to make it compatible with all browsers? Have one centralized developer (lets say w3c) to develop this virtual machine and then each browser would embed it.
But what about benefits:
Speed.
Size.
No more "loose" and "half-correct" html. It is either correct or won't compile.
Looks the same in every (supported) browser.
If not a bytecode then at least have some native compression going on, html probably is not the most efficient way of data storing. I know there is gzip but why to compress pages every time on a server and decompress in a browser if we can compress it once and feed it to a browser?
So what stops us from taking this road (well, besides a huge amount of effort to make it all happen)?
Ah, but Javascript IS becoming a compiled language. Check out Firefox 3.5 with TraceMonkey. It's insanely fast compared to um you-know-who's browser. It's true that JS will never be C, but it's a much more dynamic language than C is, and in many ways that makes it more expressive and powerful.
As far as HTML goes, I don't think that the lack of validity of HTML is a huge detriment to speed. I think the engines that put together the visual representation and manipulate the DOM need to get a lot better (um, IE, I'm looking in your general direction...). CSS compliance needs to get better, and CSS itself needs to get more powerful. (Get on the bus with CSS 3 people!)
But I do think that speed is going to get better on Firefox and Chrome to such an extent that people really ARE going to start using it for mainstream application development. It's funny. Adobe seems to be selling Flash as their platform for dynamic web content, MSFT is selling Silverlight for dynamic web content, and Google just wants to really improve HTML and Javascript to display dynamic web content. And Google's doing pretty well at it so far, I must say...
Your ideas have validity when they are applied to JavaScript. As others have noted, to one degree or another several vendors are trying to apply those principles to JS even now. Another big step in this area will likely be the Chrome OS Google has announced. However, when it comes to (X)HTML and CSS I think your ideas may be missing the point.
The world wide web is not a buggy and inconsistent application platform but a massive and unprecedented collection of interconnected documents. The power of the web is in the abstraction of the data from the often rigid (and breakable) visual layouts and increasingly complex in-page functionality largely provided via JavaScript. Encoding these pages in (X)HTML is ideal for making them accessible to the widest possible audience both in terms of browsers and in terms of technical knowledge required to author a page.
More and more the web is being used as an application platform - which is a powerful and exciting use of this technology - but we cannot lose sight of the fact that these Ajax-driven "web 2.0" apps are merely documents with extended functionality. Compilation doesn't make sense for a document and compression is already happening (via gzip and the like).
On a more practical note, the W3C moves at a glacier's pace and browser vendors take turns between jumping-the-gun supporting experimental features in unfinished specs and taking their sweet time supporting other specs which have been on the table and in common usage for years. The whole processes is like herding cats. I wouldn't hold my breath for them to make the kind of radical changes you're proposing any time soon.
Since HTML and CSS aren't code they can't be compiled. Google Chrome's V8 engine does actually convert JS into byte code, expect other rendering engines to follow suit!
http://code.google.com/apis/v8/design.html
We recently reworked a php template system I've helped create to use minify to compress multiple JS and CSS into one file each, seeing our file sizes drop to about 20% of the origial combined sizes. Minify also does gzip and caching so it's really amazing for speeding up websites.
http://code.google.com/p/minify/
In short you can't compile non-code, which HTML and CSS are. JS can be compiled and is starting to be, but all depends on what browsers feel like doing.
Browsers just need to be on the ball regarding supporting web standards. The more browsers do this, the less headache us web developers have. I was quite happy with YouTube's very public drop of support for IE6. We need more action like that for the web to move forward.
The V8 javascript engine (also embedded in Google Chrome, but it's open-source and liberally licensed so you're welcome to use it in the next browser you write!) does compile Javascript to native machine code -- of course, it does it "just in time" (like most modern compilers -- Java, C#, etc!), not "ahead of time" (like Fortran did in 1954 when computers were just too weak to handle compilation in the midst of execution). I'd be surprised if other good JS engines, like those in the very latest Firefox and Safari, didn't do the same.
Looks like you're not advocating "javascript as a compiled language" (since it obviously already IS compiled, if you're using a good JS engine), but rather "ahead-of-time" compilation for it (just when most modern languages are essentially abandoning ahead-of-time compilation). Pushing machine code rather than compilable code down the wire sounds like a mostly horrible idea -- much larger size, difficulties in supporting one CPU vs another, security nightmares in properly sandboxing it, etc, etc) with not much in term of compensating benefits.
That said, if you're really keen on pushing machine code to the client, try out nativeclient (as long as the client is an x86 machine - forget every smart phone on the planet, many netbooks, good old macs, etc) -- at least it promises a fix to the security nightmares. If and when you're happy with nativeclient, transforming a just-in-time compiler into an ahead-of-time one is a far easier technical challenge (if you want to keep using Javascript for the sources rather than other languages, of course).
See here for a previous discussion on the matter
Not all of the reasons given are necessarily valid, but one important one is that, unless you're Google, server-side CPU cycles are a lot more valuable than client-side cycles: so it's easier to have the client compile/optimize what is quite often dynamically generated HTML/JavaScript, rather than the server.
Ken
Speed.
You're assuming that it takes significant time to parse HTML. However it might be that that time is insignificant compared to the time required for something else, e.g. the time required to layout the text on the end-user's window.
No more "loose" and "half-correct" html. It is either correct or won't compile.
You already get that, using [X]HTML.
Looks the same in every (supported) browser.
You seem to be saying that there should only be one browser, or that all browsers would support it equally.
Internet standards don't happen by having a single body (the w3c) implementing something and declaring it a standard. Instead, internet standards happen by having multiple independent bodies creating multiple implementations. A consequence is:
Some people have developed something that isn't standard yet (i.e. they're ahead of the standard)
Some people haven't yet developed something that is standard (i.e. they're behind of the standard)
I think your idea is sound, however there's still no way to enforce a standard. Thus if there was a non-supported feature, there's a good chance the entire page would simply not display anything. In the current setup, critical information can still be passed.
Google V8, which is one of many new-generation javascript engines 'compiles' javascript into pseudocode, much like .NET 'compiles' c# on the fly. Nothing magical here. Expect more of it esp. as webapps get heavier and more demanding
HTML
HTML is pretty much XML. DTD'd exist for various versions and developers can check against that at any time.
CSS
CSS is not a programming language, however I do agree that "compiled" CSS could work seeing as compilation would compress it. However with the support that CSS has and with the number of essential hacks any CSS needs to have, you'd never manage to compile it without errors.
JS
As others have mentioned, JS IS becoming a compiled language except the browser compiles it for you and not you yourself.