Firefox and Wordpress PHP widgets

Today I've tested a Wordpress site in Firefox and I found out that a PHP widget on the sidebar didn't work. More precisely, Firefox showed the raw PHP code instead of the normal HTML generated by PHP. The theme used on this site is Mystique, but I don't think it's caused by the theme, because it worked just fine in all other browsers. I tested it with Firefox for Mac. I don't understand how this is possible, because the page's loading was complete.

The CSS parser of Firefox

The main file for studying the CSS parser of Firefox (latest browser's release is 4, which corresponds to the Mozilla 2.0 branch) is located at http://mxr.mozilla.org/mozilla2.0/source/layout/style/nsCSSScanner.cpp, as a part of the style component of the layout engine. In its pure, outstanding precision, everything a web developer must know on CSS parsing is there. You will learn how a parser makes up its lexical scanner by providing a set of allowed tokens (according to the LEX notation of the CSS grammar), thus breaking down tokens in significant and non-significant, according to the context of parsing. You will learn how from a single flow of tokens everything will be ordered by following every single character contained in a style sheet.

You will then learn how whitespace is handled by separating its various components (space, carriage return, form feed, tab, new line) and then reintegrating them in a single entity:

There are four types of newlines in CSS: "\r", "\n", "\r\n", and "\f". To simplify dealing with newlines, they are all normalized to "\n" here

lines 630-31

You will learn how the parser "eats" CSS comments and, for backward compatibility reasons, even HTML comments that may appear in a CSS file. More important, you will see how CSS selectors are recognized by the presence of some special delimiters (like space, colons, dashes and so on) and how the parser handles them. Finally, you will learn how the parser recovers the main flow from parsing errors.

A good example of this is how the parser handles URLs contained in a CSS property's value:

Process a url lexical token. A CSS1 url token can contain characters beyond identifier characters (e.g. '/', ':', etc.) Because of this the normal rules for tokenizing the input don't apply very well. To simplify the parser and relax some of the requirements on the scanner we parse url's here. If we find a malformed URL then we emit a token of type "InvalidURL" so that the CSS1 parser can ignore the invalid input. The parser must treat an InvalidURL token like a Function token, and process tokens until a matching parenthesis.

lines 882-90

I recommend you to study this code with an open mind, enjoying the magic cohesion that holds together this simple atom of the vast structure of Firefox. There's much to learn from this. First, you'll get a better comprehension of the CSS standard. Second, you'll appreciate more the work of browser implementors. Finally, you'll be able to see what happens behind the scenes of a browser's mind.

The rise and fall of Firefox

After four years of predominance in the new trends of browser market, Firefox seems to have exhausted its power of stimulating new users and adopters. In the meantime, we're assisting to the meteoric rise of Chrome among web users. The point is that during all these years Firefox has not been able to develop new strategies for getting more users.

Supporting web standards is not enough: now even the "infamous" Internet Explorer is on the right way to become a full standard compliant browser as of version 9. Using extensions is not enough: despite the fact that these kind of plugins seem to reasonably affect the overall performance, now other browsers start to adopt a similar strategy. Performance is not enough: even if the overall performance of the newest release seems to put Firefox to the level of performance monster like Chrome and Safari, now all other browsers are adopting the same strategy.

In a nutshell: before Chrome, Firefox was the alternate browser, with a capital "t". Now it seems to be just another alternate browser.

Firefox: the Layout component

This is an interesting slideshow about the Layout component of the latest releases of Firefox. I have to say that during the last two years this component has been noticeably improved. With the new version 4, still in beta at the moment, Firefox's developers have boosted up the overall rendering speed. Further, now Firefox supports more CSS3 features than before.

Firefox 4 unveiled: new HTML5 elements supported

Firefox 4 now supports new HTML5 tags. This is reflected by the macros list contained within the nsHTMLTagList.h header file in the parser/htmlparser/public folder of the source code. Here is the complete list of available macros:

HTML_TAG(a, Anchor)
HTML_HTMLELEMENT_TAG(abbr)
HTML_HTMLELEMENT_TAG(acronym)
HTML_HTMLELEMENT_TAG(address)
HTML_TAG(applet, SharedObject)
HTML_TAG(area, Area)
HTML_HTMLELEMENT_TAG(article)
HTML_HTMLELEMENT_TAG(aside)
#if defined(MOZ_MEDIA)
HTML_TAG(audio, Audio)
#endif
HTML_HTMLELEMENT_TAG(b)
HTML_TAG(base, Shared)
HTML_TAG(basefont, Span)
HTML_HTMLELEMENT_TAG(bdo)
HTML_TAG(bgsound, Span)
HTML_HTMLELEMENT_TAG(big)
HTML_HTMLELEMENT_TAG(blink)
HTML_TAG(blockquote, Shared)
HTML_TAG(body, Body)
HTML_TAG(br, BR)
HTML_TAG(button, Button)
HTML_TAG(canvas, Canvas)
HTML_TAG(caption, TableCaption)
HTML_HTMLELEMENT_TAG(center)
HTML_HTMLELEMENT_TAG(cite)
HTML_HTMLELEMENT_TAG(code)
HTML_TAG(col, TableCol)
HTML_TAG(colgroup, TableCol)
HTML_HTMLELEMENT_TAG(dd)
HTML_TAG(del, Mod)
HTML_HTMLELEMENT_TAG(dfn)
HTML_TAG(dir, Shared)
HTML_TAG(div, Div)
HTML_TAG(dl, SharedList)
HTML_HTMLELEMENT_TAG(dt)
HTML_HTMLELEMENT_TAG(em)
HTML_TAG(embed, SharedObject)
HTML_TAG(fieldset, FieldSet)
HTML_HTMLELEMENT_TAG(figcaption)
HTML_HTMLELEMENT_TAG(figure)
HTML_TAG(font, Font)
HTML_HTMLELEMENT_TAG(footer)
HTML_TAG(form, Form)
HTML_TAG(frame, Frame)
HTML_TAG(frameset, FrameSet)
HTML_TAG(h1, Heading)
HTML_TAG(h2, Heading)
HTML_TAG(h3, Heading)
HTML_TAG(h4, Heading)
HTML_TAG(h5, Heading)
HTML_TAG(h6, Heading)
HTML_TAG(head, Shared)
HTML_HTMLELEMENT_TAG(header)
HTML_HTMLELEMENT_TAG(hgroup)
HTML_TAG(hr, HR)
HTML_TAG(html, Shared)
HTML_HTMLELEMENT_TAG(i)
HTML_TAG(iframe, IFrame)
HTML_TAG(image, Span)
HTML_TAG(img, Image)
HTML_TAG(input, Input)
HTML_TAG(ins, Mod)
HTML_TAG(isindex, Shared)
HTML_HTMLELEMENT_TAG(kbd)
HTML_TAG(keygen, Span)
HTML_TAG(label, Label)
HTML_TAG(legend, Legend)
HTML_TAG(li, LI)
HTML_TAG(link, Link)
HTML_HTMLELEMENT_TAG(listing)
HTML_TAG(map, Map)
HTML_HTMLELEMENT_TAG(mark)
HTML_TAG(marquee, Div)
HTML_TAG(menu, Shared)
HTML_TAG(meta, Meta)
HTML_TAG(multicol, Span)
HTML_HTMLELEMENT_TAG(nav)
HTML_HTMLELEMENT_TAG(nobr)
HTML_HTMLELEMENT_TAG(noembed)
HTML_HTMLELEMENT_TAG(noframes)
HTML_HTMLELEMENT_TAG(noscript)
HTML_TAG(object, Object)
HTML_TAG(ol, SharedList)
HTML_TAG(optgroup, OptGroup)
HTML_TAG(option, Option)
HTML_TAG(output, Output)
HTML_TAG(p, Paragraph)
HTML_TAG(param, Shared)
HTML_HTMLELEMENT_TAG(plaintext)
HTML_TAG(pre, Pre)
HTML_TAG(q, Shared)
HTML_HTMLELEMENT_TAG(s)
HTML_HTMLELEMENT_TAG(samp)
HTML_TAG(script, Script)
HTML_HTMLELEMENT_TAG(section)
HTML_TAG(select, Select)
HTML_HTMLELEMENT_TAG(small)
#if defined(MOZ_MEDIA)
HTML_TAG(source, Source)
#endif
HTML_TAG(spacer, Shared)
HTML_TAG(span, Span)
HTML_HTMLELEMENT_TAG(strike)
HTML_HTMLELEMENT_TAG(strong)
HTML_TAG(style, Style)
HTML_HTMLELEMENT_TAG(sub)
HTML_HTMLELEMENT_TAG(sup)
HTML_TAG(table, Table)
HTML_TAG(tbody, TableSection)
HTML_TAG(td, TableCell)
HTML_TAG(textarea, TextArea)
HTML_TAG(tfoot, TableSection)
HTML_TAG(th, TableCell)
HTML_TAG(thead, TableSection)
HTML_TAG(title, Title)
HTML_TAG(tr, TableRow)
HTML_HTMLELEMENT_TAG(tt)
HTML_HTMLELEMENT_TAG(u)
HTML_TAG(ul, SharedList)
HTML_HTMLELEMENT_TAG(var)
#if defined(MOZ_MEDIA)
HTML_TAG(video, Video)
#endif
HTML_HTMLELEMENT_TAG(wbr)
HTML_HTMLELEMENT_TAG(xmp)

/* These are not for tags. But they will be included in the nsHTMLTag
   enum anyway */

HTML_OTHER(text)
HTML_OTHER(whitespace)
HTML_OTHER(newline)
HTML_OTHER(comment)
HTML_OTHER(entity)
HTML_OTHER(doctypeDecl)
HTML_OTHER(markupDecl)
HTML_OTHER(instruction)

Since Firefox 4 now supports these new HTML5 elements, there have been some slight changes in the html.css default style sheet. In short, the aside and article elements now have all the declaration display: block and apparently work exactly as normal div elements.

Firefox unveiled: Gecko basic data flow

Gecko LogoDavid Baron wrote an excellent presentation on the Gecko rendering engine more than four years ago. Though many changes have occurred during all these years, the basic concepts behind Gecko remain the same and are well explained in the following data flow that shows what happens when Firefox loads and renders a web document:

First comes the markup, which is parsed by the Firefox SGML parser. When a rough sketch of the future DOM tree is constructed, it's said that we are in the content sink phase. In this phase, style sheets are parsed by the CSS parser of Firefox. Once parsed, a set of style rules have been constructed. After the content sink phase, the DOM is fully build up in a content model. At this point, content model and style rules can actually work together on the frame constructor, which later yields a frame tree.

In the Gecko terminology, a frame is a rectangular region of the screen defined by the x. y and z coordinates. A frame can be affected by the reflow, that is, a change in its own state, for example when we define a CSS property on it or an user changes the base font size of the page or adjusts the window dimensions.

When all the frame hierarchy is in a proper order, we enter in the painting and display phases, where the final layout of a web document actually takes place.

Be quick or be dead: is Firefox falling?

Tipping Firefox across the chasm is an excellent post written more than five years ago. In this post, the author analyzes the rising of Firefox in the browser market share, listing some of the causes that made this possible. However, there's one point that actually is worth of mentioning: performance. In fact, Firefox is not considered under this aspect but only as an alternative to the obsolete Internet Explorer 6. Time changes everything. After five years, new browsers have been released, and a key aspect that now seems to be on the top of any browser wish list is actually performance. Take Chrome for example: it's fast, maybe it's one of the fastest browser on the market. Safari is fast too, just as Opera. Internet Explorer is just filling (or try to fill) the gap with the next release (9). And Firefox? Actually, it's much slower than Chrome, Opera and Safari. But why performance? Because in the meantime users got faster connections, so they want to see fast responses when they surf the web. A website must load in a snapshot. Period. I know that this is utopia from a mere point of view of a web developer, but the success of a browser is often determined by its users who, of course, are non-developers if they're considered as simple percentages. In other words, simple users can effectively push a browser on top or, conversely, let it falling down to minority. This is something that Firefox developers must seriously take into account. In simple words: be quick or be dead.

How Firefox parses CSS URLs

Here's a quotation taken from the nsCSSScanner.cpp of Firefox:

Process a url lexical token. A CSS1 url token can contain characters beyond identifier characters (e.g. '/', ':', etc.) Because of this the normal rules for tokenizing the input don't apply very well. To simplify the parser and relax some of the requirements on the scanner we parse url's here. If we find a malformed URL then we emit a token of type "InvalidURL" so that the CSS1 parser can ignore the invalid input. We attempt to eat the right amount of input data when an invalid URL is presented.

Basically, CSS URLs take form within the url() function, that can be applied to background properties, list properties and generated content properties. Everything inside the parentheses is considered a URL. Anyway, as stated above, parsing CSS URIs can be quite challenging, because the tokenization must include new types of characters which are not included by default in the Flex notation for IDENTs of the CSS specifications. Further, URLs may either be contained within quotes or not, which adds an additional level of complexity (e.g. checking if quotes occur in matching pairs, like "..."). Here's how Firefox copes with this:

aToken.mType = eCSSToken_InvalidURL;
    nsString& ident = aToken.mIdent;
    ident.SetLength(0);

    if (ch == ')') {
      Pushback(ch);
      // empty url spec; just get out of here
      aToken.mType = eCSSToken_URL;
    } else {
      // start of a non-quoted url
      Pushback(ch);
      PRBool ok = PR_TRUE;
      for (;;) {
        ch = Read(aErrorCode);
        if (ch < 0) break;
        if (ch == CSS_ESCAPE) {
          ch = ParseEscape(aErrorCode);
          if (0 < ch) {
            ident.Append(PRUnichar(ch));
          }
        } else if ((ch == '"') || (ch == '\'') || (ch == '(')) {
          // This is an invalid URL spec
          ok = PR_FALSE;
        } else if ((256 > ch) && ((gLexTable[ch] & IS_WHITESPACE) != 0)) {
          // Whitespace is allowed at the end of the URL
          (void) EatWhiteSpace(aErrorCode);
          if (LookAhead(aErrorCode, ')')) {
            Pushback(')');  // leave the closing symbol
            // done!
            break;
          }
          // Whitespace is followed by something other than a
          // ")". This is an invalid url spec.
          ok = PR_FALSE;
        } else if (ch == ')') {
          Unread();
          // All done
          break;
        } else {
          // A regular url character.
          ident.Append(PRUnichar(ch));
        }
      }

      // If the result of the above scanning is ok then change the token
      // type to a useful one.
      if (ok) {
        aToken.mType = eCSSToken_URL;
      }
    }
  }
  return PR_TRUE;
}

Parsing starts with a '(' token and ends with a ')' token. Firefox checks if:

  • the url() function is empty
  • the url() function contains a quoted or a non-quoted URL
  • the url() function contains invalid tokens (e.g. a '(' token)
  • the url() function contains whitespace, which is allowed at the end of the URL (Firefox removes/eats it, however)
  • the url() function is followed but other tokens after the whitespace (this is invalid)

Safari has a problem with nested matching quotes within the url() function, i.e. it accepts them instead of marking them as invalid. This is not the case of Firefox, as you can see.

How Firefox builds its CSS tokenizer

Here's a little excerpt taken from the nsCSSScanner.cpp library of the Firefox source code. Notice of the CSS lexical table is build token by token.

void
 nsCSSScanner::BuildLexTable()
 {
   gLexTableSetup = PR_TRUE;
 
   PRUint8* lt = gLexTable;
   int i;
   lt[CSS_ESCAPE] = START_IDENT;
   lt['-'] |= IS_IDENT;
   lt['_'] |= IS_IDENT | START_IDENT;
   lt[' '] |= IS_WHITESPACE;   // space
   lt['\t'] |= IS_WHITESPACE;  // horizontal tab
   lt['\r'] |= IS_WHITESPACE;  // carriage return
   lt['\n'] |= IS_WHITESPACE;  // line feed
   lt['\f'] |= IS_WHITESPACE;  // form feed
   for (i = 161; i <= 255; i++) {
     lt[i] |= IS_IDENT | START_IDENT;
   }
   for (i = '0'; i <= '9'; i++) {
     lt[i] |= IS_DIGIT | IS_HEX_DIGIT | IS_IDENT;
   }
   for (i = 'A'; i <= 'Z'; i++) {
     if ((i >= 'A') && (i <= 'F')) {
       lt[i] |= IS_HEX_DIGIT;
       lt[i+32] |= IS_HEX_DIGIT;
     }
     lt[i] |= IS_IDENT | START_IDENT;
     lt[i+32] |= IS_IDENT | START_IDENT;
  }
}