JavaScript: V8 engine preprocessing

One of the main files used by the V8 engine during JavaScript preprocessing is located in the folder /v8/src of the Chromium source code and is called preparser.cc. This file has the task of recognizing and separate input tokens that will be later feed to the JavaScript processor. For example, a main goal of a JavaScript preprocessor is the ability of recognizing JavaScript statements (i.e. all the core statements defined by the current ECMAScript specifications). V8 does the following:

PreParser::SourceElements PreParser::ParseSourceElements(int end_token,
                                                         bool* ok) {
  // SourceElements ::
  //   (Statement)* <end_token>

  while (peek() != end_token) {
    ParseStatement(CHECK_OK);
  }
  return kUnknownSourceElements;
}

The ParseStatement() method, which does all the work, is defined later in the source:

PreParser::Statement PreParser::ParseStatement(bool* ok) {
  // Statement ::
  //   Block
  //   VariableStatement
  //   EmptyStatement
  //   ExpressionStatement
  //   IfStatement
  //   IterationStatement
  //   ContinueStatement
  //   BreakStatement
  //   ReturnStatement
  //   WithStatement
  //   LabelledStatement
  //   SwitchStatement
  //   ThrowStatement
  //   TryStatement
  //   DebuggerStatement

  // Note: Since labels can only be used by 'break' and 'continue'
  // statements, which themselves are only valid within blocks,
  // iterations or 'switch' statements (i.e., BreakableStatements),
  // labels can be simply ignored in all other cases; except for
  // trivial labeled break statements 'label: break label' which is
  // parsed into an empty statement.

  // Keep the source position of the statement
  switch (peek()) {
    case i::Token::LBRACE:
      return ParseBlock(ok);

    case i::Token::CONST:
    case i::Token::VAR:
      return ParseVariableStatement(ok);

    case i::Token::SEMICOLON:
      Next();
      return kUnknownStatement;

    case i::Token::IF:
      return  ParseIfStatement(ok);

    case i::Token::DO:
      return ParseDoWhileStatement(ok);

    case i::Token::WHILE:
      return ParseWhileStatement(ok);

    case i::Token::FOR:
      return ParseForStatement(ok);

    case i::Token::CONTINUE:
      return ParseContinueStatement(ok);

    case i::Token::BREAK:
      return ParseBreakStatement(ok);

    case i::Token::RETURN:
      return ParseReturnStatement(ok);

    case i::Token::WITH:
      return ParseWithStatement(ok);

    case i::Token::SWITCH:
      return ParseSwitchStatement(ok);

    case i::Token::THROW:
      return ParseThrowStatement(ok);

    case i::Token::TRY:
      return ParseTryStatement(ok);

    case i::Token::FUNCTION:
      return ParseFunctionDeclaration(ok);

    case i::Token::NATIVE:
      return ParseNativeDeclaration(ok);

    case i::Token::DEBUGGER:
      return ParseDebuggerStatement(ok);

    default:
      return ParseExpressionOrLabelledStatement(ok);
  }
}

As you can see, the above method performs a sequential check using a switch statement. On each step, which corresponds to a different input token, an appropriate method is called on each token, depending on the type of JavaScript component encountered. Note that this is a one-way check, because each case statement terminates with a return statement so that the break statement is not required.

Leave a Reply

Note: Only a member of this blog may post a comment.