Lexical Analyser Phase and Token Generation Process

lexical analyser phase n.w
1 / 12
Embed
Share

Explore the key concepts of the lexical analyser phase in programming, including token generation, lexemes, patterns, and interaction with symbol tables. Learn how the lexical analyser reads characters in the source program, groups them into meaningful sequences called lexemes, and produces tokens. Dive into examples of tokens, specifications of tokens, alphabets, and strings. Discover the tasks involved in lexical analysis and the crucial role it plays in the compilation process.

  • Lexical Analyser
  • Token Generation
  • Lexemes
  • Patterns
  • Symbol Tables

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Lexical analyser phase Saja Almodhafar 7/11/2021 3

  2. Lexical analyzer The lexical analyzer reads the stream of characters making up the source program and groups the characters into meaningful sequences called lexemes. For each lexeme, the lexical analyzer produces output as a token of the form. <token-name, attribute-value> token-name is an abstract symbol that is used during syntax analysis. attribute-value points to an entry in the symbol table for this token. 7/11/2021 Saja Almodhafar 2

  3. Interaction between the lexical analyser and Interaction between the lexical analyser and the parser the parser 7/11/2021 Saja Almodhafar 3

  4. Lexical Analyser tasks 1. Deleting comments and spaces. 2. Corelating error messaging (line numbers) 3. Keeping a copy of the source code and error messages (in some compilers) 4. Producing tokens 5. Interaction with symbols table 7/11/2021 Saja Almodhafar 4

  5. Tokens , Patterns, and Lexemes Token is a pair consisting of a token name and an optional attribute value. <token-name, attribute-value> A pattern is the description of the form that the lexeme of the token may take. For example, the pattern of the keyword is just the sequence of characters that form this word. A Lexeme is a sequence of characters in the source program that matchs the pattern of a token and identified by the lexical analyser as an instant of the token. 7/11/2021 Saja Almodhafar 5

  6. Examples of tokens 7/11/2021 Saja Almodhafar 6

  7. Specifications of Tokens Alphabets Any finite set of symbols {0,1} is a set of binary alphabets, {0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F} is a set of Hexadecimal alphabets, {a-z, A-Z} is a set of English language alphabets. 7/11/2021 Saja Almodhafar 7

  8. Specifications of Tokens Strings Any finite sequence of alphabets is called a string. Length of the string is the total number of occurrence of alphabets, e.g., the length of the string tutorialspoint is 14 and is denoted by |tutorialspoint| = 14. A string having no alphabets, i.e. a string of zero length is known as an empty string and is denoted by (epsilon). 7/11/2021 Saja Almodhafar 8

  9. Specifications of Tokens Language : set of strings that generated from the alphabetic. 7/11/2021 Saja Almodhafar 9

  10. Operation on language 7/11/2021 Saja Almodhafar 10

  11. Example : Let L be the set of letters {A, B, , Z , a, b, z} Let D be the set of digits { 0, 1, .., 9} L U D is the set of letters and digits , i.e. 62 strings of length one. LD is the set of 520 strings of length two. Each string consists of one letter followed by one digit. L4 is the set of all 4 letters. L* is the set of all strings of all letters, including , the empty string. D+ is the string of one or more digits. 7/11/2021 Saja Almodhafar 11

  12. Special Symbols A typical high-level language contains the following symbols: 7/11/2021 Saja Almodhafar 12

Related


More Related Content