package xml

parse – github.com/tdewolff/parse/xml Index | Examples | Files

package xml

import "github.com/tdewolff/parse/xml"

Package xml is an XML1.0 lexer following the specifications at http://www.w3.org/TR/xml/.

Index ¶

func EscapeAttrVal(buf *[]byte, b []byte) []byte
func EscapeCDATAVal(buf *[]byte, b []byte) ([]byte, bool)
type Lexer

func NewLexer(r io.Reader) *Lexer
func (l *Lexer) AttrVal() []byte
func (l Lexer) Err() error
func (l Lexer) IsEOF() bool
func (l *Lexer) Next() (TokenType, []byte)

type Token
type TokenBuffer

func NewTokenBuffer(l *Lexer) *TokenBuffer
func (z *TokenBuffer) Peek(i int) *Token
func (z *TokenBuffer) Shift() *Token

type TokenType

func (tt TokenType) String() string

Examples ¶

NewLexer

Functions ¶

func EscapeAttrVal ¶

func EscapeAttrVal(buf *[]byte, b []byte) []byte

EscapeAttrVal returns the escape attribute value bytes without quotes.

func EscapeCDATAVal ¶

func EscapeCDATAVal(buf *[]byte, b []byte) ([]byte, bool)

EscapeCDATAVal returns the escaped text bytes.

Types ¶

type Lexer ¶

type Lexer struct {
	// contains filtered or unexported fields
}

Lexer is the state for the lexer.

func NewLexer ¶

func NewLexer(r io.Reader) *Lexer

NewLexer returns a new Lexer for a given io.Reader.

Example¶

Code:

{
	l := NewLexer(bytes.NewBufferString("<span class='user'>John Doe</span>"))
	out := ""
	for {
		tt, data := l.Next()
		if tt == ErrorToken {
			break
		}
		if tt == StartTagToken {
			out += "<"
		} else if tt == EndTagToken {
			out += "</"
		}
		out += string(data)
		if tt == StartTagToken {
			out += " "
		} else if tt == EndTagToken {
			out += ">"
		} else if tt == AttributeToken {
			out += "=" + string(l.AttrVal())
		}
	}
	fmt.Println(out)
	// Output: <span class='user'>John Doe</span>
}

Output:

<span class='user'>John Doe</span>

func (*Lexer) AttrVal ¶

func (l *Lexer) AttrVal() []byte

AttrVal returns the attribute value when an AttributeToken was returned from Next.

func (Lexer) Err ¶

func (l Lexer) Err() error

Err returns the error encountered during lexing, this is often io.EOF but also other errors can be returned.

func (Lexer) IsEOF ¶

func (l Lexer) IsEOF() bool

IsEOF returns true when it has encountered EOF and thus loaded the last buffer in memory.

func (*Lexer) Next ¶

func (l *Lexer) Next() (TokenType, []byte)

Next returns the next Token. It returns ErrorToken when an error was encountered. Using Err() one can retrieve the error message.

type Token ¶

type Token struct {
	TokenType
	Data    []byte
	AttrVal []byte
}

Token is a single token unit with an attribute value (if given) and hash of the data.

type TokenBuffer ¶

type TokenBuffer struct {
	// contains filtered or unexported fields
}

TokenBuffer is a buffer that allows for token look-ahead.

func NewTokenBuffer ¶

func NewTokenBuffer(l *Lexer) *TokenBuffer

NewTokenBuffer returns a new TokenBuffer.

func (*TokenBuffer) Peek ¶

func (z *TokenBuffer) Peek(i int) *Token

Peek returns the ith element and possibly does an allocation. Peeking past an error will panic.

func (*TokenBuffer) Shift ¶

func (z *TokenBuffer) Shift() *Token

Shift returns the first element and advances position.

type TokenType ¶

type TokenType uint32

TokenType determines the type of token, eg. a number or a semicolon.

const (
	ErrorToken TokenType = iota // extra token when errors occur
	CommentToken
	DOCTYPEToken
	CDATAToken
	StartTagToken
	StartTagPIToken
	StartTagCloseToken
	StartTagCloseVoidToken
	StartTagClosePIToken
	EndTagToken
	AttributeToken
	TextToken
)

TokenType values.

func (TokenType) String ¶

func (tt TokenType) String() string

String returns the string representation of a TokenType.

Source Files ¶

buffer.go lex.go util.go

Version: v1.0.0
Published: Aug 14, 2015
Platform: windows/amd64
Imports: 4 packages
Last checked: 3 hours ago –

Tools for package owners.

?	: This menu
/	: Search site
f	: Jump to identifier
g then g	: Go to top of page
g then b	: Go to end of page
G	: Go to end of page
g then i	: Go to index
g then e	: Go to examples

package xml

Index ¶

Examples ¶

Functions ¶

func EscapeAttrVal ¶

func EscapeCDATAVal ¶

Types ¶

type Lexer ¶

func NewLexer ¶

func (*Lexer) AttrVal ¶

func (Lexer) Err ¶

func (Lexer) IsEOF ¶

func (*Lexer) Next ¶

type Token ¶

type TokenBuffer ¶

func NewTokenBuffer ¶

func (*TokenBuffer) Peek ¶

func (*TokenBuffer) Shift ¶

type TokenType ¶

func (TokenType) String ¶

Source Files ¶

Jump to identifier

Keyboard shortcuts