package xml

import "github.com/tdewolff/parse/xml"

Package xml is an XML1.0 lexer following the specifications at http://www.w3.org/TR/xml/.

Index

Examples

Functions

func EscapeAttrVal

func EscapeAttrVal(buf *[]byte, b []byte) []byte

EscapeAttrVal returns the escape attribute value bytes without quotes.

func EscapeCDATAVal

func EscapeCDATAVal(buf *[]byte, b []byte) ([]byte, bool)

EscapeCDATAVal returns the escaped text bytes.

Types

type Lexer

type Lexer struct {
	// contains filtered or unexported fields
}

Lexer is the state for the lexer.

func NewLexer

func NewLexer(r io.Reader) *Lexer

NewLexer returns a new Lexer for a given io.Reader.

Example

Code:

{
	l := NewLexer(bytes.NewBufferString("<span class='user'>John Doe</span>"))
	out := ""
	for {
		tt, data := l.Next()
		if tt == ErrorToken {
			break
		}
		if tt == StartTagToken {
			out += "<"
		} else if tt == EndTagToken {
			out += "</"
		}
		out += string(data)
		if tt == StartTagToken {
			out += " "
		} else if tt == EndTagToken {
			out += ">"
		} else if tt == AttributeToken {
			out += "=" + string(l.AttrVal())
		}
	}
	fmt.Println(out)
	// Output: <span class='user'>John Doe</span>
}

Output:

<span class='user'>John Doe</span>

func (*Lexer) AttrVal

func (l *Lexer) AttrVal() []byte

AttrVal returns the attribute value when an AttributeToken was returned from Next.

func (Lexer) Err

func (l Lexer) Err() error

Err returns the error encountered during lexing, this is often io.EOF but also other errors can be returned.

func (Lexer) IsEOF

func (l Lexer) IsEOF() bool

IsEOF returns true when it has encountered EOF and thus loaded the last buffer in memory.

func (*Lexer) Next

func (l *Lexer) Next() (TokenType, []byte)

Next returns the next Token. It returns ErrorToken when an error was encountered. Using Err() one can retrieve the error message.

type Token

type Token struct {
	TokenType
	Data    []byte
	AttrVal []byte
}

Token is a single token unit with an attribute value (if given) and hash of the data.

type TokenBuffer

type TokenBuffer struct {
	// contains filtered or unexported fields
}

TokenBuffer is a buffer that allows for token look-ahead.

func NewTokenBuffer

func NewTokenBuffer(l *Lexer) *TokenBuffer

NewTokenBuffer returns a new TokenBuffer.

func (*TokenBuffer) Peek

func (z *TokenBuffer) Peek(i int) *Token

Peek returns the ith element and possibly does an allocation. Peeking past an error will panic.

func (*TokenBuffer) Shift

func (z *TokenBuffer) Shift() *Token

Shift returns the first element and advances position.

type TokenType

type TokenType uint32

TokenType determines the type of token, eg. a number or a semicolon.

const (
	ErrorToken TokenType = iota // extra token when errors occur
	CommentToken
	DOCTYPEToken
	CDATAToken
	StartTagToken
	StartTagPIToken
	StartTagCloseToken
	StartTagCloseVoidToken
	StartTagClosePIToken
	EndTagToken
	AttributeToken
	TextToken
)

TokenType values.

func (TokenType) String

func (tt TokenType) String() string

String returns the string representation of a TokenType.

Source Files

buffer.go lex.go util.go

Version
v1.0.0
Published
Aug 14, 2015
Platform
windows/amd64
Imports
4 packages
Last checked
3 hours ago

Tools for package owners.