package xml

import "github.com/tdewolff/parse/xml"

Package xml is an XML1.0 lexer following the specifications at http://www.w3.org/TR/xml/.

Index

Examples

Functions

func EscapeAttrVal

func EscapeAttrVal(buf *[]byte, b []byte) []byte

EscapeAttrVal returns the escape attribute value bytes without quotes.

func EscapeCDATAVal

func EscapeCDATAVal(buf *[]byte, b []byte) ([]byte, bool)

EscapeCDATAVal returns the escaped text bytes.

Types

type Lexer

type Lexer struct {
	// contains filtered or unexported fields
}

Lexer is the state for the lexer.

func NewLexer

func NewLexer(r io.Reader) *Lexer

NewLexer returns a new Lexer for a given io.Reader.

Example

Code:

{
	l := NewLexer(bytes.NewBufferString("<span class='user'>John Doe</span>"))
	out := ""
	for {
		tt, data, n := l.Next()
		if tt == ErrorToken {
			break
		}
		if tt == StartTagToken {
			out += "<"
		} else if tt == EndTagToken {
			out += "</"
		}
		out += string(data)
		if tt == StartTagToken {
			out += " "
		} else if tt == EndTagToken {
			out += ">"
		} else if tt == AttributeToken {
			out += "=" + string(l.AttrVal())
		}
		l.Free(n)
	}
	fmt.Println(out)
	// Output: <span class='user'>John Doe</span>
}

Output:

<span class='user'>John Doe</span>

func (*Lexer) AttrVal

func (l *Lexer) AttrVal() []byte

AttrVal returns the attribute value when an AttributeToken was returned from Next.

func (*Lexer) Err

func (l *Lexer) Err() error

Err returns the error encountered during lexing, this is often io.EOF but also other errors can be returned.

func (*Lexer) Free

func (l *Lexer) Free(n int)

Free frees up bytes of length n from previously shifted tokens.

func (*Lexer) Next

func (l *Lexer) Next() (TokenType, []byte, int)

Next returns the next Token. It returns ErrorToken when an error was encountered. Using Err() one can retrieve the error message.

type TokenType

type TokenType uint32

TokenType determines the type of token, eg. a number or a semicolon.

const (
	ErrorToken TokenType = iota // extra token when errors occur
	CommentToken
	DOCTYPEToken
	CDATAToken
	StartTagToken
	StartTagPIToken
	StartTagCloseToken
	StartTagCloseVoidToken
	StartTagClosePIToken
	EndTagToken
	AttributeToken
	TextToken
)

TokenType values.

func (TokenType) String

func (tt TokenType) String() string

String returns the string representation of a TokenType.

Source Files

lex.go util.go

Version
v1.1.0 (latest)
Published
Nov 2, 2015
Platform
linux/amd64
Imports
4 packages
Last checked
3 weeks ago

Tools for package owners.