v2 – github.com/tdewolff/parse/v2 Index | Files | Directories

package parse

import "github.com/tdewolff/parse/v2"

Package parse contains a collection of parsers for various formats in its subpackages.

Index

Constants

const PageSize = 4096

Variables

var DataURIEncodingTable = [256]bool{ /* 256 elements not displayed */

}

DataURIEncodingTable is a charmap for which characters need escaping in the Data URI encoding scheme Escape only non-printable characters, unicode and %, #, &. IE11 additionally requires encoding of \, [, ], ", <, >, `, {, }, |, ^ which is not required by Chrome, Firefox, Opera, Edge, Safari, Yandex To pass the HTML validator, restricted URL characters must be escaped: non-printable characters, space, <, >, #, %, "

var ErrBadDataURI = errors.New("not a data URI")

ErrBadDataURI is returned by DataURI when the byte slice does not start with 'data:' or is too short.

var URLEncodingTable = [256]bool{ /* 256 elements not displayed */

}

URLEncodingTable is a charmap for which characters need escaping in the URL encoding scheme

Functions

func Copy

func Copy(src []byte) (dst []byte)

Copy returns a copy of the given byte slice.

func DataURI

func DataURI(dataURI []byte) ([]byte, []byte, error)

DataURI parses the given data URI and returns the mediatype, data and ok.

func DecodeURL

func DecodeURL(b []byte) []byte

DecodeURL decodes an URL encoded using the URL encoding scheme

func Dimension

func Dimension(b []byte) (int, int)

Dimension parses a byte-slice and returns the length of the number and its unit.

func EncodeURL

func EncodeURL(b []byte, table [256]bool) []byte

EncodeURL encodes bytes using the URL encoding scheme

func EqualFold

func EqualFold(s, targetLower []byte) bool

EqualFold returns true when s matches case-insensitively the targetLower (which must be lowercase).

func IsAllWhitespace

func IsAllWhitespace(b []byte) bool

IsAllWhitespace returns true when the entire byte slice consists of space, \n, \r, \t, \f.

func IsNewline

func IsNewline(c byte) bool

IsNewline returns true for \n, \r.

func IsWhitespace

func IsWhitespace(c byte) bool

IsWhitespace returns true for space, \n, \r, \t, \f.

func Mediatype

func Mediatype(b []byte) ([]byte, map[string]string)

Mediatype parses a given mediatype and splits the mimetype from the parameters. It works similar to mime.ParseMediaType but is faster.

func Number

func Number(b []byte) int

Number returns the number of bytes that parse as a number of the regex format (+|-)?([0-9]+(\.[0-9]+)?|\.[0-9]+)((e|E)(+|-)?[0-9]+)?.

func Position

func Position(r io.Reader, offset int) (line, col int, context string)

Position returns the line and column number for a certain position in a file. It is useful for recovering the position in a file that caused an error. It only treates \n, \r, and \r\n as newlines, which might be different from some languages also recognizing \f, \u2028, and \u2029 to be newlines.

func Printable

func Printable(r rune) string

Printable returns a printable string for given rune

func QuoteEntity

func QuoteEntity(b []byte) (quote byte, n int)

QuoteEntity parses the given byte slice and returns the quote that got matched (' or ") and its entity length. TODO: deprecated

func ReplaceEntities

func ReplaceEntities(b []byte, entitiesMap map[string][]byte, revEntitiesMap map[byte][]byte) []byte

ReplaceEntities replaces all occurrences of entites (such as &quot;) to their respective unencoded bytes.

func ReplaceMultipleWhitespace

func ReplaceMultipleWhitespace(b []byte) []byte

ReplaceMultipleWhitespace replaces character series of space, \n, \t, \f, \r into a single space or newline (when the serie contained a \n or \r).

func ReplaceMultipleWhitespaceAndEntities

func ReplaceMultipleWhitespaceAndEntities(b []byte, entitiesMap map[string][]byte, revEntitiesMap map[byte][]byte) []byte

ReplaceMultipleWhitespaceAndEntities is a combination of ReplaceMultipleWhitespace and ReplaceEntities. It is faster than executing both sequentially.

func ToLower

func ToLower(src []byte) []byte

ToLower converts all characters in the byte slice from A-Z to a-z.

func TrimWhitespace

func TrimWhitespace(b []byte) []byte

TrimWhitespace removes any leading and trailing whitespace characters.

Types

type BinaryFileReader

type BinaryFileReader struct {
	Endianness binary.ByteOrder
	// contains filtered or unexported fields
}

func NewBinaryFileReader

func NewBinaryFileReader(f *os.File, chunk int) (*BinaryFileReader, error)

func (*BinaryFileReader) BufferLen

func (r *BinaryFileReader) BufferLen() int

BufferLen returns the length of the buffer.

func (*BinaryFileReader) Len

func (r *BinaryFileReader) Len() uint64

Len returns the remaining length of the buffer.

func (*BinaryFileReader) Offset

func (r *BinaryFileReader) Offset() uint64

Offset returns the offset of the buffer.

func (*BinaryFileReader) Pos

func (r *BinaryFileReader) Pos() uint64

Pos returns the reader's position.

func (*BinaryFileReader) Read

func (r *BinaryFileReader) Read(b []byte) (int, error)

Read complies with io.Reader.

func (*BinaryFileReader) ReadByte

func (r *BinaryFileReader) ReadByte() byte

ReadByte reads a single byte.

func (*BinaryFileReader) ReadBytes

func (r *BinaryFileReader) ReadBytes(n int) []byte

ReadBytes reads n bytes.

func (*BinaryFileReader) ReadInt16

func (r *BinaryFileReader) ReadInt16() int16

ReadInt16 reads a int16.

func (*BinaryFileReader) ReadInt32

func (r *BinaryFileReader) ReadInt32() int32

ReadInt32 reads a int32.

func (*BinaryFileReader) ReadInt64

func (r *BinaryFileReader) ReadInt64() int64

ReadInt64 reads a int64.

func (*BinaryFileReader) ReadInt8

func (r *BinaryFileReader) ReadInt8() int8

ReadInt8 reads a int8.

func (*BinaryFileReader) ReadString

func (r *BinaryFileReader) ReadString(n int) string

ReadString reads a string of length n.

func (*BinaryFileReader) ReadUint16

func (r *BinaryFileReader) ReadUint16() uint16

ReadUint16 reads a uint16.

func (*BinaryFileReader) ReadUint32

func (r *BinaryFileReader) ReadUint32() uint32

ReadUint32 reads a uint32.

func (*BinaryFileReader) ReadUint64

func (r *BinaryFileReader) ReadUint64() uint64

ReadUint64 reads a uint64.

func (*BinaryFileReader) ReadUint8

func (r *BinaryFileReader) ReadUint8() uint8

ReadUint8 reads a uint8.

func (*BinaryFileReader) Seek

func (r *BinaryFileReader) Seek(pos uint64) error

Seek set the reader position in the buffer.

type BinaryReader

type BinaryReader struct {
	Endianness binary.ByteOrder
	// contains filtered or unexported fields
}

BinaryReader is a binary big endian file format reader.

func NewBinaryReader

func NewBinaryReader(buf []byte) *BinaryReader

NewBinaryReader returns a big endian binary file format reader.

func NewBinaryReaderLE

func NewBinaryReaderLE(buf []byte) *BinaryReader

NewBinaryReaderLE returns a little endian binary file format reader.

func (*BinaryReader) EOF

func (r *BinaryReader) EOF() bool

EOF returns true if we reached the end-of-file.

func (*BinaryReader) Len

func (r *BinaryReader) Len() uint32

Len returns the remaining length of the buffer.

func (*BinaryReader) Pos

func (r *BinaryReader) Pos() uint32

Pos returns the reader's position.

func (*BinaryReader) Read

func (r *BinaryReader) Read(b []byte) (int, error)

Read complies with io.Reader.

func (*BinaryReader) ReadByte

func (r *BinaryReader) ReadByte() byte

ReadByte reads a single byte.

func (*BinaryReader) ReadBytes

func (r *BinaryReader) ReadBytes(n uint32) []byte

ReadBytes reads n bytes.

func (*BinaryReader) ReadInt16

func (r *BinaryReader) ReadInt16() int16

ReadInt16 reads a int16.

func (*BinaryReader) ReadInt32

func (r *BinaryReader) ReadInt32() int32

ReadInt32 reads a int32.

func (*BinaryReader) ReadInt64

func (r *BinaryReader) ReadInt64() int64

ReadInt64 reads a int64.

func (*BinaryReader) ReadInt8

func (r *BinaryReader) ReadInt8() int8

ReadInt8 reads a int8.

func (*BinaryReader) ReadString

func (r *BinaryReader) ReadString(n uint32) string

ReadString reads a string of length n.

func (*BinaryReader) ReadUint16

func (r *BinaryReader) ReadUint16() uint16

ReadUint16 reads a uint16.

func (*BinaryReader) ReadUint32

func (r *BinaryReader) ReadUint32() uint32

ReadUint32 reads a uint32.

func (*BinaryReader) ReadUint64

func (r *BinaryReader) ReadUint64() uint64

ReadUint64 reads a uint64.

func (*BinaryReader) ReadUint8

func (r *BinaryReader) ReadUint8() uint8

ReadUint8 reads a uint8.

func (*BinaryReader) Seek

func (r *BinaryReader) Seek(pos uint32) error

Seek set the reader position in the buffer.

type BinaryReader2

type BinaryReader2 struct {
	Endian binary.ByteOrder
	// contains filtered or unexported fields
}

func NewBinaryReader2

func NewBinaryReader2(f IBinaryReader) *BinaryReader2

func NewBinaryReader2Bytes

func NewBinaryReader2Bytes(data []byte) *BinaryReader2

func NewBinaryReader2File

func NewBinaryReader2File(filename string) (*BinaryReader2, error)

func NewBinaryReader2Mmap

func NewBinaryReader2Mmap(filename string) (*BinaryReader2, error)

func NewBinaryReader2Reader

func NewBinaryReader2Reader(r io.Reader, n int64) (*BinaryReader2, error)

func (*BinaryReader2) Close

func (r *BinaryReader2) Close() error

func (*BinaryReader2) Err

func (r *BinaryReader2) Err() error

func (*BinaryReader2) Free

func (r *BinaryReader2) Free()

Free frees all previously read bytes, you cannot seek from before this position (for reader).

func (*BinaryReader2) InPageCache

func (r *BinaryReader2) InPageCache(start, end int64) bool

InPageCache returns true if the range is already in the page cache (for mmap).

func (*BinaryReader2) Len

func (r *BinaryReader2) Len() int

Len returns the remaining length of the buffer.

func (*BinaryReader2) Pos

func (r *BinaryReader2) Pos() int64

Pos returns the reader's position.

func (*BinaryReader2) Read

func (r *BinaryReader2) Read(b []byte) (int, error)

Read complies with io.Reader.

func (*BinaryReader2) ReadByte

func (r *BinaryReader2) ReadByte() byte

ReadByte reads a single byte.

func (*BinaryReader2) ReadBytes

func (r *BinaryReader2) ReadBytes(n int) []byte

ReadBytes reads n bytes.

func (*BinaryReader2) ReadInt16

func (r *BinaryReader2) ReadInt16() int16

ReadInt16 reads a int16.

func (*BinaryReader2) ReadInt32

func (r *BinaryReader2) ReadInt32() int32

ReadInt32 reads a int32.

func (*BinaryReader2) ReadInt64

func (r *BinaryReader2) ReadInt64() int64

ReadInt64 reads a int64.

func (*BinaryReader2) ReadInt8

func (r *BinaryReader2) ReadInt8() int8

ReadInt8 reads a int8.

func (*BinaryReader2) ReadString

func (r *BinaryReader2) ReadString(n int) string

ReadString reads a string of length n.

func (*BinaryReader2) ReadUint16

func (r *BinaryReader2) ReadUint16() uint16

ReadUint16 reads a uint16.

func (*BinaryReader2) ReadUint32

func (r *BinaryReader2) ReadUint32() uint32

ReadUint32 reads a uint32.

func (*BinaryReader2) ReadUint64

func (r *BinaryReader2) ReadUint64() uint64

ReadUint64 reads a uint64.

func (*BinaryReader2) ReadUint8

func (r *BinaryReader2) ReadUint8() uint8

ReadUint8 reads a uint8.

func (*BinaryReader2) Seek

func (r *BinaryReader2) Seek(pos int64)

type BinaryWriter

type BinaryWriter struct {
	Endian binary.ByteOrder
	// contains filtered or unexported fields
}

BinaryWriter is a big endian binary file format writer.

func NewBinaryWriter

func NewBinaryWriter(buf []byte) *BinaryWriter

NewBinaryWriter returns a big endian binary file format writer.

func (*BinaryWriter) Bytes

func (w *BinaryWriter) Bytes() []byte

Bytes returns the buffer's bytes.

func (*BinaryWriter) Len

func (w *BinaryWriter) Len() uint32

Len returns the buffer's length in bytes.

func (*BinaryWriter) Write

func (w *BinaryWriter) Write(b []byte) (int, error)

Write complies with io.Writer.

func (*BinaryWriter) WriteByte

func (w *BinaryWriter) WriteByte(v byte)

WriteByte writes the given byte to the buffer.

func (*BinaryWriter) WriteBytes

func (w *BinaryWriter) WriteBytes(v []byte)

WriteBytes writes the given bytes to the buffer.

func (*BinaryWriter) WriteInt16

func (w *BinaryWriter) WriteInt16(v int16)

WriteInt16 writes the given int16 to the buffer.

func (*BinaryWriter) WriteInt32

func (w *BinaryWriter) WriteInt32(v int32)

WriteInt32 writes the given int32 to the buffer.

func (*BinaryWriter) WriteInt64

func (w *BinaryWriter) WriteInt64(v int64)

WriteInt64 writes the given int64 to the buffer.

func (*BinaryWriter) WriteInt8

func (w *BinaryWriter) WriteInt8(v int8)

WriteInt8 writes the given int8 to the buffer.

func (*BinaryWriter) WriteString

func (w *BinaryWriter) WriteString(v string)

WriteString writes the given string to the buffer.

func (*BinaryWriter) WriteUint16

func (w *BinaryWriter) WriteUint16(v uint16)

WriteUint16 writes the given uint16 to the buffer.

func (*BinaryWriter) WriteUint32

func (w *BinaryWriter) WriteUint32(v uint32)

WriteUint32 writes the given uint32 to the buffer.

func (*BinaryWriter) WriteUint64

func (w *BinaryWriter) WriteUint64(v uint64)

WriteUint64 writes the given uint64 to the buffer.

func (*BinaryWriter) WriteUint8

func (w *BinaryWriter) WriteUint8(v uint8)

WriteUint8 writes the given uint8 to the buffer.

type BitmapReader

type BitmapReader struct {
	// contains filtered or unexported fields
}

BitmapReader is a binary bitmap reader.

func NewBitmapReader

func NewBitmapReader(buf []byte) *BitmapReader

NewBitmapReader returns a binary bitmap reader.

func (*BitmapReader) EOF

func (r *BitmapReader) EOF() bool

EOF returns if we reached the buffer's end-of-file.

func (*BitmapReader) Pos

func (r *BitmapReader) Pos() uint32

Pos returns the current bit position.

func (*BitmapReader) Read

func (r *BitmapReader) Read() bool

Read reads the next bit.

type BitmapWriter

type BitmapWriter struct {
	// contains filtered or unexported fields
}

BitmapWriter is a binary bitmap writer.

func NewBitmapWriter

func NewBitmapWriter(buf []byte) *BitmapWriter

NewBitmapWriter returns a binary bitmap writer.

func (*BitmapWriter) Bytes

func (w *BitmapWriter) Bytes() []byte

Bytes returns the buffer's bytes.

func (*BitmapWriter) Len

func (w *BitmapWriter) Len() uint32

Len returns the buffer's length in bytes.

func (*BitmapWriter) Write

func (w *BitmapWriter) Write(bit bool)

Write writes the next bit.

type Error

type Error struct {
	Message string
	Line    int
	Column  int
	Context string
}

Error is a parsing error returned by parser. It contains a message and an offset at which the error occurred.

func NewError

func NewError(r io.Reader, offset int, message string, a ...interface{}) *Error

NewError creates a new error

func NewErrorLexer

func NewErrorLexer(l *Input, message string, a ...interface{}) *Error

NewErrorLexer creates a new error from an active Lexer.

func (*Error) Error

func (e *Error) Error() string

Error returns the error string, containing the context and line + column number.

func (*Error) Position

func (e *Error) Position() (int, int, string)

Position returns the line, column, and context of the error. Context is the entire line at which the error occurred.

type IBinaryReader

type IBinaryReader interface {
	Close() error
	Len() int
	Bytes(int, int64) ([]byte, error)
}

type Indenter

type Indenter struct {
	io.Writer
	// contains filtered or unexported fields
}

func NewIndenter

func NewIndenter(w io.Writer, n int) Indenter

func (Indenter) Indent

func (in Indenter) Indent() int

func (Indenter) Write

func (in Indenter) Write(b []byte) (int, error)

type Input

type Input struct {
	// contains filtered or unexported fields
}

Input is a buffered reader that allows peeking forward and shifting, taking an io.Input. It keeps data in-memory until Free, taking a byte length, is called to move beyond the data.

func NewInput

func NewInput(r io.Reader) *Input

NewInput returns a new Input for a given io.Input and uses io.ReadAll to read it into a byte slice. If the io.Input implements Bytes, that is used instead. It will append a NULL at the end of the buffer.

func NewInputBytes

func NewInputBytes(b []byte) *Input

NewInputBytes returns a new Input for a given byte slice and appends NULL at the end. To avoid reallocation, make sure the capacity has room for one more byte.

func NewInputString

func NewInputString(s string) *Input

NewInputString returns a new Input for a given string and appends NULL at the end.

func (*Input) Bytes

func (z *Input) Bytes() []byte

Bytes returns the underlying buffez.

func (*Input) Err

func (z *Input) Err() error

Err returns the error returned from io.Input or io.EOF when the end has been reached.

func (*Input) Len

func (z *Input) Len() int

Len returns the length of the underlying buffez.

func (*Input) Lexeme

func (z *Input) Lexeme() []byte

Lexeme returns the bytes of the current selection.

func (*Input) Move

func (z *Input) Move(n int)

Move advances the position.

func (*Input) MoveRune

func (z *Input) MoveRune()

MoveRune advances the position by the length of the current rune.

func (*Input) Offset

func (z *Input) Offset() int

Offset returns the character position in the buffez.

func (*Input) Peek

func (z *Input) Peek(pos int) byte

Peek returns the ith byte relative to the end position. Peek returns 0 when an error has occurred, Err returns the erroz.

func (*Input) PeekErr

func (z *Input) PeekErr(pos int) error

PeekErr returns the error at position pos. When pos is zero, this is the same as calling Err().

func (*Input) PeekRune

func (z *Input) PeekRune(pos int) (rune, int)

PeekRune returns the rune and rune length of the ith byte relative to the end position.

func (*Input) Pos

func (z *Input) Pos() int

Pos returns a mark to which can be rewinded.

func (*Input) Reset

func (z *Input) Reset()

Reset resets position to the underlying buffez.

func (*Input) Restore

func (z *Input) Restore()

Restore restores the replaced byte past the end of the buffer by NULL.

func (*Input) Rewind

func (z *Input) Rewind(pos int)

Rewind rewinds the position to the given position.

func (*Input) Shift

func (z *Input) Shift() []byte

Shift returns the bytes of the current selection and collapses the position to the end of the selection.

func (*Input) Skip

func (z *Input) Skip()

Skip collapses the position to the end of the selection.

Source Files

binary.go binary_unix.go common.go error.go input.go position.go util.go

Directories

PathSynopsis
bufferPackage buffer contains buffer and wrapper types for byte slices.
cssPackage css is a CSS3 lexer and parser following the specifications at http://www.w3.org/TR/css-syntax-3/.
htmlPackage html is an HTML5 lexer following the specifications at http://www.w3.org/TR/html5/syntax.html.
jsPackage js is an ECMAScript5.1 lexer following the specifications at http://www.ecma-international.org/ecma-262/5.1/.
jsonPackage json is a JSON parser following the specifications at http://json.org/.
strconv
xmlPackage xml is an XML1.0 lexer following the specifications at http://www.w3.org/TR/xml/.
Version
v2.7.20 (latest)
Published
Jan 28, 2025
Platform
linux/amd64
Imports
13 packages
Last checked
1 day ago

Tools for package owners.