package match

licensecheck – github.com/google/licensecheck/internal/match Index | Files

package match

import "github.com/google/licensecheck/internal/match"

Package match defines matching algorithms and support code for the license checker.

Index ¶

Variables ¶

var TraceDFA int

TraceDFA controls whether DFA execution prints debug tracing when stuck. If TraceDFA > 0 and the DFA has followed a path of at least TraceDFA symbols since the last matching state but hits a dead end, it prints out information about the dead end.

Types ¶

type Dict ¶

type Dict struct {
	// contains filtered or unexported fields
}

A Dict maps words to integer indexes in a word list, of type WordID. The zero Dict is an empty dictionary ready for use.

Lookup and Words are read-only operations, safe for any number of concurrent calls from multiple goroutines. Insert is a write operation; it must not run concurrently with any other call, whether to Insert, Lookup, or Words.

func (*Dict) Insert ¶

func (d *Dict) Insert(w string) WordID

Insert adds the word w to the word list, returning its index. If w is already in the word list, it is not added again; Insert returns the existing index.

func (*Dict) InsertSplit ¶

func (d *Dict) InsertSplit(text string) []Word

InsertSplit splits text into a sequence of lowercase words, inserting any new words in the dictionary.

func (*Dict) Lookup ¶

func (d *Dict) Lookup(w string) WordID

Lookup looks for the word w in the word list and returns its index. If w is not in the word list, Lookup returns BadWord.

func (*Dict) Split ¶

func (d *Dict) Split(text string) []Word

Split splits text into a sequence of lowercase words. It does not add any new words to the dictionary. Unrecognized words are reported as having ID = BadWord.

func (*Dict) Words ¶

func (d *Dict) Words() []string

Words returns the current word list. The list is not a copy; the caller can read but must not modify the list.

type LRE ¶

type LRE struct {
	// contains filtered or unexported fields
}

An LRE is a compiled license regular expression.

TODO: Move this comment somewhere non-internal later.

A license regular expression (LRE) is a pattern syntax intended for describing large English texts such as software licenses, with minor allowed variations. The pattern syntax and the matching are word-based and case-insensitive; punctuation is ignored in the pattern and in the matched text.

The valid LRE patterns are:

word            - a single case-insensitive word
__N__           - any sequence of up to N words
expr1 expr2     - concatenation
expr1 || expr2  - alternation
(( expr ))      - grouping
expr??          - zero or one instances of expr
//** text **//  - a comment

To make patterns harder to misread in large texts:

|| must only appear inside (( ))
?? must only follow (( ))
(( must be at the start of a line, preceded only by spaces
)) must be at the end of a line, followed only by spaces and ??.

For example:

//** https://en.wikipedia.org/wiki/Filler_text **//
Now is
((not))??
the time for all good
((men || women || people))
to come to the aid of their __1__.

func ParseLRE ¶

func ParseLRE(d *Dict, file, s string) (*LRE, error)

ParseLRE parses the string s as a license regexp. The file name is used in error messages if non-empty.

func (*LRE) Dict ¶

func (re *LRE) Dict() *Dict

Dict returns the Dict used by the LRE.

func (*LRE) File ¶

func (re *LRE) File() string

File returns the file name passed to ParseLRE.

type Match ¶

type Match struct {
	ID    int // index of LRE in list passed to NewMultiLRE
	Start int // word index of start of match
	End   int // word index of end of match
}

A Match records the position of a single match in a text.

type Matches ¶

type Matches struct {
	Text  string  // the entire text
	Words []Word  // the text, split into Words
	List  []Match // the matches
}

A Matches is a collection of all leftmost-longest, non-overlapping matches in text.

type MultiLRE ¶

type MultiLRE struct {
	// contains filtered or unexported fields
}

A MultiLRE matches multiple LREs simultaneously against a text. It is more efficient than matching each LRE in sequence against the text.

func NewMultiLRE ¶

func NewMultiLRE(list []*LRE) (_ *MultiLRE, err error)

NewMultiLRE returns a MultiLRE looking for the given LREs. All the LREs must have been parsed using the same Dict; if not, NewMultiLRE panics.

func (*MultiLRE) Dict ¶

func (re *MultiLRE) Dict() *Dict

Dict returns the Dict used by the MultiLRE.

func (*MultiLRE) Match ¶

func (re *MultiLRE) Match(text string) *Matches

Match reports all leftmost-longest, non-overlapping matches in text. It always returns a non-nil *Matches, in order to return the split text. Check len(matches.List) to see whether any matches were found.

type SyntaxError ¶

type SyntaxError struct {
	File    string
	Offset  int
	Context string
	Err     string
}

A SyntaxError reports a syntax error during parsing.

func (*SyntaxError) Error ¶

func (e *SyntaxError) Error() string

type Word ¶

type Word struct {
	ID WordID
	Lo int32 // Word appears at text[Lo:Hi].
	Hi int32
}

A Word represents a single word found in a text.

type WordID ¶

type WordID int32

A WordID is the index of a word in a dictionary.

const AnyWord WordID = -2

AnyWord represents a wildcard matching any word.

const BadWord WordID = -1

BadWord represents a word not present in the dictionary.

Source Files ¶

dict.go regexp.go rematch.go resyntax.go

Version: v0.3.1 (latest)
Published: Sep 3, 2020
Platform: linux/amd64
Imports: 10 packages
Last checked: 5 months ago –

Tools for package owners.

?	: This menu
/	: Search site
f	: Jump to identifier
g then g	: Go to top of page
g then b	: Go to end of page
G	: Go to end of page
g then i	: Go to index
g then e	: Go to examples

package match

Index ¶

Variables ¶

Types ¶

type Dict ¶

func (*Dict) Insert ¶

func (*Dict) InsertSplit ¶

func (*Dict) Lookup ¶

func (*Dict) Split ¶

func (*Dict) Words ¶

type LRE ¶

func ParseLRE ¶

func (*LRE) Dict ¶

func (*LRE) File ¶

type Match ¶

type Matches ¶

type MultiLRE ¶

func NewMultiLRE ¶

func (*MultiLRE) Dict ¶

func (*MultiLRE) Match ¶

type SyntaxError ¶

func (*SyntaxError) Error ¶

type Word ¶

type WordID ¶

Source Files ¶

Jump to identifier

Keyboard shortcuts