package chunk
import "github.com/jdkato/prose/chunk"
Package chunk implements functions for finding useful chunks in text previously tagged from parts of speech.
Code:
Output:Example¶
{
txt := "Go is a open source programming language created at Google."
words := tokenize.TextToWords(txt)
tagger := tag.NewPerceptronTagger()
fmt.Println(Chunk(tagger.Tag(words), TreebankNamedEntities))
// Output: [Go Google]
}
[Go Google]
Index ¶
- Variables
- func Chunk(tagged []tag.Token, rx *regexp.Regexp) []string
- func Locate(tagged []tag.Token, rx *regexp.Regexp) [][]int
Examples ¶
Variables ¶
var TreebankNamedEntities = regexp.MustCompile( `((CD__)*(NNP.)+(CD__|NNP.)*)+` + `((IN__)*(CD__)*(NNP.)+(CD__|NNP.)*)*`)
TreebankNamedEntities matches proper names, excluding prior adjectives, possibly including numbers and a linkage by preposition or subordinating conjunctions (for example "Bank of England").
Functions ¶
func Chunk ¶
Chunk returns a slice containing the chunks of interest according to the regexp.
This is a convenience wrapper around Locate, which should be used if you need access the to the in-text locations of each chunk.
func Locate ¶
Locate finds the chunks of interest according to the regexp.
Source Files ¶
- Version
- v1.2.0
- Published
- Jun 16, 2020
- Platform
- darwin/amd64
- Imports
- 2 packages
- Last checked
- 7 hours ago –
Tools for package owners.