package segmenter
import "github.com/go-text/typesetting/segmenter"
Package segmenter implements Unicode rules used to segment a paragraph of text according to several criteria. In particular, it provides a way of delimiting line break opportunities.
The API of the package follows the very nice iterator pattern proposed in github.com/npillmayer/uax, but use a somewhat simpler internal implementation, inspired by Pango.
The reference documentation is at https://unicode.org/reports/tr14 and https://unicode.org/reports/tr29.
Index ¶
Types ¶
type Grapheme ¶
type Grapheme struct { // Text is a subslice of the original input slice, containing the delimited grapheme Text []rune // Offset is the start of the grapheme in the input rune slice Offset int }
Line is the content of a grapheme delimited by the segmenter.
type GraphemeIterator ¶
type GraphemeIterator struct {
// contains filtered or unexported fields
}
GraphemeIterator provides a convenient way of iterating over the graphemes delimited by a `Segmenter`.
func (*GraphemeIterator) Grapheme ¶
func (gr *GraphemeIterator) Grapheme() Grapheme
Grapheme returns the current `Grapheme`
func (*GraphemeIterator) Next ¶
func (gr *GraphemeIterator) Next() bool
Next returns true if there is still a grapheme to process, and advances the iterator; or return false.
type Line ¶
type Line struct { // Text is a subslice of the original input slice, containing the delimited line Text []rune // Offset is the start of the line in the input rune slice Offset int // IsMandatoryBreak is true if breaking (at the end of the line) // is mandatory IsMandatoryBreak bool }
Line is the content of a line delimited by the segmenter.
type LineIterator ¶
type LineIterator struct {
// contains filtered or unexported fields
}
LineIterator provides a convenient way of iterating over the lines delimited by a `Segmenter`.
func (*LineIterator) Line ¶
func (li *LineIterator) Line() Line
Line returns the current `Line`
func (*LineIterator) Next ¶
func (li *LineIterator) Next() bool
Next returns true if there is still a line to process, and advances the iterator; or return false.
type Segmenter ¶
type Segmenter struct {
// contains filtered or unexported fields
}
Segmenter is the entry point of the package.
Usage :
var seg Segmenter seg.Init(...) iter := seg.LineIterator() for iter.Next() { ... // do something with iter.Line() }
func (*Segmenter) GraphemeIterator ¶
func (sg *Segmenter) GraphemeIterator() *GraphemeIterator
GraphemeIterator returns an iterator over the graphemes delimited in [Init].
func (*Segmenter) Init ¶
Init resets the segmenter storage with the given input, and computes the attributes required to segment the text.
func (*Segmenter) LineIterator ¶
func (sg *Segmenter) LineIterator() *LineIterator
LineIterator returns an iterator on the lines delimited in [Init].
Source Files ¶
segmenter.go unicode14_rules.go unicode29_rules.go
- Version
- v0.1.0
- Published
- Dec 26, 2023
- Platform
- windows/amd64
- Imports
- 2 packages
- Last checked
- 37 minutes ago –
Tools for package owners.