package gk

import "github.com/DataDog/sketches-go/gkarray"

Example

Code:play 

package main

import (
	"fmt"
	"math/rand"

	gk "github.com/DataDog/sketches-go/gkarray"
)

func main() {
	rand.Seed(1234)

	sketch := gk.NewDefaultGKArray()

	for i := 0; i < 500; i++ {
		v := rand.NormFloat64()
		sketch.Add(v)
	}

	anotherSketch := gk.NewDefaultGKArray()
	for i := 0; i < 500; i++ {
		v := rand.NormFloat64()
		anotherSketch.Add(v)
	}
	sketch.Merge(anotherSketch)

	fmt.Println(quantiles(sketch))
	fmt.Println(quantiles(anotherSketch))
}

func quantiles(sketch *gk.GKArray) []float64 {
	qs := []float64{0.5, 0.75, 0.9, 1}
	quantiles := make([]float64, len(qs))
	for i, q := range qs {
		quantiles[i] = sketch.Quantile(q)
	}
	return quantiles
}

Output:

[-0.0681257027936446 0.6425452806032255 1.3493584822458902 3.0655851632106974]
[0.0432544888835891 0.7615197295175491 1.4662069162660956 3.0655851632106974]

Index

Examples

Constants

const DefaultEpsilon float64 = 0.01

Types

type Entries

type Entries []Entry

Entries is a slice of Entry

func (Entries) Len

func (slice Entries) Len() int

func (Entries) Less

func (slice Entries) Less(i, j int) bool

func (Entries) Swap

func (slice Entries) Swap(i, j int)

type Entry

type Entry struct {
	// contains filtered or unexported fields
}

Entry is an element of the sketch. For the definition of g and delta, see the original paper http://infolab.stanford.edu/~datar/courses/cs361a/papers/quantiles.pdf

type GKArray

type GKArray struct {
	// contains filtered or unexported fields
}

GKArray is a version of GK with a buffer for the incoming values. epsilon is the accuracy of the sketch. Once Merge() is called, the sketch has a 2*epsilon rank-accuracy guarantee.

func NewDefaultGKArray

func NewDefaultGKArray() *GKArray

func NewGKArray

func NewGKArray(epsilon float64) *GKArray

NewGKArray allocates a new GKArray summary.

func (*GKArray) Add

func (s *GKArray) Add(v float64)

Add a new value to the summary.

func (*GKArray) Avg

func (s *GKArray) Avg() float64

func (*GKArray) Compress

func (s *GKArray) Compress()

func (*GKArray) Copy

func (s *GKArray) Copy(o *GKArray)

func (*GKArray) Count

func (s *GKArray) Count() int64

func (*GKArray) MakeCopy

func (s *GKArray) MakeCopy() *GKArray

func (*GKArray) MemorySize

func (s *GKArray) MemorySize() int

func (*GKArray) Merge

func (s *GKArray) Merge(o *GKArray)

Merge another GKArray into this in-place.

Here is one way to merge summaries so that the sketch is one-way mergeable: we extract an epsilon-approximate distribution from one of the summaries (o) and we insert this distribution into the other summary (s). More specifically, to extract the approximate distribution, we can query for all the quantiles i/(o.count-1) where i is between 0 and o.count-1 (included). Then we insert those values into s as usual. This way, when querying a quantile from the merged summary, the returned quantile has a rank error from the inserted values that is lower than epsilon, but the inserted values, because of the merge process, have a rank error from the actual data that is also lower than epsilon, so that the total rank error is bounded by 2*epsilon. However, querying and inserting each value as described above has a complexity that is linear in the number of values that have been inserted in o rather than in the number of entries in the summary. To tackle this issue, we can notice that each of the quantiles that are queried from o is a v of one of the entry of o. Instead of actually querying for those quantiles, we can count the number of times each v will be returned (when querying the quantiles

i/(o.valcount-1)); we end up with the values n below. Then instead of successively inserting each v n times,

we can actually directly append them to s.incoming as new entries where g = n. This is possible because the values of n will never violate the condition n <= int(s.eps * (s.count+o.count-1)).

func (*GKArray) Quantile

func (s *GKArray) Quantile(q float64) float64

Quantile returns an epsilon estimate of the element at q.

func (*GKArray) String

func (s *GKArray) String() string

func (*GKArray) Sum

func (s *GKArray) Sum() float64

Source Files

gkarray.go

Version
v0.0.1
Published
Sep 23, 2019
Platform
windows/amd64
Imports
5 packages
Last checked
9 hours ago

Tools for package owners.