storagecloud.google.com/go/storage/dataflux Index | Files

package dataflux

import "cloud.google.com/go/storage/dataflux"

Package dataflux provides an easy way to parallelize listing in Google Cloud Storage.

More information about Google Cloud Storage is available at https://cloud.google.com/storage/docs.

See https://pkg.go.dev/cloud.google.com/go for authentication, timeouts, connection pooling and similar aspects of this package.

NOTE: This package is in preview. It is not stable, and is likely to change.

Index

Types

type Lister

type Lister struct {
	// contains filtered or unexported fields
}

Lister is used for interacting with Dataflux fast-listing. The caller should initialize it with NewLister() instead of creating it directly.

func NewLister

func NewLister(c *storage.Client, in *ListerInput) *Lister

NewLister creates a new dataflux Lister to list objects in the give bucket.

func (*Lister) Close

func (c *Lister) Close()

Close closes the range channel of the Lister.

func (*Lister) NextBatch

func (c *Lister) NextBatch(ctx context.Context) ([]*storage.ObjectAttrs, error)

NextBatch runs worksteal algorithm and sequential listing in parallel to quickly return a list of objects in the bucket. For smaller dataset, sequential listing is expected to be faster. For larger dataset, worksteal listing is expected to be faster.

type ListerInput

type ListerInput struct {
	// BucketName is the name of the bucket to list objects from. Required.
	BucketName string

	// Parallelism is number of parallel workers to use for listing.
	// Default value is 10x number of available CPU. Optional.
	Parallelism int

	// BatchSize is the number of objects to list. Default value returns
	// all objects at once. The number of objects returned will be
	// rounded up to a multiple of gcs page size. Optional.
	BatchSize int

	// Query is the query to filter objects for listing. Default value is nil.
	// Use ProjectionNoACL for faster listing. Including ACLs increases
	// latency while fetching objects. Optional.
	Query storage.Query

	// SkipDirectoryObjects is to indicate whether to list directory objects.
	// Default value is false. Optional.
	SkipDirectoryObjects bool
}

ListerInput contains options for listing objects.

Source Files

doc.go fast_list.go range_splitter.go sequential.go worksteal.go

Version
v1.44.0
Published
Oct 3, 2024
Platform
darwin/amd64
Imports
12 packages
Last checked
9 minutes ago

Tools for package owners.