storage package

This package contains classes for interacting with remote storage services.

storage.remote_storage module

Abstract base class for remote storage services.

class pis.storage.remote_storage.RemoteStorage[source]

Bases: ABC

Abstract base class for remote storage services.

abstract check(uri: str) bool[source]

Check if the provided storage is valid.

This method should check if the storage exists and has proper permissions. On Google Cloud Storage, for example, this would check if the bucket exists and the service account has get, list and create permissions.

Parameters:

uri (str) – The URI to check.

Returns:

True if the storage exists, False otherwise.

Return type:

bool

abstract download_to_file(uri: str, dst: Path) int[source]

Download a file to the local filesystem.

Parameters:
  • uri (str) – The URI of the file to download.

  • dst (Path) – The destination path to download the file to.

Returns:

The revision number of the file.

Return type:

int

Raises:
abstract download_to_string(uri: str) tuple[str, int][source]

Download a file and return its contents as a string.

Parameters:

uri (str) – The URI of the file to download.

Returns:

A tuple containing the file contents and the revision number.

Return type:

tuple[str, int]

Raises:
abstract get_session() Any[source]

Return a session for making requests.

Returns:

The session.

abstract list(uri: str, pattern: str | None = None) list[str][source]

List files in prefix URI.

Optionally, a pattern can be provided to match files against. The pattern should be a simple string match, preceded by an exclamation mark to exclude files. For example, ‘foo’ will match all files containing ‘foo’, while ‘!foo’ will exclude all files containing ‘foo’.

Parameters:
  • uri (str) – The prefix URI by which to list files.

  • pattern (str | None) – Optional. The pattern to match files against.

Returns:

A list of file URIs.

Return type:

list[str]

abstract stat(uri: str) dict[source]

Get metadata for a file.

Currently, only the modification time is required, as it is used for the download_latest task. This method should be expanded as needed.

Parameters:

uri (str) – The URI to get metadata for.

Returns:

A dictionary containing metadata.

Return type:

dict

Raises:

NotFoundError – If the file does not exist.

abstract upload(src: Path, uri: str, revision: int | None = None) int[source]

Upload a file to the remote storage.

Optionally, a revision number can be provided to ensure that the file has not been modified since the last time it was read.

Parameters:
  • src (Path) – The source path of the file to upload.

  • uri (str) – The URI to upload the file to.

  • revision (int | None) – Optional. The expected revision number of the file.

Returns:

The new revision number of the file.

Return type:

int

Raises:
pis.storage.remote_storage.get_remote_storage(uri: str | None) RemoteStorage[source]

Get a storage object for a URI.

Parameters:

uri (str) – The URI to get a storage object for.

Returns:

A remote storage class.

Return type:

RemoteStorage

Raises:

ValueError – If the URI is not supported.

storage.google module

Google Cloud Storage class.

class pis.storage.google.GoogleStorage[source]

Bases: RemoteStorage

Google Cloud Storage helper class.

This class implements the RemoteStorage interface for Google Cloud Storage.

Variables:
  • credentials (google.auth.credentials.Credentials) – The Google Cloud Storage credentials.

  • client (google.cloud.storage.client.Client) – The Google Cloud Storage client.

check(uri: str) bool[source]

Check if a bucket exists in Google Cloud Storage.

Parameters:

uri (str) – The URI of the file to check.

Returns:

True if the file exists, False otherwise.

Return type:

bool

download_to_file(uri: str, dst: Path) int[source]

Download a file from Google Cloud Storage to the local filesystem.

Parameters:
  • uri (str) – The URI of the file to download.

  • dst (Path) – The destination path to download the file to.

Returns:

The generation number of the file.

Return type:

int

Raises:
download_to_string(uri: str) tuple[str, int][source]

Download a file from Google Cloud Storage and return its contents as a string.

Parameters:

uri (str) – The URI of the file to download.

Raises:
Returns:

A tuple containing the file contents and the generation number.

Return type:

tuple[str, int]

get_session() AuthorizedSession[source]

Get the current authenticated session.

Returns:

An authorized session.

Return type:

AuthorizedSession

list(uri: str, pattern: str | None = None) list[str][source]

List blobs in a bucket.

Parameters:
  • uri (str) – The URI prefix to list blobs for.

  • pattern (str | None) – The pattern to match blobs against.

Returns:

A list of blob URIs.

Return type:

list[str]

Raises:
stat(uri: str) dict[source]

Get metadata for a file in Google Cloud Storage.

Parameters:

uri (str) – The URI of the file to get metadata for.

Returns:

A dictionary containing metadata.

Return type:

dict

Raises:

NotFoundError – If the file does not exist.

upload(src: Path, uri: str, revision: int | None = None) int[source]

Upload a file to Google Cloud Storage.

Parameters:
  • src (Path) – The source path of the file to upload.

  • uri (str) – The URI to upload the file to.

  • revision (int | None) – The expected revision number of the file.

Returns:

The new revision number of the file.

Return type:

int

Raises:

storage.noop module

No-op storage class.

class pis.storage.noop.NoopStorage[source]

Bases: RemoteStorage

No-op storage helper class.

This class implements the RemoteStorage interface but does not perform any operations. It is used when PIS is run locally.

check(uri: str) bool[source]

Check if a file exists.

download_to_file(uri: str, dst: Path) int[source]

Download a file to the local filesystem.

download_to_string(uri: str) tuple[str, int][source]

Download a file and return its contents as a string.

get_session() None[source]

Return a session for making requests.

list(uri: str, pattern: str | None = None) list[str][source]

List files.

stat(uri: str) dict[source]

Get metadata for a file.

upload(src: Path, uri: str, revision: int | None = None) int[source]

Upload a file.

Module contents

Remote storage implementation classes.