lamindb.Storage

class lamindb.Storage(root: str, type: str, region: str | None)

Bases: SQLRecord, TracksRun, TracksUpdates

Storage locations of artifacts such as S3 buckets or local directories.

A storage location is either a directory/folder (local or in the cloud) or an entire S3/GCP bucket.

A LaminDB instance can manage and link multiple storage locations. But any storage location is managed by at most one LaminDB instance.

Managed vs. linked storage locations

The LaminDB instance can update & delete artifacts in managed storage locations but merely read artifacts in linked storage locations.

When you transfer artifacts from another instance, the default is to only copy metadata into the target instance, but merely link the data.

The instance_uid field indicates the managing LaminDB instance of a storage location.

When you delete a LaminDB instance, you’ll be warned about data in managed storage locations while data in linked storage locations is ignored.

See also

storage

Default storage.

StorageSettings

Storage settings.

Examples

Configure the default storage location upon initiation of a LaminDB instance:

lamin init --storage ./mydata # or "s3://my-bucket" or "gs://my-bucket"

View the default storage location:

>>> ln.settings.storage
PosixPath('/home/runner/work/lamindb/lamindb/docs/guide/mydata')

Dynamically change the default storage:

>>> ln.settings.storage = "./storage_2" # or a cloud bucket

Attributes

DoesNotExist = <class 'lamindb.models.core.Storage.DoesNotExist'>
Meta = <class 'lamindb.models.sqlrecord.SQLRecord.Meta'>
MultipleObjectsReturned = <class 'lamindb.models.core.Storage.MultipleObjectsReturned'>
artifacts: Artifact

Artifacts contained in this storage location.

branch: int

Whether record is on a branch or in another “special state”.

This dictates where a record appears in exploration, queries & searches, whether a record can be edited, and whether a record acts as a template.

Branch name coding is handled through LaminHub. “Special state” coding is as defined below.

One should note that there is no “main” branch as in git, but that all five special codes (-1, 0, 1, 2, 3) act as sub-specfications for what git would call the main branch. This also means that for records that live on a branch only the “default state” exists. E.g., one can only turn a record into a template, lock it, archive it, or trash it once it’s merged onto the main branch.

  • 3: template (hidden in queries & searches)

  • 2: locked (same as default, but locked for edits except for space admins)

  • 1: default (visible in queries & searches)

  • 0: archive (hidden, meant to be kept, locked for edits for everyone)

  • -1: trash (hidden, scheduled for deletion)

An integer higher than >3 codes a branch that can be used for collaborators to create drafts that can be merged onto the main branch in an experience akin to a Pull Request. The mapping onto a semantic branch name is handled through LaminHub.

branch_id
created_by: User

Creator of record.

created_by_id
objects = <lamindb.models.query_manager.QueryManager object>
property path: Path | UPath

Bucket or folder path.

Cloud storage bucket:

>>> ln.Storage("s3://my-bucket").save()

Directory/folder in cloud storage:

>>> ln.Storage("s3://my-bucket/my-directory").save()

Local directory/folder:

>>> ln.Storage("./my-directory").save()
property pk
run: Run | None

Run that created record.

run_id
space: Space

The space in which the record lives.

space_id

Methods

async adelete(using=None, keep_parents=False)
async arefresh_from_db(using=None, fields=None, from_queryset=None)
async asave(*args, force_insert=False, force_update=False, using=None, update_fields=None)
clean()

Hook for doing any extra model-wide validation after clean() has been called on every field by self.clean_fields. Any ValidationError raised by this method will not be associated with a particular field; it will have a special-case association with the field defined by NON_FIELD_ERRORS.

clean_fields(exclude=None)

Clean all fields and raise a ValidationError containing a dict of all validation errors if any occur.

date_error_message(lookup_type, field_name, unique_for)
delete()

Delete.

Return type:

None

get_constraints()
get_deferred_fields()

Return a set containing names of deferred fields for this instance.

prepare_database_save(field)
refresh_from_db(using=None, fields=None, from_queryset=None)

Reload field values from the database.

By default, the reloading happens from the database this instance was loaded from, or by the read router if this instance wasn’t loaded from any database. The using parameter will override the default.

Fields can be used to specify which fields to reload. The fields should be an iterable of field attnames. If fields is None, then all non-deferred fields are reloaded.

When accessing deferred fields of an instance, the deferred loading of the field will call this method.

save(*args, **kwargs)

Save.

Always saves to the default database.

Return type:

SQLRecord

save_base(raw=False, force_insert=False, force_update=False, using=None, update_fields=None)

Handle the parts of saving which should be done only once per save, yet need to be done in raw saves, too. This includes some sanity checks and signal sending.

The ‘raw’ argument is telling save_base not to save any parent models and not to do any changes to the values before save. This is used by fixture loading.

serializable_value(field_name)

Return the value of the field name for this instance. If the field is a foreign key, return the id value instead of the object. If there’s no Field object with this name on the model, return the model attribute’s value.

Used to serialize a field’s value (in the serializer, or form output, for example). Normally, you would just access the attribute directly and not use this method.

unique_error_message(model_class, unique_check)
validate_constraints(exclude=None)
validate_unique(exclude=None)

Check unique constraints on the model and raise ValidationError if any failed.