lamindb.Storage¶
- class lamindb.Storage(root: str, type: str, region: str | None)¶
Bases:
SQLRecord
,TracksRun
,TracksUpdates
Storage locations of artifacts such as S3 buckets or local directories.
A storage location is either a directory/folder (local or in the cloud) or an entire S3/GCP bucket.
A LaminDB instance can manage and link multiple storage locations. But any storage location is managed by at most one LaminDB instance.
Managed vs. linked storage locations
The LaminDB instance can update & delete artifacts in managed storage locations but merely read artifacts in linked storage locations.
When you transfer artifacts from another instance, the default is to only copy metadata into the target instance, but merely link the data.
The
instance_uid
field indicates the managing LaminDB instance of a storage location.When you delete a LaminDB instance, you’ll be warned about data in managed storage locations while data in linked storage locations is ignored.
See also
storage
Default storage.
StorageSettings
Storage settings.
Examples
Configure the default storage location upon initiation of a LaminDB instance:
lamin init --storage ./mydata # or "s3://my-bucket" or "gs://my-bucket"
View the default storage location:
>>> ln.settings.storage PosixPath('/home/runner/work/lamindb/lamindb/docs/guide/mydata')
Dynamically change the default storage:
>>> ln.settings.storage = "./storage_2" # or a cloud bucket
Attributes¶
- DoesNotExist = <class 'lamindb.models.core.Storage.DoesNotExist'>¶
- Meta = <class 'lamindb.models.sqlrecord.SQLRecord.Meta'>¶
- MultipleObjectsReturned = <class 'lamindb.models.core.Storage.MultipleObjectsReturned'>¶
- artifacts: Artifact¶
Artifacts contained in this storage location.
- branch: int¶
Whether record is on a branch or in another “special state”.
This dictates where a record appears in exploration, queries & searches, whether a record can be edited, and whether a record acts as a template.
Branch name coding is handled through LaminHub. “Special state” coding is as defined below.
One should note that there is no “main” branch as in git, but that all five special codes (-1, 0, 1, 2, 3) act as sub-specfications for what git would call the main branch. This also means that for records that live on a branch only the “default state” exists. E.g., one can only turn a record into a template, lock it, archive it, or trash it once it’s merged onto the main branch.
3: template (hidden in queries & searches)
2: locked (same as default, but locked for edits except for space admins)
1: default (visible in queries & searches)
0: archive (hidden, meant to be kept, locked for edits for everyone)
-1: trash (hidden, scheduled for deletion)
An integer higher than >3 codes a branch that can be used for collaborators to create drafts that can be merged onto the main branch in an experience akin to a Pull Request. The mapping onto a semantic branch name is handled through LaminHub.
- branch_id¶
- created_by: User¶
Creator of record.
- created_by_id¶
- objects = <lamindb.models.query_manager.QueryManager object>¶
- property path: Path | UPath¶
Bucket or folder path.
Cloud storage bucket:
>>> ln.Storage("s3://my-bucket").save()
Directory/folder in cloud storage:
>>> ln.Storage("s3://my-bucket/my-directory").save()
Local directory/folder:
>>> ln.Storage("./my-directory").save()
- property pk¶
- run: Run | None¶
Run that created record.
- run_id¶
- space: Space¶
The space in which the record lives.
- space_id¶
Methods¶
- async adelete(using=None, keep_parents=False)¶
- async arefresh_from_db(using=None, fields=None, from_queryset=None)¶
- async asave(*args, force_insert=False, force_update=False, using=None, update_fields=None)¶
- clean()¶
Hook for doing any extra model-wide validation after clean() has been called on every field by self.clean_fields. Any ValidationError raised by this method will not be associated with a particular field; it will have a special-case association with the field defined by NON_FIELD_ERRORS.
- clean_fields(exclude=None)¶
Clean all fields and raise a ValidationError containing a dict of all validation errors if any occur.
- date_error_message(lookup_type, field_name, unique_for)¶
- delete()¶
Delete.
- Return type:
None
- get_constraints()¶
- get_deferred_fields()¶
Return a set containing names of deferred fields for this instance.
- prepare_database_save(field)¶
- refresh_from_db(using=None, fields=None, from_queryset=None)¶
Reload field values from the database.
By default, the reloading happens from the database this instance was loaded from, or by the read router if this instance wasn’t loaded from any database. The using parameter will override the default.
Fields can be used to specify which fields to reload. The fields should be an iterable of field attnames. If fields is None, then all non-deferred fields are reloaded.
When accessing deferred fields of an instance, the deferred loading of the field will call this method.
- save_base(raw=False, force_insert=False, force_update=False, using=None, update_fields=None)¶
Handle the parts of saving which should be done only once per save, yet need to be done in raw saves, too. This includes some sanity checks and signal sending.
The ‘raw’ argument is telling save_base not to save any parent models and not to do any changes to the values before save. This is used by fixture loading.
- serializable_value(field_name)¶
Return the value of the field name for this instance. If the field is a foreign key, return the id value instead of the object. If there’s no Field object with this name on the model, return the model attribute’s value.
Used to serialize a field’s value (in the serializer, or form output, for example). Normally, you would just access the attribute directly and not use this method.
- unique_error_message(model_class, unique_check)¶
- validate_constraints(exclude=None)¶
- validate_unique(exclude=None)¶
Check unique constraints on the model and raise ValidationError if any failed.