julee.repositories.minio ======================== .. py:module:: julee.repositories.minio .. autoapi-nested-parse:: Minio repository implementations for julee domain. This module exports Minio-based implementations of all repository protocols for the Capture, Extract, Assemble, Publish workflow. These implementations use Minio for object storage and are suitable for production environments where persistent, scalable storage is required. All implementations maintain the same async interfaces as their memory counterparts while providing durable, distributed storage capabilities. Submodules ---------- .. toctree:: :maxdepth: 1 /autoapi/julee/repositories/minio/assembly/index /autoapi/julee/repositories/minio/assembly_specification/index /autoapi/julee/repositories/minio/client/index /autoapi/julee/repositories/minio/document/index /autoapi/julee/repositories/minio/document_policy_validation/index /autoapi/julee/repositories/minio/knowledge_service_config/index /autoapi/julee/repositories/minio/knowledge_service_query/index /autoapi/julee/repositories/minio/policy/index Classes ------- .. autoapisummary:: julee.repositories.minio.MinioAssemblyRepository julee.repositories.minio.MinioAssemblySpecificationRepository julee.repositories.minio.MinioDocumentPolicyValidationRepository julee.repositories.minio.MinioDocumentRepository julee.repositories.minio.MinioKnowledgeServiceConfigRepository julee.repositories.minio.MinioKnowledgeServiceQueryRepository julee.repositories.minio.MinioPolicyRepository Package Contents ---------------- .. py:class:: MinioAssemblyRepository(client) Bases: :py:obj:`julee.domain.repositories.assembly.AssemblyRepository`, :py:obj:`julee.repositories.minio.client.MinioRepositoryMixin` Minio implementation of AssemblyRepository using Minio for persistence. This implementation stores assembly data as JSON objects in the "assemblies" bucket. .. py:method:: generate_id() :async: Generate a unique assembly identifier. .. py:method:: get(assembly_id) :async: Retrieve an assembly by ID. .. py:method:: get_many(assembly_ids) :async: Retrieve multiple assemblies by ID. :param assembly_ids: List of unique assembly identifiers :returns: Dict mapping assembly_id to Assembly (or None if not found) .. py:method:: save(assembly) :async: Save assembly metadata (status, updated_at, etc.). .. py:attribute:: assembly_bucket :value: 'assemblies' .. py:attribute:: client .. py:attribute:: logger .. py:class:: MinioAssemblySpecificationRepository(client) Bases: :py:obj:`julee.domain.repositories.assembly_specification.AssemblySpecificationRepository`, :py:obj:`julee.repositories.minio.client.MinioRepositoryMixin` Minio implementation of AssemblySpecificationRepository using Minio for persistence. This implementation stores assembly specifications as JSON objects in the "assembly-specifications" bucket. Each specification includes its complete JSON schema definition and knowledge service query mappings. .. py:method:: generate_id() :async: Generate a unique assembly specification identifier. .. py:method:: get(assembly_specification_id) :async: Retrieve an assembly specification by ID. .. py:method:: get_many(assembly_specification_ids) :async: Retrieve multiple assembly specifications by ID. :param assembly_specification_ids: List of unique specification :param identifiers: :returns: Dict mapping specification_id to AssemblySpecification (or None if not found) .. py:method:: list_all() :async: List all assembly specifications. :returns: List of all assembly specifications, sorted by assembly_specification_id .. py:method:: save(assembly_specification) :async: Save an assembly specification to Minio. .. py:attribute:: client .. py:attribute:: logger .. py:attribute:: specifications_bucket :value: 'assembly-specifications' .. py:class:: MinioDocumentPolicyValidationRepository(client) Bases: :py:obj:`julee.domain.repositories.document_policy_validation.DocumentPolicyValidationRepository`, :py:obj:`julee.repositories.minio.client.MinioRepositoryMixin` Minio implementation of DocumentPolicyValidationRepository using Minio for persistence. This implementation stores document policy validations as JSON objects in the "document-policy-validations" bucket. Each validation includes its complete status tracking, validation scores, transformation results, and metadata. .. py:method:: generate_id() :async: Generate a unique validation identifier. .. py:method:: get(validation_id) :async: Retrieve a document policy validation by ID. .. py:method:: get_many(validation_ids) :async: Retrieve multiple document policy validations by ID. :param validation_ids: List of unique validation identifiers :returns: Dict mapping validation_id to DocumentPolicyValidation (or None if not found) .. py:method:: save(validation) :async: Save a document policy validation to Minio. .. py:attribute:: client .. py:attribute:: logger .. py:attribute:: validations_bucket :value: 'document-policy-validations' .. py:class:: MinioDocumentRepository(client) Bases: :py:obj:`julee.domain.repositories.document.DocumentRepository`, :py:obj:`julee.repositories.minio.client.MinioRepositoryMixin` Minio implementation of DocumentRepository using Minio for persistence. This implementation stores document metadata and content separately: - Metadata: JSON objects in the "documents" bucket - Content: Binary objects in the "documents-content" bucket This separation allows for efficient metadata queries while supporting large content files without hitting Temporal's 2MB payload limits. .. py:method:: generate_id() :async: Generate a unique document identifier. .. py:method:: get(document_id) :async: Retrieve a document with metadata and content. .. py:method:: get_many(document_ids) :async: Retrieve multiple documents by ID using batch operations. :param document_ids: List of unique document identifiers :returns: Dict mapping document_id to Document (or None if not found) .. note:: This implementation optimizes by batch-fetching metadata first, then batch-fetching unique content streams, then splicing them together. .. py:method:: list_all() :async: List all documents. :returns: List of all documents, sorted by document_id .. py:method:: save(document) :async: Save a document with its content and metadata. If the document has content_string, it will be converted to a ContentStream and stored. The content_string field should only be used for small content (few KB) when saving from workflows/use-cases. Call-sites in activities should always use the content stream. .. py:attribute:: client .. py:attribute:: content_bucket :value: 'documents-content' .. py:attribute:: logger .. py:attribute:: metadata_bucket :value: 'documents' .. py:class:: MinioKnowledgeServiceConfigRepository(client) Bases: :py:obj:`julee.domain.repositories.knowledge_service_config.KnowledgeServiceConfigRepository`, :py:obj:`julee.repositories.minio.client.MinioRepositoryMixin` Minio implementation of KnowledgeServiceConfigRepository using Minio for persistence. This implementation stores knowledge service configurations as JSON objects: - Knowledge Service Configs: JSON objects in the "knowledge-service-configs" bucket Each configuration is stored with its knowledge_service_id as the object name for efficient retrieval and updates. .. py:method:: generate_id() :async: Generate a unique knowledge service identifier. :returns: Unique knowledge service ID string .. py:method:: get(knowledge_service_id) :async: Retrieve a knowledge service configuration by ID. :param knowledge_service_id: Unique knowledge service identifier :returns: KnowledgeServiceConfig object if found, None otherwise .. py:method:: get_many(knowledge_service_ids) :async: Retrieve multiple knowledge service configs by ID. :param knowledge_service_ids: List of unique knowledge service :param identifiers: :returns: Dict mapping knowledge_service_id to KnowledgeServiceConfig (or None if not found) .. py:method:: list_all() :async: List all knowledge service configurations. :returns: List of all knowledge service configurations, sorted by knowledge_service_id .. py:method:: save(knowledge_service) :async: Save a knowledge service configuration. :param knowledge_service: Complete KnowledgeServiceConfig to save .. py:attribute:: bucket_name :value: 'knowledge-service-configs' .. py:attribute:: client .. py:attribute:: logger .. py:class:: MinioKnowledgeServiceQueryRepository(client) Bases: :py:obj:`julee.domain.repositories.knowledge_service_query.KnowledgeServiceQueryRepository`, :py:obj:`julee.repositories.minio.client.MinioRepositoryMixin` Minio implementation of KnowledgeServiceQueryRepository. This implementation stores knowledge service queries as JSON objects in Minio buckets, following the established patterns for Minio repositories in this system. Each query is stored as a separate object with deterministic naming. .. py:method:: generate_id() :async: Generate a unique query identifier. :returns: Unique string identifier for a new query .. py:method:: get(query_id) :async: Retrieve a knowledge service query by ID. :param query_id: Unique query identifier :returns: KnowledgeServiceQuery object if found, None otherwise .. py:method:: get_many(query_ids) :async: Retrieve multiple knowledge service queries by ID. :param query_ids: List of unique query identifiers :returns: Dict mapping query_id to KnowledgeServiceQuery (or None if not found) .. py:method:: list_all() :async: List all knowledge service queries. :returns: List of all knowledge service queries, sorted by query_id .. py:method:: save(query) :async: Store or update a knowledge service query. :param query: KnowledgeServiceQuery object to store .. py:attribute:: bucket_name :value: 'knowledge-service-queries' .. py:attribute:: client .. py:attribute:: logger .. py:class:: MinioPolicyRepository(client) Bases: :py:obj:`julee.domain.repositories.policy.PolicyRepository`, :py:obj:`julee.repositories.minio.client.MinioRepositoryMixin` Minio implementation of PolicyRepository using Minio for persistence. This implementation stores policies as JSON objects in the "policies" bucket. Each policy includes its complete validation scores and optional transformation queries. .. py:method:: generate_id() :async: Generate a unique policy identifier. .. py:method:: get(policy_id) :async: Retrieve a policy by ID. .. py:method:: get_many(policy_ids) :async: Retrieve multiple policies by ID. :param policy_ids: List of unique policy identifiers :returns: Dict mapping policy_id to Policy (or None if not found) .. py:method:: save(policy) :async: Save a policy to Minio. .. py:attribute:: client .. py:attribute:: logger .. py:attribute:: policies_bucket :value: 'policies'