julee.domain.models.document.document

Document domain models for the Capture, Extract, Assemble, Publish workflow.

This module contains the core document domain objects that represent documents and their metadata in the CEAP workflow system.

All domain models use Pydantic BaseModel for validation, serialization, and type safety, following the patterns established in the sample project.

Classes

Document

Complete document entity including content and metadata.

DocumentStatus

Status of a document through the Capture, Extract, Assemble, Publish

Functions

delegate_to_content(*method_names)

Decorator to delegate IO methods to the content stream property.

Module Contents

class julee.domain.models.document.document.Document(/, **data)[source]

Bases: pydantic.BaseModel

Complete document entity including content and metadata.

This is the primary domain model that represents a complete document in the CEAP workflow system. Content is provided as a ContentStream for efficient handling of both small and large documents.

The content stream is excluded from JSON serialization - use separate content endpoints for streaming binary data over HTTP.

classmethod content_multihash_must_not_be_empty(v)[source]
classmethod content_type_must_not_be_empty(v)[source]
classmethod document_id_must_not_be_empty(v)[source]
classmethod filename_must_not_be_empty(v)[source]
validate_content_fields(info)[source]

Ensure document has either content or content_string, not both.

additional_metadata: dict[str, Any] = None[source]
assembly_types: list[str] = None[source]
content: julee.domain.models.custom_fields.content_stream.ContentStream | None = None[source]
content_multihash: str = None[source]
content_string: str | None = None[source]
content_type: str[source]
created_at: datetime.datetime | None = None[source]
document_id: str[source]
knowledge_service_id: str | None = None[source]
original_filename: str[source]
size_bytes: int = None[source]
status: DocumentStatus[source]
updated_at: datetime.datetime | None = None[source]
class julee.domain.models.document.document.DocumentStatus[source]

Bases: str, enum.Enum

Status of a document through the Capture, Extract, Assemble, Publish pipeline.

ASSEMBLED = 'assembled'[source]
ASSEMBLY_SPECIFICATION_IDENTIFIED = 'assembly_specification_identified'[source]
CAPTURED = 'captured'[source]
EXTRACTED = 'extracted'[source]
FAILED = 'failed'[source]
PUBLISHED = 'published'[source]
REGISTERED = 'registered'[source]
julee.domain.models.document.document.delegate_to_content(*method_names)[source]

Decorator to delegate IO methods to the content stream property.