julee.domain.models.document¶
Document domain package for the Capture, Extract, Assemble, Publish workflow.
This package contains the Document domain object and its related functionality for the CEAP workflow system.
Document represents complete document entities including content and metadata, providing a stream-like interface for efficient handling of both small and large documents.
Submodules¶
Classes¶
Complete document entity including content and metadata. |
|
Status of a document through the Capture, Extract, Assemble, Publish |
Package Contents¶
- class julee.domain.models.document.Document(/, **data)[source]¶
Bases:
pydantic.BaseModelComplete document entity including content and metadata.
This is the primary domain model that represents a complete document in the CEAP workflow system. Content is provided as a ContentStream for efficient handling of both small and large documents.
The content stream is excluded from JSON serialization - use separate content endpoints for streaming binary data over HTTP.
- additional_metadata: dict[str, Any] = None¶
- assembly_types: list[str] = None¶
- content: julee.domain.models.custom_fields.content_stream.ContentStream | None = None¶
- content_bytes: bytes | None = None¶
- content_multihash: str = None¶
- content_type: str¶
- created_at: datetime.datetime | None = None¶
- document_id: str¶
- knowledge_service_id: str | None = None¶
- original_filename: str¶
- size_bytes: int = None¶
- status: DocumentStatus¶
- updated_at: datetime.datetime | None = None¶
- class julee.domain.models.document.DocumentStatus[source]¶
Bases:
str,enum.EnumStatus of a document through the Capture, Extract, Assemble, Publish pipeline.
- ASSEMBLED = 'assembled'¶
- ASSEMBLY_SPECIFICATION_IDENTIFIED = 'assembly_specification_identified'¶
- CAPTURED = 'captured'¶
- EXTRACTED = 'extracted'¶
- FAILED = 'failed'¶
- PUBLISHED = 'published'¶
- REGISTERED = 'registered'¶