FileReference
¶
Content¶
Tesseract that mounts input and output directories as datasets. To be used for Tesseracts with large inputs and/or outputs.
Example Tesseract (examples/filereference
)¶
Using InputFileReference
and OutputFileReference
you can
include references to files in the InputSchema
and OuputSchema
of a Tesseract.
The file reference schemas make sure that a file exists (either locally or in the Tesseract)
and resolve paths correctly in both tesseract-runtime
and tesseract run
calls.
class InputSchema(BaseModel):
data: list[InputFileReference]
class OutputSchema(BaseModel):
data: list[OutputFileReference]
def apply(inputs: InputSchema) -> OutputSchema:
output_path = Path(get_config().output_path)
files = []
for source in inputs.data:
# source is a pathlib.Path starting with /path/to/input_path/...
target = output_path / source.name
# target must be a pathlib.Path at /path/to/output_path
target = target.with_suffix(".copy")
shutil.copy(source, target)
files.append(target)
return OutputSchema(data=files)
For the tesseract-runtime
command, paths are relative to the local input/output paths:
tesseract-runtime apply \
--input-path ./testdata \
--output-path ./output \
'{"inputs": {"data": ["sample_0.json", "sample_1.json"]}}'
For the tesseract run
command, the file
reference schemas resolve to the mounted input/output folders inside the
Tesseract:
tesseract run filereference apply \
--input-path ./testdata \
--output-path ./output \
'{"inputs": {"data": ["sample_2.json", "sample_3.json"]}}'
For the Python SDK usage examples see test_tesseract.py
.