Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
[report]
exclude_lines =
# Skip any pass lines such as may be used for @abstractmethod
pass

# Have to re-enable the standard pragma
pragma: no cover

# Don't complain about missing debug-only code:
def __repr__
if self\.debug

# Don't complain if tests don't hit defensive assertion code:
raise AssertionError
raise NotImplementedError

# Don't complain if non-runnable code isn't run:
if 0:
if __name__ == .__main__.:
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
# Change Log
All notable changes to this project will be documented in this file.

## 2.2.1 - 2026-05-31
### Runner
- added option to use day="last" for monthly scheduling

## 2.2.0 - 2026-03
### Runner
Expand Down
21 changes: 11 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,12 @@ Runner uses a schedule-based approach:
#### Data Flow

For each pipeline execution:
1. **Dependency verification**: Check that all required input tables have data within specified time intervals
2. **Transformation execution**: Run your transformation to produce a Spark DataFrame
3. **Automatic enrichment**: Runner adds `INFORMATION_DATE` (the run date) and `VERSION` (package version) columns
4. **Partitioned write**: Data is written to Databricks with partitioning configuration
5. **Reporting**: Optional email notifications on failure and run information stored to tracking table
1. **Previous completion check**: Runner checks if the target table partition for the run date already exists (unless `rerun` is enabled)
2. **Dependency verification**: Check that all required input tables have data within specified time intervals
3. **Transformation execution**: Run your transformation to produce a Spark DataFrame
4. **Automatic enrichment**: Runner adds `INFORMATION_DATE` (the run date) and `VERSION` (package version) columns
5. **Partitioned write**: Data is written to Databricks with partitioning configuration
6. **Reporting**: Optional email notifications on failure and run information stored to tracking table

#### Dependency Tracking

Expand All @@ -66,13 +67,13 @@ Runner's dependency tracking ensures that all required input data is available b
* **Missing data handling**: If required data is missing, Runner raises an error for that specific pipeline/date but continues executing other pipelines and dates in the queue

**Example:** If you're running a pipeline on 2024-01-15 with a dependency that has a 7-day interval:
* Runner checks if the dependency table has data for 2024-01-08 (15 days - 7 days)
* Runner checks if the dependency table has data between 2024-01-08 and 2024-01-15 (15 days - 7 days)
* If the dependency has `filters: {VERSION: "v2"}`, it specifically checks for data where VERSION='v2'
* If data exists, the pipeline proceeds; otherwise, an error is raised for this specific execution, but other scheduled runs continue

### Transformation
### Job/Transformation
For the details on the interface see the [implementation](rialto/runner/transformation.py)
Inside the transformation you have access to a [TableReader](#common), date of running, and if provided to Runner, a live spark session and [metadata manager](#metadata).
Inside the job you have access to a [TableReader](#common), date of running, and if provided to Runner, a live spark session and [metadata manager](#metadata).
You can either implement your jobs directly via extending the Transformation class, or by using the [jobs](#jobs) abstraction.

### Runner
Expand Down Expand Up @@ -163,7 +164,7 @@ pipelines: # a list of pipelines to run
python_class: Pipeline2Class
schedule:
frequency: monthly
day: 6
day: 6 # or 'latest' for the last day of the month, otherwise avoid using days higher than 28 to ensure all months are covered
info_date_shift: # can be combined as a list
- units: "days"
value: 5
Expand Down Expand Up @@ -434,7 +435,7 @@ With that sorted out, we can now provide a quick example of the *rialto.jobs* mo
from pyspark.sql import DataFrame
from rialto.common import TableReader
from rialto.jobs import config_parser, job, datasource
from rialto.runner.config_loader import PipelineConfig
from rialto.runner.services.config_loader import PipelineConfig
from pydantic import BaseModel


Expand Down
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,9 @@
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = "rialto"
copyright = "2022, Marek Dobransky"
copyright = "2022-2026, Marek Dobransky"
author = "Marek Dobransky"
release = "2.2.0"
release = "2.2.1"

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
Expand Down
7 changes: 7 additions & 0 deletions docs/source/modules.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
rialto
======

.. toctree::
:maxdepth: 4

rialto
10 changes: 5 additions & 5 deletions docs/source/rialto.common.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,33 +5,33 @@ Submodules
----------

rialto.common.env\_yaml module
----------------------------------
------------------------------

.. automodule:: rialto.common.env_yaml
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.common.table\_reader module
----------------------------------

.. automodule:: rialto.common.table_reader
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.common.utils module
--------------------------

.. automodule:: rialto.common.utils
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

Module contents
---------------

.. automodule:: rialto.common
:members:
:undoc-members:
:show-inheritance:
:undoc-members:
25 changes: 12 additions & 13 deletions docs/source/rialto.jobs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,50 +5,49 @@ Submodules
----------

rialto.jobs.decorators module
----------------------------------------
-----------------------------

.. automodule:: rialto.jobs.decorators
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.jobs.job\_base module
---------------------------------------
----------------------------

.. automodule:: rialto.jobs.job_base
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.jobs.module\register module
---------------------------------------
rialto.jobs.module\_register module
-----------------------------------

.. automodule:: rialto.jobs.module_register
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.jobs.resolver module
--------------------------------------
---------------------------

.. automodule:: rialto.jobs.resolver
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.jobs.test\_utils module
-----------------------------------------
------------------------------

.. automodule:: rialto.jobs.test_utils
:members:
:undoc-members:
:show-inheritance:

:undoc-members:

Module contents
---------------

.. automodule:: rialto.jobs
:members:
:undoc-members:
:show-inheritance:
:undoc-members:
9 changes: 4 additions & 5 deletions docs/source/rialto.loader.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,30 +9,29 @@ rialto.loader.config\_loader module

.. automodule:: rialto.loader.config_loader
:members:
:undoc-members:
:show-inheritance:

:undoc-members:

rialto.loader.interfaces module
-------------------------------

.. automodule:: rialto.loader.interfaces
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.loader.pyspark\_feature\_loader module
---------------------------------------------

.. automodule:: rialto.loader.pyspark_feature_loader
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

Module contents
---------------

.. automodule:: rialto.loader
:members:
:undoc-members:
:show-inheritance:
:undoc-members:
10 changes: 5 additions & 5 deletions docs/source/rialto.maker.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,37 +9,37 @@ rialto.maker.containers module

.. automodule:: rialto.maker.containers
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.maker.feature\_maker module
----------------------------------

.. automodule:: rialto.maker.feature_maker
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.maker.utils module
-------------------------

.. automodule:: rialto.maker.utils
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.maker.wrappers module
----------------------------

.. automodule:: rialto.maker.wrappers
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

Module contents
---------------

.. automodule:: rialto.maker
:members:
:undoc-members:
:show-inheritance:
:undoc-members:
6 changes: 3 additions & 3 deletions docs/source/rialto.metadata.data_classes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,21 +9,21 @@ rialto.metadata.data\_classes.feature\_metadata module

.. automodule:: rialto.metadata.data_classes.feature_metadata
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.metadata.data\_classes.group\_metadata module
----------------------------------------------------

.. automodule:: rialto.metadata.data_classes.group_metadata
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

Module contents
---------------

.. automodule:: rialto.metadata.data_classes
:members:
:undoc-members:
:show-inheritance:
:undoc-members:
8 changes: 4 additions & 4 deletions docs/source/rialto.metadata.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,29 +17,29 @@ rialto.metadata.enums module

.. automodule:: rialto.metadata.enums
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.metadata.metadata\_manager module
----------------------------------------

.. automodule:: rialto.metadata.metadata_manager
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

rialto.metadata.utils module
----------------------------

.. automodule:: rialto.metadata.utils
:members:
:undoc-members:
:show-inheritance:
:undoc-members:

Module contents
---------------

.. automodule:: rialto.metadata
:members:
:undoc-members:
:show-inheritance:
:undoc-members:
23 changes: 23 additions & 0 deletions docs/source/rialto.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
rialto package
==============

Subpackages
-----------

.. toctree::
:maxdepth: 4

rialto.common
rialto.jobs
rialto.loader
rialto.maker
rialto.metadata
rialto.runner

Module contents
---------------

.. automodule:: rialto
:members:
:show-inheritance:
:undoc-members:
Loading
Loading