What is this Python project?
Dagster is a job orchestration framework that is used to build data assets. It is a great tool, works on multiple platforms, and is very easy to set it up. It has a lot of useful features, such as:
- Managing data assets with code, building python-based data jobs to load, transfer, and transform data
- Integrates with dbt and provides a user interface to run dbt models
- Provides a UI to monitor jobs, debug runs, inspect assets, or even launch backfills
- Multiplatform - works with Windows and Mac by just installing from pip (no other dependencies)
- It has multiple executors to run tasks, including in-process, multi-process, Dask, Celery, Docker, and Kubernetes.
What's the difference between this Python project and similar ones?
Enumerate comparisons.
- One great comparison for dagster is Airflow - but Dagster concentrates more on orchestrating data assets, including table materializations, etc
- It is not just a scheduler, but a framework to build jobs using Dagster API
Anyone who agrees with this pull request could submit an Approve review to it.