Upgrading Apache Airflow Versions

In a previous post we explained how to Install and Configure Apache Airflow (a platform to programmatically author, schedule and monitor workflows). The technology is actively being worked on and more and more features and bug fixes are being added to the project in the form of new releases. At some point, you will want to upgrade to take advantage of these new feature.

In this post we’ll go over the process that you should for upgrading apache airflow versions.

Note: You will need to separately make sure that your dags will be able to work on the new version of Airflow.

Upgrade Airflow

Note: These steps can also work to downgrade versions of Airflow

Note: Execute all of this on all the instances in your Airflow Cluster (if you have more then one machine)

  1. Gather information about your current environment and your target setup:
    • Get the Airflow Home directory. Placeholder for this value: {AIRFLOW_HOME}
    • Get the current version of Airflow you are running. Placeholder for this value: {OLD_AIRFLOW_VERSION}
      1. To get this value you can run:
        $ airflow version
    • Get the new version of Airflow you want to run. Placeholder for this value: {NEW_AIRFLOW_VERSION}
    • Are you using sqlite? Placeholder for this value:{USING_SQLITE?}
    • If you’re not using SQLite, search the airflow.cfg file for the metastore (celery_result_backend and sql_alchemy_conn configurations) type {AIRFLOW_DB_TYPE}, host name {AIRFLOW_DB_HOST}, database schema name {AIRFLOW_DB_SCHEMA}, username {AIRFLOW_DB_USERNAME}, and password {AIRFLOW_DB_PASSWORD}
  2. Ensure the new version of Airflow you want to Install is Available
    1. Run the follow command (don’t forget to include the ‘==’):
      $ pip install airflow==
      • Note: This will throw an error saying that the version is not provided and then show you all the versions available. This is supposed to happen and is a way that you can find out what version are available.
    2. View the list of versions available and make sure the version you want to install ‘{NEW_AIRFLOW_VERSION}’ is available
  3. Shutdown all the Airflow Services on the Master and Worker nodes
    1. webserver
      1. gunicorn processes
    2. scheduler
    3. worker – if applicable
      1. celeryd daemons
    4. flower – if applicable
    5. kerberos ticket renewer – if applicable
  4. Take backups of various components to ensure you can Rollback
    1. Optionally, you can create a directory to house all of these backups. The bellow steps assume you’re going to create this type of folder and push all your objects to the {AIRFLOW_BACKUP_FOLDER}. But you can just as easily rename the files you want to backup if that’s more convenient.
      • Create the backup folder:
        $ mkdir -p {AIRFLOW_BACKUP_FOLDER}
    2. Backup your Configurations
      • Move the airflow.cfg file to the backup folder:
        $ cd {AIRFLOW_HOME}
        $ mv airflow.cfg {AIRFLOW_BACKUP_FOLDER}
    3. Backup your DAGs
      • Zip up the Airflow DAGs folder and move it to the backup folder:
        $ cd {AIRFLOW_HOME}
        $ zip -r airflow_dags.zip dags
        $ mv airflow_dags.zip {AIRFLOW_BACKUP_FOLDER}
      • Note: You may need to install the zip package
    4. Backup your DB/Metastore
      1. If you’re using sqlite ({USING_SQLITE?}):
        • Move the airflow.db sqlite db to the backup folder:
          $ cd {AIRFLOW_HOME}
          $ mv airflow.db {AIRFLOW_BACKUP_FOLDER}
      2. If you’re using a SQL database like MySQL or PostgreSQL, take a dump of the database.
        • If you’re MySQL you can use the following command:
          $ mysqldump --host={AIRFLOW_DB_HOST} --user={AIRFLOW_DB_USERNAME} --password={AIRFLOW_DB_PASSWORD} {AIRFLOW_DB_SCHEMA} > {AIRFLOW_BACKUP_FOLDER}/airflow_metastore_backup.sql
  5. Upgrade Airflow
    1. Run the following PIP command to install Airflow and the required dependencies:
      $ sudo pip install airflow=={NEW_AIRFLOW_VERSION} --upgrade
      $ sudo pip install airflow[hive]=={NEW_AIRFLOW_VERSION} --upgrade
    2. Note: If you installed additional sub-packages of Airflow you will need to upgrade those too
  6. Regenerate and Update Airflow Configurations
    1. Regenerate the airflow.cfg that was backed up using the following command:
      $ airflow initdb
      • Note: The reason you want to regenerate the airflow.cfg file is because between version of airflow, new configurations might have been added or old configurations values (for things that you don’t need to update from the default values) might have changed.
    2. Remove the generated airflow.db file
      $ cd {AIRFLOW_HOME}
      $ rm airflow.db
    3. If you’re using sqlite, copy the old airflow.db file you backed up back to the original place
      $ cd {AIRFLOW_HOME}
      $ cp {AIRFLOW_BACKUP_FOLDER}/airflow.db .
    4. Manually copy all of the individual updated configurations from the old airflow.cfg file that you backed up to the new airflow.cfg file
      • Compare the airflow.cfg files (backed up and new one) to determine which configurations you need to copy over. This may include the following configurations:
        • executor
        • sql_alchemy_conn
        • base_url
        • load_examples
        • broker_url
        • celery_result_backend
    5. Review the airflow.cfg file further to ensure all values are set to the correct value
  7. Upgrade Metastore DB
    • Run the following command:
      $ airflow upgradedb
  8. Restart your Airflow Services
    • The same ones you shutdown in step #3
  9. Test the upgraded Airflow Instance
    • High Level Checklist:
      • Services start up with out errors?
      • DAGs run as expected?
      • Do the plugins you have installed (if any) load and work as expected?
  10. Once/If you want, you can delete the {AIRFLOW_BACKUP_FOLDER} folder and its contents

Rollback Airflow

In the event you encountered a problem during the upgrade process and would like to rollback to the version you already had before, follow these instructions:

  1. Take note of what step you stopped at in the upgrade process
  2. Stop all the Airflow Services
  3. If you reached step #7 in the upgrade steps above (Step: Upgrade Metastore DB)
    1. Restore the database to the original state
      1. If you’re using sqlite ({USING_SQLITE?})
        1. Delete the airflow.db file that’s there and copy the old airflow.db file from your backup folder to its original place:
          $ cd {AIRFLOW_HOME}
          $ rm airflow.db
          $ cp {AIRFLOW_BACKUP_FOLDER}/airflow.db .
      2. If you’re using a SQL database like MySQL or PostgreSQL, restore the dump of the database
        • If you’re using MySQL you can use the following command:
          $ mysql --host={AIRFLOW_DB_HOST} --user={AIRFLOW_DB_USERNAME} --password={AIRFLOW_DB_PASSWORD} {AIRFLOW_DB_SCHEMA} < {AIRFLOW_BACKUP_FOLDER}/airflow_metastore_backup.sql
  4. If you reached step #6 in the upgrade steps above (Step: Regenerate and Update Airflow Configurations)
    • Copy the airflow.cfg file that you backed up back to its original place:
      $ cd {AIRFLOW_HOME}
      $ rm airflow.cfg
      $ cp {AIRFLOW_BACKUP_FOLDER}/airflow.cfg .
  5. If you reached step #5 in the upgrade steps above (Step: Upgrade Airflow)
    • Downgrade Airflow back to the original version:
      $ sudo pip install airflow=={OLD_AIRFLOW_VERSION} --upgrade
      $ sudo pip install airflow[hive]=={OLD_AIRFLOW_VERSION} --upgrade
    • Note: If you installed additional sub-packages of Airflow you will need to downgrade those too
  6. If you reached step #4 in the upgrade steps above (Step: Take backups)
    1. Restore the airflow.cfg file (if you haven’t already done so)
      $ cd {AIRFLOW_HOME}
      $ cp {AIRFLOW_BACKUP_FOLDER}/airflow.cfg .
    2. If you’re using sqlite ({USING_SQLITE?}), restore the airflow.db file (if you haven’t already done so)
      $ cd {AIRFLOW_HOME}
      $ cp {AIRFLOW_BACKUP_FOLDER}/airflow.db .
  7. Restart all the Airflow Services
  8. Test the restored Airflow Instance

Installing and Configuring Apache Airflow

Apache Airflow is a platform to programmatically author, schedule and monitor workflows – it supports integration with 3rd party platforms so that you, our developer and user community, can adapt it to your needs and stack.

Additional Documentation:

Documentation: https://airflow.incubator.apache.org/

Install Documentation: https://airflow.incubator.apache.org/installation.html

GitHub Repo: https://github.com/apache/incubator-airflow

Preparing the Environment

Install all needed system dependencies

Ubuntu

  1. SSH onto target machine (s) where you want to install Airflow
  2. Login as Root
  3. Install Required Libraries
    #Run upgrade
    apt-get update
     
    #Unzip
    apt-get install unzip
     
    #Build Essentials - GCC Compiler
    apt-get install build-essential
     
    #Python Development
    apt-get install python-dev
     
    #SASL
    apt-get install libsasl2-dev
     
    #Pandas
    apt-get install python-pandas
  4. Check Python Version
    1. Run the command:
      python -V
    2. If the version comes back as “Python 2.7.X” you can skip the rest of this step
    3. Install Python 2.7.X
      cd /opt
      sudo wget --no-check-certificate https://www.python.org/ftp/python/2.7.6/Python-2.7.6.tar.xz
      tar xf Python-2.7.6.tar.xz
      cd Python-2.7.6
      ./configure --prefix=/usr/local
      make && make altinstall
      
      
      ls -ltr /usr/local/bin/python*
      
      
      vi ~/.bashrc
      #add this line alias python='/usr/local/bin/python2.7'
      
      source ~/.bashrc
  5. Install PIP
    1. Run Install
      cd /tmp/
       
      wget https://bootstrap.pypa.io/ez_setup.py
       
      python ez_setup.py
      
      unzip setuptools-*.zip
      cd setuptools-*
      
      easy_install pip
    2. Verify Installation
       which pip
       
      # Should print out the path to the pip command
    3. If you come across an issue where while using pip bellow, its still referring to python2.6, you can follow these instructions
      1. Replace the binaries in the /usr/bin/ directory with the ones that were just installed
        cd /usr/bin/
        
        #Backup old binaries
        mv pip pip-BACKUP
        mv pip2 pip2-BACKUP
        mv pip2.6 pip2.6-BACKUP
        
        #Setup symlinks to the new version of pip that was installed
        ln -s /usr/local/bin/pip pip
        ln -s /usr/local/bin/pip2 pip2
        ln -s /usr/local/bin/pip2.7 pip2.7

Troubleshooting installation on Ubuntu:

  • If you later get the error “error trying to exec ‘as’: execvp: No such file or directory” while trying to install airflow with PIP
    • Install the following:
      apt-get install binutils
      apt-get install gcc
      apt-get install build-essential
      pip install pandas
    • Retry installation
    • If the problem persists, uninstall the packages listed above and reinstall. Then rerun.

CentOS

  1. SSH onto target machine(s) where you want to install Airflow
  2. Login as Root
  3. Install Required Libraries
    yum groupinstall "Development tools"
     
    yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel python-devel wget cyrus-sasl-devel.x86_64
  4. Check Python Version
    1. Run the command:
      python -V
    2. If the version comes back as “Python 2.7.X” you can skip the rest of this step
    3. Install Python 2.7.X
      cd /opt
      sudo wget --no-check-certificate https://www.python.org/ftp/python/2.7.6/Python-2.7.6.tar.xz
      tar xf Python-2.7.6.tar.xz
      cd Python-2.7.6
      ./configure --prefix=/usr/local
      make && make altinstall
      
      ls -ltr /usr/local/bin/python*
      
      vi ~/.bashrc
      #add this line alias python='/usr/local/bin/python2.7'
      
      
      source ~/.bashrc
  5. Install PIP
    1. Run Install
      cd /tmp/
      
      wget https://bootstrap.pypa.io/ez_setup.py
      
      python ez_setup.py
      
      unzip setuptools-X.X.zip
      cd setuptools-X.X
      
      easy_install pip
    2. Verify Installation
       which pip
       
      #Should print out "/usr/local/bin/pip"

Troubleshooting on CentOS:

  • If you get an error saying ImportError: No module named extern while Installing PIP with easy_install
    1. Reinstall python-setuptools:
      yum reinstall python-setuptools
    2. Retry installation

Install Airflow

Login as Root and run:

pip install airflow==1.7.0
pip install airflow[hive]==1.7.0
pip install airflow[celery]==1.7.0

Update: Common Issue with Celery

Recently there were some updates to the dependencies of Airflow where if you were to install the airflow[celery] dependency for Airflow 1.7.x, pip would install celery version 4.0.2. This version of celery is incompatible with Airflow 1.7.x. This would result in various types of errors including messages saying that the CeleryExecutor can’t be loaded or that tasks are not getting executed as they should.

To get around this issue, install an older version of celery using pip:

pip install celery==3.1.17

Install RabbitMQ

If you intend to use RabbitMQ as a message broker you will need to install RabbitMQ.If you don’t intend to, you can skip this step. For production it is recommended that you use CeleryExecutors which requires a message broker such as RabbitMQ.

Setup

Follow these steps: Install RabbitMQ

Recovering from a RabbitMQ Node Failure

If you’ve opted to setup RabbitMQ to run on as a cluster, and one of those cluster nodes fails, you can follow these steps to recover on airflow:

  1. Bring the RabbitMQ node and daemon back up
  2. Navigate to the RabbitMQ Management UI
  3. Click on Queues
  4. Delete the “Default” queue
  5. Restart Airflow Scheduler service

Install MySQL Dependencies

If you intend to use MySQL as an DB repo you will need to install some MySQL dependencies. If you don’t intend to, you can skip this step.

Install MySQL Dependencies on Ubuntu

  1. Install MySQL Dependencies
     apt-get install python-dev libmysqlclient-dev
     pip install MySQL-python

Install MySQL Dependencies on CentOS

  1. Install MySQL Dependencies
    yum install -y mysql-devel python-devel python-setuptools
    pip install MySQL-python

Configuring Airflow

Its recommended to use RabbitMQ.

Apache Airflow needs a home, ~/airflow is the default, but you can lay foundation somewhere else if you prefer (OPTIONAL)

export AIRFLOW_HOME=~/airflow

Run the following as the desired user (who ever you want executing the Airflow jobs) to setup the airflow directories and default configs

airflow initdb
 
#note: When you run this the first time, it will generate a sqlite file (airflow.db) in the AIRFLOW_HOME directory for the Airflow Metastore. If you don't intend to use sqlite as the Metastore then you can remove this file.

Make the following changes to the {AIRFLOW_HOME}/airflow.cfg file

  1. Change the Executor to CeleryExecutor (Recommended for production)
    executor = CeleryExecutor
  2. Point SQL Alchemy to MySQL (if using MySQL)
    sql_alchemy_conn = mysql://{USERNAME}:{PASSWORD}@{MYSQL_HOST}:3306/airflow
  3. Set dags are paused on startup. This is a good idea to avoid unwanted runs of the workflow. (Recommended)
    # Are DAGs paused by default at creation
    dags_are_paused_at_creation = True
  4. Don’t load examples
    load_examples = False
  5. Set the Broker URL (If you’re using CeleryExecutors)
    1. If you’re using RabbitMQ:
      broker_url = amqp://guest:guest@{RABBITMQ_HOST}:5672/
    2. If you’re using AWS SQS:
      broker_url = sqs://{ACCESS_KEY_ID}:{SECRET_KEY}@
       
      # Note: You will also need to install boto:
      $ pip install -U boto
  6. Point Celery to MySQL (if using MySQL)
    celery_result_backend = db+mysql://{USERNAME}:{PASSWORD}@{MYSQL_HOST}:3306/airflow
  7. Set the default_queue name used by CeleryExecutors (Optional: Primarily for if you have a preference of the default queue name or plan on using the same broker for multiple airflow instances)
    # Default queue that tasks get assigned to and that worker listen on.
    default_queue = {YOUR_QUEUE_NAME_HERE}
  8. Setup MySQL (if using MySQL)
    1. Login to the mysql machine
    2. Create the airflow database if it doesn’t exist
      CREATE DATABASE airflow CHARACTER SET utf8 COLLATE utf8_unicode_ci;
    3. Grant access
      grant all on airflow.* TO ‘USERNAME'@'%' IDENTIFIED BY ‘{password}';
  9. Run initdb to setup the database tables
    airflow initdb
  10. Create needed directories
    cd {AIRFLOW_HOME}
    mkdir dags
    mkdir logs

Configuring Airflow – Advanced (Optional)

Email Alerting

Allow Email alerting for if a task or job fails.

  1. Edit the {AIRFLOW_HOME}/airflow.cfg file
  2. Set the properties
    1. Properties
      • SMTP_HOST - Host of the SMTP Server
      • SMTP_TLS - Whether to use TLS when connecting to the SMTP Server
      • SMTP_USE_SSL - Whether to use SSL when connecting to the SMTP Server
      • STMP_USER - Username for connecting to SMTP Server
      • SMTP_PORT - Port to use for SMTP Server
      • SMTP_PASSWORD - Password associated with the user thats used to connect to SMTP Server
      • SMTP_EMAIL_FROM - Email to send Alert Emails as
    2. Example
      [email]
      email_backend = airflow.utils.send_email_smtp
      
      [smtp]
      # If you want airflow to send emails on retries, failure, and you want to
      # the airflow.utils.send_email function, you have to configure an smtp
      # server here
      smtp_host = {SMTP_HOST}
      smtp_starttls = {SMTP_TLS: True or False}
      smtp_ssl = {SMTP_USE_SSL: True or False}
      smtp_user = {STMP_USER}
      smtp_port = {SMTP_PORT}
      smtp_password = {SMTP_PASSWORD}
      smtp_mail_from = {SMTP_EMAIL_FROM}

Password Authentication

To enable password authentication for the web app.

Follow these instructions: http://airflow.incubator.apache.org/security.html

Controlling Airflow Services

By default you have to use the Airflow Command line Tool to startup the services. You can use the bellow commands to startup the processes in the background and dump the output to log files.

Starting Services

  1. Start Web Server
    nohup airflow webserver $* >> ~/airflow/logs/webserver.logs &
  2. Start Celery Workers
    nohup airflow worker $* >> ~/airflow/logs/worker.logs &
  3. Start Scheduler
    nohup airflow scheduler >> ~/airflow/logs/scheduler.logs &
  4. Navigate to the Airflow UI
  5. Start Flower (Optional)
    • Flower is a web UI built on top of Celery, to monitor your workers.
    nohup airflow flower >> ~/airflow/logs/flower.logs &
  6. Navigate to the Flower UI (Optional)

Stopping Services

Search for the service and run the kill command:

# Get the PID of the service you want to stop
ps -eaf | grep airflow
# Kill the process
kill -9 {PID}

Setting up Systemd to Run Airflow

Deploy Systemd Scripts

  1. Login as Root
  2. Get the zipped up Airflow
    cd /tmp/
    wget https://github.com/apache/incubator-airflow/archive/{AIRFLOW_VERSION}.zip
    
    #Example: "wget https://github.com/apache/incubator-airflow/archive/1.7.0.zip"
  3. Unzip the file
    unzip {AIRFLOW_VERSION}.zip
    # This will output extract the contents into: incubator-airflow-{AIRFLOW_VERSION}
  4. Distribute the Systemd files
    cd incubator-airflow-{AIRFLOW_VERSION}/scripts/systemd/
    
    # Update the contents of the airflow file.
    # Set the AIRFLOW_HOME if its anything other then the default
    vi airflow
    # Copy the airflow property file to the target location
    cp airflow /etc/sysconfig/
    
    # Update the contents of the airflow-*.service files
    # Set the User and Group values to the user and group you want the airflow service to run as
    vi airflow-*.service
    # Copy the airflow services files to the target location
    cp airflow-*.service /etc/systemd/system/

How to Use Systemd

Webserver
# Starting up the Service
service airflow-webserver start

# Stopping the Service
service airflow-webserver stop
# Restarting the Service
service airflow-webserver restart
# Checking the Status of the Service
service airflow-webserver status
# Viewing the Logs
journalctl -u airflow-webserver -e
Celery Worker
# Starting up the Service
service airflow-worker start

# Stopping the Service
service airflow-worker stop
# Restarting the Service
service airflow-worker restart
# Checking the Status of the Service
service airflow-worker status
# Viewing the Logs
journalctl -u airflow-worker -e
Scheduler
# Starting up the Service
service airflow-scheduler start

# Stopping the Service
service airflow-scheduler stop
# Restarting the Service
service airflow-scheduler restart
# Checking the Status of the Service
service airflow-scheduler status
# Viewing the Logs
journalctl -u airflow-scheduler -e
Flower (Optional)
# Starting up the Service
service airflow-flower start

# Stopping the Service
service airflow-flower stop
# Restarting the Service
service airflow-flower restart
# Checking the Status of the Service
service airflow-flower status
# Viewing the Logs
journalctl -u airflow-flower -e

Setting up Airflow Services to Run on Machine Startup

Webserver
chkconfig airflow-webserver on
Celery Worker
chkconfig airflow-worker on
Scheduler
chkconfig airflow-scheduler on
Flower (Optional)
chkconfig airflow-flower on

Testing Airflow

Example Dags

https://github.com/apache/incubator-airflow/tree/master/airflow/example_dags

High Level Testing

Note: You will need to deploy the tutorial.py dag.

airflow test tutorial print_date 2016-03-30

#[2016-03-30 18:39:46,621] {bash_operator.py:72} INFO - Output:
#[2016-03-30 18:39:46,623] {bash_operator.py:76} INFO - Wed Mar 30 18:39:46 UTC 2016

Running a Sample Airflow DAG

Assume the following code is in the dag at {AIRFLOW_HOME}/dags/sample.py

from airflow import DAG
from airflow.operators import DummyOperator
from datetime import datetime, timedelta

default_args = {
    'owner': 'airflow',
    'start_date': datetime.now() - timedelta(seconds=10),
    'retries': 0
}

dag = DAG('sample', default_args=default_args, start_date=datetime.now() - timedelta(seconds=10))

op = DummyOperator(task_id='dummy', dag=dag)
Verify the DAG is Available

Verify that the DAG you deployed is available in the list of DAGs

airflow list_dags

The output should list the ‘sample’ DAG

Running a Test

Let’s test by running the actual task instances on a specific date. The date specified in this context is an execution_date, which simulates the scheduler running your task or dag at a specific date + time:

airflow test sample dummy 2016-03-30
Run

Heres how to run a particular task. Note: It might fail if the dependent tasks are not run successfully.

airflow run sample dummy 2016-04-22T00:00:00 --local
Trigger DAG

Trigger a DAG run

airflow trigger_dag sample
Backfill

Backfill will respect your dependencies, emit logs into files and talk to the database to record status. If you do have a webserver up, you’ll be able to track the progress. airflow webserver will start a web server if you are interested in tracking the progress visually as your backfill progresses.

airflow backfill sample -s 2016-08-21

Helpful Operations

Getting Airflow Version

airflow version

Find Airflow Site-Packages Installation Location

Sometimes it might be helpful to find the source code so you can perform some other operations to help customize the experience in Airflow. This is how you can find the location of where the airflow source code is installed:

  1. Start up a Python CLI
    python
  2. Run the following code to find where the airflow source code is installed
    import site
    import os
    SITE_PACKAGES = site.getsitepackages()
    print "All Site Packages: " + str(SITE_PACKAGES)
    for site_package in SITE_PACKAGES:
    	test_path = site_package + "/airflow"
    	if os.path.exists(test_path):
    		AIRFLOW_INSTALL_DIR = test_path
    
    
    print "Site Page Containing Airflow: " + str(AIRFLOW_INSTALL_DIR)

Usual Site Package Paths:

  • Centos
    • /usr/lib/python2.7/site-packages

Change Alert Email Subject

By default, the Airflow Alert Emails are always sent with the subject like: Airflow alert: <TaskInstance: [DAG_NAME].[TASK_ID] [DATE] [failed]>. If you would like to change this to provide more information as to which Airflow cluster you’re working with you can follow these steps.

Note: It requires a very small modification of the Airflow Source Code.

  1. Go to the Airflow Site-Packages Installation Location
    1. Example Path: /usr/lib/python2.7/site-packages/airflow
  2. Edit the models.py file
  3. Search for the text “Airflow alert: ”
    1. Using nano
      1. Open the file
      2. Hit CTRL+w
      3. Type in “Airflow alert” and hit enter
  4. Modify this string to whatever you would like.
    1. Original value ‘title = “Airflow alert: {self}”.format(**locals())”‘ will produce ‘Airflow alert: <TaskInstance: [DAG_NAME].[TASK_ID] [DATE] [failed]>’
    2. An updated value like ‘title = “Test Updated Airflow alert: {self}”.format(**locals())”‘ will produce ‘Test Updated Airflow alert: <TaskInstance: [DAG_NAME].[TASK_ID] [DATE] [failed]>’

Set Logging Level

If you want to get more information in the logs (debug) or log less information (warn) you can follow these steps to set the logging level

Note: It requires a very small modification of the Airflow Source Code.

  1. Go to the Airflow Site-Packages Installation Location of airflow
  2. Edit the settings.py file
  3. Set the LOGGING_LEVEL variable to your desired value
    1. debug → logging.DEBUG
    2. info → logging.INFO
    3. warn → logging.WARN
  4. Restart the Airflow Services