Implementing Airflow¶
We use Google Cloud Composer to provide districts with a managed Airflow service that is deeply integrated with Google Cloud. You’ll need to create a Composer environment in your Google Cloud project that will house all DAG files for your integrations.
Creating Composer environment¶
Open your Composer environments page and click create to create a new environment. Here are the settings we typically use:
- Name: we use the district’s name
- Node count: 3
- Location: select a location near you
- Machine type: n1-standard-1
- Image version: latest Airflow version available
- Python version: 3
PyPi Packages¶
There are a few PyPi packages that need to be installed. Click into your new Composer environment and entry the following packages under PYPI PACKAGES.
- paramiko
- pysftp
- pyarrow
- pandas >=0.25.0
Assigning static IPs to Airflow¶
Some edTech vendors require the district to whitelist the IP addresses where requests will be coming from. For example, Lexia will only allow requests from whitelisted IPs to their SFTP server. With Cloud Composer, the Airflow scheduler and worker nodes are run on Google Kubernetes Engine. This means you need to change the IP type of the nodes from ephermeral to static on the External IP addresses page.