Integrating Apache Spark and NiFi for Data Lakes

Think Big’s Pipeline Controller (PCNG) is open source software with subscription support for creating a turn-key Data Lake solution that includes an intuitive user interface for self-service data ingest and wrangling (no coding required!), provides metadata tracking including lineage allowing data stewards and data scientists to quickly catalog, discover and qualify data, offers an operations dashboard for SLA tracking and feed monitoring, and builds on modern open-source frameworks such as Apache Spark and NiFi.

file_download Watch Video