Introduction to DataPilot

What is DataPilot?

DataPilot is an innovative tool designed to be an AI-powered assistant for data engineers and analysts working with SQL and dbt (data build tool). It integrates seamlessly into the development environment, providing real-time insights and suggestions to uphold best practices and enhance the quality of data projects.

With DataPilot, teams can automate the review process for their SQL queries and dbt models, ensuring that their data transformations are efficient, well-documented, and maintainable. It also facilitates organization-wide consistency by enforcing project standards through integration with version control systems and continuous integration/continuous deployment (CI/CD) pipelines.

Key Features

DataPilot comes with a host of features aimed at improving data project management:

  • Insightful Analysis: DataPilot performs in-depth analysis of SQL code and dbt projects, highlighting areas of concern such as model fanouts, hard-coded references, and potential duplications.

  • Seamless Integration: It can be easily integrated into local development environments as well as Git workflows and CI/CD pipelines, making it a versatile tool for teams of all sizes.

  • Early Detection: By identifying potential issues early in the development cycle, DataPilot helps prevent costly and time-consuming fixes down the line.

  • Best Practice Enforcement: DataPilot encourages the adoption of best practices in SQL and dbt project development, aiding in the maintenance of high-quality data models.

  • Automated Checks: The tool includes a range of automated checks for detecting unused sources, ensuring dependency integrity, and encouraging comprehensive testing and documentation.

How DataPilot Works

DataPilot operates by scanning your SQL and dbt project files, identifying patterns and structures that indicate potential problems or deviations from best practices. Once an issue is detected, it provides feedback and recommendations on how to address it.

For dbt projects, DataPilot makes use of the manifest and catalog files generated by dbt to perform its analysis. This ensures that the insights provided are based on the most up-to-date view of your project’s state.