Connecting to data sources in Power BI is a fundamental step for creating insightful and interactive reports. Without connecting to data source(s), we cannot access data and cannot run queries to perform analytics. Power BI provides robust tools for connecting, transforming, and visualizing data.
Types of Data Sources in Power BI
Power BI supports and can connect to a wide variety of data sources, ranging from excel files, databases to online & cloud services. Here are key data sources that Power BI can connect to:
- Files – These are data files stored on your local computer or cloud storage. Files data sources are Excel, CSV, XML, JSON, PDF, etc.
- Databases – Power BI can connect to various databases, both on-premises and cloud-based. Database data sources are SQL Server, Oracle, MySQL, PostgreSQL, etc.
- Online Services – Power BI integrates with a wide range of online services, allowing to connect to cloud-based data sources. Online service based data sources are SharePoint, Salesforce, Dynamics 365, Google Analytics, etc.
- Azure Data Services – Power BI supports deep integration with Azure data services. Azure data services are Azure SQL Database, Azure Data Lake, Azure Blob Storage etc.
- SaaS providers – Connect directly to Software as a Service (SaaS) applications like Microsoft 365, Mailchimp, Google BigQuery etc.
- Web and API Data Sources – Power BI supports to import data directly from websites or APIs. Example – OData feeds, REST APIs, Web data, etc.
- Big Data and Analytics – Power BI connects to big data platforms and tools for advanced analytics. Example – Hadoop (HDFS), Spark, Impala etc.
- On-Premise Data Sources – For on-premises data, Power BI uses data gateways to connect securely.
- Cloud Storage Services – Power BI can connect to various cloud storage services like One Drive, Google Drive, Dropbox etc.
- Other Data Sources – Power BI also supports various other data sources like R and Python Scripts, Azure Marketplace etc.
Connecting to Data Sources
Here’s a detailed guide on how to connect to different types of data sources in Power BI. Follow the following steps to connect to data sources and import data in Power BI.
After starting the Power BI application, either we can directly go to connect to the data source or we can first start the blank report and then connect the data source for that report.
- Start the Power BI application
- Open the blank report by clicking on the black report icon
Get Data:
- In the Power BI Desktop home ribbon, click on the “Get Data” button. A window will pop up with a list of available data sources.
- Choose the data source you want to connect to, such as Excel, SQL Server, SharePoint, etc.
Examples to connect to different data sources
Connecting to an Excel File
- In the “Get Data” window, select Excel from the list of data sources.
- Click “Connect”.
- Browse to Your File – A file browser will open. Navigate to the location of your Excel file and select that file. Click “Open”.
- Select Tables/Sheets: – The Navigator window will appear, listing all the tables or sheets in the Excel file.
- Select the tables or sheets you want to load into Power BI.
- Click “Load” to import the data or “Transform Data” to clean and transform it before loading.
Connecting to SQL Server Database
- In the “Get Data” window, choose SQL Server from the database list.
- Click “Connect”.
Enter Server Details:
- Enter the server name and, if required, the database name.
- Choose the appropriate authentication method (Windows or Database authentication).
- Click “OK”.
Select Data to Import:
- In the Navigator window, select the tables or views from the database.
- Click “Load” to import the data into Power BI.
Connecting to Web Data
- In the “Get Data” window, select Web from the list of data sources.
- Click “Connect”.
- Enter the URL of the web page that contains the data we want to extract.
- Click “OK”.
- Power BI will analyze the web page and list the available tables.
- Select the data tables you want to import.
- Click “Load” to import the data into Power BI.
The imported data will appear in the data pane on the left side of the screen.
After importing the data, we can transform and clean this data and can start creating visuals and reports using this data in Power BI.
Refreshing Data
Note – Remember to refresh the data periodically to keep your reports up to date with the latest information from the data source. We can do this by clicking on the “Data” tab and selecting “Refresh Data” or by setting up a schedule for data refresh in the Power BI service.
- Manual Refresh:
- To refresh the data manually in Power BI Desktop, click the “Refresh” button in the Home ribbon.
- Scheduled Refresh:
- If we publish the report to Power BI Service, we can set up a scheduled refresh to automatically update data from the data source.
- Go to Power BI Service, navigate to dataset, and configure the Scheduled Refresh settings.
DirectQuery Mode vs Import Mode
Direct Query and Import Mode are two data connectivity methods in Power BI that determine how data is accessed, stored, and processed within your reports and dashboards. Each mode has its own advantages and disadvantages, making it suitable for different use cases.
Memory and data queries are very important concepts in Power BI, especially while working with the Import method of connection. So, before we go on to contrast these two very important terms, ‘memory’ and ‘data query’, we shall have to know the role of these while we use Power BI.
Import Mode
In Import Mode, data from the data source is imported and stored in Power BI’s in-memory data model. This means that a copy of the data that we are connected to and is loaded into Power BI Desktop or Power BI Service.
- Imported data is stored within a compressed, columnar storage format in memory in Power BI, where it remains after importation.
- Performance – Importing data with this mode gives higher performance. The data is stored in-memory and access locally, this makes queries and visualizations operations very fast.
- Data Refresh – Data in Import Mode is static after it’s loaded. To update the data, we need to perform a manual or scheduled refresh, which re-imports the data from the source.
- Using the in-memory data we can create complex models, with DAX (Data Analysis Expression) language and perform advanced calculations. It supports generation of reports together with visualizations just from the data imported.
Advantages of Import Mode
- High Performance – Because of in-memory data model, queries are executed quickly
- Complex Transformations – Power BI allows complex transformations and modeling on the imported data.
- Offline Capability – We can work with data offline because it’s stored locally within the Power BI file.
Disadvantages of Import Mode
- Memory Consumption – Large datasets can consume significant amount of memory, which can limit the size of the datasets we can import.
- Data Staleness – Data can become outdated. Manual refresh and schedule refresh options are possible, But is not be suitable for real-time data analysis.
- Refresh Limitations – In the Power BI Service, there are limitations on how often you can refresh your data.
Use Cases for Import Mode
In Power BI Desktop, import mode is used to upload data from different data sources like Excel files, CSV files, databases, and cloud services like Microsoft Dynamics, Salesforce, and Azure. With the Import method, data is extracted from the source into Power BI for analysis. Here are a few scenarios in which we can use Import mode in Power BI:
- In cases of Static data analysis where data doesn’t change frequently or is produced in batches
- When Performance is critical, importing data into Power BI will give faster query performance than Direct Query mode.
- Import data for transformation and filtering
- We can combine multiple sources efficiently with Power BI Desktop because import will help you merge diverse information sources into one.
- We can load records into a model so all the relations among tables are available. This helps in complex analysis between tables.
- Good for smaller data size and scheduled refresh scenarios.
DirectQuery Mode
In DirectQuery Mode, data is not imported into Power BI. Instead, queries are sent directly to the underlying data source every time a visualization or report is refreshed. The data remains in the original source and is fetched on demand.
- DirectQuery in Power BI is the way to connect to the data that permits users to link up with a data source for real-time analysis and reporting. Power BI DirectQuery connects to the original data source, but it doesn’t store any data in memory.
- The performance depends on the underlying data source’s ability to handle the queries. If the source is powerful and optimized, performance can be good. otherwise, it can be slower compared to Import Mode.
- Real-Time Data – Data is queried directly from the data source. It always perform queries on the latest data. This makes DirectQuery suitable for real-time or near real-time analysis.
Advantages of DirectQuery Mode
- Real-Time Data – DirectQuery always works with the latest data from the data source. It does not refreshing of data.
- Handles Large Datasets – Data is not loaded into local in-memory. It is suitable for very large datasets that cannot be imported (or impractical to import) into Power BI.
- Storage Efficiency – Data is not imported into memory, so no additional memory required in Power BI.
Disadvantages of DirectQuery
- Performance Dependency – Performance depends heavily on the data source’s ability to process queries. Complex queries may lead to slower performance.
- Limited Transformations – DirectQuery supports limited data transformations compared to Import Mode. Might not be good to perform complex Power Query operations.
- Restrictions on DAX Calculations – Some DAX functions and features, like calculated columns and complex measures, are restricted in DirectQuery mode.
Use Cases for DirectQuery
Direct Query is a powerful feature in Power BI that enables real-time data analysis. It is supported by various data sources, including databases like SQL Server, Azure SQL Database, Oracle Database, and many others. Here are a few scenarios in which we can use direct query in Power BI:
- In case of very large data, we can use DirectQuery into Power BI because it allows us to query the data directly from the data source without first loading it into Power BI, which saves both space and time.
- For accessing real-time data, DirectQuery allows to access real-time information. Any changes made on the original datasets are reflected instantly in analytical reports and dashboards.
- DirectQuery supports situations where there’s frequent change in dataset like stock price or sensors without necessarily refreshing all data in Power BI.
Differences between Import & DirectQuery
Category | Import Mode | DirectQuery Mode |
---|---|---|
Data Storage | Data from the data source is copied (imported) and stored in-memory into the Power BI. | Data is stored in the actual data source (like database). No import of complete data into Power BI. Queries are sent directly to a data source via Power BI. |
Data Refreshing | Manual or Schedules import of data from data source on regular basis | Queries are running on actual data source and always get the latest results. No need of manual refresh of data. |
Performance | Data is stored in-memory. Performance is high for complex queries and aggregations. Better response and fast load of reports and dashboards. | Performance is is mostly depends on power (like database server configuration) of data source. Can cause longer wait times complex requests or larger datasets. |
Data Limitation | Specific limitations about dataset volume. Smaller data size is preferred. Amount of dataset depends on the memory available for Power BI. | Handles big datasets without any limits imposed by memory. Also queries based on large historical datasets. |
Real Time Analytics | After importing dataset, it does require any live connection with the data-source. Manual or schedule refresh of data can be done. Real time analytics is not supported. | Queries are sent directly to data-source with live connection and real time data. This supports real-time or near real-time analytics. |
Functionality | Certain advanced functionalities can be achieved using Import mode because the data is already stored and can be manipulated within the memory. | Some operations or changes are constrained with DirectQuery. Limiting live queries which makes it difficult for transformation of certain types of data models into an aggregate form. |
Hybrid Models (Composite Models)
Power BI also supports hybrid or composite models. In hybrid models we can use both Import Mode and DirectQuery Mode to generate analytics and to create the reports. This allows for a more flexible approach, combining the benefits of Import mode and DirectQuery.
Key features of Hybrid Model
- Combines both Import and DirectQuery modes within the same model.
- Specific tables or partitions within a table can be set to either Import or DirectQuery.
- Offers flexibility by allowing high-performance analytics on imported data while keeping certain tables or partitions in DirectQuery mode for real-time updates.
Benefits of Hybrid Model
- Performance Optimization – Import static datasets in-memory for performance and keep frequently updated tables in DirectQuery mode for real-time access
- Combine static reports with real-time analytics by using DirectQuery for current data while leveraging Import mode for historical analysis.
- Partitioned Data Loading – Partition a large table into segments where some partitions use Import mode (for older data) and others use DirectQuery (for recent data).
- Hybrid models add complexity to dataset management. Proper understanding and planning is necessary before using hybrid models.
Use Cases of Hybrid Model
- Sales Analysis – Import historical sales data for fast trend analysis while keeping recent sales data in DirectQuery for real-time reporting.
- Inventory Management – Use Import mode for historical stock levels and DirectQuery for current stock data to ensure up-to-date reporting.
- Financial Reporting – Combine static financial data (Import) with live operational data (DirectQuery) for comprehensive financial dashboards.
Data loading and Transformations
When we need to load the data from different sources like Excel files, CSV files, databases, and cloud service, there are two option available:
- Click Load to import the data directly into Power BI.
- Click Transform Data to open Power Query Editor for further manipulation before loading.
When you click on load, data directly move to Power BI Desktop which can be used to create visuals, dashboards and to make a report which can be shared with team members.
If we click on Transform data, it open into Power Query Editor for further manipulation before loading the data into Power BI Desktop.
This section was to get basic understanding between data loading and data transformation. Complete detail about Data transformation are covered in next article.
Summary
In this article, we learned about connecting to data sources in Power BI. Following sections were explored: