Use Copilot in Power BI to quickly visualize information and uncover insights once your semantic model is prepared. Copilot makes it easier to create compelling, data-driven stories—even if you’re new to Power BI by generating report pages, visuals, and summaries based on your prompts,
Create reports with Copilot in Power BI
Copilot in Power BI helps you build interactive reports faster by turning your ideas into report elements with just a prompt. You can use suggested prompts or enter your own to create new report pages, suggest content, or answer specific data questions in the Copilot pane.
The following example was generated by using the Create a new report page suggested prompt. Copilot asked for the name and description, and then created the report elements, including:
Header with slicers for product and salesperson.
Card visuals for the sums of sales, cost, and measure for profit (sales minus cost).
Area charts evaluating sales, cost, and profit over time.
Column charts evaluating sales, cost, and profit by product.
A Map visual for sales by region.
This report and all of the elements were generated in a consistent theme and created quickly. Using Copilot saves time and helps you focus on analysis rather than manual layout and formatting.
Create reports in the Power BI service
When you create reports in Power BI Desktop and publish them, both the semantic model and the report are saved in your workspace. The semantic model contains your data, relationships, and measures, while the report is the visual layer you design.
Copilot can also help create reports based on semantic models in the Power BI service. To get started, select the ellipsis (…) next to your semantic model and choose Create report.
For this example report, we see several different possibilities offered, including:
Sales Performance by Salesperson
Regional Sales Analysis
Product Cost and Profitability
In the following screenshot, we selected the Product Cost and Profitability prompt to generate a new report page. This page has a similar layout from the one created with Power BI Desktop. There are Card visuals, Bar and Column charts, and Area charts to help analyze the cost and profitability of different products, including standard cost, profit, and profit margin.
Create pages with custom prompts
You can also provide your own prompt based on your knowledge of the data and business requirements for the report. There might be limitations to what Copilot can create based on your prompt. Some ideas for our model are:
Create a page that shows the top 10 products by sales.
Create a visual on a new page that shows the top three colors across all products sold.
Copilot can also help you communicate key findings by generating narrative summaries of your report data. Whether you’re preparing an executive summary or answering likely questions from leadership, Copilot’s summaries help keep your audience engaged and informed. Use the Narrative visual or Copilot pane to create summaries with suggested or custom prompts, referencing all visuals on a page or only selected ones.
Example prompts:
Give an executive summary
Answer likely questions from leadership
Create a bulleted list of insights
In the following screenshot, the Copilot-generated summary, some of the highlighted points are:
Highest and lowest sales across products.
A significant spike in sales.
Overall profit and highest profit on a single day.
These details offer insights at a glance while the whole report offers interactive visuals for further analysis.
Both developers and consumers can use the Copilot pane to summarize or answer questions about the report. Copilot considers all data, including data hidden behind filters or slicers that aren’t currently applied. You can specify the scope (entire report or current page) in your prompt.
Important
Copilot responses might vary and can contain errors. Treat Copilot-generated content as a first draft and always review before completing development.
Power BI semantic models are the foundation for all reports and incorrectly designed semantic models might result in limited or inaccurate reports. Copilot in Power BI helps increase productivity and bridge skill gaps by assisting with data preparation, model configuration, and DAX calculations. This unit explores ways to use Copilot in Power BI Desktop to develop a semantic model.
Clean, consistent data is the foundation for accurate reporting and analysis. Copilot can help automate repetitive tasks such as resolving inconsistencies, setting data types, and applying naming conventions. Use Power Query Editor to profile, clean, and transform your data.
Profile: Assess column quality, distribution, and profile.
Clean: Address missing values, correct out-of-range or inconsistent entries, and remove duplicates.
Transform: Rename columns, set correct data types, and reshape tables as needed.
This process also helps prepare your data for use with AI, which is covered later in this module.
Configure a semantic model
Relationships allow you to filter and summarize data in report visuals later in the development process. You can use the autodetect relationships feature to get you started, then use Copilot to summarize the initial semantic model to determine if any other relationships are needed.
In the following image, there’s a single fact table connected by relationships to dimension tables. Power BI reports work best when using a star or snowflake schema for the semantic model.
Copilot can summarize your semantic model and provide a high-level overview of the structure and key elements. This summary can help you understand important objects or missing relationships, for example.
Query with DAX
There are four views in Power BI Desktop: Report, Table, Model, and DAX Query. In the DAX Query view, you can select Copilot in the ribbon and use natural language to describe what you want.
Consider the following prompt total sales for all salespeople individually for all items in the accessories category entered into the Copilot feature of DAX Query view.
This prompt is intending to calculate each individual salesperson’s total sales for the accessories category. At AdventureWorks, there are multiple categories with multiple products within. Accessories have low sales, and the business requirement is to understand better who is selling more accessories, in case they can share valuable information.
DAXCopy
// DAX query generated by Fabric Copilot with "total sales for all salespeople individually for all items in the accessories category"
// Total sales for each salesperson for items in the accessories category
EVALUATE
SUMMARIZECOLUMNS(
'Salesperson'[Salesperson],
FILTER('Product', 'Product'[Category] == "Accessories"),
"Total Sales", [Total Sales]
)
The following table shows sample results of the Copilot-generated DAX query.
Salesperson
Total Sales
Stephen Jiang
8374.76
Michael Blythe
38682.84
Linda Mitchell
66916.05
Create measures from DAX queries
Use Copilot in DAX Query View to explore the data and determine which measures you need to create, and then select Update model with changes to create the measures. The following query was generated from a suggest measures prompt.
DAXCopy
// DAX query generated by Fabric Copilot with "Suggest new measures in a DAX query for further analysis and try them out with one or more suitable columns"
DEFINE
// New measure to calculate the average profit per product sold
MEASURE 'Sales'[Avg Profit per Product] = DIVIDE([Profit], [Unique Products Sold])
// New measure to calculate the average sales per reseller
MEASURE 'Sales'[Avg Sales per Reseller] = DIVIDE([Total Sales], [Unique Resellers])
// New measure to calculate the average quantity per order
MEASURE 'Sales'[Avg Quantity per Order] = DIVIDE([Total Quantity], [Orders])
// New measure to calculate the average sales per order
MEASURE 'Sales'[Avg Sales per Order] = DIVIDE([Total Sales], [Orders])
// Evaluate the new measures
EVALUATE
ROW(
"Avg Profit per Product", [Avg Profit per Product],
"Avg Sales per Reseller", [Avg Sales per Reseller],
"Avg Quantity per Order", [Avg Quantity per Order],
"Avg Sales per Order", [Avg Sales per Order]
)
And here’s the resulting table:
Avg Profit per Product
Avg Sales per Reseller
Avg Quantity per Order
Avg Sales per Order
2992.4987
122703.4339
56.44745575221239
21445.9541
The following screenshot is the result of three simple steps:
Enter the suggest measures prompt.
Select Keep query after results return.
Run the query.
As a developer, you can Update model with changes to create the measures best suited to your project.
Copilot allows you to explore, design, and enhance semantic models more efficiently, making report development more accessible for all skill levels.
After you deploy your model to an endpoint, you can start interacting with it to see how it works. Let’s explore how you can use prompt engineering techniques to optimize your model’s performance.
Apply prompt patterns to optimize your model’s output
The quality of the questions you send to the language model, directly influences the quality of the responses you get back. You can carefully construct your question, or prompt, to receive better and more interesting responses. The process of designing and optimizing prompts to improve the model’s performance is also known as prompt engineering.
Prompt engineering requires users to ask relevant, specific, unambiguous, and well-structured questions, instructing the model to generate more accurate responses. To understand how to create well-defined prompts, let’s explore some patterns that help you improve the output of a model:
Instruct the model to act as a persona.
Guide the model to suggest better questions.
Provide a template to generate output in a specific format.
Understand how a model reasons by asking it to reflect.
Add context to improve the accuracy of the model’s output.
As a developer, an effective approach is to add instructions to your deployed model in the form of a system prompt. The system prompt sets the model’s behavior and allows you to guide the model without exposing the end user to the instructions. The best results are often achieved through the assignment of an explicit system prompt and guidance (or templates) that help users submit prompts that reflect the patterns described here.
Instruct the model to act as a persona
When you want the model to take a specific point of view or perspective when answering a question, you can ask it to act as a persona.
For example, you can ask the model to respond as if it’s a seasoned marketing professional if you’re working in sales and need insights about a new CRM software.
No persona specified
With persona specified
System prompt
None
You’re a seasoned marketing professional that writes advertising copy for an audience of technical customers.
User prompt
Write a one-sentence description of a CRM product.
Write a one-sentence description of a CRM product.
Model response
A CRM product is a software tool designed to manage a company’s interactions with current and potential customers, encompassing sales, marketing, customer service, and technical support.
Experience seamless customer relationship management with our CRM product, designed to streamline your business operations, enhance customer interactions, and drive sales growth with its intuitive interface and robust analytics tools.
You can create different personas for various business roles. Think of roles like a product manager, data analyst, customer service expert, or teacher. Define the persona and add more detail to what you expect that persona to know or do to get tailored, context-driven responses.
Ask for better question suggestions
When you want the model to help refine your question, you can ask it to suggest better ways to phrase your query. You can also ask it to provide extra context.
For example, if you’re seeking advice on cooking a meal, you can prompt the model to suggest clarifying questions to help you get a more targeted answer.
No suggestions given
With suggestions given
System prompt
You’re an AI chef who can advise on menus.
You’re an AI chef who can advise on menus. You’re approachable and always want to provide the best advice you can.
User prompt
I have to host a dinner party for four people. What should I cook?
I have to host a dinner party for four people. What should I cook? What other information do you need to help me plan a great meal for my guests?
Model response
Hosting a dinner party for four is a wonderful opportunity to showcase your culinary skills. Here’s a suggested three-course menu that balances flavors and provides a pleasant dining experience:
Appetizer Caprese Salad Skewers – Ingredients: Cherry tomatoes, fresh mozzarella balls, fresh basil leaves, balsamic glaze, olive oil, salt, and pepper. – Instructions: Thread a cherry tomato, a basil leaf, and a mozzarella ball onto small skewers. Drizzle with olive oil and balsamic glaze, then season with salt and pepper. This dish is light, refreshing, and easy to prepare ahead of time.
Main Course Herb-Crusted Salmon with Lemon Asparagus – Ingredients: Salmon fillets, fresh herbs (such as parsley, dill, and thyme), breadcrumbs, lemon, asparagus, olive oil, salt, and pepper. – Instructions: Preheat your oven to 400°F (200°C). Mix breadcrumbs with chopped fresh herbs and a bit of olive oil to form a paste. Press this mixture onto the salmon fillets and season with salt and pepper. Place salmon on a baking sheet and bake for about 15-20 minutes, or until cooked through. In the last 10 minutes of cooking, toss asparagus with olive oil, lemon zest, salt, and pepper, and roast alongside the salmon. Serve with a squeeze of fresh lemon juice over the salmon and asparagus.
Dessert Chocolate Mousse with Fresh Berries –Ingredients: Dark chocolate, eggs, sugar, heavy cream, vanilla extract, and mixed berries (such as strawberries, blueberries, and raspberries). – Instructions: Melt the chocolate over a double boiler, then let it cool slightly. Separate egg yolks and whites. Whip the egg whites to soft peaks, then gradually add sugar to make a meringue. Whip the heavy cream until soft peaks form and mix in vanilla extract. Fold the melted chocolate into the egg yolks, then gently fold in the whipped cream and meringue until combined. Spoon the mousse into serving dishes and refrigerate for at least 2 hours. Serve topped with fresh berries for a burst of color and freshness.
Additional Tips – Pairings: Consider serving a light white wine with the appetizer and main course, and perhaps a dessert wine with the chocolate mousse. – Timing: Prepare the appetizer and dessert in advance to minimize last-minute stress. The main course will take the most time in the oven, so plan accordingly.
This menu is designed to be elegant yet approachable, ensuring your guests enjoy both the food and the experience. Enjoy your dinner party!
Hosting a dinner party is a wonderful opportunity to showcase your culinary skills and create memorable experiences for your guests. To help you plan a great meal, here are a few questions that can guide you in choosing the perfect menu:
Dietary Preferences/Restrictions: Are any of your guests vegetarian, vegan, gluten-free, or have any food allergies or intolerances? Knowing this will help in selecting dishes that everyone can enjoy.Flavor Preferences: Do your guests have any favorite cuisines or flavors? For example, do they enjoy Italian, Mexican, Mediterranean, or perhaps something more adventurous like Thai or Indian?Time and Skill Level: How much time do you have to prepare the meal, and what is your comfort level in the kitchen? Are you looking for a simple meal or are you up for something more elaborate?Occasion: Is this a casual get-together, a celebration, or something more formal? The nature of the event might influence the menu style.Seasonal Ingredients: Are you interested in using seasonal ingredients? This can help in choosing fresh and flavorful produce.Drinks Pairing: Would you like to include alcoholic or non-alcoholic beverages to pair with the meal? With this information, I can suggest a tailored menu. Here’s a general idea that can be adapted based on your answers:
StarterGarden Salad with Citrus Vinaigrette: Fresh greens, seasonal vegetables, and a light citrus dressing.Stuffed Mushrooms: For a vegetarian option, filled with herbs and cheese. Main Course Herb-Crusted Chicken Breast: Served with roasted vegetables.Quinoa Stuffed Bell Peppers: A flavorful vegetarian or vegan option. Side Dishes Garlic Butter Roasted Potatoes: Classic and comforting.Steamed Green Beans with Almonds: A simple, healthy side. Dessert Chocolate Lava Cake: A decadent treat to end the meal.Fruit Tart: A lighter, refreshing option. Drinks Wine Pairing: A versatile white wine like Sauvignon Blanc or a light red like Pinot Noir.Mocktail Option: A refreshing citrus mocktail or sparkling water with herbs.
Feel free to share more details, and I can help refine the menu further!
When you encourage the model to suggest clarifications or extra details can help you ask the right questions and achieve better, more accurate answers in fewer interactions.
Specify the desired format for responses
When you want the model to generate output in a specific format, you can provide a template or structure in your prompt.
For example, if you’re a sports reporting composing a historical article, you can request that the model follow a specific template, which includes headings, bullet points, and data breakdowns.
No template specified
With template specified
System prompt
You’re a helpful AI assistant.
You’re a helpful AI assistant for sports reporters.
User prompt
What happened in the 2018 Soccer World Cup final?
What happened in the 2018 Soccer World Cup final? Format the result to show the match date, location, and the two teams competing. Then the final score, and finally any notable events that occurred during the match.
Model response
The 2018 FIFA World Cup final took place on July 15, 2018, at the Luzhniki Stadium in Moscow, Russia. France faced Croatia in the match. France emerged victorious with a 4-2 win. The goals for France were scored by Mario Mandzukic (own goal), Antoine Griezmann (penalty), Paul Pogba, and Kylian Mbappé. Croatia’s goals were scored by Ivan Perišić and Mandzukic. Kylian Mbappé, at just 19 years old, became the second youngest player to score in a World Cup final, after Pelé. This victory marked France’s second World Cup title, their first being in 1998.
Match Date: July 15, 2018 Location: Luzhniki Stadium, Moscow, Russia Teams Competing: France vs. Croatia Final Score: France 4, Croatia 2
Notable Events: First Half: France took an early lead in the 18th minute with an own goal by Mario Mandžukić after a cross from Antoine Griezmann. Croatia equalized in the 28th minute with a header from Ivan Perišić, who had just come on as a substitute. Just before halftime, in the 44th minute, Paul Pogba scored from a free kick to put France back in the lead.Second Half: The game saw a dramatic increase in intensity after the break. In the 68th minute, Kylian Mbappé scored his first World Cup goal, bringing the score to 3-1 in favor of France. Croatia managed to score again in the 69th minute when Mandžukić headed in a cross from Perišić, reducing the deficit to 3-2.Final Moments: In the 81st minute, a controversial penalty was awarded to France after a handball by Croatian defender Dejan Lovren. Griezmann converted the penalty, sealing the victory for France with a 4-2 scoreline.Additional Context: This match was notable for being the youngest ever World Cup final in terms of average age, featuring players like Mbappé and N’Golo Kanté. France’s victory marked their second World Cup title, having previously won in 1998. Croatia reached the final for the first time in their history, finishing as runners-up.
You can apply this approach to other scenarios where a specific format is needed, such as generating emails, summaries, proposals, or even code and scripts. Define the format template clearly and provide details on how you want the output structured to get consistent and organized responses.
You can also use a one-shot or few-shots approach by providing one or more examples to help the model identify a desired pattern.
Ask for an explanation of reasoning
When you want the model to explain the reasoning behind its answers, you can ask the model to automatically reflect on its rationale and assumptions after providing a response.
For example, if you’re working on a mathematical problem, you can ask the model to explain the reasoning behind specific calculations.
No reflection specified
With reflection specified
System prompt
You’re an AI math assistant.
You’re an AI math assistant. You always explain your answers.
User prompt
A right-angled triangle has a hypotenuse side of length 3 cm and an adjacent side length of 2cm.What is the length of the remaining side?
A right-angled triangle has a hypotenuse side of length 3 cm and an adjacent side length of 2cm.What is the length of the remaining side?
Model response
Approximately 2.24 cm.
We can use the Pythagorean theorem to solve this problem. The Pythagorean theorem states that in a right-angled triangle, the square of the length of the hypotenuse (the side opposite the right angle) is equal to the sum of the squares of the lengths of the other two sides. Mathematically, this can be written as: c² = a² + b² where c is the length of the hypotenuse, and a and b are the lengths of the other two sides. In this problem, we are given the length of the hypotenuse (c = 3 cm) and the length of one of the other sides (a = 2 cm). We need to find the length of the remaining side (b). Plugging the given values into the Pythagorean theorem, we get: 3² = 2² + b² 9 = 4 + b² Now, we can solve for b²: b² = 9 – 4 b² = 5 Finally, we take the square root of both sides to find the length of the remaining side: b = √5 ≈ 2.24 cm So, the length of the remaining side is approximately 2.24 cm.
You can apply this approach when you want explanations in data analysis, marketing strategy, or technical troubleshooting. When you ask the model to define its reasoning, you use a technique called chain-of-thought to make it think step by step.
Add context
When you want the model to focus on specific topics, you can specify the context to consider. You can also tell the model to ignore irrelevant information.
For example, if you’re planning a trip, you can provide the model with more context to help improve the relevance of its response.
No context specified
With context specified
System prompt
You’re an AI travel assistant.
You’re an AI travel assistant.
User question
When should I visit Edinburgh?
When should I visit Edinburgh? I’m particularly interested in attending Scotland’s home matches in the Six Nations rugby tournament.
Model response
Edinburgh, the capital city of Scotland, is a wonderful destination with a rich history and vibrant culture. The best time to visit Edinburgh largely depends on your preferences for weather, events, and tourist crowds. Spring (April to June): The weather starts to warm up, and the days get longer. Springtime is perfect for exploring the city’s historic sites, gardens, and parks, such as the Royal Botanic Garden. The Edinburgh Festival Fringe, the world’s largest arts festival, takes place in August, which is the end of summer.Summer (July to August): The weather is at its warmest, and the days are long. This is a popular time to visit, but it can also be crowded with tourists. Summer is ideal for outdoor activities, and you can enjoy the Edinburgh International Festival and the Royal Edinburgh Military Tattoo.Autumn (September to November): The weather begins to cool down, and the leaves change color, creating a picturesque backdrop for the city’s historic architecture. It’s less crowded, and you can enjoy events like the Edinburgh International Book Festival.Winter (December to February): The weather is colder, and there may be snowfall, which can create a magical atmosphere. Winter is a quieter time to visit, with fewer tourists and lower prices. You can attend events like the Edinburgh Christmas Market and the Hogmanay Street Party, which celebrates the New Year. In summary, the best time to visit Edinburgh depends on your preferences. Spring and autumn offer pleasant weather and fewer crowds, while summer is ideal for outdoor activities and festivals. Winter can be a quieter, more magical experience, with festive events to enjoy.
The Six Nations rugby tournament typically runs from late January to early March. Edinburgh, as a host city, often has matches at Murrayfield Stadium, which is Scotland’s national stadium. To ensure you don’t miss Scotland’s home matches in the Six Nations, I recommend planning your visit for this period.
However, keep in mind that match dates and times can vary, so it’s a good idea to check the official Six Nations website or the Scottish Rugby Union’s website for the exact schedule of matches. Additionally, Edinburgh has a lot to offer beyond rugby, with its rich history, stunning architecture, and vibrant cultural scene, so you might consider extending your stay to explore the city before or after the tournament.
Also, remember that visiting during the tournament might mean larger crowds and busier accommodations, so book your flights, hotel, and transportation well in advance to ensure a smooth experience. Enjoy your trip to Edinburgh and the thrilling atmosphere of the Six Nations rugby matches!
By defining what the model should focus on or disregard, you can ensure the conversation stays on track and generate more relevant, tailored responses.
You can specify the context by describing what it should or shouldn’t include, and by connecting the model to data sources it should retrieve context from before generating an answer.
Apply model optimization strategies
Note
This section discusses options and considerations for model optimization that you may consider beyond prompt engineering. A full exploration of how to apply these optimization strategies is beyond the scope of this module.
Prompt engineering can be an effective way to optimize model responses, but in some cases it may not provide sufficient context or guidance to always meet your exact needs. As a developer, you can consider the following additional optimization strategies to improve the relevance of your generative AI application’s responses:
Retrieval Augmented Generation (RAG): A technique that involves using a data source to provide grounding context to prompts. RAG can be a useful approach when you need the model to answer questions based on a specific knowledge domain or when you need the model to consider information related to events that occurred after the training data on which the model is based.
Fine-tuning: A technique that involves extending the training of a foundation model by providing example prompts and responses that reflect the desired output format and style.
Both of these approaches involve additional cost, complexity, and maintainability challenges, so as a general rule it’s best to start your optimization efforts through prompt engineering, and then consider additional strategies if necessary.
The strategy you should choose as a developer depends on your requirements:
Optimize for context: When the model lacks contextual knowledge and you want to maximize responses accuracy.
Optimize the model: When you want to improve the response format, style, or speech by maximizing consistency of behavior.
To optimize for context, you can apply a Retrieval Augmented Generation (RAG) pattern. With RAG, you ground your data by first retrieving context from a data source before generating a response. For example, you want employees to ask questions about expense claim processes and limits based on your own corporation’s expenses policy documentation.
When you want the model to respond in a specific style or format, you can instruct the model to do so by adding guidelines in the system message. When you notice the model’s behavior isn’t consistent, you can further enforce consistency in behavior by fine-tuning a model. With fine-tuning, you train a base language model on a dataset of example prompts and responses before integrating it in your application, with the result that the fine-tuned model will produce responses that are consistent with the examples in the fine-tuning training dataset.
You can use any combination of optimization strategies, for example prompt engineering, RAG and a fine-tuned model, to improve your language application.
When you develop a generative AI app, you need to integrate language models into your application. To be able to use a language model, you need to deploy the model. Let’s explore how to deploy language models in the Azure AI Foundry, after first understanding why to deploy a model.
Why deploy a model?
You train a model to generate output based on some input. To get value out of your model, you need a solution that allows you to send input to the model, which the model processes, after which the output is visualized for you.
With generative AI apps, the most common type of solution is a chat application that expects a user question, which the model processes, to generate an adequate response. The response is then visualized to the user as a response to their question.
You can integrate a language model with a chat application by deploying the model to an endpoint. An endpoint is a specific URL where a deployed model or service can be accessed. Each model deployment typically has its own unique endpoint, which allows different applications to communicate with the model through an API (Application Programming Interface).
When a user asks a question:
An API request is sent to the endpoint.
The endpoint specifies the model that processes the request.
The result is sent back to the app through an API response.
Now that you understand why you want to deploy a model, let’s explore the deployment options with Azure AI Foundry.
Deploy a language model with Azure AI Foundry
When you deploy a language model with Azure AI Foundry, you have several types available, which depend on the model you want to deploy.
Deploy options include:
Standard deployment: Models are hosted in the Azure AI Foundry project resource.
Serverless compute: Models are hosted in Microsoft-managed dedicated serverless endpoints in an Azure AI Foundry hub project.
Managed compute: Models are hosted in managed virtual machine images in an Azure AI Foundry hub project.
The associated cost depends on the type of model you deploy, which deployment option you choose, and what you are doing with the model:
Standard deployment
Serverless compute
Managed compute
Supported models
Azure AI Foundry models (including Azure OpenAI models and Models-as-a-service models)
The model catalog in Azure AI Foundry provides a central repository of models that you can browse to find the right language model for your particular generative AI use case.
Selecting a foundation model for your generative AI app is important as it affects how well your app works. To find the best model for your app, you can use a structured approach by asking yourself the following questions:
Can AI solve my use case?
How do I select the best model for my use case?
Can I scale for real-world workloads?
Let’s explore each of these questions.
Can AI solve my use case?
Nowadays we have thousands of language models to choose from. The main challenge is to understand if there’s a model that satisfies your needs and to answer the question: Can AI solve my use case?
To start answering this question, you need to discover, filter, and deploy a model. You can explore the available language models through three different catalogs:
Hugging Face: Vast catalog of open-source models across various domains.
GitHub: Access to diverse models via GitHub Marketplace and GitHub Copilot.
Azure AI Foundry: Comprehensive catalog with robust tools for deployment.
Though you can use each of these catalogs to explore models, the model catalog in Azure AI Foundry makes it easiest to explore and deploy a model to build you prototype, while offering the best selection of models.
Let’s explore some of the options you need to consider when searching for suitable models.
Choose between large and small language models
First of all, you have a choice between Large Language Models (LLMs) and Small Language Models (SLMs).
LLMs like GPT-4, Mistral Large, Llama3 70B, Llama 405B, and Command R+ are powerful AI models designed for tasks that require deep reasoning, complex content generation, and extensive context understanding.
SLMs like Phi3, Mistral OSS models, and Llama3 8B are efficient and cost-effective, while still handling many common Natural Language Processing (NLP) tasks. They’re perfect for running on lower-end hardware or edge devices, where cost and speed are more important than model complexity.
Focus on a modality, task, or tool
Language models like GPT-4 and Mistral Large are also known as chat completion models, designed to generate coherent and contextually appropriate text-based responses. When you need higher levels of performance in complex tasks like math, coding, science, strategy, and logistics, you can also use reasoning models like DeepSeek-R1 and o1.
Beyond text-based AI, some models are multi-modal, meaning they can process images, audio, and other data types alongside text. Models like GPT-4o and Phi3-vision are capable of analyzing and generating both text and images. Multi-modal models are useful when your application needs to process and understand images, such as in computer vision or document analysis. Or when you want to build an AI app that interacts with visual content, such as a digital tutor explaining images or charts.
If your use case involves generating images, tools like DALL·E 3 and Stability AI can create realistic visuals from text prompts. Image generation models are great for designing marketing materials, illustrations, or digital art.
Another group of task-specific models are embedding models like Ada and Cohere. Embeddings models convert text into numerical representations and are used to improve search relevance by understanding semantic meaning. These models are often implemented in Retrieval Augmented Generation (RAG) scenarios to enhance recommendation engines by linking similar content.
When you want to build an application that interacts with other software tools dynamically, you can add function calling and JSON support. These capabilities allow AI models to work efficiently with structured data, making them useful for automating API calls, database queries, and structured data processing.
Specialize with regional and domain-specific models
Certain models are designed for specific languages, regions, or industries. These models can outperform general-purpose generative AI in their respective domains. For example:
Core42 JAIS is an Arabic language LLM, making it the best choice for applications targeting Arabic-speaking users.
Mistral Large has a strong focus on European languages, ensuring better linguistic accuracy for multilingual applications.
Nixtla TimeGEN-1 specializes in time-series forecasting, making it ideal for financial predictions, supply chain optimization, and demand forecasting.
If your project has regional, linguistic, or industry-specific needs, these models can provide more relevant results than general-purpose AI.
Balance flexibility and performance with open versus proprietary models
You also need to decide whether to use open-source models or proprietary models, each with its own advantages.
Proprietary models are best for cutting-edge performance and enterprise use. Azure offers models like OpenAI’s GPT-4, Mistral Large, and Cohere Command R+, which deliver industry-leading AI capabilities. These models are ideal for businesses needing enterprise-level security, support, and high accuracy.
Open-source models are best for flexibility and cost-efficiency. There are hundreds of open-source models available in the Azure AI Foundry model catalog from Hugging Face, and models from Meta, Databricks, Snowflake, and Nvidia. Open models give developers more control, allowing fine-tuning, customization, and local deployment.
Whatever model you choose, you can use the Azure AI Foundry model catalog. Using models through the model catalog meets the key enterprise requirements for usage:
Data and privacy: you get to decide what happens with your data.
Security and compliance: built-in security.
Responsible AI and content safety: evaluations and content safety.
Now you know the language models that are available to you, you should have an understanding of whether AI can indeed solve your use case. If you think a language model would enrich your application, you then need to select the specific model that you want to deploy and integrate.
How do I select the best model for my use case?
To select the best language model for you use case, you need to decide on what criteria you’re using to filter the models. The criteria are the necessary characteristics you identify for a model. Four characteristics you can consider are:
Task type: What type of task do you need the model to perform? Does it include the understanding of only text, or also audio, or video, or multiple modalities?
Precision: Is the base model good enough or do you need a fine-tuned model that is trained on a specific skill or dataset?
Openness: Do you want to be able to fine-tune the model yourself?
Deployment: Do you want to deploy the model locally, on a serverless endpoint, or do you want to manage the deployment infrastructure?
You already explored the various types of models available in the previous section. Now, let’s explore in more detail how precision and performance can be important filters when choosing a model.
Filter models for precision
In generative AI, precision refers to the accuracy of the model in generating correct and relevant outputs. It measures the proportion of true positive results (correct outputs) among all generated outputs. High precision means fewer irrelevant or incorrect results, making the model more reliable.
When integrating a language model into an app, you can choose between a base model or a fine-tuned model. A base model, like GPT-4, is pretrained on a large dataset and can handle various tasks but can lack precision for specific domains. Techniques like prompt engineering can improve this, but sometimes fine-tuning is necessary.
A fine-tuned model is trained further on a smaller, task-specific dataset to improve its precision and ability to generate relevant outputs for specific applications. You can either use a fine-tuned model or fine-tune a model yourself.
Filter models for performance
You can evaluate your model performance at different phases, using various evaluation approaches.
When you’re exploring models through the Azure AI Foundry model catalog, you can use model benchmarks to compare publicly available metrics like coherence and accuracy across models and datasets. These benchmarks can help you in the initial exploration phase, but give little information on how the model would perform in your specific use case.
Benchmark
Description
Accuracy
Compares model-generated text with correct answer according to the dataset. Result is one if generated text matches the answer exactly, and zero otherwise.
Coherence
Measures whether the model output flows smoothly, reads naturally, and resembles human-like language.
Fluency
Assesses how well the generated text adheres to grammatical rules, syntactic structures, and appropriate usage of vocabulary, resulting in linguistically correct and natural-sounding responses.
Groundedness
Measures alignment between the model’s generated answers and the input data.
GPT Similarity
Quantifies the semantic similarity between a ground truth sentence (or document) and the prediction sentence generated by an AI model.
Quality index
A comparative aggregate score between 0 and 1, with better-performing models scoring a higher value
Cost
The cost of using the model based on a price-per-token. Cost is a useful metric with which to compare quality, enabling you to determine an appropriate tradeoff for your needs.
To evaluate how a selected model performs regarding your specific requirements, you can consider manual or automated evaluations. Manual evaluations allow you to rate your model’s responses. Automated evaluations include traditional machine learning metrics and AI-assisted metrics that are calculated and generated for you.
When you evaluate a model’s performance, it’s common to start with manual evaluations, as they quickly assess the quality of the model’s responses. For more systematic comparisons, automated evaluations using metrics like precision, recall, and F1 score based on your own ground truth offer a faster, scalable, and more objective approach.
Can I scale for real-world workloads?
You selected a model for your use case and have successfully built a prototype. Now, you need to understand how to scale for real-world workloads.
Considerations for scaling a generative AI solution include:
Model deployment: Where will you deploy the model for the best balance of performance and cost?
Model monitoring and optimization: How will you monitor, evaluate, and optimize model performance?
Prompt management: How will you orchestrate and optimize prompts to maximize the accuracy and relevance of generated responses?
Model lifecycle: How will you manage model, data, and code updates as part of an ongoing Generative AI Operations (GenAIOps) lifecycle?
Azure AI Foundry provides visual and code-first tools that can help you build and maintain a scalable generative AI solution.
A data analyst is one of several critical roles in an organization, who help uncover and make sense of information to keep the company balanced and operating efficiently. Therefore, it’s vital that a data analyst clearly understands their responsibilities and the tasks that are performed on a near-daily basis. Data analysts are essential in helping organizations gain valuable insights into the expanse of data that they have, and they work closely with others in the organization to help reveal valuable information.
The following figure shows the five key areas that you’ll engage in during the data analysis process.
Prepare
As a data analyst, you’ll likely divide most of your time between the prepare and model tasks. Deficient or incorrect data can have a major impact that results in invalid reports, a loss of trust, and a negative effect on business decisions, which can lead to loss in revenue, a negative business impact, and more.
Before a report can be created, data must be prepared. Data preparation is the process of profiling, cleaning, and transforming your data to get it ready to model and visualize.
Data preparation is the process of taking raw data and turning it into information that is trusted and understandable. It involves, among other things, ensuring the integrity of the data, correcting wrong or inaccurate data, identifying missing data, converting data from one structure to another or from one type to another, or even a task as simple as making data more readable.
Data preparation also involves understanding how you’re going to get and connect to the data and the performance implications of the decisions. When connecting to data, you need to make decisions to ensure that models and reports meet, and perform to, acknowledged requirements and expectations.
Privacy and security assurances are also important. These assurances can include anonymizing data to avoid oversharing or preventing people from seeing personally identifiable information when it isn’t needed. Alternatively, helping to ensure privacy and security can involve removing that data completely if it doesn’t fit in with the story that you’re trying to shape.
Data preparation can often be a lengthy process. Data analysts follow a series of steps and methods to prepare data for placement into a proper context and state that eliminate poor data quality and allow it to be turned into valuable insights.
Model
When the data is in a proper state, it’s ready to be modeled. Data modeling is the process of determining how your tables are related to each other. This process is done by defining and creating relationships between the tables. From that point, you can enhance the model by defining metrics and adding custom calculations to enrich your data.
Creating an effective and proper semantic model is a critical step in helping organizations understand and gain valuable insights into the data. An effective semantic model makes reports more accurate, allows the data to be explored faster and efficiently, decreases time for the report writing process, and simplifies future report maintenance.
The model is another critical component that has a direct effect on the performance of your report and overall data analysis. A poorly designed model can have a drastically negative impact on the general accuracy and performance of your report. Conversely, a well-designed model with well-prepared data will ensure a properly efficient and trusted report. This notion is more prevalent when you are working with data at scale.
From a Power BI perspective, if your report is performing slowly, or your refreshes are taking a long time, you will likely need to revisit the data preparation and modeling tasks to optimize your report.
The process of preparing data and modeling data is an iterative process. Data preparation is the first task in data analysis. Understanding and preparing your data before you model it will make the modeling step much easier.
Visualize
The visualization task is where you get to bring your data to life. The ultimate goal of the visualize task is to solve business problems. A well-designed report should tell a compelling story about that data, which will enable business decision makers to quickly gain needed insights. By using appropriate visualizations and interactions, you can provide an effective report that guides the reader through the content quickly and efficiently, therefore allowing the reader to follow a narrative into the data.
The reports that are created during the visualization task help businesses and decision makers understand what that data means so that accurate and vital decisions can be made. Reports drive the overall actions, decisions, and behaviors of an organization that is trusting and relying on the information that is discovered in the data.
The business might communicate that they need all data points on a given report to help them make decisions. As a data analyst, you should take the time to fully understand the problem that the business is trying to solve. Determine whether all their data points are necessary because too much data can make detecting key points difficult. Having a small and concise data story can help find insights quickly.
With the built-in AI capabilities in Power BI, data analysts can build powerful reports, without writing any code, that enable users to get insights and answers and find actionable objectives. The AI capabilities in Power BI, such as the built-in AI visuals and Copilot, enable the discovering of data by asking questions, using the Quick Insights feature, or creating machine learning models directly within Power BI.
An important aspect of visualizing data is designing and creating reports for accessibility. As you build reports, it is important to think about people who will be accessing and reading the reports. Reports should be designed with accessibility in mind from the outset so that no special modifications are needed in the future.
Many components of your report will help with storytelling. From a color scheme that is complementary and accessible, to fonts and sizing, to picking the right visuals for what is being displayed, they all come together to tell that story.
Analyze
The analyze task is the important step of understanding and interpreting the information that is displayed on the report. In your role as a data analyst, you should understand the analytical capabilities of Power BI and use those capabilities to find insights, identify patterns and trends, predict outcomes, and then communicate those insights in a way that everyone can understand.
Advanced analytics enables businesses and organizations to ultimately drive better decisions throughout the business and create actionable insights and meaningful results. With advanced analytics, organizations can drill into the data to predict future patterns and trends, identify activities and behaviors, and enable businesses to ask the appropriate questions about their data.
Previously, analyzing data was a difficult and intricate process that was typically performed by data engineers or data scientists. Today, Power BI makes data analysis accessible, which simplifies the data analysis process. Users can quickly gain insights into their data by using visuals and metrics directly from their desktop and then publish those insights to dashboards so that others can find needed information.
This feature is another area where AI integrations within Power BI can take your analysis to the next level. Integrations with Azure Machine Learning, cognitive services, Copilot and built-in AI visuals will help to enrich your data and analysis.
Manage
Power BI consists of many components, including reports, dashboards, workspaces, semantic models, and more. As a data analyst, you are responsible for the management of these Power BI assets, overseeing the sharing and distribution of items, such as reports and dashboards, and ensuring the security of Power BI assets.
Apps can be a valuable distribution method for your content and allow easier management for large audiences. This feature also allows you to have custom navigation experiences and link to other assets within your organization to complement your reports.
The management of your content helps to foster collaboration between teams and individuals. Sharing and discovery of your content is important for the right people to get the answers that they need. It is also important to help ensure that items are secure. You want to make sure that the right people have access and that you are not leaking data past the correct stakeholders.
Proper management can also help reduce data silos within your organization. Data duplication can make managing and introducing data latency difficult when resources are overused. Power BI helps reduce data silos with the use of shared semantic models, and it allows you to reuse data that you have prepared and modeled. For key business data, endorsing a semantic model as certified can help to ensure trust in that data.
The management of Power BI assets helps reduce the duplication of efforts and helps ensure security of the data.
Before data can be used to tell a story, it must be run through a process that makes it usable in the story. Data analysis is the process of identifying, cleaning, transforming, and modeling data to discover meaningful and useful information. The data is then crafted into a story through reports for analysis to support the critical decision-making process.
As the world becomes more data-driven, storytelling through data analysis is becoming a vital component and aspect of large and small businesses. It is the reason that organizations continue to hire data analysts.
Data-driven businesses make decisions based on the story that their data tells, and in today’s data-driven world, data is not being used to its full potential, a challenge that most businesses face. Data analysis is, and should be, a critical aspect of all organizations to help determine the impact to their business, including evaluating customer sentiment, performing market and product research, and identifying trends or other data insights.
While the process of data analysis focuses on the tasks of cleaning, modeling, and visualizing data, the concept of data analysis and its importance to business should not be understated. To analyze data, core components of analytics are divided into the following categories:
Descriptive
Diagnostic
Predictive
Prescriptive
Artificial Intelligence (AI)
Descriptive analytics
Descriptive analytics help answer questions about what has happened based on historical data. Descriptive analytics techniques summarize large semantic models to describe outcomes to stakeholders.
By developing key performance indicators (KPIs), these strategies can help track the success or failure of key objectives. Metrics such as return on investment (ROI) are used in many industries, and specialized metrics are developed to track performance in specific industries.
An example of descriptive analytics is generating reports to provide a view of an organization’s sales and financial data.
Diagnostic analytics
Diagnostic analytics help answer questions about why events happened. Diagnostic analytics techniques supplement basic descriptive analytics, and they use the findings from descriptive analytics to discover the cause of these events. Then, performance indicators are further investigated to discover why these events improved or became worse. Generally, this process occurs in three steps:
Identify anomalies in the data. These anomalies might be unexpected changes in a metric or a particular market.
Collect data that’s related to these anomalies.
Use statistical techniques to discover relationships and trends that explain these anomalies.
Predictive analytics
Predictive analytics help answer questions about what will happen in the future. Predictive analytics techniques use historical data to identify trends and determine if they’re likely to recur. Predictive analytical tools provide valuable insight into what might happen in the future. Techniques include a variety of statistical and machine learning techniques such as neural networks, decision trees, and regression.
Prescriptive analytics
Prescriptive analytics help answer questions about which actions should be taken to achieve a goal or target. By using insights from prescriptive analytics, organizations can make data-driven decisions. This technique allows businesses to make informed decisions in the face of uncertainty. Prescriptive analytics techniques rely on machine learning as one of the strategies to find patterns in large semantic models. By analyzing past decisions and events, organizations can estimate the likelihood of different outcomes.
Artificial Intelligence
Artificial Intelligence (AI) helps answer questions about your data. AI refers to the simulation of human intelligence in machines that are programmed to think, learn, and adapt. In the context of analytics, AI enables systems to process vast amounts of data, recognize patterns, and deliver insights with minimal human intervention. It supports a wide range of applications, from natural language processing and image recognition to code generation and intelligent visualization suggestions.
Example
By enabling reporting and data visualizations, a retail business uses descriptive analytics to look at patterns of purchases from previous years to determine what products might be popular next year. The company might also look at supporting data to understand why a particular product was popular and if that trend is continuing, which will help them determine whether to continue stocking that product.
A business might determine that a certain product was popular over a specific timeframe. Then, they can use this analysis to determine whether certain marketing efforts or online social activities contributed to the sales increase.
An underlying facet of data analysis is that a business needs to trust its data. As a practice, the data analysis process will capture data from trusted sources and shape it into something that is consumable, meaningful, and easily understood to help with the decision-making process. Data analysis enables businesses to fully understand their data through data-driven processes and decisions, allowing them to be confident in their decisions.
As the amount of data grows, so does the need for data analysts. A data analyst knows how to organize information and distill it into something relevant and comprehensible. A data analyst knows how to gather the right data and what to do with it, in other words, making sense of the data in your data overload.
Azure DNS is a hosting service for DNS domains that provides name resolution by using Microsoft Azure infrastructure. By hosting your domains in Azure, you can manage your DNS records using the same credentials, APIs, tools, and billing as your other Azure services.
Benefits of Azure DNS
Azure DNS uses the scope and scale of Microsoft Azure to provide numerous benefits, including:
Reliability and performance
Security
Ease of Use
Customizable virtual networks
Alias records
Reliability and performance
DNS domains in Azure DNS are hosted on Azure’s global network of DNS name servers, providing resiliency and high availability. Azure DNS uses anycast networking, so the closest available DNS server answers each DNS query, providing fast performance and high availability for your domain.
Security
Azure DNS is based on Azure Resource Manager, which provides features such as:
Azure role-based access control (Azure RBAC) to control who has access to specific actions for your organization.
Activity logs to monitor how a user in your organization modified a resource or to find an error when troubleshooting.
Resource locking to lock a subscription, resource group, or resource. Locking prevents other users in your organization from accidentally deleting or modifying critical resources.
Ease of use
Azure DNS can manage DNS records for your Azure services and provide DNS for your external resources as well. Azure DNS is integrated in the Azure portal and uses the same credentials, support contract, and billing as your other Azure services.
Because Azure DNS is running on Azure, it means you can manage your domains and records with the Azure portal, Azure PowerShell cmdlets, and the cross-platform Azure CLI. Applications that require automated DNS management can integrate with the service by using the REST API and SDKs.
Customizable virtual networks with private domains
Azure DNS also supports private DNS domains. This feature allows you to use your own custom domain names in your private virtual networks, rather than being stuck with the Azure-provided names.
Alias records
Azure DNS also supports alias record sets. You can use an alias record set to refer to an Azure resource, such as an Azure public IP address, an Azure Traffic Manager profile, or an Azure Content Delivery Network (CDN) endpoint. If the IP address of the underlying resource changes, the alias record set seamlessly updates itself during DNS resolution. The alias record set points to the service instance, and the service instance is associated with an IP address.
Important
You can’t use Azure DNS to buy a domain name. For an annual fee, you can buy a domain name by using App Service domains or a third-party domain name registrar. Once purchased, your domains can be hosted in Azure DNS for record management.
Azure ExpressRoute lets you extend your on-premises networks into the Microsoft cloud over a private connection, with the help of a connectivity provider. This connection is called an ExpressRoute Circuit. With ExpressRoute, you can establish connections to Microsoft cloud services, such as Microsoft Azure and Microsoft 365. This feature allows you to connect offices, datacenters, or other facilities to the Microsoft cloud. Each location would have its own ExpressRoute circuit.
Connectivity can be from an any-to-any (IP VPN) network, a point-to-point Ethernet network, or a virtual cross-connection through a connectivity provider at a colocation facility. ExpressRoute connections don’t go over the public Internet. This setup allows ExpressRoute connections to offer more reliability, faster speeds, consistent latencies, and higher security than typical connections over the Internet.
Features and benefits of ExpressRoute
There are several benefits to using ExpressRoute as the connection service between Azure and on-premises networks.
Connectivity to Microsoft cloud services across all regions in the geopolitical region.
Global connectivity to Microsoft services across all regions with the ExpressRoute Global Reach.
Dynamic routing between your network and Microsoft via Border Gateway Protocol (BGP).
Built-in redundancy in every peering location for higher reliability.
Connectivity to Microsoft cloud services
ExpressRoute enables direct access to the following services in all regions:
Microsoft Office 365
Microsoft Dynamics 365
Azure compute services, such as Azure Virtual Machines
Azure cloud services, such as Azure Cosmos DB and Azure Storage
Global connectivity
You can enable ExpressRoute Global Reach to exchange data across your on-premises sites by connecting your ExpressRoute circuits. For example, say you had an office in Asia and a datacenter in Europe, both with ExpressRoute circuits connecting them to the Microsoft network. You could use ExpressRoute Global Reach to connect those two facilities, allowing them to communicate without transferring data over the public internet.
Dynamic routing
ExpressRoute uses the BGP. BGP is used to exchange routes between on-premises networks and resources running in Azure. This protocol enables dynamic routing between your on-premises network and services running in the Microsoft cloud.
Built-in redundancy
Each connectivity provider uses redundant devices to ensure that connections established with Microsoft are highly available. You can configure multiple circuits to complement this feature.
ExpressRoute connectivity models
ExpressRoute supports four models that you can use to connect your on-premises network to the Microsoft cloud:
CloudExchange colocation
Point-to-point Ethernet connection
Any-to-any connection
Directly from ExpressRoute sites
Colocation at a cloud exchange
Colocation refers to your datacenter, office, or other facility being physically colocated at a cloud exchange, such as an ISP. If your facility is colocated at a cloud exchange, you can request a virtual cross-connect to the Microsoft cloud.
Point-to-point Ethernet connection
Point-to-point ethernet connection refers to using a point-to-point connection to connect your facility to the Microsoft cloud.
Any-to-any networks
With any-to-any connectivity, you can integrate your wide area network (WAN) with Azure by providing connections to your offices and datacenters. Azure integrates with your WAN connection to provide a connection like you would have between your datacenter and any branch offices.
Directly from ExpressRoute sites
You can connect directly into the Microsoft’s global network at a peering location strategically distributed across the world. ExpressRoute Direct provides dual 100 Gbps or 10-Gbps connectivity, which supports Active/Active connectivity at scale.
Security considerations
With ExpressRoute, your data doesn’t travel over the public internet, reducing the risks associated with internet communications. ExpressRoute is a private connection from your on-premises infrastructure to your Azure infrastructure. Even if you have an ExpressRoute connection, DNS queries, certificate revocation list checking, and Azure Content Delivery Network requests are still sent over the public internet.
A virtual private network (VPN) uses an encrypted tunnel within another network. VPNs are typically deployed to connect two or more trusted private networks to one another over an untrusted network (typically the public internet). Traffic is encrypted while traveling over the untrusted network to prevent eavesdropping or other attacks. VPNs can enable networks to safely and securely share sensitive information.
VPN gateways
A VPN gateway is a type of virtual network gateway. Azure VPN Gateway instances are deployed in a dedicated subnet of the virtual network and enable the following connectivity:
Connect on-premises datacenters to virtual networks through a site-to-site connection.
Connect individual devices to virtual networks through a point-to-site connection.
Connect virtual networks to other virtual networks through a network-to-network connection.
All data transfer is encrypted inside a private tunnel as it crosses the internet. You can deploy only one VPN gateway in each virtual network. However, you can use one gateway to connect to multiple locations, which includes other virtual networks or on-premises datacenters.
When setting up a VPN gateway, you must specify the type of VPN – either policy-based or route-based. The primary distinction between these two types is how they determine which traffic needs encryption. In Azure, regardless of the VPN type, the method of authentication employed is a preshared key.
Policy-based VPN gateways specify statically the IP address of packets that should be encrypted through each tunnel. This type of device evaluates every data packet against those sets of IP addresses to choose the tunnel where that packet is going to be sent through.
In Route-based gateways, IPSec tunnels are modeled as a network interface or virtual tunnel interface. IP routing (either static routes or dynamic routing protocols) decides which one of these tunnel interfaces to use when sending each packet. Route-based VPNs are the preferred connection method for on-premises devices. They’re more resilient to topology changes such as the creation of new subnets.
Use a route-based VPN gateway if you need any of the following types of connectivity:
Connections between virtual networks
Point-to-site connections
Multisite connections
Coexistence with an Azure ExpressRoute gateway
High-availability scenarios
If you’re configuring a VPN to keep your information safe, you also want to be sure that it’s a highly available and fault tolerant VPN configuration. There are a few ways to maximize the resiliency of your VPN gateway.
Active/standby
By default, VPN gateways are deployed as two instances in an active/standby configuration, even if you only see one VPN gateway resource in Azure. When planned maintenance or unplanned disruption affects the active instance, the standby instance automatically assumes responsibility for connections without any user intervention. Connections are interrupted during this failover, but they typically restore within a few seconds for planned maintenance and within 90 seconds for unplanned disruptions.
Active/active
With the introduction of support for the BGP routing protocol, you can also deploy VPN gateways in an active/active configuration. In this configuration, you assign a unique public IP address to each instance. You then create separate tunnels from the on-premises device to each IP address. You can extend the high availability by deploying an additional VPN device on-premises.
ExpressRoute failover
Another high-availability option is to configure a VPN gateway as a secure failover path for ExpressRoute connections. ExpressRoute circuits have resiliency built in. However, they aren’t immune to physical problems that affect the cables delivering connectivity or outages that affect the complete ExpressRoute location. In high-availability scenarios, where there’s risk associated with an outage of an ExpressRoute circuit, you can also provision a VPN gateway that uses the internet as an alternative method of connectivity. In this way, you can ensure there’s always a connection to the virtual networks.
Zone-redundant gateways
In regions that support availability zones, VPN gateways and ExpressRoute gateways can be deployed in a zone-redundant configuration. This configuration brings resiliency, scalability, and higher availability to virtual network gateways. Deploying gateways in Azure availability zones physically and logically separates gateways within a region while protecting your on-premises network connectivity to Azure from zone-level failures. These gateways require different gateway stock keeping units (SKUs) and use Standard public IP addresses instead of Basic public IP addresses.