Blog

Technical insights, industry trends, and lessons from our data transformation journey.

The Hidden Cost of MCPs and Custom Instructions on Your Context Window

SelfServiceBI· 23 Nov 2025

Large context windows sound limitless—200K, 400K, even a million tokens. But once you bolt on a few MCP servers, dump in a giant CLAUDE.md, and drag a long chat history behind you, you can easily burn over 50% of that window before you paste a single line of code. This post is about that hidden tax—and how to stop paying it. Where This Started This exploration started when I came across a LinkedIn post by Johnny Winter featuring a YouTube video about terminal-based AI tools and context management. The video demonstrates how tools like Claude Code, Gemini CLI, and others leverage project-aware context files—which got me thinking about what’s actually consuming all that context space. Video by NetworkChuck ℹ️ Note: While this post uses Claude Code for examples, these concepts apply to any AI coding agent—GitHub Copilot, Cursor, Windsurf, Gemini CLI, and others. The Problem: You’re Already at 50% Before You Start Think of a context window as working memory. Modern AI models have impressive limits (as of 2025): Claude Sonnet 4.5: 200K tokens (1M beta for tier 4+) GPT-5: 400K tokens via API Gemini 3 Pro: 1M input tokens A token is roughly 3-4 characters, so 200K tokens equals about 150,000 words. That sounds like plenty, right? Here’s what actually consumes it: System prompt and system tools MCP server tool definitions Memory files (CLAUDE.md, .cursorrules) Autocompact buffer (reserved for conversation management) Conversation history Your code and the response being generated By the time you add a few MCPs and memory files, a large chunk of your context window is already gone—before you’ve written a single line of code. Real Numbers: The MCP Tax Model Context Protocol (MCP) servers make it easier to connect AI agents to external tools and data. But each server you add costs tokens. Here’s what my actual setup looked like (from Claude Code’s /context command): MCP tools alone consume 16.3% of the context window—before I’ve even started a conversation. Combined with system overhead, I’m already at 51% usage with essentially zero messages. The Compounding Effect The real problem emerges when overhead compounds. Here’s my actual breakdown: Category Tokens % of Window System prompt 3.0k 1.5% System tools 14.8k 7.4% MCP tools 32.6k 16.3% Custom agents 794 0.4% Memory files 5.4k 2.7% Messages 8 0.0% Autocompact buffer 45.0k 22.5% Free space 99k 49.3% Total: 101k/200k tokens used (51%) You’re working with less than half your theoretical capacity—and that’s with essentially zero conversation history. Once you start coding, the available space shrinks even further. Why This Matters: Performance and Quality Context consumption affects more than just space: Processing Latency: Empirical testing with GPT-4 Turbo shows that time to first token increases by approximately 0.24ms per input token. That means every additional 10,000 tokens adds roughly 2.4 seconds of latency to initial response time. (Source: Glean’s research on input token impact) Cache Invalidation: Modern AI systems cache frequently used context. Any change (adding an MCP, editing instructions) invalidates that cache, forcing full reprocessing. Quality Degradation: When context gets tight, models may: Skip intermediate reasoning steps Miss edge cases Spread attention too thinly across information Fill gaps with plausible but incorrect information Truncate earlier conversation, losing track of prior requirements I’ve noticed this particularly in long coding sessions. After discussing architecture early in a conversation, the agent later suggests solutions that contradict those earlier decisions—because that context has been truncated away. Practical Optimization: Real-World Example Let me share a before/after from my own setup: Before Optimization: 10+ MCPs enabled (all the time) MCP tools consuming 32.6k tokens (16.3%) Only 99k tokens free (49.3%) Frequent need to summarize/restart sessions After Optimization: 3-4 MCPs enabled by default MCP tools reduced to ~12k tokens (~6%) Memory files trimmed to essentials (~3k tokens) Over 140k tokens free (70%+) Results: More working space, better reasoning quality, fewer context limit issues, and faster responses. Optimization Checklist Before adding another MCP or expanding instructions: Have I measured my current context overhead? Is my custom instruction file under 5,000 tokens? Do I actively use all enabled MCPs? Have I removed redundant or outdated instructions? Could I accomplish this goal without consuming more context? In Claude Code: Use the /context command to see your current context usage breakdown. Specific Optimization Strategies 1. Audit Your MCPs Regularly Ask yourself: Do I use this MCP daily? Weekly? Monthly? Could I accomplish this task without the MCP? Action: Disable MCPs you don’t use regularly. Enable them only when needed. Impact of Selective MCP Usage By selectively disabling MCPs you don’t frequently use, you can immediately recover significant context space. This screenshot shows the difference in available context when strategically choosing which MCPs to keep active versus loading everything. In Claude Code, you can toggle MCPs through the settings panel. This simple action can recover 10-16% of your context window. 2. Ruthlessly Edit Custom Instructions Your CLAUDE.md memory files, .cursorrules, or copilot-instructions.md should be: Concise (under 5,000 tokens) Focused on patterns, not examples Project-specific, not general AI guidance Bad Example: When writing code, always follow best practices. Use meaningful variable names. Write comments. Test your code. Follow SOLID principles. Consider performance. Think about maintainability... (Continues for 200 lines) Good Example: Code Style: - TypeScript strict mode - Functional patterns preferred - Max function length: 50 lines - All public APIs must have JSDoc Testing: - Vitest for unit tests - Each function needs test coverage - Mock external dependencies 3. Start Fresh When Appropriate Long conversations accumulate context. Sometimes the best optimization is: Summarizing what’s been decided Starting a new session with that summary Dropping irrelevant historical context 4. Understand Autocompact Buffer Claude Code includes an autocompact buffer that helps manage context automatically. When you run /context, you’ll see something like: Autocompact buffer: 45.0k tokens (22.5%) This buffer reserves space to prevent hitting hard token limits by automatically compacting or summarizing older messages during long conversations. It maintains continuity without abrupt truncation—but it also means that 22.5% of your window is already taken. You can also see and toggle this behavior in Claude Code’s /config settings: In this screenshot, Auto-compact is enabled, which keeps a dedicated buffer for summarizing older messages so long conversations stay coherent without suddenly hitting hard context limits. Claude Code Specific Limitations: The Granularity Problem Claude Code currently has a platform-level limitation that makes fine-grained control challenging, documented in GitHub Issue #7328: “MCP Tool Filtering”. The Core Issue: Claude Code loads ALL tools from configured MCP servers. You can only enable or disable entire servers, not individual tools within a server. The Impact: Large MCP servers with 20+ tools can easily consume 50,000+ tokens just on definitions. If a server has 25 tools but you only need 3, you must either: Load all 25 tools and accept the context cost Disable the entire server and lose access to the 3 tools you need Build a custom minimal MCP server (significant development effort) This makes tool-level filtering essential for context optimization, not just a convenience. The feature is under active development with community support. In the meantime: Use MCP servers sparingly Prefer smaller, focused servers over large multi-tool servers Regularly audit which servers you actually need enabled Provide feedback on the GitHub issues to help prioritize this feature Key Takeaways You’re burning a huge portion of your context window before you even paste in your first file. MCP tools alone can consume 16%+ of your window. System tools add another 7%. The autocompact buffer reserves 22%. It adds up fast. Optimization is ongoing. Regular audits of MCPs and memory files keep your agent running smoothly. Aim to keep baseline overhead under 30% of total context (excluding the autocompact buffer). Measurement matters. Use /context in Claude Code to monitor your overhead. You can’t optimize what you don’t measure. Performance degrades subtly. Latency increases roughly 2.4 seconds per 10,000 tokens based on empirical testing. Reasoning quality drops as context fills up. Start minimal, add intentionally. The best developers using AI agents: Start minimal Add capabilities intentionally Monitor performance impact Optimize regularly Remove what isn’t providing value The goal isn’t to minimize context usage at all costs. The goal is intentional, efficient context usage that maximizes response quality, processing speed, and available working space. Think of your context window like RAM in a computer. More programs running means less memory for each program. Eventually, everything slows down. It’s not about having every tool available. It’s about having the right tools, configured optimally, for the work at hand. Resources Official Documentation Claude Code MCP Documentation Model Context Protocol (MCP) Overview Claude Code Best Practices Claude Code Cost Management Claude Context Windows Research & Performance How Input Token Count Impacts LLM Latency - Glean Community Resources Model Context Protocol Documentation GitHub Copilot Custom Instructions Johnny Winter’s LinkedIn Post on Terminal AI You’ve Been Using AI the Hard Way (Use This Instead) - YouTube Video Have you optimized your AI agent setup? What context window challenges have you encountered? I’d love to hear your experiences and optimization strategies.

Guiding AI Agent - Power BI Report Development Example

SelfServiceBI· 12 Aug 2025

How LLMs Mirror Human Pattern Recognition LLMs are not thinking machines, but pattern matching algorithms. The reason we feel like they are thinking is because most of our brains is to do pattern matching. It works most of the time. Daniel Kahneman described in his book “Thinking, Fast and Slow” about “System 1” that is fast, instinctive and emotional; “System 2” that is slower, more deliberative, and more logical. My view is that the current LLMs are representing our brain using “System 1”. Which can be used if you have ample experience with the topic you want to use it on. (this is the famous 10,000 Hours principle) If the challange you would like to solve does not have a huge literature or you want to solve problems in a novel way, then that is the time to use “System 2”. But AI does not have a “System 2” option that works well. The relatively newer reasoning models are one attempt to try to solve this problem but they still require ample data in their training set to not start hallucinating. Power BI Report Development in Code Developing Power BI reports using the new PBIR format is relatively new, there is not a lot of data available how to do it. There is not really a best practice because Report development is as much art than it is science. Even using AI for DAX requires “Carefully designed instructions and examples” according to Jeffrey Wang. Using LLM with DAX: LinkedIn Post on NL2DAX Benchmark view this post on LinkedIn. AI Agent Research - Audio blog I started researching how to use the tools available to guide the agents to improve the code they are generating.. I encourage you to listen to it because you will find a lot of useful information in this! AI Agents Unveiled Harnessing Power Dodging Pitfalls And Protecting Your Brain In The Age Of AI by Mihaly Kavasi (generated by NotebookLM) New to NotebookLM? Agent instructions Almost all agents have a prompt file you can adjust to guide you Agent how you want it to interact with the code you are working on. AI Agent Name Instruction File Name Claude claude.md GitHub Copilot copilot-instructions.md Gemini gemini-config.yaml Llama llama-instructions.md Mistral mistral-prompt.md Lovable knowledge.md Cursor .cursorrules These are the files you can populate, here is an example of teaching the AI to understand the PBIR format. Notice that you need to have a detailed understanding of how Power BI works, in order to be able to validate and correct the information the AI generates for itself. Prompt steps example for claude.md 1. Init 2. Extend claude.md with Static Resources containing the images and icons added to Power BI, as well as the Base Theme and the Applied Theme. The way Power BI decides how to display a report element is based on the following hierarchy: attribute specified in the visual; if not, attribute specified in the registered theme; if not, attribute specified in the base theme; if not, the default value. This is relevant to understand where certain design settings are set within the report. 3. Extend claude.md to include: • report.json contains the report-level filters • pages.json contains the settings for the opening page when opening the report • page.json contains the page-level filters • visual.json contains visual-level filtering 4. Extend claude.md with the information about where to find the names of entities such as pages, visuals, and bookmarks. 5. In claude.md, clarify the definition of visual names by checking multiple visuals, since some visuals have names and others do not... Join me: I will show you in a live demo how to work with AI Agents in Power BI Join the London FABUG Community Interested in learning more about AI agents and Power BI development? Join us at the next London Fabric and Power BI User Group meetup! Recommend you to watch this video to understand where AI Agents and LLMs in general are heading. How to change your prompts for GPT-5 watch on YouTube. Conclusion Working with AI agents in specialized domains like PBIR development requires bridging the gap between pattern matching and true reasoning. We must act as the “System 2” - providing detailed instructions, validating outputs, and correcting misconceptions through instruction files that transfer our domain expertise. The key takeaway? AI agents can accelerate Power BI development when paired with human expertise. Success requires understanding both the technology (Power BI, DAX, PBIR) and how to effectively communicate this knowledge to AI partners. The future isn’t about replacing human expertise - it’s about amplifying it through thoughtful human-AI collaboration. As tools evolve, our role shifts from coding everything to skillfully guiding AI agents toward desired outcomes. Resources Claude Code Best Practices (Obsidian Publish) Why GitHub Copilot Custom Instructions Matter — Thomas Thornton

Creating a Power BI Knowledge base with NotebookLM

SelfServiceBI· 20 Jun 2025

Power BI documentation is extensive, but finding the right information when you need it can be challenging. What if you could have an AI assistant that knows all the Power BI documentation inside and out? That’s exactly what we can achieve using Google’s NotebookLM. NotebookLM is Google’s AI-powered research assistant that can analyze and understand large collections of documents, making them searchable and queryable through natural language. In this post, I’ll show you how to create a comprehensive Power BI knowledge base using NotebookLM. Why Create a Power BI Knowledge Base? Power BI has thousands of pages of documentation scattered across Microsoft Learn, community forums, and various resources. Finding specific information often involves: Searching through multiple documentation sites Reading lengthy articles to find relevant sections Trying to remember where you saw that specific feature explanation Piecing together information from different sources A centralized knowledge base powered by AI can solve these problems by providing instant, contextual answers to your Power BI questions. Isn’t just using LLM not good enough? Large language models (LLMs) are incredibly versatile because they’re trained on massive amounts of data. However, this also means they can sometimes struggle to find specific details within a huge web of information. Plus, since LLMs generate responses based on patterns rather than exact facts, they might occasionally provide outdated info or even make things up (a phenomenon known as “hallucination”). Getting Started with Power BI Documentation The first step is gathering the Power BI documentation. Microsoft provides comprehensive documentation on Microsoft Learn, covering everything from basic concepts to advanced features. Power BI Official documentation You’ll might also want to collect documentation covering: Community Tools – Add-ons, custom visuals, and utilities for Power BI. Code Repositories – GitHub/GitLab samples, DAX, Power Query, and scripts. Related Technology Docs – Docs for Entra, Dataverse, Fabric, Synapse, SQL Server. Community Blogs & Videos – Tutorials and tips from the Power BI community. Official Power BI Blog – Updates and best practices from the product team. Conference Materials – Presentations and slides from Power BI events. Sample Datasets & Templates – Example datasets, PBIX files, and templates. FAQ & Troubleshooting – Common issues and solutions from forums. API & Developer Docs – Resources for REST API, embedding, and automation. Security & Compliance – Guides on governance and data protection. You can also take advantage of NotebookLM’s built-in Discover feature to search for additional resources and further enrich your knowledge base. Setting Up NotebookLM Once you have your documentation organized: Visit NotebookLM - Go to notebooklm.google.com Create a new notebook - Start a new project for your Power BI knowledge base Upload documents - Add your collected documentation files Let NotebookLM process - The AI will analyze and index your content NotebookLM supports various file formats including PDF, text files, and even Google Docs. It can handle substantial amounts of content, making it perfect for comprehensive documentation collections. Querying Your Knowledge Base Once your knowledge base is set up, you can start asking questions in natural language: NotebookLM will provide detailed answers with citations, showing you exactly which documents contain the relevant information. Important reminder: Verify! Cross-reference answers with source Test suggested solutions before implementing Keep your documentations up-to-date Conclusion Creating a Power BI knowledge base with NotebookLM transforms how you access and use Power BI documentation. Instead of spending time searching through multiple resources, you can get instant, contextual answers to your questions. Start building your Power BI knowledge base today, and experience the difference of having all Power BI information at your fingertips, ready to answer any question you might have.

How to find which visual generated the query in Log Analytics?

SelfServiceBI· 11 Mar 2024

Model metadata in Power BI is getting more and more accessable and there are a fair amount of tool that we can use to work with it. Report metadata on the otherhand is not easy to work with. In my previous blog post, I showed how you can use Log Analytics to improve the performance of models and reports. The only missing piece was how you can find the visual that you need to work on. This post will try to cover all the different ways we can extract and interact with report metadata. Manual method If you have access to the PBIX file for the report, you can extract the information for the file. Rename the file from .pbix to .zip Open the zip file and find the Layout file Open it in an editor like VS Code Format file to JSON Search for the VisualID (CTRL+F) Search for the Display name that you can look up in the Selection View in Power BI Desktop To make your life easier with larger report, you can also the find the Page name. You need to search for the ReportSection that encompasses the visual Limited Python knowledge & GitHub Copilot method :) I long wanted to find a meaningful task, where I could see how to use GitHub Copilot to enhance my limited python knowledge. It was quite an interesting learning experience that helped me understand how to interact with Copilot in a way that is productive. Python Script: SelfServiceBIBlog/Governance/ExportReportMetadata.py at main · KavasiMihaly/SelfServiceBIBlog (github.com) This script is currently extracts the most essential information to help with the analysis in Log Analytics. Feel free to add to this script and update with additional information extracted. Happy for any code contributions in GitHub. ⚠️You might need to open the Layout file in VS Code first and format the file to JSON.⚠️ Output: Script will create a csv file containing the Visual and Page details. Adding the results to Log Analytics When you add this extracted information to the Log Analytics template report, we discussed in my previous post, it greatly improve the insights and makes it really easy to find the visuals in a report. Also enables you to do more generally analytics about the type of visualisations used and their performance. Community tools available Power BI Helper "Power BI Helper is an external tool for Power BI Desktop. Using this tool, you can analyze your Power BI file, explore the data sources, the Power Query script, the data model, DAX expressions, and visualizations. This is a tool for Power BI developers and analysts and administrators, architects, and managers to create documentation of their work and have a better Power BI development experience." https://powerbihelper.org/ https://powerbihelper.org/ This tool give you an overview of the visuals and the ability to export this information. Measure Killer https://en.brunner.bi/measurekiller Analyzing the model and report to find unused measures and column Entreprise version allow tenant level analysis The UI doesn't show the visual information but after exporting the data you can get to the visual details. Microsoft Fabric APIs With the appearance of Fabric APIs, we now have a Microsoft supported way to access Power BI Report metadata among a lot of other items available in Azure. As you can see in the image the API could be also used to manage these items. Fabric Report Definition API: https://learn.microsoft.com/en-us/rest/api/fabric/articles/item-management/definitions/report-definition Here are the information you can access through the API. The visual and page metadata is stored in the report.json. Conclusion Report metadata is useful for performance monitoring and integrating it with log analytics data can improve the usability.

Log Analytics for Power BI: How to start and improve insights for better model performance

SelfServiceBI· 05 Mar 2024

I had a task to help a managed services team in monitoring multiple production datasets and reports in Power BI. These resources, accessed by thousands of users, demanded optimal performance.  For this we had to streamline information gathering, implement an intuitive platform, and empower the team to explore both high-level overviews and granular query details. This was a perfect opportunity to work with Log Analytics for Power BI and to dig deep into the data and find ways to advance the available solutions. What is Azure Log Analytics? Azure Log Analytics (LA) is a service within Azure Monitor that Power BI uses to save activity logs. Azure Monitor enables you to collect, analyze, and act on telemetry data from both Azure and on-premises environments. Key features of Azure Log Analytics include long-term storage, an ad-hoc query interface, and API access for data export and integration with other systems. How does the integration work? Power BI integrates with Log Analytics to expose events from the Analysis Services engine. These events are derived from existing diagnostic logs available for Azure Analysis Services. Once connected to Power BI, data is sent continuously and becomes available in Log Analytics in approximately 5 minutes. Integration is on the Workspace level, which allows selective data collection and better cost control. Only work with Workspaces in a Premium capacity By default 30 days of data is preserved. This can be modified and there are options for archiving. If you need further guidance on setting up Log Analytics, refer to the official documentation. 🚀 Analyzing collected data In Azure In the Azure Portal you can analyse the data in the Logs section of the Log Analytics Workspace. Here is the official tutorial that help getting familiar with the https://learn.microsoft.com/en-us/azure/azure-monitor/logs/log-analytics-tutorial ⚠️There is a big limitation of analysing data from Log Analytics which is the limited query resultset size. It is the same if you are analysing the data in Azure or connecting through the API via Power BI. You can see that in the notification in the picture above. ⚠️ In Power BI Fortunately Rui Romano, Bravik Merchant and other from Microsoft contributors created a really useful Power BI report template to analyse Log Analytics data. Power BI Log Analytics Template Reports: https://github.com/microsoft/PowerBI-LogAnalytics-Template-Reports This is a brilliant report that provided a wide range of insights about both model query and refresh performance. I highly recommend to use if you plan to implement Log Analytics at your organization. The solution also resolved the data volume issue by querying Log Analytics in increments: But something was missing... As I implemented the solution at one of the projects I noticed that there was no way in the template report to attribute a query to a particular visual. The reports that connected to the models had multiple pages and potentially more than 100 visual elements, this made finding the problematic visuals quite hard. So I started searching through the logs to see if there is any information available that would help. As you can see in the image the Visual details were buried in the Application Context column. How to add the missing piece? Because the report is really sophisticated it took me a while to understand its inner workings and also received some help from Will Cisler, thank you for that! Add Visual ID to the data First we need to modify the Power Query and embedded KQL code to add the missing element. As a starting point extend fnGetOperationsExceptRefreshes function: | extend Sources = parse_json(ApplicationContext).Sources | mv-expand Sources | extend ReportId = tostring(parse_json(Sources).ReportId) | extend VisualId = tostring(parse_json(Sources).VisualId) | extend TextDataHash = hash(EventText) | extend DurationMs = coalesce(DurationMs, 0) | extend User = iff(toupper(PowerBIWorkspaceId) in (workspaceIdList), hash_md5(User), User) | extend ExecutingUser = iff(toupper(PowerBIWorkspaceId) in (workspaceIdList), hash_md5(ExecutingUser), ExecutingUser) | distinct ApplicationName, CpuTimeMs, ArtifactId, ArtifactName, ArtifactKind, DatasetMode, DurationMs, EventText, ExecutingUser, OperationDetailName, OperationName, PremiumCapacityId, ReportId, SourceSystem, Status, StatusCode, TenantId, TextDataHash, TimeGenerated, Type, User, PowerBIWorkspaceId, PowerBIWorkspaceName, XmlaRequestId, XmlaSessionId, _ResourceId, _SubscriptionId, VisualId | extend QuerySubeventKey = case( OperationName == 'QueryEnd' and isnotempty(XmlaRequestId), XmlaRequestId, // add rootActivityId as the key if this is a QueryEnd. This joins to the Suboperation table. dynamic(null)) // if we are not expecting subevents then make this null. In PowerQuery we will add a dummy index value for null values to keep this column unique for the 1:* relationship | project-rename DatasetId = ArtifactId, Dataset = ArtifactName, StartTime = TimeGenerated, ResourceId = _ResourceId, SubscriptionId = _SubscriptionId, ['Duration (ms)'] = DurationMs, VisualGUID = VisualId "],Timeout=#duration(0,0,60,0)])), ⚠️There might be other parts of the solution need to be adjusted to let this new column to flow through the pipeline.⚠️ Add Visual Details Page to the Report Finding the visuals that cause the performance issue was our main goal, so I created a page dedicated to showcase query summary by visuals. Also added some time analysis to show how changes in usage or improvements in the solution affect the performance. To connect the page with the rest of the report a drill through was added from the Workspace summary page to the Visuals details. Also from the Visual details to the Query details to understand how each query performs. Real life examples of performance improvements using Log Analytics Here are some examples that we were able to achive with analysing Log Analytics data with our current setup. Slow table visual (40s+) that was designed for exporting data running more than 3.000 time in 2 weeks, was moved to a separate drill through page (Progressive Disclosure). Turns out nobody really wanted to export data because it only run less then 20 times in the next 2 weeks. Complex DAX issue causing visual to failed after running for more than 4 minutes. Corrected the calculation, now it runs under 10 seconds. Legacy, currently unnecessary filters in some visuals for "data cleansing" removed improving query performace from 60 sec to 2 sec. What's next? Unfortunately currently in Power BI there is no option to extract report metadata information from the published reports through the Power BI APIs. In my next blog post I will show how you can find the exact details of a visual and page in a PBIX file manually, what are the community built options and how Fabric APIs will help.

New Page Available: Explore Power BI features with Power BI

SelfServiceBI· 04 Aug 2022

I spend a lot of time going through the Power BI documentation to find answers about Power BI features. As always, I look for opportunities to make my life easier when I need to look through of data. What better tool for this than Power BI itself? So, I created a few reports to help me and you to find easier and faster what we search for. Questions like: Is it mandatory to use Data Gateway if I need to connect to Amazon Athena? Will I be able to connect live to Essbase? What are the parameters I need to use if I want to get all the reports using the Rest APIs? Is there an API call to cancel a dataset refresh? What options admin have to manage external user access? I saw a feature in a blog post/video, but I cannot see it in my tenant, can the admin turn it off? Link to the Page: Power BI Features Explore Power BI features with these reports Hope you find these reports useful for finding answers to your questions faster.

Hidden features of Analyze in Excel - Part 3 - Focus

SelfServiceBI· 15 Jul 2022

In the first two parts (part 1 and part 2) of the blog we explored the Show details functionality of a Pivot table in Excel. Now we look at another hidden feature that will help you limit distractions. Whenever you connect to a Power BI dataset you immediately see the whole model. The issue is that many models contain a lot of tables, columns, and measures. Most of them irrelevant to your current analysis. Which model would you rather work with? Full data model Housing Market submodel When creating the dataset there is usually a desire to make it as comprehensive as possible, but it comes with a cost. Using it can become confusing and complicated for the business users. I have a blog post talking about the differences between making a data model for yourself and others, which talks about these challenges. In that post I talk about the different approach that you need to take when designing a dataset for someone else to use. This is even more true to the Excel users. But we have a "secret" tool to make their life much easier called perspective. What is a perspective? "Perspectives, in tabular models, define viewable subsets of a model that provide focused, business-specific, or application-specific viewpoints of the model." - AAS documentation. In Power BI it is used for supporting the Personalized Visuals feature. By default, you cannot use them when creating Power BI Report. Creating them also not available in PBI Desktop. You need to use Tabular Editor. Tip: In Premium, using the Analysis Services connector, the perspectives are accessible in Power BI Desktop. Using perspectives in Excel In Excel, when connecting to a Power BI Dataset you can modify some properties of that connection that is not available in Power BI. By opening the connection properties and navigating to the Definition section you can define the perspective you would like to use. Tip: The Perspective name cannot have spaces in it. Once you defined it, in the pivot table you will only see the tables, columns and measures that are added to that perspective. This can greatly simplify the work of an analyst because they do not need to spend so much time finding the pin in the haystack. Connecting to the Housing Market Data perspective Creating a perspective connection file? You can export the connection into a connection file and this property will be saved. Summary Perspectives are a really fantastic way to create different simpler views from a complex data model without the need to split it up. This helps both the Power BI developer and the end users. Hope you liked this blog post, see you in the next one!

Hidden features of Analyze in Excel - Part 2 - Give me all the data

SelfServiceBI· 11 Jul 2022

In the previous blog post we discussed how to define the details of the drill through feature of a Pivot table. Show Details - default results As we learned how to define the results of the Show Details capability, we encounter another potential issue. As you can see in the image below the cell selected represented almost 2.000 rows, but by default you only get the first 1.000 rows. This way this feature it not that useful. Default Show Details functionality - 1.000 rows In an ideal world you would expect to see all the data and if you miss the first row you might believe you actual have it. This can cause a lot of confusion. Fortunately, there is a solution to get all the details. Connection Properties - OLAP Drill Through If you go to the properties of the connection, you will see an option that most people ignored so far. It is called Maximum number of records retrieve. Connection properties You can increase this number to anything that an excel sheet can cope with. Of course, getting more data require an efficient data model, good internet connection and a suitable hardware spec. Show Details made more useful Together with the Details Rows Expression and increasing number of rows retrieved in Show Details, we can increase the value of these standard Excel functions. Modified Show Details functionality - 10.000 rows Considerations Because this setting is in Excel it cannot be set in the Power BI model. You need to open the file in the desktop version of Excel you cannot use or set this feature in the Excel Online. It is possible to create a connection file to save the details, but it does not save this setting. This will be useful in the next part of this series. For now, you need to save the excel file created from Analyze in Excel with this setting modified to a shared drive for the excel users to use. Currently searching for a solution to change this setting globally for all excel files so it is not a manual effort every time. If you know one, please share it with me. Export Connections into an .ODS file Summary This and the previous post not just ensure that the Excel users do not feel any negative impact by using a Power BI dataset but instead adds value to them. In the next blog I will show how you can reduce distractions and complexity when working with a Power BI dataset in Excel. Hope you like this blog post, see you in the next one.

Hidden features of Analyze in Excel - Part 1 - Show me the details

SelfServiceBI· 07 Jul 2022

Why focus on Analyze in Excel? In my previous post about Organizational Data Types I introduced this chart showing the potential reach of certain features in an organization. There we covered the widest reaching feature. In this series I would like to cover some hidden features of Analyze in Excel. These will improve the usability of your solutions and solves some of the issues you would face on your adoption journey to Power BI. There is an Adoption Roadmap available that also talks about the challenges. Not everyone will want to use Power BI in an organization, but that does not mean we should not focus on them too. We need to reach people where they are currently and not necessarily force them to change. Show Details In a Pivot table there is a standard feature called Show Details. This gives a filtered view of top 1000 rows based on our selection. Show Details functionality in Excel While it works with standard pivot tables by default, this behaviour changes when we use Analyze in Excel. Data Model First let's understand the data model we are going to use with Analyze in Excel. This is a fair straightforward subset of a bigger model. One fact table - Housing Market Data Three dimension tables - Calendar, Post Code, Lower Super Output Area Calculations table that only contains measures organized into folders (following Power BI data modelling best practice) Data model Standard experience Let's see what happens when you try to use Analyze in Excel with this standard model. After connecting to the published dataset, it is straightforward to use the pivot table as expected. If you want to see the details, you can double click on the cell... Click to Show Details from a cell in a Pivot table Pivot table fields ...or right click and select Show Details Show Details option What you would expect is a nice table of columns from the Housing Market table with the filtered rows. This is how it works with simple Excel pivot tables. No data shown - measure table is empty Instead, this screen is going to welcome you. But why? Because we are using a Power BI data model, the measures are in their own data table, which does not have any columns or rows. What if the measures are in the fact table? We would still have a problem because it would exclude all the relationship columns like date and postcode in our example, losing valuable information from the details. So, using Power BI have some limitation for Excel users? How to see the details - Detail Rows Expression To solve this issue and even provide better results we need to use Tabular Editor. The rows that are displayed on Show Details are dependent on the measures. To determine which columns should appear we need to provide a Detail Rows Expression. Detail Rows Expression in Tabular Editor Because we are using a data model, we are not just limited to columns from one table, but any related information can be added to it. You will see a sample Dax query that I used for this blog post. The best way to design a query for your data model is to use DAX studio. Detail Rows Expression sample: SELECTCOLUMNS( 'Housing Market Data', "Transaction ID", [Transaction ID], "Estate Type", [Estate Type], "Property Type", [Property Type], "New Build",[New Build?], "Local Areas District", RELATED('Local Area District'[LA Name]), "Post Code",RELATED('Post Code'[Post Codes]), "Street",[Street], "House Number/Name",[House Number/Name], "Transaction Date",RELATED('Calendar'[Date]), "Properties Sold", [Properties Sold], "Average Price Paid", [Average Price Paid] ) Default expression If you have a lot of measures it could seem like a lot of work to add this to each of the measures. If you would like the same expression for all the measures in a table, then you can use the Default Detail Rows Expression setting for the table. Table level Expression You are still able to add specific expression to certain measures if they need to differ. Summary Using Detail Rows Expression, you can ensure that the Excel users do not feel that they lost a valuable feature because of Power BI, instead they not have an even better version of that. Hope you like this blog post, the next part will arrive shortly.

Organizational Data Types - Supercharge your Excel users’ productivity

SelfServiceBI· 29 Jun 2022

Data Types has been part of Excel for some time now, but a lot of people is not familiar with it. You can find them under the Data tab. You might have wondered what they are for? They are built-in datasets about common information like Stocks, Currencies and Geographic Locations. There are also other data types by Wolfram, but they are going to disappear on 11th June 2023. You can use these data types to get additional information about data you work with in Excel. Like current stock price for a ticker symbol or population information for a city. Working with built-in Data Types What are Organizational Data Types? Organizational Data types are like these other Data Types, but they are specific to the Organizations you are in. These can be created using Power BI Featured Tables. To see them you need to sign in your organization tenant in Excel. Organizational Data Types in action Why should you use Organizational Data Types? To make organizations data driven we should make working with data as easy as possible. In an organization people whose core task is to analyze data and need the capabilities of a tool like Power BI is relatively small. But almost everyone uses Excel for ad hoc tasks. Providing them with and easy to use access to organizational or even 3rd party information supercharges their productivity a lot and improves consistency. They won`t need to look for a set of data across files or send emails. Using this feature effectively can have the biggest impact on your organizations use of data and exponentially increase to value you generate. Additional recommendation: Organizational Data Types are not just for organizational data Just because it is called organizational data types it does not mean it should only contain data that the organization manages. The examples you see contains data about schools, GPs or registered companies. These might not be relevant for your organization, but I'm sure there are a lot of open/3rd party datasets available that you could use to help your colleagues. Setup Organizational Data Types Checking tenant settings First what you need to understand if featured tables are enabled in your organization. This can be checked in Power BI tenant settings. Selecting the Data Source When setting up an Organizational Data Type my recommendation is not to use an existing dataset. The reason is that modelling requirements for a dataset for Data Types or standard usage is different. Using common Data flows could be a good option here as it would reduce the data duplication and load on the source systems. Separate Organization Data Types dataset in Power BI Structuring Data For Data Types models you need flat tables because everything you would like to use for a data type needs to be in the same table. So, you do not need to build a data model. Put all your columns, calculated columns, and measures in the table you want to use as a Data Type in Excel. Featured table examples to be used in Excel as Data Types Set up a featured table Once you have your data ready, setting up as Featured Table is relatively easy. Most importantly you need to ensure you have a unique column to serve as your Key Column. After publishing up, they should appear in Excel after a restart. Organizational data types Summary Organizational Data Types (Featured Tables) are an underutilized tool than can improve the productivity of a much larger group of users than Power BI itself. If you haven`t already I highly recommend for you to check out this feature and think about how you could utilize it in your organization. Hope you found this blog post useful.