<?xml version="1.0" encoding="utf-8"?> 
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-us">
    <generator uri="https://gohugo.io/" version="0.152.2">Hugo</generator><title type="html"><![CDATA[Scientific Python on Blog]]></title>
    
    
    
            <link href="https://blog.scientific-python.org/" rel="alternate" type="text/html" title="html" />
            <link href="https://blog.scientific-python.org/atom.xml" rel="self" type="application/atom" title="atom" />
    <updated>2026-04-04T04:32:36+00:00</updated>
    
    
    
    
        <id>https://blog.scientific-python.org/</id>
    
        
        <entry>
            <title type="html"><![CDATA[Community Considerations Around AI Contributions]]></title>
            <link href="https://blog.scientific-python.org/scientific-python/community-considerations-around-ai/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/scientific-python/dev-summit-2/?utm_source=atom_feed" rel="related" type="text/html" title="Developer Summit 2" />
                <link href="https://blog.scientific-python.org/scientific-python/translations/?utm_source=atom_feed" rel="related" type="text/html" title="Translations for Scientific Python projects" />
                <link href="https://blog.scientific-python.org/scientific-python/dev-summit-1/?utm_source=atom_feed" rel="related" type="text/html" title="Developer Summit 1" />
                <link href="https://blog.scientific-python.org/scientific-python/dev-summit-1-sparse/?utm_source=atom_feed" rel="related" type="text/html" title="Developer Summit 1: Sparse Arrays" />
                <link href="https://blog.scientific-python.org/scientific-python/2022-czi-grant/?utm_source=atom_feed" rel="related" type="text/html" title="Scientific Python awarded CZI grant to improve communications infrastructure &amp; accessibility" />
            
                <id>https://blog.scientific-python.org/scientific-python/community-considerations-around-ai/</id>
            
            
            <published>2026-01-29T00:00:00+00:00</published>
            <updated>2026-01-29T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>An attempt at exploring risks and impacts of LLMs and agents on our ecosystem, and how we, as a community, may agree upon common cultural norms and standards for integrating such technology into contributor pathways and workflows.</blockquote><p>As LLM- / Agent-generated workflows and PRs become commonplace, we, the Scientific Python maintainer community, have to decide how to engage with them. Much of our ecosystem was crafted by hand, with a lot of care and love, so it is unsurprising that the rise in LLM contributions may at first feel threatening.
I know from personal experience that I felt somewhat deflated the first time one of these landed in front of me.
There was a sense of frustration and loss, and it took me a few days to process the repercussions.
Discussing this with a colleague, he rightfully positioned it as follows:</p>
<blockquote>
<p><em>The reason for the success of projects like NumPy and SciPy is not primarily superior coding ability, or better tooling, but the explosive effect of humans working well together, and enjoying their work. We cannot ignore that human and social element, because if we do, we lose it all, and we really do become lesser versions of programming teams in companies.</em></p>
<p><em>In Rebel Code (Moody, 2001), there is this idea that open-source has always been a movement of rebellion, an attempt at taking back the commons from enclosure. What does the rebellion look like when it loses it social form? What kind of people will we lose?</em> — Matthew Brett</p>
</blockquote>
<p>It is against this backdrop that I spent a week pacing the corridor, hand-wringing.
This post is some part of the result: an attempt at exploring risks and impacts of LLMs and agents on our ecosystem, and how we, as a community, may agree upon common cultural norms and standards for integrating such technology into contributor pathways and workflows. It is not a philosophical piece, jumping pretty much directly into pragmatic concerns. I think the other conversation, the one that ties more directly into Matthew&rsquo;s concerns above, is also very much worth having.</p>
<p>Admittedly, my initial sense of foreboding around LLMs upending our community&rsquo;s culture of collaboration, while by no means gone, has somewhat dissipated as I&rsquo;ve started to also contemplate the other side of the coin: how these changes may benefit maintainers, who have, over the years, increasingly been burdened with more menial tasks, drawing them away from the work that originally attracted them to the ecosystem.</p>
<p>Whether we like it or not, the world has changed irrevocably, and now is a good time to consider how to position ourselves within it.</p>
<p>I start this post outlining concerns, since these have been the topic of most AI conversations within our community.
I&rsquo;ll follow with a section on proposed guidelines that we may iterate on, and end with a more hopeful section of how we may benefit from the revolution underfoot.</p>
<h2 id="maintainer-concerns">Maintainer concerns<a class="headerlink" href="#maintainer-concerns" title="Link to this heading">#</a></h2>
<h3 id="licensing">Licensing<a class="headerlink" href="#licensing" title="Link to this heading">#</a></h3>
<p>The earliest concern raised around LLMs was that they almost certainly violate licensing conditions.
In many cases, they will readily produce material derived from training sources that have licenses incompatible with the library you are contributing to. Being a summarization of a large corpus, an LLM is unlikely to even know that it drew upon a BSD-licensed source when generating code, and as such attribution will not be given.</p>
<p>Of course, it matters <em>what</em> you generate. If you refactor a test suite or correct spelling, you are unlikely to contravene any licenses. If, however, you are implementing a sophisticated algorithm, perhaps one that exists in, say, GPL&rsquo;d libraries (incompatible with our BSD-based ecosystem), the risk increases significantly.</p>
<p>Colleagues I spoke to prior to writing this post mentioned that they often use LLMs for annoying one-off tasks: generating an <code>nginx</code> configuration, writing an OPML-to-YAML converter, or setting up some throwaway experiment. There&rsquo;s little reason to have licensing concerns about such use-cases.</p>
<h3 id="introduction-of-subtle-bugs">Introduction of subtle bugs<a class="headerlink" href="#introduction-of-subtle-bugs" title="Link to this heading">#</a></h3>
<p>LLMs typically operate with limited <em>context</em>. Problem-specific context needs to be selected and provided by the user, and it is not clear what <em>optimal</em> context entails. Certain categories of prediction mistakes occur frequently, including hallucinations and over-confidence. Even having a &ldquo;good memory&rdquo; (i.e., being able to process and reference a lot of material quickly) cannot account for such missing context, and therefore proposed solutions may be sub-optimal.
It happens, therefore, that contributions generated by LLMs introduce subtle bugs, due to a lack of systems architecture awareness.
These are bugs which, unfortunately, you—the maintainer—will be responsible for resolving in the future :)</p>
<p>In a (mostly positive) summary of the current (beginning of 2026) state of AI for coding<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, Andrej Karpathy writes:</p>
<blockquote>
<p><em>The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don’t manage their confusion, they don’t seek clarifications, they don’t surface inconsistencies, they don’t present tradeoffs, they don’t push back when they should, and they are still a little too sycophantic [telling users what they want to hear].</em> — Andrej Karpathy<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
</blockquote>
<h3 id="reviewer-frustration">Reviewer frustration<a class="headerlink" href="#reviewer-frustration" title="Link to this heading">#</a></h3>
<p>LLM contributions can be generated at a staggering pace, but reviews still require careful human attention. And while a first pass LLM review is fine, we&rsquo;re not going to merge changes without looking at and understanding them first.
After all, that is why the review process exists in the first place—because we&rsquo;ve learned the cost of moving forward too hastily, and of making decisions without carefully considering their impact.
Unless the contributor is attentive and deliberately careful in the follow-up conversation, the interaction may feel hollow and dehumanized, and generate frustration among reviewers.
There are ways to improve the situation (see &ldquo;Potential guidelines&rdquo; below), but the essence of this concern is that it can be very discouraging for &ldquo;artisans&rdquo; to have to engage with—and spend time on—code that cost very little to build.</p>
<h2 id="general-concerns">General concerns<a class="headerlink" href="#general-concerns" title="Link to this heading">#</a></h2>
<h3 id="a-reduction-in-learning">A reduction in learning<a class="headerlink" href="#a-reduction-in-learning" title="Link to this heading">#</a></h3>
<p>When programmers shift towards relying on LLMs, they will be tempted to focus less on learning. Why go through the pain of figuring out a complex code base, or understanding an algorithm you are working on, when all that can be taken care of on your behalf? The reward is immediate, but over time there is a cost to bear as the contributor&rsquo;s abilities decrease—or never evolve in the first place. This affects, in particular, those who are learning programming and problem-solving skills for the first time.</p>
<p>To be fair, there are good learning opportunities with LLMs as well: they may help you better understand a codebase, advise on subtleties of translating code to a new language, etc. If you apply it with care, you <em>can</em> benefit without succumbing to the risks—but it requires a fair amount of discipline, and humans are better at avoiding suffering than disciplining ourselves.</p>
<h3 id="uncertain-improved-efficiency">Uncertain improved efficiency<a class="headerlink" href="#uncertain-improved-efficiency" title="Link to this heading">#</a></h3>
<p>Research is still out on whether AI improves coding efficiency, especially when it comes to experienced developers working on open source projects they know well. Preliminary studies of this specific scenario suggest neutral to negative results.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<p>Somewhat tongue in cheek, software engineer Mike Judge notes:</p>
<blockquote>
<p><em>If so many developers are so extraordinarily productive using these tools, where is the flood of shovelware? We should be seeing apps of all shapes and sizes, video games, new websites, mobile apps, software-as-a-service apps — we should be drowning in choice. We should be in the middle of an indie software revolution. We should be seeing 10,000 Tetris clones on Steam.</em> — Mike Judge<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup></p>
</blockquote>
<p>Experiments to determine the &ldquo;AI efficiency multiplier&rdquo; are structured as follows: you generate a list of tasks. For each, you estimate how long it will take, and then flip a coin to decide whether you implement the solution using the &ldquo;classic approach&rdquo;, or by using an agent. You then do the task, estimate how long it took, and note the time it <em>actually</em> took.</p>
<p>What the METR study showed (albeit with low N, so results are uncertain) is that programmers may <em>feel</em> like they&rsquo;re faster with AI when often they&rsquo;re not. It&rsquo;s an easy trap to step into, especially when having to decide between two options where one <em>requires effort</em> and the other does not.
Again, it&rsquo;s worth noting that &ldquo;farming out&rdquo; the task means there is a good chance that you will fail to fully comprehend the solution and its potential impact, unless you deliberately and carefully review the result.</p>
<p>I think the equation clearly shifts when doing tasks you are unfamiliar with. For example, if you spend most of your time building scientific code in Python, having to scaffold a website from scratch will take longer than it would using an agent. But of course you then have the benefit of knowing how to build websites, and so it matters how many times you will be doing that type of task in the future.</p>
<h3 id="eroded-artistic-co-creation">Eroded artistic co-creation<a class="headerlink" href="#eroded-artistic-co-creation" title="Link to this heading">#</a></h3>
<p>Many of us got into open source because there is a deep satisfaction that comes from productive collaboration with other humans. We enjoy thinking about and talking through hard problems, learning from the best, and sharing our art. The software we build is a culmination and reflection of this culture of collaboration.</p>
<p>Like with any tool, we need to learn whether, when, and how to apply AI.
If it is used to replace <em>thinking</em>, instead of merely to reduce grunt-work, it risks derailing collaboration and sucking the joy out of attentive design and meticulous problem solving.</p>
<p>Given all the concerns above, some projects may well decide that they should not be using it at all.
We are only starting to engage more with AI contributions, only starting to see its impact on our projects and our collaborative culture. I think 2026 will prove to be highly educational.</p>
<h2 id="potential-guidelines">Potential guidelines<a class="headerlink" href="#potential-guidelines" title="Link to this heading">#</a></h2>
<p>Given the above concerns, what should we as a community do to engage with (a) the new tools (b) developers who utilize these tools and (c) the contributions they generate.
What I&rsquo;m asking is: instead of being prescriptive around the tools others choose to use to do their work, can we instead formulate guidelines that will allow us to continue enjoying working together?</p>
<p>Here are some initial suggestions:</p>
<p><strong>1. Be transparent</strong></p>
<p>Trust would greatly increase if contributors <em>declared</em> their AI use. This would help reviewers decide how they want to engage in the review process, and make them aware of potential risks. Declaring the use of AI sets the stage for honest conversations, improving the likelihood of a good interaction.</p>
<blockquote>
<p><em>Kernel contributors have been using tooling to generate contributions for a long time. These tools can increase the volume of contributions. At the same time, reviewer and maintainer bandwidth is a scarce resource. Understanding which portions of a contribution come from humans versus tools is helpful to maintain those resources and keep kernel development healthy.</em> — Proposed Linux kernel developer guideline<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup></p>
</blockquote>
<p>Needless to say, if a project has AI guidance (and many already do), it should be followed.</p>
<p><strong>2. Take responsibility</strong></p>
<p>When you submit a PR, no matter <em>what</em> tools you use, it is <em>your</em> responsibility to make sure that it addresses the problem <em>correctly</em> and doesn&rsquo;t introduce subtle errors—in that sense, it is no different from contributing code you wrote yourself. In that case, however, you may have a clearer idea about potential pitfalls, and how careful you were in avoiding them. When using LLMs, there should be a <em>deliberate effort</em> to ensure that code conforms to community norms.</p>
<p>For example, when writing tests with an LLM, it may simply generate a large number of very similar tests, instead of using parametrization and fixtures the way the rest of the project does. It would then be up to you to identify that deficiency and refactor your code (or have the LLM refactor the code) to address it.</p>
<p><strong>3. Gain understanding</strong></p>
<p>With the assistance of AI, it will happen that contributors venture into territory beyond their technical expertise.
With the help of LLMs, contributors are able to solve harder problems than before, but it remains crucial that they put in the effort to understand their contribution before submitting.
We have to appreciate that there are some risks associated with LLM-assisted contributions, both as these models evolve to be &ldquo;better programmers&rdquo; and avoid common pitfalls, and as authors work out how to best provide context and assess LLM-generated content.
Transparency here helps a great deal to set expectations.</p>
<p>For contributors, it is important to keep in mind that reviews remain human conversations between peers.
Reviewers typically prefer not engaging an LLM directly, but having a knowledgeable exchange around the motivation behind a PR, how various technical choices are justified, and the impact it may have on the rest of the code.</p>
<p>We therefore recommend that contributors work to fully understand the changes they submit, and present it in such a way that reviewers can dive straight into the motivation, design decisions, and technical requirements.</p>
<p><strong>4. Honor Copyright</strong></p>
<p>I had a longer section here around copyright and attribution, but I cut most of it, since I fear we may have run out of good options.
Projects in our ecosystem have been very deliberate about adhering to licensing requirements, and about giving attribution.
But LLMs are unlikely to ever produce meaningful license updates, and by this stage they&rsquo;ve read and assimilated most of our library codebases.
Patterns that were once copyrighted are now commonly duplicated and fully generic.</p>
<p>Personally, I am pulled in two directions: first, I care about people getting credit for the work they do. On the other hand, and also the reason we do not employ copyleft licenses, is that we really want our work to be used widely and impact as many lives as possible. Credit is important, especially for younger people starting their careers, and we&rsquo;ll have to continue thinking about how to give it justly. Practically speaking, however, I&rsquo;m not convinced that licenses are an effective mechanism to enforce credit anymore.</p>
<p>I think it is reasonable to expect, still, that when you make a contribution you have a reasonable sense that you are not violating copyright.
After all, you cannot force contributors to some random GPL library to be OK with your using their code.
In the case of a test suite refactoring, or contributing to a typical React app, this is unlikely to be a problem.
But once you start implementing new algorithms in SciPy, e.g., you&rsquo;re running the risk of, e.g., copying algorithms or bringing in copyrighted ideas from other projects.</p>
<p>I&rsquo;d therefore recommend playing it safe and only making AI-guided contributions that clearly steer clear of copyright infringements. Also see <a href="https://devguide.python.org/getting-started/generative-ai/#acceptable-uses">the Python Developer&rsquo;s Guide</a> on what they consider reasonable use-cases.</p>
<h2 id="potential-benefits">Potential Benefits<a class="headerlink" href="#potential-benefits" title="Link to this heading">#</a></h2>
<p>Notwithstanding the above concerns, and pervading sentiments around AI in our community, I think it&rsquo;s worth honestly assessing potential benefits.</p>
<p>When I started writing this blog post, I had dabbled with LLM-generated code from time to time &ldquo;to keep my finger on the pulse&rdquo;. I was, frankly, quite underwhelmed.
During a routine re-evaluation in December, however, I noticed a marked shift in how quickly an agent was able to solve a routine coding task. The same caveats as usual applied (the AI sometimes went down rabbit holes, it made up function names, etc.), but it <em>did</em> give me pause.</p>
<p>As I wrestled with the implications of AI for our community, I realized that the Scientific Python project was formed, in part, because of the enormous burdens that maintainers now face. To address those burdens, we build tools, we coordinate, and we explore new solutions. But one thing that is very difficult to increase is <em>labor</em>. The existing maintainer community is slow to grow—after all, it takes a very specific kind of person to do (and enjoy doing) the work we do. And there are only 24 hours in a day—substantially fewer for many of us as we move from being students to having families, industry careers, etc.</p>
<p>So, here we are, at a time where we risk being immobilized by our own success: as our libraries grow and are adopted by more and more users, we are unable to add new features because we are so overburdened by maintenance requirements. And now we are presented with a tool that cannot handle the sophisticated thinking and problem solving required to architect libraries and implement novel algorithms, but <em>that is</em> useful for solving common maintenance chores. Then one has to wonder: is this perhaps an opportunity for us to see our maintenance burden lightened, so we can get back to the craft we love—i.e., producing hand-crafted APIs and novel implementations of algorithms that give researchers across the world access to cutting edge methods?</p>
<h2 id="conclusion">Conclusion<a class="headerlink" href="#conclusion" title="Link to this heading">#</a></h2>
<p>When it comes to a disruptive technology that has the potential to rapidly reshape the ecosystem that we have built with so much care and attention to detail, how we engage with it demands careful consideration.
From what I&rsquo;ve seen from the current generation of LLMs, they&rsquo;re not ready to provide us with best-in-class solutions, but they may already be useful in reducing some of the tedium involved in contributing.
Will utilizing AI, due to the risks outlined above, unravel the very tapestry of collaboration that holds our ecosystem together, or can it be harnessed to restore developer bandwidth and preserve the &ldquo;explosive effect of humans working well together&rdquo;?
It is worth exploring how to adjust to and incorporate these changes, and how to best let our different &ldquo;coding philosophies&rdquo; and tool choices co-exist—all while preserving the incredible benefit that our open-source scientific software ecosystem provides.</p>
<p><em>Please let us know your thoughts in the comments below. This post is part of an effort to come up with a community approach to AI and AI-generated contributions.</em></p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<ul>
<li><a href="https://docs.fedoraproject.org/en-US/council/policy/ai-contribution-policy/">Fedora AI contribution policy</a></li>
<li><a href="https://github.com/scikit-image/scikit-image/pull/7982">scikit-image AI contribution guidelines</a></li>
<li><a href="https://wiki.gentoo.org/wiki/Project:Council/AI_policy">Gentoo AI policy</a></li>
<li><a href="https://lkml.org/lkml/2025/11/5/1802">Linux Kernel: [PATCH] [v3] Documentation: Provide guidelines for tool-generated content</a></li>
<li><a href="https://devguide.python.org/getting-started/generative-ai/#acceptable-uses">Python Developer&rsquo;s Guide: Generative AI</a></li>
</ul>
<h2 id="credit">Credit<a class="headerlink" href="#credit" title="Link to this heading">#</a></h2>
<p>I would like to thank the following community members for feedback on early drafts: <em>Matthew Brett, Henry Schreiner, Dan McCloy</em>. And <em>Chris Holdgraf, Angus Hollands, Brian Hawthorne</em>, for conversations on the topic. This post does not necessarily reflect their views.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p><a href="https://x.com/karpathy/status/2015883857489522876?s=20">https://x.com/karpathy/status/2015883857489522876?s=20</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p><a href="https://x.com/karpathy/status/2015883857489522876">https://x.com/karpathy/status/2015883857489522876</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p><a href="https://secondthoughts.ai/p/ai-coding-slowdown">https://secondthoughts.ai/p/ai-coding-slowdown</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p><a href="https://mikelovesrobots.substack.com/p/wheres-the-shovelware-why-ai-coding">https://mikelovesrobots.substack.com/p/wheres-the-shovelware-why-ai-coding</a>&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p><a href="https://lore.kernel.org/ksummit/20251114183528.1239900-1-dave.hansen@linux.intel.com/">https://lore.kernel.org/ksummit/20251114183528.1239900-1-dave.hansen@linux.intel.com/</a>&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="scientific-python" label="Scientific-Python" />
                             
                                <category scheme="taxonomy:Tags" term="ai" label="AI" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[A Year of Typing: My NumPy Fellowship Retrospective]]></title>
            <link href="https://blog.scientific-python.org/numpy/fellowship-program-2025-retrospective/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/numpy/fellowship-program-2025/?utm_source=atom_feed" rel="related" type="text/html" title="NumPy&#39;s Second Developer in Residence: Joren Hammudoglu" />
                <link href="https://blog.scientific-python.org/numpy/fellowship-program/?utm_source=atom_feed" rel="related" type="text/html" title="NumPy&#39;s first Developer in Residence: Sayed Adel" />
                <link href="https://blog.scientific-python.org/numpy/numpy2/?utm_source=atom_feed" rel="related" type="text/html" title="NumPy 2.0: an evolutionary milestone" />
                <link href="https://blog.scientific-python.org/numpy/numpy-rng/?utm_source=atom_feed" rel="related" type="text/html" title="Best Practices for Using NumPy&#39;s Random Number Generators" />
                <link href="https://blog.scientific-python.org/numpy/mukulikapahari/?utm_source=atom_feed" rel="related" type="text/html" title="NumPy Contributor Spotlight: Mukulika Pahari" />
            
                <id>https://blog.scientific-python.org/numpy/fellowship-program-2025-retrospective/</id>
            
            
            <published>2026-01-08T00:00:00+00:00</published>
            <updated>2026-01-08T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>A Year of Typing: My NumPy Fellowship Retrospective</blockquote><p>It’s been exactly one year since I started my journey as a NumPy Fellow, and looking back, it has honestly been the best job I&rsquo;ve ever had. My main goal for 2025 was to push the boundaries of static typing within the Scientific Python ecosystem. I&rsquo;m happy to report that we didn&rsquo;t just push the boundaries; we reshaped them.</p>
<p>Here is a high-level look at what we achieved, from making <code>numpy</code> fully type-checked to bridging the gap between scientific computing and the wider Python typing community.</p>
<h2 id="numpy-is-now-fully-type-checked">NumPy is Now Fully Type-Checked<a class="headerlink" href="#numpy-is-now-fully-type-checked" title="Link to this heading">#</a></h2>
<p>One of the biggest wins this year is that NumPy is now fully type-checked. When I started, there were significant gaps between the runtime behavior and the typing stubs.</p>
<p>For example, I spent a lot of time integrating <code>stubtest</code> — a <a href="https://github.com/python/mypy">mypy</a> tool that checks if stubs match the runtime — into the CI pipeline. For this I had to fix thousands (yes, thousands) of errors in the stubs. Now typing correctness is enforced by running <code>stubtest</code> and <code>mypy</code> on the stubs in CI, ensuring that technical debt doesn&rsquo;t creep back in.</p>
<p>Crucially, NumPy is now largely compatible with the official <a href="https://typing.python.org/en/latest/">Python typing specification</a>. I helped drop support for Python 3.11 to update our stubs to use the modern PEP 695 syntax, making the codebase cleaner and more future-proof.</p>
<p>Besides static typing, since NumPy 2.4.0, all function- and class-signatures can now also be inspected at runtime using <code>inspect.signature</code>. This is a game-changer for runtime type-checkers like <a href="https://github.com/beartype/beartype">beartype</a> and <a href="https://github.com/agronholm/typeguard">typeguard</a>.</p>
<p>Over the past year I&rsquo;ve made <a href="https://github.com/numpy/numpy/pulls?q=is%3Apr&#43;author%3Ajorenham&#43;created%3A2025-01-01..2025-12-31&#43;is%3Amerged">hundreds of contributions</a> to NumPy, so there&rsquo;s a good chance I&rsquo;m forgetting some important achievements.</p>
<h2 id="the-shape-typing-frontier-numtype">The Shape-Typing Frontier: NumType<a class="headerlink" href="#the-shape-typing-frontier-numtype" title="Link to this heading">#</a></h2>
<p>A massive part of my fellowship was dedicated to the &ldquo;holy grail&rdquo; of array typing: shape-typing. For this I had to rely on type-checker behavior that isn&rsquo;t well-specified, and therefore subject to change. Using these typing acrobatics in NumPy would be too risky, so we decided to create a new project for this, called <a href="https://github.com/numpy/numtype">NumType</a>.</p>
<p>When NumType is installed, your static type-checker will use its <code>.pyi</code> stubs instead of those bundled with NumPy. There are three main advantages to this:</p>
<ol>
<li>Improved ufunc annotations.</li>
<li>Full support of the <a href="https://numpy.org/neps/nep-0050-scalar-promotion.html">NEP 50</a> promotion rules for all scalars, exhaustively verified to be 100% accurate.</li>
<li>Experimental shape-typing with automatic static broadcasting types.</li>
</ol>
<p>The &ldquo;magic&rdquo; types that enable the dtype promotion and shape-type broadcasting are currently only accessible from the private type-check-only <code>_numtype</code> API. But the plan is to eventually make these part of the public <code>numtype</code> API.</p>
<p>But before you drop everything to install NumType, note that it&rsquo;s currently in alpha, so there&rsquo;s no backwards-compatibility guarantee. However, if you <em>do</em> decide to use it and encounter an issue, be sure to complain about it in high definition at <a href="https://github.com/numpy/numtype/issues">https://github.com/numpy/numtype/issues</a> :)</p>
<h2 id="strengthening-the-ecosystem-scipy-stubs-and-beyond">Strengthening the Ecosystem: <code>scipy-stubs</code> and Beyond<a class="headerlink" href="#strengthening-the-ecosystem-scipy-stubs-and-beyond" title="Link to this heading">#</a></h2>
<p>Typing NumPy is useless if the libraries built <em>on top</em> of it aren&rsquo;t typed. A significant portion of my time went into <code>scipy-stubs</code>.</p>
<ul>
<li>We transferred ownership of <a href="https://github.com/scipy/scipy-stubs/"><code>scipy-stubs</code></a> — which started as <code>jorenham/scipy-stubs</code> — to the official SciPy organization.</li>
<li><code>scipy-stubs</code> now covers the full SciPy API, and only uses <code>Any</code> when absolutely necessary.</li>
<li>It has grown massively — it now contains over 72,000 lines of code (according to <a href="https://github.com/boyter/scc"><code>scc</code></a>), making it the largest hand-written stubs-only Python package; even if you include <code>typeshed</code>&rsquo;s standard library stubs (which currently counts 69,439 lines of code).</li>
<li>We added runtime support for the generic types, including the sparse arrays, probability distributions, and interpolation classes.</li>
<li>I helped large libraries such as <a href="https://github.com/pandas-dev/pandas"><code>pandas</code></a>, <a href="https://github.com/jax-ml/jax"><code>jax</code></a>, <a href="https://github.com/colour-science/colour"><code>colour</code></a>, and <a href="https://github.com/apache/spark"><code>pyspark</code></a> adopt <code>scipy-stubs</code>.</li>
</ul>
<p>I also made sure to spread the love to other corners of the ecosystem by adding typing support to <a href="https://github.com/numpy/numpy-financial"><code>numpy-financial</code></a>, <a href="https://github.com/wolph/numpy-stl"><code>numpy-stl</code></a>, <a href="https://pypi.org/project/numpy-quaddtype/"><code>numpy-quaddtype</code></a>, <a href="https://github.com/pydata/numexpr"><code>numexpr</code></a>, and <a href="https://github.com/joblib/threadpoolctl"><code>threadpoolctl</code></a>.</p>
<h2 id="bridging-communities">Bridging Communities<a class="headerlink" href="#bridging-communities" title="Link to this heading">#</a></h2>
<p>Perhaps the achievement I&rsquo;m most proud of is the collaboration with the type-checker maintainers. Scientific Python has complex needs that often stretch the limits of Python&rsquo;s type system.</p>
<p>Throughout the year, I discovered and investigated bugs in all five major type-checkers: <a href="https://github.com/python/mypy"><code>mypy</code></a>, <a href="https://github.com/microsoft/pyright"><code>pyright</code></a>, <a href="https://github.com/DetachHead/basedpyright"><code>basedpyright</code></a>, <a href="https://github.com/facebook/pyrefly"><code>pyrefly</code></a>, and <a href="https://github.com/astral-sh/ty"><code>ty</code></a>. <a href="https://github.com/python/mypy/pulls?q=is%3Apr&#43;author%3Ajorenham&#43;created%3A2025-01-01..2025-12-31">Some</a> of the <code>mypy</code> bugs I even managed to fix myself. We fixed critical bugs affecting NumPy users and improved analysis times.</p>
<p>I feel that this work has brought the Scientific Python community much closer to the Python Typing community. I&rsquo;m incredibly grateful to the maintainers of these tools for their responsiveness and willingness to collaborate.</p>
<h2 id="wrapping-up">Wrapping Up<a class="headerlink" href="#wrapping-up" title="Link to this heading">#</a></h2>
<p>This fellowship has been an absolute privilege, and I feel like I&rsquo;ve made the most out of it. If you want to dive into the nitty-gritty details, you can find all of my activity on my GitHub profile (<a href="https://github.com/jorenham"><code>@jorenham</code></a>), but for now, I&rsquo;m just happy to have made those squiggly lines a bit more meaningful.</p>
<p>Thanks to everyone who made this possible, and type safe!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="numpy" label="numpy" />
                             
                                <category scheme="taxonomy:Tags" term="developer-in-residence" label="developer-in-residence" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Hacking Shortest Paths: Solve Harder Problems by Tweaking Graphs]]></title>
            <link href="https://blog.scientific-python.org/networkx/hacking-shortest-paths/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/outreachy2023/internship/?utm_source=atom_feed" rel="related" type="text/html" title="Outreachy Part II: Internship Guide " />
                <link href="https://blog.scientific-python.org/networkx/outreachy2023/contribution-phase/?utm_source=atom_feed" rel="related" type="text/html" title="Outreachy Part I: My experience as a first-time contributor in Open-Source" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/graph-iso-vf2pp/?utm_source=atom_feed" rel="related" type="text/html" title="The VF2&#43;&#43; algorithm" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/iso-feasibility-candidates/?utm_source=atom_feed" rel="related" type="text/html" title="ISO Feasibility &amp; Candidates" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/node-ordering-ti-updating/?utm_source=atom_feed" rel="related" type="text/html" title="Updates on VF2&#43;&#43;" />
            
                <id>https://blog.scientific-python.org/networkx/hacking-shortest-paths/</id>
            
            
            <published>2025-12-04T00:00:00+00:00</published>
            <updated>2025-12-04T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Modeling Constraints in Shortest Path Problems using NetworkX and Graph Transformations</blockquote><p>Many <strong>real-world shortest path problems</strong> include constraints that classic algorithms don’t directly handle. <a href="https://networkx.org/">NetworkX</a> provides robust, optimized implementations of <a href="https://networkx.org/documentation/latest/reference/algorithms/shortest_paths.html">algorithms</a> like Dijkstra’s, Bellman-Ford, and A*. But what if your problem doesn’t fit the classic shortest path formulation?</p>
<p>Instead of designing a new algorithm from scratch, a powerful approach is to <strong>transform your problem into a standard shortest path query by modifying the input graph</strong>. This lets you <strong>leverage existing, well-tested tools</strong>.</p>
<p>In this post, I’ll present a few common “tricks” to encode more complex shortest-path-like problems using simple graph modifications. Each trick includes a real-world example and can be implemented with just a few lines of NetworkX code.</p>
<h3 id="trick-1-multiple-targets-with-a-sentinel-node">Trick 1: Multiple Targets with a &ldquo;Sentinel&rdquo; Node<a class="headerlink" href="#trick-1-multiple-targets-with-a-sentinel-node" title="Link to this heading">#</a></h3>
<p><strong>Problem</strong>: Find the shortest path from a source node to the closest of several target nodes:</p>
<p>$$
\min_{\substack{p \in \mathcal{P}(s, t) \ t \in \text{targets}}} \mathrm{length}(p)
$$</p>
<p><strong>Classic scenario</strong>: You&rsquo;re part of an emergency response team trying to reach the closest hospital. You know the locations of several hospitals across the city, and you want to get to the nearest one as quickly as possible.</p>
<p><strong>Solution</strong>: Add a new sentinel (or sink) node and connect it to each target node. Then, compute the shortest path from the source to the sentinel. The path will transit through the closest original target (at its second to last hop).</p>
<h4 id="example">Example:<a class="headerlink" href="#example" title="Link to this heading">#</a></h4>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">networkx</span> <span class="k">as</span> <span class="nn">nx</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Create your city map (graph)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># This Erdős-Rényi model generates random graphs for demonstration.</span>
</span></span><span class="line"><span class="cl"><span class="n">G</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">erdos_renyi_graph</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="mi">15</span><span class="p">,</span> <span class="n">p</span><span class="o">=</span><span class="mf">0.2</span><span class="p">,</span> <span class="n">seed</span><span class="o">=</span><span class="mi">111</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ambulance_location</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl"><span class="n">hospitals</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">11</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Add sentinel node</span>
</span></span><span class="line"><span class="cl"><span class="n">G</span><span class="o">.</span><span class="n">add_node</span><span class="p">(</span><span class="s2">&#34;sentinel&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">h</span> <span class="ow">in</span> <span class="n">hospitals</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">G</span><span class="o">.</span><span class="n">add_edge</span><span class="p">(</span><span class="n">h</span><span class="p">,</span> <span class="s2">&#34;sentinel&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Find shortest path from ambulance to the closest hospital (via sentinel)</span>
</span></span><span class="line"><span class="cl"><span class="n">path</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">shortest_path</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">source</span><span class="o">=</span><span class="n">ambulance_location</span><span class="p">,</span> <span class="n">target</span><span class="o">=</span><span class="s2">&#34;sentinel&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="s2">&#34;Route to closest hospital:&#34;</span><span class="p">,</span> <span class="n">path</span><span class="p">[:</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span></span></span></code></pre>
</div>
<p><strong>Output:</strong></p>

<div class="highlight">
  <pre>Route to closest hospital: [0, 8, 3, 11]</pre>
</div>

<p>By adding just a single sentinel node and a few zero-weight edges, you can turn a multi-target search into a single-target shortest path problem.</p>
<p><img src="/networkx/hacking-shortest-paths/sentinel_trick.png" alt="Path to nearest hospital."></p>
<p>While it’s possible to modify Dijkstra’s algorithm to support multiple targets directly—by stopping as soon as one is reached—doing so requires reimplementing logic already handled efficiently by libraries like NetworkX. In contrast, the sentinel trick keeps your code simple and makes use of preexisting tools.</p>
<h3 id="trick-2-forced-detours--pass-through-a-set-of-nodes">Trick 2: Forced Detours — Pass Through a Set of Nodes<a class="headerlink" href="#trick-2-forced-detours--pass-through-a-set-of-nodes" title="Link to this heading">#</a></h3>
<p><strong>Problem</strong>: You want the shortest path from a source to a target, but the path must pass through at least one node from a specific set $C$ (checkpoints):</p>
<p>$$
\min_{\substack{p \in \mathcal{P}(s, t) \ p \cap C \ne \emptyset}} \mathrm{length}(p)
$$</p>
<p><strong>Scenario</strong>: You&rsquo;re heading home but need to stop at any one of several nearby convenience stores to pick up a specific item. You want the shortest overall route home that includes at least one store stop.</p>
<p><strong>Solution</strong>: Create <strong>two versions of each original node</strong>: one for the <strong>before checkpoint</strong> state and one for <strong>after checkpoint</strong>. Edges from the original graph are duplicated within each state. Next, add the state transition edges: connect the pre-checkpoint version of each checkpoint node to its post-checkpoint version with an edge.</p>
<p>This effectively models the constraint into the graph’s structure. The solution to the problem is equivalent to finding the shortest path from source (&ldquo;before checkpoint&rdquo;) to target (&ldquo;after checkpoint&rdquo;).</p>
<h4 id="example-1">Example:<a class="headerlink" href="#example-1" title="Link to this heading">#</a></h4>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">networkx</span> <span class="k">as</span> <span class="nn">nx</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">groupby</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Create your city map (graph)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># This Erdős-Rényi model generates random graphs for demonstration.</span>
</span></span><span class="line"><span class="cl"><span class="n">G</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">erdos_renyi_graph</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="mi">15</span><span class="p">,</span> <span class="n">p</span><span class="o">=</span><span class="mf">0.2</span><span class="p">,</span> <span class="n">seed</span><span class="o">=</span><span class="mi">111</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">checkpoints</span> <span class="o">=</span> <span class="p">[</span><span class="mi">5</span><span class="p">,</span> <span class="mi">10</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="n">source</span><span class="p">,</span> <span class="n">target</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">14</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">G2</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">Graph</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># For each edge in the original graph G,</span>
</span></span><span class="line"><span class="cl"><span class="c1"># add corresponding edges in the expanded graph G2 for both states:</span>
</span></span><span class="line"><span class="cl"><span class="c1"># - (node, False): before visiting a checkpoint</span>
</span></span><span class="line"><span class="cl"><span class="c1"># - (node, True): after visiting a checkpoint</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">u</span><span class="p">,</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">G</span><span class="o">.</span><span class="n">edges</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">    <span class="n">G2</span><span class="o">.</span><span class="n">add_edge</span><span class="p">((</span><span class="n">u</span><span class="p">,</span> <span class="kc">False</span><span class="p">),</span> <span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="kc">False</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="n">G2</span><span class="o">.</span><span class="n">add_edge</span><span class="p">((</span><span class="n">u</span><span class="p">,</span> <span class="kc">True</span><span class="p">),</span> <span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="kc">True</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Checkpoints allow you to transition from &#34;before checkpoint&#34; to &#34;after checkpoint&#34;</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">checkpoint</span> <span class="ow">in</span> <span class="n">checkpoints</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">G2</span><span class="o">.</span><span class="n">add_edge</span><span class="p">((</span><span class="n">checkpoint</span><span class="p">,</span> <span class="kc">False</span><span class="p">),</span> <span class="p">(</span><span class="n">checkpoint</span><span class="p">,</span> <span class="kc">True</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Start from (source, False), end at (target, True)</span>
</span></span><span class="line"><span class="cl"><span class="n">path_with_state</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">shortest_path</span><span class="p">(</span><span class="n">G2</span><span class="p">,</span> <span class="n">source</span><span class="o">=</span><span class="p">(</span><span class="n">source</span><span class="p">,</span> <span class="kc">False</span><span class="p">),</span> <span class="n">target</span><span class="o">=</span><span class="p">(</span><span class="n">target</span><span class="p">,</span> <span class="kc">True</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># We remove the state and use `groupby` to drop duplicated consecutive locations</span>
</span></span><span class="line"><span class="cl"><span class="c1"># because the path in the stateful graph includes the checkpoint location twice</span>
</span></span><span class="line"><span class="cl"><span class="c1"># with different states (e.g., before and after visiting a checkpoint).</span>
</span></span><span class="line"><span class="cl"><span class="c1"># This collapses consecutive duplicates to produce a cleaner path over original nodes.</span>
</span></span><span class="line"><span class="cl"><span class="n">path</span> <span class="o">=</span> <span class="p">[</span><span class="n">node</span> <span class="k">for</span> <span class="n">node</span><span class="p">,</span> <span class="n">_</span> <span class="ow">in</span> <span class="n">groupby</span><span class="p">(</span><span class="n">path_with_state</span><span class="p">,</span> <span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">])]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="s2">&#34;Shortest Path through checkpoint:&#34;</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span></span></span></code></pre>
</div>
<p><strong>Output:</strong></p>

<div class="highlight">
  <pre>Shortest Path through checkpoint: [0, 8, 13, 5, 13, 3, 11, 14]</pre>
</div>

<p>The shortest path from node $0$ to node $14$ that passes through at least one checkpoint ($5$ or $10$) is $0, 8, 13, 5, 13, 3, 11, 14$. In the graph below, blue nodes represent the states before visiting a checkpoint, while green nodes represent the states after visiting a checkpoint. Checkpoint nodes are colored red and are the only ones that allow transitions from &ldquo;before checkpoint&rdquo; to &ldquo;after checkpoint&rdquo; states.</p>
<p><img src="/networkx/hacking-shortest-paths/checkpoint_trick.png" alt="Shortest Path through checkpoint."></p>
<p>The path begins in the &ldquo;before checkpoint&rdquo; state, moving from node $0$ to $5$ via $8$ and $13$. Upon reaching checkpoint $5$, the state transitions to &ldquo;after checkpoint&rdquo;, allowing the rest of the path to continue from $5$ through $13$, $3$, $11$, and finally $14$.</p>
<p><strong>Note:</strong> the resulting path is not loop-free: it traverses over $13$ twice, which is allowed in this state-expanded graph because the location is visited in different contexts—before and after satisfying the checkpoint constraint.</p>
<h3 id="generalization-modeling-arbitrary-state">Generalization: Modeling Arbitrary State<a class="headerlink" href="#generalization-modeling-arbitrary-state" title="Link to this heading">#</a></h3>
<p>The previous approach introduces the idea of stateful graphs, where <strong>each node encodes not just a location but a context—like &ldquo;before or after checkpoint&rdquo;</strong>. And once you start thinking in terms of state, a whole new world of graph transformations opens up.</p>
<p>You can extend this framework even further. The same idea applies for more complex types of states, it&rsquo;s just a matter of encoding the context. For example, you could model:</p>
<ul>
<li><strong>Multiple types of checkpoints</strong>. e.g. before getting home, you need to stop at a grocery store, a pharmacy and a coffee shop.</li>
<li><strong>Weight-changing nodes</strong>. e.g. you can stop by the mechanic, upgrade your car so it becomes 2x faster.</li>
<li><strong>A budget of fuel or money</strong>. e.g. each edge consumes a given amount of fuel, you can stop at service stations to recharge.</li>
</ul>
<p>All of these can be modeled by attaching a state to each node and adjusting the graph accordingly. Therefore turning the specific shortest path problem into a shortest path over an expanded state-space graph, but <strong>the underlying algorithm (e.g., Dijkstra’s) remains untouched</strong>.</p>
<p>For example, to model a budget of $5$ coins where each edge has a cost, you can construct a stateful graph in which each node is labeled with its location and remaining budget. So a node $A$ with $5$ coins becomes $(A, 5)$, and an edge of cost $1$ from $A$ to $B$ corresponds to a transition from $(A, 5)$ to $(B, 4)$. Edges are only added if the cost does not exceed the current budget, ensuring that invalid paths are automatically excluded.</p>
<h4 id="limitations">Limitations<a class="headerlink" href="#limitations" title="Link to this heading">#</a></h4>
<p>One key limitation of this approach is that <strong>modifying the input graph to encode additional constraints can lead to a significant increase in its size</strong>. For example, when enforcing multiple checkpoint requirements by duplicating nodes, the graph can grow <strong>exponentially</strong> in the number of constraints.</p>
<p>This size explosion impacts <strong>both memory and runtime performance</strong>. Even if the underlying shortest path algorithm is efficient (e.g., Dijkstra&rsquo;s or A*), it now operates on a much larger graph, potentially becoming impractical for large scenarios.</p>
<p><strong>Pruning strategies</strong> may mitigate this to some extent but there is a <strong>trade-off between how accurate we can express constraints vs. the size of the transformed graph</strong>.</p>
<p>For instance, in our node-budget state graph example, modeling all different possible budget values can make the stateful graph <strong>explode</strong>. One possible pruning strategy could be considering only budgets that are multiple of ten. Which reduces the size of the graph by <strong>sacrificing accuracy</strong> in the final result (the solution becomes an approximation).</p>
<h4 id="wrapping-up-transform-the-input-reuse-the-algorithm">Wrapping Up: Transform the Input, Reuse the Algorithm<a class="headerlink" href="#wrapping-up-transform-the-input-reuse-the-algorithm" title="Link to this heading">#</a></h4>
<p><strong>Transforming your input data to encode constraints for problem variations is a powerful technique that extends far beyond shortest path problems.</strong> By modifying the input structure you can solve more complex challenges while reusing well-established, efficient algorithms.</p>
<p>This approach simplifies development, and can lead to cleaner, more maintainable code.</p>
<p>So next time you encounter a “non-standard” problem, ask yourself: <strong>can I modify the input to fit a classic algorithm instead of creating a new one from scratch?</strong> This mindset opens up a broad toolbox of solutions across many algorithmic domains.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="shortest-paths" label="shortest-paths" />
                             
                                <category scheme="taxonomy:Tags" term="dijkstra" label="dijkstra" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Pytrees for Scientific Python]]></title>
            <link href="https://blog.scientific-python.org/pytrees/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
            
                <id>https://blog.scientific-python.org/pytrees/</id>
            
            
            <published>2025-07-08T00:00:00+00:00</published>
            <updated>2025-07-08T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Introducing PyTrees for Scientific Python. We discuss what PyTrees are, how they&rsquo;re useful in the realm of scientific Python, and how to work <em>efficiently</em> with them.</blockquote><h2 id="manipulating-tree-like-data-using-functional-programming-paradigms">Manipulating Tree-like Data using Functional Programming Paradigms<a class="headerlink" href="#manipulating-tree-like-data-using-functional-programming-paradigms" title="Link to this heading">#</a></h2>
<p>A &ldquo;PyTree&rdquo; is a nested collection of Python containers (e.g. dicts, (named) tuples, lists, &hellip;), where the leaves are of interest.
In the scientific world, such a PyTree could consist of experimental measurements of different properties at different timestamps and measurement settings resulting in a highly complex, nested and not necessarily rectangular data structure.
Such collections can be cumbersome to manipulate <em>efficiently</em>, especially if they are nested any depth.
It often requires complex recursive logic which usually does not generalize to other nested Python containers (PyTrees), e.g. for new measurements.</p>
<p>The core concept of PyTrees is being able to flatten them into a flat collection of leaves and a &ldquo;blueprint&rdquo; of the tree structure, and then being able to unflatten them back into the original PyTree.
This allows for the application of generic transformations.
In this blog post, we use <a href="https://github.com/metaopt/optree/tree/main/optree"><code>optree</code></a> — a standalone PyTree library — that enables these transformations. It focuses on performance, is feature rich, has minimal dependencies, and has been adopted by <a href="https://pytorch.org">PyTorch</a>, <a href="https://keras.io">Keras</a>, and <a href="https://github.com/tensorflow/tensorflow">TensorFlow</a> (through Keras) as a core dependency.
For example, on a PyTree with NumPy arrays as leaves, taking the square root of each leaf with <code>optree.tree_map(np.sqrt, tree)</code>:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">optree</span> <span class="k">as</span> <span class="nn">pt</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># tuple of a list of a dict with an array as value, and an array</span>
</span></span><span class="line"><span class="cl"><span class="n">tree</span> <span class="o">=</span> <span class="p">([[{</span><span class="s2">&#34;foo&#34;</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mf">4.0</span><span class="p">])}],</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mf">9.0</span><span class="p">])],)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># sqrt of each leaf array</span>
</span></span><span class="line"><span class="cl"><span class="n">sqrt_tree</span> <span class="o">=</span> <span class="n">pt</span><span class="o">.</span><span class="n">tree_map</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">,</span> <span class="n">tree</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;</span><span class="si">{</span><span class="n">sqrt_tree</span><span class="si">=}</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># &gt;&gt; sqrt_tree=([[{&#39;foo&#39;: array([2.])}], array([3.])],)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># reductions</span>
</span></span><span class="line"><span class="cl"><span class="n">all_positive</span> <span class="o">=</span> <span class="nb">all</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">all</span><span class="p">(</span><span class="n">x</span> <span class="o">&gt;</span> <span class="mf">0.0</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">pt</span><span class="o">.</span><span class="n">tree_iter</span><span class="p">(</span><span class="n">tree</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;</span><span class="si">{</span><span class="n">all_positive</span><span class="si">=}</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># &gt;&gt; all_positive=True</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">summed</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">pt</span><span class="o">.</span><span class="n">tree_reduce</span><span class="p">(</span><span class="nb">sum</span><span class="p">,</span> <span class="n">tree</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;</span><span class="si">{</span><span class="n">summed</span><span class="si">=}</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># &gt;&gt; summed=np.float64(13.0)</span></span></span></code></pre>
</div>
<p>The trick here is that these operations can be implemented in three steps, e.g. <code>tree_map</code>:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># step 1:</span>
</span></span><span class="line"><span class="cl"><span class="n">leaves</span><span class="p">,</span> <span class="n">treedef</span> <span class="o">=</span> <span class="n">pt</span><span class="o">.</span><span class="n">tree_flatten</span><span class="p">(</span><span class="n">tree</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># step 2:</span>
</span></span><span class="line"><span class="cl"><span class="n">new_leaves</span> <span class="o">=</span> <span class="nb">tuple</span><span class="p">(</span><span class="nb">map</span><span class="p">(</span><span class="n">fun</span><span class="p">,</span> <span class="n">leaves</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># step 3:</span>
</span></span><span class="line"><span class="cl"><span class="n">result_tree</span> <span class="o">=</span> <span class="n">pt</span><span class="o">.</span><span class="n">tree_unflatten</span><span class="p">(</span><span class="n">treedef</span><span class="p">,</span> <span class="n">new_leaves</span><span class="p">)</span></span></span></code></pre>
</div>
<h3 id="pytree-origins">PyTree Origins<a class="headerlink" href="#pytree-origins" title="Link to this heading">#</a></h3>
<p>Originally, the concept of PyTrees was developed by the <a href="https://docs.jax.dev/en/latest/">JAX</a> project to make nested collections of JAX arrays work transparently at the &ldquo;JIT-boundary&rdquo; (the JAX JIT toolchain does not know about Python containers, only about JAX Arrays).
However, PyTrees were quickly adopted by AI researchers for broader use-cases: semantically grouping layers of weights and biases in a list of named tuples (or dictionaries) is a common pattern in the JAX-AI-world, as shown in the following (pseudo) Python snippet:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">NamedTuple</span><span class="p">,</span> <span class="n">Callable</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">jax</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">jax.numpy</span> <span class="k">as</span> <span class="nn">jnp</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Layer</span><span class="p">(</span><span class="n">NamedTuple</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">W</span><span class="p">:</span> <span class="n">jax</span><span class="o">.</span><span class="n">Array</span>
</span></span><span class="line"><span class="cl">    <span class="n">b</span><span class="p">:</span> <span class="n">jax</span><span class="o">.</span><span class="n">Array</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">layers</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="n">Layer</span><span class="p">(</span><span class="n">W</span><span class="o">=</span><span class="n">jnp</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="o">...</span><span class="p">),</span> <span class="n">b</span><span class="o">=</span><span class="n">jnp</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="o">...</span><span class="p">)),</span>  <span class="c1"># first layer</span>
</span></span><span class="line"><span class="cl">    <span class="n">Layer</span><span class="p">(</span><span class="n">W</span><span class="o">=</span><span class="n">jnp</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="o">...</span><span class="p">),</span> <span class="n">b</span><span class="o">=</span><span class="n">jnp</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="o">...</span><span class="p">)),</span>  <span class="c1"># second layer</span>
</span></span><span class="line"><span class="cl">    <span class="o">...</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nd">@jax.jit</span>
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">neural_network</span><span class="p">(</span><span class="n">layers</span><span class="p">:</span> <span class="nb">list</span><span class="p">[</span><span class="n">Layer</span><span class="p">],</span> <span class="n">x</span><span class="p">:</span> <span class="n">jax</span><span class="o">.</span><span class="n">Array</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">jax</span><span class="o">.</span><span class="n">Array</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">layer</span> <span class="ow">in</span> <span class="n">layers</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">x</span> <span class="o">=</span> <span class="n">jnp</span><span class="o">.</span><span class="n">tanh</span><span class="p">(</span><span class="n">layer</span><span class="o">.</span><span class="n">W</span> <span class="o">@</span> <span class="n">x</span> <span class="o">+</span> <span class="n">layer</span><span class="o">.</span><span class="n">b</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">x</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">prediction</span> <span class="o">=</span> <span class="n">neural_network</span><span class="p">(</span><span class="n">layers</span><span class="o">=</span><span class="n">layers</span><span class="p">,</span> <span class="n">x</span><span class="o">=</span><span class="n">jnp</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="o">...</span><span class="p">))</span></span></span></code></pre>
</div>
<p>Here, <code>layers</code> is a PyTree — a <code>list</code> of multiple <code>Layer</code> — and the JIT compiled <code>neural_network</code> function <em>just works</em> with this data structure as input.
Although you cannot see what happens inside of <code>jax.jit</code>, <code>layers</code> is automatically flattened by the <code>jax.jit</code> decorator to a flat iterable of arrays, which are understood by the JAX JIT toolchain in contrast to a Python <code>list</code> of <code>NamedTuples</code>.</p>
<h3 id="pytrees-in-scientific-python">PyTrees in Scientific Python<a class="headerlink" href="#pytrees-in-scientific-python" title="Link to this heading">#</a></h3>
<p>Wouldn&rsquo;t it be nice to make workflows in the scientific Python ecosystem <em>just work</em> with any PyTree?</p>
<p>Giving semantic meaning to numeric data through PyTrees can be useful for applications outside of AI as well.
Consider the following minimization of the <a href="https://en.wikipedia.org/wiki/Rosenbrock_function">Rosenbrock</a> function:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">scipy.optimize</span> <span class="kn">import</span> <span class="n">minimize</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">rosenbrock</span><span class="p">(</span><span class="n">params</span><span class="p">:</span> <span class="nb">tuple</span><span class="p">[</span><span class="nb">float</span><span class="p">])</span> <span class="o">-&gt;</span> <span class="nb">float</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Rosenbrock function. Minimum: f(1, 1) = 0.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    https://en.wikipedia.org/wiki/Rosenbrock_function
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">x</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">params</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">x</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span> <span class="o">+</span> <span class="mi">100</span> <span class="o">*</span> <span class="p">(</span><span class="n">y</span> <span class="o">-</span> <span class="n">x</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">x0</span> <span class="o">=</span> <span class="p">(</span><span class="mf">0.9</span><span class="p">,</span> <span class="mf">1.2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">res</span> <span class="o">=</span> <span class="n">minimize</span><span class="p">(</span><span class="n">rosenbrock</span><span class="p">,</span> <span class="n">x0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">res</span><span class="o">.</span><span class="n">x</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># &gt;&gt; [0.99999569 0.99999137]</span></span></span></code></pre>
</div>
<p>Now, let&rsquo;s consider a minimization that uses a more complex type for the parameters — a NamedTuple that describes our fit parameters:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">optree</span> <span class="k">as</span> <span class="nn">pt</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">NamedTuple</span><span class="p">,</span> <span class="n">Callable</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">scipy.optimize</span> <span class="kn">import</span> <span class="n">minimize</span> <span class="k">as</span> <span class="n">sp_minimize</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Params</span><span class="p">(</span><span class="n">NamedTuple</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">x</span><span class="p">:</span> <span class="nb">float</span>
</span></span><span class="line"><span class="cl">    <span class="n">y</span><span class="p">:</span> <span class="nb">float</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">rosenbrock</span><span class="p">(</span><span class="n">params</span><span class="p">:</span> <span class="n">Params</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">float</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Rosenbrock function. Minimum: f(1, 1) = 0.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    https://en.wikipedia.org/wiki/Rosenbrock_function
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">params</span><span class="o">.</span><span class="n">x</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span> <span class="o">+</span> <span class="mi">100</span> <span class="o">*</span> <span class="p">(</span><span class="n">params</span><span class="o">.</span><span class="n">y</span> <span class="o">-</span> <span class="n">params</span><span class="o">.</span><span class="n">x</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">minimize</span><span class="p">(</span><span class="n">fun</span><span class="p">:</span> <span class="n">Callable</span><span class="p">,</span> <span class="n">params</span><span class="p">:</span> <span class="n">Params</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Params</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># flatten and store PyTree definition</span>
</span></span><span class="line"><span class="cl">    <span class="n">flat_params</span><span class="p">,</span> <span class="n">treedef</span> <span class="o">=</span> <span class="n">pt</span><span class="o">.</span><span class="n">tree_flatten</span><span class="p">(</span><span class="n">params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># wrap fun to work with flat_params</span>
</span></span><span class="line"><span class="cl">    <span class="k">def</span> <span class="nf">wrapped_fun</span><span class="p">(</span><span class="n">flat_params</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">params</span> <span class="o">=</span> <span class="n">pt</span><span class="o">.</span><span class="n">tree_unflatten</span><span class="p">(</span><span class="n">treedef</span><span class="p">,</span> <span class="n">flat_params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">fun</span><span class="p">(</span><span class="n">params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># actual minimization</span>
</span></span><span class="line"><span class="cl">    <span class="n">res</span> <span class="o">=</span> <span class="n">sp_minimize</span><span class="p">(</span><span class="n">wrapped_fun</span><span class="p">,</span> <span class="n">flat_params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># re-wrap the bestfit values into Params with stored PyTree definition</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">pt</span><span class="o">.</span><span class="n">tree_unflatten</span><span class="p">(</span><span class="n">treedef</span><span class="p">,</span> <span class="n">res</span><span class="o">.</span><span class="n">x</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># scipy minimize that works with any PyTree</span>
</span></span><span class="line"><span class="cl"><span class="n">x0</span> <span class="o">=</span> <span class="n">Params</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="mf">0.9</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="mf">1.2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">bestfit_params</span> <span class="o">=</span> <span class="n">minimize</span><span class="p">(</span><span class="n">rosenbrock</span><span class="p">,</span> <span class="n">x0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">bestfit_params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># &gt;&gt; Params(x=np.float64(0.999995688776513), y=np.float64(0.9999913673387226))</span></span></span></code></pre>
</div>
<p>This new <code>minimize</code> function works with <em>any</em> PyTree!</p>
<p>Let&rsquo;s now consider a modified and more complex version of the Rosenbrock function that relies on two sets of <code>Params</code> as input — a common pattern for hierarchical models (e.g. a superposition of various probability density functions):</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">rosenbrock_modified</span><span class="p">(</span><span class="n">two_params</span><span class="p">:</span> <span class="nb">tuple</span><span class="p">[</span><span class="n">Params</span><span class="p">,</span> <span class="n">Params</span><span class="p">])</span> <span class="o">-&gt;</span> <span class="nb">float</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Modified Rosenbrock where the x and y parameters are determined by
</span></span></span><span class="line"><span class="cl"><span class="s2">    a non-linear transformations of two versions of each, i.e.:
</span></span></span><span class="line"><span class="cl"><span class="s2">      x = arcsin(min(x1, x2) / max(x1, x2))
</span></span></span><span class="line"><span class="cl"><span class="s2">      y = sigmoid(x1 - x2)
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">p1</span><span class="p">,</span> <span class="n">p2</span> <span class="o">=</span> <span class="n">two_params</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># calculate `x` and `y` from two sources:</span>
</span></span><span class="line"><span class="cl">    <span class="n">x</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">asin</span><span class="p">(</span><span class="nb">min</span><span class="p">(</span><span class="n">p1</span><span class="o">.</span><span class="n">x</span><span class="p">,</span> <span class="n">p2</span><span class="o">.</span><span class="n">x</span><span class="p">)</span> <span class="o">/</span> <span class="nb">max</span><span class="p">(</span><span class="n">p1</span><span class="o">.</span><span class="n">x</span><span class="p">,</span> <span class="n">p2</span><span class="o">.</span><span class="n">x</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="n">y</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">+</span> <span class="n">np</span><span class="o">.</span><span class="n">exp</span><span class="p">(</span><span class="o">-</span><span class="p">(</span><span class="n">p1</span><span class="o">.</span><span class="n">y</span> <span class="o">/</span> <span class="n">p2</span><span class="o">.</span><span class="n">y</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">x</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span> <span class="o">+</span> <span class="mi">100</span> <span class="o">*</span> <span class="p">(</span><span class="n">y</span> <span class="o">-</span> <span class="n">x</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">x0</span> <span class="o">=</span> <span class="p">(</span><span class="n">Params</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="mf">0.9</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="mf">1.2</span><span class="p">),</span> <span class="n">Params</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="mf">0.8</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="mf">1.3</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="n">bestfit_params</span> <span class="o">=</span> <span class="n">minimize</span><span class="p">(</span><span class="n">rosenbrock_modified</span><span class="p">,</span> <span class="n">x0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">bestfit_params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># &gt;&gt; (</span>
</span></span><span class="line"><span class="cl"><span class="c1">#     Params(x=np.float64(4.686181110201706), y=np.float64(0.05129869722505759)),</span>
</span></span><span class="line"><span class="cl"><span class="c1">#     Params(x=np.float64(3.9432263101976073), y=np.float64(0.005146110126174016)),</span>
</span></span><span class="line"><span class="cl"><span class="c1"># )</span></span></span></code></pre>
</div>
<p>The new <code>minimize</code> still works, because a <code>tuple</code> of <code>Params</code> is just <em>another</em> PyTree!</p>
<h3 id="final-thought">Final Thought<a class="headerlink" href="#final-thought" title="Link to this heading">#</a></h3>
<p>Working with nested data structures doesn’t have to be messy.
PyTrees let you focus on the data and the transformations you want to apply, in a generic manner.
Whether you&rsquo;re building neural networks, optimizing scientific models, or just dealing with complex nested Python containers, PyTrees can make your code cleaner, more flexible, and just nicer to work with.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="pytrees" label="PyTrees" />
                             
                                <category scheme="taxonomy:Tags" term="functional-programming" label="Functional Programming" />
                             
                                <category scheme="taxonomy:Tags" term="tree-like-data-manipulation" label="Tree-like data manipulation" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[PyPalettes: all the colors you'll ever need]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/pypalettes/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/how-to-create-custom-tables/?utm_source=atom_feed" rel="related" type="text/html" title="How to create custom tables" />
                <link href="https://blog.scientific-python.org/matplotlib/unc-biol222/?utm_source=atom_feed" rel="related" type="text/html" title="Art from UNC BIOL222" />
                <link href="https://blog.scientific-python.org/matplotlib/book/?utm_source=atom_feed" rel="related" type="text/html" title="Newly released open access book" />
                <link href="https://blog.scientific-python.org/matplotlib/visualising-usage-using-batteries/?utm_source=atom_feed" rel="related" type="text/html" title="Battery Charts - Visualise usage rates &amp; more" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_final/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC&#39;21: Final Report" />
            
                <id>https://blog.scientific-python.org/matplotlib/pypalettes/</id>
            
            
            <published>2025-04-01T00:00:00+00:00</published>
            <updated>2025-04-01T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Matplotlib is the go-to library for data visualization in Python. While it offers quality built-in colormaps like viridis and inferno, the limited selection can make Matplotlib charts look similar. To address this, I developed pypalettes, a Python library with over 2,500 high-quality, pre-made color palettes, based on Paletteer. The library includes a web app for browsing and previewing all of them.</blockquote><h2 id="finding-the-right-color-has-never-been-easier">Finding the right color has never been easier<a class="headerlink" href="#finding-the-right-color-has-never-been-easier" title="Link to this heading">#</a></h2>
<p><a href="https://github.com/JosephBARBIERDARNAL/pypalettes">PyPalettes</a> is a new Python library designed to simplify the use of color palettes in Python charts.</p>
<p>It provides mainly two things:</p>
<ul>
<li>a <a href="https://github.com/JosephBARBIERDARNAL/pypalettes">super-easy-to-use library</a> that requires only 1 line of code (in 99.99% of cases, 2 otherwise 🙃) to access thousands of pre-defined and attractive palettes.</li>
<li>a <a href="https://python-graph-gallery.com/color-palette-finder/">web app</a> to browse, filter, search, and preview all available palettes (with <strong>bonus</strong>: copy-pastable code to reproduce the charts).</li>
</ul>
<p><a href="https://python-graph-gallery.com/color-palette-finder/"><img src="https://github.com/holtzy/The-Python-Graph-Gallery/raw/master/static/asset/pypalettes.gif" alt="Preview and try all the palettes"></a></p>
<center><i>A small sample of the available palettes</i></center>
<br>
<h2 id="from-r-to-python">From R to Python<a class="headerlink" href="#from-r-to-python" title="Link to this heading">#</a></h2>
<p>In R, there are dozens of packages dedicated to colors for data visualization. Then <a href="https://emilhvitfeldt.github.io/paletteer/">Paletteer</a> came out to <strong>aggregate</strong> every color palette from those packages into a single one, meaning you <strong>only need one package</strong> to access almost all the color palettes people have created!</p>
<p>While re-crafting the <a href="https://python-graph-gallery.com/python-colors/">colors section of the Python Graph Gallery</a>, I started thinking of a way to have a similar tool to Paletteer but for Python.</p>
<center><h3 style="color: lightgray;">That's where PyPalettes comes in.</h3></center>
<p>Paletteer has a community-maintained <a href="https://pmassicotte.github.io/paletteer_gallery/">gallery</a>—a single page showcasing all its color palettes, along with their original sources and names. With the author’s approval, I scraped this gallery to compile the data.</p>
<p>While there may have been other ways to obtain this information, using a short Python script to reproduce the dataset ensures both simplicity and reproducibility (the script scrapes a page stored locally instead of the web page). To make <strong>pypalettes</strong> more comprehensive, I also incorporated all <strong>built-in colors</strong> from <code>Matplotlib</code>.</p>
<p>As a result, I created a dataset containing approximately <strong>2,500 unique palettes</strong>, each with a name, a list of hexadecimal colors, and a source.</p>
<p>At this point, the hardest part was already done. I just had to create a simple API to make them usable in a Python environment and add some additional simple features.</p>
<p>And since <a href="https://www.yan-holtz.com/">Yan</a> supported the idea, he created this amazing <a href="https://python-graph-gallery.com/color-palette-finder/">web app</a>, making it much easier to browse available palettes.</p>
<p>As a thank-you to <code>Paletteer</code>, Yan also created a color finder that features only <code>Paletteer</code> palettes! If you use R, <a href="https://r-graph-gallery.com/color-palette-finder">check it out here</a>.</p>
<br>
<h2 id="how-to-use-pypalettes">How to use pypalettes<a class="headerlink" href="#how-to-use-pypalettes" title="Link to this heading">#</a></h2>
<p>The goal was to make the simplest API possible, and I&rsquo;m quite satisfied with the result. For example, you really like the <a href="https://python-graph-gallery.com/color-palette-finder/?palette=Esox_lucius">&ldquo;Esox lucius&rdquo; palette</a>, and you want to make a chart with it.</p>
<p>First, you import the <code>load_cmap()</code> function (the main function of the library):</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">pypalettes</span> <span class="kn">import</span> <span class="n">load_cmap</span></span></span></code></pre>
</div>
<p>And then you just have to call this function with <code>name=&quot;Esox_lucius&quot;</code></p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">cmap</span> <span class="o">=</span> <span class="n">load_cmap</span><span class="p">(</span><span class="s2">&#34;Esox_lucius&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p>The output of <code>load_cmap()</code> is either a <a href="https://matplotlib.org/stable/api/_as_gen/matplotlib.colors.ListedColormap.html">matplotlib.colors.ListedColormap</a> or a <a href="https://matplotlib.org/stable/api/_as_gen/matplotlib.colors.LinearSegmentedColormap.html">matplotlib.colors.LinearSegmentedColormap</a>, depending on the value of the <code>cmap_type</code> argument (default is <code>&quot;discrete&quot;</code>, so it&rsquo;s <code>ListedColormap</code> in this case).</p>
<p>Finally, you can create your chart as you normally would:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># load libraries</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">cartopy.crs</span> <span class="k">as</span> <span class="nn">ccrs</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">geopandas</span> <span class="k">as</span> <span class="nn">gpd</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib.font_manager</span> <span class="kn">import</span> <span class="n">FontProperties</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">highlight_text</span> <span class="kn">import</span> <span class="n">fig_text</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># load font</span>
</span></span><span class="line"><span class="cl"><span class="n">personal_path</span> <span class="o">=</span> <span class="s2">&#34;/Users/josephbarbier/Library/Fonts/&#34;</span>  <span class="c1"># change this to your own path</span>
</span></span><span class="line"><span class="cl"><span class="n">font</span> <span class="o">=</span> <span class="n">FontProperties</span><span class="p">(</span><span class="n">fname</span><span class="o">=</span><span class="n">personal_path</span> <span class="o">+</span> <span class="s2">&#34;FiraSans-Light.ttf&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">bold_font</span> <span class="o">=</span> <span class="n">FontProperties</span><span class="p">(</span><span class="n">fname</span><span class="o">=</span><span class="n">personal_path</span> <span class="o">+</span> <span class="s2">&#34;FiraSans-Medium.ttf&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># projection</span>
</span></span><span class="line"><span class="cl"><span class="n">proj</span> <span class="o">=</span> <span class="n">ccrs</span><span class="o">.</span><span class="n">Mercator</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># load the world dataset</span>
</span></span><span class="line"><span class="cl"><span class="n">df</span> <span class="o">=</span> <span class="n">gpd</span><span class="o">.</span><span class="n">read_file</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;https://raw.githubusercontent.com/holtzy/The-Python-Graph-Gallery/master/static/data/all_world.geojson&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">[</span><span class="o">~</span><span class="n">df</span><span class="p">[</span><span class="s2">&#34;name&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">isin</span><span class="p">([</span><span class="s2">&#34;Antarctica&#34;</span><span class="p">])]</span>
</span></span><span class="line"><span class="cl"><span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="n">to_crs</span><span class="p">(</span><span class="n">proj</span><span class="o">.</span><span class="n">proj4_init</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">12</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span> <span class="n">dpi</span><span class="o">=</span><span class="mi">300</span><span class="p">,</span> <span class="n">subplot_kw</span><span class="o">=</span><span class="p">{</span><span class="s2">&#34;projection&#34;</span><span class="p">:</span> <span class="n">proj</span><span class="p">})</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_axis_off</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">df</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">column</span><span class="o">=</span><span class="s2">&#34;name&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">cmap</span><span class="o">=</span><span class="n">cmap</span><span class="p">,</span>  <span class="c1"># here we pass the colormap loaded before</span>
</span></span><span class="line"><span class="cl">    <span class="n">edgecolor</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">linewidth</span><span class="o">=</span><span class="mf">0.2</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig_text</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">x</span><span class="o">=</span><span class="mf">0.5</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">y</span><span class="o">=</span><span class="mf">0.93</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">s</span><span class="o">=</span><span class="s2">&#34;World map with &lt;PyPalettes&gt; colors&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">fontsize</span><span class="o">=</span><span class="mi">25</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">font</span><span class="o">=</span><span class="n">font</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">highlight_textprops</span><span class="o">=</span><span class="p">[{</span><span class="s2">&#34;font&#34;</span><span class="p">:</span> <span class="n">bold_font</span><span class="p">}],</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig_text</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">x</span><span class="o">=</span><span class="mf">0.85</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="mf">0.14</span><span class="p">,</span> <span class="n">s</span><span class="o">=</span><span class="s2">&#34;Joseph Barbier &amp; Yan Holtz&#34;</span><span class="p">,</span> <span class="n">fontsize</span><span class="o">=</span><span class="mi">8</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;right&#34;</span><span class="p">,</span> <span class="n">font</span><span class="o">=</span><span class="n">font</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<center>
<p><img src="/matplotlib/pypalettes/map.png" alt=""></p>
</center>
<p>And once the code is working, you can change the color map name and see straight away what it would look like!</p>
<br>
<h2 id="other-usages">Other usages<a class="headerlink" href="#other-usages" title="Link to this heading">#</a></h2>
<p>PyPalettes is primarily designed for <code>matplotlib</code> due to its <strong>high compatibility</strong> with the <code>cmap</code> argument, but one can imagine <strong>much more</strong>.</p>
<p>For example, the output of <code>load_cmap()</code> includes attributes like <code>colors</code> and <code>rgb</code>, which return lists of hexadecimal colors or RGB values. These can be used in <strong>any context</strong>—from Python visualization libraries like Plotly, Plotnine, and Altair to colorimetry, image processing, or any application that requires color!</p>
<br>
<h2 id="learn-more">Learn more<a class="headerlink" href="#learn-more" title="Link to this heading">#</a></h2>
<p>The main links to find out more about this project are as follows:</p>
<ul>
<li>the <a href="https://python-graph-gallery.com/color-palette-finder/">web app</a> to browse the palettes</li>
<li>this <a href="https://python-graph-gallery.com/introduction-to-pypalettes/">introduction to PyPalettes</a> for a more in-depth code explanation</li>
<li>the <a href="https://github.com/JosephBARBIERDARNAL/pypalettes">Github repo</a> with source code and palettes (give us a star! ⭐)</li>
</ul>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                             
                                <category scheme="taxonomy:Tags" term="color" label="color" />
                             
                                <category scheme="taxonomy:Tags" term="colormap" label="colormap" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[NumPy's Second Developer in Residence: Joren Hammudoglu]]></title>
            <link href="https://blog.scientific-python.org/numpy/fellowship-program-2025/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/numpy/fellowship-program/?utm_source=atom_feed" rel="related" type="text/html" title="NumPy&#39;s first Developer in Residence: Sayed Adel" />
                <link href="https://blog.scientific-python.org/numpy/numpy2/?utm_source=atom_feed" rel="related" type="text/html" title="NumPy 2.0: an evolutionary milestone" />
                <link href="https://blog.scientific-python.org/numpy/numpy-rng/?utm_source=atom_feed" rel="related" type="text/html" title="Best Practices for Using NumPy&#39;s Random Number Generators" />
                <link href="https://blog.scientific-python.org/numpy/mukulikapahari/?utm_source=atom_feed" rel="related" type="text/html" title="NumPy Contributor Spotlight: Mukulika Pahari" />
            
                <id>https://blog.scientific-python.org/numpy/fellowship-program-2025/</id>
            
            
            <published>2025-01-01T00:00:00+00:00</published>
            <updated>2025-01-01T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Introducing NumPy&rsquo;s second developer in residence, Joren Hammudoglu.</blockquote><p>The NumPy team is excited to announce the appointment of Joren Hammudoglu (@jorenham) as the second NumPy Developer in Residence. For the second time, the project is in a position to use its project funds to pay for a full year of maintainer time through the NumPy Fellowship Program.</p>
<p>Joren has been the driving force behind the improvements in NumPy&rsquo;s support for static typing since he started contributing in mid-2024. He has authored a lot of the improvements — from the annotations themselves to CI support and working towards fundamental design improvements like ndarray shape typing — and helps guide and integrate the work of other NumPy contributors in this area, and engages with upstream projects like MyPy and Pyright and the typing standards/PEP process to help move static typing support for the ecosystem as a whole forward. He also contributes widely to static typing support in the ecosystem, as the author of <code>scipy-stubs</code>, <code>numtype</code> and more.</p>
<p>The NumPy Steering Council sees Joren’s appointment to this role as both recognition of his contributions and expertise as well as an opportunity to continue improving NumPy’s static typing support — an area that few maintainers are knowledgeable about but a significant fraction of end users tends to care about a lot.</p>
<p>Joren&rsquo;s role is for the calendar year 2025. Thanks to individual and corporate donations, as well as payments from Tidelift, NumPy has been able to fund another full-time position for one year, following the first such role held by Sayed Adel in 2023. The funds are still <a href="https://numpy.org/neps/nep-0048-spending-project-funds.html">transparently administered</a> on Open Collective.</p>
<p>Welcome aboard, Joren! We&rsquo;re excited to see the impact of your work.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="numpy" label="numpy" />
                             
                                <category scheme="taxonomy:Tags" term="developer-in-residence" label="developer-in-residence" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Developer Summit 2]]></title>
            <link href="https://blog.scientific-python.org/scientific-python/dev-summit-2/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/scientific-python/dev-summit-1/?utm_source=atom_feed" rel="related" type="text/html" title="Developer Summit 1" />
                <link href="https://blog.scientific-python.org/scientific-python/dev-summit-1-sparse/?utm_source=atom_feed" rel="related" type="text/html" title="Developer Summit 1: Sparse Arrays" />
                <link href="https://blog.scientific-python.org/scientific-python/translations/?utm_source=atom_feed" rel="related" type="text/html" title="Translations for Scientific Python projects" />
                <link href="https://blog.scientific-python.org/scientific-python/2022-czi-grant/?utm_source=atom_feed" rel="related" type="text/html" title="Scientific Python awarded CZI grant to improve communications infrastructure &amp; accessibility" />
                <link href="https://blog.scientific-python.org/scientific-python/alt-text-workshop-summary/?utm_source=atom_feed" rel="related" type="text/html" title="Team up! Alt text and cross-project community" />
            
                <id>https://blog.scientific-python.org/scientific-python/dev-summit-2/</id>
            
            
            <published>2024-09-29T00:00:00+00:00</published>
            <updated>2024-09-29T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Summary of the 2024 Scientific Python Developer summit.</blockquote><p>This post is a while overdue, but it&rsquo;s been a busy summer for everyone!</p>
<p>In June, several of us got together for the &ldquo;annual&rdquo; (well, we&rsquo;ve had it twice now) Scientific Python developer summit in Seattle.
Our friends at the eScience Institute were again kind enough to host us.
This time around, we made the event a bit shorter to avoid clashing with graduation.</p>
<p>As before, the Developer Summits are for members of the community to come together, in person, so they can work.
Of course, we work together already online, but the event allows us to focus our energies on cross-project concerns (that often fall by the wayside) with planning and intent.</p>
<p>This is why, before the summit, we have planning Zoom calls, where we identify topics of interest, which are turned into issues, which are then fleshed out and discussed prior to the event.
That way, we hoped to hit the ground running—as we did!</p>
<!-- prettier-ignore-start -->
<figure class="align-default" id="id000">
<img src="group.jpg" alt="Photo of the summit attendees, in front of a fountain on the University of Washington campus, with a volcano in the background" class="align-center">



<figcaption>
<p><span class="caption-text">Our intrepid community developers, with Mt Rainier in the background</span>
</figcaption>
</figure>

<!-- prettier-ignore-end -->
<h2 id="topics">Topics<a class="headerlink" href="#topics" title="Link to this heading">#</a></h2>
<p>You can get a rough idea of what we worked on by browsing the <a href="https://github.com/scientific-python/summit-2024/issues/">planning issues</a> and the <a href="https://hackmd.io/wsJVTMYdQGG_Zgz7rgxSzw">summit worklog</a>.</p>
<p>Broad topics included <a href="https://scientific-python.org/specs/">SPECs</a>, documentation, tools &amp; bots, <a href="https://lectures.scientific-python.org/">lectures</a>, <code>scipy.sparse</code>, telemetry, Array API, and type annotation.</p>
<h3 id="documentation">Documentation<a class="headerlink" href="#documentation" title="Link to this heading">#</a></h3>
<p>Documentation was a much more popular topic than anticipated!</p>
<ul>
<li>The new <a href="https://mystmd.org/guide">mystmd</a> tooling generated some excitement, and an <a href="https://github.com/numpy/numpy-tutorials/tree/mystjs">experimental port of the NumPy tutorials</a> was made by Melissa and Ross.</li>
<li>Recommendations on consistent use of <a href="https://github.com/numpy/numpydoc/pull/525">backticks</a> and <a href="https://github.com/pydata/pydata-sphinx-theme/issues/1852">monospaced font</a> were submitted to numpydoc and pydata-sphinx-theme, respectively.</li>
<li>Madicken, Paul, and Dan worked together to <a href="https://github.com/pydata/pydata-sphinx-theme/pull/1861">extend PyData Sphinx Theme&rsquo;s testing infrastructure</a>, by combining Sphinx Build Factory (for generating small test sites) with Playwright (for browser automation).</li>
<li>Eric and Elliott fixed an <a href="https://github.com/sphinx-gallery/sphinx-gallery/pull/1320">intersphinx issue in sphinx-gallery</a>.</li>
</ul>
<h3 id="specs">SPECs<a class="headerlink" href="#specs" title="Link to this heading">#</a></h3>
<p>The Scientific Python Ecosystem Coordination documents (SPECs) aim to improve coordination of technical development across the ecosystem.</p>
<p>Several new SPECs were started:</p>
<ul>
<li><a href="https://hackmd.io/yI1iAqekQIq0a4jLS9WPyw">SPEC-?: Dispatching (<code>spatch</code>)</a></li>
<li><a href="https://scientific-python.org/specs/spec-0008/">SPEC-8: Securing The Release Process</a></li>
<li><a href="https://scientific-python.org/specs/spec-0009/">SPEC-9: Governance</a></li>
<li><a href="https://github.com/scientific-python/specs/pull/321">SPEC-10: Changelog and release documentation</a></li>
<li><a href="https://github.com/scientific-python/specs/pull/326">SPEC-12: Formatting mathematical expressions</a></li>
<li><a href="https://github.com/scientific-python/specs/pull/324">SPEC-13: Naming conventions</a></li>
</ul>
<p>Some existing SPECs were discussed and improved:</p>
<ul>
<li><a href="https://scientific-python.org/specs/spec-0007/">SPEC-7: Seeding pseudo-random number generation (SPRaNG)</a></li>
</ul>
<p>Matplotlib <a href="https://scientific-python.org/specs/purpose-and-process/#decision-points">endorsed</a> several SPECs.</p>
<h3 id="tooling">Tooling<a class="headerlink" href="#tooling" title="Link to this heading">#</a></h3>
<ul>
<li>We created a new tools team to handle the ever-growing <a href="https://tools.scientific-python.org/">list of tools we maintain</a>.</li>
<li>Eric added his <a href="https://github.com/scientific-python/circleci-artifacts-redirector-action">circleci-artifacts-redirector-action</a> to the suite.</li>
<li>Matthias brought over his <a href="https://github.com/scientific-python/MeeseeksDev">backport bot</a> and set up a maintenance team.</li>
</ul>
<h3 id="scipy">SciPy<a class="headerlink" href="#scipy" title="Link to this heading">#</a></h3>
<p>Several of the SciPy developers were present, and we used the opportunity to celebrate Dan Schult joining as a core developer 🎉!
Matt and Pamphile did some work on the new distribution infrastructure, Dan worked on sparse (remotely with CJ), and a <a href="https://github.com/scipy/scipy/pull/20891">PR adding newly-supported <code>const</code> statements to Cython code</a> got reviewed and merged.
Eric isolated <a href="https://github.com/sphinx-doc/sphinx/issues/12409">a non-deterministic bug in Sphinx</a> that was impacting parallel builds of SciPy&rsquo;s documentation.
He found a work-around that had been eluding the team for months!</p>
<h3 id="unplanned-collaborations">Unplanned collaborations<a class="headerlink" href="#unplanned-collaborations" title="Link to this heading">#</a></h3>
<p>As is the nature of these events, some collaborations arise spontaneously.
For example:</p>
<ul>
<li>Nick and Ariel explored using Awkward Array for neuro-tractography.</li>
<li>Nick and Mridul explored using <a href="https://scipp.github.io/index.html">scipp</a> for high-energy physics data.</li>
<li>Guen worked on telemetry.</li>
<li>Inessa and Sanket discussed best practices for community surveys and project governance.</li>
<li>Sebastian and Thomas <a href="https://hackmd.io/84thx0ucQ2ab17ZYrBhWRw">discussed parallelization APIs</a>.</li>
<li>Inessa, with input from Tim and Thomas, finalized the design of the 2024 scikit-learn user survey.</li>
<li>Erik and Dan discussed index compression options for CSR-like N-D sparse arrays.</li>
</ul>
<h3 id="conclusion">Conclusion<a class="headerlink" href="#conclusion" title="Link to this heading">#</a></h3>
<!-- prettier-ignore-start -->
<figure class="align-default" id="id001">
<img src="chess.jpg" alt="Members of the team playing chess at Big Time Ale Brewery in Seattle" class="align-center">



<figcaption>
<p><span class="caption-text">Nothing like a relaxing game of chess after a long day&rsquo;s work</span>
</figcaption>
</figure>

<!-- prettier-ignore-end -->
<p>Numerous other PRs were made, of which a number were probably not even captured in the <a href="https://hackmd.io/wsJVTMYdQGG_Zgz7rgxSzw">worklog</a>.
But, besides the inherent satisfaction of working together with this great group, the best feature of the summit was that we were able to hang out, bonding over our communal joys and struggles—both technical and personal.</p>
<p>We are grateful to the ecosystem developers who gave up their time to attend the summit (many had to put in leave <em>just to do more work</em>!).
The summits are valuable, and translate to a lot of work getting done and decisions being made.
We hope that there will be more on the horizon!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="summit" label="Summit" />
                             
                                <category scheme="taxonomy:Tags" term="scientific-python" label="Scientific-Python" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Automated tests with GPUs for your project]]></title>
            <link href="https://blog.scientific-python.org/scikit-learn/gpu-ci/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
            
                <id>https://blog.scientific-python.org/scikit-learn/gpu-ci/</id>
            
            
            <published>2024-08-15T00:00:00+00:00</published>
            <updated>2024-08-15T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Setting up CI with a GPU to test your code</blockquote><p>TL;DR: If you have GPU code in your project, setup a GitHub hosted GPU runner today.
It is fairly quick to do and will free you from having to run tests manually.</p>
<p>Writing automated tests for your code base and certainly for the more complex parts
of it has become as normal as brushing your teeth in the morning. Having a system
that automatically runs a project&rsquo;s tests for every Pull Request
is completely normal. However, until recently it was very complex and expensive
to setup a system that can run tests on a system with a GPU. This means that,
when dealing with GPU related code, we were thrown back into the dark ages where
you had to rely on manual testing.</p>
<p>In this blog post I will describe how we set up a GitHub Action based GPU runner
for the scikit-learn project and the things we learnt along the way. The goal is
to give you some additional information and details about the setup we now use.</p>
<ul>
<li><a href="/scikit-learn/gpu-ci/#larger-runners-with-gpus">Setting up larger runners for your project</a></li>
<li><a href="/scikit-learn/gpu-ci/#vm-image-contents">VM image contents and setup</a></li>
<li><a href="/scikit-learn/gpu-ci/#workflow-configuration">Workflow configuration</a></li>
<li><a href="/scikit-learn/gpu-ci/#bonus-material">Bonus material</a></li>
</ul>
<h2 id="larger-runners-with-gpus">Larger runners with GPUs<a class="headerlink" href="#larger-runners-with-gpus" title="Link to this heading">#</a></h2>
<p>All workflows for your GitHub project are executed on a
runner. Normally all your workflows run on the default runner, but you can have additional runners too. If you wanted
to you could host a runner yourself on your own infrastructure. Until now this
was the only way to get access to a runner with a GPU. However, hosting your
own runner is complicated and comes with pitfalls regarding security.</p>
<p>Since about April 2024 GitHub has made <a href="https://docs.github.com/en/actions/using-github-hosted-runners/about-larger-runners/about-larger-runners">larger runners with a
GPU</a> generally available.</p>
<p>To use these you will have to <a href="https://docs.github.com/en/billing/managing-your-github-billing-settings/adding-or-editing-a-payment-method#updating-your-organizations-payment-method">setup a credit card for your organisation</a>. Configure a spending limit so that you do not end up getting surprised
with a very large bill. For scikit-learn we currently use a limit of $50.</p>
<p>When <a href="https://github.com/organizations/YOUR_OWN_ORG_NAME/settings/actions/runners">adding a new GitHub hosted runner</a> make sure to select the &ldquo;Partner&rdquo; tab when
choosing the VM&rsquo;s image. You need to select the &ldquo;NVIDIA GPU-Optimized Image for AI and HPC&rdquo;
image in order to be able to choose the GPU runner later on.</p>
<p>The group the runner is assigned to can be configured to only allow particular repositories
and workflows to use the runner group. It makes sense to only enable the runner
group for the repository in which you plan to use it. Limiting which workflows your
runner will pick up requires an additional level of indirection in your workflow
setup, so I will not cover it in this blog post.</p>
<p>Name your runner group <code>cuda-gpu-runner-group</code> to match the name used in the examples
below.</p>
<h2 id="vm-image-contents">VM Image contents<a class="headerlink" href="#vm-image-contents" title="Link to this heading">#</a></h2>
<p>The GPU runner uses a disk image provided by NVIDIA. This means that there are
some differences to the image that the default runner uses.</p>
<p>The <code>gh</code> command-line utility is not installed by default. Keep this in mind
if you want to do things like removing a label from the Pull Request or
other such tasks.</p>
<p>The biggest difference to the standard image is that the GPU image contains
a conda installation, but the file permissions do not allow the workflow user
to modify the existing environment or create new environments. As a result
for scikit-learn we install conda a second time via miniforge. The conda environment is
created from a lockfile, so we do not need to run the dependency solver.</p>
<h2 id="workflow-configuration">Workflow configuration<a class="headerlink" href="#workflow-configuration" title="Link to this heading">#</a></h2>
<p>A key difference between the GPU runner and the default runner is that a project
has to pay for the time of the GPU runner. This means that you might want to
execute your GPU workflow only for some Pull Requests instead of all of them.</p>
<p>The GPU available in the runner is not very powerful, this means it is not
that attractive of a target for people who are looking to abuse free GPU resources.
Nevertheless, once in a while someone might try. Another reason to not run
the GPU workflow by default.</p>
<p>A nice way to deal with running the workflow only after some form of human review
is to use a label. To mark a Pull Request (PR) for execution on the GPU runner a
reviewer applies a particular label. Applying a label does not cause a notification
to be sent to all PR participants, unlike using a special comment to trigger the
workflow.
In the following example the <code>CUDA CI</code> label is used to mark a PR for execution and
the <code>runs-on</code> directive is used to select the GPU runner. This is a snippet from
<a href="https://github.com/scikit-learn/scikit-learn/blob/9d39f57399d6f1f7d8e8d4351dbc3e9244b98d28/.github/workflows/cuda-ci.yml">the full GPU workflow</a> used in the scikit-learn repository.</p>

<div class="highlight">
  <pre>name: CUDA GPU
on:
  pull_request:
    types:
      - labeled

jobs:
  tests:
    if: contains(github.event.pull_request.labels.*.name, &#39;CUDA CI&#39;)
    runs-on:
      group: cuda-gpu-runner-group
    steps:
      - uses: actions/setup-python@v5
        with:
          python-version: &#39;3.12.3&#39;
      - name: Checkout main repository
        uses: actions/checkout@v4
      ...</pre>
</div>

<p>In order to remove the label again we need a workflow with elevated
permissions. It needs to be able to edit a Pull Request. This privilege is not
available for workflows triggered from Pull Requests from forks. Instead
the workflow has to run in the context of the main repository and should only
do the minimum amount of work.</p>

<div class="highlight">
  <pre>on:
  # Using `pull_request_target` gives us the possibility to get a API token
  # with write permissions
  pull_request_target:
    types:
      - labeled

# In order to remove the &#34;CUDA CI&#34; label we need to have write permissions for PRs
permissions:
  pull-requests: write

jobs:
  label-remover:
    if: contains(github.event.pull_request.labels.*.name, &#39;CUDA CI&#39;)
    runs-on: ubuntu-20.04
    steps:
      - uses: actions-ecosystem/action-remove-labels@v1
        with:
          labels: CUDA CI</pre>
</div>

<p>This snippet is from the <a href="https://github.com/scikit-learn/scikit-learn/blob/9d39f57399d6f1f7d8e8d4351dbc3e9244b98d28/.github/workflows/cuda-label-remover.yml">label remover workflow</a>
we use in scikit-learn.</p>
<h2 id="bonus-material">Bonus Material<a class="headerlink" href="#bonus-material" title="Link to this heading">#</a></h2>
<p>For scikit-learn we have been using the GPU runner for about six weeks. So far we have stayed
below the $50 monthly spending limit we set. This includes some runs to debug the setup at the
start.</p>
<p>One of the scikit-learn contributors created a <a href="https://gist.github.com/EdAbati/ff3bdc06bafeb92452b3740686cc8d7c">Colab notebook that people can use to setup and run the scikit-learn test suite on Colab</a>. This is useful
for contributors who do not have easy access to a GPU. They can test their changes or debug
failures without having to wait for a maintainer to label the Pull Request. We plan to add
a workflow that comments on PRs with information on how to use this notebook to increase its
discoverability.</p>
<h2 id="conclusion">Conclusion<a class="headerlink" href="#conclusion" title="Link to this heading">#</a></h2>
<p>Overall it was not too difficult to setup the GPU runner. It took a little bit of fiddling to
deal with the differences in VM image content as well as a few iterations for how to setup
the workflow, given we wanted to manually trigger them.</p>
<p>The GPU runner has been reliably working and picking up work when requested. It saves us (the
maintainers) a lot of time, as we do not have to checkout a PR locally and run the tests
by hand.</p>
<p>The costs so far have been manageable and it has been worth spending the money as it removes
a repetitive and tedious manual task from the reviewing workflow. However, it does require
having the funds and a credit card.</p>]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="scikit-learn" label="scikit-learn" />
                             
                                <category scheme="taxonomy:Tags" term="ci" label="ci" />
                             
                                <category scheme="taxonomy:Tags" term="gpu" label="gpu" />
                             
                                <category scheme="taxonomy:Tags" term="cuda" label="cuda" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Translations for Scientific Python projects]]></title>
            <link href="https://blog.scientific-python.org/scientific-python/translations/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/scientific-python/2022-czi-grant/?utm_source=atom_feed" rel="related" type="text/html" title="Scientific Python awarded CZI grant to improve communications infrastructure &amp; accessibility" />
                <link href="https://blog.scientific-python.org/scientific-python/dev-summit-1/?utm_source=atom_feed" rel="related" type="text/html" title="Developer Summit 1" />
                <link href="https://blog.scientific-python.org/scientific-python/dev-summit-1-sparse/?utm_source=atom_feed" rel="related" type="text/html" title="Developer Summit 1: Sparse Arrays" />
                <link href="https://blog.scientific-python.org/scientific-python/alt-text-workshop-summary/?utm_source=atom_feed" rel="related" type="text/html" title="Team up! Alt text and cross-project community" />
                <link href="https://blog.scientific-python.org/scientific-python/gsod-2022-proposal/?utm_source=atom_feed" rel="related" type="text/html" title="Scientific Python GSoD 2022 Proposal" />
            
                <id>https://blog.scientific-python.org/scientific-python/translations/</id>
            
            
            <published>2024-08-13T00:00:00+00:00</published>
            <updated>2024-08-13T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Setting up and managing translations for Scientific Python projects.</blockquote><p>In November 2022, <a href="https://blog.scientific-python.org/scientific-python/2022-czi-grant/">the Chan Zuckerberg Initiative (CZI) awarded the Scientific Python project with a grant to improve communications infrastructure and accessibility</a>. This proposal involves several key areas to improve sustainability, inclusivity and accessibility of the Scientific Python ecosystem. One of these areas is making software documentation and user interfaces available in multiple languages. We are happy to announce that we have organized the necessary infrastructure and processes to allow volunteers to start translating multiple project websites.</p>
<p>In this blog post, we will discuss how we set up manage translations for Scientific Python projects, and how you can participate in the translation and localization effort. The work described here was done by Quansight Labs.</p>
<h2 id="why-translations-are-important">Why translations are important<a class="headerlink" href="#why-translations-are-important" title="Link to this heading">#</a></h2>
<p>Accessibility and inclusion are important aspects of building a healthy community around open source software development. By providing translations of our websites, documentation and user interfaces in multiple languages, we can reach a wider global audience, thereby making our projects more inclusive and increasing the diversity of contributions and ideas. This is especially important for scientific software projects, which are used by researchers and scientists from around the world with a wide range of language proficiencies and backgrounds.</p>
<p>Recently, machine translation tools have made it easier to translate content into multiple languages. While such tools are improving, they often fall short in the context of technical documentation. In addition, without a human in the loop, maintainers can never be sure if translated content is correct for languages they are not familiar with. For this reason, we chose to work with a group of volunteers to help us translate selective subsets of project websites and documentation.</p>
<h2 id="setting-up-translations-workflow">Setting up translations workflow<a class="headerlink" href="#setting-up-translations-workflow" title="Link to this heading">#</a></h2>
<p>You may have seen that translations into some languages are already available for <a href="https://numpy.org">numpy.org</a>, with a version switcher in the top right corner.</p>
<p><img src="/scientific-python/translations/numpyorg.png" alt="Screenshot of the numpy.org site in Japanese, with a version switcher in the top right corner showing the English and Portuguese language options."></p>
<p>A number of core projects have also joined this effort and are set up to start with translations. At the moment, we are targeting <a href="https://numpy.org">NumPy</a>, <a href="https://scipy.org">SciPy</a>, <a href="https://networkx.org">NetworkX</a>, <a href="https://xarray.dev">Xarray</a>, and <a href="https://pandas.pydata.org">pandas</a>. We&rsquo;re offering to help other core projects integrate something similar into their websites, and we aim to accomplish this in a way that requires minimal effort from the core project maintainers, using <a href="https://scientific-python.crowdin.com">Crowdin</a>.</p>
<p>For the moment, our scope is to translate only the projects&rsquo; websites—the landing pages you see when you check out the links above—and <strong>not</strong> full documentation. We are intentionally starting small with the goal of completing this first phase and then potentially expanding once the translations team is established.</p>
<h2 id="translations-team">Translations team<a class="headerlink" href="#translations-team" title="Link to this heading">#</a></h2>
<p>For new contributors who are looking to get involved in the projects they already use and depend on, joining the translations team can be a great way to get started. Because of how the translations infrastructure is set up in Crowdin, this workflow is particularly well-suited for new contributors who are not yet familiar with GitHub or other development tools—they can work entirely on the Crowdin web platform.</p>
<p>One advantage of setting this up at the Scientific Python level, and not on a per-project basis, is that the translations team can work on multiple projects, and knowledge and experience can be shared. This also helps to ensure that translations are consistent across projects.</p>
<p>The translations team will be responsible for:</p>
<ul>
<li>Translating (and reviewing) content into multiple languages;</li>
<li>Ensuring that translations are accurate and up-to-date;</li>
<li>Engaging with the community to help maintain and improve translations.</li>
</ul>
<p>Once the translations are complete and reviewed, a maintainer for each project can merge the translations and publish them to the project website. After the initial setup, the translations team will be able to manage the translations workflow independently, with minimal input from the project maintainers.</p>
<p>For more information on the infrastructure and team working on the translations, and how to join as a translator, see <a href="https://scientific-python-translations.github.io/">https://scientific-python-translations.github.io/</a>. You can also join the <code>#translation</code> channel at the <a href="https://discord.gg/vur45CbwMz">Scientific Python Discord server</a>.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="scientific-python" label="Scientific-Python" />
                             
                                <category scheme="taxonomy:Tags" term="translations" label="translations" />
                             
                                <category scheme="taxonomy:Tags" term="inclusion" label="inclusion" />
                             
                                <category scheme="taxonomy:Tags" term="czi" label="CZI" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[NumPy 2.0: an evolutionary milestone]]></title>
            <link href="https://blog.scientific-python.org/numpy/numpy2/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/numpy/numpy-rng/?utm_source=atom_feed" rel="related" type="text/html" title="Best Practices for Using NumPy&#39;s Random Number Generators" />
                <link href="https://blog.scientific-python.org/numpy/fellowship-program/?utm_source=atom_feed" rel="related" type="text/html" title="NumPy&#39;s first Developer in Residence: Sayed Adel" />
                <link href="https://blog.scientific-python.org/numpy/mukulikapahari/?utm_source=atom_feed" rel="related" type="text/html" title="NumPy Contributor Spotlight: Mukulika Pahari" />
                <link href="https://blog.scientific-python.org/matplotlib/book/?utm_source=atom_feed" rel="related" type="text/html" title="Newly released open access book" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_final/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC&#39;21: Final Report" />
            
                <id>https://blog.scientific-python.org/numpy/numpy2/</id>
            
            
            <published>2024-06-17T00:00:00+00:00</published>
            <updated>2024-06-17T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>We announce the release of NumPy 2.0, which brings major improvements in functionality and usability. Improvements to NumPy internals set the stage for future development. We discuss the motivation behind breaking changes and how those impact users, as well as some of the history of the 2.0 development process.</blockquote><p>Eighteen years since the release of NumPy 1.0, we are thrilled to announce the
launch of NumPy 2.0! This major release marks a significant milestone in the
evolution of NumPy, bringing a wealth of enhancements and improvements to users,
and setting the stage for future feature development.</p>
<p>NumPy has improved and evolved over the past 18 years, with many old releases bringing
significant performance, usability, and consistency improvements.
That said, our approach for a long time has been to make only incremental changes while
carefully managing backward compatibility. This approach minimizes user breakage,
but also limits the scope of improvements that can be made, both to the API and its underlying implementation.
Therefore, for this one-off major release, we are breaking backward
compatibility to implement significant improvements in NumPy&rsquo;s type system. The
type system is fundamental to NumPy, and major behavioral changes could not be
made incrementally without mixing two different
type systems, which would be a recipe for disaster.</p>
<p>The journey to an actual 2.0 release has been long, and it was difficult to
build the necessary momentum. In part, this may be because, for a time, the
NumPy developers associated a NumPy 2.0 release with nothing less than a
revolutionary rewrite of significant key pieces of the code base. Many of these
rewrites and changes happened over the years, but because of backward
compatibility concerns they remained largely invisible to the users. NumPy 2.0
is the culmination of these efforts, allowing us to discard some legacy
ABI (Application Binary Interface) that prevented future improvements.</p>
<p>Some major changes to NumPy internals—required for key features in
2.0—have been in the works since 2019 at least.
We started concrete plans for the 2.0 release more than a year ago, at a four hour
long <a href="https://github.com/numpy/archive/tree/main/2.0_developer_meeting">public planning meeting</a>
in April 2023. Many of the key changes were proposed and discussed. The key goals
we decided on there were perhaps even larger and more ambitious in scope than
some of us expected. This also unlocked some extra energy - which has been great to see.
After the meeting and over the course of the last year, NumPy enhancement
proposals (<a href="https://numpy.org/neps/">NEPs</a>) were written,
reviewed, and implemented for each major change.</p>
<p>Some key highlights are:</p>
<ul>
<li>
<p>Cleaned-up and streamlined Python API (<a href="https://numpy.org/neps/nep-0052-python-api-cleanup.html">NEP 52</a>):
The Python API has undergone a thorough cleanup, making it easier to learn
and use NumPy. The main namespace has been reduced by approximately 10%, and
the more niche <code>numpy.lib</code> namespace has been reduced by about 80%, providing
a clearer distinction between public and private API elements.</p>
</li>
<li>
<p>Improved scalar promotion rules: The scalar promotion rules have been
updated, as proposed in <a href="https://numpy.org/neps/nep-0050-scalar-promotion.html">NEP 50</a>
addressing surprising behaviors in type promotion, e.g. with zero dimensional arrays.</p>
</li>
<li>
<p>Powerful new DType API and a new string dtype: NumPy 2.0 introduces a new API
for implementing user-defined custom data types as proposed by
<a href="https://numpy.org/neps/nep-0041-improved-dtype-support.html">NEP 41</a>. We used
this new API to implement <code>StringDType</code>, offering efficient and painless
support for variable length strings which was proposed in
<a href="https://numpy.org/neps/nep-0055-string_dtype.html">NEP 55</a>. And it is our hope
that enable future new data types with interesting new capabilities in the
PyData ecosystem and in NumPy itself.</p>
</li>
<li>
<p>Windows compatibility enhancements: The default 32-bit integer representation
on Windows has been updated to 64-bit on 64-bit architectures, addressing one
of the most common problems with having NumPy work portably across operating
systems.</p>
</li>
<li>
<p>Support for the Python array API standard: This is the first release to
include full support for the array API standard (v2022.12), made possible
by the new promotion rules, APIs, and API cleanup mentioned above.
We also aligned existing APIs and behavior with the standard,
as proposed in <a href="https://numpy.org/neps/nep-0056-array-api-main-namespace.html">NEP 56</a>.</p>
</li>
</ul>
<p>These are just some of the more impactful changes in behavior and usability. In addition,
NumPy 2.0 contains significant performance and documentation improvements,
and much more - for an extensive list of changes, see
the <a href="https://numpy.org/devdocs/release/2.0.0-notes.html">NumPy 2 release notes</a>.</p>
<p>To adopt this major release, users will likely need to adjust existing code, but we
worked hard to strike a balance between improvements and ensuring that the
transition to NumPy 2.0 is as seamless as possible. We wrote a comprehensive
<a href="https://numpy.org/devdocs/numpy_2_0_migration_guide.html">migration guide</a>,
and a <a href="https://numpy.org/devdocs/numpy_2_0_migration_guide.html#ruff-plugin">ruff plugin</a>
that helps to update Python code so it will work with both NumPy 1.x and
NumPy 2.x.</p>
<p>While we do require C API users to recompile their projects to support
NumPy 2.0, we prepared for this in NumPy 1.25 already. The build process was
simplified so that you can now compile with the latest NumPy version,
and remain backward compatible.
This means that projects build with NumPy 2.x are &ldquo;magically&rdquo; compatible with
1.x. It also means that projects no longer need to build their binaries using
the oldest supported version of NumPy.</p>
<p>We knew throughout development that rolling out NumPy 2.0
would be (temporarily) disruptive, because of the backwards-incompatible API and
ABI changes. We spent an extraordinary amount of effort communicating these
changes, helping downstream projects adapt, tracking compatibility of popular
open source projects (see, e.g.,
<a href="https://github.com/numpy/numpy/issues/26191">numpy#26191</a>), and completing the
release process at limited pace to provide time for adoption. No
doubt, the next few weeks will bring to light some new challenges, however we fully expect these
to be manageable and well worth it in the long run.</p>
<p>The NumPy 2.0 release is the result of a collaborative, largely volunteer,
effort spanning many years and involving contributions from a diverse community
of developers. In addition, many of the changes above would not have been
possible without funders and institutional sponsors allowing several team members
to work on NumPy as part of their day jobs. We&rsquo;d like to acknowledge in particular:
the Gordon and Betty Moore Foundation, the Alfred P. Sloan Foundation,
NASA, NVIDIA, Quansight Labs, the Chan Zuckerberg Initiative, and Tidelift.</p>
<p>We are excited about future improvements to NumPy, many of which will be
possible due to changes in NumPy 2.0. See <a href="https://numpy.org/neps/roadmap.html">the NumPy roadmap</a>
for some features in the pipeline or on the wishlist. Let&rsquo;s
continue working together to improve NumPy and the scientific Python and PyData
ecosystem!</p>]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="numpy" label="numpy" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Best Practices for Using NumPy's Random Number Generators]]></title>
            <link href="https://blog.scientific-python.org/numpy/numpy-rng/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/scientific-python/dev-summit-1-development-guide/?utm_source=atom_feed" rel="related" type="text/html" title="The Scientific Python Development Guide" />
                <link href="https://blog.scientific-python.org/numpy/fellowship-program/?utm_source=atom_feed" rel="related" type="text/html" title="NumPy&#39;s first Developer in Residence: Sayed Adel" />
                <link href="https://blog.scientific-python.org/numpy/mukulikapahari/?utm_source=atom_feed" rel="related" type="text/html" title="NumPy Contributor Spotlight: Mukulika Pahari" />
                <link href="https://blog.scientific-python.org/matplotlib/how-to-create-custom-tables/?utm_source=atom_feed" rel="related" type="text/html" title="How to create custom tables" />
                <link href="https://blog.scientific-python.org/matplotlib/visualising-usage-using-batteries/?utm_source=atom_feed" rel="related" type="text/html" title="Battery Charts - Visualise usage rates &amp; more" />
            
                <id>https://blog.scientific-python.org/numpy/numpy-rng/</id>
            
            
            <published>2024-01-26T23:22:46+02:00</published>
            <updated>2024-01-26T23:22:46+02:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Best Practices for Using NumPy&rsquo;s Random Number Generators</blockquote><p>Given the practical challenges of achieving true randomness, deterministic algorithms, known as Pseudo Random Number Generators (RNGs), are employed in science to create sequences that mimic randomness. These generators are used for simulations, experiments, and analysis where it is essential to have numbers that appear unpredictable. I want to share here what I have learned about best practices with pseudo RNGs and especially the ones available in <a href="https://numpy.org/">NumPy</a>.</p>
<p>A pseudo RNG works by updating an internal state through a deterministic algorithm. This internal state is initialized with a value known as a seed and each update produces a number that appears randomly generated. The key here is that the process is deterministic, meaning that if you start with the same seed and apply the same algorithm, you will get the same sequence of internal states (and numbers). Despite this determinism, the resulting numbers exhibit properties of randomness, appearing unpredictable and evenly distributed. Users can either specify the seed manually, providing a degree of control over the generated sequence, or they can opt to let the RNG object automatically derive the seed from system entropy. The latter approach enhances unpredictability by incorporating external factors into the seed.</p>
<p>I assume a certain knowledge of NumPy and that NumPy 1.17 or greater is used. The reason for this is that great new features were introduced in the <a href="https://numpy.org/doc/stable/reference/random/index.html">random</a> module of version 1.17. As <code>numpy</code> is usually imported as <code>np</code>, I will sometimes use <code>np</code> instead of <code>numpy</code>. Finally, RNG will always mean pseudo RNG in the rest of this blog post.</p>
<h3 id="the-main-messages">The main messages<a class="headerlink" href="#the-main-messages" title="Link to this heading">#</a></h3>
<ol>
<li>Avoid using the global NumPy RNG. This means that you should avoid using <a href="https://numpy.org/doc/stable/reference/random/generated/numpy.random.seed.html"><code>np.random.seed</code></a> and <code>np.random.*</code> functions, such as <code>np.random.random</code>, to generate random values.</li>
<li>Create a new RNG and pass it around using the <a href="https://numpy.org/doc/stable/reference/random/generator.html#numpy.random.default_rng"><code>np.random.default_rng</code></a> function.</li>
<li>Be careful with parallel random number generation and use the <a href="https://numpy.org/doc/stable/reference/random/parallel.html">strategies provided by NumPy</a>.</li>
</ol>
<p>Note that, with older versions of NumPy (&lt;1.17), the way to create a new RNG is to use <a href="https://numpy.org/doc/stable/reference/random/legacy.html#numpy.random.RandomState"><code>np.random.RandomState</code></a> which is based on the popular Mersenne Twister 19937 algorithm. This is also how the global NumPy RNG is created. This function is still available in newer versions of NumPy, but it is now recommended to use <code>default_rng</code> instead, which returns an instance of the statistically better <a href="https://www.pcg-random.org">PCG64</a> RNG. You might still see <code>np.random.RandomState</code> being used in tests as it has strong stability guarantees between different NumPy versions.</p>
<h2 id="random-number-generation-with-numpy">Random number generation with NumPy<a class="headerlink" href="#random-number-generation-with-numpy" title="Link to this heading">#</a></h2>
<p>When you import <code>numpy</code> in your Python script, an RNG is created behind the scenes. This RNG is the one used when you generate a new random value using a function such as <code>np.random.random</code>. I will here refer to this RNG as the global NumPy RNG.</p>
<p>Although not recommended, it is a common practice to reset the seed of this global RNG at the beginning of a script using the <code>np.random.seed</code> function. Fixing the seed at the beginning ensures that the script is reproducible: the same values and results will be produced each time you run it. However, although sometimes convenient, using the global NumPy RNG is a bad practice. A simple reason is that using global variables can lead to undesired side effects. For instance one might use <code>np.random.random</code> without knowing that the seed of the global RNG was set somewhere else in the codebase. Quoting <a href="https://numpy.org/neps/nep-0019-rng-policy.html">Numpy Enhancement Proposal (NEP) 19</a> by Robert Kern:</p>
<blockquote>
<p>The implicit global RandomState behind the <code>np.random.*</code> convenience functions can cause problems, especially when threads or other forms of concurrency are involved. Global state is always problematic. We categorically recommend avoiding using the convenience functions when reproducibility is involved. [&hellip;] The preferred best practice for getting reproducible pseudorandom numbers is to instantiate a generator object with a seed and pass it around.</p>
</blockquote>
<p>In short:</p>
<ul>
<li>Instead of using <code>np.random.seed</code>, which reseeds the already created global NumPy RNG, and then using <code>np.random.*</code> functions, you should create a new RNG.</li>
<li>You should create an RNG at the beginning of your script (with your own seed if you want reproducibility) and use this RNG in the rest of the script.</li>
</ul>
<p>To create a new RNG you can use the <code>default_rng</code> function as illustrated in the <a href="https://numpy.org/doc/stable/reference/random/index.html">introduction of the random module documentation</a>:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">rng</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">default_rng</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">rng</span><span class="o">.</span><span class="n">random</span><span class="p">()</span>  <span class="c1"># generate a floating point number between 0 and 1</span></span></span></code></pre>
</div>
<p>If you want to use a seed for reproducibility, <a href="https://numpy.org/doc/stable/reference/random/index.html#quick-start">the NumPy documentation</a> recommends using a large random number, where large means at least 128 bits. The first reason for using a large random number is that this increases the probability of having a different seed than anyone else and thus independent results. The second reason is that relying only on small numbers for your seeds can lead to biases as they do not fully explore the state space of the RNG. This limitation implies that the first number generated by your RNG may not seem as random as expected due to inaccessible first internal states. For example, some numbers will never be produced as the first output. One possibility would be to pick the seed at random in the state space of the RNG but <a href="https://github.com/numpy/numpy/issues/25778#issuecomment-1930441151">according to Robert Kern</a> a 128-bit random number is large enough<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. To generate a 128-bit random number for your seed you can rely on the <a href="https://docs.python.org/3/library/secrets.html">secrets module</a>:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">secrets</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">secrets</span><span class="o">.</span><span class="n">randbits</span><span class="p">(</span><span class="mi">128</span><span class="p">)</span></span></span></code></pre>
</div>
<p>When running this code I get <code>65647437836358831880808032086803839626</code> for the number to use as my seed. This number is randomly generated so you need to copy paste the value that is returned by <code>secrets.randbits(128)</code> otherwise you will have a different seed each time you run your code and thus break reproducibility:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">seed</span> <span class="o">=</span> <span class="mi">65647437836358831880808032086803839626</span>
</span></span><span class="line"><span class="cl"><span class="n">rng</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">default_rng</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">rng</span><span class="o">.</span><span class="n">random</span><span class="p">()</span></span></span></code></pre>
</div>
<p>The reason for seeding your RNG only once (and passing that RNG around) is that with a good RNG such as the one returned by <code>default_rng</code> you will be ensured good randomness and independence of the generated numbers. However, if not done properly, using several RNGs (each one created with its own seed) might lead to streams of random numbers that are less independent than the ones created from the same seed<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. That being said, <a href="https://github.com/numpy/numpy/issues/15322#issuecomment-573890207">as explained by Robert Kern</a>, with the RNGs and seeding strategies introduced in NumPy 1.17, it is considered fairly safe to create RNGs using system entropy, i.e. using <code>default_rng(None)</code> multiple times. However as explained later be careful when running jobs in parallel and relying on <code>default_rng(None)</code>. Another reason for seeding your RNG only once is that obtaining a good seed can be time consuming. Once you have a good seed to instantiate your generator, you might as well use it.</p>
<h2 id="passing-a-numpy-rng-around">Passing a NumPy RNG around<a class="headerlink" href="#passing-a-numpy-rng-around" title="Link to this heading">#</a></h2>
<p>As you write functions that you will use on their own as well as in a more complex script it is convenient to be able to pass a seed or your already created RNG. The function <code>default_rng</code> allows you to do this very easily. As written above, this function can be used to create a new RNG from your chosen seed, if you pass a seed to it, or from system entropy when passing <code>None</code> but you can also pass an already created RNG. In this case the returned RNG is the one that you passed.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">stochastic_function</span><span class="p">(</span><span class="n">high</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">rng</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">rng</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">default_rng</span><span class="p">(</span><span class="n">rng</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">rng</span><span class="o">.</span><span class="n">integers</span><span class="p">(</span><span class="n">high</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span></span></span></code></pre>
</div>
<p>You can either pass an <code>int</code> seed or your already created RNG to <code>stochastic_function</code>. To be perfectly exact, the <code>default_rng</code> function returns the exact same RNG passed to it for certain kind of RNGs such at the ones created with <code>default_rng</code> itself. You can refer to the <a href="https://numpy.org/doc/stable/reference/random/generator.html#numpy.random.default_rng"><code>default_rng</code> documentation</a> for more details on the arguments that you can pass to this function<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>.</p>
<h2 id="parallel-processing">Parallel processing<a class="headerlink" href="#parallel-processing" title="Link to this heading">#</a></h2>
<p>You must be careful when using RNGs in conjunction with parallel processing. Let&rsquo;s consider the context of Monte Carlo simulation: you have a random function returning random outputs and you want to generate these random outputs a lot of times, for instance to compute an empirical mean. If the function is expensive to compute, an easy solution to speed up the computation time is to resort to parallel processing. Depending on the parallel processing library or backend that you use different behaviors can be observed. For instance if you do not set the seed yourself it can be the case that forked Python processes use the same random seed, generated for instance from system entropy, and thus produce the exact same outputs which is a waste of computational resources. A very nice example illustrating this when using the Joblib parallel processing library is available <a href="https://joblib.readthedocs.io/en/latest/auto_examples/parallel_random_state.html">here</a>.</p>
<p>If you fix the seed at the beginning of your main script for reproducibility and then pass your seeded RNG to each process to be run in parallel, most of the time this will not give you what you want as this RNG will be deep copied. The same results will thus be produced by each process. One of the solutions is to create as many RNGs as parallel processes with a different seed for each of these RNGs. The issue now is that you cannot choose the seeds as easily as you would think. When you choose two different seeds to instantiate two different RNGs how do you know that the numbers produced by these RNGs will appear as statistically independent?<sup id="fnref1:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> The design of independent RNGs for parallel processes has been an important research question. See, for example, <a href="https://www.sciencedirect.com/science/article/pii/S0378475416300829">Random numbers for parallel computers: Requirements and methods, with emphasis on GPUs</a> by L&rsquo;Ecuyer et al. (2017) for a good summary of different methods.</p>
<p>Starting with NumPy 1.17, it is now very easy to instantiate independent RNGs. Depending on the type of RNG you use, different strategies are available as documented in the <a href="https://numpy.org/doc/stable/reference/random/parallel.html">Parallel random number generation section</a> of the NumPy documentation. One of the strategies is to use <code>SeedSequence</code> which is an algorithm that makes sure that poor input seeds are transformed into good initial RNG states. More precisely, this ensures that you will not have a degenerate behavior from your RNG and that the subsequent numbers will appear random and independent. Additionally, it ensures that close seeds are mapped to very different initial states, resulting in RNGs that are, with very high probability, independent of each other. You can refer to the documentation of <a href="https://numpy.org/doc/stable/reference/random/parallel.html#seedsequence-spawning">SeedSequence Spawning</a> for examples on how to generate independent RNGs from a <code>SeedSequence</code> or an existing RNG. I here show how to apply this to the <a href="https://joblib.readthedocs.io/en/latest/auto_examples/parallel_random_state.html#fixing-the-random-state-to-obtain-deterministic-results">joblib example</a> mentioned above.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">joblib</span> <span class="kn">import</span> <span class="n">Parallel</span><span class="p">,</span> <span class="n">delayed</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">stochastic_function</span><span class="p">(</span><span class="n">high</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">rng</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">rng</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">default_rng</span><span class="p">(</span><span class="n">rng</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">rng</span><span class="o">.</span><span class="n">integers</span><span class="p">(</span><span class="n">high</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">seed</span> <span class="o">=</span> <span class="mi">319929794527176038403653493598663843656</span>
</span></span><span class="line"><span class="cl"><span class="c1"># creating the RNG that is passed around.</span>
</span></span><span class="line"><span class="cl"><span class="n">rng</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">default_rng</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># create 5 independent RNGs</span>
</span></span><span class="line"><span class="cl"><span class="n">child_rngs</span> <span class="o">=</span> <span class="n">rng</span><span class="o">.</span><span class="n">spawn</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># use 2 processes to run the stochastic_function 5 times with joblib</span>
</span></span><span class="line"><span class="cl"><span class="n">random_vector</span> <span class="o">=</span> <span class="n">Parallel</span><span class="p">(</span><span class="n">n_jobs</span><span class="o">=</span><span class="mi">2</span><span class="p">)(</span>
</span></span><span class="line"><span class="cl">    <span class="n">delayed</span><span class="p">(</span><span class="n">stochastic_function</span><span class="p">)(</span><span class="n">rng</span><span class="o">=</span><span class="n">child_rng</span><span class="p">)</span> <span class="k">for</span> <span class="n">child_rng</span> <span class="ow">in</span> <span class="n">child_rngs</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">random_vector</span><span class="p">)</span></span></span></code></pre>
</div>
<p>By using a fixed seed you always get the same results each time you run this code and by using <code>rng.spawn</code> you have an independent RNG for each call to <code>stochastic_function</code>. Note that here you could also spawn from a <code>SeedSequence</code> that you would create with the seed instead of creating an RNG. However, in general you pass around an RNG therefore I only assume to have access to an RNG. Also note that spawning from an RNG is only possible from version 1.25 of NumPy<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>.</p>
<p>I hope this blog post helped you understand the best ways to use NumPy RNGs. The new Numpy API gives you all the tools you need for that. The resources below are available for further reading. Finally, I would like to thank Pamphile Roy, Stefan van der Walt and Jarrod Millman for their great feedbacks and comments which contributed to greatly improve the original version of this blog post.</p>
<h2 id="resources">Resources<a class="headerlink" href="#resources" title="Link to this heading">#</a></h2>
<h3 id="numpy-rngs">Numpy RNGs<a class="headerlink" href="#numpy-rngs" title="Link to this heading">#</a></h3>
<ul>
<li><a href="https://numpy.org/doc/stable/reference/random/index.html">The documentation of the NumPy random module</a> is the best place to find information and where I found most of the information that I share here.</li>
<li><a href="https://numpy.org/neps/nep-0019-rng-policy.html">The Numpy Enhancement Proposal (NEP) 19 on the Random Number Generator Policy</a> which lead to the changes introduced in NumPy 1.17</li>
<li>A <a href="https://github.com/numpy/numpy/issues/15322">NumPy issue</a> about the <code>check_random_state</code> function and RNG good practices, especially <a href="https://github.com/numpy/numpy/issues/15322#issuecomment-573890207">this comment</a> by Robert Kern.</li>
<li>Check also <a href="https://github.com/numpy/numpy/issues/25778#issuecomment-1930441151">this answer of Robert Kern</a> to my question about what <code>SeedSequence</code> can and cannot do. This also explains why it is recommended to use very large random numbers for seeds.</li>
<li><a href="https://scikit-learn.org/stable/faq.html#how-do-i-set-a-random-state-for-an-entire-execution">How do I set a random_state for an entire execution?</a> from the scikit-learn FAQ.</li>
<li>There are <a href="https://github.com/scientific-python/specs/pull/180">ongoing discussions</a> about uniformizing the APIs used by different libraries to seed RNGs.</li>
</ul>
<h3 id="rngs-in-general">RNGs in general<a class="headerlink" href="#rngs-in-general" title="Link to this heading">#</a></h3>
<ul>
<li><a href="https://www.sciencedirect.com/science/article/pii/S0378475416300829">Random numbers for parallel computers: Requirements and methods, with emphasis on GPUs</a> by L&rsquo;Ecuyer et al. (2017)</li>
<li>To know more about the default RNG used in NumPy, named PCG, I recommend the <a href="https://www.pcg-random.org/paper.html">PCG paper</a> which also contains lots of useful information about RNGs in general. The <a href="https://www.pcg-random.org">pcg-random.org website</a> is also full of interesting information about RNGs.</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>If you only need a seed for reproducibility and do not need independence with respect to others, say for a unit test, a small seed is perfectly fine.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>A good RNG is expected to produce independent numbers for a given seed. However, the independence of sequences generated from two different seeds is not always guaranteed. For instance, it is possible that the sequence started with the second seed might quickly converge to an internal state also obtained by the first seed. This can result in both RNGs producing the same subsequent numbers, which would compromise the randomness expected from distinct seeds.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Before knowing about <code>default_rng</code>, and before NumPy 1.17, I was using the scikit-learn function <a href="https://scikit-learn.org/stable/modules/generated/sklearn.utils.check_random_state.html"><code>check_random_state</code></a> which is of course heavily used in the scikit-learn codebase. While writing this post I discovered that this function is now available in <a href="https://github.com/scipy/scipy/blob/62d2af2e13280d29781585aa39a3c5a5dfdfba17/scipy/_lib/_util.py#L231">scipy</a>. A look at the docstring and/or the source code of this function will give you a good idea about what it does. The differences with <code>default_rng</code> are that <code>check_random_state</code> currently relies on <code>np.random.RandomState</code> and that when <code>None</code> is passed to <code>check_random_state</code> then the function returns the already existing global NumPy RNG. The latter can be convenient because if you fix the seed of the global RNG before in your script using <code>np.random.seed</code>, <code>check_random_state</code> returns the generator that you seeded. However, as explained above, this is not the recommended practice and you should be aware of the risks and the side effects.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Before 1.25 you need to get the <code>SeedSequence</code> from the RNG using the <code>_seed_seq</code> private attribute of the underlying bit generator: <code>rng.bit_generator._seed_seq</code>. You can then spawn from this <code>SeedSequence</code> to get child seeds that will result in independent RNGs.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="numpy" label="numpy" />
                             
                                <category scheme="taxonomy:Tags" term="rng" label="rng" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[The Scientific Python Development Guide]]></title>
            <link href="https://blog.scientific-python.org/scientific-python/dev-summit-1-development-guide/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/scientific-python/scientific-python-project/?utm_source=atom_feed" rel="related" type="text/html" title="Scientific Python: Community developed, community owned" />
                <link href="https://blog.scientific-python.org/matplotlib/how-to-create-custom-tables/?utm_source=atom_feed" rel="related" type="text/html" title="How to create custom tables" />
                <link href="https://blog.scientific-python.org/matplotlib/visualising-usage-using-batteries/?utm_source=atom_feed" rel="related" type="text/html" title="Battery Charts - Visualise usage rates &amp; more" />
                <link href="https://blog.scientific-python.org/matplotlib/python-graph-gallery.com/?utm_source=atom_feed" rel="related" type="text/html" title="The Python Graph Gallery: hundreds of python charts with reproducible code." />
                <link href="https://blog.scientific-python.org/matplotlib/stellar-chart-alternative-radar-chart/?utm_source=atom_feed" rel="related" type="text/html" title="Stellar Chart, a Type of Chart to Be on Your Radar" />
            
                <id>https://blog.scientific-python.org/scientific-python/dev-summit-1-development-guide/</id>
            
            
            <published>2023-08-26T12:00:00-05:00</published>
            <updated>2023-08-26T12:00:00-05:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Introducing the Scientific Python Development Guide!</blockquote><p>One outcome of the
<a href="https://scientific-python.org/summits/developer/2023/">2023 Scientific Python Developer Summit</a>
was the <a href="https://learn.scientific-python.org/development">Scientific Python Development Guide</a>, a comprehensive guide to modern
Python package development, complete with a <a href="https://github.com/scientific-python/cookie">new project template</a>
supporting 10+ build backends and a <a href="https://learn.scientific-python.org/development/guides/repo-review">WebAssembly-powered checker</a>
with checks linked to the guide. The guide covers topics like <a href="https://learn.scientific-python.org/development/guides/packaging-simple/">modern</a>,
<a href="https://learn.scientific-python.org/development/guides/packaging-compiled/">compiled</a>, and <a href="https://learn.scientific-python.org/development/guides/packaging-classic/">classic</a> packaging, <a href="https://learn.scientific-python.org/development/guides/style/">style</a> checks, <a href="https://learn.scientific-python.org/development/guides/mypy/">type
checking</a>, <a href="https://learn.scientific-python.org/development/guides/docs/">docs</a>, <a href="https://learn.scientific-python.org/development/guides/tasks/">task runners</a>, <a href="https://learn.scientific-python.org/development/guides/gha-basic/">CI</a>, <a href="https://learn.scientific-python.org/development/guides/pytest/">tests</a>,
and much more! There also are sections of <a href="https://learn.scientific-python.org/development/tutorials/">tutorials</a>, <a href="https://learn.scientific-python.org/development/principles/">principles</a>, and
some common <a href="https://learn.scientific-python.org/development/patterns/">patterns</a>.</p>
<p>This guide (along with cookie &amp; repo-review) started in <a href="https://scikit-hep.org">Scikit-HEP</a> in 2020.
During the summit, it was merged with the <a href="https://nsls-ii.github.io/">NSLS-II</a> guidelines, which provided
the basis for the <a href="https://learn.scientific-python.org/development/principles/">principles</a> section. I&rsquo;d like to thank and acknowledge Dan
Allan and Gregory Lee for working tirelessly during the summit to rework,
rewrite, merge, and fix the guide, including writing most of the <a href="https://learn.scientific-python.org/development/tutorials/">tutorials</a>
pages and first <a href="https://learn.scientific-python.org/development/patterns/">patterns</a> page, and rewriting the <a href="https://learn.scientific-python.org/development/tutorials/dev-environment/">environment</a> page as a
tutorial.</p>
<h2 id="the-guide">The guide<a class="headerlink" href="#the-guide" title="Link to this heading">#</a></h2>
<p>The core of the project is the guide, which is comprised of four sections:</p>
<ul>
<li><a href="https://learn.scientific-python.org/development/tutorials/">Tutorials</a>: How to go from &ldquo;research&rdquo; code to a basic package, for
beginning readers.</li>
<li><a href="https://learn.scientific-python.org/development/guides/">Topical guides</a>: The core of the guide, for intermediate to advanced
readers.</li>
<li><a href="https://learn.scientific-python.org/development/principles/">Principles</a>: Some general principles from the <a href="https://nsls-ii.github.io/">NSLS-II</a> guide.</li>
<li><a href="https://learn.scientific-python.org/development/patterns/">Patterns</a>: Recipes for common situations. Three pages are there now;
<a href="https://learn.scientific-python.org/development/patterns/data-files/">including data</a>, <a href="https://learn.scientific-python.org/development/patterns/backports/">backports</a>, and <a href="https://learn.scientific-python.org/development/patterns/exports/">exports</a>.</li>
</ul>
<p>From the original <a href="https://scikit-hep.org/developer">Scikit-HEP dev pages</a>, a lot was added:</p>
<ul>
<li>Brand new guide page on documentation, along with new <a href="https://learn.scientific-python.org/development/guides/repo-review">sp-repo-review</a> checks to
help with readthedocs.</li>
<li>A compiled projects page! Finally! With <a href="https://scikit-build-core.readthedocs.io">scikit-build-core</a>,
<a href="https://meson-python.readthedocs.io">meson-python</a>, and <a href="https://www.maturin.rs">maturin</a>. The page shows real outputs from the
<a href="https://www.cookiecutter.io">cookiecutter</a>, kept in sync with <a href="https://nedbatchelder.com/code/cog">cog</a> (a huge benefit of using a single
repo for all three components!)</li>
<li>Big update to <a href="https://learn.scientific-python.org/development/guides/gha-basic/">GHA CI page</a> including a section on Composite
Actions given at the Dev summit.</li>
<li>CLI entry points are now included.</li>
<li>Python 3.12 support added, Python 3.7 dropped.</li>
<li>New <a href="https://learn.scientific-python.org/development/guides/repo-review">sp-repo-review</a> badges throughout (more on that later!)</li>
<li>Updates for <a href="https://beta.ruff.rs">Ruff</a>&rsquo;s move and support for requires-python.</li>
<li>Lots of additions for GitHub Actions.</li>
</ul>
<p>The infrastructure was updated too:</p>
<ul>
<li>Using latest Jekyll (version 4) and latest Just the Docs theme. More colorful
callout boxes. Plugins are used now.</li>
<li>Live PR preview (provided by probably the world’s first readthedocs Jekyll
build!). Developed with the Zarr developers during the summit.</li>
<li>Better advertising for <a href="https://github.com/scientific-python/cookie">cookie</a> and <a href="https://learn.scientific-python.org/development/guides/repo-review">sp-repo-review</a> on the index page(s).</li>
<li>Auto bump and auto-sync CI jobs.</li>
</ul>
<h2 id="cookie">Cookie<a class="headerlink" href="#cookie" title="Link to this heading">#</a></h2>
<p>We also did something I&rsquo;ve wanted to do for a long time: the guide, the
cookiecutter template, and the checks are all in a single repo! The repo is
<a href="https://github.com/scientific-python/cookie">scientific-python/cookie</a>, which is the moved <code>scikit-hep/cookie</code> (the
old URL for cookiecutter still works!).</p>
<p>Cookie is a new project template supporting multiple backends (including
compiled ones), kept in sync with the dev guide. We recommend starting with
the dev guide and setting up your first package by hand, so that you understand
what each part is for, but once you&rsquo;ve done that, <a href="https://github.com/scientific-python/cookie">cookie</a> allows you to get
started on a new project in seconds.</p>
<p>A lot of work went into <a href="https://github.com/scientific-python/cookie">cookie</a>, too!</p>
<ul>
<li>Generalized defaults. We still have special integration if someone sets the
org to <code>scikit-hep</code>; the same integration can be offered to other orgs.</li>
<li>All custom hooks removed; standard jinja now used throughout templates. Using
cookiecutter computed variables idiom too. Windows still fully supported and
tested. Adding new choices is much easier now.</li>
<li>Added cruft (a cookiecutter updater) testing.</li>
<li>Dual-supporting <a href="https://copier.readthedocs.io">copier</a> now too, a <a href="https://www.cookiecutter.io">cookiecutter</a> replacement with a huge
CLI improvement (and also supports updating). Might be the first project to
support both at the same time. CI (or <a href="https://nox.thea.codes">nox</a> locally) checks to ensure
generation is identical. Much better interface with copier, including
validation, descriptive text, arrow selection, etc.</li>
<li>Improved CLI experience even if using cookiecutter (like no longer requesting
current year).</li>
<li>Reworked docs template.</li>
<li>Support for cookiecutter 2.2 pretty descriptions (added about four hours after
cookiecutter 2.2.0 was released) and cookiecutter 2.2.3 choice descriptions.</li>
<li>GitLab CI support when not targeting github.com URLs (added by Giordon Stark).</li>
<li>Support for selecting VCS or classic versioning.</li>
</ul>
<h2 id="repo-review">Repo-review<a class="headerlink" href="#repo-review" title="Link to this heading">#</a></h2>
<p>See the <a href="https://iscinumpy.dev/post/repo-review/">introduction to repo-review</a>
for information about this one!</p>
<p>Along with this was probably the biggest change, one requested by several people
at the summit: <a href="https://github.com/scientific-python/repo-review">scientific-python/repo-review</a> (was
<code>scikit-hep/repo-review</code>) is now a completely general framework for implementing
checks in Python 3.10+. The checks have been moved to <code>sp-repo-review</code>, which is
now part of scientific-python/cookie. There are too many changes to list here,
so just the key ones in 0.6, 0.7, 0.8, 0.9, and 0.10:</p>
<ul>
<li>Extensive, beautiful <a href="https://repo-review.readthedocs.io">documentation</a> for
check authors at (used to help guide the new docs guide page &amp; template
update!)</li>
<li>Support for four output formats, <a href="https://rich.readthedocs.io">rich</a> (improved), svg, json (new), html
(new).</li>
<li>Support for listing all checks.</li>
<li>GitHub Actions support with HTML step summary, <a href="https://pre-commit.com">pre-commit</a> support.</li>
<li>Generalized topological sorting to fixtures, dynamic fixtures.</li>
<li>Dynamic check selection (via fixtures! Basically everything is powered by
fixtures now.)</li>
<li>URL support in all output formats (including the terminal and WebApp!)</li>
<li>Support for package not at root of repo.</li>
<li>Support for running on a remote repo from the command line.</li>
<li>Support for select/ignore config in <code>pyproject.toml</code> or command line.</li>
<li>Pretty printed and controllable sorting for families.</li>
<li>Supports running from Python, including inside a readme with something like
cog.</li>
<li>Support for dynamic family descriptions (such as to output build system and
licence used).</li>
<li>Support for limiting the output to just errors or errors and skips.</li>
<li>Support for running on multiple repos at once, with output tailored to
multiple repos. Also supports passing <code>pyproject.toml</code> path instead to make
running on mixed repos easier.</li>
<li>Support for linting <code>[tool.repo-review]</code> with <a href="https://validate-pyproject.readthedocs.io">validate-pyproject</a>.</li>
</ul>
<p>The
<a href="https://repo-review.readthedocs.io/en/latest/changelog.html">full changelog</a>
has more - you can even see the 10 beta releases in-between 0.6.x and 0.7.0
where a lot of this refactoring work was happening. If you have configuration
you’d like to write check for, feel free to write a plugin!</p>
<p><a href="https://validate-pyproject.readthedocs.io">validate-pyproject</a> 0.14 has added support for being used as a repo-review
plugin, so you can validate <code>pyproject.toml</code> files with repo-review! This lints
<code>[project]</code> and <code>[build-system]</code> tables, <code>[tool.setuptools]</code>, and other tools
via plugins. <a href="https://scikit-build-core.readthedocs.io">Scikit-build-core</a> 0.5 can be used as a validate-project plugin
to lint <code>[tool.scikit-build]</code>. Repo-review has a plugin for
<code>[tool.repo-review]</code>.</p>
<h2 id="sp-repo-review">sp-repo-review<a class="headerlink" href="#sp-repo-review" title="Link to this heading">#</a></h2>
<p>Finally, <a href="https://learn.scientific-python.org/development/guides/repo-review">sp-repo-review</a> contains the previous repo-review plugins with checks:</p>
<ul>
<li>Fully cross-linked with the development guide. Every check has a URL that
points to a matching badge inside the development guide where the thing the
check is looking for is being discussed!</li>
<li>Full list of checks (including URLs), produced by cog, in
<a href="https://pypi.org/p/sp-repo-review">readme</a>.</li>
<li>Also ships with GitHub Action and <a href="https://pre-commit.com">pre-commit</a> checks</li>
<li>Released (in sync with cookie &amp; guide, as they are in the same repo) as
CalVer,
<a href="https://github.com/scientific-python/cookie/releases">with release notes</a>.</li>
<li>Split CI that selects just want to run based on changed files, with green
checkmark that respects skips (based on the excellent contrition to
pypa/build).</li>
</ul>
<!-- prettier-ignore-start -->
<figure class="align-default" id="id000">
<img src="cibw_example.png" alt="Image of sp-repo-review showing checks" class="align-center" width="60%">



<figcaption>
<p><span class="caption-text">Running sp-repo-review on cibuildwheel</span>
</figcaption>
</figure>

<!-- prettier-ignore-end -->
<h2 id="using-the-guide">Using the guide<a class="headerlink" href="#using-the-guide" title="Link to this heading">#</a></h2>
<p>If you have a guide, we&rsquo;d like for you to compare it with the Scientific Python
Development Guide, and see if we are missing anything - bring it to our
attention, and maybe we can add it. And then you can link to the centrally
maintained guide instead of manually maintaining a complete custom guide. See
<a href="https://scikit-hep.org/developer">scikit-hep/developer</a> for an example; many pages now point at this guide.
We can also provide org integrations for <a href="https://github.com/scientific-python/cookie">cookie</a>, providing some
customizations when a user targets your org (targeting <code>scikit-hep</code> will add a
badge).</p>]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="cookie" label="cookie" />
                             
                                <category scheme="taxonomy:Tags" term="scientific-python" label="scientific-python" />
                             
                                <category scheme="taxonomy:Tags" term="summit" label="summit" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Your Code Could Go To Space]]></title>
            <link href="https://blog.scientific-python.org/community-stories/will_tirone/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/scipy/internships/smit/?utm_source=atom_feed" rel="related" type="text/html" title="SciPy Internship: 2021-2022" />
                <link href="https://blog.scientific-python.org/scipy/qmc-basics/?utm_source=atom_feed" rel="related" type="text/html" title="A quick tour of QMC with SciPy" />
            
                <id>https://blog.scientific-python.org/community-stories/will_tirone/</id>
            
            
            <published>2023-07-19T00:00:00+00:00</published>
            <updated>2023-07-19T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>How I learned to code and contributed to a library used on the Mars 2020 mission.</blockquote><p>In mid-2018 I started learning Python by reading textbooks and watching online tutorials. I had absolutely zero background in computer science, but it seemed interesting so I continued to try. At some point, I decided I wanted to do a master’s degree in statistics, so I began to work on more statistics-based programming. That’s when I found SciPy. I became (and still am) fascinated by the idea of open-source software that is completely free to use and supported by a community of diligent programmers.</p>
<p>With plenty of extra time on my hands during the pandemic, I made it my goal to contribute to a Python library. My first contribution was actually to a project called <a href="https://github.com/firstcontributions/first-contributions">first contributions</a> which walks you through a very basic commit and push to GitHub. That built up my confidence a bit, so I decided to tackle a SciPy issue.
It was not easy. I watched several videos and guides on how to contribute to an open-source library but got stuck many times along the way! I have to admit I felt incompetent trying to make changes to this huge library, but the maintainers and community could not have been nicer or more supportive. That’s really the magic of open source. I was confused and lost, but the (largely volunteer) community was amazing.</p>
<p>Eventually, I managed to get a very small commit merged into the main branch of SciPy (which you can see <a href="https://github.com/scipy/scipy/pull/12962">here</a>). Despite being, at most, a few lines of code, this was a huge landmark for me as a programmer. To my surprise though, in early 2021, a little badge pop up on my GitHub profile that said “Mars 2020 Helicopter Contributor”. I was confused. I didn’t recall working on helicopters, much less helicopters that flew on Mars. I still remember getting chills when I read that I contributed to a library that was used on NASA’s Mars 2020 mission, which involved the robotic helicopter Ingenuity. GitHub posted an <a href="https://github.blog/2021-04-19-open-source-goes-to-mars/">article</a> explaining how about 12,000 people received a badge indicating that they had contributed to an open-source library that was used on the mission! Keep in mind that I made an absolutely tiny contribution, but I was extremely proud to be recognized in that way.</p>
<p>In an even stranger twist, this summer, I’m interning at NASA’s Jet Propulsion Laboratory (JPL) which built and still flies the Ingenuity helicopter along with the other robotic space exploration missions. It’s truly surreal to have come full circle like this, from learning Python in my living room during the pandemic to using SciPy in my daily work at JPL. Here, my work involves writing statistical simulations to estimate the probability of system failures during a mission.
If you’re reading this and interested in contributing, please know that your contributions to an open-source library, no matter how small, can have an impact larger than you could ever imagine. Collaboration like this is essential to pushing forward the boundaries of science. If you want to contribute, please feel free to reach out to me (@WillTirone) or anyone else in the SciPy community, and we can get you started on the right path. Last, I want to thank the maintainers of SciPy for their endless support and assistance while I was learning the basics of contribution to the library.</p>]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="open-source" label="Open-source" />
                             
                                <category scheme="taxonomy:Tags" term="testimonial" label="testimonial" />
                             
                                <category scheme="taxonomy:Tags" term="nasa" label="NASA" />
                             
                                <category scheme="taxonomy:Tags" term="jpl" label="JPL" />
                             
                                <category scheme="taxonomy:Tags" term="scipy" label="scipy" />
                             
                                <category scheme="taxonomy:Tags" term="internship" label="internship" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Developer Summit 1]]></title>
            <link href="https://blog.scientific-python.org/scientific-python/dev-summit-1/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/scientific-python/dev-summit-1-sparse/?utm_source=atom_feed" rel="related" type="text/html" title="Developer Summit 1: Sparse Arrays" />
                <link href="https://blog.scientific-python.org/scientific-python/2022-czi-grant/?utm_source=atom_feed" rel="related" type="text/html" title="Scientific Python awarded CZI grant to improve communications infrastructure &amp; accessibility" />
                <link href="https://blog.scientific-python.org/scientific-python/alt-text-workshop-summary/?utm_source=atom_feed" rel="related" type="text/html" title="Team up! Alt text and cross-project community" />
                <link href="https://blog.scientific-python.org/scientific-python/gsod-2022-proposal/?utm_source=atom_feed" rel="related" type="text/html" title="Scientific Python GSoD 2022 Proposal" />
            
                <id>https://blog.scientific-python.org/scientific-python/dev-summit-1/</id>
            
            
            <published>2023-07-09T18:24:17-07:00</published>
            <updated>2023-07-09T18:24:17-07:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Learn about the first Scientific Python Developer Summit.</blockquote><p>The first <a href="https://scientific-python.org/summits/developer/2023/">Scientific Python Developer Summit</a> (May 22-26, 2023) brought together 34 developers at the eScience Institute at the University of Washington to develop shared infrastructure, documentation, tools, and recommendations for libraries in the Scientific Python ecosystem.</p>
<h2 id="pre-summit-planning">Pre-summit planning<a class="headerlink" href="#pre-summit-planning" title="Link to this heading">#</a></h2>
<p>Prior to the summit we held several hour-long planning meetings:</p>
<ul>
<li><a href="https://scientific-python.org/summits/developer/2023/general-planning/">General (2023-02-27)</a></li>
<li><a href="https://hackmd.io/UNwG2BjJSxOUJ0M1iWI-nQ">May 15, Package metrics, DevStats</a></li>
<li><a href="https://hackmd.io/MmbP4VTATyG129_U56xdJQ">May 15, SPECs</a></li>
<li><a href="https://hackmd.io/YL5DNtsaSsS-1ZU3Pxkrxg">May 18, Community &amp; Documentation</a></li>
<li><a href="https://hackmd.io/0M1Yh7KwTnaXSsU14BiyQw">May 19, Build Systems &amp; CI Infrastructure</a></li>
<li><a href="https://hackmd.io/JL5slkxORA-q7VRN79v1sA">May 19, PyTest plugins &amp; Sphinx extensions</a></li>
</ul>
<h2 id="summit-execution">Summit execution<a class="headerlink" href="#summit-execution" title="Link to this heading">#</a></h2>
<p><img src="/scientific-python/dev-summit-1/checkin.png" alt="Morning group check-in"></p>
<p>At the summit, we had a brief check-in and then split into several groups based on each developers time and interests. Raw work progress and log have been collected in a <a href="https://hackmd.io/iEtdfbxfSbGwOAJTXmqyIQ?both">document</a>, we highlight just a few of things we accomplished below:</p>
<h3 id="sparse-arrays">Sparse arrays<a class="headerlink" href="#sparse-arrays" title="Link to this heading">#</a></h3>
<p>Almost a quarter of the group worked on <a href="https://scientific-python.org/summits/sparse/">sparse arrays</a> for the entire week.
This work is part of a larger, <a href="https://scientific-python.org/grants/sparse_arrays/">multi-year effort</a> to improve and expand SciPy&rsquo;s
<a href="https://github.com/scipy/scipy/pull/14822">sparse array API</a>, which will eventually
involve removing the sparse matrix API and eventually <code>np.matrix</code>.</p>
<p>More details can be found in the <a href="https://blog.scientific-python.org/scientific-python/dev-summit-1-sparse/">Developer Summit 1: Sparse</a> blog post.</p>
<h3 id="scientific-python-ecosystem-coordination-documents">Scientific Python Ecosystem Coordination documents<a class="headerlink" href="#scientific-python-ecosystem-coordination-documents" title="Link to this heading">#</a></h3>
<p>We made significant progress on several <a href="https://scientific-python.org/specs/">SPECs</a>, which had been drafted during previous sprints.</p>
<p><img src="/scientific-python/dev-summit-1/specs.png" alt="Snapshot of the current SPECs and their endorsements"></p>
<p><a href="https://scientific-python.org/specs/spec-0000">SPEC 0&mdash;Minimum Supported Versions </a>, an updated and expanded recommendation similar to
the <a href="https://numpy.org/neps/nep-0029-deprecation_policy.html">NEP 29</a>, was discussed and endorsed by several
<a href="https://scientific-python.org/specs/core-projects/">core projects</a>.</p>
<p><a href="https://scientific-python.org/specs/spec-0001/">SPEC 1&mdash;Lazy Loading of Submodules and Functions</a> was discussed and endorsed by
two <a href="https://scientific-python.org/specs/core-projects/">core projects</a>.</p>
<p><a href="https://scientific-python.org/specs/spec-0002/">SPEC 2&mdash;API Dispatch</a> was discussed (in a follow-up video meeting just after the summit)
and is in the process of being marked as <code>withdrawn</code> or something similar.</p>
<p><a href="https://scientific-python.org/specs/spec-0003/">SPEC 3&mdash;Accessibility</a> was discussed and updated. We hope to see it endorsed by
several core projects in the near future.</p>
<p><a href="https://scientific-python.org/specs/spec-0004/">SPEC 4&mdash;Using a creating nightly wheels</a> was rewritten, a helper GitHub action
<a href="https://github.com/scientific-python/upload-nightly-action">upload-nightly-action</a> was created, and PRs to update the various
projects to use the new <a href="https://anaconda.org/scientific-python-nightly-wheels/">nightly wheels location</a> were made. The updates
are now complete and the SPEC was endorsed by two core projects.</p>
<p>We anticipate several more core projects to endorse the existing SPECs over the coming months and we are now holding regular
SPEC steering committee meetings to continue developing and expanding the SPECs.</p>
<h3 id="community-building">Community building<a class="headerlink" href="#community-building" title="Link to this heading">#</a></h3>
<p>We created a comprehensive <a href="https://learn.scientific-python.org/community/">community guide</a> to empower projects in fostering their communities. This guide includes essential information on the role of community managers, along with practical strategies for community meetings, outreach, onboarding, and project management.</p>
<h3 id="development-documentation">Development Documentation<a class="headerlink" href="#development-documentation" title="Link to this heading">#</a></h3>
<p>We created a <a href="https://learn.scientific-python.org/development/">development guide</a>,
a <a href="https://github.com/scientific-python/cookie">new project template</a>,
and <a href="https://learn.scientific-python.org/development/guides/repo-review/">existing project review</a>.</p>
<h3 id="serendipitous-collaboration">Serendipitous Collaboration<a class="headerlink" href="#serendipitous-collaboration" title="Link to this heading">#</a></h3>
<p>One of the fun things that happens at summits like these are the chance encounters of people from different projects.
For example, a couple of attendees worked on creating a co-collaboration network across the broader scientific python ecosystem.
This gave us the opportunity to look at how contributors collaborate across projects.
We could see how the bigger projects were all clustered together as there are multiple contributors who share maintenance duties for multiple projects.
We could also, for example, see how the Scikit-HEP cluster was a bit further away from the usual scientific Python cluster.
An action item for us :) We need more collaboration!!</p>
<p><img src="/scientific-python/dev-summit-1/collab.png" alt="Visualization of co-collaboration network"></p>
<h3 id="pytest-pluginssphinx-extensions">Pytest plugins/Sphinx extensions<a class="headerlink" href="#pytest-pluginssphinx-extensions" title="Link to this heading">#</a></h3>
<p>Several attendees worked on pytest plugins and Sphinx extensions:</p>
<ul>
<li>
<p><a href="https://github.com/tylerjereddy/pytest-regex">pytest-regex</a> was created to support selecting tests with regular expressions.</p>
</li>
<li>
<p><a href="https://github.com/scientific-python/pytest-doctestplus">pytest-doctestplus</a> was moved upstream into the Scientific Python organization.
The summit provided new momentum to develop new features (e.g. produce updated docstring), and to use it for the NumPy documentation testing.</p>
</li>
<li>
<p>sphinx-scientific-python, a new extension as a home for various features from the ecosystem, e.g.,
we agreed on bringing existing extensions from MNE tools to this extension.</p>
</li>
<li>
<p>pydata-sphinx-theme updates</p>
</li>
</ul>
<h3 id="scipy-release-management-progress">SciPy release management progress<a class="headerlink" href="#scipy-release-management-progress" title="Link to this heading">#</a></h3>
<p>The first release candidate of SciPy 1.11.0 was published on PyPI
on May 31, 2023, five days after the conclusion of the summit. The
summit facilitated high-bandwidth decision making on several proposed
SciPy code changes by allowing the current SciPy release manager (Tyler Reddy,
Los Alamos National Laboratory) to consult with other SciPy core developers
in person. Specific code changes were discussed with the following SciPy
maintainers: Stefan van der Walt (<code>scipy.ndimage</code>), CJ Carey (<code>scipy.sparse</code>),
Matt Haberland (<code>scipy.stats</code>), and Pamphile Roy (<code>scipy.stats</code>). When SciPy
releases are performed out of band from the summit, the release manager
will often have to delay incorporation of useful code changes to the
next release six months later, due to lack of availability of the
pertinent domain experts.</p>
<!--

### Lecture notes
-->
<h3 id="package-metrics">Package metrics<a class="headerlink" href="#package-metrics" title="Link to this heading">#</a></h3>
<p>We factored out a general <a href="https://github.com/scientific-python/devstats">developer statistics package</a>
from our prototype <a href="https://devstats.scientific-python.org/">developer statistics website</a>.</p>
<!--

## Post-summit implementation

We are still in the process of
-->]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="summit" label="Summit" />
                             
                                <category scheme="taxonomy:Tags" term="scientific-python" label="Scientific-Python" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Developer Summit 1: Sparse Arrays]]></title>
            <link href="https://blog.scientific-python.org/scientific-python/dev-summit-1-sparse/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/scientific-python/2022-czi-grant/?utm_source=atom_feed" rel="related" type="text/html" title="Scientific Python awarded CZI grant to improve communications infrastructure &amp; accessibility" />
                <link href="https://blog.scientific-python.org/scientific-python/alt-text-workshop-summary/?utm_source=atom_feed" rel="related" type="text/html" title="Team up! Alt text and cross-project community" />
                <link href="https://blog.scientific-python.org/scientific-python/gsod-2022-proposal/?utm_source=atom_feed" rel="related" type="text/html" title="Scientific Python GSoD 2022 Proposal" />
            
                <id>https://blog.scientific-python.org/scientific-python/dev-summit-1-sparse/</id>
            
            
            <published>2023-07-09T10:07:40-04:00</published>
            <updated>2023-07-09T10:07:40-04:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Sparse Arrays at the May Developer Summit in Seattle</blockquote><p>(May 22-26, 2023, Seattle WA) &ndash;
The first <a href="https://blog.scientific-python.org/scientific-python/dev-summit-1/">Scientific Python Developer Summit</a> provided an opportunity
for core developers from the scientific Python ecosystem to come together to:</p>
<ol>
<li>improve joint infrastructure</li>
<li>better coordinate core projects</li>
<li>work on a shared strategic plan</li>
</ol>
<p>Related notes/sites:</p>
<ul>
<li><a href="https://hackmd.io/iEtdfbxfSbGwOAJTXmqyIQ?view">Worklog</a>.</li>
<li><a href="https://scientific-python.org/summits/developer/2023/">Planning Meeting Notes and Info</a>.</li>
</ul>
<p>One of the focuses of the summit was Sparse Arrays, and specifically their implementation in SciPy.
This post attempts to recap what happened with &ldquo;sparse&rdquo; at the summit
and a glimpse of plans for our continuing work. The Sparse Array working group
holds <a href="https://scientific-python.org/calendars">open follow-up meetings</a>, currently scheduled every two weeks,
to continue the momentum and move this project forward.</p>
<p>At the Summit, we focused on improving the newly added Sparse Array API
in SciPy, that lets users manipulate sparse data with NumPy
semantics (before, SciPy used NumPy&rsquo;s 2D-only Matrix API, but that is <a href="https://stackoverflow.com/questions/53254738/deprecation-status-of-the-numpy-matrix-class">slated for deprecation</a>).
Our goal at the summit was to give focused energy to the effort,
bring new people on board, and connect downstream users with the development
effort. We also worked to create a working group for this project that would
last beyond the summit itself.</p>
<p>The specific PRs and Issues involved in <code>scipy.sparse</code> are detailed in the
<a href="https://hackmd.io/1Q2832LDR_2Uv_-cV-wnYg">Summit 2023 scipy.sparse Report</a>,
with more detailed description appearing in the
<a href="https://hackmd.io/iEtdfbxfSbGwOAJTXmqyIQ?view">Summit Worklog</a>.
Some big picture take-aways are:</p>
<ul>
<li>Reorganized how to check for matrix/array/format info. This involved
adding a <code>format</code> attribute describing which format of sparse storage is used,
changing functions <code>issparse</code>/<code>isspmatrix</code> as well as shifting
the class hierarchy to allow easy <code>isinstance</code> checking.
The interface going forward includes:
<ul>
<li><code>issparse(A)</code>: True when a sparse array or matrix.</li>
<li><code>isinstance(A, sparray)</code>: True when a sparse array.</li>
<li><code>isspmatrix(A)</code>: True when a sparse matrix.
To check the format of a sparse array or matrix use <code>A.format == &quot;csr&quot;</code> or similar.</li>
</ul>
</li>
<li>Made decisions about how to approach the &ldquo;creation functions&rdquo; for sparse arrays.
The big-picture approach is to introduce new functions with an <code>_array</code> suffix which
construct sparse arrays. The old names will continue to create sparse matrix until
post-deprecation removal.
Some specific changes made include:
<ul>
<li>Add the creation function <code>diags_array(A)</code> (and planned for <code>eye_array</code>, <code>random_array</code> and others).</li>
<li>Create a <code>sparse.linalg.matrix_power</code> function for positive integer matrix power of a sparse array</li>
</ul>
</li>
<li>Made progress toward 1D sparse arrays. The data structures for 1d may be quite different from 2d.
A prototype <code>coo_array</code> allowed exploration of possible n-d arrays, though that is not a short-term goal.</li>
<li>Explored feasibility and usefulness of defining <code>__array_ufunc__</code> and other <code>__array_*__</code> protocols for sparse arrays</li>
<li>Made clearer distinction between private and public methods for sparse arrays</li>
<li>Improved documentation for sparse arrays</li>
</ul>
<p>Our goal is to have a working set of sparse array construction functions
and a 1d sparse array class (focusing on <code>coo_array</code> first) in plenty of
time for intensive testing before SciPy v1.12. This will then allow us to
focus on creating migration documents and tools as well as helping downstream
libraries make the shift to sparse arrays. We hope to enable the removal of
deprecated sparse matrix interfaces in favor of the array interface. For this
to happen we will need most downstream users to shift to the sparse array API.
We intend to help them do that.</p>
<p>Our work continues with a community call every <a href="https://scientific-python.org/calendars">two weeks on Fridays.</a>
Near term work is to:</p>
<ul>
<li>Continue improving sparse creation functions: diags, eye, random and others.</li>
<li>Deprecate some matrix-specific functionality</li>
<li>General performance improvements</li>
<li>Adapting scikit-learn to support sparse arrays (to be discussed with scikit-learn&rsquo;s maintainers)</li>
</ul>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="summit" label="Summit" />
                             
                                <category scheme="taxonomy:Tags" term="scientific-python" label="Scientific-Python" />
                             
                                <category scheme="taxonomy:Tags" term="scipy.sparse" label="scipy.sparse" />
                             
                                <category scheme="taxonomy:Tags" term="sparse" label="Sparse" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Outreachy Part II: Internship Guide ]]></title>
            <link href="https://blog.scientific-python.org/networkx/outreachy2023/internship/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/outreachy2023/contribution-phase/?utm_source=atom_feed" rel="related" type="text/html" title="Outreachy Part I: My experience as a first-time contributor in Open-Source" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/graph-iso-vf2pp/?utm_source=atom_feed" rel="related" type="text/html" title="The VF2&#43;&#43; algorithm" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/iso-feasibility-candidates/?utm_source=atom_feed" rel="related" type="text/html" title="ISO Feasibility &amp; Candidates" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/node-ordering-ti-updating/?utm_source=atom_feed" rel="related" type="text/html" title="Updates on VF2&#43;&#43;" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/gsoc-2022/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC 2022: NetworkX VF2&#43;&#43; Implementation" />
            
                <id>https://blog.scientific-python.org/networkx/outreachy2023/internship/</id>
            
            
            <published>2023-02-26T22:16:12-03:00</published>
            <updated>2023-02-26T22:16:12-03:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>My experience during the Outreachy intership at NetworkX.</blockquote><p>This is the second part of a blog series where I talk about my experience during my Outreachy internship at NetworkX. If you haven’t read the first part you can find it <a href="../contribution-phase">here</a>.</p>
<p>As you advance through the contribution phase you may wonder how your internship is gonna be in case you get selected. Here is my experience as a <strong>NetworkX</strong> intern and some tips that could help you through the internship.</p>
<h2 id="my-experience-as-an-intern-at-networkx">My experience as an intern at NetworkX<a class="headerlink" href="#my-experience-as-an-intern-at-networkx" title="Link to this heading">#</a></h2>
<p>I started my internship in December. I was almost done with my assignments at school and was heading into finals season. It&rsquo;s a wild time to start an internship but the beginning is usually not intense. The first week is about meeting the NetworkX team and deciding what you want to do during your internship. As part of your internship, you’re encouraged to research and contribute in any way you want. That means that you don’t necessarily need to work on the proposed project. I started writing notebooks because I felt confident doing that but you can explore other tasks too. As part of writing notebooks, I spent a lot of time reading papers and doing research. That was fun and let me develop some interesting skills. Also, this gave me a better view of what I want to do in the future as a computer scientist. As my notebooks were about graph isomorphism I researched new isomorphism algorithms and evaluated the possibility of implementing them in NetworkX. While writing the notebooks I read the documentation a lot so I fixed and added some things there too. Definitely contributing in any way you can is the key. For me working at NetworkX was not about fulfilling specific tasks but getting a broad vision of the project and thinking about ways I can make it grow. This approach gave me a good insight into how projects like this are managed and maintained, which I think is the most important thing I learned during the internship.</p>
<h3 id="here-are-some-specific-tips-that-can-help-you-during-your-internship">Here are some specific tips that can help you during your internship:<a class="headerlink" href="#here-are-some-specific-tips-that-can-help-you-during-your-internship" title="Link to this heading">#</a></h3>
<ul>
<li>
<p>When you start a new notebook do an <em>initial draft</em> with the general structure of your notebook. That will help you to aim your research and organize your ideas better.</p>
</li>
<li>
<p><em>Always do some research first</em> even if you think you know all the material. There’s always some idea, intuition, or interesting application that you don’t know about.</p>
</li>
<li>
<p><em>Take your time to learn things that can be helpful for your internship.</em> Outreachy internships aim to help you gain skills so you can continue your tech career. Sometimes it can feel like you spend more time learning than doing and that’s ok! This is above all a learning experience!</p>
</li>
<li>
<p><em>Out of ideas for notebooks?</em> Reading what’s already on the <strong>nx-guides</strong> can be a source of inspiration. Also, you can look for cool graph real-world applications in books and on the internet.</p>
</li>
<li>
<p><em>The repository is an incredible source of information about the project.</em> If you are struggling with something, you can look at all related issues and PRs. There, you will be able to find discussions and explanations that can give you a better sense of why things are a certain way.</p>
</li>
<li>
<p><em>Learn about the project structure.</em> A Python package is not just a lot of Python code together, there are a lot of other packages used in order to make documentation and testing work. Learning how everything works underneath will usually make your work easier but also it is a great skill to gain. For me, understanding how a project like this comes to life was extremely interesting because It was something I have never paid attention to before.</p>
</li>
<li>
<p><em>You will understand things as you go.</em> So don’t overstress if you don’t understand everything. With time, some details will click. But it’s also important not to immediately give up when you don’t get something. The key is to keep your confidence even when you are feeling a bit lost.</p>
</li>
<li>
<p><em>Organize your work and learn how to work remotely.</em> If this is your first time working remotely it’s important that you find your own way to organize your time. There are many strategies that can help you figure out how to organize your work throughout the day. Try different techniques until you find whatever suits you best. If you are a college student you may want to use the same system that works for you at school, but working it’s different so you may need to explore other options. For me, it was useful to have two lists: a <em>to-do list</em> because it was motivating to track my progress and an <em>ideas list</em> with things I want to do, usually smaller contributions that I can do when I’m tired of the bigger tasks. I also tried the Pomodoro technique but for me was more effective to work on tasks until I finish and then take a break if I want to.</p>
</li>
<li>
<p>As part of your Outreachy internship, you will need to write blogs, turn in feedback and attend informal chats. Be aware of that and organize all the deadlines so you and your mentor don’t miss any of them.</p>
</li>
<li>
<p><em>Make a cheat sheet with all the useful commands and links.</em> That way you don’t have to go through the process of finding that information again every time you need it. If there is a series of commands that you use a lot try writing a bash script. Here is a repository with my cheat sheet:
<a href="https://github.com/paulitapb/Outreachy2023">https://github.com/paulitapb/Outreachy2023</a></p>
</li>
</ul>
<p>Overall my experience as a <strong>NetworkX</strong> intern was amazing! Not only did I gain many different skills but also now I am more confident in my abilities to work in tech. I discovered Open-Source communities and I realized I am able to contribute in valuable ways. Furthermore, I now have a better sense of what I want my future in tech to look like and what are my options.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="outreachy" label="outreachy" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="first-time-contributor" label="first-time-contributor" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Outreachy Part I: My experience as a first-time contributor in Open-Source]]></title>
            <link href="https://blog.scientific-python.org/networkx/outreachy2023/contribution-phase/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/vf2pp/graph-iso-vf2pp/?utm_source=atom_feed" rel="related" type="text/html" title="The VF2&#43;&#43; algorithm" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/iso-feasibility-candidates/?utm_source=atom_feed" rel="related" type="text/html" title="ISO Feasibility &amp; Candidates" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/node-ordering-ti-updating/?utm_source=atom_feed" rel="related" type="text/html" title="Updates on VF2&#43;&#43;" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/gsoc-2022/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC 2022: NetworkX VF2&#43;&#43; Implementation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/my-summer-of-code-2021/?utm_source=atom_feed" rel="related" type="text/html" title="My Summer of Code 2021" />
            
                <id>https://blog.scientific-python.org/networkx/outreachy2023/contribution-phase/</id>
            
            
            <published>2023-02-26T22:15:58-03:00</published>
            <updated>2023-02-26T22:15:58-03:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Some tips to help you during the contribution phase.</blockquote><h2 id="whats-outreachy">What&rsquo;s Outreachy?<a class="headerlink" href="#whats-outreachy" title="Link to this heading">#</a></h2>
<p><strong>Outreachy</strong> is a paid remote internship program for underrepresented groups in tech. All internships are in Open Source and Open Science. To be selected as an intern first you need to :</p>
<ol>
<li>
<p><em>Fill out an initial application:</em> You’ll need to answer some questions about how you are affected by the systemic bias, and how being underrepresented in your local tech industry impacted your development. Maybe you don’t know how to answer some of these questions, especially if you are still not looking for a job, but it’s important to do some research first. If you can’t find any “official” information, tech communities often do surveys and publish the results. Reaching out to local tech communities that work with underrepresented groups is a great way to find mentors and like-minded people that can support you through your tech career. Take your time to reflect on these questions before writing your answers. Also if you are a college student you need to submit your school calendar. Read carefully all the time requirements and reach out to Outreachy coordinators if you think there are any details about your school calendar that you need to discuss.</p>
</li>
<li>
<p><em>Take part in the contribution phase:</em> Once your initial application is approved you will be able to see all the projects. Finding the right project for you is important and also very challenging. One can feel tempted to contribute to multiple projects but unless you have a lot of free time I don’t feel like it&rsquo;s the best option. Smaller and more constant contributions are the way to go. The contribution phase is not about introducing huge contributions but rather an opportunity to interact with the community, learn about the project, and gain new skills. Definitely finding the right project for you is key and depends a lot on how much time are you willing to put into it and your current skills.</p>
</li>
</ol>
<p>For more information about Outreachy go to: <a href="https://www.outreachy.org/">https://www.outreachy.org/</a></p>
<h2 id="my-experience-at-networkx-during-the-contribution-phase">My experience at NetworkX during the contribution phase<a class="headerlink" href="#my-experience-at-networkx-during-the-contribution-phase" title="Link to this heading">#</a></h2>
<p>If this is your first time contributing to an open-source project you may feel overwhelmed. Understanding an almost 20-year-old project like NetworkX can feel like it&rsquo;s going to take forever but don&rsquo;t worry I have some tips that may be handy for you during the contribution phase.</p>
<ul>
<li>
<p><em>Learn about the project:</em> Understanding the project is a process that may take some time. Don’t rush it! You don’t need to understand the entire codebase in a day. The most important things that you need to know only will take you a few hours to go through: Learn about the project mission and values,
community rules, and contribution process. In NetworkX all you need to know is here: <a href="https://networkx.org/documentation/stable/developer/index.html">https://networkx.org/documentation/stable/developer/index.html</a></p>
</li>
<li>
<p><em>Start contributing right away:</em> You don&rsquo;t need to understand every part of the project to make valuable contributions. Start small and use that experience to level up your contributions. At the beginning of the contribution phase, some good first issues are added. Work on them first and then start opening your own issues (Don’t forget to link your PR with the issues so they can be automagically closed). Also, record your contributions on the Outreachy website as you submit them. I only recorded all contributions at the end and that took me a lot of time. If you struggle to find issues or ideas for contributions here are my contributions at NetworkX: <a href="https://github.com/networkx/networkx/pulls?q=is%3Apr&#43;paulitapb">https://github.com/networkx/networkx/pulls?q=is%3Apr+paulitapb</a></p>
</li>
<li>
<p><em>It’s not just about writing code.</em> What&rsquo;s great about big projects is that you can explore many different things. Making contributions to different parts of the project shows that you understand the project on a general level and can be a valuable member of the community.</p>
</li>
<li>
<p><em>Don’t be afraid of the community!</em> As a beginner, you may worry about the technical side of the project but understanding the community review process is key. Usually, communities want to grow and that means teaching new contributors about the project. It’s fine if your contributions are not perfect or if you need to ask questions. That’s the beauty of Open-Source communities! Also, don’t be discouraged if a contribution is not merged into the project. Maybe that was already suggested, tested, or deprecated. Take that as a learning experience and even that can give you some ideas for future contributions.</p>
</li>
</ul>
<p>I hope this information helps you to start your Open-Source journey! The NetworkX team is waiting for your great contributions!</p>
<p>If you are interested in my experience during the internship you can find the second part of this blog <a href="../internship">here</a>.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="outreachy" label="outreachy" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="first-time-contributor" label="first-time-contributor" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[NumPy's first Developer in Residence: Sayed Adel]]></title>
            <link href="https://blog.scientific-python.org/numpy/fellowship-program/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/numpy/mukulikapahari/?utm_source=atom_feed" rel="related" type="text/html" title="NumPy Contributor Spotlight: Mukulika Pahari" />
            
                <id>https://blog.scientific-python.org/numpy/fellowship-program/</id>
            
            
            <published>2022-12-01T00:00:00+00:00</published>
            <updated>2022-12-01T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Introducing the NumPy Fellowship Program and our first
Developer in Residence, Sayed Adel, who will be working on performance
optimization.</blockquote><p>The NumPy team is excited to announce the launch of the NumPy Fellowship
Program and the appointment of Sayed Adel
(<a href="https://github.com/seiko2plus">@seiko2plus</a>) as the first NumPy
Developer in Residence. This is a significant milestone in the history
of the project: for the first time, NumPy is in a position to use its
project funds to pay for a full year of maintainer time. We believe that
this will be an impactful program that will contribute to NumPy&rsquo;s
long-term sustainability as a community-driven open source project.</p>
<p>Sayed has been making major contributions to NumPy
since the start of 2020, in particular around computational performance.
He is the main author of the NumPy SIMD architecture (<a href="https://numpy.org/neps/nep-0038-SIMD-optimizations.html"><em>NEP
38</em></a>,
<a href="https://numpy.org/devdocs/reference/simd/index.html"><em>docs</em></a>),
generously shared his knowledge of SIMD instructions with the core
developer team, and helped integrate the work of various volunteer and
industry contributors in this area. As a result, we&rsquo;ve been able to
expand support to multiple CPU architectures, integrating contributions
from IBM, Intel, Apple, and others, none of which would have been
possible without Sayed. Furthermore, when NumPy tentatively started
using C++ in 2021, Sayed was one of the proponents of the move and
helped with its implementation.</p>
<p>The NumPy Steering Council sees Sayed&rsquo;s appointment to this role as both
recognition of his past outstanding contributions as well as an
opportunity to continue improving NumPy&rsquo;s computational performance. In
the next 12 months, we&rsquo;d like to see Sayed focus on the following:</p>
<ul>
<li>SIMD code maintenance,</li>
<li>code review of SIMD contributions from others,</li>
<li>performance-related features,</li>
<li>sharing SIMD and C++ expertise with the team and growing a NumPy
sub-team around it,</li>
<li>SIMD build system migration to Meson,</li>
<li>and wherever else Sayed&rsquo;s interests take him.</li>
</ul>
<blockquote>
<p><em>&ldquo;I&rsquo;m both happy and nervous: this is a great opportunity, but also a
great responsibility,&rdquo; said Sayed in response to his appointment.</em></p>
</blockquote>
<p>The funds for the NumPy Fellowship Program come from a partnership with
Tidelift and from individual donations. We sincerely thank both
Tidelift and everyone who donated to the project&mdash;without you, this
program would not be possible! We also acknowledge the CPython
Developer-in-Residence and the Django Fellowship programs, which
served as inspiration for this program.</p>
<p>Sayed officially starts as the NumPy Developer in Residence today, 1
December 2022. Already, we are thinking about opportunities beyond
this first year: we imagine &ldquo;in residence&rdquo; roles that focus on
developing, improving, and maintaining other parts of the NumPy
project (e.g., documentation, website, translations, contributor
experience, etc.). We look forward to this exciting new chapter of the
NumPy contributor community and will keep you posted on our progress.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="numpy" label="numpy" />
                             
                                <category scheme="taxonomy:Tags" term="developer-in-residence" label="developer-in-residence" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Scientific Python awarded CZI grant to improve communications infrastructure & accessibility]]></title>
            <link href="https://blog.scientific-python.org/scientific-python/2022-czi-grant/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/scientific-python/alt-text-workshop-summary/?utm_source=atom_feed" rel="related" type="text/html" title="Team up! Alt text and cross-project community" />
                <link href="https://blog.scientific-python.org/scientific-python/gsod-2022-proposal/?utm_source=atom_feed" rel="related" type="text/html" title="Scientific Python GSoD 2022 Proposal" />
            
                <id>https://blog.scientific-python.org/scientific-python/2022-czi-grant/</id>
            
            
            <published>2022-11-08T00:00:00+00:00</published>
            <updated>2022-11-08T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>New Scientific Python grant to focus on common web themes, adopting and promoting access-centered practices, translations, and interactivity of documentation.</blockquote><p>We are delighted to announce a <a href="https://scientific-python.org/doc/scientific-python-community-and-communications-infrastructure-2022.pdf">two-year grant</a> from the Chan Zuckerberg Initiative (CZI) in support of the <a href="https://scientific-python.org/">Scientific Python project</a>.
This grant will support work on common web themes, joint infrastructure and practices, accessibility, and interactivity of core library documentation.
We are particularly excited that, through this work, we may expand global participation of scientific communities in using and contributing to Python tools.
It is, to the best of our knowledge, the first time that a scientific open source community has received significant support for accessibility and internationalization efforts.</p>
<h2 id="czi--scientific-python">CZI &amp; Scientific Python<a class="headerlink" href="#czi--scientific-python" title="Link to this heading">#</a></h2>
<p>CZI continues to support many impactful and innovative projects in the scientific Python community through its <a href="https://chanzuckerberg.com/eoss/">Essential Open Source Software for Science (EOSS) program</a>.
Today, they announced the <a href="https://czi.co/3fFaQMZ">5th funding cycle of that program</a>.
This grant to Scientific Python, while outside the EOSS program, complements it well.
Among other things, the Scientific Python project aims to support, document, and make accessible common practices &amp; infrastructure.
Such infrastructure will benefit not only the projects at the core of the ecosystem, but also those well beyond it.</p>
<p>&ldquo;We are thrilled to partner with the Scientific Python project, an effort to harmonize a critical set of open source research software projects widely used across all the areas of biomedical research that CZI supports.
The distributed nature of the scientific open source ecosystem will greatly benefit from their efforts to standardize best practices and focus on ecosystem-level initiatives,&rdquo; said Dario Taraborelli, Science Program Officer at the Chan Zuckerberg Initiative.</p>
<h2 id="what-will-we-be-working-on">What will we be working on?<a class="headerlink" href="#what-will-we-be-working-on" title="Link to this heading">#</a></h2>
<p>This grant will support core scientific Python projects by doing release management, writing documentation, building and supporting joint infrastructure, and by measuring and publishing metrics on community involvement and project health.
In addition, here are some specific deliverables:</p>
<h3 id="common-web-themes">Common web themes<a class="headerlink" href="#common-web-themes" title="Link to this heading">#</a></h3>
<p>There are two web themes commonly deployed on community sites: the <a href="https://theme.scientific-python.org/">Scientific Python Hugo Theme</a>—for project websites, and the <a href="https://github.com/pydata/pydata-sphinx-theme">pydata-sphinx-theme</a>—for documentation.
We will improve these themes, effectively upgrading several project websites simultaneously.
By fostering theme adoption, we will help the ecosystem present a more unified front to users, while reducing the web maintenance burden on developers.
Other theme work includes better responsive layouts (important for use on mobile and tables), blogging facilities, increased usability, and accessibility compliance.</p>
<h3 id="adopting-and-promoting-access-centered-practices">Adopting and promoting access-centered practices<a class="headerlink" href="#adopting-and-promoting-access-centered-practices" title="Link to this heading">#</a></h3>
<p>Better accessibility of online resources increases usability for everyone, while fostering community participation and inclusion.
The Scientific Python Hugo Theme and pydata-sphinx-theme are natural conduits for introducing accessibility standards and best practices to the broader ecosystem.
We will develop access-centered best practices and contribution guidelines, organize online workshops, and work with other maintainers to improve their projects&rsquo; documentation and homepage accessibility.
A set of access-centered practices will be written up as a Scientific Python Ecosystem Coordination document (or SPEC, for short), to provide guidance to those projects we cannot support directly.</p>
<p>A key aim of this work is to have web and documentation themes, as well as core scientific Python project websites, meet the applicable <a href="https://www.w3.org/WAI/standards-guidelines/wcag/">Web Content Accessibility Guidelines</a>.</p>
<h3 id="interactive-documentation--translations">Interactive documentation &amp; translations<a class="headerlink" href="#interactive-documentation--translations" title="Link to this heading">#</a></h3>
<p>Documentation is key to a project’s success, and good documentation is approachable to end users with a wide range of backgrounds and skills.
While most scientific Python projects value documentation and work hard at it, there is still much room for improvement.</p>
<p>One such improvement is translation and localization.
Development takes place in English, as reflected by project websites and documentation.
While many contributors are comfortable with English as a first, second, or even third language, the language barrier excludes especially users that are very young, are new to the community, have learning disabilities, or are from the Global South—all potential future contributors and leaders in the scientific Python community! We will therefore translate key pages of core project websites, and provide translation infrastructure for the web themes.</p>
<p>A second area of improvement is interactivity.
Interactive project documentation has the potential to engage less experienced users, making it easier to experiment with and teach ecosystem libraries.
We will work on documentation interactivity by providing seamless, in-browser execution of code via JupyterLite, a WebAssembly Jupyter distribution.</p>
<h2 id="who-will-be-involved">Who will be involved<a class="headerlink" href="#who-will-be-involved" title="Link to this heading">#</a></h2>
<p>The four PI’s for this grant are Stéfan van der Walt (UC Berkeley; NumPy, scikit-image, SciPy; Scientific Python Hugo Theme), Tania Allard (Quansight Labs; JupyterHub, NumFOCUS DISC, Jupyter accessibility), Jarrod Millman (Scientific Python; NetworkX; scikit-image; Scientific Python Hugo Theme, pydata-sphinx-theme), and Ralf Gommers (Quansight Labs; SciPy, NumPy, <a href="https://data-apis.org/">data-apis.org</a>).
Melissa Weber Mendonça (Quansight Labs; NumPy, SciPy) and Chris Holdgraf (2i2c; Project Jupyter, MyST, pydata-sphinx-theme) will participate as key personnel, providing expertise in documentation and Sphinx themes in particular.
Jarrod and Stéfan are co-creators of the Scientific Python project, and everyone on the grant has been involved in the larger scientific Python ecosystem and community for many years.</p>
<h2 id="next-steps">Next steps<a class="headerlink" href="#next-steps" title="Link to this heading">#</a></h2>
<p>Today was announcement day, but the real work starts in December.
Some topics we’ll be able to dive straight into; others will require hiring—and we’re excited to involve new web designers, accessibility experts, and engineers in this journey.
Stay tuned—there’s a lot more to come!</p>
<p>To connect with the team, and to follow job posts, please join us at <a href="https://discuss.scientific-python.org">https://discuss.scientific-python.org</a>.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="scientific-python" label="Scientific-Python" />
                             
                                <category scheme="taxonomy:Tags" term="czi" label="CZI" />
                             
                                <category scheme="taxonomy:Tags" term="funding" label="funding" />
                             
                                <category scheme="taxonomy:Tags" term="accessibility" label="accessibility" />
                             
                                <category scheme="taxonomy:Tags" term="common-infrastructure" label="common-infrastructure" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Scientific Python: Community developed, community owned]]></title>
            <link href="https://blog.scientific-python.org/scientific-python/scientific-python-project/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
            
                <id>https://blog.scientific-python.org/scientific-python/scientific-python-project/</id>
            
            
            <published>2022-08-14T00:00:00+00:00</published>
            <updated>2022-08-14T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Introducing the Scientific Python project: why it is necessary,
what it hopes to achieve, and the various mechanisms by which
it will operate.</blockquote><p>The Scientific Python project is an initiative to better coordinate and support the scientific Python ecosystem of libraries and to grow the surrounding community.
It aims to improve communication between ecosystem projects, to better plan for their joint future, and to make that future a reality.</p>
<h2 id="why-is-this-important">Why is this important?<a class="headerlink" href="#why-is-this-important" title="Link to this heading">#</a></h2>
<p>Initially, the Scientific Python developer community was small, so that it was easy to discuss important ecosystem-wide decisions at events like the annual SciPy conference.
But with the rapid growth of the community, number of libraries, as well as geographical diversification, this was no longer possible.
Scientific Python is a loose federation of somewhat independent community projects, and while this configuration is robust, it also tends to favor reinvention of the wheel and decisions that focus on project needs, instead of being strategically aligned with the entire ecosystem.
Ultimately, the different projects depend on one another, so that it makes sense to have close coordination between them.</p>
<h2 id="how-are-we-doing-this">How are we doing this?<a class="headerlink" href="#how-are-we-doing-this" title="Link to this heading">#</a></h2>
<h3 id="specs">SPECs<a class="headerlink" href="#specs" title="Link to this heading">#</a></h3>
<p>The <a href="https://scientific-python.org/specs/">SPECs</a>, or Scientific Python Ecosystem Coordination documents, provide a mechanism through which the community can establish cross-project policies.
They function similarly to PEPs, NEPs, SKIPs, or any of the other enhancement proposals—except that they are relevant to multiple projects in the ecosystem.</p>
<p>These documents will be recommendations written up by the community, and their authority will derive from endorsement by popular libraries.
Some of them are already in progress and many are on the way!</p>
<p>SPECs are short and concise, and are endorsed by core projects in the ecosystem once they are adopted.</p>
<h3 id="shared-infrastructure">Shared infrastructure<a class="headerlink" href="#shared-infrastructure" title="Link to this heading">#</a></h3>
<p>We provide common engineering infrastructure to help maintainers.
Some tools we currently work on include
<a href="https://github.com/scientific-python/scientific-python-hugo-theme">a Hugo web theme</a> for project websites,
a self-hosted privacy-friendly web analytics platform,
a <a href="https://discuss.scientific-python.org">shared discussion forum</a>,
the <a href="https://github.com/scientific-python/devpy">devpy</a> developer CLI,
<a href="https://blog.scientific-python.org">this blog</a>,
and a <a href="https://devstats.scientific-python.org">project development statistics dashboard</a>.</p>
<h3 id="developer-events">Developer events<a class="headerlink" href="#developer-events" title="Link to this heading">#</a></h3>
<p>We organize virtual &ldquo;domain summits&rdquo; where developers can meet to discuss relevant cross-project topics.
These will be recorded and shared on our <a href="https://www.youtube.com/@scientific-python">YouTube channel</a>.
Thus far, we&rsquo;ve organized four such events on: API dispatching, alt-text for improved accessibility, domain stacks, and sparse arrays.</p>
<p>We also organize an annual in-person developer summit: a week of intense collaboration, with work scheduled ahead of time, during which we address as many cross-project concerns as we can.</p>
<h3 id="documentation">Documentation<a class="headerlink" href="#documentation" title="Link to this heading">#</a></h3>
<p>We work on documentation for new contributors and maintainers.
Our YouTube channel hosts onboarding videos, that show how to get started contributing to a scientific Python project, as well as developer interviews.
Over the next year, we also plan to unify several disparate community resources into a maintainer guide.</p>
<h3 id="community-outreach">Community outreach<a class="headerlink" href="#community-outreach" title="Link to this heading">#</a></h3>
<p>We love to reach out to and connect with our growing community of users and developers!
Platforms we are present on include
<a href="https://twitter.com/scientific_py">Twitter</a>, <a href="https://www.facebook.com/scientific.python">Facebook</a>, <a href="https://www.instagram.com/scientific.python">Instagram</a>, and <a href="https://www.tiktok.com/@scientific.python">TikTok</a>.</p>
<h2 id="who-is-behind-this">Who is behind this?<a class="headerlink" href="#who-is-behind-this" title="Link to this heading">#</a></h2>
<p>The short answer: anyone who wants to be.
The long answer: we are a community of volunteers from different scientific Python ecosystem packages.
There are several teams working on the different aspects of the project, such as our <a href="https://scientific-python.org/about/">community managers &amp; leaders</a>, the <a href="https://scientific-python.org/about/">SPEC steering committee</a>, and <a href="https://blog.scientific-python.org/about/">blog content reviewers and editors</a>.
The project is led by Jarrod Millman and Stéfan van der Walt, both long-term community members who care deeply about the success of the ecosystem and its developers.</p>
<p>Currently there are eight <a href="https://scientific-python.org/specs/core-projects/">projects</a> that endorse the SPECs: IPython, Matplotlib, NetworkX, NumPy, pandas, scikit-image, scikit-learn, and SciPy.
However, contributors from many more projects participate on our discussion forum, write blogs, and contribute to the community in other ways.
We welcome everyone to become part of the community and to contribute however they can!</p>
<p><img src="/scientific-python/scientific-python-project/community.png" alt="Picture of the Scientific Python Community"></p>
<h2 id="what-am-i-doing-here">What am I doing here?<a class="headerlink" href="#what-am-i-doing-here" title="Link to this heading">#</a></h2>
<p>For the past couple of months I have been a community manager for the project.
This includes recording documentation videos for the website, recording developer interviews for our YouTube channel, presenting talks at conferences, hosting developer events, creating content for our Instagram, Facebook, TikTok, and Twitter channels, and many other things that I never thought I would do.</p>
<p>Why? Because I believe in this.
Jarrod and Stéfan reached out to me last year, inviting me to be part of this amazing idea and I was honored and very grateful.
I wasn&rsquo;t sure that I could do it, but now I find myself here and I know that this is the right place for me.
Not because I have a lot of experience in these things (I had actually never even used TikTok before joining the project), but because I care.
I have learned the importance of building community and while the Scientific Python tools are amazing, what makes the difference is the community around them and I&rsquo;m grateful to be able to help make this community great.</p>
<p>I have learned a lot from the Scientific Python ecosystem by being a community manager, I have met a lot of wonderful people and I have seen what people can do with the tools that the ecosystem offers.
So, my take: The Scientific Python project is a great bet.
Open source Scientific Python is about much more than coding, it is about collaborating, teaching, and communicating.
So unifying the community and promoting the integration of the projects sounds like the perfect path to follow in order to get the most out of the ecosystem.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="community" label="community" />
                             
                                <category scheme="taxonomy:Tags" term="scientific-python" label="scientific-python" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[The VF2++ algorithm]]></title>
            <link href="https://blog.scientific-python.org/networkx/vf2pp/graph-iso-vf2pp/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/vf2pp/iso-feasibility-candidates/?utm_source=atom_feed" rel="related" type="text/html" title="ISO Feasibility &amp; Candidates" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/node-ordering-ti-updating/?utm_source=atom_feed" rel="related" type="text/html" title="Updates on VF2&#43;&#43;" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/gsoc-2022/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC 2022: NetworkX VF2&#43;&#43; Implementation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/my-summer-of-code-2021/?utm_source=atom_feed" rel="related" type="text/html" title="My Summer of Code 2021" />
                <link href="https://blog.scientific-python.org/networkx/atsp/completing-the-asadpour-algorithm/?utm_source=atom_feed" rel="related" type="text/html" title="Completing the Asadpour Algorithm" />
            
                <id>https://blog.scientific-python.org/networkx/vf2pp/graph-iso-vf2pp/</id>
            
            
            <published>2022-08-10T00:00:00+00:00</published>
            <updated>2022-08-10T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Implementing the VF2++ algorithm for the Graph Isomorphism.</blockquote><p>The last and final post discussing the <strong>VF2++ helpers</strong> can be found <a href="../iso-feasibility-candidates">here</a>.
Now that we&rsquo;ve figured out how to solve all the sub-problems that <strong>VF2++</strong> consists of, we are ready to combine our
implemented functionalities to create the final solver for the <strong>Graph Isomorphism</strong> problem.</p>
<h2 id="introduction">Introduction<a class="headerlink" href="#introduction" title="Link to this heading">#</a></h2>
<p>We should quickly review the individual functionalities used in the VF2++ algorithm:</p>
<ul>
<li><strong>Node ordering</strong> which finds the optimal order to access the nodes, such that those that are more likely to match are placed first in the order. This reduces the possibility of infeasible searches taking place first.</li>
<li><strong>Candidate selection</strong> such that, given a node $u$ from $G_1$, we obtain the candidate nodes $v$ from $G_2$.</li>
<li><strong>Feasibility rules</strong> introducing easy-to-check cutting and consistency conditions which, if satisfied by a candidate pair of nodes $u$ from $G_1$ and $v$ from $G_2$, the mapping is extended.</li>
<li><strong>$T_i$ updating</strong> which updates the $T_i$ and $\tilde{T}_i$, $i=1,2$ parameters in case that a new pair is added to the mapping, and restores them when a pair is popped from it.</li>
</ul>
<p>We are going to use all these functionalities to form our <strong>Isomorphism solver</strong>.</p>
<h2 id="vf2">VF2++<a class="headerlink" href="#vf2" title="Link to this heading">#</a></h2>
<p>First of all, let&rsquo;s describe the algorithm in simple terms, before presenting the pseudocode. The algorithm will look something like this:</p>
<ol>
<li>Check if all <strong>preconditions</strong> are satisfied before calling the actual solver. For example there&rsquo;s no point examining two graphs with different number of nodes for isomorphism.</li>
<li>Initialize all the necessary <strong>parameters</strong> ($T_i$, $\tilde{T}_i$, $i=1,2$) and maybe cache some information that is going to be used later.</li>
<li>Take the next unexamined node $u$ from the ordering.</li>
<li>Find its candidates and check if there&rsquo;s a candidate $v$ such that the pair $u-v$ satisfies the <strong>feasibility rules</strong></li>
<li>if there&rsquo;s any, extend the mapping and <strong>go to 3</strong>.</li>
<li>if not, pop the last pair $\hat{u}-\hat{v}$ from the mapping and try a different candidate $\hat{v}$, from the remaining candidates of $\hat{u}$</li>
<li>The two graphs are <strong>isomorphic</strong> if the number of <strong>mapped nodes</strong> equals the number of nodes of the two graphs.</li>
<li>The two graphs are <strong>not isomorphic</strong> if there are no remaining candidates for the first node of the ordering (root).</li>
</ol>
<p>The official code for the <strong>VF2++</strong> is presented below.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># Check if there&#39;s a graph with no nodes in it</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="n">G1</span><span class="o">.</span><span class="n">number_of_nodes</span><span class="p">()</span> <span class="o">==</span> <span class="mi">0</span> <span class="ow">or</span> <span class="n">G2</span><span class="o">.</span><span class="n">number_of_nodes</span><span class="p">()</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">False</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Check that both graphs have the same number of nodes and degree sequence</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="ow">not</span> <span class="n">nx</span><span class="o">.</span><span class="n">faster_could_be_isomorphic</span><span class="p">(</span><span class="n">G1</span><span class="p">,</span> <span class="n">G2</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">False</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Initialize parameters (Ti/Ti_tilde, i=1,2) and cache necessary information about degree and labels</span>
</span></span><span class="line"><span class="cl"><span class="n">graph_params</span><span class="p">,</span> <span class="n">state_params</span> <span class="o">=</span> <span class="n">_initialize_parameters</span><span class="p">(</span><span class="n">G1</span><span class="p">,</span> <span class="n">G2</span><span class="p">,</span> <span class="n">node_labels</span><span class="p">,</span> <span class="n">default_label</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Check if G1 and G2 have the same labels, and that number of nodes per label is equal between the two graphs</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="ow">not</span> <span class="n">_precheck_label_properties</span><span class="p">(</span><span class="n">graph_params</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">False</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Calculate the optimal node ordering</span>
</span></span><span class="line"><span class="cl"><span class="n">node_order</span> <span class="o">=</span> <span class="n">_matching_order</span><span class="p">(</span><span class="n">graph_params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Initialize the stack to contain node-candidates pairs</span>
</span></span><span class="line"><span class="cl"><span class="n">stack</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl"><span class="n">candidates</span> <span class="o">=</span> <span class="nb">iter</span><span class="p">(</span><span class="n">_find_candidates</span><span class="p">(</span><span class="n">node_order</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">graph_params</span><span class="p">,</span> <span class="n">state_params</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="n">stack</span><span class="o">.</span><span class="n">append</span><span class="p">((</span><span class="n">node_order</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">candidates</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">mapping</span> <span class="o">=</span> <span class="n">state_params</span><span class="o">.</span><span class="n">mapping</span>
</span></span><span class="line"><span class="cl"><span class="n">reverse_mapping</span> <span class="o">=</span> <span class="n">state_params</span><span class="o">.</span><span class="n">reverse_mapping</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Index of the node from the order, currently being examined</span>
</span></span><span class="line"><span class="cl"><span class="n">matching_node</span> <span class="o">=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">while</span> <span class="n">stack</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">current_node</span><span class="p">,</span> <span class="n">candidate_nodes</span> <span class="o">=</span> <span class="n">stack</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">try</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">candidate</span> <span class="o">=</span> <span class="nb">next</span><span class="p">(</span><span class="n">candidate_nodes</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">except</span> <span class="ne">StopIteration</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="c1"># If no remaining candidates, return to a previous state, and follow another branch</span>
</span></span><span class="line"><span class="cl">        <span class="n">stack</span><span class="o">.</span><span class="n">pop</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">        <span class="n">matching_node</span> <span class="o">-=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">stack</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="c1"># Pop the previously added u-v pair, and look for a different candidate _v for u</span>
</span></span><span class="line"><span class="cl">            <span class="n">popped_node1</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">stack</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">            <span class="n">popped_node2</span> <span class="o">=</span> <span class="n">mapping</span><span class="p">[</span><span class="n">popped_node1</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">            <span class="n">mapping</span><span class="o">.</span><span class="n">pop</span><span class="p">(</span><span class="n">popped_node1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="n">reverse_mapping</span><span class="o">.</span><span class="n">pop</span><span class="p">(</span><span class="n">popped_node2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="n">_restore_Tinout</span><span class="p">(</span><span class="n">popped_node1</span><span class="p">,</span> <span class="n">popped_node2</span><span class="p">,</span> <span class="n">graph_params</span><span class="p">,</span> <span class="n">state_params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">continue</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">_feasibility</span><span class="p">(</span><span class="n">current_node</span><span class="p">,</span> <span class="n">candidate</span><span class="p">,</span> <span class="n">graph_params</span><span class="p">,</span> <span class="n">state_params</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="c1"># Terminate if mapping is extended to its full</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">mapping</span><span class="p">)</span> <span class="o">==</span> <span class="n">G2</span><span class="o">.</span><span class="n">number_of_nodes</span><span class="p">()</span> <span class="o">-</span> <span class="mi">1</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">cp_mapping</span> <span class="o">=</span> <span class="n">mapping</span><span class="o">.</span><span class="n">copy</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">            <span class="n">cp_mapping</span><span class="p">[</span><span class="n">current_node</span><span class="p">]</span> <span class="o">=</span> <span class="n">candidate</span>
</span></span><span class="line"><span class="cl">            <span class="k">yield</span> <span class="n">cp_mapping</span>
</span></span><span class="line"><span class="cl">            <span class="k">continue</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1"># Feasibility rules pass, so extend the mapping and update the parameters</span>
</span></span><span class="line"><span class="cl">        <span class="n">mapping</span><span class="p">[</span><span class="n">current_node</span><span class="p">]</span> <span class="o">=</span> <span class="n">candidate</span>
</span></span><span class="line"><span class="cl">        <span class="n">reverse_mapping</span><span class="p">[</span><span class="n">candidate</span><span class="p">]</span> <span class="o">=</span> <span class="n">current_node</span>
</span></span><span class="line"><span class="cl">        <span class="n">_update_Tinout</span><span class="p">(</span><span class="n">current_node</span><span class="p">,</span> <span class="n">candidate</span><span class="p">,</span> <span class="n">graph_params</span><span class="p">,</span> <span class="n">state_params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="c1"># Append the next node and its candidates to the stack</span>
</span></span><span class="line"><span class="cl">        <span class="n">candidates</span> <span class="o">=</span> <span class="nb">iter</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="n">_find_candidates</span><span class="p">(</span><span class="n">node_order</span><span class="p">[</span><span class="n">matching_node</span><span class="p">],</span> <span class="n">graph_params</span><span class="p">,</span> <span class="n">state_params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">stack</span><span class="o">.</span><span class="n">append</span><span class="p">((</span><span class="n">node_order</span><span class="p">[</span><span class="n">matching_node</span><span class="p">],</span> <span class="n">candidates</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="n">matching_node</span> <span class="o">+=</span> <span class="mi">1</span></span></span></code></pre>
</div>
<h2 id="performance">Performance<a class="headerlink" href="#performance" title="Link to this heading">#</a></h2>
<p>This section is dedicated to the performance comparison between <strong>VF2</strong> and <strong>VF2++</strong>. The comparison was performed in
<strong>random graphs</strong> without labels, for number of nodes anywhere between the range $(100-2000)$. The results are depicted
in the two following diagrams.</p>
<center><img src="times.png" alt="vf2++ and vf2 times"/></center>
<center><img src="speedup.png" alt="speedup"/></center>
<p>We notice that the maximum speedup achieved is <strong>14x</strong>, and continues to increase as the number of nodes increase.
It is also highly prominent that the increase in number of nodes, doesn&rsquo;t seem to affect the performance of <strong>VF2++</strong> to
a significant extent, when compared to the drastic impact on the performance of <strong>VF2</strong>. Our results are almost identical
to those presented in the original <strong><a href="https://www.sciencedirect.com/science/article/pii/S0166218X18300829">VF2++ paper</a></strong>, verifying the theoretical analysis and premises of the literature.</p>
<h2 id="optimizations">Optimizations<a class="headerlink" href="#optimizations" title="Link to this heading">#</a></h2>
<p>The achieved boost is due to some key improvements and optimizations, specifically:</p>
<ul>
<li><strong>Optimal node ordering</strong>, which avoids following unfruitful branches that will result in infeasible states. We make sure that the nodes that have the biggest possibility to match are accessed first.</li>
<li><strong>Implementation in a non-recursive manner</strong>, avoiding Python&rsquo;s maximum recursion limit while also reducing function call overhead.</li>
<li><strong>Caching</strong> of both node degrees and nodes per degree in the beginning, so that we don&rsquo;t have to access those features in every degree check. For example, instead of doing</li>
</ul>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">res</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">node</span> <span class="ow">in</span> <span class="n">G2</span><span class="o">.</span><span class="n">nodes</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">G1</span><span class="o">.</span><span class="n">degree</span><span class="p">[</span><span class="n">u</span><span class="p">]</span> <span class="o">==</span> <span class="n">G2</span><span class="o">.</span><span class="n">degree</span><span class="p">[</span><span class="n">node</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl">        <span class="n">res</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">node</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># do stuff with res ...</span></span></span></code></pre>
</div>
<p>to get the nodes of same degree as u (which happens a lot of times in the implementation), we just do:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">res</span> <span class="o">=</span> <span class="n">G2_nodes_of_degree</span><span class="p">[</span><span class="n">G1</span><span class="o">.</span><span class="n">degree</span><span class="p">[</span><span class="n">u</span><span class="p">]]</span>
</span></span><span class="line"><span class="cl"><span class="c1"># do stuff with res ...</span></span></span></code></pre>
</div>
<p>where &ldquo;G2_nodes_of_degree&rdquo; stores set of nodes for a given degree. The same is done with node labels.</p>
<ul>
<li><strong>Extra shrinking of the candidate set for each node</strong> by adding more checks in the candidate selection method and removing some from the feasibility checks. In simple terms, instead of checking a lot of conditions on a larger set of candidates, we check fewer conditions but on a more targeted and significantly smaller set of candidates.
For example, in this code:</li>
</ul>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">candidates</span> <span class="o">=</span> <span class="nb">set</span><span class="p">(</span><span class="n">G2</span><span class="o">.</span><span class="n">nodes</span><span class="p">())</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">candidate</span> <span class="ow">in</span> <span class="n">candidates</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">feasibility</span><span class="p">(</span><span class="n">u</span><span class="p">,</span> <span class="n">candidate</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">do_stuff</span><span class="p">()</span></span></span></code></pre>
</div>
<p>we take a huge set of candidates, which results in poor performance due to maximizing calls of &ldquo;feasibility&rdquo;, thus performing
the feasibility checks in a very large set. Now compare that to the following alternative:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">candidates</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="n">n</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">n</span> <span class="ow">in</span> <span class="n">G2_nodes_of_degree</span><span class="p">[</span><span class="n">G1</span><span class="o">.</span><span class="n">degree</span><span class="p">[</span><span class="n">u</span><span class="p">]]</span><span class="o">.</span><span class="n">intersection</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">G2_nodes_of_label</span><span class="p">[</span><span class="n">G1_labels</span><span class="p">[</span><span class="n">u</span><span class="p">]]</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">candidate</span> <span class="ow">in</span> <span class="n">candidates</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">feasibility</span><span class="p">(</span><span class="n">u</span><span class="p">,</span> <span class="n">candidate</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">do_stuff</span><span class="p">()</span></span></span></code></pre>
</div>
<p>Immediately we have drastically reduced the number of checks performed and calls to the function, as now we only apply them to nodes of the same degree and label as $u$. This is a simplification for demonstration purposes. In the actual implementation there are more checks and extra shrinking of the candidate set.</p>
<h2 id="demo">Demo<a class="headerlink" href="#demo" title="Link to this heading">#</a></h2>
<p>Let&rsquo;s demonstrate our <strong>VF2++</strong> solver on a real graph. We are going to use the graph from the Graph Isomorphism wikipedia.</p>
<p float="center">
  <img src="https://upload.wikimedia.org/wikipedia/commons/9/9a/Graph_isomorphism_a.svg" width="200" height="200">
  <img src="https://upload.wikimedia.org/wikipedia/commons/8/84/Graph_isomorphism_b.svg" width="395" height="250">
</p>
<p>Let&rsquo;s start by constructing the graphs from the image above. We&rsquo;ll call
the graph on the left <code>G</code> and the graph on the left <code>H</code>:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">networkx</span> <span class="k">as</span> <span class="nn">nx</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">G</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">Graph</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="s2">&#34;a&#34;</span><span class="p">,</span> <span class="s2">&#34;g&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="s2">&#34;a&#34;</span><span class="p">,</span> <span class="s2">&#34;h&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="s2">&#34;a&#34;</span><span class="p">,</span> <span class="s2">&#34;i&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="s2">&#34;g&#34;</span><span class="p">,</span> <span class="s2">&#34;b&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="s2">&#34;g&#34;</span><span class="p">,</span> <span class="s2">&#34;c&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="s2">&#34;b&#34;</span><span class="p">,</span> <span class="s2">&#34;h&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="s2">&#34;b&#34;</span><span class="p">,</span> <span class="s2">&#34;j&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="s2">&#34;h&#34;</span><span class="p">,</span> <span class="s2">&#34;d&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="s2">&#34;c&#34;</span><span class="p">,</span> <span class="s2">&#34;i&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="s2">&#34;c&#34;</span><span class="p">,</span> <span class="s2">&#34;j&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="s2">&#34;i&#34;</span><span class="p">,</span> <span class="s2">&#34;d&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="s2">&#34;d&#34;</span><span class="p">,</span> <span class="s2">&#34;j&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">H</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">Graph</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">7</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mi">6</span><span class="p">,</span> <span class="mi">7</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span></span></span></code></pre>
</div>
<h3 id="use-the-vf2-without-taking-labels-into-consideration">use the VF2++ without taking labels into consideration<a class="headerlink" href="#use-the-vf2-without-taking-labels-into-consideration" title="Link to this heading">#</a></h3>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">res</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">vf2pp_is_isomorphic</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">H</span><span class="p">,</span> <span class="n">node_label</span><span class="o">=</span><span class="kc">None</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># res: True</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">res</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">vf2pp_isomorphism</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">H</span><span class="p">,</span> <span class="n">node_label</span><span class="o">=</span><span class="kc">None</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># res: {1: &#34;a&#34;, 2: &#34;h&#34;, 3: &#34;d&#34;, 4: &#34;i&#34;, 5: &#34;g&#34;, 6: &#34;b&#34;, 7: &#34;j&#34;, 8: &#34;c&#34;}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">res</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">nx</span><span class="o">.</span><span class="n">vf2pp_all_isomorphisms</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">H</span><span class="p">,</span> <span class="n">node_label</span><span class="o">=</span><span class="kc">None</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1"># res: all isomorphic mappings (there might be more than one). This function is a generator.</span></span></span></code></pre>
</div>
<h3 id="use-the-vf2-taking-labels-into-consideration">use the VF2++ taking labels into consideration<a class="headerlink" href="#use-the-vf2-taking-labels-into-consideration" title="Link to this heading">#</a></h3>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># Assign some label to each node</span>
</span></span><span class="line"><span class="cl"><span class="n">G_node_attributes</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;a&#34;</span><span class="p">:</span> <span class="s2">&#34;blue&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;g&#34;</span><span class="p">:</span> <span class="s2">&#34;green&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;b&#34;</span><span class="p">:</span> <span class="s2">&#34;pink&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;h&#34;</span><span class="p">:</span> <span class="s2">&#34;red&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;c&#34;</span><span class="p">:</span> <span class="s2">&#34;yellow&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;i&#34;</span><span class="p">:</span> <span class="s2">&#34;orange&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;d&#34;</span><span class="p">:</span> <span class="s2">&#34;cyan&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;j&#34;</span><span class="p">:</span> <span class="s2">&#34;purple&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">nx</span><span class="o">.</span><span class="n">set_node_attributes</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">G_node_attributes</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s2">&#34;color&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">H_node_attributes</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="mi">1</span><span class="p">:</span> <span class="s2">&#34;blue&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="mi">2</span><span class="p">:</span> <span class="s2">&#34;red&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="mi">3</span><span class="p">:</span> <span class="s2">&#34;cyan&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="mi">4</span><span class="p">:</span> <span class="s2">&#34;orange&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="mi">5</span><span class="p">:</span> <span class="s2">&#34;green&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="mi">6</span><span class="p">:</span> <span class="s2">&#34;pink&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="mi">7</span><span class="p">:</span> <span class="s2">&#34;purple&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="mi">8</span><span class="p">:</span> <span class="s2">&#34;yellow&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">nx</span><span class="o">.</span><span class="n">set_node_attributes</span><span class="p">(</span><span class="n">H</span><span class="p">,</span> <span class="n">H_node_attributes</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s2">&#34;color&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">res</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">vf2pp_is_isomorphic</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">H</span><span class="p">,</span> <span class="n">node_label</span><span class="o">=</span><span class="s2">&#34;color&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># res: True</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">res</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">vf2pp_isomorphism</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">H</span><span class="p">,</span> <span class="n">node_label</span><span class="o">=</span><span class="s2">&#34;color&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># res: {1: &#34;a&#34;, 2: &#34;h&#34;, 3: &#34;d&#34;, 4: &#34;i&#34;, 5: &#34;g&#34;, 6: &#34;b&#34;, 7: &#34;j&#34;, 8: &#34;c&#34;}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">res</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">nx</span><span class="o">.</span><span class="n">vf2pp_all_isomorphisms</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">H</span><span class="p">,</span> <span class="n">node_label</span><span class="o">=</span><span class="s2">&#34;color&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1"># res: {1: &#34;a&#34;, 2: &#34;h&#34;, 3: &#34;d&#34;, 4: &#34;i&#34;, 5: &#34;g&#34;, 6: &#34;b&#34;, 7: &#34;j&#34;, 8: &#34;c&#34;}</span></span></span></code></pre>
</div>
<p>Notice how in the first case, our solver may return a different mapping every time, since the absence of labels results in nodes that can map to more than one others. For example, node 1 can map to both a and h, since the graph is symmetrical.
On the second case though, the existence of a single, unique label per node imposes that there&rsquo;s only one match for each node, so the mapping returned is deterministic. This is easily observed from
output of <code>list(nx.vf2pp_all_isomorphisms)</code> which, in the first case, returns all possible mappings while in the latter, returns a single, unique isomorphic mapping.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="vf2&#43;&#43;" label="vf2&#43;&#43;" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[ISO Feasibility & Candidates]]></title>
            <link href="https://blog.scientific-python.org/networkx/vf2pp/iso-feasibility-candidates/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/vf2pp/node-ordering-ti-updating/?utm_source=atom_feed" rel="related" type="text/html" title="Updates on VF2&#43;&#43;" />
                <link href="https://blog.scientific-python.org/networkx/vf2pp/gsoc-2022/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC 2022: NetworkX VF2&#43;&#43; Implementation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/my-summer-of-code-2021/?utm_source=atom_feed" rel="related" type="text/html" title="My Summer of Code 2021" />
                <link href="https://blog.scientific-python.org/networkx/atsp/completing-the-asadpour-algorithm/?utm_source=atom_feed" rel="related" type="text/html" title="Completing the Asadpour Algorithm" />
                <link href="https://blog.scientific-python.org/networkx/atsp/looking-at-the-big-picture/?utm_source=atom_feed" rel="related" type="text/html" title="Looking at the Big Picture" />
            
                <id>https://blog.scientific-python.org/networkx/vf2pp/iso-feasibility-candidates/</id>
            
            
            <published>2022-07-11T00:00:00+00:00</published>
            <updated>2022-07-11T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Information about my progress on two important features of the algorithm.</blockquote><p>The previous post can be found <a href="../node-ordering-ti-updating">here</a>, be sure to check it out so you
can
follow the process step by step. Since then, another two very significant features of the algorithm have been
implemented and tested: <strong>node pair candidate selection</strong> and <strong>feasibility checks</strong>.</p>
<h2 id="introduction">Introduction<a class="headerlink" href="#introduction" title="Link to this heading">#</a></h2>
<p>As previously described, in the ISO problem we are basically trying to create a <strong>mapping</strong> such that, every node
from the first graph is matched to a node from the second graph. This searching for &ldquo;feasible pairs&rdquo; can be visualized
by a tree, where each node is the candidate pair that we should examine. This can become much clearer if we take a look
at the below figure.</p>
<center><img src="dfs.png" alt="DFS VF2++ example$"/></center>
<p>In order to check if the graphs $G_1$, $G_2$ are isomorphic, we check every candidate pair of nodes and if it is
feasible, we extend the mapping and go deeper into the tree of pairs. If it&rsquo;s not feasible, we climb up and follow a
different branch, until every node in $G_1$ is mapped to a node $G_2$. In our example, we start by examining node 0 from G1, with
node 0 of G2. After some checks (details below), we decide that the
nodes 0 and 0 are matching, so we go deeper to map the remaining nodes. The next pair is 1-3, which fails the
feasibility check, so we have to examine a different branch as shown. The new branch is 1-2, which is feasible, so we
continue on using the same logic until all the nodes are mapped.</p>
<h2 id="candidate-pair-selection">Candidate Pair Selection<a class="headerlink" href="#candidate-pair-selection" title="Link to this heading">#</a></h2>
<p>Although in our example we use a random candidate pair of nodes, in the actual implementation we are able to target
specific pairs that are more likely to be matched, hence boost the performance of the algorithm. The idea is that, in
every step of the algorithm, <strong>given a candidate</strong></p>
<p>$$u\in V_1$$</p>
<p><strong>we compute the candidates</strong></p>
<p>$$v\in V_2$$</p>
<p>where $V_1$ and $V_2$ are the nodes of $G_1$ and $G_2$ respectively. Now this is a puzzle that does not require a lot of
specific knowledge on graphs or the algorithm itself. Keep up with me, and you will realize it yourself. First, let $M$
be the mapping so far, which includes all the &ldquo;covered nodes&rdquo; until this point. There are actually <strong>three</strong> different
types of $u$ nodes that we might encounter.</p>
<h3 id="case-1">Case 1<a class="headerlink" href="#case-1" title="Link to this heading">#</a></h3>
<p>Node $u$ has no neighbors (degree of $u$ equals to zero). It would be redundant to test
as candidates for $u$, nodes from $G_2$ that have more than zero neighbors. That said, we eliminate most of the possible
candidates and keep those that have the same degree as $u$ (in this case, zero). Pretty easy right?</p>
<h3 id="case-2">Case 2<a class="headerlink" href="#case-2" title="Link to this heading">#</a></h3>
<p>Node $u$ has neighbors, but none of them belong to the mapping. This situation is illustrated in the following figure.</p>
<center><img src="c2.png" alt="candidates"/></center>
<p>The grey lines indicate that the nodes of $G_1$ (left 1,2) are mapped to the nodes of $G_2$ (right 1,2). They are basically
the mapping. Again, given $u$, we make the observation that candidates $v$ of u, should also have no neighbors in the
mapping, and also have the same degree as $u$ (as in the figure). Notice how if we add a neighbor to $v$, or if we place
one of its neighbors inside the mapping, there is no point examining the pair $u-v$ for matching.</p>
<h3 id="case-3">Case 3<a class="headerlink" href="#case-3" title="Link to this heading">#</a></h3>
<p>Node $u$ has neighbors and some of them belong to the mapping. This scenario is also depicted in the below figure.</p>
<center><img src="c3.png" alt="candidates"/></center>
<p>In this case, to obtain the candidates for $u$, we must look into the neighborhoods of nodes from $G_2$, which map back
to the covered neighbors of $u$. In our example, $u$ has one covered neighbor (1), and 1 from $G_1$ maps to 1 from $G_2$,
which has $v$ as neighbor. Also, for v to be considered as candidate, it should have the same degree as $u$, obviously.
Notice how every node that is not in the neighborhood of 1 (in $G_2$) cannot be matched to $u$ without breaking the
isomorphism.</p>
<h2 id="iso-feasibility-rules">ISO Feasibility Rules<a class="headerlink" href="#iso-feasibility-rules" title="Link to this heading">#</a></h2>
<p>Let&rsquo;s assume that given a node $u$, we obtained its candidate $v$ following the process described in the previous section.
At this point, the <strong>Feasibility Rules</strong> are going to determine whether the mapping should be extended by the pair $u-v$
or if we should try another candidate. The <strong>feasibility</strong> of a pair $u-v$ is examined by <strong>consistency</strong> and
<strong>cutting</strong> checks.</p>
<h3 id="consistency-rules">Consistency rules<a class="headerlink" href="#consistency-rules" title="Link to this heading">#</a></h3>
<p>At, first I am going to present the mathematical expression of the consistency check. It may seem complicated at first,
but it&rsquo;s going to be made simple by using a visual illustration. Using the notation $nbh_i(u)$ for the neighborhood of u
in graph $G_i$, the consistency rule is:</p>
<p>$$\forall\tilde{v}\in nbh_2(v)\cap M:(u, M^{-1}(\tilde{v}))\in E_1) \wedge \forall\tilde{u}\in nbh_1(u)\cap M:(u, M(\tilde{u}))\in E_2)$$</p>
<p>We are going to use the following simple figure to demystify the above equation.</p>
<center><img src="const.png" alt="consistency check scenario"/></center>
<p>The mapping is depicted as grey lines between the nodes that are already mapped, meaning that 1 maps to A and 2 to B.
What is implied by the equation is that, for two nodes $u$ and $v$ to pass the consistency check, the neighbors of $u$
that belong in the mapping, should map to neighbors of $v$ (and backwards). This could be checked by code as simple
as:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">for</span> <span class="n">neighbor</span> <span class="ow">in</span> <span class="n">G1</span><span class="p">[</span><span class="n">u</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">neighbor</span> <span class="ow">in</span> <span class="n">mapping</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">mapping</span><span class="p">[</span><span class="n">neighbor</span><span class="p">]</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">G2</span><span class="p">[</span><span class="n">v</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="kc">False</span>
</span></span><span class="line"><span class="cl">        <span class="k">elif</span> <span class="n">G1</span><span class="o">.</span><span class="n">number_of_edges</span><span class="p">(</span><span class="n">u</span><span class="p">,</span> <span class="n">neighbor</span><span class="p">)</span> <span class="o">!=</span> <span class="n">G2</span><span class="o">.</span><span class="n">number_of_edges</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="n">v</span><span class="p">,</span> <span class="n">mapping</span><span class="p">[</span><span class="n">neighbor</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="p">):</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="kc">False</span></span></span></code></pre>
</div>
<p>where the final two lines also check the number of edges between node $u$ and its neighbor $\tilde{u}$, which should be
the same as those between $v$ and its neighbor which $\tilde{u}$ maps to. At a very high level, we could describe this
check as a 1-look-ahead check.</p>
<h3 id="cutting-rules">Cutting rules<a class="headerlink" href="#cutting-rules" title="Link to this heading">#</a></h3>
<p>We have previously discussed what $T_i$ and $\tilde{T_i}$ represent (see previous post). These sets are used in the
cutting checks as follows: the number of neighbors of $u$ that belong to $T_1$, should be equal to the number of
neighbors of $v$ that belong to $T_2$. Take a moment to observe the below figure.</p>
<center><img src="cut.png" alt="cutting check scenario"/></center>
<p>Once again, node 1 maps to A and 2 to B. The red nodes (4,5,6) are basically $T_1$ and the yellow ones (C,D,E) are $T_2$.
Notice that in order for $u-v$ to be feasible, $u$ should have the same number of neighbors, inside $T_1$,
as $v$ in $T_2$. In every other case, the two graphs are not isomorphic, which can be verified visually. For this
example, both nodes have 2 of their neighbors (4,6 and C,E) in $T_1$ and $T_2$ respectively. Careful! If we delete the
$V-E$ edge and connect $V$ to $D$, the cutting condition is still satisfied. However, the feasibility is going to fail,
by the consistency checks of the previous section. A simple code to apply the cutting check would be:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">T1</span><span class="o">.</span><span class="n">intersection</span><span class="p">(</span><span class="n">G1</span><span class="p">[</span><span class="n">u</span><span class="p">]))</span> <span class="o">!=</span> <span class="nb">len</span><span class="p">(</span><span class="n">T2</span><span class="o">.</span><span class="n">intersection</span><span class="p">(</span><span class="n">G2</span><span class="p">[</span><span class="n">v</span><span class="p">]))</span> <span class="ow">or</span> <span class="nb">len</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">T1out</span><span class="o">.</span><span class="n">intersection</span><span class="p">(</span><span class="n">G1</span><span class="p">[</span><span class="n">u</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span> <span class="o">!=</span> <span class="nb">len</span><span class="p">(</span><span class="n">T2out</span><span class="o">.</span><span class="n">intersection</span><span class="p">(</span><span class="n">G2</span><span class="p">[</span><span class="n">v</span><span class="p">])):</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">False</span></span></span></code></pre>
</div>
<p>where <code>T1out</code> and <code>T2out</code> correspond to $\tilde{T_1}$ and $\tilde{T_2}$ respectively. And yes, we have to check for
those as well, however we skipped them in the above explanation for simplicity.</p>
<h2 id="conclusion">Conclusion<a class="headerlink" href="#conclusion" title="Link to this heading">#</a></h2>
<p>At this point, we have successfully implemented and tested all the major components of the algorithm <strong>VF2++</strong>,</p>
<ul>
<li><strong>Node Ordering</strong></li>
<li><strong>$T_i/\tilde{T_i}$ Updating</strong></li>
<li><strong>Feasibility Rules</strong></li>
<li><strong>Candidate Selection</strong></li>
</ul>
<p>This means that, in the next post, hopefully, we are going to discuss our first, full and functional implementation of
<strong>VF2++</strong>.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="vf2&#43;&#43;" label="vf2&#43;&#43;" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Updates on VF2++]]></title>
            <link href="https://blog.scientific-python.org/networkx/vf2pp/node-ordering-ti-updating/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/vf2pp/gsoc-2022/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC 2022: NetworkX VF2&#43;&#43; Implementation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/my-summer-of-code-2021/?utm_source=atom_feed" rel="related" type="text/html" title="My Summer of Code 2021" />
                <link href="https://blog.scientific-python.org/networkx/atsp/completing-the-asadpour-algorithm/?utm_source=atom_feed" rel="related" type="text/html" title="Completing the Asadpour Algorithm" />
                <link href="https://blog.scientific-python.org/networkx/atsp/looking-at-the-big-picture/?utm_source=atom_feed" rel="related" type="text/html" title="Looking at the Big Picture" />
                <link href="https://blog.scientific-python.org/networkx/atsp/sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Sampling a Spanning Tree" />
            
                <id>https://blog.scientific-python.org/networkx/vf2pp/node-ordering-ti-updating/</id>
            
            
            <published>2022-07-06T00:00:00+00:00</published>
            <updated>2022-07-06T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Summary of the progress on VF2++</blockquote><p>This post includes all the major updates since the <a href="../gsoc-2022">last post</a> about VF2++. Each section
is dedicated to a different sub-problem and presents the progress on it so far. General progress, milestones and related
issues can be <a href="https://github.com/kpetridis24/networkx/milestone/1">found here</a>.</p>
<h2 id="node-ordering">Node ordering<a class="headerlink" href="#node-ordering" title="Link to this heading">#</a></h2>
<p>The node ordering is one major modification that <strong>VF2++</strong> proposes. Basically, the nodes are examined in an order that
makes the matching faster by first examining nodes that are more likely to match. This part of the algorithm has been
implemented, however there is an issue. The existence of detached nodes (not connected to the rest of the graph) causes
the code to crash. Fixing this bug will be a top priority during the next steps. The ordering implementation is described
by the following pseudocode.</p>
<blockquote>
<hr>
<p><strong>Matching Order</strong></p>
<hr>
<ol>
<li><strong>Set</strong> $M = \varnothing$.</li>
<li><strong>Set</strong> $\bar{V1}$ : nodes not in order yet</li>
<li><strong>while</strong> $\bar{V1}$ not empty <strong>do</strong>
<ul>
<li>$rareNodes=[$nodes from $V_1$ with the rarest labels$]$</li>
<li>$maxNode=argmax_{degree}(rareNodes)$</li>
<li>$T=$ BFSTree with $maxNode$ as root</li>
<li><strong>for</strong> every level in $T$ <strong>do</strong>
<ul>
<li>$V_d=[$nodes of the $d^{th}$ level$]$</li>
<li>$\bar{V_1} \setminus V_d$</li>
<li>$ProcessLevel(V_d)$</li>
</ul>
</li>
</ul>
</li>
<li>Output $M$: the matching order of the nodes.</li>
</ol>
</blockquote>
<blockquote>
<hr>
<p><strong>Process Level</strong></p>
<hr>
<ol>
<li><strong>while</strong> $V_d$ not empty <strong>do</strong>
<ul>
<li>$S=[$nodes from $V_d$ with the most neighbors in M$]$</li>
<li>$maxNodes=argmax_{degree}(S)$</li>
<li>$rarestNode=[$node from $maxNodes$ with the rarest label$]$</li>
<li>$V_d \setminus m$</li>
<li>Append m to M</li>
</ul>
</li>
</ol>
</blockquote>
<h2 id="t_i-and-tildet_i">$T_i$ and $\tilde{T_i}$<a class="headerlink" href="#t_i-and-tildet_i" title="Link to this heading">#</a></h2>
<p>According to the VF2++ paper notation:</p>
<p>$$T_1=(u\in V_1 \setminus m: \exists \tilde{u} \in m: (u,\tilde{u}\in E_1))$$</p>
<p>where $V_1$ and $E_1$ contain all the nodes and edges of the first graph respectively, and $m$ is a dictionary, mapping
every node of the first graph to a node of the second graph. Now if we interpret the above equation, we conclude that
$T_1$ contains uncovered neighbors of covered nodes. In simple terms, it includes all the nodes that do not belong in
the mapping $m$ yet, but are neighbors of nodes that are in the mapping. In addition,</p>
<p>$$\tilde{T_1}=(V_1 \setminus m \setminus T_1)$$</p>
<p>The following figure is meant to provide some visual explanation of what exactly $T_i$ is.</p>
<p><img src="/networkx/vf2pp/node-ordering-ti-updating/Ti.png" alt="Illustration of $T_i$."></p>
<p>The blue nodes 1,2,3 are nodes from graph G1 and the green nodes A,B,C belong to the graph G2. The grey lines connecting
those two indicate that in this current state, node 1 is mapped to node A, node 2 is mapped to node B, etc. The yellow
edges are just the neighbors of the covered (mapped) nodes. Here, $T_1$ contains the red nodes (4,5,6) which are
neighbors of the covered nodes 1,2,3, and $T_2$ contains the grey ones (D,E,F). None of the nodes depicted would be
included in $\tilde{T_1}$ or $\tilde{T_2}$. The latter sets would contain all the remaining nodes from the two graphs.</p>
<p>Regarding the computation of these sets, it&rsquo;s not practical to use the brute force method and iterate over all nodes in
every step of the algorithm to find the desired nodes and compute $T_i$ and $\tilde{T_i}$. We use the following
observations to implement an incremental computation of $T_i$ and $\tilde{T_i}$ and make VF2++ more efficient.</p>
<ul>
<li>$T_i$ is empty in the beginning, since there are no mapped nodes ($m=\varnothing$) and therefore no neighbors of
mapped nodes.</li>
<li>$\tilde{T_i}$ initially contains all the nodes from graph $G_i, i=1,2$ which can be realized directly from the
notation if we consider both $m$ and $T_1$ empty sets.</li>
<li>Every step of the algorithm either adds one node $u$ to the mapping or pops one from it.</li>
</ul>
<p>We can conclude that in every step, $T_i$ and $\tilde{T_i}$ can be incrementally updated. This method avoids a ton of
redundant operations and results in significant performance improvement.</p>
<p><img src="/networkx/vf2pp/node-ordering-ti-updating/acceleration.png" alt="Performance comparison between brute force Ti computing and incremental updating."></p>
<p>The above graph shows the difference in performance between using the exhaustive brute force and incrementally updating
$T_i$ and $\tilde{T_i}$. The graph used to obtain these measurements was a regular
<a href="https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model">GNP Graph</a> with a probability for an edge equal to
$0.7$. It can clearly be seen that execution time of the brute force
method increases much more rapidly with the number of nodes/edges than
the incremental update method, as expected.
The brute force method looks like this:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">compute_Ti</span><span class="p">(</span><span class="n">G1</span><span class="p">,</span> <span class="n">G2</span><span class="p">,</span> <span class="n">mapping</span><span class="p">,</span> <span class="n">reverse_mapping</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">T1</span> <span class="o">=</span> <span class="p">{</span><span class="n">nbr</span> <span class="k">for</span> <span class="n">node</span> <span class="ow">in</span> <span class="n">mapping</span> <span class="k">for</span> <span class="n">nbr</span> <span class="ow">in</span> <span class="n">G1</span><span class="p">[</span><span class="n">node</span><span class="p">]</span> <span class="k">if</span> <span class="n">nbr</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">mapping</span><span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">T2</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">nbr</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="n">node</span> <span class="ow">in</span> <span class="n">reverse_mapping</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="n">nbr</span> <span class="ow">in</span> <span class="n">G2</span><span class="p">[</span><span class="n">node</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">nbr</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">reverse_mapping</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">T1_out</span> <span class="o">=</span> <span class="p">{</span><span class="n">n1</span> <span class="k">for</span> <span class="n">n1</span> <span class="ow">in</span> <span class="n">G1</span><span class="o">.</span><span class="n">nodes</span><span class="p">()</span> <span class="k">if</span> <span class="n">n1</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">mapping</span> <span class="ow">and</span> <span class="n">n1</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">T1</span><span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">T2_out</span> <span class="o">=</span> <span class="p">{</span><span class="n">n2</span> <span class="k">for</span> <span class="n">n2</span> <span class="ow">in</span> <span class="n">G2</span><span class="o">.</span><span class="n">nodes</span><span class="p">()</span> <span class="k">if</span> <span class="n">n2</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">reverse_mapping</span> <span class="ow">and</span> <span class="n">n2</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">T2</span><span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">T1</span><span class="p">,</span> <span class="n">T2</span><span class="p">,</span> <span class="n">T1_out</span><span class="p">,</span> <span class="n">T2_out</span></span></span></code></pre>
</div>
<p>If we assume that G1 and G2 have the same number of nodes (N), the average number of nodes in the mapping is $N_m$, and
the average node degree of the graphs is $D$, then the time complexity of this function is:</p>
<p>$$O(2N_mD + 2N) = O(N_mD + N)$$</p>
<p>in which we have excluded the lookup times in $T_i$, $mapping$ and $reverse\_mapping$ as they are all $O(1)$. Our
incremental method works like this:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">update_Tinout</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">G1</span><span class="p">,</span> <span class="n">G2</span><span class="p">,</span> <span class="n">T1</span><span class="p">,</span> <span class="n">T2</span><span class="p">,</span> <span class="n">T1_out</span><span class="p">,</span> <span class="n">T2_out</span><span class="p">,</span> <span class="n">new_node1</span><span class="p">,</span> <span class="n">new_node2</span><span class="p">,</span> <span class="n">mapping</span><span class="p">,</span> <span class="n">reverse_mapping</span>
</span></span><span class="line"><span class="cl"><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># This function should be called right after the feasibility is established and node1 is mapped to node2.</span>
</span></span><span class="line"><span class="cl">    <span class="n">uncovered_neighbors_G1</span> <span class="o">=</span> <span class="p">{</span><span class="n">nbr</span> <span class="k">for</span> <span class="n">nbr</span> <span class="ow">in</span> <span class="n">G1</span><span class="p">[</span><span class="n">new_node1</span><span class="p">]</span> <span class="k">if</span> <span class="n">nbr</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">mapping</span><span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">uncovered_neighbors_G2</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">nbr</span> <span class="k">for</span> <span class="n">nbr</span> <span class="ow">in</span> <span class="n">G2</span><span class="p">[</span><span class="n">new_node2</span><span class="p">]</span> <span class="k">if</span> <span class="n">nbr</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">reverse_mapping</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Add the uncovered neighbors of node1 and node2 in T1 and T2 respectively</span>
</span></span><span class="line"><span class="cl">    <span class="n">T1</span><span class="o">.</span><span class="n">discard</span><span class="p">(</span><span class="n">new_node1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">T2</span><span class="o">.</span><span class="n">discard</span><span class="p">(</span><span class="n">new_node2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">T1</span> <span class="o">=</span> <span class="n">T1</span><span class="o">.</span><span class="n">union</span><span class="p">(</span><span class="n">uncovered_neighbors_G1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">T2</span> <span class="o">=</span> <span class="n">T2</span><span class="o">.</span><span class="n">union</span><span class="p">(</span><span class="n">uncovered_neighbors_G2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># todo: maybe check this twice just to make sure</span>
</span></span><span class="line"><span class="cl">    <span class="n">T1_out</span><span class="o">.</span><span class="n">discard</span><span class="p">(</span><span class="n">new_node1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">T2_out</span><span class="o">.</span><span class="n">discard</span><span class="p">(</span><span class="n">new_node2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">T1_out</span> <span class="o">=</span> <span class="n">T1_out</span> <span class="o">-</span> <span class="n">uncovered_neighbors_G1</span>
</span></span><span class="line"><span class="cl">    <span class="n">T2_out</span> <span class="o">=</span> <span class="n">T2_out</span> <span class="o">-</span> <span class="n">uncovered_neighbors_G2</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">T1</span><span class="p">,</span> <span class="n">T2</span><span class="p">,</span> <span class="n">T1_out</span><span class="p">,</span> <span class="n">T2_out</span></span></span></code></pre>
</div>
<p>which based on the previous notation, is:</p>
<p>$$O(2D + 2(D + M_{T_1}) + 2D) = O(D + M_{T_1})$$</p>
<p>where $M_{T_1}$ is the expected (average) number of elements in $T_1$.</p>
<p>Certainly, the complexity is much better in this
case, as $D$ and $M_{T_1}$ are significantly smaller than $N_mD$ and $N$.</p>
<p>In this post we investigated how node ordering works at a high level, and also
how we are able to calculate some important parameters so that the space and
time complexity are reduced.
The next post will continue with examining two more significant components of
the VF2++ algorithm: the <strong>candidate node pair selection</strong> and the
<strong>cutting/consistency</strong> rules that decide when the mapping should or shouldn&rsquo;t
be extended.
Stay tuned!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="vf2&#43;&#43;" label="vf2&#43;&#43;" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[GSoC 2022: NetworkX VF2++ Implementation]]></title>
            <link href="https://blog.scientific-python.org/networkx/vf2pp/gsoc-2022/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/my-summer-of-code-2021/?utm_source=atom_feed" rel="related" type="text/html" title="My Summer of Code 2021" />
                <link href="https://blog.scientific-python.org/networkx/atsp/completing-the-asadpour-algorithm/?utm_source=atom_feed" rel="related" type="text/html" title="Completing the Asadpour Algorithm" />
                <link href="https://blog.scientific-python.org/networkx/atsp/looking-at-the-big-picture/?utm_source=atom_feed" rel="related" type="text/html" title="Looking at the Big Picture" />
                <link href="https://blog.scientific-python.org/networkx/atsp/sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Preliminaries for Sampling a Spanning Tree" />
            
                <id>https://blog.scientific-python.org/networkx/vf2pp/gsoc-2022/</id>
            
            
            <published>2022-06-09T00:00:00+00:00</published>
            <updated>2022-06-09T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>This is the first blog of my GSoC-2022 journey. It includes general information about me, and a superficial description of the project.</blockquote><h2 id="intro">Intro<a class="headerlink" href="#intro" title="Link to this heading">#</a></h2>
<p>I got accepted as a <strong>GSoC</strong> contributor, and I am so excited to spend the summer working on such an incredibly
interesting project. The mentors are very welcoming, communicative, fun to be around, and I really look forward to
collaborating with them. My application for GSoC 2022 can
be found <a href="https://summerofcode.withgoogle.com/programs/2022/projects/V1hY83XG">here</a>.</p>
<h2 id="about-me">About me<a class="headerlink" href="#about-me" title="Link to this heading">#</a></h2>
<p>My name is Konstantinos Petridis, and I am an <strong>Electrical Engineering</strong> student at the Aristotle University of
Thessaloniki. I am currently on my 5th year of studies, with a <strong>Major in Electronics &amp; Computer Science</strong>. Although a
wide range of scientific fields fascinate me, I have a strong passion for <strong>Computer Science</strong>, <strong>Physics</strong> and
<strong>Space</strong>. I love to study, learn new things and don&rsquo;t hesitate to express my curiosity by asking a bunch of questions
to the point of being annoying. You can find me on GitHub <a href="https://github.com/kpetridis24">@kpetridis24</a>.</p>
<h2 id="project">Project<a class="headerlink" href="#project" title="Link to this heading">#</a></h2>
<p>The project I&rsquo;ll be working on, is the implementation of <strong>VF2++</strong>, a state-of-the-art algorithm used for the
<a href="https://en.wikipedia.org/wiki/Graph_isomorphism"><strong>Graph Isomorphism</strong></a> problem, which lies in the
<a href="https://en.wikipedia.org/wiki/Complexity_class">complexity class</a> <a href="https://en.wikipedia.org/wiki/NP_%28complexity%29"><strong>NP</strong></a>.
The functionality of the algorithm is similar to a regular, but
more complex form of a
<a href="https://en.wikipedia.org/wiki/Depth-first_search"><strong>DFS</strong></a>, but performed on the possible solutions rather than the
graph nodes. In order to verify/reject the isomorphism between two graphs, we examine every possible candidate pair of
nodes
(one from the first and one from the second graph) and check whether going deeper into the DFS tree is feasible using
specific rules. In case of feasibility establishment, the DFS tree is expanded, investigating deeper pairs. When one
pair is not feasible, we go up the tree and follow a different branch, just like in a regular <strong>DFS</strong>. More details
about the algorithm can be found <a href="https://doi.org/10.1016/j.dam.2018.02.018">here</a>.</p>
<h2 id="motivation">Motivation<a class="headerlink" href="#motivation" title="Link to this heading">#</a></h2>
<p>The major reasons I chose this project emanate from both my love for <strong>Graph Theory</strong>, and the fascinating nature of
this individual project. The algorithm itself is so recent, that <strong>NetworkX</strong> is possibly going to hold one of the first
implementations of it. This might become a reference that is going to help to further develop and optimize future
implementations of the algorithm by other organisations. Regarding my personal gain, I will become more familiar with
the open source communities and their philosophy, I will collaborate with highly skilled individuals and cultivate a
significant amount of experience on researching, working as a team, getting feedback and help when needed, contributing
to an actual scientific library.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="vf2&#43;&#43;" label="vf2&#43;&#43;" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[SciPy Internship: 2021-2022]]></title>
            <link href="https://blog.scientific-python.org/scipy/internships/smit/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
            
                <id>https://blog.scientific-python.org/scipy/internships/smit/</id>
            
            
            <published>2022-06-04T00:00:00+00:00</published>
            <updated>2022-06-04T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Internship Experience</blockquote><p>I was <a href="https://mail.python.org/archives/list/scipy-dev@python.org/message/4S43BYHDQIPQENNJ6EMQY5QZDZK3ZT5I/">selected as an intern</a> to work on SciPy build system. In this blog post, I will be describing my journey of this 10-months long internship at SciPy. I worked on a variety of topics starting from migrating the SciPy build system to <a href="https://mesonbuild.com/index.html">Meson</a>, cleaning up the public API namespaces and adding <a href="https://uarray.org/en/latest/">Uarray</a> support to SciPy submodules.</p>
<h1 id="experience">Experience<a class="headerlink" href="#experience" title="Link to this heading">#</a></h1>
<h2 id="meson-build-system">Meson Build System<a class="headerlink" href="#meson-build-system" title="Link to this heading">#</a></h2>
<p>The main reasons for switching to Meson include (in addition to <code>distutils</code> being deprecated):</p>
<ol>
<li><em>Much faster builds</em></li>
<li><em>Reliability</em></li>
<li><em>Support for cross-compiling</em></li>
<li><em>Better build logs</em></li>
<li><em>Easier to debug build issues</em></li>
</ol>
<p><em>For more details on the initial proposal to switch to Meson, see <a href="https://github.com/scipy/scipy/issues/13615">scipy-13615</a></em></p>
<p>I was initially selected to work on the migrating the SciPy build system to <a href="https://mesonbuild.com/index.html">meson</a>. I started by adding Meson build support
for <a href="https://github.com/rgommers/scipy/pull/35">scipy.misc</a> and <a href="https://github.com/rgommers/scipy/pull/37">scipy.signal</a>. While working on this, we came across many <a href="https://github.com/rgommers/scipy/issues/42">build warnings</a> which we wanted to fix, since they unnecessarily increased the build log and might point to some hidden bugs. I fixed these warnings, the majority of which came from <a href="https://github.com/rgommers/scipy/issues/30">deprecated NumPy C API calls</a>.</p>
<ul>
<li>I also started <a href="https://github.com/rgommers/scipy/issues/58">benchmarking</a> the Meson build with various optimization levels, during which I ended up finding some <a href="https://github.com/scipy/scipy/issues/14667">failing benchmark tests</a> and tried to fix them.</li>
<li>I implemented the <a href="https://github.com/rgommers/scipy/pull/94">dev.py</a> interface that works in a similar way to <code>runtests.py</code>, but using Meson for building SciPy.</li>
<li>I extended my work on the Meson build by writing Python scripts for checking the installation of all <a href="https://github.com/rgommers/scipy/issues/69">test files</a> and <a href="https://github.com/scipy/scipy/pull/16010">.pyi files</a>.</li>
<li>I documented how to use <a href="https://github.com/rgommers/scipy/pull/96">dev.py</a>, and use <a href="https://github.com/scipy/scipy/pull/15953">parallel builds and optimization levels</a> with Meson.</li>
<li>I added <a href="https://github.com/rgommers/scipy/pull/130">meson option</a> to switch between BLAS/LAPACK libraries.</li>
</ul>
<p>Meson build support including all the above work was merged into SciPy&rsquo;s <code>main</code> branch around Christmas 2021. Meson will now become the default build in the upcoming 1.9.0 release.</p>
<h2 id="making-cleaner-public-namespaces">Making cleaner public namespaces<a class="headerlink" href="#making-cleaner-public-namespaces" title="Link to this heading">#</a></h2>
<h4 id="whats-the-issue">What&rsquo;s the issue?<a class="headerlink" href="#whats-the-issue" title="Link to this heading">#</a></h4>
<p><em>&ldquo;A basic API design principle is: a public object should only be available from one namespace. Having any function in two or more places is just extra technical debt, and with things like dispatching on an API or another library implementing a mirror API, the cost goes up.&rdquo;</em></p>

<div class="highlight">
  <pre>&gt;&gt;&gt; from scipy import ndimage
&gt;&gt;&gt; ndimage.filters.gaussian_filter is ndimage.gaussian_filter  # :(
True</pre>
</div>

<p>The <a href="http://scipy.github.io/devdocs/reference/index.html#api-definition">API reference docs</a> of SciPy define the public API. However, SciPy still had some submodules that were <em>accidentally</em> somewhat public by missing an underscore at the start of their name.
I worked on <a href="https://github.com/scipy/scipy/issues/14360">cleaning the pubic namespaces</a> for about a couple of months by carefully adding underscores to the <code>.py</code> files that were not meant to be public and added depecrated warnings if anyone tries to access them.</p>
<h4 id="the-solution">The solution:<a class="headerlink" href="#the-solution" title="Link to this heading">#</a></h4>

<div class="highlight">
  <pre>&gt;&gt;&gt; from scipy import ndimage
&gt;&gt;&gt; ndimage.filters.gaussian_filter is ndimage.gaussian_filter
&lt;stdin&gt;:1: DeprecationWarning: Please use `gaussian_filter` from the `scipy.ndimage` namespace, the `scipy.ndimage.filters` namespace is deprecated.
True</pre>
</div>

<h2 id="adding-uarray-support">Adding Uarray support<a class="headerlink" href="#adding-uarray-support" title="Link to this heading">#</a></h2>
<p><em>&ldquo;SciPy adopted uarray to support a multi-dispatch mechanism with the goal being: allow writing backends for public APIs that execute in parallel, distributed or on GPU.&rdquo;</em></p>
<p>For about the last four months, I worked on adding <a href="https://github.com/scipy/scipy/issues/14353">Uarray support</a> to SciPy submobules. I do recommend reading <a href="https://labs.quansight.org/blog/2021/10/array-libraries-interoperability/">this blog post</a> by Anirudh Dagar covering the motivation and actual usage of <code>uarray</code>. I picked up the following submodules for adding <code>uarray</code> compatibility:</p>
<ul>
<li><a href="https://github.com/rgommers/scipy/pull/101">signal</a></li>
<li><a href="https://github.com/scipy/scipy/pull/15610">linalg</a></li>
<li><a href="https://github.com/scipy/scipy/pull/15665">special</a></li>
</ul>
<p>At the same time, in order to show a working prototype, I also added <code>uarray</code> backends in CuPy to the following submodules:</p>
<ul>
<li><code>cupyx.scipy.ndimage</code> (<a href="https://github.com/cupy/cupy/pull/6403">PR #6403</a>)</li>
<li><code>cupyx.scipy.linalg</code> (<a href="https://github.com/cupy/cupy/pull/6460">PR #6460</a>)</li>
<li><code>cupyx.scipy.special</code> (<a href="https://github.com/cupy/cupy/pull/6564">PR #6564</a>)</li>
</ul>
<p>The pull requests contain links to Colab notebooks which show these features in action.</p>
<h4 id="what-does-usage-of-such-a-backend-look-like">What does usage of such a backend look like?<a class="headerlink" href="#what-does-usage-of-such-a-backend-look-like" title="Link to this heading">#</a></h4>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">scipy</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">cupy</span> <span class="k">as</span> <span class="nn">cp</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">scipy.linalg</span> <span class="kn">import</span> <span class="n">inv</span><span class="p">,</span> <span class="n">set_backend</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">cupyx.scipy.linalg</span> <span class="k">as</span> <span class="nn">_cupy_backend</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">x_cu</span><span class="p">,</span> <span class="n">x_nu</span> <span class="o">=</span> <span class="n">cp</span><span class="o">.</span><span class="n">array</span><span class="p">([[</span><span class="mf">1.0</span><span class="p">,</span> <span class="mf">2.0</span><span class="p">],</span> <span class="p">[</span><span class="mf">3.0</span><span class="p">,</span> <span class="mf">4.0</span><span class="p">]]),</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([[</span><span class="mf">1.0</span><span class="p">,</span> <span class="mf">2.0</span><span class="p">],</span> <span class="p">[</span><span class="mf">3.0</span><span class="p">,</span> <span class="mf">4.0</span><span class="p">]])</span>
</span></span><span class="line"><span class="cl"><span class="n">y_scipy</span> <span class="o">=</span> <span class="n">inv</span><span class="p">(</span><span class="n">x_nu</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">with</span> <span class="n">set_backend</span><span class="p">(</span><span class="n">_cupy_backend</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">y_cupy</span> <span class="o">=</span> <span class="n">inv</span><span class="p">(</span><span class="n">x_cu</span><span class="p">)</span></span></span></code></pre>
</div>
<h2 id="miscelleanous-work">Miscelleanous Work<a class="headerlink" href="#miscelleanous-work" title="Link to this heading">#</a></h2>
<ul>
<li><a href="https://github.com/scipy/scipy/pull/15440">Fix path issues in runtests.py</a></li>
<li><a href="https://github.com/scipy/scipy/pull/15250">Array inputs for stats.kappa4</a></li>
<li><a href="https://github.com/rgommers/scipy/pull/115">Fixes mac CI conda env cache</a></li>
</ul>
<h2 id="future-work">Future Work<a class="headerlink" href="#future-work" title="Link to this heading">#</a></h2>
<ul>
<li>The &ldquo;switch to Meson&rdquo; project is nearing its completion. One of the final issues was to allow <a href="https://github.com/scipy/scipy/pull/15476">building wheels</a> with the <code>meson-python</code> backend.</li>
<li>The PRs opened for adding <code>uarray</code> support are still under heavy discussion, and the main aim will be get them merged as soon as possible once we have reached a concrete decision.</li>
</ul>
<h2 id="things-to-remember">Things to remember<a class="headerlink" href="#things-to-remember" title="Link to this heading">#</a></h2>
<ol>
<li><em>Patience</em>: Setting up new project always takes some time. We might need to update/fix the system libraries and try to resolve the errors gradually.</li>
<li><em>Learning</em>: Learning new things was one of the main key during the internship. I was completely new to build systems and GPU libraries.</li>
</ol>
<h2 id="thank-you">Thank You!!<a class="headerlink" href="#thank-you" title="Link to this heading">#</a></h2>
<p>I am very grateful to <a href="https://github.com/rgommers">Ralf Gommers</a> for providing me with this opportunity and believing in me. His guidance, support and patience played a major role during the entire course of internship.
I am also thankful to whole SciPy community for helping me with the PR reviews and providing essential feedback. Also, huge thanks to <a href="https://github.com/czgdp1807">Gagandeep Singh</a> for always being a part of this wonderful journey.</p>
<p><em>In a nutshell, I will remember this experience as: <a href="https://github.com/rgommers">Ralf Gommers</a> has boosted my career by millions!</em></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="scipy" label="SciPy" />
                             
                                <category scheme="taxonomy:Tags" term="internship" label="internship" />
                             
                                <category scheme="taxonomy:Tags" term="meson-build" label="meson-build" />
                             
                                <category scheme="taxonomy:Tags" term="uarray" label="uarray" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Team up! Alt text and cross-project community]]></title>
            <link href="https://blog.scientific-python.org/scientific-python/alt-text-workshop-summary/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/scientific-python/gsod-2022-proposal/?utm_source=atom_feed" rel="related" type="text/html" title="Scientific Python GSoD 2022 Proposal" />
            
                <id>https://blog.scientific-python.org/scientific-python/alt-text-workshop-summary/</id>
            
            
            <published>2022-05-19T00:00:00+00:00</published>
            <updated>2022-05-19T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Learn more about alt text on the blog and how you can contribute.</blockquote><p><img src="/scientific-python/alt-text-workshop-summary/alt-text-header.png" alt="Icons of the accessibility symbol, a paper tag labeled ‘Alt’, and a photograph."></p>
<p>The Scientific Python blog has just gotten a little more accessible! If you
didn&rsquo;t catch our <a href="https://twitter.com/scientific_py/status/1523733881651834880">invite on Twitter</a>
or run into the problem firsthand, there&rsquo;s a good chance you might not have
noticed the new descriptions for a number of blog post images.</p>
<p>Since it&rsquo;s not a flashy improvement, we wanted to make a point to highlight
the community effort to to make a more accessible blog&ndash;and internet as a
whole&ndash;last week.</p>
<h2 id="what-did-we-do">What did we do?<a class="headerlink" href="#what-did-we-do" title="Link to this heading">#</a></h2>
<p>In the spirit of <a href="https://scientific-python.org/about/">Scientific Python&rsquo;s mission to build community-developed and
inclusive spaces</a>, the project had its
first image description workshop on May 13, 2022. We gathered to learn about
and practice writing about images from the ground up. Image descriptions&ndash;
<a href="https://developer.mozilla.org/en-US/docs/Web/API/HTMLImageElement/alt">often called <code>alt text</code></a>&ndash;
are one of many accessibility considerations that people making content are
responsible for.
<a href="https://webaim.org/projects/million/#alttext">Missing alt text is among the most common culprits of inaccessible content on the internet</a>.
Fortunately, missing alt text has a clear fix even for those new to with
accessibility: describe the image based on its context.</p>
<p><img src="/scientific-python/alt-text-workshop-summary/group.png" alt="Some of the team behind the alt text. Listed from left to right in Zoom, they are Mars Lee, Jarrod Millman, Isabela Presedo-Floyd, Jan-Hendrik Müller, Mridul Seth, Maxwell Grover, Noa Tamir, Pamphile Roy, and Saranjeet Kaur. Several event participants are not pictured per their request."></p>
<p>During this event, thirteen people wrote 23 image descriptions to improve 11
blog posts in one hour (plus the time for post-event feedback 😉). Wow! The
images covered range from illustrative additions, to creative tutorials, to
charts that carry the message of the blog post, so it provides great set of
examples on how to consider
<a href="https://www.w3.org/WAI/tutorials/images/decision-tree/">writing alt text in different situations</a>.</p>
<p>That&rsquo;s not even counting the many questions, discussions, and ephiphanies that
happened during out work time. Learning can&rsquo;t be captured as easily by a
number, but a taste of our less quantifiable wins can be found on
<a href="https://youtu.be/Zn-zyU2lS0k">the workshop recording</a>.</p>
<h2 id="want-to-learn-more">Want to learn more?<a class="headerlink" href="#want-to-learn-more" title="Link to this heading">#</a></h2>
<p>If you want to further explore the experience and relive the joy, you can find
resources on the <a href="https://hackmd.io/bfhftUCiTRqx2S8CTGUt6g?view">event agenda</a>,
discussions on the <a href="https://github.com/MarsBarLee/blog.scientific-python.org/pull/1">working pull request</a>,
and the final steps needed to make change on the <a href="https://github.com/scientific-python/blog.scientific-python.org/pull/71">contributing pull request</a>.</p>
<p>But wait, there&rsquo;s more! While the community made great progress on improving
the blog, the work to add or improve alt text on Scientific Python&rsquo;s blog,
website, and documentation is ongoing. You can continue these efforts by</p>
<ul>
<li>Contributing as an individual! There&rsquo;s no reason you can&rsquo;t write some image descriptions for your favorite blog post to make it even better. (👀 This is a great excuse to find a new favorite blog post.)</li>
<li>Contributing as a group! Collect your friends, co-workers, neighbors, local book club attendees, or others to team up and write alt text. (Help for duplicating the process of our group PR can be found <a href="https://github.com/isabela-pf/a11y-events/tree/main/workshop-resources/alt-text">on the draft event documentation</a>.)</li>
<li>Staying in touch for announcements of future contributing events on the <a href="https://twitter.com/scientific_py">Scientific Python Twitter</a>.</li>
</ul>
<hr>
<p><em>This image description event has been run across multiple open-source projects in the scientific computing ecosystem. <a href="https://github.com/alt-text-task-force">Reach out via issue</a> if you are interested in running a similar event on a project you are part of.</em> ❤️</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="scientific-python" label="Scientific-Python" />
                             
                                <category scheme="taxonomy:Tags" term="event" label="event" />
                             
                                <category scheme="taxonomy:Tags" term="accessibility" label="accessibility" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[NumPy Contributor Spotlight: Mukulika Pahari]]></title>
            <link href="https://blog.scientific-python.org/numpy/mukulikapahari/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
            
                <id>https://blog.scientific-python.org/numpy/mukulikapahari/</id>
            
            
            <published>2022-04-12T00:00:00+00:00</published>
            <updated>2022-04-12T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>An interview with Mukulika Pahari, a NumPy contributor.</blockquote><p>Our first Contributor Spotlight interview is with Mukulika Pahari, our “go-to” person for Numpy documentation.
Mukulika is a Computer Science student at Mumbai University. Her passions outside of
computing involve things with paper, including reading books (fiction!), folding origami, and journaling.
During our interview she discussed why she joined NumPy, what keeps her motivated, and how likely
she would recommend becoming a NumPy contributor.</p>
<h2 id="tell-us-something-about-yourself">Tell us something about yourself.<a class="headerlink" href="#tell-us-something-about-yourself" title="Link to this heading">#</a></h2>
<p>Hi, I am Mukulika. I live in Mumbai, India, and I’m completing my Computer Science degree at Mumbai University.
I joined NumPy last summer during Google Season of Docs. The idea behind this initiative is to raise awareness
of open source, the role of documentation, and the importance of technical writing. It also gives technical
writers an opportunity to gain experience working on open source projects.</p>
<p>Apart from that, I like to read fiction &ndash; literally everything that I can put my hands on &ndash;
and I find it relaxing to learn origami from YouTube tutorials.</p>
<h2 id="what-is-your-role-in-the-numpy-project">What is your role in the NumPy project?<a class="headerlink" href="#what-is-your-role-in-the-numpy-project" title="Link to this heading">#</a></h2>
<p>I write technical documentation for NumPy, and I help new contributors with their questions.</p>
<h2 id="what-are-your-best-and-worst-experiences-contributing-to-numpy-or-other-open-source-projects">What are your best and worst experiences contributing to NumPy or other open source projects?<a class="headerlink" href="#what-are-your-best-and-worst-experiences-contributing-to-numpy-or-other-open-source-projects" title="Link to this heading">#</a></h2>
<p>The best part for me, honestly, is the people. It is inspiring to meet people from diverse backgrounds
all over the world and do something together. However, I do find it quite scary to put your code out
there for &ldquo;the whole world to see and evaluate.&rdquo; It can challenge my confidence. But meeting all
the contributors, seeing their work, and getting their valuable feedback is absolutely worth it.</p>
<h2 id="what-motivated-you-to-join-and-contribute-to-numpywhat-keeps-you-motivated">What motivated you to join and contribute to NumPy?What keeps you motivated?<a class="headerlink" href="#what-motivated-you-to-join-and-contribute-to-numpywhat-keeps-you-motivated" title="Link to this heading">#</a></h2>
<p>Since I already used NumPy in my data analysis courses in school, and now I am using it at my internship,
I thought that I could also contribute to it. It is always more fun to do side projects in a group.
Once you get to know the people in the NumPy community, you want to stay. They are really open and supportive!</p>
<h2 id="what-is-the-book-youve-given-most-as-a-gift-and-why">What is the book you’ve given most as a gift and why?<a class="headerlink" href="#what-is-the-book-youve-given-most-as-a-gift-and-why" title="Link to this heading">#</a></h2>
<p>Well, I do not really give out books to people &ndash; being a broke college student is quite a barrier.
But I think that everyone should read &ldquo;The Hitchhiker&rsquo;s Guide to Galaxy&rdquo; by Douglas Adams. It is
absolutely hilarious! It is both entertaining and spiked with wisdom.</p>
<h2 id="what-small-purchase-has-most-impacted-your-life-recently">What small purchase has most impacted your life recently?<a class="headerlink" href="#what-small-purchase-has-most-impacted-your-life-recently" title="Link to this heading">#</a></h2>
<p>I recently bought a nice journal and started to write in it. I find it very cleansing to put thoughts
on paper and give them structure. I appreciate pretty paper products&ndash;this one has pastel pages.</p>
<h2 id="how-has-a-failure-or-apparent-failure-set-you-up-for-later-success">How has a failure, or apparent failure, set you up for later success?<a class="headerlink" href="#how-has-a-failure-or-apparent-failure-set-you-up-for-later-success" title="Link to this heading">#</a></h2>
<p>I can&rsquo;t think of a specific situation, but, in general, all my experiences so far seem
to follow a general theme: it is absolutely okay not to be great at everything.
You fail, and then you learn for the future.</p>
<h2 id="who-do-you-consider-a-successful-person">Who do you consider a successful person?<a class="headerlink" href="#who-do-you-consider-a-successful-person" title="Link to this heading">#</a></h2>
<p>My definition of success is being happy without causing harm to anyone.</p>
<h2 id="what-advice-would-you-give-someone-at-the-start-of-their-career-what-advice-should">What advice would you give someone at the start of their career? What advice should<a class="headerlink" href="#what-advice-would-you-give-someone-at-the-start-of-their-career-what-advice-should" title="Link to this heading">#</a></h2>
<p>they ignore?
Since I am at the beginning of my career, I can&rsquo;t say much. But I think it is nice
to listen to everyone and get feedback, with the mindset that you do not necessarily have
to act on their advice. Having multiple perspectives is good.</p>
<h2 id="how-likely-you-are-to-recommend-contributing-to-numpy-on-a-scale-of-zero-to-ten">How likely you are to recommend contributing to NumPy on a scale of zero to ten?<a class="headerlink" href="#how-likely-you-are-to-recommend-contributing-to-numpy-on-a-scale-of-zero-to-ten" title="Link to this heading">#</a></h2>
<p>I&rsquo;d say a solid nine! It is overall a great experience.</p>
<h2 id="is-there-something-you-wish-id-asked-but-havent">Is there something you wish I’d asked but haven’t?<a class="headerlink" href="#is-there-something-you-wish-id-asked-but-havent" title="Link to this heading">#</a></h2>
<p>Yes. What I like the most about the NumPy community is that
it does not require huge commitments time-wise. Every little thing is appreciated,
so that is certainly motivating.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="contributor-spotlight" label="contributor-spotlight" />
                             
                                <category scheme="taxonomy:Tags" term="numpy" label="numpy" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[A quick tour of QMC with SciPy]]></title>
            <link href="https://blog.scientific-python.org/scipy/qmc-basics/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
            
                <id>https://blog.scientific-python.org/scipy/qmc-basics/</id>
            
            
            <published>2022-04-04T00:00:00+00:00</published>
            <updated>2022-04-04T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Do you need to use random numbers? Use Quasi-Monte Carlo (QMC) methods instead. QMC, what is it? Why you should care? And how to use it?</blockquote><p>At the end of this article, my goal is to convince you that: if you need to
use random numbers, you <em>should</em> consider using
<a href="https://scipy.github.io/devdocs/reference/stats.qmc.html"><code>scipy.stats.qmc</code></a>
instead of
<a href="https://numpy.org/doc/stable/reference/random/index.html"><code>np.random</code></a>.</p>
<p>In the following, we assume that <em>SciPy</em>, <em>NumPy</em> and <em>Matplotlib</em> are
installed and imported:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">scipy.stats</span> <span class="kn">import</span> <span class="n">qmc</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span></span></span></code></pre>
</div>
<blockquote>
<p>Note that no seeding is used in these examples. This will be the topic
of another article: seeding should <strong>only</strong> be used for testing purposes.</p>
</blockquote>
<p><strong>So what are Monte Carlo (MC) and Quasi-Monte Carlo (QMC)?</strong></p>
<h2 id="monte-carlo-mc">Monte Carlo (MC)<a class="headerlink" href="#monte-carlo-mc" title="Link to this heading">#</a></h2>
<blockquote>
<p>MC methods are a broad class of computational algorithms that rely on
repeated random sampling to obtain numerical results.
The underlying concept is to use randomness to solve problems that might be
deterministic in principle. They are often used in physical and mathematical
problems and are most useful when it is difficult or impossible to use other
approaches. MC methods are mainly used in three classes of problem:
optimization, numerical integration, and generating draws from a probability
distribution.</p>
</blockquote>
<p>Put simply, this is how you would usually generate a <em>sample</em> of points using
MC:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">rng</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">default_rng</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">sample</span> <span class="o">=</span> <span class="n">rng</span><span class="o">.</span><span class="n">random</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="p">(</span><span class="mi">256</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span></span></span></code></pre>
</div>
<p>In this case, <code>sample</code> is a 2-dimensional array with 256 points which can be
visualized using a 2D scatter plot.</p>
<p><img src="/scipy/qmc-basics/mc.png" alt="MC sample using a 2D scatter plot."></p>
<p>In the plot above, points are generated randomly without any
knowledge about previously drawn points. It is clear that some regions of the
space are left unexplored while other regions have clusters. In an optimization
problem, it could mean that you would need to generate more sample to find the
optimum. Or in a regression problem, you could also overfit a model due to
some cluster of points.</p>
<p>Generating random numbers is a more complex problem than it sounds. Simple MC
methods are designed to sample points to be independent and identically
distributed (IID).</p>
<p>One could think that the solution is just to use a grid! But look what
happens if we have a distance of 0.1 between points in the unit-hypercube (
with all bounds ranging from 0 to 1).</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">disc</span> <span class="o">=</span> <span class="mi">10</span>
</span></span><span class="line"><span class="cl"><span class="n">x1</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">disc</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">x2</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">disc</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">x3</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">disc</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">x1</span><span class="p">,</span> <span class="n">x2</span><span class="p">,</span> <span class="n">x3</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">meshgrid</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">x2</span><span class="p">,</span> <span class="n">x3</span><span class="p">)</span></span></span></code></pre>
</div>
<p>The number of points required to fill the unit interval would be 10.
In a 2-dimensional hypercube the same spacing would require 100, and in 3
dimensions 1,000 points. As the number of dimensions grows, the number of
samples which is required to fill the space rises exponentially as the
dimensionality of the space increases. This exponential growth is called
the <em>curse of dimensionality</em>.</p>
<p><img src="/scipy/qmc-basics/curse.png" alt="Curse of dimensionality. 3 figures: 1D, 2D and 3D scatter plots of samples."></p>
<h2 id="quasi-monte-carlo-qmc">Quasi-Monte Carlo (QMC)<a class="headerlink" href="#quasi-monte-carlo-qmc" title="Link to this heading">#</a></h2>
<p>To mitigate the <em>curse of dimensionality</em>, you could decide to randomly
remove points from the sample or randomly sample in n-dimension. In both
cases, this <strong>will</strong> need to empty regions and clusters of points elsewhere.</p>
<p>Quasi-Monte Carlo (QMC) methods have been created specifically to answer this
problem. As opposed to MC methods, QMC methods are deterministic. Which means
that the points are not IID, but each new point knows about previous points.
The result is that we can construct samples with good coverage of the space.</p>
<blockquote>
<p>Deterministic does <strong>not</strong> mean that samples are always the same.
the sequences can be scrambled.</p>
</blockquote>
<p>Starting with version 1.7, SciPy provides QMC methods in
<a href="https://scipy.github.io/devdocs/reference/stats.qmc.html"><code>scipy.stats.qmc</code></a>.</p>
<p>Let&rsquo;s generate 2 samples with MC and a QMC method named <em>Sobol&rsquo;</em>.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">n</span><span class="p">,</span> <span class="n">d</span> <span class="o">=</span> <span class="mi">256</span><span class="p">,</span> <span class="mi">2</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">rng</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">default_rng</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">sample_mc</span> <span class="o">=</span> <span class="n">rng</span><span class="o">.</span><span class="n">random</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">d</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">qrng</span> <span class="o">=</span> <span class="n">qmc</span><span class="o">.</span><span class="n">Sobol</span><span class="p">(</span><span class="n">d</span><span class="o">=</span><span class="n">d</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">sample_qmc</span> <span class="o">=</span> <span class="n">qrng</span><span class="o">.</span><span class="n">random</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="n">n</span><span class="p">)</span></span></span></code></pre>
</div>
<p>A very similar interface, but as seen below, with radically different results.</p>
<p><img src="/scipy/qmc-basics/mc_sobol.png" alt="Comparison between MC and QMC samples using a 2D scatter plot."></p>
<p>The 2D space clearly exhibit less empty areas and less clusters with the QMC
sample.</p>
<h2 id="quality">Quality?<a class="headerlink" href="#quality" title="Link to this heading">#</a></h2>
<p>Beyond the visual improvement of <em>quality</em>, there are metrics to assess the
quality of a sample. Geometrical criteria are commonly used, one can
compute the distance (L1, L2, etc.) between all pairs of points. But there
are also statistical criteria such as: the
<a href="https://scipy.github.io/devdocs/reference/generated/scipy.stats.qmc.discrepancy.html">discrepancy</a>.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">qmc</span><span class="o">.</span><span class="n">discrepancy</span><span class="p">(</span><span class="n">sample_mc</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># 0.0009</span>
</span></span><span class="line"><span class="cl"><span class="n">qmc</span><span class="o">.</span><span class="n">discrepancy</span><span class="p">(</span><span class="n">sample_qmc</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># 1.1e-05</span></span></span></code></pre>
</div>
<p>The lower the value, the better the quality.</p>
<h2 id="convergence">Convergence<a class="headerlink" href="#convergence" title="Link to this heading">#</a></h2>
<p>If this still does not convince you, let&rsquo;s look at a concrete example:
integrating a function. Let&rsquo;s look at the mean of the squared sum in
5 dimensions:</p>
<p>$$f(\mathbf{x}) = \left( \sum_{j=1}^{5}x_j \right)^2,$$</p>
<p>with $x_j \sim \mathcal{U}(0,1)$. It has a known mean value,
$\mu = 5/3+5(5-1)/4$. By sampling points, we can compute that mean numerically.</p>
<blockquote>
<p>The samplings are done 99 times and averaged. The variance is not reported
for simplicity, just know that it&rsquo;s guaranteed to be lower with QMC than with
MC.</p>
</blockquote>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">dim</span> <span class="o">=</span> <span class="mi">5</span>
</span></span><span class="line"><span class="cl"><span class="n">ref</span> <span class="o">=</span> <span class="mi">5</span> <span class="o">/</span> <span class="mi">3</span> <span class="o">+</span> <span class="mi">5</span> <span class="o">*</span> <span class="p">(</span><span class="mi">5</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="mi">4</span>
</span></span><span class="line"><span class="cl"><span class="n">n_conv</span> <span class="o">=</span> <span class="mi">99</span>
</span></span><span class="line"><span class="cl"><span class="n">ns_gen</span> <span class="o">=</span> <span class="mi">2</span> <span class="o">**</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mi">13</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">func</span><span class="p">(</span><span class="n">sample</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># dim 5, true value 5/3 + 5*(5 - 1)/4</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">sample</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">conv_method</span><span class="p">(</span><span class="n">sampler</span><span class="p">,</span> <span class="n">func</span><span class="p">,</span> <span class="n">n_samples</span><span class="p">,</span> <span class="n">n_conv</span><span class="p">,</span> <span class="n">ref</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">samples</span> <span class="o">=</span> <span class="p">[</span><span class="n">sampler</span><span class="p">(</span><span class="n">n_samples</span><span class="p">)</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_conv</span><span class="p">)]</span>
</span></span><span class="line"><span class="cl">    <span class="n">samples</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">samples</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">evals</span> <span class="o">=</span> <span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">func</span><span class="p">(</span><span class="n">sample</span><span class="p">))</span> <span class="o">/</span> <span class="n">n_samples</span> <span class="k">for</span> <span class="n">sample</span> <span class="ow">in</span> <span class="n">samples</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="n">squared_errors</span> <span class="o">=</span> <span class="p">(</span><span class="n">ref</span> <span class="o">-</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">evals</span><span class="p">))</span> <span class="o">**</span> <span class="mi">2</span>
</span></span><span class="line"><span class="cl">    <span class="n">rmse</span> <span class="o">=</span> <span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">squared_errors</span><span class="p">)</span> <span class="o">/</span> <span class="n">n_conv</span><span class="p">)</span> <span class="o">**</span> <span class="mf">0.5</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">rmse</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Analysis</span>
</span></span><span class="line"><span class="cl"><span class="n">sample_mc_rmse</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl"><span class="n">sample_sobol_rmse</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl"><span class="n">rng</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">default_rng</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">ns</span> <span class="ow">in</span> <span class="n">ns_gen</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Monte Carlo</span>
</span></span><span class="line"><span class="cl">    <span class="n">sampler_mc</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">rng</span><span class="o">.</span><span class="n">random</span><span class="p">((</span><span class="n">x</span><span class="p">,</span> <span class="n">dim</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="n">conv_res</span> <span class="o">=</span> <span class="n">conv_method</span><span class="p">(</span><span class="n">sampler_mc</span><span class="p">,</span> <span class="n">func</span><span class="p">,</span> <span class="n">ns</span><span class="p">,</span> <span class="n">n_conv</span><span class="p">,</span> <span class="n">ref</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">sample_mc_rmse</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">conv_res</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Sobol&#39;</span>
</span></span><span class="line"><span class="cl">    <span class="n">engine</span> <span class="o">=</span> <span class="n">qmc</span><span class="o">.</span><span class="n">Sobol</span><span class="p">(</span><span class="n">d</span><span class="o">=</span><span class="n">dim</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">conv_res</span> <span class="o">=</span> <span class="n">conv_method</span><span class="p">(</span><span class="n">engine</span><span class="o">.</span><span class="n">random</span><span class="p">,</span> <span class="n">func</span><span class="p">,</span> <span class="n">ns</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">ref</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">sample_sobol_rmse</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">conv_res</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">sample_mc_rmse</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">sample_mc_rmse</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">sample_sobol_rmse</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">sample_sobol_rmse</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Plot</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mi">4</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_aspect</span><span class="p">(</span><span class="s2">&#34;equal&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># MC</span>
</span></span><span class="line"><span class="cl"><span class="n">ratio</span> <span class="o">=</span> <span class="n">sample_mc_rmse</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">/</span> <span class="n">ns_gen</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">**</span> <span class="p">(</span><span class="o">-</span><span class="mi">1</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">ns_gen</span><span class="p">,</span> <span class="n">ns_gen</span> <span class="o">**</span> <span class="p">(</span><span class="o">-</span><span class="mi">1</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span> <span class="o">*</span> <span class="n">ratio</span><span class="p">,</span> <span class="n">ls</span><span class="o">=</span><span class="s2">&#34;-&#34;</span><span class="p">,</span> <span class="n">c</span><span class="o">=</span><span class="s2">&#34;k&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">ns_gen</span><span class="p">,</span> <span class="n">sample_mc_rmse</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s2">&#34;MC: np.random&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Sobol&#39;</span>
</span></span><span class="line"><span class="cl"><span class="n">ratio</span> <span class="o">=</span> <span class="n">sample_sobol_rmse</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">/</span> <span class="n">ns_gen</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">**</span> <span class="p">(</span><span class="o">-</span><span class="mi">2</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">ns_gen</span><span class="p">,</span> <span class="n">ns_gen</span> <span class="o">**</span> <span class="p">(</span><span class="o">-</span><span class="mi">2</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span> <span class="o">*</span> <span class="n">ratio</span><span class="p">,</span> <span class="n">ls</span><span class="o">=</span><span class="s2">&#34;-&#34;</span><span class="p">,</span> <span class="n">c</span><span class="o">=</span><span class="s2">&#34;k&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">ns_gen</span><span class="p">,</span> <span class="n">sample_sobol_rmse</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s2">&#34;QMC: qmc.Sobol&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="sa">r</span><span class="s2">&#34;$N_s$&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_xscale</span><span class="p">(</span><span class="s2">&#34;log&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_xticks</span><span class="p">(</span><span class="n">ns_gen</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_xticklabels</span><span class="p">([</span><span class="sa">rf</span><span class="s2">&#34;$2^</span><span class="se">{{</span><span class="si">{</span><span class="n">ns</span><span class="si">}</span><span class="se">}}</span><span class="s2">$&#34;</span> <span class="k">for</span> <span class="n">ns</span> <span class="ow">in</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mi">13</span><span class="p">)])</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="sa">r</span><span class="s2">&#34;$\log (\epsilon)$&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_yscale</span><span class="p">(</span><span class="s2">&#34;log&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="s2">&#34;upper right&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">tight_layout</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p><img src="/scipy/qmc-basics/conv_mc_sobol.png" alt="Convergence of the integration error with MC and QMC."></p>
<p>With MC the approximation error follows a theoretical rate of $O(n^{-1/2})$.
But, QMC methods have better rates of convergence and achieve $O(n^{-1})$
for this function–and even better rates on very smooth functions.</p>
<p>This means that using $2^8=256$ points from <em>Sobol&rsquo;</em> leads to a lower
error than using $2^{12}=4096$ points from MC! When the function evaluation
is costly, it can bring huge computational savings.</p>
<h2 id="sampling-from-any-distribution-advanced">Sampling from any distribution (advanced)<a class="headerlink" href="#sampling-from-any-distribution-advanced" title="Link to this heading">#</a></h2>
<p>But there is more! Another great use of QMC is to sample arbitrary
distributions. In SciPy 1.8, there are new classes of
<a href="https://scipy.github.io/devdocs/reference/stats.sampling.html">samplers</a>
that allow you to sample from any custom distribution. And some of
these methods can use QMC with a <code>qrvs</code> method:</p>
<ul>
<li><a href="https://scipy.github.io/devdocs/reference/generated/scipy.stats.sampling.NumericalInversePolynomial.html">NumericalInversePolynomial</a></li>
<li><a href="https://scipy.github.io/devdocs/reference/generated/scipy.stats.sampling.NumericalInverseHermite.html">NumericalInverseHermite</a></li>
</ul>
<p>Here is an example with a distribution from SciPy: <em>fisk</em>. We generate
a MC sample from the distribution (either directly from the distribution with
<code>fisk.rvs</code> or using <code>NumericalInverseHermite.rvs</code>) and another sample with
QMC using <code>NumericalInverseHermite.qrvs</code>.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">scipy.stats</span> <span class="k">as</span> <span class="nn">stats</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">scipy.stats</span> <span class="kn">import</span> <span class="n">sampling</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Any distribution</span>
</span></span><span class="line"><span class="cl"><span class="n">c</span> <span class="o">=</span> <span class="mf">3.9</span>
</span></span><span class="line"><span class="cl"><span class="n">dist</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">fisk</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># MC</span>
</span></span><span class="line"><span class="cl"><span class="n">rng</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">default_rng</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">sample_mc</span> <span class="o">=</span> <span class="n">dist</span><span class="o">.</span><span class="n">rvs</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="n">random_state</span><span class="o">=</span><span class="n">rng</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># QMC</span>
</span></span><span class="line"><span class="cl"><span class="n">rng_dist</span> <span class="o">=</span> <span class="n">sampling</span><span class="o">.</span><span class="n">NumericalInverseHermite</span><span class="p">(</span><span class="n">dist</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># sample_mc = rng_dist.rvs(128, random_state=rng)  # MC alternative same as above</span>
</span></span><span class="line"><span class="cl"><span class="n">qrng</span> <span class="o">=</span> <span class="n">qmc</span><span class="o">.</span><span class="n">Sobol</span><span class="p">(</span><span class="n">d</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">sample_qmc</span> <span class="o">=</span> <span class="n">rng_dist</span><span class="o">.</span><span class="n">qrvs</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="n">qmc_engine</span><span class="o">=</span><span class="n">qrng</span><span class="p">)</span></span></span></code></pre>
</div>
<p>Let&rsquo;s visualize the difference between MC and QMC by calculating the empirical
Probability Density Function (PDF). The QMC results are clearly superior
to MC.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># Visualization</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">axs</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">sharey</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">sharex</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="mi">4</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">x</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="n">dist</span><span class="o">.</span><span class="n">ppf</span><span class="p">(</span><span class="mf">0.01</span><span class="p">),</span> <span class="n">dist</span><span class="o">.</span><span class="n">ppf</span><span class="p">(</span><span class="mf">0.99</span><span class="p">),</span> <span class="mi">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">pdf</span> <span class="o">=</span> <span class="n">dist</span><span class="o">.</span><span class="n">pdf</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">delta</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">max</span><span class="p">(</span><span class="n">pdf</span><span class="p">)</span> <span class="o">*</span> <span class="mf">5e-2</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">samples</span> <span class="o">=</span> <span class="p">{</span><span class="s2">&#34;MC: np.random&#34;</span><span class="p">:</span> <span class="n">sample_mc</span><span class="p">,</span> <span class="s2">&#34;QMC: qmc.Sobol&#34;</span><span class="p">:</span> <span class="n">sample_qmc</span><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">ax</span><span class="p">,</span> <span class="n">sample</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">axs</span><span class="p">,</span> <span class="n">samples</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_title</span><span class="p">(</span><span class="n">sample</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">pdf</span><span class="p">,</span> <span class="s2">&#34;-&#34;</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s2">&#34;fisk PDF&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">samples</span><span class="p">[</span><span class="n">sample</span><span class="p">],</span> <span class="o">-</span><span class="n">delta</span> <span class="o">-</span> <span class="n">delta</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">random</span><span class="p">(</span><span class="mi">128</span><span class="p">),</span> <span class="s2">&#34;+k&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">kde</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">gaussian_kde</span><span class="p">(</span><span class="n">samples</span><span class="p">[</span><span class="n">sample</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">kde</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="s2">&#34;-.&#34;</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s2">&#34;empirical PDF&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># or use a histogram</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># ax.hist(sample, density=True, histtype=&#39;stepfilled&#39;, alpha=0.2)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_xlim</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">axs</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="s2">&#34;best&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">supylabel</span><span class="p">(</span><span class="s2">&#34;Density&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">supxlabel</span><span class="p">(</span><span class="s2">&#34;Sample value&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">tight_layout</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p><img src="/scipy/qmc-basics/fisk_mc_sobol.png" alt="Probability density function of the fisk distribution.
Comparison with empirical distributions built with MC and QMC."></p>
<p>Careful readers will note that there is no seeding. This is intentional as
noted at the beginning of this article. You might run this code
again and have better results with MC. <strong>But</strong> only sometimes. And that&rsquo;s
exactly my point. On average, you are guaranteed to have more consistent
results with a better quality using QMC. I invite you to try it and see for
yourself!</p>
<h2 id="conclusion">Conclusion<a class="headerlink" href="#conclusion" title="Link to this heading">#</a></h2>
<p>I hope that I convinced you to use QMC the next time you need random numbers.
QMC is superior to MC, period.</p>
<p>There is an extensive body of literature and rigorous proofs. One reason MC is
still more popular is that QMC is harder to implement and, depending on the
method, there are rules to follow.</p>
<p>Take the <em>Sobol&rsquo;</em> method we used: you must use exactly $2^n$ sample. If you
don&rsquo;t do it, you will break some properties and end up having the same
performance than MC. This is why some people argue that QMC is not better:
they simply don&rsquo;t use the methods properly, hence fail to see any benefits and
conclude that MC is &ldquo;enough&rdquo;.</p>
<p>In
<a href="https://scipy.github.io/devdocs/reference/stats.qmc.html"><code>scipy.stats.qmc</code></a>,
we went to great lengths to explain how to use the methods, and we added some
explicit warnings to make the methods accessible and useful to
everyone.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="scipy" label="scipy" />
                             
                                <category scheme="taxonomy:Tags" term="tutorial" label="tutorial" />
                             
                                <category scheme="taxonomy:Tags" term="random-numbers" label="random-numbers" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Scientific Python GSoD 2022 Proposal]]></title>
            <link href="https://blog.scientific-python.org/scientific-python/gsod-2022-proposal/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/gsod-developing-matplotlib-entry-paths/?utm_source=atom_feed" rel="related" type="text/html" title="GSoD: Developing Matplotlib Entry Paths" />
            
                <id>https://blog.scientific-python.org/scientific-python/gsod-2022-proposal/</id>
            
            
            <published>2022-03-25T00:00:00+00:00</published>
            <updated>2022-03-25T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Create educational content for the Scientific Python Blog</blockquote><h2 id="create-educational-content-for-the-scientific-python-blog">Create educational content for the Scientific Python Blog<a class="headerlink" href="#create-educational-content-for-the-scientific-python-blog" title="Link to this heading">#</a></h2>
<h2 id="about-your-organization">About your organization<a class="headerlink" href="#about-your-organization" title="Link to this heading">#</a></h2>
<p>With an extensive and high-quality ecosystem of libraries, scientific Python
has emerged as the leading platform for data analysis.
This ecosystem is sustained largely by volunteers working on independent
projects with separate mailing lists, websites, roadmaps, documentation,
engineering and packaging solutions, and governance structures.</p>
<p>The Scientific Python project aims to better coordinate the ecosystem and
prepare the software projects in this ecosystem for the next decade of data
science.</p>
<h2 id="about-your-project">About your project<a class="headerlink" href="#about-your-project" title="Link to this heading">#</a></h2>
<h3 id="your-projects-problem">Your project’s problem<a class="headerlink" href="#your-projects-problem" title="Link to this heading">#</a></h3>
<p>There is no shortage of blog posts around the web about how to use and explore
different packages in the scientific Python ecosystem.
However, some of it is outdated or incomplete, and many times doesn&rsquo;t follow
the best practices that would be advocated for by the maintainers of these
packages.</p>
<p>In addition, we would like to create a central, <em>community-driven</em> location where
Scientific Python projects can make announcements and share information.</p>
<p>Our project aims to be the definitive community blog&mdash;for people looking
to make use of these libraries in education, research and industry, contribute
to them, or maintain them&mdash;written, reviewed, and approved by the community
of developers and users.</p>
<p>While our core projects (NumPy, SciPy, Matplotlib, scikit-image, NetworkX, etc.)
will be regularly contributing content, we also would like to increase the number of
contributors by providing support to newer members to generate high-quality,
peer-reviewed blog posts.</p>
<h3 id="your-projects-scope">Your project’s scope<a class="headerlink" href="#your-projects-scope" title="Link to this heading">#</a></h3>
<!--
*Tell us about what documentation your organization will create, update, or improve. If some work is deliberately not being done, include that information as well. Include a time estimate, and whether you have already identified organization volunteers and a technical writer to work with your project.*
-->
<p>Our goal is to populate the <a href="https://blog.scientific-python.org/">https://blog.scientific-python.org/</a> website with
high-quality content, reviewed and approved by the maintainers of the
libraries in the ecosystem.
The main goal of these documents is to centralize information relevant to all
(or most) projects in the ecosystem, at the reduced cost of being maintained in
one place.</p>
<p>This project aims to:</p>
<ul>
<li>Create content for the <a href="https://blog.scientific-python.org/">https://blog.scientific-python.org/</a> website</li>
</ul>
<p>To ensure this project is successful, it is recommended that the technical
writer has some familiarity with at least a few of Scientific Python&rsquo;s
<a href="https://scientific-python.org/specs/core-projects">core projects</a>.</p>
<h3 id="measuring-your-projects-success">Measuring your project’s success<a class="headerlink" href="#measuring-your-projects-success" title="Link to this heading">#</a></h3>
<!--
*How will you know that your new documentation has helped solve your problem? What metrics will you use, and how will you track them?*
-->
<p>We would consider the project successful if:</p>
<ul>
<li>At least 3 blog posts were published on blog.scientific-python.org,
by each of the technical writers.</li>
<li>Improved submission and review guide</li>
</ul>
<h3 id="timeline">Timeline<a class="headerlink" href="#timeline" title="Link to this heading">#</a></h3>
<p>We anticipate the project to be developed over six months including onboarding
five technical writers, reviewing existing material, developing blog post ideas with
the project mentors and blog editorial board, writing and revising the
blog posts, as well as providing feedback on the submission and review process.</p>
<!-- prettier-ignore-start -->




<table class="table" id="id000">
  
  <col
    
    align="left"
    
  >
  
  <col
    
    align="right"
    
  >
  

  
  <th class="head">Dates</th>
  
  <th class="head">Action Items</th>
  

  
  <tr>
    
      
      <td>
        May
      </td>
      
    
      
      <td>
        Onboarding
      </td>
      
    
  </tr>
  
  <tr>
    
      
      <td>
        June
      </td>
      
    
      
      <td>
        Review existing documentation
      </td>
      
    
  </tr>
  
  <tr>
    
      
      <td>
        July
      </td>
      
    
      
      <td>
        Update contributor guide
      </td>
      
    
  </tr>
  
  <tr>
    
      
      <td>
        August&ndash;October
      </td>
      
    
      
      <td>
        Create and edit content
      </td>
      
    
  </tr>
  
  <tr>
    
      
      <td>
        November
      </td>
      
    
      
      <td>
        Project completion
      </td>
      
    
  </tr>
  

</table>


<!-- prettier-ignore-end -->
<h2 id="project-budget">Project budget<a class="headerlink" href="#project-budget" title="Link to this heading">#</a></h2>
<!-- prettier-ignore-start -->




<table class="table" id="id001">
  

  
  <th class="head">Budget item</th>
  
  <th class="head">Amount</th>
  
  <th class="head">Running Total</th>
  
  <th class="head">Notes/justifications</th>
  

  
  <tr>
    
      
      <td>
        Technical writers (5)
      </td>
      
    
      
      <td>
        $15,000.00
      </td>
      
    
      
      <td>
        $15,000.00
      </td>
      
    
      
      <td>
        $3,000 / writer
      </td>
      
    
  </tr>
  
  <tr>
    
      
      <td>
        TOTAL
      </td>
      
    
      
      <td>
        
      </td>
      
    
      
      <td>
        $15,000.00
      </td>
      
    
      
      <td>
        
      </td>
      
    
  </tr>
  

</table>


<!-- prettier-ignore-end -->
<h3 id="additional-information">Additional information<a class="headerlink" href="#additional-information" title="Link to this heading">#</a></h3>
<!--
*Include here any additional information that is relevant to your proposal.*

*- Previous experience with technical writers or documentation: If you or any of your mentors have worked with technical writers before, or have developed documentation, mention this in your application. Describe the documentation that you produced and the ways in which you worked with the technical writer. For example, describe any review processes that you used, or how the technical writer's skills were useful to your project. Explain how this previous experience may help you to work with a technical writer in Season of Docs.*
*- Previous participation in Season of Docs, Google Summer of Code or others: If you or any of your mentors have taken part in Google Summer of Code or a similar program, mention this in your application. Describe your achievements in that program. Explain how this experience may influence the way you work in Season of Docs.*
-->
<p>The Scientific Python project is a new initiative, and this is our first time
participating in Google Season of Docs.
However, both Jarrod Millman and Ross Barnowski are established members of the
Python community, with a vast collective experience in mentoring, managing and
maintaining large open source projects.</p>
<p>Jarrod cofounded the Neuroimaging in Python project. He was the NumPy and SciPy
release manager from 2007 to 2009. He cofounded NumFOCUS and served on its board
from 2011 to 2015. Currently, he is the release manager of NetworkX and cofounder
of the Scientific Python project.</p>
<p>Both mentors Jarrod and Ross have mentored many new
contributors on multiple projects including NumPy, SciPy, and NetworkX.
Ross has served as a co-mentor for three former GSoD students on the NumPy
project, largely related to generating new content for tutorials, as well as
refactoring existing user documentation.</p>
<p>Links:</p>
<ul>
<li><a href="https://scientific-python.org/">https://scientific-python.org/</a></li>
<li><a href="https://blog.scientific-python.org/">https://blog.scientific-python.org/</a></li>
<li><a href="https://github.com/scientific-python/">https://github.com/scientific-python/</a></li>
</ul>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsod" label="GSoD" />
                             
                                <category scheme="taxonomy:Tags" term="scientific-python" label="Scientific-Python" />
                             
                                <category scheme="taxonomy:Tags" term="proposal" label="proposal" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[How to create custom tables]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/how-to-create-custom-tables/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/visualising-usage-using-batteries/?utm_source=atom_feed" rel="related" type="text/html" title="Battery Charts - Visualise usage rates &amp; more" />
                <link href="https://blog.scientific-python.org/matplotlib/python-graph-gallery.com/?utm_source=atom_feed" rel="related" type="text/html" title="The Python Graph Gallery: hundreds of python charts with reproducible code." />
                <link href="https://blog.scientific-python.org/matplotlib/stellar-chart-alternative-radar-chart/?utm_source=atom_feed" rel="related" type="text/html" title="Stellar Chart, a Type of Chart to Be on Your Radar" />
                <link href="https://blog.scientific-python.org/matplotlib/ipcc-sr15/?utm_source=atom_feed" rel="related" type="text/html" title="Figures in the IPCC Special Report on Global Warming of 1.5°C (SR15)" />
                <link href="https://blog.scientific-python.org/matplotlib/codeswitching-visualization/?utm_source=atom_feed" rel="related" type="text/html" title="Visualizing Code-Switching with Step Charts" />
            
                <id>https://blog.scientific-python.org/matplotlib/how-to-create-custom-tables/</id>
            
            
            <published>2022-03-11T11:10:06+00:00</published>
            <updated>2022-03-11T11:10:06+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>A tutorial on how to create custom tables in Matplotlib which allow for flexible design and customization.</blockquote><h1 id="introduction">Introduction<a class="headerlink" href="#introduction" title="Link to this heading">#</a></h1>
<p>This tutorial will teach you how to create custom tables in Matplotlib, which are extremely flexible in terms of the design and layout. You’ll hopefully see that the code is very straightforward! In fact, the main methods we will be using are <code>ax.text()</code> and <code>ax.plot()</code>.</p>
<p>I want to give a lot of credit to <a href="https://twitter.com/CrumpledJumper">Todd Whitehead</a> who has created these types of tables for various Basketball teams and players. His approach to tables is nothing short of fantastic due to the simplicity in design and how he manages to effectively communicate data to his audience. I was very much inspired by his approach and wanted to be able to achieve something similar in Matplotlib.</p>
<p>Before I begin with the tutorial, I wanted to go through the logic behind my approach as I think it&rsquo;s valuable and transferable to other visualizations (and tools!).</p>
<p>With that, I would like you to <strong>think of tables as highly structured and organized scatterplots</strong>. Let me explain why: for me, scatterplots are the most fundamental chart type (regardless of tool).</p>
<p><img src="/matplotlib/how-to-create-custom-tables/scatterplots.png" alt="Scatterplots"></p>
<p>For example <code>ax.plot()</code> automatically &ldquo;connects the dots&rdquo; to form a line chart or <code>ax.bar()</code> automatically &ldquo;draws rectangles&rdquo; across a set of coordinates. Very often (again regardless of tool) we may not always see this process happening. The point is, it is useful to think of any chart as a scatterplot or simply as a collection of shapes based on xy coordinates. This logic / thought process can unlock a ton of <em>custom</em> charts as the only thing you need are the coordinates (which can be mathematically computed).</p>
<p>With that in mind, we can move on to tables! So rather than plotting rectangles or circles we want to plot text and gridlines in a highly organized manner.</p>
<p>We will aim to create a table like this, which I have posted on Twitter <a href="https://twitter.com/TimBayer93/status/1476926897850359809">here</a>. Note, the only elements added outside of Matplotlib are the fancy arrows and their descriptions.</p>
<p><img src="/matplotlib/how-to-create-custom-tables/0_example.png" alt="Example"></p>
<h1 id="creating-a-custom-table">Creating a custom table<a class="headerlink" href="#creating-a-custom-table" title="Link to this heading">#</a></h1>
<p>Importing required libraries.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib</span> <span class="k">as</span> <span class="nn">mpl</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.patches</span> <span class="k">as</span> <span class="nn">patches</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib</span> <span class="kn">import</span> <span class="n">pyplot</span> <span class="k">as</span> <span class="n">plt</span></span></span></code></pre>
</div>
<p>First, we will need to set up a coordinate space - I like two approaches:</p>
<ol>
<li>working with the standard Matplotlib 0-1 scale (on both the x- and y-axis) or</li>
<li>an index system based on row / column numbers (this is what I will use here)</li>
</ol>
<p>I want to create a coordinate space for a table containing 6 columns and 10 rows - this means (similar to pandas row/column indices) each row will have an index between 0-9 and each column will have an index between 0-6 (this is technically 1 more column than what we defined but one of the columns with a lot of text will span two column “indices”)</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># first, we&#39;ll create a new figure and axis object</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="mi">6</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># set the number of rows and cols for our table</span>
</span></span><span class="line"><span class="cl"><span class="n">rows</span> <span class="o">=</span> <span class="mi">10</span>
</span></span><span class="line"><span class="cl"><span class="n">cols</span> <span class="o">=</span> <span class="mi">6</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># create a coordinate system based on the number of rows/columns</span>
</span></span><span class="line"><span class="cl"><span class="c1"># adding a bit of padding on bottom (-1), top (1), right (0.5)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_ylim</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">rows</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_xlim</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">cols</span> <span class="o">+</span> <span class="mf">0.5</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/how-to-create-custom-tables/1_coordinate_space.png" alt="Empty Coordinate Space"></p>
<p>Now, the data we want to plot is sports (football) data. We have information about 10 players and some values against a number of different metrics (which will form our columns) such as goals, shots, passes etc.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># sample data</span>
</span></span><span class="line"><span class="cl"><span class="n">data</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span><span class="s2">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;player10&#34;</span><span class="p">,</span> <span class="s2">&#34;shots&#34;</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s2">&#34;passes&#34;</span><span class="p">:</span> <span class="mi">79</span><span class="p">,</span> <span class="s2">&#34;goals&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s2">&#34;assists&#34;</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span><span class="s2">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;player9&#34;</span><span class="p">,</span> <span class="s2">&#34;shots&#34;</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="s2">&#34;passes&#34;</span><span class="p">:</span> <span class="mi">72</span><span class="p">,</span> <span class="s2">&#34;goals&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s2">&#34;assists&#34;</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span><span class="s2">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;player8&#34;</span><span class="p">,</span> <span class="s2">&#34;shots&#34;</span><span class="p">:</span> <span class="mi">3</span><span class="p">,</span> <span class="s2">&#34;passes&#34;</span><span class="p">:</span> <span class="mi">47</span><span class="p">,</span> <span class="s2">&#34;goals&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s2">&#34;assists&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span><span class="s2">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;player7&#34;</span><span class="p">,</span> <span class="s2">&#34;shots&#34;</span><span class="p">:</span> <span class="mi">4</span><span class="p">,</span> <span class="s2">&#34;passes&#34;</span><span class="p">:</span> <span class="mi">99</span><span class="p">,</span> <span class="s2">&#34;goals&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s2">&#34;assists&#34;</span><span class="p">:</span> <span class="mi">5</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span><span class="s2">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;player6&#34;</span><span class="p">,</span> <span class="s2">&#34;shots&#34;</span><span class="p">:</span> <span class="mi">5</span><span class="p">,</span> <span class="s2">&#34;passes&#34;</span><span class="p">:</span> <span class="mi">84</span><span class="p">,</span> <span class="s2">&#34;goals&#34;</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s2">&#34;assists&#34;</span><span class="p">:</span> <span class="mi">4</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span><span class="s2">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;player5&#34;</span><span class="p">,</span> <span class="s2">&#34;shots&#34;</span><span class="p">:</span> <span class="mi">6</span><span class="p">,</span> <span class="s2">&#34;passes&#34;</span><span class="p">:</span> <span class="mi">56</span><span class="p">,</span> <span class="s2">&#34;goals&#34;</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="s2">&#34;assists&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span><span class="s2">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;player4&#34;</span><span class="p">,</span> <span class="s2">&#34;shots&#34;</span><span class="p">:</span> <span class="mi">7</span><span class="p">,</span> <span class="s2">&#34;passes&#34;</span><span class="p">:</span> <span class="mi">67</span><span class="p">,</span> <span class="s2">&#34;goals&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s2">&#34;assists&#34;</span><span class="p">:</span> <span class="mi">3</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span><span class="s2">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;player3&#34;</span><span class="p">,</span> <span class="s2">&#34;shots&#34;</span><span class="p">:</span> <span class="mi">8</span><span class="p">,</span> <span class="s2">&#34;passes&#34;</span><span class="p">:</span> <span class="mi">91</span><span class="p">,</span> <span class="s2">&#34;goals&#34;</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s2">&#34;assists&#34;</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span><span class="s2">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;player2&#34;</span><span class="p">,</span> <span class="s2">&#34;shots&#34;</span><span class="p">:</span> <span class="mi">9</span><span class="p">,</span> <span class="s2">&#34;passes&#34;</span><span class="p">:</span> <span class="mi">75</span><span class="p">,</span> <span class="s2">&#34;goals&#34;</span><span class="p">:</span> <span class="mi">3</span><span class="p">,</span> <span class="s2">&#34;assists&#34;</span><span class="p">:</span> <span class="mi">2</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span><span class="s2">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;player1&#34;</span><span class="p">,</span> <span class="s2">&#34;shots&#34;</span><span class="p">:</span> <span class="mi">10</span><span class="p">,</span> <span class="s2">&#34;passes&#34;</span><span class="p">:</span> <span class="mi">70</span><span class="p">,</span> <span class="s2">&#34;goals&#34;</span><span class="p">:</span> <span class="mi">4</span><span class="p">,</span> <span class="s2">&#34;assists&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">},</span>
</span></span><span class="line"><span class="cl"><span class="p">]</span></span></span></code></pre>
</div>
<p>Next, we will start plotting the table (as a structured scatterplot). I did promise that the code will be very simple, less than 10 lines really, here it is:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># from the sample data, each dict in the list represents one row</span>
</span></span><span class="line"><span class="cl"><span class="c1"># each key in the dict represents a column</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">rows</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># extract the row data from the list</span>
</span></span><span class="line"><span class="cl">    <span class="n">d</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">row</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># the y (row) coordinate is based on the row index (loop)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># the x (column) coordinate is defined based on the order I want to display the data in</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># player name column</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="mf">0.5</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="n">row</span><span class="p">,</span> <span class="n">s</span><span class="o">=</span><span class="n">d</span><span class="p">[</span><span class="s2">&#34;id&#34;</span><span class="p">],</span> <span class="n">va</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;left&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># shots column - this is my &#34;main&#34; column, hence bold text</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="n">row</span><span class="p">,</span> <span class="n">s</span><span class="o">=</span><span class="n">d</span><span class="p">[</span><span class="s2">&#34;shots&#34;</span><span class="p">],</span> <span class="n">va</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;right&#34;</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;bold&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># passes column</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="n">row</span><span class="p">,</span> <span class="n">s</span><span class="o">=</span><span class="n">d</span><span class="p">[</span><span class="s2">&#34;passes&#34;</span><span class="p">],</span> <span class="n">va</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;right&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># goals column</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="n">row</span><span class="p">,</span> <span class="n">s</span><span class="o">=</span><span class="n">d</span><span class="p">[</span><span class="s2">&#34;goals&#34;</span><span class="p">],</span> <span class="n">va</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;right&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># assists column</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="n">row</span><span class="p">,</span> <span class="n">s</span><span class="o">=</span><span class="n">d</span><span class="p">[</span><span class="s2">&#34;assists&#34;</span><span class="p">],</span> <span class="n">va</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;right&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/how-to-create-custom-tables/2_adding_data.png" alt="Adding data"></p>
<p>As you can see, we are starting to get a basic wireframe of our table. Let&rsquo;s add column headers to further make this <em>scatterplot</em> look like a table.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># Add column headers</span>
</span></span><span class="line"><span class="cl"><span class="c1"># plot them at height y=9.75 to decrease the space to the</span>
</span></span><span class="line"><span class="cl"><span class="c1"># first data row (you&#39;ll see why later)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="mf">0.5</span><span class="p">,</span> <span class="mf">9.75</span><span class="p">,</span> <span class="s2">&#34;Player&#34;</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;bold&#34;</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;left&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mf">9.75</span><span class="p">,</span> <span class="s2">&#34;Shots&#34;</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;bold&#34;</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;right&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mf">9.75</span><span class="p">,</span> <span class="s2">&#34;Passes&#34;</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;bold&#34;</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;right&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mf">9.75</span><span class="p">,</span> <span class="s2">&#34;Goals&#34;</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;bold&#34;</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;right&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mf">9.75</span><span class="p">,</span> <span class="s2">&#34;Assists&#34;</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;bold&#34;</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;right&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="mi">6</span><span class="p">,</span> <span class="mf">9.75</span><span class="p">,</span> <span class="s2">&#34;Special</span><span class="se">\n</span><span class="s2">Column&#34;</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;bold&#34;</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;right&#34;</span><span class="p">,</span> <span class="n">va</span><span class="o">=</span><span class="s2">&#34;bottom&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/how-to-create-custom-tables/3_headers.png" alt="Adding Headers"></p>
<h1 id="formatting-our-table">Formatting our table<a class="headerlink" href="#formatting-our-table" title="Link to this heading">#</a></h1>
<p>The rows and columns of our table are now done. The only thing that is left to do is formatting - much of this is personal choice. The following elements I think are generally useful when it comes to good table design (more research <a href="https://www.storytellingwithdata.com/blog/2019/10/29/how-i-improved-the-table">here</a>):</p>
<p>Gridlines: Some level of gridlines are useful (less is more). Generally some guidance to help the audience trace their eyes or fingers across the screen can be helpful (this way we can <em>group</em> items too by drawing gridlines around them).</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">rows</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="n">cols</span> <span class="o">+</span> <span class="mi">1</span><span class="p">],</span> <span class="p">[</span><span class="n">row</span> <span class="o">-</span> <span class="mf">0.5</span><span class="p">,</span> <span class="n">row</span> <span class="o">-</span> <span class="mf">0.5</span><span class="p">],</span> <span class="n">ls</span><span class="o">=</span><span class="s2">&#34;:&#34;</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="s2">&#34;.5&#34;</span><span class="p">,</span> <span class="n">c</span><span class="o">=</span><span class="s2">&#34;grey&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># add a main header divider</span>
</span></span><span class="line"><span class="cl"><span class="c1"># remember that we plotted the header row slightly closer to the first data row</span>
</span></span><span class="line"><span class="cl"><span class="c1"># this helps to visually separate the header row from the data rows</span>
</span></span><span class="line"><span class="cl"><span class="c1"># each data row is 1 unit in height, thus bringing the header closer to our</span>
</span></span><span class="line"><span class="cl"><span class="c1"># gridline gives it a distinctive difference.</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="n">cols</span> <span class="o">+</span> <span class="mi">1</span><span class="p">],</span> <span class="p">[</span><span class="mf">9.5</span><span class="p">,</span> <span class="mf">9.5</span><span class="p">],</span> <span class="n">lw</span><span class="o">=</span><span class="s2">&#34;.5&#34;</span><span class="p">,</span> <span class="n">c</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/how-to-create-custom-tables/4_gridlines.png" alt="Adding Gridlines"></p>
<p>Another important element for tables in my opinion is highlighting the <em>key</em> data points. We already bolded the values that are in the &ldquo;Shots&rdquo; column but we can further shade this column to give it further importance to our readers.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># highlight the column we are sorting by</span>
</span></span><span class="line"><span class="cl"><span class="c1"># using a rectangle patch</span>
</span></span><span class="line"><span class="cl"><span class="n">rect</span> <span class="o">=</span> <span class="n">patches</span><span class="o">.</span><span class="n">Rectangle</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mf">1.5</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.5</span><span class="p">),</span>  <span class="c1"># bottom left starting position (x,y)</span>
</span></span><span class="line"><span class="cl">    <span class="mf">0.65</span><span class="p">,</span>  <span class="c1"># width</span>
</span></span><span class="line"><span class="cl">    <span class="mi">10</span><span class="p">,</span>  <span class="c1"># height</span>
</span></span><span class="line"><span class="cl">    <span class="n">ec</span><span class="o">=</span><span class="s2">&#34;none&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">fc</span><span class="o">=</span><span class="s2">&#34;grey&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">alpha</span><span class="o">=</span><span class="mf">0.2</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">zorder</span><span class="o">=-</span><span class="mi">1</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">rect</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/how-to-create-custom-tables/5_highlight_column.png" alt="Highlight column"></p>
<p>We&rsquo;re almost there. The magic piece is <code>ax.axis(‘off’)</code>. This hides the axis, axis ticks, labels and everything “attached” to the axes, which means our table now looks like a clean table!</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">axis</span><span class="p">(</span><span class="s2">&#34;off&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/how-to-create-custom-tables/6_hide_axis.png" alt="Hide axis"></p>
<p>Adding a title is also straightforward.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_title</span><span class="p">(</span><span class="s2">&#34;A title for our table!&#34;</span><span class="p">,</span> <span class="n">loc</span><span class="o">=</span><span class="s2">&#34;left&#34;</span><span class="p">,</span> <span class="n">fontsize</span><span class="o">=</span><span class="mi">18</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;bold&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/how-to-create-custom-tables/6_title.png" alt="Title"></p>
<h1 id="bonus-adding-special-columns">Bonus: Adding special columns<a class="headerlink" href="#bonus-adding-special-columns" title="Link to this heading">#</a></h1>
<p>Finally, if you wish to add images, sparklines, or other custom shapes and patterns then we can do this too.</p>
<p>To achieve this we will create new floating axes using <code>fig.add_axes()</code> to create a new set of floating axes based on the figure coordinates (this is different to our axes coordinate system!).</p>
<p>Remember that figure coordinates by default are between 0 and 1. [0,0] is the bottom left corner of the entire figure. If you’re unfamiliar with the differences between a figure and axes then check out <a href="https://matplotlib.org/stable/gallery/showcase/anatomy.html">Matplotlib&rsquo;s Anatomy of a Figure</a> for further details.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">newaxes</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">rows</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># offset each new axes by a set amount depending on the row</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># this is probably the most fiddly aspect (TODO: some neater way to automate this)</span>
</span></span><span class="line"><span class="cl">    <span class="n">newaxes</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">fig</span><span class="o">.</span><span class="n">add_axes</span><span class="p">([</span><span class="mf">0.75</span><span class="p">,</span> <span class="mf">0.725</span> <span class="o">-</span> <span class="p">(</span><span class="n">row</span> <span class="o">*</span> <span class="mf">0.063</span><span class="p">),</span> <span class="mf">0.12</span><span class="p">,</span> <span class="mf">0.06</span><span class="p">]))</span></span></span></code></pre>
</div>
<p>You can see below what these <em>floating</em> axes will look like (I say floating because they’re on top of our main axis object). The only tricky thing is figuring out the xy (figure) coordinates for these.</p>
<p>These <em>floating</em> axes behave like any other Matplotlib axes. Therefore, we have access to the same methods such as ax.bar(), ax.plot(), patches, etc. Importantly, each axis has its own independent coordinate system. We can format them as we wish.</p>
<p><img src="/matplotlib/how-to-create-custom-tables/7_floating_axes.png" alt="Floating axes"></p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># plot dummy data as a sparkline for illustration purposes</span>
</span></span><span class="line"><span class="cl"><span class="c1"># you can plot _anything_ here, images, patches, etc.</span>
</span></span><span class="line"><span class="cl"><span class="n">newaxes</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">c</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">newaxes</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">set_ylim</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># once again, the key is to hide the axis!</span>
</span></span><span class="line"><span class="cl"><span class="n">newaxes</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">axis</span><span class="p">(</span><span class="s2">&#34;off&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/how-to-create-custom-tables/8_sparklines.png" alt="Sparklines"></p>
<p>That’s it, custom tables in Matplotlib. I did promise very simple code and an ultra-flexible design in terms of what you want / need. You can adjust sizes, colors and pretty much anything with this approach and all you need is simply a loop that plots text in a structured and organized manner. I hope you found it useful. Link to a Google Colab notebook with the code is <a href="https://colab.research.google.com/drive/1JshATKxjs7NWz2U8Oy6xOJaLgjldC1CW">here</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Art from UNC BIOL222]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/unc-biol222/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/ipcc-sr15/?utm_source=atom_feed" rel="related" type="text/html" title="Figures in the IPCC Special Report on Global Warming of 1.5°C (SR15)" />
                <link href="https://blog.scientific-python.org/matplotlib/emoji-mosaic-art/?utm_source=atom_feed" rel="related" type="text/html" title="Emoji Mosaic Art" />
                <link href="https://blog.scientific-python.org/matplotlib/warming-stripes/?utm_source=atom_feed" rel="related" type="text/html" title="Creating the Warming Stripes in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/using-matplotlib-to-advocate-for-postdocs/?utm_source=atom_feed" rel="related" type="text/html" title="Using Matplotlib to Advocate for Postdocs" />
                <link href="https://blog.scientific-python.org/matplotlib/book/?utm_source=atom_feed" rel="related" type="text/html" title="Newly released open access book" />
            
                <id>https://blog.scientific-python.org/matplotlib/unc-biol222/</id>
            
            
            <published>2021-11-19T08:46:00-08:00</published>
            <updated>2021-11-19T08:46:00-08:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>UNC BIOL222: Art created with Matplotlib</blockquote><p>As part of the University of North Carolina BIOL222 class, <a href="https://twitter.com/tylikcat">Dr. Catherine Kehl</a> asked her students to &ldquo;use <code>matplotlib.pyplot</code> to make art.&rdquo; BIOL222 is Introduction to Programming, aimed at students with no programming background. The emphasis is on practical, hands-on active learning.</p>
<p>The students completed the assignment with festive enthusiasm around Halloween. Here are some great examples:</p>
<p>Harris Davis showed an affinity for pumpkins, opting to go 3D!
<img src="/matplotlib/unc-biol222/pumpkin.png" alt="3D Pumpkin"></p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># get library for 3d plotting</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">mpl_toolkits.mplot3d</span> <span class="kn">import</span> <span class="n">Axes3D</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># make a pumpkin :)</span>
</span></span><span class="line"><span class="cl"><span class="n">rho</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">3</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="p">,</span> <span class="mi">32</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">theta</span><span class="p">,</span> <span class="n">phi</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">meshgrid</span><span class="p">(</span><span class="n">rho</span><span class="p">,</span> <span class="n">rho</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">r</span><span class="p">,</span> <span class="n">R</span> <span class="o">=</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.5</span>
</span></span><span class="line"><span class="cl"><span class="n">X</span> <span class="o">=</span> <span class="p">(</span><span class="n">R</span> <span class="o">+</span> <span class="n">r</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">cos</span><span class="p">(</span><span class="n">phi</span><span class="p">))</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">cos</span><span class="p">(</span><span class="n">theta</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">Y</span> <span class="o">=</span> <span class="p">(</span><span class="n">R</span> <span class="o">+</span> <span class="n">r</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">cos</span><span class="p">(</span><span class="n">phi</span><span class="p">))</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">theta</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">Z</span> <span class="o">=</span> <span class="n">r</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">phi</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># make the stem</span>
</span></span><span class="line"><span class="cl"><span class="n">theta1</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="p">,</span> <span class="mi">90</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">r1</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">50</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">T1</span><span class="p">,</span> <span class="n">R1</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">meshgrid</span><span class="p">(</span><span class="n">theta1</span><span class="p">,</span> <span class="n">r1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">X1</span> <span class="o">=</span> <span class="n">R1</span> <span class="o">*</span> <span class="mf">0.5</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">T1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">Y1</span> <span class="o">=</span> <span class="n">R1</span> <span class="o">*</span> <span class="mf">0.5</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">cos</span><span class="p">(</span><span class="n">T1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">Z1</span> <span class="o">=</span> <span class="o">-</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">X1</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="n">Y1</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="o">-</span> <span class="mf">0.7</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">Z1</span><span class="p">[</span><span class="n">Z1</span> <span class="o">&lt;</span> <span class="mf">0.3</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span>
</span></span><span class="line"><span class="cl"><span class="n">Z1</span><span class="p">[</span><span class="n">Z1</span> <span class="o">&gt;</span> <span class="mf">0.7</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Display the pumpkin &amp; stem</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">gca</span><span class="p">(</span><span class="n">projection</span><span class="o">=</span><span class="s2">&#34;3d&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_xlim3d</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_ylim3d</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_zlim3d</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">plot_surface</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">Y</span><span class="p">,</span> <span class="n">Z</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;tab:orange&#34;</span><span class="p">,</span> <span class="n">rstride</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">cstride</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">plot_surface</span><span class="p">(</span><span class="n">X1</span><span class="p">,</span> <span class="n">Y1</span><span class="p">,</span> <span class="n">Z1</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;tab:green&#34;</span><span class="p">,</span> <span class="n">rstride</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">cstride</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p>Bryce Desantis stuck to the biological theme and demonstrated <a href="https://en.wikipedia.org/wiki/Fractal">fractal</a> art.
<img src="/matplotlib/unc-biol222/leaf.png" alt="Bryce Fern"></p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Barnsley&#39;s Fern - Fractal; en.wikipedia.org/wiki/Barnsley_…</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># functions for each part of fern:</span>
</span></span><span class="line"><span class="cl"><span class="c1"># stem</span>
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">stem</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mf">0.16</span> <span class="o">*</span> <span class="n">y</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># smaller leaflets</span>
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">smallLeaf</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">(</span><span class="mf">0.85</span> <span class="o">*</span> <span class="n">x</span> <span class="o">+</span> <span class="mf">0.04</span> <span class="o">*</span> <span class="n">y</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.04</span> <span class="o">*</span> <span class="n">x</span> <span class="o">+</span> <span class="mf">0.85</span> <span class="o">*</span> <span class="n">y</span> <span class="o">+</span> <span class="mf">1.6</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># large left leaflets</span>
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">leftLarge</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">(</span><span class="mf">0.2</span> <span class="o">*</span> <span class="n">x</span> <span class="o">-</span> <span class="mf">0.26</span> <span class="o">*</span> <span class="n">y</span><span class="p">,</span> <span class="mf">0.23</span> <span class="o">*</span> <span class="n">x</span> <span class="o">+</span> <span class="mf">0.22</span> <span class="o">*</span> <span class="n">y</span> <span class="o">+</span> <span class="mf">1.6</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># large right leftlets</span>
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">rightLarge</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">(</span><span class="o">-</span><span class="mf">0.15</span> <span class="o">*</span> <span class="n">x</span> <span class="o">+</span> <span class="mf">0.28</span> <span class="o">*</span> <span class="n">y</span><span class="p">,</span> <span class="mf">0.26</span> <span class="o">*</span> <span class="n">x</span> <span class="o">+</span> <span class="mf">0.24</span> <span class="o">*</span> <span class="n">y</span> <span class="o">+</span> <span class="mf">0.44</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">componentFunctions</span> <span class="o">=</span> <span class="p">[</span><span class="n">stem</span><span class="p">,</span> <span class="n">smallLeaf</span><span class="p">,</span> <span class="n">leftLarge</span><span class="p">,</span> <span class="n">rightLarge</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># number of data points and frequencies for parts of fern generated:</span>
</span></span><span class="line"><span class="cl"><span class="c1"># lists with all 75000 datapoints</span>
</span></span><span class="line"><span class="cl"><span class="n">datapoints</span> <span class="o">=</span> <span class="mi">75000</span>
</span></span><span class="line"><span class="cl"><span class="n">x</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl"><span class="n">datapointsX</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl"><span class="n">datapointsY</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl"><span class="c1"># For 75,000 datapoints</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">n</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">datapoints</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">FrequencyFunction</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">choice</span><span class="p">(</span><span class="n">componentFunctions</span><span class="p">,</span> <span class="n">p</span><span class="o">=</span><span class="p">[</span><span class="mf">0.01</span><span class="p">,</span> <span class="mf">0.85</span><span class="p">,</span> <span class="mf">0.07</span><span class="p">,</span> <span class="mf">0.07</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">    <span class="n">x</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">FrequencyFunction</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">datapointsX</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">datapointsY</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Scatter plot &amp; scaled down to 0.1 to show more definition:</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">datapointsX</span><span class="p">,</span> <span class="n">datapointsY</span><span class="p">,</span> <span class="n">s</span><span class="o">=</span><span class="mf">0.1</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;g&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># Title of Figure</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&#34;Barnsley&#39;s Fern - Assignment 3&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># Changing background color</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">axes</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_facecolor</span><span class="p">(</span><span class="s2">&#34;#d8d7bf&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p>Grace Bell got a little trippy with this rotationally semetric art. It&rsquo;s pretty cool how she captured mouse events. It reminds us of a flower. What do you see?
<img src="/matplotlib/unc-biol222/rotations.png" alt="Rotations"></p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib.tri</span> <span class="kn">import</span> <span class="n">Triangulation</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib.patches</span> <span class="kn">import</span> <span class="n">Polygon</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># I found this sample code online and manipulated it to make the art piece!</span>
</span></span><span class="line"><span class="cl"><span class="c1"># was interested in because it combined what we used for functions as well as what we used for plotting with (x,y)</span>
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">update_polygon</span><span class="p">(</span><span class="n">tri</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">tri</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">points</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="k">else</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">points</span> <span class="o">=</span> <span class="n">triang</span><span class="o">.</span><span class="n">triangles</span><span class="p">[</span><span class="n">tri</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="n">xs</span> <span class="o">=</span> <span class="n">triang</span><span class="o">.</span><span class="n">x</span><span class="p">[</span><span class="n">points</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="n">ys</span> <span class="o">=</span> <span class="n">triang</span><span class="o">.</span><span class="n">y</span><span class="p">[</span><span class="n">points</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="n">polygon</span><span class="o">.</span><span class="n">set_xy</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">column_stack</span><span class="p">([</span><span class="n">xs</span><span class="p">,</span> <span class="n">ys</span><span class="p">]))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">on_mouse_move</span><span class="p">(</span><span class="n">event</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">event</span><span class="o">.</span><span class="n">inaxes</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">tri</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span>
</span></span><span class="line"><span class="cl">    <span class="k">else</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">tri</span> <span class="o">=</span> <span class="n">trifinder</span><span class="p">(</span><span class="n">event</span><span class="o">.</span><span class="n">xdata</span><span class="p">,</span> <span class="n">event</span><span class="o">.</span><span class="n">ydata</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">update_polygon</span><span class="p">(</span><span class="n">tri</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_title</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;In triangle </span><span class="si">{</span><span class="n">tri</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">event</span><span class="o">.</span><span class="n">canvas</span><span class="o">.</span><span class="n">draw</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># this is the info that creates the angles</span>
</span></span><span class="line"><span class="cl"><span class="n">n_angles</span> <span class="o">=</span> <span class="mi">14</span>
</span></span><span class="line"><span class="cl"><span class="n">n_radii</span> <span class="o">=</span> <span class="mi">7</span>
</span></span><span class="line"><span class="cl"><span class="n">min_radius</span> <span class="o">=</span> <span class="mf">0.1</span>  <span class="c1"># the radius of the middle circle can move with this variable</span>
</span></span><span class="line"><span class="cl"><span class="n">radii</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="n">min_radius</span><span class="p">,</span> <span class="mf">0.95</span><span class="p">,</span> <span class="n">n_radii</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">angles</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="p">,</span> <span class="n">n_angles</span><span class="p">,</span> <span class="n">endpoint</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">angles</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">repeat</span><span class="p">(</span><span class="n">angles</span><span class="p">[</span><span class="o">...</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">newaxis</span><span class="p">],</span> <span class="n">n_radii</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">angles</span><span class="p">[:,</span> <span class="mi">1</span><span class="p">::</span><span class="mi">2</span><span class="p">]</span> <span class="o">+=</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span> <span class="o">/</span> <span class="n">n_angles</span>
</span></span><span class="line"><span class="cl"><span class="n">x</span> <span class="o">=</span> <span class="p">(</span><span class="n">radii</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">cos</span><span class="p">(</span><span class="n">angles</span><span class="p">))</span><span class="o">.</span><span class="n">flatten</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">y</span> <span class="o">=</span> <span class="p">(</span><span class="n">radii</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">angles</span><span class="p">))</span><span class="o">.</span><span class="n">flatten</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">triang</span> <span class="o">=</span> <span class="n">Triangulation</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">triang</span><span class="o">.</span><span class="n">set_mask</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">np</span><span class="o">.</span><span class="n">hypot</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">triang</span><span class="o">.</span><span class="n">triangles</span><span class="p">]</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">),</span> <span class="n">y</span><span class="p">[</span><span class="n">triang</span><span class="o">.</span><span class="n">triangles</span><span class="p">]</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="o">&lt;</span> <span class="n">min_radius</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">trifinder</span> <span class="o">=</span> <span class="n">triang</span><span class="o">.</span><span class="n">get_trifinder</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">subplot_kw</span><span class="o">=</span><span class="p">{</span><span class="s2">&#34;aspect&#34;</span><span class="p">:</span> <span class="s2">&#34;equal&#34;</span><span class="p">})</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">triplot</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">triang</span><span class="p">,</span> <span class="s2">&#34;y+-&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>  <span class="c1"># made the color of the plot yellow and there are &#34;+&#34; for the data points but you can&#39;t really see them because of the lines crossing</span>
</span></span><span class="line"><span class="cl"><span class="n">polygon</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">([[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">]],</span> <span class="n">facecolor</span><span class="o">=</span><span class="s2">&#34;y&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">update_polygon</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">polygon</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">canvas</span><span class="o">.</span><span class="n">mpl_connect</span><span class="p">(</span><span class="s2">&#34;motion_notify_event&#34;</span><span class="p">,</span> <span class="n">on_mouse_move</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p>As a bonus, did you like that fox in the banner? That was created (and well documented) by Emily Foster!</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">axis</span><span class="p">(</span><span class="s2">&#34;off&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># head</span>
</span></span><span class="line"><span class="cl"><span class="n">xhead</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="o">-</span><span class="mi">50</span><span class="p">,</span> <span class="mi">50</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">yhead</span> <span class="o">=</span> <span class="o">-</span><span class="mf">0.007</span> <span class="o">*</span> <span class="p">(</span><span class="n">xhead</span> <span class="o">*</span> <span class="n">xhead</span><span class="p">)</span> <span class="o">+</span> <span class="mi">100</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xhead</span><span class="p">,</span> <span class="n">yhead</span><span class="p">,</span> <span class="s2">&#34;darkorange&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># outer ears</span>
</span></span><span class="line"><span class="cl"><span class="n">xearL</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="o">-</span><span class="mf">45.8</span><span class="p">,</span> <span class="o">-</span><span class="mi">9</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">yearL</span> <span class="o">=</span> <span class="o">-</span><span class="mf">0.08</span> <span class="o">*</span> <span class="p">(</span><span class="n">xearL</span> <span class="o">*</span> <span class="n">xearL</span><span class="p">)</span> <span class="o">-</span> <span class="mi">4</span> <span class="o">*</span> <span class="n">xearL</span> <span class="o">+</span> <span class="mi">70</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">xearR</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">9</span><span class="p">,</span> <span class="mf">45.8</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">yearR</span> <span class="o">=</span> <span class="o">-</span><span class="mf">0.08</span> <span class="o">*</span> <span class="p">(</span><span class="n">xearR</span> <span class="o">*</span> <span class="n">xearR</span><span class="p">)</span> <span class="o">+</span> <span class="mi">4</span> <span class="o">*</span> <span class="n">xearR</span> <span class="o">+</span> <span class="mi">70</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xearL</span><span class="p">,</span> <span class="n">yearL</span><span class="p">,</span> <span class="s2">&#34;black&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xearR</span><span class="p">,</span> <span class="n">yearR</span><span class="p">,</span> <span class="s2">&#34;black&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># inner ears</span>
</span></span><span class="line"><span class="cl"><span class="n">xinL</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="o">-</span><span class="mf">41.1</span><span class="p">,</span> <span class="o">-</span><span class="mf">13.7</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">yinL</span> <span class="o">=</span> <span class="o">-</span><span class="mf">0.08</span> <span class="o">*</span> <span class="p">(</span><span class="n">xinL</span> <span class="o">*</span> <span class="n">xinL</span><span class="p">)</span> <span class="o">-</span> <span class="mi">4</span> <span class="o">*</span> <span class="n">xinL</span> <span class="o">+</span> <span class="mi">59</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">xinR</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mf">13.7</span><span class="p">,</span> <span class="mf">41.1</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">yinR</span> <span class="o">=</span> <span class="o">-</span><span class="mf">0.08</span> <span class="o">*</span> <span class="p">(</span><span class="n">xinR</span> <span class="o">*</span> <span class="n">xinR</span><span class="p">)</span> <span class="o">+</span> <span class="mi">4</span> <span class="o">*</span> <span class="n">xinR</span> <span class="o">+</span> <span class="mi">59</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xinL</span><span class="p">,</span> <span class="n">yinL</span><span class="p">,</span> <span class="s2">&#34;salmon&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xinR</span><span class="p">,</span> <span class="n">yinR</span><span class="p">,</span> <span class="s2">&#34;salmon&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># bottom of face</span>
</span></span><span class="line"><span class="cl"><span class="n">xfaceL</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="o">-</span><span class="mf">49.6</span><span class="p">,</span> <span class="o">-</span><span class="mi">14</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">xfaceR</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">14</span><span class="p">,</span> <span class="mf">49.3</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">xfaceM</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="o">-</span><span class="mi">14</span><span class="p">,</span> <span class="mi">14</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xfaceL</span><span class="p">,</span> <span class="nb">abs</span><span class="p">(</span><span class="n">xfaceL</span><span class="p">),</span> <span class="s2">&#34;darkorange&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xfaceR</span><span class="p">,</span> <span class="nb">abs</span><span class="p">(</span><span class="n">xfaceR</span><span class="p">),</span> <span class="s2">&#34;darkorange&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xfaceM</span><span class="p">,</span> <span class="nb">abs</span><span class="p">(</span><span class="n">xfaceM</span><span class="p">),</span> <span class="s2">&#34;black&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># nose</span>
</span></span><span class="line"><span class="cl"><span class="n">xnose</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="o">-</span><span class="mi">14</span><span class="p">,</span> <span class="mi">14</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ynose</span> <span class="o">=</span> <span class="o">-</span><span class="mf">0.03</span> <span class="o">*</span> <span class="p">(</span><span class="n">xnose</span> <span class="o">*</span> <span class="n">xnose</span><span class="p">)</span> <span class="o">+</span> <span class="mi">20</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xnose</span><span class="p">,</span> <span class="n">ynose</span><span class="p">,</span> <span class="s2">&#34;black&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># whiskers</span>
</span></span><span class="line"><span class="cl"><span class="n">xwhiskR</span> <span class="o">=</span> <span class="p">[</span><span class="mi">50</span><span class="p">,</span> <span class="mi">70</span><span class="p">,</span> <span class="mi">55</span><span class="p">,</span> <span class="mi">70</span><span class="p">,</span> <span class="mi">55</span><span class="p">,</span> <span class="mi">70</span><span class="p">,</span> <span class="mf">49.3</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="n">xwhiskL</span> <span class="o">=</span> <span class="p">[</span><span class="o">-</span><span class="mi">50</span><span class="p">,</span> <span class="o">-</span><span class="mi">70</span><span class="p">,</span> <span class="o">-</span><span class="mi">55</span><span class="p">,</span> <span class="o">-</span><span class="mi">70</span><span class="p">,</span> <span class="o">-</span><span class="mi">55</span><span class="p">,</span> <span class="o">-</span><span class="mi">70</span><span class="p">,</span> <span class="o">-</span><span class="mf">49.3</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="n">ywhisk</span> <span class="o">=</span> <span class="p">[</span><span class="mf">82.6</span><span class="p">,</span> <span class="mi">85</span><span class="p">,</span> <span class="mi">70</span><span class="p">,</span> <span class="mi">65</span><span class="p">,</span> <span class="mi">60</span><span class="p">,</span> <span class="mi">45</span><span class="p">,</span> <span class="mf">49.3</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xwhiskR</span><span class="p">,</span> <span class="n">ywhisk</span><span class="p">,</span> <span class="s2">&#34;darkorange&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xwhiskL</span><span class="p">,</span> <span class="n">ywhisk</span><span class="p">,</span> <span class="s2">&#34;darkorange&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># eyes</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="mi">20</span><span class="p">,</span> <span class="mi">60</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">,</span> <span class="n">marker</span><span class="o">=</span><span class="s2">&#34;o&#34;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mi">15</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="o">-</span><span class="mi">20</span><span class="p">,</span> <span class="mi">60</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">,</span> <span class="n">marker</span><span class="o">=</span><span class="s2">&#34;o&#34;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mi">15</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="mi">22</span><span class="p">,</span> <span class="mi">62</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;white&#34;</span><span class="p">,</span> <span class="n">marker</span><span class="o">=</span><span class="s2">&#34;o&#34;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mi">6</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="o">-</span><span class="mi">18</span><span class="p">,</span> <span class="mi">62</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;white&#34;</span><span class="p">,</span> <span class="n">marker</span><span class="o">=</span><span class="s2">&#34;o&#34;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mi">6</span><span class="p">)</span></span></span></code></pre>
</div>
<p>We look forward to seeing these students continue in their plotting and scientific adventures!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="art" label="art" />
                             
                                <category scheme="taxonomy:Tags" term="academia" label="academia" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Newly released open access book]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/book/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_final/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC&#39;21: Final Report" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_quarter/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC&#39;21: Quarter Progress" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_prequarter/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC&#39;21: Pre-Quarter Progress" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_midterm/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC&#39;21: Mid-Term Progress" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_introduction/?utm_source=atom_feed" rel="related" type="text/html" title="Aitik Gupta joins as a Student Developer under GSoC&#39;21" />
            
                <id>https://blog.scientific-python.org/matplotlib/book/</id>
            
            
            <published>2021-11-15T14:26:51+01:00</published>
            <updated>2021-11-15T14:26:51+01:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>New open access book released</blockquote><p>It&rsquo;s my great pleasure to announce that I&rsquo;ve finished my book on matplotlib and it is now freely available at <a href="https://www.labri.fr/perso/nrougier/scientific-visualization.html">www.labri.fr/perso/nrougier/scientific-visualization.html</a> while sources for the book are hosted at <a href="https://github.com/rougier/scientific-visualization-book">github.com/rougier/scientific-visualization-book</a>.</p>
<h2 id="abstract">Abstract<a class="headerlink" href="#abstract" title="Link to this heading">#</a></h2>
<p>The Python scientific visualisation landscape is huge. It is composed of a myriad of tools, ranging from the most versatile and widely used down to the more specialised and confidential. Some of these tools are community based while others are developed by companies. Some are made specifically for the web, others are for the desktop only, some deal with 3D and large data, while others target flawless 2D rendering. In this landscape, Matplotlib has a very special place. It is a versatile and powerful library that allows you to design very high quality figures, suitable for scientific publishing. It also offers a simple and intuitive interface as well as an object oriented architecture that allows you to tweak anything within a figure. Finally, it can be used as a regular graphic library in order to design non‐scientific figures. This book is organized into four parts. The first part considers the fundamental principles of the Matplotlib library. This includes reviewing the different parts that constitute a figure, the different coordinate systems, the available scales and projections, and we’ll also introduce a few concepts related to typography and colors. The second part is dedicated to the actual design of a figure. After introducing some simple rules for generating better figures, we’ll then go on to explain the Matplotlib defaults and styling system before diving on into figure layout organization. We’ll then explore the different types of plot available and see how a figure can be ornamented with different elements. The third part is dedicated to more advanced concepts, namely 3D figures, optimization &amp; animation. The fourth and final part is a collection of showcases.</p>
<h3 id="book-gallery">Book gallery<a class="headerlink" href="#book-gallery" title="Link to this heading">#</a></h3>
<p><img src="/matplotlib/book/book-gallery.png" alt="A grid of multiple plots showing how data may be visualized."></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Battery Charts - Visualise usage rates & more]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/visualising-usage-using-batteries/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/python-graph-gallery.com/?utm_source=atom_feed" rel="related" type="text/html" title="The Python Graph Gallery: hundreds of python charts with reproducible code." />
                <link href="https://blog.scientific-python.org/matplotlib/stellar-chart-alternative-radar-chart/?utm_source=atom_feed" rel="related" type="text/html" title="Stellar Chart, a Type of Chart to Be on Your Radar" />
                <link href="https://blog.scientific-python.org/matplotlib/ipcc-sr15/?utm_source=atom_feed" rel="related" type="text/html" title="Figures in the IPCC Special Report on Global Warming of 1.5°C (SR15)" />
                <link href="https://blog.scientific-python.org/matplotlib/codeswitching-visualization/?utm_source=atom_feed" rel="related" type="text/html" title="Visualizing Code-Switching with Step Charts" />
                <link href="https://blog.scientific-python.org/matplotlib/elementary-cellular-automata/?utm_source=atom_feed" rel="related" type="text/html" title="Elementary Cellular Automata" />
            
                <id>https://blog.scientific-python.org/matplotlib/visualising-usage-using-batteries/</id>
            
            
            <published>2021-08-19T16:52:58+05:30</published>
            <updated>2021-08-19T16:52:58+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>A tutorial on how to show usage rates and more using batteries</blockquote><h1 id="introduction">Introduction<a class="headerlink" href="#introduction" title="Link to this heading">#</a></h1>
<p>I have been creating common visualisations like scatter plots, bar charts, beeswarms etc. for a while and thought about doing something different. Since I&rsquo;m an avid football fan, I thought of ideas to represent players&rsquo; usage or involvement over a period (a season, a couple of seasons). I have seen some cool visualisations like donuts which depict usage and I wanted to make something different and simple to understand. I thought about representing batteries as a form of player usage and it made a lot of sense.</p>
<p>For players who have been barely used (played fewer minutes) show a <strong><em>large amount of battery</em></strong> present since they have enough energy left in the tank. And for heavily used players, do the opposite i.e. show <strong><em>drained or less amount of battery</em></strong></p>
<p>So, what is the purpose of a battery chart? You can use it to show usage, consumption, involvement, fatigue etc. (anything usage related).</p>
<p>The image below is a sample view of how a battery would look in our figure, although a single battery isn&rsquo;t exactly what we are going to recreate in this tutorial.</p>
<p><img src="/matplotlib/visualising-usage-using-batteries/battery.png" alt="A sample visualisation"></p>
<h1 id="tutorial">Tutorial<a class="headerlink" href="#tutorial" title="Link to this heading">#</a></h1>
<p>Before jumping on to the tutorial, I would like to make it known that the function can be tweaked to fit accordingly depending on the number of subplots or any other size parameter. Coming to the figure we are going to plot, there are a series of steps that is to be considered which we will follow one by one. The following are those steps:-</p>
<ol>
<li>Outlining what we are going to plot</li>
<li>Import necessary libraries</li>
<li>Write a function to draw the battery
<ul>
<li>This is the function that will be called to plot the battery chart</li>
</ul>
</li>
<li>Read the data and plot the chart accordingly
<ul>
<li>We will demonstrate it with an example</li>
</ul>
</li>
</ol>
<h2 id="plot-outline"><span style="text-decoration: underline">Plot Outline</span><a class="headerlink" href="#plot-outline" title="Link to this heading">#</a></h2>
<p>What is our use case?</p>
<ul>
<li>We are given a dataset where we have data of Liverpool&rsquo;s players and their minutes played in the last 2 seasons (for whichever club they for played in that time period). We will use this data for our visualisation.</li>
<li>The final visualisation is the featured image of this blog post. We will navigate step-by-step as to how we&rsquo;ll create the visualisation.</li>
</ul>
<h2 id="importing-libraries"><span style="text-decoration: underline">Importing Libraries</span><a class="headerlink" href="#importing-libraries" title="Link to this heading">#</a></h2>
<p>The first and foremost part is to import the essential libraries so that we can leverage the functions within. In this case, we will import the libraries we need.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib.path</span> <span class="kn">import</span> <span class="n">Path</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib.patches</span> <span class="kn">import</span> <span class="n">FancyBboxPatch</span><span class="p">,</span> <span class="n">PathPatch</span><span class="p">,</span> <span class="n">Wedge</span></span></span></code></pre>
</div>
<p>The functions imported from <code>matplotlib.path</code> and <code>matplotlib.patches</code> will be used to draw lines, rectangles, boxes and so on to display the battery as it is.</p>
<h2 id="drawing-the-battery---a-function"><span style="text-decoration: underline">Drawing the Battery - A function</span><a class="headerlink" href="#drawing-the-battery---a-function" title="Link to this heading">#</a></h2>
<p>The next part is to define a function named <code>draw_battery()</code>, which will be used to draw the battery. Later on, we will call this function by specifying certain parameters to build the figure as we require. The following below is the code to build the battery -</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">draw_battery</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">percentage</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">bat_ec</span><span class="o">=</span><span class="s2">&#34;grey&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">tip_fc</span><span class="o">=</span><span class="s2">&#34;none&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">tip_ec</span><span class="o">=</span><span class="s2">&#34;grey&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">bol_fc</span><span class="o">=</span><span class="s2">&#34;#fdfdfd&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">bol_ec</span><span class="o">=</span><span class="s2">&#34;grey&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">invert_perc</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    fig : figure
</span></span></span><span class="line"><span class="cl"><span class="s2">        The figure object for the plot
</span></span></span><span class="line"><span class="cl"><span class="s2">    ax : axes
</span></span></span><span class="line"><span class="cl"><span class="s2">        The axes/axis variable of the figure.
</span></span></span><span class="line"><span class="cl"><span class="s2">    percentage : int, optional
</span></span></span><span class="line"><span class="cl"><span class="s2">        This is the battery percentage - size of the fill. The default is 0.
</span></span></span><span class="line"><span class="cl"><span class="s2">    bat_ec : str, optional
</span></span></span><span class="line"><span class="cl"><span class="s2">        The edge color of the battery/cell. The default is &#34;grey&#34;.
</span></span></span><span class="line"><span class="cl"><span class="s2">    tip_fc : str, optional
</span></span></span><span class="line"><span class="cl"><span class="s2">        The fill/face color of the tip of battery. The default is &#34;none&#34;.
</span></span></span><span class="line"><span class="cl"><span class="s2">    tip_ec : str, optional
</span></span></span><span class="line"><span class="cl"><span class="s2">        The edge color of the tip of battery. The default is &#34;grey&#34;.
</span></span></span><span class="line"><span class="cl"><span class="s2">    bol_fc : str, optional
</span></span></span><span class="line"><span class="cl"><span class="s2">        The fill/face color of the lightning bolt. The default is &#34;#fdfdfd&#34;.
</span></span></span><span class="line"><span class="cl"><span class="s2">    bol_ec : str, optional
</span></span></span><span class="line"><span class="cl"><span class="s2">        The edge color of the lightning bolt. The default is &#34;grey&#34;.
</span></span></span><span class="line"><span class="cl"><span class="s2">    invert_perc : bool, optional
</span></span></span><span class="line"><span class="cl"><span class="s2">        A flag to invert the percentage shown inside the battery. The default is False
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">    -------
</span></span></span><span class="line"><span class="cl"><span class="s2">    None.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">try</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">fig</span><span class="o">.</span><span class="n">set_size_inches</span><span class="p">((</span><span class="mi">15</span><span class="p">,</span> <span class="mi">15</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="n">xlim</span><span class="o">=</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">20</span><span class="p">),</span> <span class="n">ylim</span><span class="o">=</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">axis</span><span class="p">(</span><span class="s2">&#34;off&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">invert_perc</span> <span class="o">==</span> <span class="kc">True</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">percentage</span> <span class="o">=</span> <span class="mi">100</span> <span class="o">-</span> <span class="n">percentage</span>
</span></span><span class="line"><span class="cl">        <span class="c1"># color options - #fc3d2e red &amp; #53d069 green &amp; #f5c54e yellow</span>
</span></span><span class="line"><span class="cl">        <span class="n">bat_fc</span> <span class="o">=</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;#fc3d2e&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="n">percentage</span> <span class="o">&lt;=</span> <span class="mi">20</span>
</span></span><span class="line"><span class="cl">            <span class="k">else</span> <span class="s2">&#34;#53d069&#34;</span> <span class="k">if</span> <span class="n">percentage</span> <span class="o">&gt;=</span> <span class="mi">80</span> <span class="k">else</span> <span class="s2">&#34;#f5c54e&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">        Static battery and tip of battery
</span></span></span><span class="line"><span class="cl"><span class="s2">        &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="n">battery</span> <span class="o">=</span> <span class="n">FancyBboxPatch</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mf">2.1</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">            <span class="mi">10</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="mf">0.8</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;round, pad=0.2, rounding_size=0.5&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">fc</span><span class="o">=</span><span class="s2">&#34;none&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">ec</span><span class="o">=</span><span class="n">bat_ec</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">fill</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">ls</span><span class="o">=</span><span class="s2">&#34;-&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">lw</span><span class="o">=</span><span class="mf">1.5</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">tip</span> <span class="o">=</span> <span class="n">Wedge</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="mf">15.35</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">),</span> <span class="mf">0.2</span><span class="p">,</span> <span class="mi">270</span><span class="p">,</span> <span class="mi">90</span><span class="p">,</span> <span class="n">fc</span><span class="o">=</span><span class="s2">&#34;none&#34;</span><span class="p">,</span> <span class="n">ec</span><span class="o">=</span><span class="n">bat_ec</span><span class="p">,</span> <span class="n">fill</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">ls</span><span class="o">=</span><span class="s2">&#34;-&#34;</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mi">3</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">add_artist</span><span class="p">(</span><span class="n">battery</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">add_artist</span><span class="p">(</span><span class="n">tip</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">        Filling the battery cell with the data
</span></span></span><span class="line"><span class="cl"><span class="s2">        &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="n">filler</span> <span class="o">=</span> <span class="n">FancyBboxPatch</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="mf">5.1</span><span class="p">,</span> <span class="mf">2.13</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="n">percentage</span> <span class="o">/</span> <span class="mi">10</span><span class="p">)</span> <span class="o">-</span> <span class="mf">0.2</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="mf">0.74</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;round, pad=0.2, rounding_size=0.5&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">fc</span><span class="o">=</span><span class="n">bat_fc</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">ec</span><span class="o">=</span><span class="n">bat_fc</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">fill</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">ls</span><span class="o">=</span><span class="s2">&#34;-&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">lw</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">add_artist</span><span class="p">(</span><span class="n">filler</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">        Adding a lightning bolt in the centre of the cell
</span></span></span><span class="line"><span class="cl"><span class="s2">        &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="n">verts</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="mf">10.5</span><span class="p">,</span> <span class="mf">3.1</span><span class="p">),</span>  <span class="c1"># top</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="mf">8.5</span><span class="p">,</span> <span class="mf">2.4</span><span class="p">),</span>  <span class="c1"># left</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="mf">9.5</span><span class="p">,</span> <span class="mf">2.4</span><span class="p">),</span>  <span class="c1"># left mid</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="mi">9</span><span class="p">,</span> <span class="mf">1.9</span><span class="p">),</span>  <span class="c1"># bottom</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="mi">11</span><span class="p">,</span> <span class="mf">2.6</span><span class="p">),</span>  <span class="c1"># right</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mf">2.6</span><span class="p">),</span>  <span class="c1"># right mid</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="mf">10.5</span><span class="p">,</span> <span class="mf">3.1</span><span class="p">),</span>  <span class="c1"># top</span>
</span></span><span class="line"><span class="cl">        <span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">codes</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">            <span class="n">Path</span><span class="o">.</span><span class="n">MOVETO</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">Path</span><span class="o">.</span><span class="n">LINETO</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">Path</span><span class="o">.</span><span class="n">LINETO</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">Path</span><span class="o">.</span><span class="n">LINETO</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">Path</span><span class="o">.</span><span class="n">LINETO</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">Path</span><span class="o">.</span><span class="n">LINETO</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">Path</span><span class="o">.</span><span class="n">CLOSEPOLY</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="n">path</span> <span class="o">=</span> <span class="n">Path</span><span class="p">(</span><span class="n">verts</span><span class="p">,</span> <span class="n">codes</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">bolt</span> <span class="o">=</span> <span class="n">PathPatch</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">fc</span><span class="o">=</span><span class="n">bol_fc</span><span class="p">,</span> <span class="n">ec</span><span class="o">=</span><span class="n">bol_ec</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mf">1.5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">add_artist</span><span class="p">(</span><span class="n">bolt</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="kn">import</span> <span class="nn">traceback</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="nb">print</span><span class="p">(</span><span class="s2">&#34;EXCEPTION FOUND!!! SAFELY EXITING!!! Find the details below:&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">traceback</span><span class="o">.</span><span class="n">print_exc</span><span class="p">()</span></span></span></code></pre>
</div>
<h2 id="reading-the-data"><span style="text-decoration: underline">Reading the Data</span><a class="headerlink" href="#reading-the-data" title="Link to this heading">#</a></h2>
<p>Once we have created the API or function, we can now implement the same. And for that, we need to feed in required data. In our example, we have a dataset that has the list of Liverpool players and the minutes they have played in the past two seasons. The data was collected from <a href="http://www.fbref.com">Football Reference aka FBRef</a>.</p>
<p>We use the read excel function in the pandas library to read our dataset that is stored as an excel file.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">data</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_excel</span><span class="p">(</span><span class="s2">&#34;Liverpool Minutes Played.xlsx&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p>Now, let us have a look at how the data looks by listing out the first five rows of our dataset -</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">data</span><span class="o">.</span><span class="n">head</span><span class="p">()</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/visualising-usage-using-batteries/head_data.PNG" alt="The first 5 rows of our dataset"></p>
<h2 id="plotting-our-data"><span style="text-decoration: underline">Plotting our data</span><a class="headerlink" href="#plotting-our-data" title="Link to this heading">#</a></h2>
<p>Now that everything is ready, we go ahead and plot the data. We have 25 players in our dataset, so a 5 x 5 figure is the one to go for. We&rsquo;ll also add some headers and set the colors accordingly.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="n">facecolor</span> <span class="o">=</span> <span class="s2">&#34;#00001a&#34;</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">set_facecolor</span><span class="p">(</span><span class="n">facecolor</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">text</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="mf">0.35</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="mf">0.95</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;Liverpool: Player Usage/Involvement&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">color</span><span class="o">=</span><span class="s2">&#34;white&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">size</span><span class="o">=</span><span class="mi">18</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">fontname</span><span class="o">=</span><span class="s2">&#34;Libre Baskerville&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">fontweight</span><span class="o">=</span><span class="s2">&#34;bold&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">text</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="mf">0.25</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="mf">0.92</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;Data from 19/20 and 20/21 | Battery percentage indicate usage | less battery = played more/ more involved&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">color</span><span class="o">=</span><span class="s2">&#34;white&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">size</span><span class="o">=</span><span class="mi">12</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">fontname</span><span class="o">=</span><span class="s2">&#34;Libre Baskerville&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span></span></span></code></pre>
</div>
<p>We have now now filled in appropriate headers, figure size etc. The next step is to plot all the axes i.e. batteries for each and every player. <code>p</code> is the variable used to iterate through the dataframe and fetch each players data. The <code>draw_battery()</code> function call will obviously plot the battery. We also add the required labels along with that - player name and usage rate/percentage in this case.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">p</span> <span class="o">=</span> <span class="mi">0</span>  <span class="c1"># The variable that&#39;ll iterate through each row of the dataframe (for every player)</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">5</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">5</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span><span class="o">.</span><span class="n">text</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="mi">10</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="mi">4</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="nb">str</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">iloc</span><span class="p">[</span><span class="n">p</span><span class="p">,</span> <span class="mi">0</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl">            <span class="n">color</span><span class="o">=</span><span class="s2">&#34;white&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">size</span><span class="o">=</span><span class="mi">14</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">fontname</span><span class="o">=</span><span class="s2">&#34;Lora&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">va</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span><span class="o">.</span><span class="n">set_facecolor</span><span class="p">(</span><span class="n">facecolor</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">draw_battery</span><span class="p">(</span><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">],</span> <span class="nb">round</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">iloc</span><span class="p">[</span><span class="n">p</span><span class="p">,</span> <span class="mi">8</span><span class="p">]),</span> <span class="n">invert_perc</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">        Add the battery percentage as text if a label is required
</span></span></span><span class="line"><span class="cl"><span class="s2">        &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span><span class="o">.</span><span class="n">text</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="mi">5</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="mf">0.9</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;Usage - &#34;</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="mi">100</span> <span class="o">-</span> <span class="nb">round</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">iloc</span><span class="p">[</span><span class="n">p</span><span class="p">,</span> <span class="mi">8</span><span class="p">])))</span> <span class="o">+</span> <span class="s2">&#34;%&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">fontsize</span><span class="o">=</span><span class="mi">12</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">color</span><span class="o">=</span><span class="s2">&#34;white&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">p</span> <span class="o">+=</span> <span class="mi">1</span></span></span></code></pre>
</div>
<p>Now that everything is almost done, we do some final touchup and this is a completely optional part anyway. Since the visualisation is focused on Liverpool players, I add Liverpool&rsquo;s logo and also add my watermark. Also, crediting the data source/provider is more of an ethical habit, so we go ahead and do that as well before displaying the plot.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">liv</span> <span class="o">=</span> <span class="n">Image</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">&#34;Liverpool.png&#34;</span><span class="p">,</span> <span class="s2">&#34;r&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">liv</span> <span class="o">=</span> <span class="n">liv</span><span class="o">.</span><span class="n">resize</span><span class="p">((</span><span class="mi">80</span><span class="p">,</span> <span class="mi">80</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="n">liv</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">liv</span><span class="p">)</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">float</span><span class="p">)</span> <span class="o">/</span> <span class="mi">255</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">figimage</span><span class="p">(</span><span class="n">liv</span><span class="p">,</span> <span class="mi">30</span><span class="p">,</span> <span class="mi">890</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">text</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="mf">0.11</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="mf">0.08</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;viz: Rithwik Rajendran/@rithwikrajendra&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">color</span><span class="o">=</span><span class="s2">&#34;lightgrey&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">size</span><span class="o">=</span><span class="mi">14</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">fontname</span><span class="o">=</span><span class="s2">&#34;Lora&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">text</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="mf">0.8</span><span class="p">,</span> <span class="mf">0.08</span><span class="p">,</span> <span class="s2">&#34;data: FBRef/Statsbomb&#34;</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;lightgrey&#34;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">14</span><span class="p">,</span> <span class="n">fontname</span><span class="o">=</span><span class="s2">&#34;Lora&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p>So, we have the plot below. You can customise the design as you want in the <code>draw_battery()</code> function - change size, colours, shapes etc</p>
<p><img src="/matplotlib/visualising-usage-using-batteries/Liverpool_Usage_Chart.png" alt="Usage Chart Liverpool"></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[GSoC'21: Final Report]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_final/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_quarter/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC&#39;21: Quarter Progress" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_prequarter/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC&#39;21: Pre-Quarter Progress" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_midterm/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC&#39;21: Mid-Term Progress" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_introduction/?utm_source=atom_feed" rel="related" type="text/html" title="Aitik Gupta joins as a Student Developer under GSoC&#39;21" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2020_final_work_product/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC 2020 Work Product - Baseline Images Problem" />
            
                <id>https://blog.scientific-python.org/matplotlib/gsoc_2021_final/</id>
            
            
            <published>2021-08-17T17:36:40+05:30</published>
            <updated>2021-08-17T17:36:40+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Google Summer of Code 2021: Final Report - Aitik Gupta</blockquote><p><strong><ins>Matplotlib: Revisiting Text/Font Handling</ins></strong></p>
<p>To kick things off for the final report, here&rsquo;s a <a href="https://user-images.githubusercontent.com/43996118/129448683-bc136398-afeb-40ac-bbb7-0576757baf3c.jpg">meme</a> to nudge about the <a href="/tags/gsoc/">previous blogs</a>.</p>
<h2 id="about-matplotlib">About Matplotlib<a class="headerlink" href="#about-matplotlib" title="Link to this heading">#</a></h2>
<p>Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations, which has become a <em>de-facto Python plotting library</em>.</p>
<p>Much of the implementation behind its font manager is inspired by <a href="https://www.w3.org/">W3C</a> compliant algorithms, allowing users to interact with font properties like <code>font-size</code>, <code>font-weight</code>, <code>font-family</code>, etc.</p>
<h4 id="however-the-way-matplotlib-handled-fonts-and-general-text-layout-was-not-ideal-which-is-what-summer-2021-was-all-about">However, the way Matplotlib handled fonts and general text layout was not ideal, which is what Summer 2021 was all about.<a class="headerlink" href="#however-the-way-matplotlib-handled-fonts-and-general-text-layout-was-not-ideal-which-is-what-summer-2021-was-all-about" title="Link to this heading">#</a></h4>
<blockquote>
<p>By &ldquo;not ideal&rdquo;, I do not mean that the library has design flaws, but that the design was engineered in the early 2000s, and is now <em>outdated</em>.</p>
</blockquote>
<p>(..more on this later)</p>
<h3 id="about-the-project">About the Project<a class="headerlink" href="#about-the-project" title="Link to this heading">#</a></h3>
<p>(PS: here&rsquo;s <a href="https://docs.google.com/document/d/11PrXKjMHhl0rcQB4p_W9JY_AbPCkYuoTT0t85937nB0/view#heading=h.feg5pv3x59u2">the link</a> to my GSoC proposal, if you&rsquo;re interested)</p>
<p>Overall, the project was divided into two major subgoals:</p>
<ol>
<li>Font Subsetting</li>
<li>Font Fallback</li>
</ol>
<p>But before we take each of them on, we should get an idea about some basic terminology for fonts (which are a <em>lot</em>, and are rightly <em>confusing</em>)</p>
<p>The <a href="https://github.com/matplotlib/matplotlib/pull/20346/files">PR: Clarify/Improve docs on family-names vs generic-families</a> brings about a bit of clarity about some of these terms. The next section has a linked PR which also explains the types of fonts and how that is relevant to Matplotlib.</p>
<h2 id="font-subsetting">Font Subsetting<a class="headerlink" href="#font-subsetting" title="Link to this heading">#</a></h2>
<p>An easy-to-read guide on Fonts and Matplotlib was created with <a href="https://github.com/matplotlib/matplotlib/pull/20450">PR: [Doc] Font Types and Font Subsetting</a>, which is currently live at <a href="https://matplotlib.org/devdocs/users/fonts.html">Matplotlib&rsquo;s DevDocs</a>.</p>
<p>Taking an excerpt from one of my previous blogs (and <a href="https://matplotlib.org/devdocs/users/fonts.html#subsetting">the doc</a>):</p>
<blockquote>
<p>Fonts can be considered as a collection of these glyphs, so ultimately the goal of subsetting is to find out which glyphs are <ins>required</ins> for a certain array of characters, and embed <ins>only those</ins> within the output.</p>
</blockquote>
<p>PDF, PS/EPS and SVG output document formats are special, as in <strong>the text within them can be <ins>editable</ins></strong>, i.e, one can copy/search text from documents (for eg, from a PDF file) if the text is editable.</p>
<h3 id="matplotlib-and-subsetting">Matplotlib and Subsetting<a class="headerlink" href="#matplotlib-and-subsetting" title="Link to this heading">#</a></h3>
<p>The PDF, PS/EPS and SVG backends used to support font subsetting, <em>only for a few types</em>. What that means is, before Summer &lsquo;21, Matplotlib could generate Type 3 subsets for PDF, PS/EPS backends, but it <ins><em>could not</em></ins> generate Type 42 / TrueType subsets.</p>
<p>With <a href="https://github.com/matplotlib/matplotlib/pull/20391">PR: Type42 subsetting in PS/PDF</a> merged in, users can expect their PDF/PS/EPS documents to contains subsetted glyphs from the original fonts.</p>
<p>This is especially beneficial for people who wish to use <ins>commercial</ins> (or <a href="https://en.wikipedia.org/wiki/CJK_characters">CJK</a>) fonts. Licenses for many fonts <strong><em>require</em></strong> subsetting such that they can’t be trivially copied from the output files generated from Matplotlib.</p>
<h2 id="font-fallback">Font Fallback<a class="headerlink" href="#font-fallback" title="Link to this heading">#</a></h2>
<p>Matplotlib was designed to work with a single font at runtime. A user <em>could</em> specify a <code>font.family</code>, which was supposed to correspond to <a href="https://www.w3schools.com/cssref/pr_font_font-family.asp">CSS</a> properties, but that was only used to find a <em>single</em> font present on the user&rsquo;s system.</p>
<p>Once that font was found (which is almost always found, since Matplotlib ships with a set of default fonts), all the user text was rendered only through that font. (which used to give out &ldquo;<ins>tofu</ins>&rdquo; if a character wasn&rsquo;t found)</p>
<hr>
<p>It might seem like an <em>outdated</em> approach for text rendering, now that we have these concepts like font-fallback, <ins>but these concepts weren&rsquo;t very well discussed in early 2000s</ins>. Even getting a single font to work <em>was considered a hard engineering problem</em>.</p>
<p>This was primarily because of the lack of <strong>any standardization</strong> for representation of fonts (Adobe had their own font representation, and so did Apple, Microsoft, etc.)</p>
<table>
  <thead>
      <tr>
          <th><img src="https://user-images.githubusercontent.com/43996118/128605750-9d76fa4a-ce57-45c6-af23-761334d48ef7.png" alt="Previous"></th>
          <th><img src="https://user-images.githubusercontent.com/43996118/128605746-9f79ebeb-c03d-407e-9e27-c3203a210908.png" alt="After"></th>
      </tr>
  </thead>
  <tbody>
  </tbody>
</table>
<p align="middle">
    <ins>Previous</ins> (notice <i>Tofus</i>) VS  <ins>After</ins> (CJK font as fallback)
</p>
<p>To migrate from a font-first approach to a text-first approach, there are multiple steps involved:</p>
<h3 id="parsing-the-whole-font-family">Parsing the whole font family<a class="headerlink" href="#parsing-the-whole-font-family" title="Link to this heading">#</a></h3>
<p>The very first (and crucial!) step is to get to a point where we have multiple font paths (ideally individual font files for the whole family). That is achieved with either:</p>
<ul>
<li><a href="https://github.com/matplotlib/matplotlib/pull/20496">PR: [with findfont diff] Parsing all families in font_manager</a>, or</li>
<li><a href="https://github.com/matplotlib/matplotlib/pull/20549">PR: [without findfont diff] Parsing all families in font_manager</a></li>
</ul>
<p>Quoting one of my <a href="../gsoc_2021_prequarter/">previous</a> blogs:</p>
<blockquote>
<p>Don’t break, a lot at stake!</p>
</blockquote>
<p>My first approach was to change the existing public <code>findfont</code> API to incorporate multiple filepaths. Since Matplotlib has a <em>very huge</em> userbase, there&rsquo;s a high chance it would break a chunk of people&rsquo;s workflow:</p>
<p align="center">
  <img src="https://user-images.githubusercontent.com/43996118/129636132-47b141b3-f149-49b7-b0c0-67c256bd6ee1.png" alt="FamilyParsingFlowChart" width="60%" />
  First PR (left), Second PR (right)
</p>
<h3 id="ft2font-overhaul">FT2Font Overhaul<a class="headerlink" href="#ft2font-overhaul" title="Link to this heading">#</a></h3>
<p>Once we get a list of font paths, we need to change the internal representation of a &ldquo;font&rdquo;. Matplotlib has a utility called FT2Font, which is written in C++, and used with wrappers as a Python extension, which in turn is used throughout the backends. For all intents and purposes, it used to mean: <code>FT2Font === SingleFont</code> (if you&rsquo;re interested, here&rsquo;s a <a href="https://user-images.githubusercontent.com/43996118/128352387-76a3f52a-20fc-4853-b624-0c91844fc785.png">meme</a> about how FT2Font was named!)</p>
<p>But that is not the case anymore, here&rsquo;s a flowchart to explain what happens now:</p>
<p align="center">
  <img src="https://user-images.githubusercontent.com/43996118/129720023-14f5d67f-f279-433f-ad78-e5eccb6c784a.png" alt="FamilyParsingFlowChart" width="100%" />
  Font-Fallback Algorithm
</p>
<p>With <a href="https://github.com/matplotlib/matplotlib/pull/20740">PR: Implement Font-Fallback in Matplotlib</a>, every FT2Font object has a <code>std::vector&lt;FT2Font *&gt; fallback_list</code>, which is used for filling the parent cache, as can be seen in the self-explanatory flowchart.</p>
<p>For simplicity, only one type of cache (<ins>character -&gt; FT2Font</ins>) is shown, whereas in actual implementation there&rsquo;s 2 types of caches, one shown above, and another for glyphs (<ins>glyph_id -&gt; FT2Font</ins>).</p>
<blockquote>
<p>Note: Only the parent&rsquo;s APIs are used in some backends, so for each of the individual public functions like <code>load_glyph</code>, <code>load_char</code>, <code>get_kerning</code>, etc., we find the FT2Font object which has that glyph from the parent FT2Font cache!</p>
</blockquote>
<h3 id="multi-font-embedding-in-pdfpseps">Multi-Font embedding in PDF/PS/EPS<a class="headerlink" href="#multi-font-embedding-in-pdfpseps" title="Link to this heading">#</a></h3>
<p>Now that we have multiple fonts to render a string, we also need to embed them for those special backends (i.e., PDF/PS, etc.). This was done with some patches to specific backends:</p>
<ul>
<li><a href="https://github.com/matplotlib/matplotlib/pull/20804">PR: Implement multi-font embedding for PDF Backend</a></li>
<li><a href="https://github.com/matplotlib/matplotlib/pull/20832">PR: Implement multi-font embedding for PS Backend</a></li>
</ul>
<p>With this, one could create a PDF or a PS/EPS document with multiple fonts which are embedded (and subsetted!).</p>
<h2 id="conclusion">Conclusion<a class="headerlink" href="#conclusion" title="Link to this heading">#</a></h2>
<p>From small contributions to eventually working on a core module of such a huge library, the road was not what I had imagined, and I learnt a lot while designing solutions to these problems.</p>
<h4 id="the-work-i-did-would-eventually-end-up-affecting-every-single-matplotlib-user">The work I did would eventually end up affecting every single Matplotlib user.<a class="headerlink" href="#the-work-i-did-would-eventually-end-up-affecting-every-single-matplotlib-user" title="Link to this heading">#</a></h4>
<p>&hellip;since all plots will work their way through the new codepath!</p>
<p>I think that single statement is worth the <ins>whole GSoC project</ins>.</p>
<h3 id="pull-request-statistics">Pull Request Statistics<a class="headerlink" href="#pull-request-statistics" title="Link to this heading">#</a></h3>
<p>For the sake of statistics (and to make GSoC sound a bit less intimidating), here&rsquo;s a list of contributions I made to Matplotlib <ins>before Summer &lsquo;21</ins>, most of which are only a few lines of diff:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: center">Created At</th>
          <th>PR Title</th>
          <th style="text-align: center">Diff</th>
          <th style="text-align: center">Status</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: center">Nov 2, 2020</td>
          <td><a href="https://github.com/matplotlib/matplotlib/pull/18870">Expand ScalarMappable.set_array to accept array-like inputs</a></td>
          <td style="text-align: center">(+28 −4)</td>
          <td style="text-align: center">MERGED</td>
      </tr>
      <tr>
          <td style="text-align: center">Nov 8, 2020</td>
          <td><a href="https://github.com/matplotlib/matplotlib/pull/18916">Add overset and underset support for mathtext</a></td>
          <td style="text-align: center">(+71 −0)</td>
          <td style="text-align: center">MERGED</td>
      </tr>
      <tr>
          <td style="text-align: center">Nov 14, 2020</td>
          <td><a href="https://github.com/matplotlib/matplotlib/pull/18947">Strictly increasing check with test coverage for streamplot grid</a></td>
          <td style="text-align: center">(+54 −2)</td>
          <td style="text-align: center">MERGED</td>
      </tr>
      <tr>
          <td style="text-align: center">Jan 11, 2021</td>
          <td><a href="https://github.com/matplotlib/matplotlib/pull/19271">WIP: Add support to edit subplot configurations via textbox</a></td>
          <td style="text-align: center">(+51 −11)</td>
          <td style="text-align: center">DRAFT</td>
      </tr>
      <tr>
          <td style="text-align: center">Jan 18, 2021</td>
          <td><a href="https://github.com/matplotlib/matplotlib/pull/19314">Fix over/under mathtext symbols</a></td>
          <td style="text-align: center">(+7,459 −4,169)</td>
          <td style="text-align: center">MERGED</td>
      </tr>
      <tr>
          <td style="text-align: center">Feb 11, 2021</td>
          <td><a href="https://github.com/matplotlib/matplotlib/pull/19497">Add overset/underset whatsnew entry</a></td>
          <td style="text-align: center">(+28 −17)</td>
          <td style="text-align: center">MERGED</td>
      </tr>
      <tr>
          <td style="text-align: center">May 15, 2021</td>
          <td><a href="https://github.com/matplotlib/matplotlib/pull/20235">Warn user when mathtext font is used for ticks</a></td>
          <td style="text-align: center">(+28 −0)</td>
          <td style="text-align: center">MERGED</td>
      </tr>
  </tbody>
</table>
<p>Here&rsquo;s a list of PRs I opened <ins>during Summer'21</ins>:</p>
<ul>
<li>[Status: ✅] <a href="https://github.com/matplotlib/matplotlib/pull/20346">Clarify/Improve docs on family-names vs generic-families</a></li>
<li>[Status: ✅] <a href="https://github.com/matplotlib/matplotlib/pull/20367">Add parse_math in Text and default it False for TextBox</a></li>
<li>[Status: ✅] <a href="https://github.com/matplotlib/matplotlib/pull/20391">Type42 subsetting in PS/PDF</a></li>
<li>[Status: ✅] <a href="https://github.com/matplotlib/matplotlib/pull/20450">[Doc] Font Types and Font Subsetting</a></li>
<li>[Status: 🚧] <a href="https://github.com/matplotlib/matplotlib/pull/20496">[with findfont diff] Parsing all families in font_manager</a></li>
<li>[Status: 🚧] <a href="https://github.com/matplotlib/matplotlib/pull/20549">[without findfont diff] Parsing all families in font_manager</a></li>
<li>[Status: 🚧] <a href="https://github.com/matplotlib/matplotlib/pull/20740">Implement Font-Fallback in Matplotlib</a></li>
<li>[Status: 🚧] <a href="https://github.com/matplotlib/matplotlib/pull/20804">Implement multi-font embedding for PDF Backend</a></li>
<li>[Status: 🚧] <a href="https://github.com/matplotlib/matplotlib/pull/20832">Implement multi-font embedding for PS Backend</a></li>
</ul>
<h2 id="acknowledgements">Acknowledgements<a class="headerlink" href="#acknowledgements" title="Link to this heading">#</a></h2>
<p>From learning about software engineering fundamentals from <a href="https://github.com/tacaswell">Tom</a> to learning about nitty-gritty details about font representations from <a href="https://github.com/jkseppan">Jouni</a>;</p>
<p>From learning through <a href="https://github.com/anntzer">Antony</a>&rsquo;s patches and pointers to receiving amazing feedback on these blogs from <a href="https://github.com/story645">Hannah</a>, it has been an adventure! 💯</p>
<p><em>Special Mentions: <a href="https://github.com/sauerburger">Frank</a>, <a href="https://github.com/srijan-paul">Srijan</a> and <a href="https://github.com/tfidfwastaken">Atharva</a> for their helping hands!</em></p>
<p>And lastly, <em>you</em>, the reader; if you&rsquo;ve been following my <a href="/tags/gsoc/">previous blogs</a>, or if you&rsquo;ve landed at this one directly, I thank you nevertheless. (one last <a href="https://user-images.githubusercontent.com/43996118/126441988-5a2067fd-055e-44e5-86e9-4dddf47abc9d.png">meme</a>, I promise!)</p>
<p>I know I speak for every developer out there, when I say <ins><strong><em>it means a lot</em></strong></ins> when you choose to look at their journey or their work product; it could as well be a tiny website, or it could be as big as designing a complete library!</p>
<hr>
<blockquote>
<p>I&rsquo;m grateful to <a href="https://matplotlib.org/">Maptlotlib</a> (under the parent organisation: <a href="https://numfocus.org/">NumFOCUS</a>), and of course, <a href="https://summerofcode.withgoogle.com/">Google Summer of Code</a> for this incredible learning opportunity.</p>
</blockquote>
<p>Farewell, reader! :&rsquo;)</p>
<p align="center">
  <img src="https://user-images.githubusercontent.com/43996118/118876008-5e6dd580-b90a-11eb-96db-0abc930c6993.png" alt="MatplotlibGSoC" />
  Consider contributing to Matplotlib (Open Source in general) ❤️
</p>
<h4 id="note-this-blog-post-is-also-available-at-my-personal-website">NOTE: This blog post is also available at my <a href="https://aitikgupta.github.io/gsoc-final/">personal website</a>.<a class="headerlink" href="#note-this-blog-post-is-also-available-at-my-personal-website" title="Link to this heading">#</a></h4>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="GSoC" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[My Summer of Code 2021]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/my-summer-of-code-2021/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/completing-the-asadpour-algorithm/?utm_source=atom_feed" rel="related" type="text/html" title="Completing the Asadpour Algorithm" />
                <link href="https://blog.scientific-python.org/networkx/atsp/looking-at-the-big-picture/?utm_source=atom_feed" rel="related" type="text/html" title="Looking at the Big Picture" />
                <link href="https://blog.scientific-python.org/networkx/atsp/sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Preliminaries for Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution/?utm_source=atom_feed" rel="related" type="text/html" title="The Entropy Distribution" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/my-summer-of-code-2021/</id>
            
            
            <published>2021-08-16T00:00:00+00:00</published>
            <updated>2021-08-16T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Review of my entire summer implementing the Asadpour ATSP Algorithm</blockquote><p>Welcome! This post is not going to be discussing technical implementation details or theortical work for my Google Summer of Code project, but rather serve as a summary and recap for the work that I did this summer.</p>
<p>I am very happy with the work I was able to accomplish and believe that I successfully completed my project.</p>
<h2 id="overview">Overview<a class="headerlink" href="#overview" title="Link to this heading">#</a></h2>
<p>My project was titled NetworkX: Implementing the Asadpour Asymmetric Traveling Salesman Problem Algorithm.
The updated abstract given on the Summer of Code project <a href="https://summerofcode.withgoogle.com/dashboard/project/5352909442646016/details/">project page</a> is below.</p>
<blockquote>
<p>This project seems to implement the asymmetric traveling salesman problem developed by Asadpour et al, originally published in 2010 and revised in 2017.
The project is broken into multiple methods, each of which has a set timetable during the project.
We start by solving the Held-Karp relaxation using the Ascent method from the original paper by Held and Karp.
Assuming the result is fractional, we continue into the Asadpour algorithm (integral solutions are optimal by definition and immediately returned).
We approximate the distribution of spanning trees on the undirected support of the Held Karp solution using a maximum entropy rounding method to construct a distribution of trees.
Roughly speaking, the probability of sampling any given tree is proportional to the product of all its edge lambda values.
We sample 2 log <em>n</em> trees from the distribution using an iterative approach developed by V. G. Kulkarni and choose the tree with the smallest cost after returning direction to the arcs.
Finally, the minimum tree is augmented using a minimum network flow algorithm and shortcut down to an <em>O(log n / log log n)</em> approximation of the minimum Hamiltonian cycle.</p>
</blockquote>
<p>My proposal PDF for the 2021 Summer of Code can be <a href="https://drive.google.com/file/d/1XGrjupLYWioz-Nf8Vp63AeuBVApdkwSa/view?usp=sharing">found here</a>.</p>
<p>All of my changes and additions to NetworkX are part of <a href="https://github.com/networkx/networkx/pull/4740">this pull request</a> and can also be found on <a href="https://github.com/mjschwenne/networkx/tree/bothTSP">this branch</a> in my fork of the GitHub repository, but I will be discussing the changes and commits in more detail later.
Also note that for the commits I listed in each section, this is an incomplete list only hitting on focused commits to that function or its tests.
For the complete list, please reference the pull request or the <code>bothTSP</code> GitHub branch on my fork of NetworkX.</p>
<p>My contributions to NetworkX this summer consist predominantly of the following functions and classes, each of which I will discuss in their own sections of this blog post.
Functions and classes which are front-facing are also linked to the <a href="https://networkx.org/documentation/networkx-2.7.1/index.html">developer documentation</a> for NetworkX in the list below and for their section headers.</p>
<ul>
<li><a href="https://networkx.org/documentation/networkx-2.7.1/reference/algorithms/generated/networkx.algorithms.tree.mst.SpanningTreeIterator.html"><code>SpanningTreeIterator</code></a></li>
<li><a href="https://networkx.org/documentation/networkx-2.7.1/reference/algorithms/generated/networkx.algorithms.tree.branchings.ArborescenceIterator.html"><code>ArborescenceIterator</code></a></li>
<li><code>held_karp_ascent</code></li>
<li><code>spanning_tree_distribution</code></li>
<li><code>sample_spanning_tree</code></li>
<li><a href="https://networkx.org/documentation/networkx-2.7.1/reference/algorithms/generated/networkx.algorithms.approximation.traveling_salesman.asadpour_atsp.html"><code>asadpour_atsp</code></a></li>
</ul>
<p>These functions have also been unit tested, and those tests will be integrated into NetworkX once the pull request is merged.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>The following papers are where all of these algorithms originate form and they were of course instrumental in the completion of this project.</p>
<p>[1] A. Asadpour, M. X. Goemans, A. Madry, S. O. Gharan, and A. Saberi, <em>An O (log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, SODA ’10, Society for Industrial and Applied Mathematics, 2010, p. 379 - 389 <a href="https://dl.acm.org/doi/abs/10.5555/1873601.1873633">https://dl.acm.org/doi/abs/10.5555/1873601.1873633</a>.</p>
<p>[2] J. Edmonds, <em>Optimum Branchings</em>, Journal of Research of the National Bureau of Standards, 1967, Vol. 71B, p.233-240, <a href="https://archive.org/details/jresv71Bn4p233">https://archive.org/details/jresv71Bn4p233</a></p>
<p>[3] M. Held, R.M. Karp, <em>The traveling-salesman problem and minimum spanning trees</em>. Operations research, 1970-11-01, Vol.18 (6), p.1138-1162. <a href="https://www.jstor.org/stable/169411">https://www.jstor.org/stable/169411</a></p>
<p>[4] G.K. Janssens, K. Sörensen, <em>An algorithm to generate all spanning trees in order of increasing cost</em>, Pesquisa Operacional, 2005-08, Vol. 25 (2), p. 219-229, <a href="https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en">https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en</a></p>
<p>[5] V. G. Kulkarni, <em>Generating random combinatorial objects</em>, Journal of algorithms, 11 (1990), p. 185–207.</p>
<h2 id="spanningtreeiterator"><a href="https://networkx.org/documentation/networkx-2.7.1/reference/algorithms/generated/networkx.algorithms.tree.mst.SpanningTreeIterator.html"><code>SpanningTreeIterator</code></a><a class="headerlink" href="#spanningtreeiterator" title="Link to this heading">#</a></h2>
<p>The <code>SpanningTreeIterator</code> was the first contribution I completed as part of my GSoC project.
This class takes a graph and returns every spanning tree in it in order of increasing cost, which makes it a direct implementation of [4].</p>
<p>The interesting thing about this iterator is that it is not used as part of the Asadpour algorithm, but served as an intermediate step so that I could develop the <code>ArborescenceIterator</code> which is required for the Held Karp relaxation.
It works by partitioning the edges of the graph as either included, excluded or open and then finding the minimum spanning tree which respects the partition data on the graph edges.
In order to get this to work, I created a new minimum spanning tree function called <code>kruskal_mst_edges_partition</code> which does exactly that.
To prevent redundancy, all kruskal minimum spanning trees now use this function (the original <code>kruskal_mst_edges</code> function is now just a wrapper for the partitioned version).
Once a spanning tree is returned from the iterator, the partition data for that tree is split so that the union of the newly generated partitions is the set of all spanning trees in the partition except the returned minimum spanning tree.</p>
<p>As I mentioned earlier, the <code>SpanningTreeIterator</code> is not directly used in my GSoC project, but I still decided to implement it to understand the partition process and be able to directly use the examples from [4] before moving onto the <code>ArborescenceIterator</code>.
This class I&rsquo;m sure will be useful to the other users of NetworkX and provided a strong foundation to build the <code>ArborescenceIterator</code> off of.</p>
<p><strong>Blog Posts about <code>SpanningTreeIterator</code></strong></p>
<p>5 Jun 2021 - <a href="../finding-all-minimum-arborescences">Finding All Minimum Arborescences</a></p>
<p>10 Jun 2021 - <a href="../implementing-the-iterators">Implementing The Iterators</a></p>
<p><strong>Commits about <code>SpanningTreeIterator</code></strong></p>
<p>Now, at the beginning of this project, my commit messages were not very good&hellip;
I had some problems about merge conflicts after I accidentally committed to the wrong branch and this was the first time I&rsquo;d used a pre-commit hook.</p>
<p>I have not changed the commit messages here, so that you may be assumed by my troughly unhelpful messages, but did annotate them to provide a more accurate description of the commit.</p>
<p><a href="https://github.com/mjschwenne/networkx/commit/495458842d3ec798c6ea52dc1c8089b9a5ce3de5">Testing</a> - <em>Rewrote Kruskal&rsquo;s algorithm to respect partitions and tested that while stubbing the iterators in a separate file</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/3d81e36c8313013a3ae4c4dfc6517c3bde8d826e">I&rsquo;m not entirely sure how the commit hook works&hellip;</a> - <em>Added test cases and finalized implementation of Spanning Tree Iterator in the incorrect file</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/d481f757125a699f69bf5c16790d2e727e3cc159">Moved iterators into the correct files to maintain proper codebase visibility</a> - <em>Realized that the iterators need to be in <code>mst.py</code> and <code>branchings.py</code> respectively to keep private functions hidden</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5503203433bc875df8c0de5d827bda7bed1589e2">Documentation update for the iterators</a> - <em>No explanation needed</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/337804ee38b2c1ac3964447a39d67184081deb01">Update mst.py to accept suggestion</a> - <em>Accepted doc string edit from code review</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5f97de07821e49cc9ba4f9996ec6d1495eb268b7">Review suggestions from dshult</a> - <em>Implemented code review suggestions from one of my mentors</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/97b2da1b5499ecbfd15ef2abd385e50f94c6ba97">Cleaned code, merged functions if possible and opened partition functionality to all</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/aef90dfcbb8b8424c6ed887311b4825559d0a398">Implement suggestions from boothby</a></p>
<h2 id="arborescenceiterator"><a href="https://networkx.org/documentation/networkx-2.7.1/reference/algorithms/generated/networkx.algorithms.tree.branchings.ArborescenceIterator.html"><code>ArborescenceIterator</code></a><a class="headerlink" href="#arborescenceiterator" title="Link to this heading">#</a></h2>
<p>The <code>ArborescenceIterator</code> is a modified version of the algorithm discussed in [4] so that it iterates over the spanning arborescences.</p>
<p>This iterator was a bit more difficult to implement, but that is due to how the minimum spanning arborescence algorithm is structured rather than the partition scheme not being applicable to directed graphs.
In fact the partition scheme is identical to the undirected <code>SpanningTreeIterator</code>, but Edmonds&rsquo; algorithm is more complex and there are several edge cases about how nodes can be contracted and what it means for respecting the partition data.
In order to fully understand the NetworkX implementation, I had to read the original Edmonds paper, [2].</p>
<p>The most notable change was that when the iterator writes the next partition onto the edges of the graph just before Edmonds&rsquo; algorithm is executed, if any incoming edge is marked as included, all of the others are marked as excluded.
This is an implicit part of the <code>SpanningTreeIterator</code>, but needed to be explicitly done here so that if the vertex in question was merged during Edmonds&rsquo; algorithm we could not choose two of the incoming edges to the same vertex once the merging was reversed.</p>
<p>As a final note, the <code>ArborescenceIterator</code> has one more initial parameter than the <code>SpanningTreeIterator</code>, which is the ability to give it an initial partition and iterate over all spanning arborescence with cost greater than the initial partition.
This was used as part of the branch and bound method, but is no longer a part of the my Asadpour algorithm implementation.</p>
<p><strong>Blog Posts about <code>ArborescenceIterator</code></strong></p>
<p>5 Jun 2021 - <a href="../finding-all-minimum-arborescences">Finding All Minimum Arborescences</a></p>
<p>10 Jun 2021 - <a href="../implementing-the-iterators">Implementing The Iterators</a></p>
<p><strong>Commits about <code>ArborescenceIterator</code></strong></p>
<p>My commits listed here are still annotated and much of the work was done at the same time.</p>
<p><a href="https://github.com/mjschwenne/networkx/commit/495458842d3ec798c6ea52dc1c8089b9a5ce3de5">Testing</a> - <em>Rewrote Kruskal&rsquo;s algorithm to respect partitions and tested that while stubbing the iterators in a separate file</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/d481f757125a699f69bf5c16790d2e727e3cc159">Moved iterators into the correct files to maintain proper codebase visibility</a> - <em>Realized that the iterators need to be in <code>mst.py</code> and <code>branchings.py</code> respectively to keep private functions hidden</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/73cade29568f9e10303fb901c97ac52b1d45b8aa">Including Black reformat</a> - <em>Modified Edmonds&rsquo; algorithm to respect partitions</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/ae1c1031980f7e3c3854d718c8813b226d2e8d42">Modified the ArborescenceIterator to accept init partition</a> - <em>No explanation needed</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5503203433bc875df8c0de5d827bda7bed1589e2">Documentation update for the iterators</a> - <em>No explanation needed</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/b44a5ab9c8d5ac86db446213d7b9712e5b9aac81">Update branchings.py accept doc string edit</a> - <em>No explanation needed</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5f97de07821e49cc9ba4f9996ec6d1495eb268b7">Review suggestions from dshult</a> - <em>Implemented code review suggestions from one of my mentors</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/97b2da1b5499ecbfd15ef2abd385e50f94c6ba97">Cleaned code, merged functions if possible and opened partition functionality to all</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/55688deb9a84bc7a77aecc556a63ff80dc41c56f">Implemented review suggestions from rossbar</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/aef90dfcbb8b8424c6ed887311b4825559d0a398">Implement suggestions from boothby</a></p>
<h2 id="held_karp_ascent"><code>held_karp_ascent</code><a class="headerlink" href="#held_karp_ascent" title="Link to this heading">#</a></h2>
<p>The Held Karp relaxation was the most difficult part of my GSoC project and the part that I was the most worried about going into this May.</p>
<p>My plans on how to solve the relaxation evolved over the course of the summer as well, finally culminating in <code>held_karp_ascent</code>.
In my GSoC proposal, I discuss using <code>scipy</code> to solve the relaxation, but the Held Karp relaxation is a semi-infinite linear problem (that is, it is finite but exponential) so I would quickly surpass the capabilities of virtually any computer that the code would be run on.
Fortunately I realized that while I was still writing my proposal and was able to change it.
Next, I wanted to use the ellipsoid algorithm because that is the suggested method in the Asadpour paper [1].</p>
<p>As it happens, the ellipsoid algorithm is not implemented in <code>numpy</code> or <code>scipy</code> and after discussing the practicality of implementing the algorithm as part of this project, we decided that a robust ellipsoid solver was a GSoC project onto itself and beyond the scope of the Asadpour algorithm.
Another method was needed, and was found.
In the original paper by Held and Karp [3], they present three different algorithms for solving the relaxation, the column-generation technique, the ascent method and the branch and bound method.
After reading the paper and comparing all of the methods, I decided that the branch and bound method was the best in terms of performance and wanted to implement that one.</p>
<p>The branch and bound method is a modified version of the ascent method, so I started by implementing the ascent method, then the branch and bound around it.
This had the extra benefit of allowing me to compare the two and determine which is actually better.</p>
<p>Implementing the ascent method proved difficult.
There were a number of subtle bugs in finding the minimum 1-arborescences and finding the value of epsilon by not realizing all of the valid edge substitutions in the graph.
More information about these problems can be found in my post titled <em>Understanding the Ascent Method</em>.
Even after this the ascent method was not working proper, but I decided to move onto the branch and bound method in hopes of learning more about the process so that I could fix the ascent method.</p>
<p>That is exactly what happened!
While debugging the branch and bound method, I realized that my function for finding the set of minimum 1-arborescences would stop searching too soon and possibly miss the minimum 1-arborescences.
Once I fixed that bug, both the ascent as well as the branch and bound method started to produce the correct results.</p>
<p>But which one would be used in the final project?</p>
<p>Well, that came down to which output was more compatible with the rest of the Asadpour algorithm.
The ascent method could find a fractional solution where the edges are not totally in or out of the solution while the branch and bound method would take the time to ensure that the solution was integral.
As it would happen, the Asadpour algorithm expects a fractional solution to the Held Karp relaxation so in the end the ascent method one out and the branch and bound method was removed from the project.</p>
<p>All of this is detailed in the (many) blog posts I wrote on this topic, which are listed below.</p>
<p><strong>Blog posts about the Held Karp relaxation</strong></p>
<p>My first two posts were about the <code>scipy</code> solution and the ellipsoid algorithm.</p>
<p>11 Apr 2021 - <a href="../held-karp-relaxation">Held Karp Relaxation</a></p>
<p>8 May 2021 - <a href="../held-karp-separation-oracle">Held Karp Separation Oracle</a></p>
<p>This next post discusses the merits of each algorithm presenting in the original Held and Karp paper [3].</p>
<p>3 Jun 2021 - <a href="../a-closer-look-at-held-karp">A Closer Look At Held Karp</a></p>
<p>And finally, the last three Held Karp related posts are about the debugging of the algorithms I did implement.</p>
<p>22 Jun 2021 - <a href="../understanding-the-ascent-method">Understanding The Ascent Method</a></p>
<p>28 Jun 2021 - <a href="../implementing-the-held-karp-relaxation">Implementing The Held Karp Relaxation</a></p>
<p>7 Jul 2021 - <a href="../finalizing-held-karp">Finalizing Held Karp</a></p>
<p><strong>Commits about the Held Karp relaxation</strong></p>
<p>Annotations only provided if needed.</p>
<p><a href="https://github.com/networkx/networkx/pull/4740/commits/716437f6ccbbd6c77a7a01b38d330f899c333f0a">Grabbing black reformats</a> - <em>Initial Ascent method implementation</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/cd28eb71676ecc34c7af6f2e0f8980ad6ae89f00">Working on debugging ascent method plus black reformats</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/660e4d3f04a0b4ce28e152af7f8c7df84e1961b3">Ascent method terminating, but at non-optimal solution</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/8314c3c28d205ed5a7d6316904f4db0265d93942">minor edits</a> - <em>Removed some debug statements</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/f7dcb54ce17ec3646e7d3c33f909f6b382608532">Fixed termination condition, still given non-optimal result</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/beccc98c362eb8bdddc42b72af0d669ad082e468">Minor bugfix, still non-optimal result</a> - <em>Ensured reported answer is the cycle if multiple options</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/68ffad5c70811a702ade569817a1f3a14c33a1af">Fixed subtle bug in find_epsilon()</a> - <em>Fixed the improper substitute detection bug</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/a4f1442dcf2c6f69dcf03dacf0ed38183cdc7ddb">Cleaned code and tried something which didn&rsquo;t work</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/644d14ac6ce327ce577592e566153c0117c6dcb6">Black formats</a> - <em>Initial branch and bound implementation</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/288bb5324cceb11e94396e435616c70b87926f69">Branch and bound returning optimal solution</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/242b53da0e00326ece75304a4ad8fb89e9ba8a25">black formatting changes</a> - <em>Split ascent and branch and bound methods into different functions</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/adbf930c23271c17a4d2fed6fbcd03552799793c">Performance tweaks and testing fractional answers</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/d3a45122bba3240d933a2b4275173f7e8a987cfa">Fixed test bug, I hope</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/37d6219887bff444d9f29e38526965ec4cc0687d">Asadpour output for ascent method</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/bcfb0ebcbe552524e44f9c85e353b53b1711e028">Removed branch and bound method. One unit test misbehaving</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/b529389be5263144b5755f8e4589216606e37484">Added asymmetric fractional test for the ascent method</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/c6cedc1f9d53a0c486c0196041188ae1b9c740d4">Removed printn statements and tweaked final test to be more asymmetric</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/b6bec0dada9ff67dc1cf28f5ae0fe3b1df490dc5">Changed HK to only report on the support of the answer</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/837d0448d38936278cfa9fdb7d8cb636eb8552c3">documentation update</a></p>
<h2 id="spanning_tree_distribution"><code>spanning_tree_distribution</code><a class="headerlink" href="#spanning_tree_distribution" title="Link to this heading">#</a></h2>
<p>Once we have the support of the Held Karp relaxation, we calculate edge weights $\gamma$ for support so that the probability of any tree being sampled is proportional to the product of $e^{\gamma}$ across its edges.
This is called a maximum entropy distribution in the Asadpour paper.
This procedure was included in the Asadpour paper [1] on page 386.</p>
<blockquote>
<ol>
<li>Set $\gamma = \vec{0}$.</li>
<li>While there exists an edge $e$ with $q_e(\gamma) &gt; (1 + \epsilon)z_e$:</li>
</ol>
<ul>
<li>Compute $\delta$ such that if we define $\gamma&rsquo;$ ad $\gamma_e&rsquo; = \gamma_e - \delta$ and $\gamma_f&rsquo; = \gamma_e$ for all $f \in E \backslash {e}$, then $q_e(\gamma&rsquo;) = (1 + \epsilon / 2)z_e$</li>
<li>Set $\gamma \leftarrow \gamma'$</li>
</ul>
<ol start="3">
<li>Output $\tilde{\gamma} := \gamma$.</li>
</ol>
</blockquote>
<p>Where $q_e(\gamma)$ is the probability that any given edge $e$ will be in a sampled spanning tree chosen with probability proportional to $\exp(\gamma(T))$.
$\delta$ is also given as</p>
<p>$$
\delta = \frac{q_e(\gamma)(1-(1+\epsilon/2)z_e)}{(1-q_e(\gamma))(1+\epsilon/2)z_e}
$$</p>
<p>so the Asadpour paper did almost all of the heavy lifting for this function.
However, they were not very clear on how to calculate $q_e(\gamma)$ other than that Krichhoff&rsquo;s Tree Matrix Theorem can be used.</p>
<p>My original method for calculating $q*e(\gamma)$ was to apply Krichhoff&rsquo;s Theorem to the original laplacian matrix and the laplacian produced once the edge $e$ is contracted from the graph.
Testing quickly showed that once the edge is contracted from the graph, it cannot affect the value of the laplacian and thus after subtracting $\delta$ the probability of that edge would increase rather than decrease.
Multiplying my original value of $q_e(\gamma)$ by $\exp(\gamma_e)$ proved to be the solution here for reasons extensively discussed in my blog post _The Entropy Distribution* and in particular the &ldquo;Update! (28 July 2021)&rdquo; section.</p>
<p><strong>Blog posts about <code>spanning_tree_distribution</code></strong></p>
<p>13 Jul 2021 - <a href="../entropy-distribution-setup">Entropy Distribution Setup</a></p>
<p>20 Jul 2021 - <a href="../entropy-distribution">The Entropy Distribution</a></p>
<p><strong>Commits about <code>spanning_tree_distribution</code></strong></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/da1f5cf688277426575115e3328e16d8f5b29a3c">Draft of spanning_tree_distribution</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/b6bec0dada9ff67dc1cf28f5ae0fe3b1df490dc5">Changed HK to only report on the support of the answer</a> - <em>Needing to limit $\gamma$ to only the support of the Held Karp relaxation is what caused this change</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/0fcf0b3ecfc3704db17830eeeae72a67b4182ffb">Fixed contraction bug by changing to MultiGraph. Problem with prob &gt; 1</a> - <em>Because the probability is only</em> proportional <em>to the product of the edge weights, this was not actually a problem</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/e820d4f921268ff0d55f913624bcd402c90244b2">Black reformats</a> - <em>Rewrote the test and cleaned the code</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/2195002e9394bcb2c47876809cfbbec3c05b1008">Fixed pypi test error</a> - <em>The pypi tests do not have <code>numpy</code> or <code>scipy</code> and I forgot to flag the test to be skipped if they are not available</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/e4cd4f17311e8d908f016cea45f03b1b3e35822e">Further testing of dist fix</a> - <em>Fixed function to multiply $q_e(\gamma)$ by $\exp(\gamma_e)$ and implemented exception if $\delta$ ever misbehaves</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/68f0cf95565bcdce0aec4678e3af9815e23b494e">Can sample spanning trees</a> - <em>Streamlined finding $q_e(\gamma)$ using new helper function</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/837d0448d38936278cfa9fdb7d8cb636eb8552c3">documentation update</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5f97de07821e49cc9ba4f9996ec6d1495eb268b7">Review suggestions from dshult</a> - <em>Implemented code review suggestions from one of my mentors</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/aef90dfcbb8b8424c6ed887311b4825559d0a398">Implement suggestions from boothby</a></p>
<h2 id="sample_spanning_tree"><code>sample_spanning_tree</code><a class="headerlink" href="#sample_spanning_tree" title="Link to this heading">#</a></h2>
<p>What good is a spanning tree distribution if we can&rsquo;t sample from it?</p>
<p>While the Asadpour paper [1] provides a rough outline of the sampling process, the bulk of their methodology comes from the Kulkarni paper, <em>Generating random combinatorial objects</em> [5].
That paper had a much more detailed explanation and even this pseudo code from page 202.</p>
<blockquote>
<p>$U = \emptyset,$ $V = E$<br>
Do $i = 1$ to $N$;<br>
$\qquad$Let $a = n(G(U, V))$<br>
$\qquad\qquad a&rsquo;$ $= n(G(U \cup {i}, V))$<br>
$\qquad$Generate $Z \sim U[0, 1]$<br>
$\qquad$If $Z \leq \alpha_i \times \left(a&rsquo; / a\right)$<br>
$\qquad\qquad$then $U = U \cup {i}$,<br>
$\qquad\qquad$else $V = V - {i}$<br>
$\qquad$end.<br>
Stop. $U$ is the required spanning tree.</p>
</blockquote>
<p>The only real difficulty here was tracking how the nodes were being contracted.
My first attempt was a mess of <code>if</code> statements and the like, but switching it to a merge-find data structure (or disjoint set data structure) proved to be a wise decision.</p>
<p>Of course, it is one thing to be able to sample a spanning tree and another entirely to know if the sampling technique matches the expected distribution.
My first iteration test for <code>sample_spanning_tree</code> just sampled a large number of trees (50000) and they printed the percent error from the normalized distribution of spanning tree.
With a sample size of 50000 all of the errors were under 10%, but I still wanted to find a better test.</p>
<p>From my AP statistics class in high school I remembered the $X^2$ (Chi-squared) test and realized that it would be perfect here.
<code>scipy</code> even had the ability to conduct one.
By converting to a chi-squared test I was able to reduce the sample size down to 1200 (near the minimum required sample size to have a valid chi-squared test) and use a proper hypothesis test at the $\alpha = 0.01$ significance level.
Unfortunately, the test would still fail 1% of the time until I added the <code>@py_random_state</code> decorator to <code>sample_spanning_tree</code>, and then the test can pass in a <code>Random</code> object to produce repeatable results.</p>
<p><strong>Blog posts about <code>sample_spanning_tree</code></strong></p>
<p>21 Jul 2021 - <a href="../preliminaries-for-sampling-a-spanning-tree">Preliminaries For Sampling A Spanning Tree</a></p>
<p>28 Jul 2021 - <a href="../sampling-a-spanning-tree">Sampling A Spanning Tree</a></p>
<p><strong>Commits about <code>sample_spanning_tree</code></strong></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/68f0cf95565bcdce0aec4678e3af9815e23b494e">Can sample spanning trees</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/3cca2b5bfdf001b1613f8e803f78c9fb380adc59">Developing test for sampling spanning tree</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/274e2c5908f337941ee5234d727fd307257a9b85">Changed sample_spanning_tree test to Chi squared test</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/7ebc6d874ec703a46dfc40f195fa84594bb9582c">Adding test cases</a> - <em>Implemented <code>@py_random_state</code> decorator</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/837d0448d38936278cfa9fdb7d8cb636eb8552c3">documentation update</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5f97de07821e49cc9ba4f9996ec6d1495eb268b7">Review suggestions from dshult</a> - <em>Implemented code review suggestions from one of my mentors</em></p>
<h2 id="asadpour_atsp"><a href="https://networkx.org/documentation/networkx-2.7.1/reference/algorithms/generated/networkx.algorithms.approximation.traveling_salesman.asadpour_atsp.html"><code>asadpour_atsp</code></a><a class="headerlink" href="#asadpour_atsp" title="Link to this heading">#</a></h2>
<p>This function was the last piece of the puzzle, connecting all of the others together and producing the final result!</p>
<p>Implementation of this function was actually rather smooth.
The only technical difficulty I had was reading the support of the <code>flow_dict</code> and the theoretical difficulties were adapting the <code>min_cost_flow</code> function to solve the minimum circulation problem.
Oh, and that if the flow is greater than 1 I need to add parallel edges to the graph so that it is still eulerian.</p>
<p>A brief overview of the whole algorithm is given below:</p>
<ol>
<li>Solve the Held Karp relaxation and symmertize the result to made it undirected.</li>
<li>Calculate the maximum entropy spanning tree distribution on the Held Karp support graph.</li>
<li>Sample $2 \lceil \ln n \rceil$ spanning trees and record the smallest weight one before reintroducing direction to the edges.</li>
<li>Find the minimum cost circulation to create an eulerian graph containing the sampled tree.</li>
<li>Take the eulerian walk of that graph and shortcut the answer.</li>
<li>return the shortcut answer.</li>
</ol>
<p><strong>Blog posts about <code>asadpour_atsp</code></strong></p>
<p>29 Jul 2021 - <a href="../looking-at-the-big-picture">Looking At The Big Picture</a></p>
<p>10 Aug 2021 - <a href="../completing-the-asadpour-algorithm">Completing The Asadpour Algorithm</a></p>
<p><strong>Commits about <code>asadpour_atsp</code></strong></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/2c1dc57542cc9651b5443f6015fb94b94bc2f7cd">untested implementation of asadpour_tsp</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/454c82ca61ab4746b57c6681449f8ea08f96d557">Fixed issue reading flow_dict</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/328a4f3b2669fa9890d2c08a4d72f0f9bb7573dc">Fixed runtime errors in asadpour_tsp</a> - <em>General traveling salesman problem function assumed graph were undirected. This is not work with an atsp algorithm</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/1d345054a20a88b3115af900972a0145d708d8b5">black reformats</a> - <em>Fixed parallel edges from flow support bug</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/7ebc6d874ec703a46dfc40f195fa84594bb9582c">Adding test cases</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/837d0448d38936278cfa9fdb7d8cb636eb8552c3">documentation update</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/11fef147246eb3374568515a4b29aeee5a9f469d">One new test and check</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/6db9f7692fc5294ac206fa331242fe679cbfb7d7">Fixed rounding error with tests</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5f97de07821e49cc9ba4f9996ec6d1495eb268b7">Review suggestions from dshult</a> - <em>Implemented code review suggestions from one of my mentors</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/55688deb9a84bc7a77aecc556a63ff80dc41c56f">Implemented review suggestions from rossbar</a></p>
<h2 id="future-involvement-with-networkx">Future Involvement with NetworkX<a class="headerlink" href="#future-involvement-with-networkx" title="Link to this heading">#</a></h2>
<p>Overall, I really enjoyed this Summer of Code.
I was able to branch out, continue to learn python and more about graphs and graph algorithms which is an area of interest for me.</p>
<p>Assuming that I have any amount of free time this coming fall semester, I&rsquo;d love to stay involved with NetworkX.
In fact, there are already some things that I have in mind even though my current code works as is.</p>
<ul>
<li>
<p>Move <code>sample_spanning_tree</code> to <code>mst.py</code> and rename it to <code>random_spanning_tree</code>.
The ability to sample random spanning trees is not a part of the greater NetworkX library and could be useful to others.
One of my mentors mentioned it being relevant to <a href="https://en.wikipedia.org/wiki/Steiner_tree_problem">Steiner trees</a> and if I can help other developers and users out, I will.</p>
</li>
<li>
<p>Adapt <code>sample_spanning_tree</code> so that it can use both additive and multiplicative weight functions.
The Asadpour algorithm only needs the multiplicative weight, but the Kulkarni paper [5] does talk about using an additive weight function which may be more useful to other NetworkX users.</p>
</li>
<li>
<p>Move my Krichhoff&rsquo;s Tree Matrix Theorem helper function to <code>laplacian_matrix.py</code> so that other NetworkX users can access it.</p>
</li>
<li>
<p>Investigate the following article about the Held Karp relaxation.
While I have no definite evidence for this one, I do believe that the Held Karp relaxation is the slowest part of my implementation of the Asadpour algorithm and thus is the best place for improving it.
The ascent method I am using comes from the original Held and Karp paper [3], but they did release a part II which may have better algorithms in it.
The citation is given below.</p>
<p>M. Held, R.M. Karp, <em>The traveling-salesman problem and minimum spanning trees: Part II</em>. Mathematical Programming, 1971, 1(1), p. 6–25. <a href="https://doi.org/10.1007/BF01584070">https://doi.org/10.1007/BF01584070</a></p>
</li>
<li>
<p>Refactor the <code>Edmonds</code> class in <code>branchings.py</code>.
That class is the implementation for Edmonds&rsquo; branching algorithm but uses an iterative approach rather than the recursive one discussed in Edmonds&rsquo; paper [2].
I did also agree to work with another person, <a href="https://github.com/lkora">lkora</a> to help rework this class and possible add a <code>minimum_maximal_branching</code> function to find the minimum branching which still connects as many nodes as possible.
This would be analogous to a spanning forest in an undirected graph.
At the moment, neither of us have had time to start such work.
For more information please reference issue <a href="https://github.com/networkx/networkx/issues/4836">#4836</a>.</p>
</li>
</ul>
<p>While there are areas of this problem which I can improve upon, it is important for me to remember that this project was still a complete success.
NetworkX now has an algorithm to approximate the traveling salesman problem in asymmetric or directed graphs.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Completing the Asadpour Algorithm]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/completing-the-asadpour-algorithm/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/looking-at-the-big-picture/?utm_source=atom_feed" rel="related" type="text/html" title="Looking at the Big Picture" />
                <link href="https://blog.scientific-python.org/networkx/atsp/sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Preliminaries for Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution/?utm_source=atom_feed" rel="related" type="text/html" title="The Entropy Distribution" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/?utm_source=atom_feed" rel="related" type="text/html" title="Entropy Distribution Setup" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/completing-the-asadpour-algorithm/</id>
            
            
            <published>2021-08-10T00:00:00+00:00</published>
            <updated>2021-08-10T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Implementation details for asadpour_atsp</blockquote><p>My implementation of <code>asadpour_atsp</code> is now working!
Recall that my pseudo code for this function from my last post was</p>

<div class="highlight">
  <pre>def asadpour_tsp
    Input: A complete graph G with weight being the attribute key for the edge weights.
    Output: A list of edges which form the approximate ATSP solution.

    z_star = held_karp(G)
    # test to see if z_star is a graph or dict
    if type(z_star) is nx.DiGraph
        return z_star.edges

    z_support = nx.MultiGraph()
    for u, v in z_star
        if not in z_support.edges
            edge_weight = min(G[u][v][weight], G[v][u][weight])
            z_support.add_edge(u, v, weight=edge_weight)
    gamma = spanning_tree_distribution(z_support, z_star)

    for u, v in z_support.edges
        z_support[u][v][lambda] = exp(gamma[(u, v)])

    for _ in range 1 to 2 ceil(log(n))
        sampled_tree = sample_spanning_tree(G)
        sampled_tree_weight = sampled_tree.size()
        if sampled_tree_weight &lt; minimum_sampled_tree_weight
            minimum_sampled_tree = sampled_tree.copy()
            minimum_sampled_tree_weight = sampled_tree_weight

    t_star = nx.DiGraph
    for u, v, d in minimum_sampled_tree.edges(data=weight)
        if d == G[u][v][weight]
            t_star.add_edge(u, v, weight=d)
        else
            t_star.add_edge(v, u, weight=d)

    for n in t_star
        node_demands[n] = t_star.out_degree(n) - t_star.in_degree(n)

    nx.set_node_attributes(G, node_demands)
    flow_dict = nx.min_cost_flow(G)

     for u, v in flow_dict
        if edge not in t_star.edges and flow_dict[u, v] &gt; 0
            t_star.add_edge(u, v)
    eulerian_curcuit = nx.eulerian_circuit(t_star)
    return _shortcutting(eulerian_curcuit)</pre>
</div>

<p>And this was more or less correct.
A few issues were present, as they always were going to be.</p>
<p>First, my largest issue came from a part of a word being in parenthesis in the Asadpour paper on page 385.</p>
<blockquote>
<p>This integral circulation $f^*$ corresponds to a directed (multi)graph $H$ which contains $\vec{T}^*$.</p>
</blockquote>
<p>Basically if the minimum flow is every larger than 1 along an edge, I need to add that many parallel edges in order to ensure that everything is still Eulerian.
This became a problem quickly while developing my test cases as shown in the below example.</p>
<center><img src="example-multiflow.png" alt="Example of correct and incorrect circulation from the directed spanning tree"/></center>
<p>As you can see, for the incorrect circulation, vertices 2 and 3 are not eulerian as they in and out degrees do not match.</p>
<p>All of the others were just minor points where the pseudo code didn&rsquo;t directly translate into python (because, after all, it isn&rsquo;t python).</p>
<h2 id="understanding-the-output">Understanding the Output<a class="headerlink" href="#understanding-the-output" title="Link to this heading">#</a></h2>
<p>The first thing I did once <code>asadpour_atsp</code> was take the fractional, symmetric Held Karp relaxation test graph and run it through the general <code>traveling_salesman_problem</code> function.
Since there are random numbers involved here, the results were always within the $O(\log n / \log \log n)$ approximation factor but were different.
Three examples are shown below.</p>
<center><img src="example-tours.png" alt="Three possible ATSP tours on an example graph"/></center>
<p>The first thing we want to check is the approximation ratio.
We know that the minimum cost output of the <code>traveling_saleman_problem</code> function is 304 (This is actually lower than the optimal tour in the undirected version, more on this later).
Next we need to know what our maximum approximation factor is.
Now, the Asadpour algorithm is $O(\log n / \log \log n)$ which for our six vertex graph would be $\ln(6) / \ln(\ln(6)) \approx 3.0723$.
However, on page 386 they give the coefficients of the approximation as $(2 + 8 \log n / \log \log n)$ which would be $2 + 8 \times \ln(6) / \ln(\ln(6)) \approx 26.5784$.
(Remember that all $\log$&rsquo;s in the Asadpour paper refer to the natural logarithm.)
All of our examples are well below even the lower limit.</p>
<p>For example 1:</p>
<p>$$
\begin{array}{r l}
\text{actual}: &amp; 504 \\\
\text{expected}: &amp; 304 \\\
\text{approx. factor}: &amp; \frac{504}{304} \approx 1.6578 &lt; 3.0723
\end{array}
$$</p>
<p>Example 2:</p>
<p>$$
\begin{array}{r l}
\text{actual}: &amp; 404 \\\
\text{expected}: &amp; 304 \\\
\text{approx. factor}: &amp; \frac{404}{304} \approx 1.3289 &lt; 3.0723
\end{array}
$$</p>
<p>Example 3:</p>
<p>$$
\begin{array}{r l}
\text{actual}: &amp; 304 \\\
\text{expected}: &amp; 304 \\\
\text{approx. factor}: &amp; \frac{304}{304} = 1.0000 &lt; 3.0723
\end{array}
$$</p>
<p>At this point, you&rsquo;ve probably noticed that the examples given are strictly speaking, <em>not</em> hamiltonian cycles: they visit vertices multiple times.
This is because the graph we have is not complete.
The Asadpour algorithm only works on complete graphs, so the <code>traveling_salesman_problem</code> function finds the shortest cost path between every pair of vertices and inserts the missing edges.
In fact, if the <code>asadpour_atsp</code> function is given an incomplete graph, it will raise an exception.
Take example three, since there is only one repeated vertex, 5.</p>
<p>Behind the scenes, the graph is complete and the solution may contain the dashed edge in the below image.</p>
<center><img src="complete-bypass.png" alt="Reversing an edge bypass to translate the TSP back to the original graph"/></center>
<p>But that edge is not in the original graph, so during the post-processing done by the <code>traveling_salesman_problem</code> function, the red edges are inserted instead of the dashed edge.</p>
<h2 id="testing-the-asadpour-algorithm">Testing the Asadpour Algorithm<a class="headerlink" href="#testing-the-asadpour-algorithm" title="Link to this heading">#</a></h2>
<p>Before I could write any tests, I needed to ensure that the tests were consistent from execution to execution.
At the time, this was not the case since there were random numbers being generated in order to sample the spanning trees.
So I had to learn how to use the <code>@py_random_state</code> decorator.</p>
<p>When this decorator is added to the top of a function, we pass it either the position of the argument in the function signature or the name of the keyword for that argument.
It then takes that argument and configures a python Random object based on the input parameter.</p>
<ul>
<li>Parameter is <code>None</code>, use a new <code>Random</code> object.</li>
<li>Parameter is an <code>int</code>, use a new <code>Random</code> object with that seed.</li>
<li>Parameter is a <code>Random</code> object, use that object as is.</li>
</ul>
<p>So I changed the function signature of <code>sample_spanning_tree</code> to have <code>random=None</code> at the end.
For most use cases, the default value will not be changed and the results will be different every time the method is called, but if we give it an <code>int</code>, the same tree will be sampled every time.
But, for my tests I can give it a seed to create repeatable behaviour.
Since the <code>sample_spanning_tree</code> function is not visible outside of the <code>treveling_salesman</code> file, I also had to create a pass-through parameter for <code>asadpour_atsp</code> so that my seed could have any effect.</p>
<p>Once this was done, I modified the test for <code>sample_spanning_tree</code> so that it would not have a 1 in 100 chance of spontaneously failing.
At first I just passed it an <code>int</code>, but that forced every tree sampled to be the same (since the edges were shuffled the same and sampled from the same sequence of numbers) and the test failed.
So I tweaked it to use a <code>Random</code> object from the random package and this worked well.</p>
<p>From here, I wrap the complete <code>asadpour_atsp</code> parameters I want in another function <code>fixed_asadpour</code> like this:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">fixed_asadpour</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">nx_app</span><span class="o">.</span><span class="n">asadpour_atsp</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="p">,</span> <span class="mi">56</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">path</span> <span class="o">=</span> <span class="n">nx_app</span><span class="o">.</span><span class="n">traveling_salesman_problem</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">,</span> <span class="n">cycle</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">method</span><span class="o">=</span><span class="n">fixed_asadpour</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span></span></span></code></pre>
</div>
<p>I tested using both <code>traveling_salesman_problem</code> and <code>asadpour_atsp</code>.
The tests included:</p>
<ul>
<li>The fractional, symmetric Held Karp graph from above.</li>
<li>A real world example using airline prices between six cities (also uses non-integer node names).</li>
<li>The same real world example but asking for a path not a cycle.</li>
<li>Using a disconnected graph (raises exception).</li>
<li>Using an incomplete graph (raises exception).</li>
<li>Using an integral Held Karp solution (returns directly after Held Karp with exact solution).</li>
<li>Using an impossible graph (one vertex has only out edges).</li>
</ul>
<h2 id="bonus-feature">Bonus Feature<a class="headerlink" href="#bonus-feature" title="Link to this heading">#</a></h2>
<p>There is even a bonus feature!
The <code>asadpour_atsp</code> function accepts a fourth argument, <code>source</code>!
Since both of the return methods use <code>eulerian_circuit</code> and the <code>_shortcutting</code> functions, I can pass a <code>source</code> vertex to the circuit function and ensure that the returned path starts and returns to the desired vertex.</p>
<p>Access it by wrapping the method, just be sure that the source vertex is in the graph to avoid an exception.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">fixed_asadpour</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">nx_app</span><span class="o">.</span><span class="n">asadpour_atsp</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="p">,</span> <span class="n">source</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">path</span> <span class="o">=</span> <span class="n">nx_app</span><span class="o">.</span><span class="n">traveling_salesman_problem</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">,</span> <span class="n">cycle</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">method</span><span class="o">=</span><span class="n">fixed_asadpour</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span></span></span></code></pre>
</div>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Madry, S. O. Gharan, and A. Saberi, <em>An O (log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, SODA ’10, Society for Industrial and Applied Mathematics, 2010, <a href="https://dl.acm.org/doi/abs/10.5555/1873601.1873633">https://dl.acm.org/doi/abs/10.5555/1873601.1873633</a>.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[GSoC'21: Quarter Progress]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_quarter/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_prequarter/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC&#39;21: Pre-Quarter Progress" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_midterm/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC&#39;21: Mid-Term Progress" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_introduction/?utm_source=atom_feed" rel="related" type="text/html" title="Aitik Gupta joins as a Student Developer under GSoC&#39;21" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2020_final_work_product/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC 2020 Work Product - Baseline Images Problem" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_5/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 3 Blog 1" />
            
                <id>https://blog.scientific-python.org/matplotlib/gsoc_2021_quarter/</id>
            
            
            <published>2021-08-03T18:48:00+05:30</published>
            <updated>2021-08-03T18:48:00+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Quarter Progress with Google Summer of Code 2021 project under NumFOCUS: Aitik Gupta</blockquote><p><strong>“<ins>Matplotlib, I want 多个汉字 in between my text.</ins>”</strong></p>
<p>Let&rsquo;s say you asked Matplotlib to render a plot with some label containing 多个汉字 (multiple Chinese characters) in between your English text.</p>
<p>Or conversely, let&rsquo;s say you use a Chinese font with Matplotlib, but you had English text in between (which is quite common).</p>
<blockquote>
<p>Assumption: the Chinese font doesn&rsquo;t have those English glyphs, and vice versa</p>
</blockquote>
<p>With this short writeup, I&rsquo;ll talk about how does a migration from a font-first to a text-first approach in Matplotlib looks like, which ideally solves the above problem.</p>
<h3 id="have-the-fonts">Have the fonts?<a class="headerlink" href="#have-the-fonts" title="Link to this heading">#</a></h3>
<p>Logically, the very first step to solving this would be to ask whether you <em>have</em> multiple fonts, right?</p>
<p>Matplotlib doesn&rsquo;t ship <a href="https://en.wikipedia.org/wiki/List_of_CJK_fonts">CJK</a> (Chinese Japanese Korean) fonts, which ideally contains these Chinese glyphs. It does try to cover most grounds with the <a href="https://matplotlib.org/stable/users/dflt_style_changes.html#normal-text">default font</a> it ships with, however.</p>
<p>So if you don&rsquo;t have a font to render your Chinese characters, go ahead and install one! Matplotlib will find your installed fonts (after rebuilding the cache, that is).</p>
<h3 id="parse-the-fonts">Parse the fonts<a class="headerlink" href="#parse-the-fonts" title="Link to this heading">#</a></h3>
<p>This is where things get interesting, and what my <a href="../gsoc_2021_prequarter/">previous writeup</a> was all about..</p>
<blockquote>
<p>Parsing the whole family to get multiple fonts for given font properties</p>
</blockquote>
<h2 id="ft2font-magic">FT2Font Magic!<a class="headerlink" href="#ft2font-magic" title="Link to this heading">#</a></h2>
<p>To give you an idea about how things used to work for Matplotlib:</p>
<ol>
<li>A single font was chosen <em>at draw time</em>
(fixed: re <a href="../gsoc_2021_prequarter/">previous writeup</a>)</li>
<li>Every character displayed in your document was rendered by only that font
(partially fixed: re <ins><em>this writeup</em></ins>)</li>
</ol>
<blockquote>
<p>FT2Font is a matplotlib-to-font module, which provides high-level Python API to interact with a <em>single font&rsquo;s operations</em> like read/draw/extract/etc.</p>
</blockquote>
<p>Being written in C++, the module needs wrappers around it to be converted into a <a href="https://docs.python.org/3/extending/extending.html">Python extension</a> using Python&rsquo;s C-API.</p>
<blockquote>
<p>It allows us to use C++ functions directly from Python!</p>
</blockquote>
<p>So wherever you see a use of font within the library (by library I mean the readable Python codebase XD), you could have derived that:</p>

<div class="highlight">
  <pre>FT2Font === SingleFont</pre>
</div>

<p>Things are be a bit different now however..</p>
<h2 id="designing-a-multi-font-system">Designing a multi-font system<a class="headerlink" href="#designing-a-multi-font-system" title="Link to this heading">#</a></h2>
<p>FT2Font is basically itself a wrapper around a library called <a href="https://www.freetype.org/">FreeType</a>, which is a freely available software library to render fonts.</p>
<p align="center">
    <figure>
        <img src="https://user-images.githubusercontent.com/43996118/128352387-76a3f52a-20fc-4853-b624-0c91844fc785.png" alt="FT2Font Naming" />
        <figcaption style="text-align: center; font-style: italic;">How FT2Font was named</figcaption>
    </figure>
</p>
<p>In my initial proposal.. while looking around how FT2Font is structured, I figured:</p>

<div class="highlight">
  <pre>Oh, looks like all we need are Faces!</pre>
</div>

<blockquote>
<p>If you don&rsquo;t know what faces/glyphs/ligatures are, head over to why <a href="https://gankra.github.io/blah/text-hates-you/">Text Hates You</a>. I can guarantee you&rsquo;ll definitely enjoy some real life examples of why text rendering is hard. 🥲</p>
</blockquote>
<p>Anyway, if you already know what Faces are, it might strike you:</p>
<p>If we already have all the faces we need from multiple fonts (let&rsquo;s say we created a child of FT2Font.. which only <ins>tracks the faces</ins> for its families), we should be able to render everything from that parent FT2Font right?</p>
<p>As I later figured out while finding segfaults in implementing this design:</p>

<div class="highlight">
  <pre>Each FT2Font is linked to a single FT_Library object!</pre>
</div>

<p>If you tried to load the face/glyph/character (basically anything) from a different FT2Font object.. you&rsquo;ll run into serious segfaults. (because one object linked to an <code>FT_Library</code> can&rsquo;t really access another object which has it&rsquo;s own <code>FT_Library</code>)</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1">// face is linked to FT2Font; which is
</span></span></span><span class="line"><span class="cl"><span class="c1">// linked to a single FT_Library object
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="n">FT_Face</span> <span class="n">face</span> <span class="o">=</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">get_face</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="n">FT_Get_Glyph</span><span class="p">(</span><span class="n">face</span><span class="o">-&gt;</span><span class="n">glyph</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">placeholder</span><span class="p">);</span> <span class="c1">// works like a charm
</span></span></span><span class="line"><span class="cl"><span class="c1"></span>
</span></span><span class="line"><span class="cl"><span class="c1">// somehow get another FT2Font&#39;s face
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="n">FT_Face</span> <span class="n">family_face</span> <span class="o">=</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">get_family_member</span><span class="p">()</span><span class="o">-&gt;</span><span class="n">get_face</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="n">FT_Get_Glyph</span><span class="p">(</span><span class="n">family_face</span><span class="o">-&gt;</span><span class="n">glyph</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">placeholder</span><span class="p">);</span> <span class="c1">// segfaults!
</span></span></span></code></pre>
</div>
<p>Realizing this took a good amount of time! After this I quickly came up with a recursive approach, wherein we:</p>
<ol>
<li>Create a list of FT2Font objects within Python, and pass it down to FT2Font</li>
<li>FT2Font will hold pointers to its families via a <br>
<code>std::vector&lt;FT2Font *&gt; fallback_list</code></li>
<li>Find if the character we want is available in the current font
<ol>
<li>If the character is available, use that FT2Font to render that character</li>
<li>If the character isn&rsquo;t found, go to step 3 again, but now iterate through the <code>fallback_list</code></li>
</ol>
</li>
<li>That&rsquo;s it!</li>
</ol>
<p>A quick overhaul of the above piece of code^</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kt">bool</span> <span class="nf">ft_get_glyph</span><span class="p">(</span><span class="n">FT_Glyph</span> <span class="o">&amp;</span><span class="n">placeholder</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">	<span class="n">FT_Error</span> <span class="n">not_found</span> <span class="o">=</span> <span class="n">FT_Get_Glyph</span><span class="p">(</span><span class="k">this</span><span class="o">-&gt;</span><span class="n">get_face</span><span class="p">(),</span> <span class="o">&amp;</span><span class="n">placeholder</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">	<span class="k">if</span> <span class="p">(</span><span class="n">not_found</span><span class="p">)</span> <span class="k">return</span> <span class="n">False</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">	<span class="k">else</span> <span class="k">return</span> <span class="n">True</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// within driver code
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="k">for</span> <span class="p">(</span><span class="n">uint</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o">&lt;</span><span class="n">fallback_list</span><span class="p">.</span><span class="n">size</span><span class="p">();</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">	<span class="c1">// iterate through all FT2Font objects
</span></span></span><span class="line"><span class="cl"><span class="c1"></span>	<span class="kt">bool</span> <span class="n">was_found</span> <span class="o">=</span> <span class="n">fallback_list</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">-&gt;</span><span class="n">ft_get_glyph</span><span class="p">(</span><span class="n">placeholder</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">	<span class="k">if</span> <span class="p">(</span><span class="n">was_found</span><span class="p">)</span> <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span></span></span></code></pre>
</div>
<p>With the idea surrounding this implementation, the <a href="https://matplotlib.org/stable/api/backend_agg_api.html">Agg backend</a> is able to render a document (either through GUI, or a PNG) with multiple fonts!</p>
<p align="center">
    <figure>
        <img src="https://user-images.githubusercontent.com/43996118/128347495-1f4f858d-33d3-4119-8732-5b26c4e9ca2a.png" alt="ChineseInBetween" />
        <figcaption style="text-align: center; font-style: italic;">PNG straight outta Matplotlib!</figcaption>
    </figure>
</p>
<h2 id="python-c-api-is-hard-at-first">Python C-API is hard, at first!<a class="headerlink" href="#python-c-api-is-hard-at-first" title="Link to this heading">#</a></h2>
<p>I&rsquo;ve spent days at Python C-API&rsquo;s <a href="https://docs.python.org/3/c-api/arg.html">argument doc</a>, and it&rsquo;s hard to get what you need at first, ngl.</p>
<p>But, with the help of some amazing people in the GSoC community (<a href="https://srijan-paul.github.io/">@srijan-paul</a>, <a href="https://atharvaraykar.me/">@atharvaraykar</a>) and amazing mentors, blockers begone!</p>
<h2 id="so-are-we-done">So are we done?<a class="headerlink" href="#so-are-we-done" title="Link to this heading">#</a></h2>
<p>Oh no. XD</p>
<p>Things work just fine for the Agg backend, but to generate a PDF/PS/SVG with multiple fonts is another story altogether! I think I&rsquo;ll save that for later.</p>
<p align="center">
    <figure>
        <img src="https://user-images.githubusercontent.com/43996118/128350093-13695b91-5ad2-4f96-91f5-8373ee7a189e.gif" alt="ThankYouDwight" />
        <figcaption style="text-align: center; font-style: italic;">If you've been following the progress so far, mayn you're awesome!</figcaption>
    </figure>
</p>
<h4 id="note-this-blog-post-is-also-available-at-my-personal-website">NOTE: This blog post is also available at my <a href="https://aitikgupta.github.io/gsoc-quarter/">personal website</a>.<a class="headerlink" href="#note-this-blog-post-is-also-available-at-my-personal-website" title="Link to this heading">#</a></h4>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="GSoC" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Looking at the Big Picture]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/looking-at-the-big-picture/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Preliminaries for Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution/?utm_source=atom_feed" rel="related" type="text/html" title="The Entropy Distribution" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/?utm_source=atom_feed" rel="related" type="text/html" title="Entropy Distribution Setup" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="Finalizing the Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/looking-at-the-big-picture/</id>
            
            
            <published>2021-07-29T00:00:00+00:00</published>
            <updated>2021-07-29T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Prelimiaries for the final Asadpour algorithm function in NetworkX</blockquote><p>Well, we&rsquo;re finally at the point in this GSoC project where the end is glimmering on the horizon.
I have completed the Held Karp relaxation, generating a spanning tree distribution and now sampling from that distribution.
That means that it is time to start thinking about how to link these separate components into one algorithm.</p>
<p>Recall that from the Asadpour paper the overview of the algorithm is</p>
<blockquote>
<hr>
<p><strong>Algorithm 1</strong> An $O(\log n / \log \log n)$-approximation algorithm for the ATSP</p>
<hr>
<p><strong>Input:</strong> A set $V$ consisting of $n$ points and a cost function $c\ :\ V \times V \rightarrow \mathbb{R}^+$ satisfying the triangle inequality.</p>
<p><strong>Output:</strong> $O(\log n / \log \log n)$-approximation of the asymmetric traveling salesman problem instance described by $V$ and $c$.</p>
<ol>
<li>Solve the Held-Karp LP relaxation of the ATSP instance to get an optimum extreme point solution $x^*$.
Define $z^*$ as in (5), making it a symmetrized and scaled down version of $x^*$.
Vector $z^*$ can be viewed as a point in the spanning tree polytope of the undirected graph on the support of $x^*$ that one obtains after disregarding the directions of arcs (See Section 3.)</li>
<li>Let $E$ be the support graph of $z^*$ when the direction of the arcs are disregarded.
Find weights ${\tilde{\gamma}}_{e \in E}$ such that the exponential distribution on the spanning trees, $\tilde{p}(T) \propto \exp(\sum_{e \in T} \tilde{\gamma}_e)$ (approximately) preserves the marginals imposed by $z^*$, i.e. for any edge $e \in E$,
<center>$\sum\_{T \in \mathcal{T} : T \ni e} \tilde{p}(T) \leq (1 + \epsilon) z^\*\_e$,</center>
for a small enough value of $\epsilon$.
(In this paper we show that $\epsilon = 0.2$ suffices for our purpose. See Section 7 and 8 for a description of how to compute such a distribution.)
</li>
<li>Sample $2\lceil \log n \rceil$ spanning trees $T_1, \dots, T_{2\lceil \log n \rceil}$ from $\tilde{p}(.)$.
For each of these trees, orient all its edges so as to minimize its cost with respect to our (asymmetric) cost function $c$.
Let $T^*$ be the tree whose resulting cost is minimal among all of the sampled trees.</li>
<li>Find a minimum cost integral circulation that contains the oriented tree $\vec{T}^*$.
Shortcut this circulation to a tour and output it. (See Section 4.)</li>
</ol>
<hr>
</blockquote>
<p>We are now firmly in the steps 3 and 4 area.
Going all the way back to my post on 24 May 2021 titled <a href="../networkx-function-stubs">Networkx Function stubs</a> the only function left is <code>asadpour_tsp</code>, the main function which needs to accomplish this entire algorithm.
But before we get to creating pseudo code for it there is still step 4 which needs a thorough examination.</p>
<h2 id="circulation-and-shortcutting">Circulation and Shortcutting<a class="headerlink" href="#circulation-and-shortcutting" title="Link to this heading">#</a></h2>
<p>Once we have sampled enough spanning trees from the graph and converted the minimum one into $\vec{T}^*$ we need to find the minimum cost integral circulation in the graph which contains $\vec{T}^*$.
While NetworkX a minimum cost circulation function, namely, <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.flow.min_cost_flow.html"><code>min_cost_flow</code></a>, it is not suitable for the Asadpour algorithm out of the box.
The problem here is that we do not have node demands, we have edge demands.
However, after some reading and discussion with one of my mentors Dan, we can convert the current problem into one which can be solved using the <code>min_cost_flow</code> function.</p>
<p>The problem that we are trying to solve is called the minimum cost circulation problem and the one which <code>min_cost_flow</code> is able to solve is the, well, minimum cost flow problem.
As it happens, these are equivalent problems, so I can convert the minimum cost circulation into a minimum cost flow problem by transforming the minimum edge demands into node demands.</p>
<p>Recall that at this point we have a directed minimum sampled spanning tree $\vec{T}^*$ and that the flow through each of the edges in $\vec{T}^*$ needs to be at least one.
From the perspective of a flow problem, $\vec{T}^*$ is moving some flow around the graph.
However, in order to augment $\vec{T}^*$ into an Eulerian graph so that we can walk it, we need to counteract this flow so that the net flow for each node is 0 $(f(\delta^+(v)) = f(\delta^-(v))$ in the Asadpour paper).</p>
<p>So, we find the net flow of each node and then assign its demand to be the negative of that number so that the flow will balance at the node in question.
If the total flow at any node $i$ is $\delta^+(i) - \delta^-(i)$ then the demand we assign to that node is $\delta^-(i) - \delta^+(i)$.
Once we assign the demands to the nodes we can temporarily ignore the edge lower capacities to find the minimum flow.</p>
<p>For more information on the conversion process, please see [2].</p>
<p>After the minimum flow is found, we take the support of the flow and add it to the $\vec{T}^*$ to create a multigraph $H$.
Now we know that $H$ is weakly connected (it contains $\vec{T^*}$) and that it is Eulerian because for every node the in-degree is equal to the out-degree.
A closed eulerian walk or eulerian circuit can be found in this graph with <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.euler.eulerian_circuit.html"><code>eulerian_circuit</code></a>.</p>
<p>Here is an example of this process on a simple graph.
I suspect that the flow will not always be the back edges from the spanning tree and that the only reason that is the case here is due to the small number of vertcies.</p>
<center><img src="example-min-flow.png" alt="Example of finding the minimum flow on a directed spanning tree"/></center>
<p>Finally, we take the eulerian circuit and shortcut it.
On the plus side, the shortcutting process is the same as the Christofides algorithm so that is already the <code>_shortcutting</code> helper function in the traveling salesman file.
This is really where it is critical that the triangle inequality holds so that the shortcutting cannot increase the cost of the circulation.</p>
<h2 id="pseudo-code-for-asadpour_tsp">Pseudo code for asadpour_tsp<a class="headerlink" href="#pseudo-code-for-asadpour_tsp" title="Link to this heading">#</a></h2>
<p>Let&rsquo;s start with the function signature.</p>

<div class="highlight">
  <pre>def asadpour_tsp
    Input: A complete graph G with weight being the attribute key for the edge weights.
    Output: A list of edges which form the approximate ATSP solution.</pre>
</div>

<p>This is exactly what we&rsquo;d expect, take a complete graph $G$ satisfying the triangle inequality and return the edges in the approximate solution to the asymmetric traveling salesman problem.
Recall from my post <a href="../networkx-function-stubs">Networkx Function Stubs</a> what the primary traveling salesman function, <code>traveling_salesman_problem</code> will ensure that we are given a complete graph that follows the triangle inequality by using all-pairs shortest path calculations and will handle if we are expected to return a true cycle or only a path.</p>
<p>The first step in the Asadpour algorithm is the Held Karp relaxation.
I am planning on editing the flow of the algorithm here a bit.
If the Held Karp relaxation finds an integer solution, then we know that is one of the optimal TSP routes so there is no point in continuing the algorithm: we can just return that as an optimal solution.
However, if the Held Karp relaxation finds a fractional solution we will press on with the algorithm.</p>

<div class="highlight">
  <pre>    z_star = held_karp(G)
    # test to see if z_star is a graph or dict
    if type(z_star) is nx.DiGraph
        return z_star.edges</pre>
</div>

<p>Once we have the Held Karp solution, we create the undirected support of <code>z_star</code> for the next step of creating the exponential distribution of spanning trees.</p>

<div class="highlight">
  <pre>    z_support = nx.MultiGraph()
    for u, v in z_star
        if not in z_support.edges
            edge_weight = min(G[u][v][weight], G[v][u][weight])
            z_support.add_edge(u, v, weight=edge_weight)
    gamma = spanning_tree_distribution(z_support, z_star)</pre>
</div>

<p>This completes steps 1 and 2 in the Asadpour overview at the top of this post.
Next we sample $2 \lceil \log n \rceil$ spanning trees.</p>

<div class="highlight">
  <pre>    for u, v in z_support.edges
        z_support[u][v][lambda] = exp(gamma[(u, v)])

    for _ in range 1 to 2 ceil(log(n))
        sampled_tree = sample_spanning_tree(G)
        sampled_tree_weight = sampled_tree.size()
        if sampled_tree_weight &lt; minimum_sampled_tree_weight
            minimum_sampled_tree = sampled_tree.copy()
            minimum_sampled_tree_weight = sampled_tree_weight</pre>
</div>

<p>Now that we have the minimum sampled tree, we need to orient the edge directions to keep the cost equal to that minimum tree.
We can do this by iterating over the edges in <code>minimum_sampled_tree</code> and checking the edge weights in the original graph $G$.
Using $G$ is required here if we did not record the minimum direction which is a possibility when we create <code>z_support</code>.</p>

<div class="highlight">
  <pre>    t_star = nx.DiGraph
    for u, v, d in minimum_sampled_tree.edges(data=weight)
        if d == G[u][v][weight]
            t_star.add_edge(u, v, weight=d)
        else
            t_star.add_edge(v, u, weight=d)</pre>
</div>

<p>Next we create a mapping of nodes to node demands for the minimum cost flow problem which was discussed earlier in this post.
I think that using a dict is the best option as it can be passed into <a href="https://networkx.org/documentation/stable/reference/generated/networkx.classes.function.set_node_attributes.html"><code>set_node_attributes</code></a> all at once before finding the minimum cost flow.</p>

<div class="highlight">
  <pre>    for n in t_star
        node_demands[n] = t_star.out_degree(n) - t_star.in_degree(n)

    nx.set_node_attributes(G, node_demands)
    flow_dict = nx.min_cost_flow(G)</pre>
</div>

<p>Take the Eulerian circuit and shortcut it on the way out.
Here we can add the support of the flow directly to <code>t_star</code> to simulate adding the two graphs together.</p>

<div class="highlight">
  <pre>    for u, v in flow_dict
        if edge not in t_star.edges and flow_dict[u, v] &gt; 0
            t_star.add_edge(u, v)
    eulerian_curcuit = nx.eulerian_circuit(t_star)
    return _shortcutting(eulerian_curcuit)</pre>
</div>

<p>That should be it.
Once the code for <code>asadpour_tsp</code> is written it will need to be tested.
I&rsquo;m not sure how I&rsquo;m going to create the test cases yet, but I do plan on testing it using real world airline ticket prices as that is my go to example for the asymmetric traveling salesman problem.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061.</p>
<p>D. Williamson, <em>ORIE 633 Network Flows Lecture 11</em>, 11 Oct 2007, <a href="https://people.orie.cornell.edu/dpw/orie633/LectureNotes/lecture11.pdf">https://people.orie.cornell.edu/dpw/orie633/LectureNotes/lecture11.pdf</a>.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Sampling a Spanning Tree]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/sampling-a-spanning-tree/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Preliminaries for Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution/?utm_source=atom_feed" rel="related" type="text/html" title="The Entropy Distribution" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/?utm_source=atom_feed" rel="related" type="text/html" title="Entropy Distribution Setup" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="Finalizing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Implementing the Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/sampling-a-spanning-tree/</id>
            
            
            <published>2021-07-28T00:00:00+00:00</published>
            <updated>2021-07-28T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Implementation details for sample_spanning_tree</blockquote><p>The heavy lifting I did in the preliminary post certainly paid off here!
In just one day I was able to implement <code>sample_spanning_tree</code> and its two helper functions.</p>
<h2 id="krichhoffs">krichhoffs<a class="headerlink" href="#krichhoffs" title="Link to this heading">#</a></h2>
<p>This was a very easy function to implement.
It followed exactly from the pesudo code and was working with <code>spanning_tree_distribution</code> before I started on <code>sample_spanning_tree</code>.</p>
<h2 id="sample_spanning_tree">sample_spanning_tree<a class="headerlink" href="#sample_spanning_tree" title="Link to this heading">#</a></h2>
<p>This function was more difficult than I originally anticipated.
The code for the main body of the function only needed minor tweaks to work with the specifics of python such as <code>shuffle</code> being in place and returning <code>None</code> and some details about how sets work.
For example, I add edge $e$ to $U$ before calling <code>prepare_graph</code> on in and then switch the <code>if</code> statement to be the inverse to remove $e$ from $U$.
Those portions are functionally the same.
The issues I had with this function <em>all</em> stem back to contracting multiple nodes in a row and how that affects the graph.</p>
<p>As a side note, the <code>contracted_edge</code> function in NetworkX is a wrapper for <code>contracted_node</code> and the latter has a <code>copy</code> keyword argument that is assumed to be <code>True</code> by the former function.
It was a trivial change to extend this functionality to <code>contracted_edge</code> but in the end I used <code>contracted_node</code> so the whole thing is moot.</p>
<p>First recall how edge contraction, or in this case node contraction, works.
Two nodes are merged into one which is connected by the same edges which connected the original two nodes.
Edges between those two nodes become self loops, but in this case I prevented the creation of self loops as directed by Kulkarni.
If a node which is not contracted has edges to both of the contracted nodes, we insert a parallel edge between them.
I struggled with NetworkX&rsquo;s API about the graph classes in a past post titled <a href="../entropy-distribution">The Entropy Distribution</a>.</p>
<p>For NetworkX&rsquo;s implementation, we would call <code>nx.contracted_nodes(G, u, v)</code> and <code>u</code> and <code>v</code> would always be merged into <code>u</code>, so <code>v</code> is the node which is no longer in the graph.</p>
<p>Now imagine that we have three edges to contract because they are all in $U$ which look like the following.</p>
<center><img src="multiple-contraction.png" alt="Example subgraph with multiple edges to contract"></center>
<p>If we process this from left to right, we first contract nodes 0 and 1.
At this point, the $\{1, 2\}$ no longer exists in $G$ as node 1 itself has been removed.
However, we would still need to contract the new $\{0, 2\}$ edge which is equivalent to the old $\{1, 2\}$ edge.</p>
<p>My first attempt to solve this was&hellip; messy and didn&rsquo;t work well.
I developed an <code>if-elif</code> chain for which endpoints of the contracting edge no longer existed in the graph and tried to use dict comprehension to force a dict to always be up to date with which vertices were equivalent to each other.
It didn&rsquo;t work and was very messy.</p>
<p>Fortunately there was a better solution.
This next bit of code I actually first used in my Graph Algorithms class from last semester.
In particular it is the merge-find or disjoint set data structure from the components algorithm (code can be found <a href="https://github.com/mjschwenne/GraphAlgorithms/blob/main/src/Components.py">here</a> and more information about the data structure <a href="https://en.wikipedia.org/wiki/Disjoint-set_data_structure">here</a>).</p>
<p>Basically we create a mapping from a node to that node&rsquo;s representative.
In this case a node&rsquo;s representative is the node that is still in $G$ but the input node has been merged into through a series of contractions.
In the above example, once node 1 is merged into node 0, 0 would become node 1&rsquo;s representative.
We search recursively through the <code>merged_nodes</code> dict until we find a node which is not in the dict, meaning that it is still its own representative and therefore in the graph.
This will let us handle a representative node later being merged into another node.
Finally, we take advantage of path compression so that lookup times remain good as the number of entries in <code>merged_nodes</code> grows.</p>
<p>This worked well once I caught a bug where the <code>prepare_graph</code> function tried to contract a node with itself.
However, the function was running and returning a result but it could have one or two more edges than needed which of course means it is not a tree.
I was testing on the symmetric fractional Held Karp graph by the way, so with six nodes it should have five edges per tree.</p>
<p>I seeded the random number generator for one of the seven edge results and started to debug!
Recall that once we generate a uniform decimal between 0 and 1 we compare it to</p>
<p>$$
\lambda_e \times \frac{K_{G \backslash {e}}}{K_G}
$$</p>
<p>where $K$ is the result of Krichhoff&rsquo;s Theorem on the subscripted graph.
One probability that caught my eye had the fractional component equal to 1.
This means that adding $e$ to the set of contracted edges had no effect on where that edge should be included in the final spanning tree.
Closer inspection revealed that the edge $e$ in question already could not be picked for the spanning tree since it did not exist in $G$ it could not exist in $G \backslash {e}$.</p>
<p>Imagine the following situation.
We have three edges to contract but they form a cycle of length three.</p>
<center><img src="contraction-cycle.png" alt="Example of the contraction of a cycle in a subgraph"></center>
<p>If we contract $\{0, 1\}$ and then $\{0, 2\}$ what does that mean for $\{1, 2\}$?
Well, ${1, 2}$ would become a self loop on vertex 0 but we are deleting self loops so it cannot exist.
It has to have a probability of 0.
Yet in the current implementation of the function, it would have a probability of $\lambda_{\{1, 2\}}$.
So, I have to check to see if a representative edge exists for the edge we are considering in the current iteration of the main for loop.</p>
<p>The solution to this is to return the merge-find data structure with the prepared graph for $G$ and then check that an edge with endpoints at the two representatives for the endpoints of the original edge exists.
If so, use the kirchhoff value as normal but if not make <code>G_e_total_tree_weight</code> equal to zero so that this edge cannot be picked.
Finally I was able to sample trees from <code>G</code> consistently, but did they match the expected probabilities?</p>
<h2 id="testing-sample_spanning_tree">Testing sample_spanning_tree<a class="headerlink" href="#testing-sample_spanning_tree" title="Link to this heading">#</a></h2>
<p>The first test I was working with sampled one tree and checked to see if it was actually a tree.
I first expanded it to sample 1000 trees and make sure that they were all trees.
At this point, I thought that the function will always return a tree, but I need to check the tree distribution.</p>
<p>So after a lot of difficulty writing the test itself to check which of the 75 possible spanning trees I had sampled I was ready to check the actual distribution.
First, the test iterates over all the spanning trees, records the products of edge weights and normalizes the data.
(Remember that the actual probability is only <em>proportional</em> to the product of edge weights).
Then I sample 50000 trees and record the actual frequency.
Next, it calculates the percent error from the expected probability to the actual frequency.
The sample size is so large because at 1000 trees the percent error was all over the place but, as the Law of Large Numbers dictates, the larger sample shows the actual results converging to the expected results so I do believe that the function is working.</p>
<p>That being said, seeing the percent error converge to be less than 15% for all 75 spanning trees is not a very rigorous test.
I can either implement a formal test using the percent error or try to create a Chi squared test using scipy.</p>
<h3 id="update-29-july-2021">Update! (29 July 2021)<a class="headerlink" href="#update-29-july-2021" title="Link to this heading">#</a></h3>
<p>This morning I was able to get a Chi squared test working and it was definitely the correct decision.
I was able to reduce the sample size from 50,000 to 1200 which is a near minimum sample.
In order to run a Chi squared test you need an expected frequency of at least 5 for all of the categories so I had to find the number of samples to ganturee that for a tree with a probability of about 0.4% which was 1163 that I rounded to 1200.</p>
<p>I am testing at the 0.01 signigance level, so this test may fail without reason 1% of the time but it is still a overall good test for distribution.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, SODA ’10,
Society for Industrial and Applied Mathematics, 2010, pp. 379-389, <a href="https://dl.acm.org/doi/abs/10.5555/1873601.1873633">https://dl.acm.org/doi/abs/10.5555/1873601.1873633</a>.</p>
<p>V. G. Kulkarni, <em>Generating random combinatorial objects</em>, Journal of algorithms, 11 (1990), pp. 185–207.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[The Python Graph Gallery: hundreds of python charts with reproducible code.]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/python-graph-gallery.com/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/codeswitching-visualization/?utm_source=atom_feed" rel="related" type="text/html" title="Visualizing Code-Switching with Step Charts" />
                <link href="https://blog.scientific-python.org/matplotlib/draw-all-graphs-of-n-nodes/?utm_source=atom_feed" rel="related" type="text/html" title="Draw all graphs of N nodes" />
                <link href="https://blog.scientific-python.org/matplotlib/stellar-chart-alternative-radar-chart/?utm_source=atom_feed" rel="related" type="text/html" title="Stellar Chart, a Type of Chart to Be on Your Radar" />
                <link href="https://blog.scientific-python.org/matplotlib/ipcc-sr15/?utm_source=atom_feed" rel="related" type="text/html" title="Figures in the IPCC Special Report on Global Warming of 1.5°C (SR15)" />
                <link href="https://blog.scientific-python.org/matplotlib/elementary-cellular-automata/?utm_source=atom_feed" rel="related" type="text/html" title="Elementary Cellular Automata" />
            
                <id>https://blog.scientific-python.org/matplotlib/python-graph-gallery.com/</id>
            
            
            <published>2021-07-24T14:06:57+02:00</published>
            <updated>2021-07-24T14:06:57+02:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>The Python Graph Gallery is a website that displays hundreds of chart examples made with python. It goes from very basic to highly customized examples and is based on common viz libraries like matplotlib, seaborn or plotly.</blockquote><p>Data visualization is a key step in a data science pipeline. <a href="https://www.python.org">Python</a> offers great possibilities when it comes to representing some data graphically, but it can be hard and time-consuming to create the appropriate chart.</p>
<p>The <a href="https://www.python-graph-gallery.com">Python Graph Gallery</a> is here to help. It displays many examples, always providing the reproducible code. It allows to build the desired chart in minutes.</p>
<h1 id="about-400-charts-in-40-sections">About 400 charts in 40 sections<a class="headerlink" href="#about-400-charts-in-40-sections" title="Link to this heading">#</a></h1>
<p>The gallery currently provides more than <a href="https://www.python-graph-gallery.com/all-charts/">400 chart examples</a>. Those examples are organized in 40 sections, one for each chart types: <a href="https://www.python-graph-gallery.com/scatter-plot/">scatterplot</a>, <a href="https://www.python-graph-gallery.com/boxplot/">boxplot</a>, <a href="https://www.python-graph-gallery.com/barplot/">barplot</a>, <a href="https://www.python-graph-gallery.com/treemap/">treemap</a> and so on. Those chart types are organized in 7 big families as suggested by <a href="https://www.data-to-viz.com">data-to-viz.com</a>: one for each visualization purpose.</p>
<p>It is important to note that not only the most common chart types are covered. Lesser known charts like <a href="https://www.python-graph-gallery.com/chord-diagram/">chord diagrams</a>, <a href="https://www.python-graph-gallery.com/streamchart/">streamgraphs</a> or <a href="https://www.python-graph-gallery.com/bubble-map/">bubble maps</a> are also available.</p>
<p><img src="/matplotlib/python-graph-gallery.com/sections-overview.png" alt="overview of the python graph gallery sections"></p>
<h1 id="master-the-basics">Master the basics<a class="headerlink" href="#master-the-basics" title="Link to this heading">#</a></h1>
<p>Each section always starts with some very basic examples. It allows to understand how to build a chart type in a few seconds. Hopefully applying the same technique on another dataset will thus be very quick.</p>
<p>For instance, the <a href="https://www.python-graph-gallery.com/scatter-plot/">scatterplot section</a> starts with this <a href="https://matplotlib.org/">matplotlib</a> example. It shows how to create a dataset with <a href="https://pandas.pydata.org/">pandas</a> and plot it with the <code>plot()</code> function. The main graph argument like <code>linestyle</code> and <code>marker</code> are described to make sure the code is understandable.</p>
<p><a href="https://www.python-graph-gallery.com/130-basic-matplotlib-scatterplot"><em>blogpost overview</em>:</a></p>
<p><img src="/matplotlib/python-graph-gallery.com/scatterplot-example.png" alt="a basic scatterplot example"></p>
<h1 id="matplotlib-customization">Matplotlib customization<a class="headerlink" href="#matplotlib-customization" title="Link to this heading">#</a></h1>
<p>The gallery uses several libraries like <a href="https://www.python-graph-gallery.com/seaborn/">seaborn</a> or <a href="https://www.python-graph-gallery.com/plotly/">plotly</a> to produce its charts, but is mainly focus on matplotlib. Matplotlib comes with great flexibility and allows to build any kind of chart without limits.</p>
<p>A <a href="https://www.python-graph-gallery.com/matplotlib/">whole page</a> is dedicated to matplotlib. It describes how to solve recurring issues like customizing <a href="https://www.python-graph-gallery.com/191-custom-axis-on-matplotlib-chart">axes</a> or <a href="https://www.python-graph-gallery.com/190-custom-matplotlib-title">titles</a>, adding <a href="https://www.python-graph-gallery.com/193-annotate-matplotlib-chart">annotations</a> (see below) or even using <a href="https://www.python-graph-gallery.com/custom-fonts-in-matplotlib">custom fonts</a>.</p>
<p><img src="/matplotlib/python-graph-gallery.com/annotations.png" alt="annotation examples"></p>
<p>The gallery is also full of non-straightforward examples. For instance, it has a <a href="https://www.python-graph-gallery.com/streamchart-basic-matplotlib">tutorial</a> explaining how to build a streamchart with matplotlib. It is based on the <code>stackplot()</code> function and adds some smoothing to it:</p>
<p><img src="/matplotlib/python-graph-gallery.com/streamchart.png" alt="stream chart with python and matplotlib"></p>
<p>Last but not least, the gallery also displays some publication ready charts. They usually involve a lot of matplotlib code, but showcase the fine grain control one has over a plot.</p>
<p>Here is an example with a post inspired by <a href="https://www.r-graph-gallery.com/web-violinplot-with-ggstatsplot.html">Tuo Wang</a>&rsquo;s work for the tidyTuesday project. (Code translated from R available <a href="https://www.python-graph-gallery.com/web-ggbetweenstats-with-matplotlib">here</a>)</p>
<p><img src="/matplotlib/python-graph-gallery.com/boxplot.png" alt="python violin and boxplot example"></p>
<h1 id="contributing">Contributing<a class="headerlink" href="#contributing" title="Link to this heading">#</a></h1>
<p>The python graph gallery is an ever growing project. It is open-source, with all its related code hosted on <a href="https://github.com/holtzy/The-Python-Graph-Gallery">github</a>.</p>
<p>Contributions are very welcome to the gallery. Each blogpost is just a jupyter notebook so suggestion should be very easy to do through issues or pull requests!</p>
<h1 id="conclusion">Conclusion<a class="headerlink" href="#conclusion" title="Link to this heading">#</a></h1>
<p>The <a href="https://www.python-graph-gallery.com">python graph gallery</a> is a project developed by <a href="https://www.yan-holtz.com">Yan Holtz</a> in his free time. It can help you improve your technical skills when it comes to visualizing data with python.</p>
<p>The gallery belongs to an ecosystem of educative websites. <a href="https://www.data-to-viz.com">Data to viz</a> describes best practices in data visualization, the <a href="https://www.r-graph-gallery.com">R</a>, <a href="https://www.python-graph-gallery.com">python</a> and <a href="https://www.d3-graph-gallery.com">d3.js</a> graph galleries provide technical help to build charts with the 3 most common tools.</p>
<p>For any question regarding the project, please say hi on twitter at <a href="https://twitter.com/R_Graph_Gallery">@R_Graph_Gallery</a>!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="graphs" label="graphs" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Preliminaries for Sampling a Spanning Tree]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution/?utm_source=atom_feed" rel="related" type="text/html" title="The Entropy Distribution" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/?utm_source=atom_feed" rel="related" type="text/html" title="Entropy Distribution Setup" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="Finalizing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Implementing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/?utm_source=atom_feed" rel="related" type="text/html" title="Understanding the Ascent Method" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/</id>
            
            
            <published>2021-07-21T00:00:00+00:00</published>
            <updated>2021-07-21T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>A close examination of the mathematics required to sample a random spanning tree from a graph</blockquote><p>In order to test the exponential distribution that I generate using <code>spanning_tree_distribution</code>, I need to be able to sample a tree from the distribution.
The primary citation used in the Asadpour paper is <em>Generating Random Combinatorial Objects</em> by V. G. Kulkarni (1989).
While I was not able to find an online copy of this article, the Michigan Tech library did have a copy that I was able to read.</p>
<h2 id="does-the-kulkarni-algorithm-work-with-asadpour">Does the Kulkarni Algorithm work with Asadpour?<a class="headerlink" href="#does-the-kulkarni-algorithm-work-with-asadpour" title="Link to this heading">#</a></h2>
<p>Kulkarni gave a general overview of the algorithm in Section 2, but Section 5 is titled `Random Spanning Trees&rsquo; and starts on page 200.
First, let&rsquo;s check that the preliminaries for the Kulkarni paper on page 200 match the Asadpour algorithm.</p>
<blockquote>
<p>Let $G = (V, E)$ be an undirected network of $M$ nodes and $N$ arcs&hellip;
Let $\mathfrak{B}$ be the set of all spanning trees in $G$.
Let $\alpha_i$ be the positive weight of arc $i \in E$.
Defined the weight $w(B)$ of a spanning tree $B \in \mathfrak{B}$ as</p>
<p>$$w(B) = \prod_{i \in B} \alpha_i$$</p>
<p>Also define</p>
<p>$$n(G) = \sum_{B \in \mathfrak{B}} w(B)$$</p>
<p>In this section we describe an algorithm to generate $B \in \mathfrak{B}$ so that</p>
<p>$$P\{B \text{ is generated}\} = \frac{w(B)}{n(G)}$$</p>
</blockquote>
<p>Immediately we can see that $\mathfrak{B}$ is the same as $\mathcal{T}$ from the Asadpour paper, the set of all spanning trees.
The weight of each edge is $\alpha_i$ for Kulkarni and $\lambda_e$ to Asadpour.
As for the product of the weights of the graph being the probability, the Asadpour paper states on page 382</p>
<blockquote>
<p>Given $\lambda*e \geq 0$ for $e \in E$, a $\lambda$*-random tree_ $T$ of $G$ is a tree $T$ chosen from the set of all spanning trees of $G$ with probability proportional to $\prod_{e \in T} \lambda_e$.</p>
</blockquote>
<p>So this is not a concern.
Finally, $n(G)$ can be written as</p>
<p>$$\sum_{T \in \mathcal{T}} \prod_{e \in T} \lambda_e$$</p>
<p>which does appear several times throughout the Asadpour paper.
Thus the preliminaries between the Kulkarni and Asadpour papers align.</p>
<h2 id="the-kulkarni-algorithm">The Kulkarni Algorithm<a class="headerlink" href="#the-kulkarni-algorithm" title="Link to this heading">#</a></h2>
<p>The specialized version of the general algorithm which Kulkarni gives is Algorithm A8 on page 202.</p>
<blockquote>
<p>$U = \emptyset,$ $V = E$<br>
Do $i = 1$ to $N$;<br>
$\qquad$Let $a = n(G(U, V))$<br>
$\qquad\qquad a&rsquo;$ $= n(G(U \cup {i}, V))$<br>
$\qquad$Generate $Z \sim U[0, 1]$<br>
$\qquad$If $Z \leq \alpha_i \times \left(a&rsquo; / a\right)$<br>
$\qquad\qquad$then $U = U \cup {i}$,<br>
$\qquad\qquad$else $V = V - {i}$<br>
$\qquad$end.<br>
Stop. $U$ is the required spanning tree.</p>
</blockquote>
<p>Now we have to understand this algorithm so we can create pseudo code for it.
First as a notational explanation, the statement &ldquo;Generate $Z \sim U[0, 1]$&rdquo; means picking a uniformly random variable over the interval $[0, 1]$ which is independent of all the random variables generated before it (See page 188 of Kulkarni for more information).
The built-in python module <a href="https://docs.python.org/3/library/random.html"><code>random</code></a> can be used here.
Looking at real-valued distributions, I believe that using <code>random.uniform(0, 1)</code> is preferable to <code>random.random()</code> since the latter does not have the probability of generating a &lsquo;1&rsquo; and that is explicitly part of the interval discussed in the Kulkarni paper.</p>
<p>The other notational oddity would be statements similar to $G(U, V)$ which is this case does not refer to a graph with $U$ as the vertex set and $V$ as the edge set as $U$ and $V$ are both subsets of the full edge set $E$.</p>
<p>$G(U, V)$ is defined in the Kulkarni paper on page 201 as</p>
<blockquote>
<p>Let $G(U, V)$ be a subgraph of $G$ obtained by deleting arcs that are not in $V$, and collapsing arcs that are in $U$ (i.e., identifying the end nodes of arcs in $U$) and deleting all self-loops resulting from these deletions and collapsing.</p>
</blockquote>
<p>This language seems a bit&hellip; clunky, especially for the edges in $U$.
In this case, &ldquo;collapsing arcs that are in $U$&rdquo; would be contracting those edges without self loops.
Fortunately, this functionality is a part of NetworkX using <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.minors.contracted_edge.html#networkx.algorithms.minors.contracted_edge"><code>networkx.algorithms.minors.contracted_edge</code></a> with the <code>self_loops</code> keyword argument set to <code>False</code>.</p>
<p>As for the edges in $E - V$, this can be easily accomplished by using <a href="https://networkx.org/documentation/stable/reference/classes/generated/networkx.MultiGraph.remove_edges_from.html"><code>networkx.MultiGraph.remove_edges_from</code></a>.</p>
<p>Once we have generated $G(U, V)$, we need to find $n(G(U, V)$.
This can be done with something we are already familiar with: Kirchhoff&rsquo;s Tree Matrix Theorem.
All we need to do is create the Laplacian matrix and then find the determinant of the first cofactor.
This code will probably be taken directly from the <code>spanning_tree_distribution</code> function.
Actually, this is a place to create a broader helper function called <code>krichhoffs</code> which will take a graph and return the number of weighted spanning trees in it which would then be used as part of <code>q</code> in <code>spanning_tree_distribution</code> and in <code>sample_spanning_tree</code>.</p>
<p>From here we compare $Z$ to $\alpha_i \left(a&rsquo; / a\right)$ so see if that edge is added to the graph or discarded.
Understanding the process of the algorithm gives context to the meaning of $U$ and $V$.
$U$ is the set of edges which we have decided to include in the spanning tree while $V$ is the set of edges yet to be considered for $U$ (roughly speaking).</p>
<p>Now there is still a bit of ambiguity in the algorithm that Kulkarni gives, mainly about $i$.
In the loop condition, $i$ is an integer from 1 to $N$, the number of arcs in the graph but it is later being added to $U$ so it has to be an edge.
Referencing the Asadpour paper, it starts its description of sampling the $\lambda$-random tree on page 383 by saying &ldquo;The idea is to order the edges $e_1, \dots, e_m$ of $G$ arbitrarily and process them one by one&rdquo;.
So I believe that the edge interpretation is correct and the integer notation used in Kulkarni was assuming that a mapping of the edges to ${1, 2, \dots, N}$ has occurred.</p>
<h2 id="sample_spanning_tree-pseudo-code">sample_spanning_tree pseudo code<a class="headerlink" href="#sample_spanning_tree-pseudo-code" title="Link to this heading">#</a></h2>
<p>Time to write some pseudo code!
Starting with the function signature</p>

<div class="highlight">
  <pre>def sample_spanning_tree
    Input: A multigraph G whose edges contain a lambda value stored at lambda_key
    Output: A new graph which is a spanning tree of G</pre>
</div>

<p>Next up is a bit of initialization</p>

<div class="highlight">
  <pre>    U = set()
    V = set(G.edges)
    shuffled_edges = shuffle(G.edges)</pre>
</div>

<p>Now the definitions of <code>U</code> and <code>V</code> come directly from Algorithm A8, but <code>shuffled_edges</code> is new.
My thoughts are that this will be what we use for $i$.
We shuffle the edges of the graph and then in the loop we iterate over the edges within <code>shuffled_edges</code>.
Next we have the loop.</p>

<div class="highlight">
  <pre>    for edge e in shuffled_edges
        G_total_tree_weight = kirchhoffs(prepare_graph(G, U, V))
        G_i_total_tree_weight = kirchhoffs(prepare_graph(G, U.add(e), V))
        z = uniform(0, 1)
        if z &lt;= e[lambda_key] * G_i_total_tree_weight / G_total_tree_weight
            U = U.add(e)
            if len(U) == G.number_of_edges - 1
                # Spanning tree complete, no need to continue to consider edges.
                spanning_tree = nx.Graph
                spanning_tree.add_edges_from(U)
                return spanning_tree
        else
            V = V.remove(e)</pre>
</div>

<p>The main loop body does use two other functions which are not part of the standard NetworkX libraries, <code>krichhoffs</code> and <code>prepare_graph</code>.
As I mentioned before, <code>krichhoffs</code> will apply Krichhoff&rsquo;s Theorem to the graph.
Pseudo code for this is below and strongly based off of the existing code in <code>q</code> of <code>spanning_tree_distribution</code> which will be updated to use this new helper.</p>

<div class="highlight">
  <pre>def krichhoffs
    Input: A multigraph G and weight key, weight
    Output: The total weight of the graph&#39;s spanning trees

    G_laplacian = laplacian_matrix(G, weight=weight)
    G_laplacian = G_laplacian.delete(0, 0)
    G_laplacian = G_laplacian.delete(0, 1)

    return det(G_laplacian)</pre>
</div>

<p>The process for the other helper, <code>prepare_graph</code> is also given.</p>

<div class="highlight">
  <pre>def prepare_graph
    Input: A graph G, set of contracted edges U and edges which are not removed V
    Output: A subgraph of G in which all vertices in U are contracted and edges not in V are
			removed

    result = G.copy
    edges_to_remove = set(result.edges).difference(V)
    result.remove_edges_from(edges_to_remove)

    for edge e in U
        nx.contracted_edge(e)

    return result</pre>
</div>

<p>There is one other change to the NetworkX API that I would like to make.
At the moment, <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.minors.contracted_edge.html"><code>networkx.algorithms.minors.contracted_edge</code></a> is programmed to always return a copy of a graph.
Since I need to be contracting multiple edges at once, it would make a lot more sense to do the contraction in place.
I would like to add an optional keyword argument to <code>contracted_edge</code> called <code>copy</code> which will default to <code>True</code> so that the overall functionality will not change but I will be able to perform in place contractions.</p>
<h2 id="next-steps">Next Steps<a class="headerlink" href="#next-steps" title="Link to this heading">#</a></h2>
<p>The most obvious one is to implement the functions that I have laid out in the pseudo code step, but testing is still a concerning area.
My best bet is to sample say 1000 trees and check that the probability of each tree is equal to the product of all of the lambda&rsquo;s on it&rsquo;s edges.</p>
<p>That actually just caused me to think of a new test of <code>spanning_tree_distribution</code>.
If I generate the distribution and then iterate over all of the spanning trees with a <code>SpanningTreeIterator</code> I can sum the total probability of each tree being sampled and if that is not 1 (or very close to it) than I do not have a valid distribution over the spanning trees.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, SODA ’10,
Society for Industrial and Applied Mathematics, 2010, pp. 379-389, <a href="https://dl.acm.org/doi/abs/10.5555/1873601.1873633">https://dl.acm.org/doi/abs/10.5555/1873601.1873633</a>.</p>
<p>V. G. Kulkarni, <em>Generating random combinatorial objects</em>, Journal of algorithms, 11 (1990), pp. 185–207.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[The Entropy Distribution]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/?utm_source=atom_feed" rel="related" type="text/html" title="Entropy Distribution Setup" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="Finalizing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Implementing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/?utm_source=atom_feed" rel="related" type="text/html" title="Understanding the Ascent Method" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/?utm_source=atom_feed" rel="related" type="text/html" title="implementing the Iterators" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/entropy-distribution/</id>
            
            
            <published>2021-07-20T00:00:00+00:00</published>
            <updated>2021-07-20T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Details on implementing the entropy distribution</blockquote><p>Implementing <code>spanning_tree_distribution</code> proved to have some NetworkX difficulties and one algorithmic difficulty.
Recall that the algorithm for creating the distribution is given in the Asadpour paper as</p>
<blockquote>
<ol>
<li>Set $\gamma = \vec{0}$.</li>
<li>While there exists an edge $e$ with $q_e(\gamma) &gt; (1 + \epsilon) z_e$:
<ul>
<li>Compute $\delta$ such that if we define $\gamma&rsquo;$ as $\gamma_e&rsquo; = \gamma_e - \delta$, and $\gamma_f&rsquo; = \gamma_f$ for all $f \in E\ \backslash {e}$, then $q_e(\gamma&rsquo;) = (1 + \epsilon/2)z_e$.</li>
<li>Set $\gamma \leftarrow \gamma&rsquo;$.</li>
</ul>
</li>
<li>Output $\tilde{\gamma} := \gamma$.</li>
</ol>
</blockquote>
<p>Now, the procedure that I laid out in my last blog titled <a href="../entropy-distribution-setup">Entropy Distribution Setup</a> worked well for the while loop portion.
All of my difficulties with the NetworkX API happened in the <code>q</code> inner function.</p>
<p>After I programmed the function, I of course needed to run it and at first I was just printing the <code>gamma</code> dict out so that I could see what the values for each edge were.
My first test uses the symmetric fractional Held Karp solution and to my surprise, every value of $\gamma$ returned as 0.
I didn&rsquo;t think that this was intended behavior because if it was, there would be no reason to include this step in the overall Asadpour algorithm, so I started to dig around the code with PyCharm&rsquo;s debugger.
The results were, as I suspected, not correct.
I was running Krichhoff&rsquo;s tree matrix theorem on the original graph, so the returned probabilities were an order of magnitude smaller than the values of $z_e$ that I was comparing them to.
Additionally, all of the values were the same so I knew that this was a problem and not that the first edge I checked had unusually small probabilities.</p>
<p>So, I returned to the Asadpour paper and started to ask myself questions like</p>
<ul>
<li>Do I need to normalize the Held Karp answer in some way?</li>
<li>Do I need to consider edges outside of $E$ (the undirected support of the Held Karp relaxation solution) or only work with the edges in $E$?</li>
</ul>
<p>It was pretty easy to dismiss the first question, if normalization was required it would be mentioned in the Asadpour paper and without a description of how to normalize it the chances of me finding the `correct&rsquo; way to do so would be next to impossible.
The second question did take some digging.
The sections of the Asadpour paper which talk about using Krichhoff&rsquo;s theorem all discuss it using the graph $G$ which is why I was originally using all edges in $G$ rather than the edges
in $E$.
A few hints pointed to the fact that I needed to only consider the edges in $E$, the first being the algorithm overview which states</p>
<blockquote>
<p>Find weights ${\tilde{\gamma}}_{e \in E}$</p>
</blockquote>
<p>In particular the $e \in E$ statement says that I do not need to consider the edges which are not in $E$.
Secondly, Lemma 7.2 starts by stating</p>
<blockquote>
<p>Let $G = (V, E)$ be a graph with weights $\gamma_e$ for $e \in E$</p>
</blockquote>
<p>Based on the current state of the function and these hints, I decided to reduce the input graph to <code>spanning_tree_distribution</code> to only edges with $z_e &gt; 0$.
Running the test on the symmetric fractional solution now, it still returned $\gamma = \vec{0}$ but the probabilities it was comparing were much closer during that first iteration.
Due to the fact that I do not have an example graph and distribution to work with, this could be the correct answer, but the fact that every value was the same still confused me.</p>
<p>My next step was to determine the actual probability of an edge being in the spanning trees for the first iteration when $\gamma = \vec{0}$.
This can be easily done with my <code>SpanningTreeIterator</code> and exploits the fact that $\gamma = \vec{0} \equiv \lambda_e = 1\ \forall\ e \in \gamma$ so we can just iterate over the spanning trees and count how often each edge appears.</p>
<p>That script is listed below</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">networkx</span> <span class="k">as</span> <span class="nn">nx</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">edges</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span>
</span></span><span class="line"><span class="cl"><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">G</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">from_edgelist</span><span class="p">(</span><span class="n">edges</span><span class="p">,</span> <span class="n">create_using</span><span class="o">=</span><span class="n">nx</span><span class="o">.</span><span class="n">Graph</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">edge_frequency</span> <span class="o">=</span> <span class="p">{}</span>
</span></span><span class="line"><span class="cl"><span class="n">sp_count</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">tree</span> <span class="ow">in</span> <span class="n">nx</span><span class="o">.</span><span class="n">SpanningTreeIterator</span><span class="p">(</span><span class="n">G</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">sp_count</span> <span class="o">+=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="n">tree</span><span class="o">.</span><span class="n">edges</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">e</span> <span class="ow">in</span> <span class="n">edge_frequency</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">edge_frequency</span><span class="p">[</span><span class="n">e</span><span class="p">]</span> <span class="o">+=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">edge_frequency</span><span class="p">[</span><span class="n">e</span><span class="p">]</span> <span class="o">=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">u</span><span class="p">,</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">edge_frequency</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="nb">print</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="sa">f</span><span class="s2">&#34;(</span><span class="si">{</span><span class="n">u</span><span class="si">}</span><span class="s2">, </span><span class="si">{</span><span class="n">v</span><span class="si">}</span><span class="s2">): </span><span class="si">{</span><span class="n">edge_frequency</span><span class="p">[(</span><span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">)]</span><span class="si">}</span><span class="s2"> / </span><span class="si">{</span><span class="n">sp_count</span><span class="si">}</span><span class="s2"> = </span><span class="si">{</span><span class="n">edge_frequency</span><span class="p">[(</span><span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">)]</span> <span class="o">/</span> <span class="n">sp_count</span><span class="si">}</span><span class="s2">&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span></span></span></code></pre>
</div>
<p>This output revealed that the probabilities returned by <code>q</code> should vary from edge to edge and that the correct solution for $\gamma$ is certainly not $\vec{0}$.</p>

<div class="highlight">
  <pre>(networkx-dev) mjs@mjs-ubuntu:~/Workspace$ python3 spanning_tree_frequency.py
(0, 1): 40 / 75 = 0.5333333333333333
(0, 2): 40 / 75 = 0.5333333333333333
(0, 5): 45 / 75 = 0.6
(1, 4): 45 / 75 = 0.6
(2, 3): 45 / 75 = 0.6
(1, 2): 40 / 75 = 0.5333333333333333
(5, 3): 40 / 75 = 0.5333333333333333
(5, 4): 40 / 75 = 0.5333333333333333
(4, 3): 40 / 75 = 0.5333333333333333</pre>
</div>

<p>Let&rsquo;s focus on that first edge, $(0, 1)$.
My brute force script says that it appears in 40 of the 75 spanning trees of the below graph where each edge is labelled with its $z_e$ value.</p>
<center><img src="test-graph-z-e.png" alt="probabilities over the example graph"/></center>
<p>Yet <code>q</code> was saying that the edge was in 24 of 75 spanning trees.
Since the denominator was correct, I decided to focus on the numerator which is the number of spanning trees in $G\ \backslash\ \{(0, 1)\}$.
That graph would be the following.</p>
<center><img src="contracted-graph.png" alt="contracting on the least likely edge"/></center>
<p>An argument can be made that this graph should have a self-loop on vertex 0, but this does not affect the Laplacian matrix in any way so it is omitted here.
Basically, the $[0, 0]$ entry of the adjacency matrix would be 1 and the degree of vertex 0 would be 5 and $5 - 1 = 4$ which is what the entry would be without the self loop.</p>
<p>What was happening was that I was giving <code>nx.contracted_edge</code> a graph of the Graph class (not a directed graph since $E$ is undirected) and was getting a graph of the Graph class back.
The Graph class does not support multiple edges between two nodes so the returned graph only had one edge between node 0 and node 2 which was affecting the overall Laplacian matrix and thus the number of spanning trees.
Switching from a Graph to a MultiGraph did the trick, but this subtle change should be mentioned in the NetworkX documentation for the function, linked <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.minors.contracted_edge.html?">here</a>.
I definitely believed that if a contracted an edge the output should automatically include both of the $(0, 2)$ edges.
An argument can be made for changing the default behavior to match this, but at the very least the documentation should explain this problem.</p>
<p>Now the <code>q</code> function was returning the correct $40 / 75$ answer for $(0, 1)$ and correct values for the rest of the edges so long as all of the $\gamma_e$&rsquo;s were 0.
But the test was erroring out with a <code>ValueError</code> when I tried to compute $\delta$.
<code>q</code> was returning a probability of an edge being in a sampled spanning tree of more than 1, which is clearly impossible but also caused the denominator of $\delta$ to become negative and violate the domain of the natural log.</p>
<p>During my investigation of this problem, I noticed that after computing $\delta$ and subtracting it from $\gamma_e$, it did not have the desired effect on $q_e$.
Recall that we define $\delta$ so that $\gamma_e - \delta$ yields a $q_e$ of $(1 + \epsilon / 2) z_e$.
In other words, the effect of $\delta$ is to decrease an edge probability which is too high, but in my current implementation it was having the opposite effect.
The value of $q_{(0, 1)}$ was going from 0.5333 to just over 0.6.
If I let this trend continue, the program would eventually hit one of those cases where $q_e \geq 1$ and crash the program.</p>
<p>Here I can use edge $(0, 1)$ as an example to show the problem.
The original Laplacian matrix for $G$ with $\gamma = \vec{0}$ is</p>
<p>$$
\begin{bmatrix}
3 &amp; -1 &amp; -1 &amp; 0 &amp; 0 &amp; -1 \\\
-1 &amp; 3 &amp; -1 &amp; 0 &amp; -1 &amp; 0 \\\
-1 &amp; -1 &amp; 3 &amp; -1 &amp; 0 &amp; 0 \\\
0 &amp; 0 &amp; -1 &amp; 3 &amp; -1 &amp; -1 \\\
0 &amp; -1 &amp; 0 &amp; -1 &amp; 3 &amp; -1 \\\
-1 &amp; 0 &amp; 0 &amp; -1 &amp; -1 &amp; 3
\end{bmatrix}
$$</p>
<p>and the Laplacian for $G\ \backslash\ \{(0, 1)\}$ is</p>
<p>$$
\begin{bmatrix}
4 &amp; -2 &amp; -1 &amp; -1 &amp; 0 \\\
-2 &amp; 3 &amp; 0 &amp; 0 &amp; -1 \\\
-1 &amp; 0 &amp; 3 &amp; -1 &amp; -1 \\\
-1 &amp; 0 &amp; -1 &amp; 3 &amp; -1 \\\
0 &amp; -1 &amp; -1 &amp; -1 &amp; 3
\end{bmatrix}
$$</p>
<p>The determinant of the first cofactor is how we get the $40 / 75$.
Now consider the Laplacian matrices after we updated $\gamma_{(0, 1)}$ for the first time.
The one for $G$ becomes</p>
<p>$$
\begin{bmatrix}
2.74 &amp; -0.74 &amp; -1 &amp; 0 &amp; 0 &amp; -1 \\\
-0.74 &amp; 2.74 &amp; -1 &amp; 0 &amp; -1 &amp; 0 \\\
-1 &amp; -1 &amp; 3 &amp; -1 &amp; 0 &amp; 0 \\\
0 &amp; 0 &amp; -1 &amp; 3 &amp; -1 &amp; -1 \\\
0 &amp; -1 &amp; 0 &amp; -1 &amp; 3 &amp; -1 \\\
-1 &amp; 0 &amp; 0 &amp; -1 &amp; -1 &amp; 3
\end{bmatrix}
$$</p>
<p>and its first cofactor determinant is reduced from 75 to 61.6.
What do we expect the value of the matrix for $G\ \backslash\ \{(0, 1)\}$ to be?
Well, we know that the final value of $q_e$ needs to be $(1 + \epsilon / 2) z_e$ or $1.1 \times 0.41\overline{6}$ which is $0.458\overline{3}$.
So</p>
<p>$$
\begin{array}{r c l}
\displaystyle\frac{x}{61.6} &amp;=&amp; 0.458\overline{3} \\\
x &amp;=&amp; 28.2\overline{3}
\end{array}
$$</p>
<p>and the value of the first cofactor determinant should be $28.2\overline{3}$.
However, the contracted Laplacian for $(0, 1)$ after the value of $\gamma_e$ is updated is</p>
<p>$$
\begin{bmatrix}
4 &amp; -2 &amp; -1 &amp; -1 &amp; 0 \\\
-2 &amp; 3 &amp; 0 &amp; 0 &amp; -1 \\\
-1 &amp; 0 &amp; 3 &amp; -1 &amp; -1 \\\
-1 &amp; 0 &amp; -1 &amp; 3 &amp; -1 \\\
0 &amp; -1 &amp; -1 &amp; -1 &amp; 3
\end{bmatrix}
$$</p>
<p>the <strong>same as before!</strong>
The only edge with a different $\gamma_e$ than before is $(0, 1)$, but since it is the contracted edge it is no longer in the graph any more and thus cannot affect the value of the first cofactor&rsquo;s determinant!</p>
<p>But if we change the algorithm to add $\delta$ to $\gamma_e$ rather than subtract it, the determinant of the first cofactor for $G\ \backslash\ \{e\}$’s Laplacian will not change but the determinant for the Laplacian of $G$&rsquo;s first cofactor will increase.
This reduces the overall probability of picking $e$ in a spanning tree.
And, if we happen to use the same formula for $\delta$ as before for our example of $(0, 1)$ then $q_{(0, 1)}$ becomes $0.449307$.
Recall our target value of $0.458\overline{3}$.
This answer has a $-1.96%$ error.</p>
<p>$$
\begin{array}{r c l}
\text{error} &amp;=&amp; \frac{0.449307 - 0.458333}{0.458333} \times 100 \\\
&amp;=&amp; \frac{-0.009026}{0.458333} \times 100 \\\
&amp;=&amp; -0.019693 \times 100 \\\
&amp;=&amp; -1.9693%
\end{array}
$$</p>
<p>Also, the test now completes without error.</p>
<h2 id="update-28-july-2021">Update! (28 July 2021)<a class="headerlink" href="#update-28-july-2021" title="Link to this heading">#</a></h2>
<p>Further research and discussion with my mentors revealed just how flawed my original analysis was.
In the next step, sampling the spanning trees, adding anything to $\gamma$ would directly increase the probability that the edge would be sampled.
That being said, the original problem that I found was still an issue.</p>
<p>Going back to the notion that we a graph on which every spanning tree maps to every spanning tree which contains the desired edge, this is still the key idea which lets us use Krichhoff&rsquo;s Tree Matrix Theorem.
And, contracting the edge will still give a graph in which every spanning tree can be mapped to a corresponding spanning tree which includes $e$.
However, the weight of those spanning trees in $G \backslash \{e\}$ do not quite map between the two graphs.</p>
<p>Recall that we are dealing with a multiplicative weight function, so the final weight of a tree is the product of all the $\lambda$&rsquo;s on its edges.</p>
<p>$$
c(T) = \prod_{e \in E} \lambda_e
$$</p>
<p>The above statement can be expanded into</p>
<p>$$
c(T) = \lambda_1 \times \lambda_2 \times \dots \times \lambda_{|E|}
$$</p>
<p>with some arbitrary ordering of the edges $1, 2, \dots |E|$.
Because the ordering of the edges is arbitrary and due to the associative property of multiplication, we can assume without loss of generality that the desired edge $e$ is the last one in the sequence.</p>
<p>Any spanning tree in $G \backslash \{e\}$ cannot include that last $\lambda$ in it because that edge does not exist in the graph.
Therefore in order to convert the weight from a tree in $G \backslash \{e\}$ we need to multiply $\lambda_e$ back into the weight of the contracted tree.
So, we can now state that</p>
<p>$$
c(T \in \mathcal{T}: T \ni e) = \lambda_e \prod_{f \in E} \lambda_f\ \forall\ T \in G \backslash \{e\}
$$</p>
<p>or that for all trees in $G \backslash \{e\}$, the cost of the corresponding tree in $G$ is the product of its edge $\lambda$&rsquo;s times the weight of the desired edge.
Now recall that $q_e(\gamma)$ is</p>
<p>$$
\frac{\sum_{T \ni e} \exp(\gamma(T))}{\sum_{T \in \mathcal{T}} \exp(\gamma(T))}
$$</p>
<p>In particular we are dealing with the numerator of the above fraction and using $\lambda_e = \exp(\gamma_e)$ we can rewrite it as</p>
<p>$$
\sum_{T \ni e} \exp(\gamma(T)) = \sum_{T \ni e} \prod_{e \in T} \lambda_e
$$</p>
<p>Since we now know that we are missing the $\lambda_e$ term, we can add it into the expression.</p>
<p>$$
\sum_{T \ni e} \lambda_e \times \prod_{f \in T, f \not= e} \lambda_f
$$</p>
<p>Using the rules of summation, we can pull the $\lambda_e$ factor out of the summation to get</p>
<p>$$
\lambda_e \times \sum_{T \ni e} \prod_{f \in T, f \not= e} \lambda_f
$$</p>
<p>And since we use that applying Krichhoff&rsquo;s Theorem to $G \backslash \{e\}$ will yield everything except the factor of $\lambda_e$, we can just multiply it back manually.
This would let the peusdo code for <code>q</code> become</p>

<div class="highlight">
  <pre>def q
    input: e, the edge of interest

    # Create the laplacian matrices
    write lambda = exp(gamma) into the edges of G
    G_laplace = laplacian(G, lambda)
    G_e = nx.contracted_edge(G, e)
    G_e_laplace = laplacian(G, lambda)

    # Delete a row and column from each matrix to made a cofactor matrix
    G_laplace.delete((0, 0))
    G_e_laplace.delete((0, 0))

    # Calculate the determinant of the cofactor matrices
    det_G_laplace = G_laplace.det
    det_G_e_laplace = G_e_laplace.det

    # return q_e
    return lambda_e * det_G_e_laplace / det_G_laplace</pre>
</div>

<p>Making this small change to <code>q</code> worked very well.
I was able to change back to subtracting $\delta$ as the Asadpour paper does and even added a check to code so that every time we update a value in $\gamma$ we know that $\delta$ has had the correct effect.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># Check that delta had the desired effect</span>
</span></span><span class="line"><span class="cl"><span class="n">new_q_e</span> <span class="o">=</span> <span class="n">q</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">desired_q_e</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span> <span class="o">+</span> <span class="n">EPSILON</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span> <span class="o">*</span> <span class="n">z_e</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="nb">round</span><span class="p">(</span><span class="n">new_q_e</span><span class="p">,</span> <span class="mi">8</span><span class="p">)</span> <span class="o">!=</span> <span class="nb">round</span><span class="p">(</span><span class="n">desired_q_e</span><span class="p">,</span> <span class="mi">8</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">raise</span> <span class="ne">Exception</span></span></span></code></pre>
</div>
<p>And the test passes without fail!</p>
<h2 id="whats-next">What&rsquo;s Next<a class="headerlink" href="#whats-next" title="Link to this heading">#</a></h2>
<p>I technically do not know if this distribution is correct until I can start to sample from it.
I have written the test I have been working with into a proper test but since my oracle is the program itself, the only way it can fail is if I change the function&rsquo;s behavior without knowing it.</p>
<p>So I must press onwards to write <code>sample_spanning_tree</code> and get a better test for both of those functions.</p>
<p>As for the tests of <code>spanning_tree_distribution</code>, I would of course like to add more test cases.
However, if the Held Karp relaxation returns a cycle as an answer, then there will be $n - 1$ path spanning trees and the notion of creating this distribution in the first place as we have already found a solution to the ATSP.
I really need more truly fractional Held Karp solutions to expand the test of these next two functions.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[GSoC'21: Pre-Quarter Progress]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_prequarter/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_midterm/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC&#39;21: Mid-Term Progress" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_introduction/?utm_source=atom_feed" rel="related" type="text/html" title="Aitik Gupta joins as a Student Developer under GSoC&#39;21" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2020_final_work_product/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC 2020 Work Product - Baseline Images Problem" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_5/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 3 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_4/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 2 Blog 2" />
            
                <id>https://blog.scientific-python.org/matplotlib/gsoc_2021_prequarter/</id>
            
            
            <published>2021-07-19T07:32:05+05:30</published>
            <updated>2021-07-19T07:32:05+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Pre-Quarter Progress with Google Summer of Code 2021 project under NumFOCUS: Aitik Gupta</blockquote><p><strong>“<ins>Well? Did you get it working?!</ins>”</strong></p>
<p>Before I answer that question, if you&rsquo;re missing the context, check out my <a href="../gsoc_2021_midterm/">previous blog</a>&rsquo;s last few lines.. promise it won&rsquo;t take you more than 30 seconds to get the whole problem!</p>
<p>With this short writeup, I intend to talk about <em>what</em> we did and <em>why</em> we did, what we did. XD</p>
<h2 id="ostrich-algorithm">Ostrich Algorithm<a class="headerlink" href="#ostrich-algorithm" title="Link to this heading">#</a></h2>
<p>Ring any bells? Remember OS (Operating Systems)? It&rsquo;s one of the core CS subjects which I bunked then and regret now. (╥﹏╥)</p>
<p>The <a href="https://en.wikipedia.org/wiki/Ostrich_algorithm">wikipedia page</a> has a 2-liner explanation if you have no idea what&rsquo;s an Ostrich Algorithm.. but I know most of y&rsquo;all won&rsquo;t bother clicking it XD, so here goes:</p>
<blockquote>
<p>Ostrich algorithm is a strategy of ignoring potential problems by &ldquo;sticking one&rsquo;s head in the sand and pretending there is no problem&rdquo;</p>
</blockquote>
<p>An important thing to note: it is used when it is more <strong>cost-effective</strong> to <em>allow the problem to occur than to attempt its prevention</em>.</p>
<p>As you might&rsquo;ve guessed by now, we ultimately ended up with the <em>not-so-clean</em> API (more on this later).</p>
<h2 id="what-was-the-problem">What was the problem?<a class="headerlink" href="#what-was-the-problem" title="Link to this heading">#</a></h2>
<p>The highest level overview of the problem was:</p>

<div class="highlight">
  <pre>❌ fontTools -&gt; buffer -&gt; ttconv_with_buffer
✅ fontTools -&gt; buffer -&gt; tempfile -&gt; ttconv_with_file</pre>
</div>

<p>The first approach created corrupted outputs, however the second approach worked fine. A point to note here would be that <em>Method 1</em> is better in terms of separation of <em>reading</em> the file from <em>parsing</em> the data.</p>
<ol>
<li><a href="https://github.com/fonttools/fonttools">fontTools</a> handles the Type42 subsetting for us, whereas <a href="https://github.com/matplotlib/matplotlib/tree/master/extern/ttconv">ttconv</a> handles the embedding.</li>
<li><code>ttconv_with_buffer</code> is a modification to the original <code>ttconv_with_file</code>; that allows it to input a file buffer instead of a file-path</li>
</ol>
<p>You might be tempted to say:</p>
<blockquote>
<p>&ldquo;Well, <code>ttconv_with_buffer</code> must be wrongly modified, duh.&rdquo;</p>
</blockquote>
<p>Logically, yes. <code>ttconv</code> was designed to work with a file-path and not a file-object (buffer), and modifying a codebase <strong>written in 1998</strong> turned out to be a larger pain than we anticipated.</p>
<h4 id="it-came-to-a-point-where-one-of-my-mentors-decided-to-implement-everything-in-python">It came to a point where one of my mentors decided to implement everything in Python!<a class="headerlink" href="#it-came-to-a-point-where-one-of-my-mentors-decided-to-implement-everything-in-python" title="Link to this heading">#</a></h4>
<p>He even did, but <ins>the efforts</ins> to get it to production / or to fix <code>ttconv</code> embedding were ⋙ to just get on with the second method. That damn ostrich really helped us get out of that debugging hell. 🙃</p>
<h2 id="font-fallback---initial-steps">Font Fallback - initial steps<a class="headerlink" href="#font-fallback---initial-steps" title="Link to this heading">#</a></h2>
<p>Finally, we&rsquo;re onto the second subgoal for the summer: <a href="https://www.w3schools.com/css/css_font_fallbacks.asp">Font Fallback</a>!</p>
<p>To give an idea about how things work right now:</p>
<ol>
<li>User asks Matplotlib to use certain font families, specified by:</li>
</ol>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">matplotlib</span><span class="o">.</span><span class="n">rcParams</span><span class="p">[</span><span class="s2">&#34;font-family&#34;</span><span class="p">]</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;list&#34;</span><span class="p">,</span> <span class="s2">&#34;of&#34;</span><span class="p">,</span> <span class="s2">&#34;font&#34;</span><span class="p">,</span> <span class="s2">&#34;families&#34;</span><span class="p">]</span></span></span></code></pre>
</div>
<ol start="2">
<li>This list is used to search for available fonts on a user&rsquo;s system.</li>
<li>However, in current (and previous) versions of Matplotlib:
<blockquote>
<p><ins>As soon as a font is found by iterating the font-family, <strong>all text</strong> is rendered by that <em>and only that</em> font.</ins></p>
</blockquote>
</li>
</ol>
<p>You can immediately see the problems with this approach; using the same font for every character will not render any glyph which isn&rsquo;t present in that font, and will instead spit out a square rectangle called &ldquo;tofu&rdquo; (read the first line <a href="https://www.google.com/get/noto/">here</a>).</p>
<p>And that is exactly the first milestone! That is, parsing the <em><ins>entire list</ins></em> of font families to get an intermediate representation of a multi-font interface.</p>
<h2 id="dont-break-a-lot-at-stake">Don&rsquo;t break, a lot at stake!<a class="headerlink" href="#dont-break-a-lot-at-stake" title="Link to this heading">#</a></h2>
<p>Imagine if you had the superpower to change Python standard library&rsquo;s internal functions, <em>without</em> consulting anybody. Let&rsquo;s say you wanted to write a solution by hooking in and changing, let&rsquo;s say <code>str(&quot;dumb&quot;)</code> implementation by returning:</p>

<div class="highlight">
  <pre>&gt;&gt;&gt; str(&#34;dumb&#34;)
[&#34;d&#34;, &#34;u&#34;, &#34;m&#34;, &#34;b&#34;]</pre>
</div>

<p>Pretty &ldquo;<ins>dumb</ins>&rdquo;, right? xD</p>
<p>For your usecase it might work fine, but it would also mean breaking the <em>entire</em> Python userbase&rsquo; workflow, not to mention the 1000000+ libraries that depend on the original functionality.</p>
<p>On a similar note, Matplotlib has a public API known as <code>findfont(prop: str)</code>, which when given a string (or <a href="https://matplotlib.org/stable/api/font_manager_api.html#matplotlib.font_manager.FontProperties">FontProperties</a>) finds you a font that best matches the given properties in your system.</p>
<p>It is used <ins>throughout the library</ins>, as well as at multiple other places, including downstream libraries. Being naive as I was, I changed this function signature and submitted the <a href="https://github.com/matplotlib/matplotlib/pull/20496">PR</a>. 🥲</p>
<p>Had an insightful discussion about this with my mentors, and soon enough raised the <a href="https://github.com/matplotlib/matplotlib/pull/20549">other PR</a>, which didn&rsquo;t touch the <code>findfont</code> API at all.</p>
<hr>
<p>One last thing to note: Even if we do complete the first milestone, we wouldn&rsquo;t be done yet, since this is just parsing the entire list to get multiple fonts..</p>
<p>We still need to migrate the library&rsquo;s internal implementation from <strong>font-first</strong> to <strong>text-first</strong>!</p>
<p>But that&rsquo;s for later, for now:
<img src="https://user-images.githubusercontent.com/43996118/126441988-5a2067fd-055e-44e5-86e9-4dddf47abc9d.png" alt="Bernie Sanders with text that read ‘I am once again thanking you for reading.’"></p>
<h4 id="note-this-blog-post-is-also-available-at-my-personal-website">NOTE: This blog post is also available at my <a href="https://aitikgupta.github.io/gsoc-pre-quarter/">personal website</a>.<a class="headerlink" href="#note-this-blog-post-is-also-available-at-my-personal-website" title="Link to this heading">#</a></h4>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="GSoC" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Entropy Distribution Setup]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="Finalizing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Implementing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/?utm_source=atom_feed" rel="related" type="text/html" title="Understanding the Ascent Method" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/?utm_source=atom_feed" rel="related" type="text/html" title="implementing the Iterators" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/?utm_source=atom_feed" rel="related" type="text/html" title="Finding all Minimum Arborescences" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/</id>
            
            
            <published>2021-07-13T00:00:00+00:00</published>
            <updated>2021-07-13T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Preliminaries for the entropy distribution over spanning trees</blockquote><p>Finally moving on from the Held Karp relaxation, we arrive at the second step of the Asadpour asymmetric traveling salesman problem algorithm.
Referencing the Algorithm 1 from the Asadpour paper, we are now <em>finally</em> on step two.</p>
<blockquote>
<hr>
<p><strong>Algorithm 1</strong> An $O(\log n / \log \log n)$-approximation algorithm for the ATSP</p>
<hr>
<p><strong>Input:</strong> A set $V$ consisting of $n$ points and a cost function $c\ :\ V \times V \rightarrow \mathbb{R}^+$ satisfying the triangle inequality.</p>
<p><strong>Output:</strong> $O(\log n / \log \log n)$-approximation of the asymmetric traveling salesman problem instance described by $V$ and $c$.</p>
<ol>
<li>Solve the Held-Karp LP relaxation of the ATSP instance to get an optimum extreme point solution $x^*$.
Define $z^*$ as in (5), making it a symmetrized and scaled down version of $x^*$.
Vector $z^*$ can be viewed as a point in the spanning tree polytope of the undirected graph on the support of $x^*$ that one obtains after disregarding the directions of arcs (See Section 3.)</li>
<li>Let $E$ be the support graph of $z^*$ when the direction of the arcs are disregarded.
Find weights ${\tilde{\gamma}}_{e \in E}$ such that the exponential distribution on the spanning trees, $\tilde{p}(T) \propto \exp(\sum_{e \in T} \tilde{\gamma}_e)$ (approximately) preserves the marginals imposed by $z^*$, i.e. for any edge $e \in E$,
$$\sum_{T \in \mathcal{T} : T \ni e} \tilde{p}(T) \leq (1 + \epsilon) z^*_e$$
for a small enough value of $\epsilon$.
(In this paper we show that $\epsilon = 0.2$ suffices for our purpose. See Section 7 and 8 for a description of how to compute such a distribution.)</li>
<li>Sample $2\lceil \log n \rceil$ spanning trees $T_1, \dots, T_{2\lceil \log n \rceil}$ from $\tilde{p}(.)$.
For each of these trees, orient all its edges so as to minimize its cost with respect to our (asymmetric) cost function $c$.
Let $T^*$ be the tree whose resulting cost is minimal among all of the sampled trees.</li>
<li>Find a minimum cost integral circulation that contains the oriented tree $\vec{T}^*$.
Shortcut this circulation to a tour and output it. (See Section 4.)</li>
</ol>
<hr>
</blockquote>
<p>Sections 7 and 8 provide two different methods to find the desired probability distribution, with section 7 using a combinatorial approach and section 8 the ellipsoid method.
Considering that there is no ellipsoid solver in the scientific python ecosystem, and my mentors and I have already decided not to implement one within this project, I will be using the method in section 7.</p>
<p>The algorithm given in section 7 is as follows:</p>
<blockquote>
<ol>
<li>Set $\gamma = \vec{0}$.</li>
<li>While there exists an edge $e$ with $q_e(\gamma) &gt; (1 + \epsilon) z_e$:
<ul>
<li>Compute $\delta$ such that if we define $\gamma&rsquo;$ as $\gamma_e&rsquo; = \gamma_e - \delta$, and $\gamma_f&rsquo; = \gamma_f$ for all $f \in E\ \backslash {e}$, then $q_e(\gamma&rsquo;) = (1 + \epsilon/2)z_e$.</li>
<li>Set $\gamma \leftarrow \gamma&rsquo;$.</li>
</ul>
</li>
<li>Output $\tilde{\gamma} := \gamma$.</li>
</ol>
</blockquote>
<p>This structure is fairly straightforward, but we need to know what $q_e(\gamma)$ is and how to calculate $\delta$.</p>
<p>Finding $\delta$ is very easy, the formula is given in the Asadpour paper
(Although I did not realize this at the time that I wrote my GSoC proposal and re-derived the equation for delta. Fortunately my formula matches the one in the paper.)</p>
<p>$$
\delta = \ln \frac{q_e(\gamma)(1 - (1 + \epsilon / 2)z_e)}{(1 - q_e(\gamma))(1 + \epsilon / 2) z_e}
$$</p>
<p>Notice that the formula for $\delta$ is reliant on $q_e(\gamma)$.
The paper defines $q_e(\gamma)$ as</p>
<p>$$
q_e(\gamma) = \frac{\sum_{T \ni e} \exp(\gamma(T))}{\sum_{T \in \mathcal{T}} \exp(\gamma(T))}
$$</p>
<p>where $\gamma(T) = \sum_{f \in T} \gamma_f$.</p>
<p>The first thing that I noticed is that in the denominator the summation is over all spanning trees for in the graph, which for the complete graphs we will be working with is exponential so a `brute force&rsquo; approach here is useless.
Fortunately, Asadpour and team realized we can use Kirchhoff&rsquo;s matrix tree theorem to our advantage.</p>
<p>As an aside about Kirchhoff&rsquo;s matrix tree theorem, I was not familiar with this theorem before this project so I had to do a bit of reading about it.
Basically, if you have a laplacian matrix (the adjacency matrix minus the degree matrix), the absolute value of any cofactor is the number of spanning trees in the graph.
This was something completely unexpected to me, and I think that it is very cool that this type of connection exists.</p>
<p>The details of using Kirchhoff&rsquo;s theorem are given in section 5.3.
We will be using a weighted laplacian $L$ defined by</p>
<p>$$
L_{i, j} = \left\{
\begin{array}{l l}
-\lambda_e &amp; e = (i, j) \in E \\\
\sum_{e \in \delta({i})} \lambda_e &amp; i = j \\\
0 &amp; \text{otherwise}
\end{array}
\right.
$$</p>
<p>where $\lambda_e = \exp(\gamma_e)$.</p>
<p>Now, we know that applying Krichhoff&rsquo;s theorem to $L$ will return</p>
<p>$$
\sum_{t \in \mathcal{T}} \prod_{e \in T} \lambda_e
$$</p>
<p>but which part of $q_e(\gamma)$ is that?</p>
<p>If we apply $\lambda_e = \exp(\gamma_e)$, we find that</p>
<p>$$
\begin{array}{r c l}
\sum_{T \in \mathcal{T}} \prod_{e \in T} \lambda_e &amp;=&amp; \sum_{T \in \mathcal{T}} \prod_{e \in T} \exp(\gamma_e) \\\
&amp;&amp; \sum_{T \in \mathcal{T}} \exp\left(\sum_{e \in T} \gamma_e\right) \\\
&amp;&amp; \sum_{T \in \mathcal{T}} \exp(\gamma(T)) \\\
\end{array}
$$</p>
<p>So moving from the first row to the second row is a confusing step, but essentially we are exploiting the properties of exponents.
Recall that $\exp(x) = e^x$, so could have written it as $\prod_{e \in T} e^{\gamma_e}$ but this introduces ambiguity as we would have multiple meanings of $e$.
Now, for all values of $e$, $e_1, e_2, \dots, e_{n-1}$ in the spanning tree $T$ that product can be expanded as</p>
<p>$$
\prod_{e \in T} e^{\gamma_e} = e^{\gamma_{e_1}} \times e^{\gamma_{e_2}} \times \dots \times e^{\gamma_{e_{n-1}}}
$$</p>
<p>Each exponential factor has the same base, so we can collapse that into</p>
<p>$$
e^{\gamma_{e_1} + \gamma_{e_2} + \dots + \gamma_{e_{n-1}}}
$$</p>
<p>which is also</p>
<p>$$
e^{\sum_{e \in T} \gamma_e}
$$</p>
<p>but we know that $\sum_{e \in T} \gamma_e$ is $\gamma(T)$, so it becomes</p>
<p>$$
e^{\gamma(T)} = \exp(\gamma(T))
$$</p>
<p>Once we put that back into the summation we arrive at the denominator in $q_e(\gamma)$, $\sum_{T \in \mathcal{T}} \exp(\gamma(T))$.</p>
<p>Next, we need to find the numerator for $q_e(\gamma)$.
Just as before, a `brute force&rsquo; approach would be exponential in complexity, so we have to find a better way.
Well, the only difference between the numerator and denominator is the condition on the outer summation, which the $T \in \mathcal{T}$ being changed to $T \ni e$ or every tree containing edge $e$.</p>
<p>There is a way to use Krichhoff&rsquo;s matrix tree theorem here as well.
If we had a graph in which every spanning tree could be mapped in a one-to-one fashion onto every spanning tree in the original graph which contains the desired edge $e$.
In order for a spanning tree to contain edge $e$, we know that the endpoints of $e$, $(u, v)$ will be directly connected to each other.
So we are then interested in every spanning tree in which we reach vertex $u$ and then leave from vertex $v$.
(As opposed to the spanning trees where we reach vertex $u$ and then leave from that same vertex).
In a sense, we are treating vertices $u$ and $v$ is the same vertex.
We can apply this literally by <em>contracting</em> $e$ from the graph, creating $G / {e}$.
Every spanning tree in this graph can be uniquely mapped from $G / {e}$ onto a spanning tree in $G$ which contains the edge $e$.</p>
<p>From here, the logic to show that a cofactor from $L$ is actually the numerator of $q_e(\gamma)$ parallels the logic for the denominator.</p>
<p>At this point, we have all of the needed information to create some pseudo code for the next function in the Asadpour method, <code>spanning_tree_distribution()</code>.
Here I will use an inner function <code>q()</code> to find $q_e$.</p>

<div class="highlight">
  <pre>def spanning_tree_distribution
    input: z, the symmetrized and scaled output of the Held Karp relaxation.
    output: gamma, the maximum entropy exponential distribution for sampling spanning trees
           from the graph.

    def q
        input: e, the edge of interest

        # Create the laplacian matrices
        write lambda = exp(gamma) into the edges of G
        G_laplace = laplacian(G, lambda)
        G_e = nx.contracted_edge(G, e)
        G_e_laplace = laplacian(G, lambda)

        # Delete a row and column from each matrix to made a cofactor matrix
        G_laplace.delete((0, 0))
        G_e_laplace.delete((0, 0))

        # Calculate the determinant of the cofactor matrices
        det_G_laplace = G_laplace.det
        det_G_e_laplace = G_e_laplace.det

        # return q_e
        return det_G_e_laplace / det_G_laplace

    # initialize the gamma vector
    gamma = 0 vector of length G.size

    while true
        # We will iterate over the edges in z until we complete the
        # for loop without changing a value in gamma. This will mean
        # that there is not an edge with q_e &gt; 1.2 * z_e
        valid_count = 0
        # Search for an edge with q_e &gt; 1.2 * z_e
        for e in z
            q_e = q(e)
            z_e = z[e]
            if q_e &gt; 1.2 * z_e
                delta = ln(q_e * (1 - 1.1 * z_e) / (1 - q_e) * 1.1 * z_e)
                gamma[e] -= delta
            else
                valid_count &#43;= 1
        if valid_count == number of edges in z
            break

    return gamma</pre>
</div>

<h2 id="next-steps">Next Steps<a class="headerlink" href="#next-steps" title="Link to this heading">#</a></h2>
<p>The clear next step is to implement the function <code>spanning_tree_distribution</code> using the pseudo code above as an outline.
I will start by writing <code>q</code> and testing it with the same graphs which I am using to test the Held Karp relaxation.
Once <code>q</code> is complete, the rest of the function seems fairly straight forward.</p>
<p>One thing that I am concerned about is my ability to test <code>spanning_tree_distribution</code>.
There are no examples given in the Asadpour research paper and no other easy resources which I could turn to in order to find an oracle.</p>
<p>The only method that I can think of right now would be to complete this function, then complete <code>sample_spanning_tree</code>.
Once both functions are complete, I can sample a large number of spanning trees to find an experimental probability for each tree, then run a statistical test (such as an h-test) to see if the probability of each tree is near $\exp(\gamma(T))$ which is the desired distribution.
An alternative test would be to use the marginals in the distribution and have to manually check that</p>
<p>$$
\sum_{T \in \mathcal{T} : T \ni e} p(T) \leq (1 + \epsilon) z^*_e,\ \forall\ e \in E
$$</p>
<p>where $p(T)$ is the experimental data from the sampled trees.</p>
<p>Both methods seem very computationally intensive and because they are sampling from a probability distribution they may fail randomly due to an unlikely sample.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Finalizing the Held-Karp Relaxation]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Implementing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/?utm_source=atom_feed" rel="related" type="text/html" title="Understanding the Ascent Method" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/?utm_source=atom_feed" rel="related" type="text/html" title="implementing the Iterators" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/?utm_source=atom_feed" rel="related" type="text/html" title="Finding all Minimum Arborescences" />
                <link href="https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="A Closer Look at the Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/</id>
            
            
            <published>2021-07-07T00:00:00+00:00</published>
            <updated>2021-07-07T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Picking which method to use for the final implementation of the Asadpour algorithm in NetworkX</blockquote><p>This <em>should</em> be my final post about the Held-Karp relaxation!
Since my last post titled <a href="../implementing-the-held-karp-relaxation">Implementing The Held Karp Relaxation</a>, I have been testing both the ascent method as well as the branch and bound method.</p>
<p>My first test was to use a truly asymmetric graph rather than a directed graph where the cost in each direction happened to be the same.
In order to create such a test, I needed to know the solution to any such proposed graphs.
I wrote a python script called <code>brute_force_optimal_tour.py</code> which will generate a random graph, print its adjacency matrix and then check every possible combination of edges to find the optimal tour.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">networkx</span> <span class="k">as</span> <span class="nn">nx</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">combinations</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">math</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">random</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">is_1_arborescence</span><span class="p">(</span><span class="n">G</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns true if `G` is a 1-arborescence
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">G</span><span class="o">.</span><span class="n">number_of_edges</span><span class="p">()</span> <span class="o">==</span> <span class="n">G</span><span class="o">.</span><span class="n">order</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">        <span class="ow">and</span> <span class="nb">max</span><span class="p">(</span><span class="n">d</span> <span class="k">for</span> <span class="n">n</span><span class="p">,</span> <span class="n">d</span> <span class="ow">in</span> <span class="n">G</span><span class="o">.</span><span class="n">in_degree</span><span class="p">())</span> <span class="o">&lt;=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">        <span class="ow">and</span> <span class="n">nx</span><span class="o">.</span><span class="n">is_weakly_connected</span><span class="p">(</span><span class="n">G</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Generate a random adjacency matrix</span>
</span></span><span class="line"><span class="cl"><span class="n">size</span> <span class="o">=</span> <span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">7</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">G_array</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">empty</span><span class="p">(</span><span class="n">size</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="nb">int</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">size</span><span class="p">[</span><span class="mi">0</span><span class="p">]):</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">size</span><span class="p">[</span><span class="mi">1</span><span class="p">]):</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">r</span> <span class="o">==</span> <span class="n">c</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">G_array</span><span class="p">[</span><span class="n">r</span><span class="p">][</span><span class="n">c</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl">            <span class="k">continue</span>
</span></span><span class="line"><span class="cl">        <span class="n">G_array</span><span class="p">[</span><span class="n">r</span><span class="p">][</span><span class="n">c</span><span class="p">]</span> <span class="o">=</span> <span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Print that adjacency matrix</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">G_array</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">G</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">from_numpy_array</span><span class="p">(</span><span class="n">G_array</span><span class="p">,</span> <span class="n">create_using</span><span class="o">=</span><span class="n">nx</span><span class="o">.</span><span class="n">DiGraph</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">num_nodes</span> <span class="o">=</span> <span class="n">G</span><span class="o">.</span><span class="n">order</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">combo_count</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl"><span class="n">min_weight_tour</span> <span class="o">=</span> <span class="kc">None</span>
</span></span><span class="line"><span class="cl"><span class="n">min_tour_weight</span> <span class="o">=</span> <span class="n">math</span><span class="o">.</span><span class="n">inf</span>
</span></span><span class="line"><span class="cl"><span class="n">test_combo</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">DiGraph</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">combo</span> <span class="ow">in</span> <span class="n">combinations</span><span class="p">(</span><span class="n">G</span><span class="o">.</span><span class="n">edges</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">),</span> <span class="n">G</span><span class="o">.</span><span class="n">order</span><span class="p">()):</span>
</span></span><span class="line"><span class="cl">    <span class="n">combo_count</span> <span class="o">+=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">    <span class="n">test_combo</span><span class="o">.</span><span class="n">clear</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">test_combo</span><span class="o">.</span><span class="n">add_weighted_edges_from</span><span class="p">(</span><span class="n">combo</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Test to see if test_combo is a tour.</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># This means first that it is an 1-arborescence</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="ow">not</span> <span class="n">is_1_arborescence</span><span class="p">(</span><span class="n">test_combo</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="k">continue</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># It also means that every vertex has a degree of 2</span>
</span></span><span class="line"><span class="cl">    <span class="n">arborescence_weight</span> <span class="o">=</span> <span class="n">test_combo</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="s2">&#34;weight&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="nb">len</span><span class="p">([</span><span class="n">n</span> <span class="k">for</span> <span class="n">n</span><span class="p">,</span> <span class="n">deg</span> <span class="ow">in</span> <span class="n">test_combo</span><span class="o">.</span><span class="n">degree</span> <span class="k">if</span> <span class="n">deg</span> <span class="o">==</span> <span class="mi">2</span><span class="p">])</span> <span class="o">==</span> <span class="n">num_nodes</span>
</span></span><span class="line"><span class="cl">        <span class="ow">and</span> <span class="n">arborescence_weight</span> <span class="o">&lt;</span> <span class="n">min_tour_weight</span>
</span></span><span class="line"><span class="cl">    <span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="c1"># Tour found</span>
</span></span><span class="line"><span class="cl">        <span class="n">min_weight_tour</span> <span class="o">=</span> <span class="n">test_combo</span><span class="o">.</span><span class="n">copy</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">        <span class="n">min_tour_weight</span> <span class="o">=</span> <span class="n">arborescence_weight</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="sa">f</span><span class="s2">&#34;Minimum tour found with weight </span><span class="si">{</span><span class="n">min_tour_weight</span><span class="si">}</span><span class="s2"> from </span><span class="si">{</span><span class="n">combo_count</span><span class="si">}</span><span class="s2"> combinations of edges</span><span class="se">\n</span><span class="s2">&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">,</span> <span class="n">d</span> <span class="ow">in</span> <span class="n">min_weight_tour</span><span class="o">.</span><span class="n">edges</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;(</span><span class="si">{</span><span class="n">u</span><span class="si">}</span><span class="s2">, </span><span class="si">{</span><span class="n">v</span><span class="si">}</span><span class="s2">, </span><span class="si">{</span><span class="n">d</span><span class="si">}</span><span class="s2">)&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<h2 id="everything-is-cool-with-the-ascent-method">Everything is Cool with the Ascent Method<a class="headerlink" href="#everything-is-cool-with-the-ascent-method" title="Link to this heading">#</a></h2>
<p>This is useful information as every though the ascent method returns a vector, because if the ascent method returns this solution (a.k.a $f(\pi) = 0$) we can calculate that vector off of the edges in the solution without having to explicitly enumerate the dict returned by <code>held_karp_ascent()</code>.</p>
<p>The first output from the program was a six vertex graph and is presented below.</p>

<div class="highlight">
  <pre>~ time python3 brute_force_optimal_tour.py
[[ 0 45 39 92 29 31]
 [72  0  4 12 21 60]
 [81  6  0 98 70 53]
 [49 71 59  0 98 94]
 [74 95 24 43  0 47]
 [56 43  3 65 22  0]]
Minimum tour found with weight 144.0 from 593775 combinations of edges

(0, 5, 31)
(5, 4, 22)
(1, 3, 12)
(3, 0, 49)
(2, 1, 6)
(4, 2, 24)

real	0m9.596s
user	0m9.689s
sys     0m0.241s</pre>
</div>

<p>First I checked that the ascent method was returning a solution with the same weight, 144, which it was.
Also, every entry in the vector was $0.866\overline{6}$ which is $\frac{5}{6}$ or the scaling factor from the Asadpour paper so I know that it was finding the exact solution.
Because if this, my test in <code>test_traveling_salesman.py</code> checks that for all edges in the solution edge set both $(u, v)$ and $(v, u)$ are equal to $\frac{5}{6}$.</p>
<p>For my next test, I created a $7 \times 7$ matrix to test with, and as expected the running time of the python script was much slower.</p>

<div class="highlight">
  <pre>~ time python3 brute_force_optimal_tour.py
[[ 0 26 63 59 69 31 41]
 [62  0 91 53 75 87 47]
 [47 82  0 90 15  9 18]
 [68 19  5  0 58 34 93]
 [11 58 53 55  0 61 79]
 [88 75 13 76 98  0 40]
 [41 61 55 88 46 45  0]]
Minimum tour found with weight 190.0 from 26978328 combinations of edges

(0, 1, 26)
(1, 3, 53)
(3, 2, 5)
(2, 5, 9)
(5, 6, 40)
(4, 0, 11)
(6, 4, 46)

real	7m28.979s
user	7m29.048s
sys     0m0.245s</pre>
</div>

<p>Once again, the value of $f(\pi)$ hit 0, so the ascent method returned an exact solution and my testing procedure was the same as for the six vertex graph.</p>
<h2 id="trouble-with-branch-and-bound">Trouble with Branch and Bound<a class="headerlink" href="#trouble-with-branch-and-bound" title="Link to this heading">#</a></h2>
<p>The branch and bound method was not working well with the two example graphs I generated.
First, on the seven vertex matrix, I programmed the test and let it run&hellip; and run&hellip; and run&hellip; until I stopped it at just over an hour of execution time.
If it took one eight of that time to brute force the solution, then the branch and bound method truly is not efficient.</p>
<p>I moved to the six vertex graph with high hopes, I already had a six vertex graph which was correctly executing in a reasonable amount of time.
The six vertex graph created a large number of exceptions and errors when I ran the tests.
I was able to determine why the errors were being generated, but the context did not conform which my expectations for the branch and bound method.</p>
<p>Basically, <code>direction_of_ascent_kilter()</code> was finding a vertex which was out-of-kilter and returning the corresponding direction of ascent, but <code>find_epsilon()</code> was not finding any valid cross over edges and returning a maximum direction of travel of $\infty$.
While I could change the default value for the return value of <code>find_epsilon()</code> to zero, that would not solve the problem because the value of the vector $\pi$ would get stuck and the program would enter an infinite loop.</p>
<p>I do have an analogy for this situation.
Imagine that you are in an unfamiliar city and you have to meet somebody at the tallest building in that city.
However, you don&rsquo;t know the address and have no way to get a GPS route to that building.
Instead of wandering around aimlessly, you decide to scan the skyline for the tallest building you can see and start walking down the street which is the closest to matching that direction.
Additionally, you have the ability to tell at any given direction how far down the chosen street to go before you need to re-evaluate and pick a new street.</p>
<p>This hypothetical is a better approximation of the ascent method, but the problem here can be demonstrated non the less.</p>
<ul>
<li>Determining if you are at the tallest building is running the linear program to see if the direction of ascent still exists.</li>
<li>Picking the street to go down is the same as finding the direction of ascent.</li>
<li>Finding out how far to go down that street is the same as finding epsilon.</li>
</ul>
<p>After this procedure works for a while, you suddenly find yourself in an unusual situation.
You can still see the tallest building, so you know you are not there yet.
You know what street will take you closer to the building, but for some reason you cannot move down that street.</p>
<p>From my understanding of the ascent and branch and bound methods, if the direction of ascent exists, then we have to be able to move some amount in that direction without fail, but the branch and bound method was failing to provide an adequate distance to move.</p>
<p>Considering the trouble with the branch and bound method, and that it is not going to be used in the final Asadpour algorithm, I plan on removing it from the NetworkX pull request and moving onwards using only the ascent method for the rest of the Ascent method.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061.</p>
<p>M. Held, R. M. Karp, <em>The traveling-salesman problem and minimum spanning trees</em>. Operations research, 1970-11-01, Vol.18 (6), p.1138-1162. <a href="https://www.jstor.org/stable/169411">https://www.jstor.org/stable/169411</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[GSoC'21: Mid-Term Progress]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_midterm/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_introduction/?utm_source=atom_feed" rel="related" type="text/html" title="Aitik Gupta joins as a Student Developer under GSoC&#39;21" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2020_final_work_product/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC 2020 Work Product - Baseline Images Problem" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_5/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 3 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_4/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 2 Blog 2" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_3/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 2 Blog 1" />
            
                <id>https://blog.scientific-python.org/matplotlib/gsoc_2021_midterm/</id>
            
            
            <published>2021-07-02T08:32:05+05:30</published>
            <updated>2021-07-02T08:32:05+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Mid-Term Progress with Google Summer of Code 2021 project under NumFOCUS: Aitik Gupta</blockquote><p><strong>&quot;<ins>Aitik, how is your GSoC going?</ins>&quot;</strong></p>
<p>Well, it&rsquo;s been a while since I last wrote. But I wasn&rsquo;t spending time watching <em>Loki</em> either! (that&rsquo;s a lie.)</p>
<p>During this period the project took on some interesting (and stressful) curves, which I intend to talk about in this small writeup.</p>
<h2 id="new-mentor">New Mentor!<a class="headerlink" href="#new-mentor" title="Link to this heading">#</a></h2>
<p>The first week of coding period, and I met one of my new mentors, <a href="https://github.com/jkseppan">Jouni</a>. Without him, along with <a href="https://github.com/tacaswell">Tom</a> and <a href="https://github.com/anntzer">Antony</a>, the project wouldn&rsquo;t have moved <em>an inch</em>.</p>
<p>It was initially Jouni&rsquo;s <a href="https://github.com/matplotlib/matplotlib/pull/18143">PR</a> which was my starting point of the first milestone in my proposal, <ins>Font Subsetting</ins>.</p>
<h2 id="what-is-font-subsetting-anyway">What is Font Subsetting anyway?<a class="headerlink" href="#what-is-font-subsetting-anyway" title="Link to this heading">#</a></h2>
<p>As was proposed by Tom, a good way to understand something is to document your journey along the way! (well, that&rsquo;s what GSoC wants us to follow anyway right?)</p>
<p>Taking an excerpt from one of the paragraphs I wrote <a href="https://github.com/matplotlib/matplotlib/blob/a94f52121cea4194a5d6f6fc94eafdfb03394628/doc/users/fonts.rst#subsetting">here</a>:</p>
<blockquote>
<p>Font Subsetting can be used before generating documents, to embed only the <em>required</em> glyphs within the documents. Fonts can be considered as a collection of these glyphs, so ultimately the goal of subsetting is to find out which glyphs are required for a certain array of characters, and embed only those within the output.</p>
</blockquote>
<p>Now this may seem straightforward, right?</p>
<h4 id="wrong">Wrong.<a class="headerlink" href="#wrong" title="Link to this heading">#</a></h4>
<p>The glyph programs can call their own subprograms, for example, characters like <code>ä</code> could be composed by calling subprograms for <code>a</code> and <code>¨</code>; or <code>→</code> could be composed by a program that changes the display matrix and calls the subprogram for <code>←</code>.</p>
<p>Since the subsetter has to find out <em>all such subprograms</em> being called by <em>every glyph</em> included in the subset, this is a generally difficult problem!</p>
<p>Something which one of my mentors said which <em>really</em> stuck with me:</p>
<blockquote>
<p>Matplotlib isn&rsquo;t a font library, and shouldn&rsquo;t try to be one.</p>
</blockquote>
<p>It&rsquo;s really easy to fall into the trap of trying to do <em>everything</em> within your own project, which ends up rather <em>hurting</em> itself.</p>
<p>Since this holds true even for Matplotlib, it uses external dependencies like <a href="https://www.freetype.org/">FreeType</a>, <a href="https://github.com/sandflow/ttconv">ttconv</a>, and newly proposed <a href="https://github.com/fonttools/fonttools">fontTools</a> to handle font subsetting, embedding, rendering, and related stuff.</p>
<p>PS: If that font stuff didn&rsquo;t make sense, I would recommend going through a friendly tutorial I wrote, which is all about <a href="https://matplotlib.org/stable/users/fonts.html">Matplotlib and Fonts</a>!</p>
<h2 id="unexpected-complications">Unexpected Complications<a class="headerlink" href="#unexpected-complications" title="Link to this heading">#</a></h2>
<p>Matplotlib uses an external dependency <code>ttconv</code> which was initially forked into Matplotlib&rsquo;s repository <strong>in 2003</strong>!</p>
<blockquote>
<p>ttconv was a standalone commandline utility for converting TrueType fonts to subsetted Type 3 fonts (among other features) written in 1995, which Matplotlib forked in order to make it work as a library.</p>
</blockquote>
<p>Over the time, there were a lot of issues with it which were either hard to fix, or didn&rsquo;t attract a lot of attention. (See the above paragraph for a valid reason)</p>
<p>One major utility which is still used is <code>convert_ttf_to_ps</code>, which takes a <em>font path</em> as input and converts it into a Type 3 or Type 42 PostScript font, which can be embedded within PS/EPS output documents. The guide I wrote (<a href="https://matplotlib.org/stable/users/fonts.html">link</a>) contains decent descriptions, the differences between these type of fonts, etc.</p>
<h4 id="so-we-need-to-convert-that-font-path-input-to-a-font-buffer-input">So we need to convert that <em>font path</em> input to a <em>font buffer</em> input.<a class="headerlink" href="#so-we-need-to-convert-that-font-path-input-to-a-font-buffer-input" title="Link to this heading">#</a></h4>
<p>Why do we need to? Type 42 subsetting isn&rsquo;t really supported by ttconv, so we use a new dependency called fontTools, whose &lsquo;full-time job&rsquo; is to subset Type 42 fonts for us (among other things).</p>
<blockquote>
<p>It provides us with a font buffer, however ttconv expects a font path to embed that font</p>
</blockquote>
<p>Easily enough, this can be done by Python&rsquo;s <code>tempfile.NamedTemporaryFile</code>:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">with</span> <span class="n">tempfile</span><span class="o">.</span><span class="n">NamedTemporaryFile</span><span class="p">(</span><span class="n">suffix</span><span class="o">=</span><span class="s2">&#34;.ttf&#34;</span><span class="p">)</span> <span class="k">as</span> <span class="n">tmp</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># fontdata is the subsetted buffer</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># returned from fontTools</span>
</span></span><span class="line"><span class="cl">    <span class="n">tmp</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">fontdata</span><span class="o">.</span><span class="n">getvalue</span><span class="p">())</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># TODO: allow convert_ttf_to_ps</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># to input file objects (BytesIO)</span>
</span></span><span class="line"><span class="cl">    <span class="n">convert_ttf_to_ps</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">os</span><span class="o">.</span><span class="n">fsencode</span><span class="p">(</span><span class="n">tmp</span><span class="o">.</span><span class="n">name</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="n">fh</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">fonttype</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">glyph_ids</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span></span></span></code></pre>
</div>
<p><strong><em>But this is far from a clean API; in terms of separation of *reading* the file from *parsing* the data.</em></strong></p>
<p>What we <em>ideally</em> want is to pass the buffer down to <code>convert_ttf_to_ps</code>, and modify the embedding code of <code>ttconv</code> (written in C++). And <em>here</em> we come across a lot of unexplored codebase, <em>which wasn&rsquo;t touched a lot ever since it was forked</em>.</p>
<p>Funnily enough, just yesterday, after spending a lot of quality time, me and my mentors figured out that the <strong>whole logging system of ttconv was broken</strong>, all because of a single debugging function. 🥲</p>
<hr>
<p>This is still an ongoing problem that we need to tackle over the coming weeks, hopefully by the next time I write one of these blogs, it gets resolved!</p>
<p>Again, thanks a ton for spending time reading these blogs. :D</p>
<h4 id="note-this-blog-post-is-also-available-at-my-personal-website">NOTE: This blog post is also available at my <a href="https://aitikgupta.github.io/gsoc-mid/">personal website</a>.<a class="headerlink" href="#note-this-blog-post-is-also-available-at-my-personal-website" title="Link to this heading">#</a></h4>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="GSoC" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Implementing the Held-Karp Relaxation]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/?utm_source=atom_feed" rel="related" type="text/html" title="Understanding the Ascent Method" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/?utm_source=atom_feed" rel="related" type="text/html" title="implementing the Iterators" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/?utm_source=atom_feed" rel="related" type="text/html" title="Finding all Minimum Arborescences" />
                <link href="https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="A Closer Look at the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/?utm_source=atom_feed" rel="related" type="text/html" title="NetworkX Function Stubs" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/</id>
            
            
            <published>2021-06-28T00:00:00+00:00</published>
            <updated>2021-06-28T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Implementation details for the ascent method to solve the Held-Karp relaxation</blockquote><p>I have now completed my implementation of the ascent and the branch and bound method detailed in the 1970 paper <em>The Traveling-Salesman Problem and Minimum Spanning Trees</em> by Micheal Held and Richard M. Karp.
In my last post, titled <a href="../understanding-the-ascent-method">Understanding the Ascent Method</a>, I completed the first iteration of the ascent method and found an important bug in the <code>find_epsilon()</code> method and found a more efficient way to determine substitutes in the graph.
However the solution being given was still not the optimal solution.</p>
<p>After discussing my options with my GSoC mentors, I decided to move onto the branch and bound method anyways with the hope that because the method is more human-computable and an example was given in the paper by Held and Karp that I would be able to find the remaining flaws.
Fortunately, this was indeed the case and I was able to correctly implement the branch and bound method and fix the last problem with the ascent method.</p>
<h2 id="initial-implementation-of-the-branch-and-bound-method">Initial Implementation of the Branch and Bound Method<a class="headerlink" href="#initial-implementation-of-the-branch-and-bound-method" title="Link to this heading">#</a></h2>
<p>The branch and bound method follows from the ascent method, but tweaks how we determine the direction of ascent and simplifies the expression used for $\epsilon$.
As a reminder, we use the notion of an <em>out-of-kilter</em> vertex to find directions of ascent which are unit vectors or negative unit vectors.
An out-of-kilter vertex is a vertex which is consistently not connected enough or connected too much in the set of minimum 1-arborescences of a graph.
The formal definition is given on page 1151 as</p>
<blockquote>
<p>Vertex $i$ is said to be <em>out-of-kilter high</em> at the point $\pi$, if, for all $k \in K(\pi), v_{ik} \geqq 1$;
similarly, vertex $i$ is <em>out-of-kilter low</em> at the point $\pi$ if, for all $k \in K(\pi), v_{ik} = -1$.</p>
</blockquote>
<p>Where $v_{ik}$ is the degree of the vertex minus two.
First, I created a function called <code>direction_of_ascent_kilter()</code> which returns a direction of ascent based on whether a vertex is out-of-kilter.
However, I did not use the method mentioned on the paper by Held and Karp, which is to find a member of $K(\pi, u_i)$ where $u_i$ is the unit vector with 1 in the $i$th location and check if vertex $i$ had a degree of 1 or more than two.
Instead, I knew that I could find the elements of $K(\pi)$ with existing code and decided to check the value of $v_{ik}$ for all $k \in K(\pi)$ and once it is determined that a vertex is out-of-kilter simply move on to the next vertex.</p>
<p>Once I have a mapping of all vertices to their kilter state, find one which is out-of-kilter and return the corresponding direction of ascent.</p>
<p>The changes to <code>find_epsilon()</code> were very minor, basically removing the denominator from the formula for $\epsilon$ and adding a check to see if we have a negative direction of ascent so that the crossover distances become positive and thus valid.</p>
<p>The brand new function which was needed was <code>branch()</code>, which well&hellip; branches according to the Held and Karp paper.
The first thing it does is run the linear program to form the ascent method to determine if a direction of ascent exists.
If the direction does exist, branch.
If not, search the set of minimum 1-arborescences for a tour and then branch if it does not exist.
The branch process itself is rather simple, find the first open edge (an edge not in the partition sets $X$ and $Y$) and then create two new configurations where that edges is either included or excluded respectively.</p>
<p>Finally the overall structure of the algorithm, written in pseudocode is</p>

<div class="highlight">
  <pre>Initialize pi to be the zero vector.
Add the configuration (∅, ∅, pi, w(0)) to the configuration priority queue.
while configuration_queue is not empty:
    config = configuration_queue.get()
    dir_ascent = direction_of_ascent_kilter()
    if dir_ascent is None:
        branch()
        if solution returned by branch is not None
            return solution
    else:
        max_dist = find_epsilon()
        update pi
        update edge weights
        update config pi and bound value</pre>
</div>

<h2 id="debugging-the-branch-and-bound-method">Debugging the Branch and Bound Method<a class="headerlink" href="#debugging-the-branch-and-bound-method" title="Link to this heading">#</a></h2>
<p>My initial implementation of the branch and bound method returned the same, incorrect solution is the ascent method, but with different edge weights.
As a reminder, I wanted a solution which looked like this:</p>
<center><img src="expected-solution.png" alt="Expected solution for the Held-Karp relaxation for the example graph"/></center>
<p>and I now had two algorithms returning this solution:</p>
<center><img src="found-solution.png" width=350 alt="Solution found by the branch-and-bound method"/></center>
<p>As I mentioned before, the branch and bound method is more human-computable than the ascent method, so I decided to follow the execution of my implementation with the one given in [1].
Below, the left side is the data from the Held and Karp paper and on the right my program&rsquo;s execution on the directed version.</p>
<table>
  <thead>
      <tr>
          <th>Undirected Graph</th>
          <th>Directed Graph</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Iteration 1:</td>
          <td></td>
      </tr>
      <tr>
          <td>Starting configuration: $(\emptyset, \emptyset, \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 \end{bmatrix}, 196)$</td>
          <td>Starting configuration: $(\emptyset, \emptyset, \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 \end{bmatrix}, 196)$</td>
      </tr>
      <tr>
          <td>Minimum 1-Trees:</td>
          <td>Minimum 1-Arborescences:</td>
      </tr>
      <tr>
          <td><img src="/networkx/atsp/implementing-the-held-karp-relaxation/minimum-1-trees-iteration-1.png" alt="minimum 1 trees"></td>
          <td><img src="/networkx/atsp/implementing-the-held-karp-relaxation/minimum-1-arborescences-iteration-1.png" alt="minimum 1 arborescences"></td>
      </tr>
      <tr>
          <td>Vertex 3 out-of-kilter LOW</td>
          <td>Vertex 3 out-of-kilter LOW</td>
      </tr>
      <tr>
          <td>$d = \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; -1 &amp; 0 &amp; 0 \end{bmatrix}$</td>
          <td>$d = \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; -1 &amp; 0 &amp; 0 \end{bmatrix}$</td>
      </tr>
      <tr>
          <td>$\epsilon(\pi, d) = 5$</td>
          <td>$\epsilon(\pi, d) = 5$</td>
      </tr>
      <tr>
          <td>New configuration: $(\emptyset, \emptyset, \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; -5 &amp; 0 &amp; 0 \end{bmatrix}, 201)$</td>
          <td>New configuration: $(\emptyset, \emptyset, \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; -5 &amp; 0 &amp; 0 \end{bmatrix}, 212)$</td>
      </tr>
      <tr>
          <td></td>
          <td></td>
      </tr>
      <tr>
          <td>Iteration 2:</td>
          <td></td>
      </tr>
      <tr>
          <td>Minimum 1-Trees:</td>
          <td>Minimum 1-Arborescences:</td>
      </tr>
      <tr>
          <td><img src="/networkx/atsp/implementing-the-held-karp-relaxation/minimum-1-trees-iteration-2.png" alt="minimum 1 trees"></td>
          <td><img src="/networkx/atsp/implementing-the-held-karp-relaxation/minimum-1-arborescences-iteration-2.png" alt="minimum 1 arborescences"></td>
      </tr>
  </tbody>
</table>
<p>In order to get these results, I forbid the program from being able to choose to connect vertex 0 to the same other vertex for both the incoming and outgoing edge.
However, it is very clear that from the start, iteration two was not going to be the same.</p>
<p>I noticed that in the first iteration, there were twice as many 1-arborescences as 1-trees and that the difference was that the cycle can be traversed in both directions.
This creates a mapping between 1-trees and 1-arborescences.
In the second iteration, there is not as twice as many 1-arborescences and that mapping is not present.
Vertex 0 always connects to vertex 3 in the arborescences and vertex 5 in the trees.
Additionally, the cost of the 1-arborescences are higher than the costs of the 1-trees.</p>
<p>I knew that the choice of root node in the arborescences affects the total price from working on the ascent method.
I now wondered if a minimum 1-arborescence could come from a non-minimum spanning arborescence.
So it would be, the answer is yes.</p>
<p>In order to test this hypothesis, I created a simple python script using a modified version of <code>k_pi()</code>.
The entire thing is longer than I&rsquo;d like to put here, but the gist was simple; iterate over <em>all</em> of the spanning arborescences in the graph, tracking the minimum weight and then printing the minimum 1-arborescences that this program finds to compare to the ones that the unaltered one finds.</p>
<p>The output is below:</p>

<div class="highlight">
  <pre>Adding arborescence with weight 212.0
Adding arborescence with weight 212.0
Adding arborescence with weight 212.0
Adding arborescence with weight 204.0
Adding arborescence with weight 204.0
Adding arborescence with weight 196.0
Adding arborescence with weight 196.0
Adding arborescence with weight 196.0
Adding arborescence with weight 196.0
Adding arborescence with weight 196.0
Adding arborescence with weight 196.0
Found 6 minimum 1-arborescences

(1, 5, 30)
(2, 1, 41)
(2, 3, 21)
(4, 2, 35)
(5, 0, 52)
(0, 4, 17)

(1, 2, 41)
(2, 3, 21)
(2, 4, 35)
(4, 0, 17)
(5, 1, 30)
(0, 5, 52)

(2, 3, 21)
(2, 4, 35)
(4, 0, 17)
(5, 1, 30)
(5, 2, 41)
(0, 5, 52)

(2, 4, 35)
(3, 2, 16)
(4, 0, 17)
(5, 1, 30)
(5, 3, 46)
(0, 5, 52)

(2, 3, 21)
(3, 5, 41)
(4, 2, 35)
(5, 1, 30)
(5, 0, 52)
(0, 4, 17)

(2, 3, 21)
(2, 5, 41)
(4, 2, 35)
(5, 1, 30)
(5, 0, 52)
(0, 4, 17)</pre>
</div>

<p>This was very enlightening.
The 1-arborescences of weight 212 were the ones that my branch and bound method was using in the second iteration, but not the true minimum ones.
Graphically, those six 1-arborescences look like this:</p>
<center><img src="true-minimum-arborescences.png" alt-"The true set of minimum arborescenes in the example graph"/></center>
<p>And suddenly that mapping between the 1-trees and 1-arborescences is back!
But why can minimum 1-arborescences come from non-minimum spanning arborescences?
Remember that we create 1-arborescences by find spanning arborescences on the vertex set ${2, 3, \dots, n}$ and then connecting that missing vertex to the root of the spanning arborescence and the minimum weight incoming edge.</p>
<p>This means that even among the true minimum spanning arborescences, the final weight of the 1-arborescence can vary based on the cost of connecting &lsquo;vertex 1&rsquo; to the root of the arborescence.
I already had to deal with this issue earlier in the implementation of the ascent method.
Now suppose that not every vertex in the graph is a root of an arborescence in the set of minimum spanning arborescences.
Let the <em>minimum</em> root be the root vertex of the arborescence which is the cheapest to connect to and the <em>maximum</em> root the root vertex which is the most expensive to connect to.
If we needed to, we could order the roots from minimum to maximum based on the weight of the edge from &lsquo;vertex 1&rsquo; to that root.</p>
<p>Finally, suppose that the result of considering only the set of minimum spanning arborescences results in a set of minimum 1-arborescenes which do not use the minimum root and have a total cost $c$ more than the cost of the minimum spanning arborescence plus the cost of connecting to the minimum root.
Continue to consider spanning arborescences in increasing weight, such as the ones returned by the <code>ArborescenceIterator</code>.
Eventually the <code>ArborescenceIterator</code> will return a spanning arborescence which has the minimum root.
If the cost of the minimum spanning arborescence is $c_{min}$ and the cost of this arborescence is less than $c_{min} + c$ then a new minimum 1-arborescence has been found from a non-minimum spanning arborescence.</p>
<p>It is obviously impractical to consider all of the spanning arborescences in the graph, and because <code>ArborescenceIterator</code> returns arborescences in order of increasing weight, there is a weight after which it is impossible to produce a minimum 1-arborescence.</p>
<p>Let the cost of a minimum spanning arborescence be $c_{min}$ and the total costs of connecting the roots range from $r_{min}$ to $r_{max}$.
The worst case cost of the minimum 1-arborescence is $c_{min} + r_{max}$ which would connect the minimum spanning arborescence to the most expensive root and the best case minimum 1-arborescence would be $c_{min} + r_{min}$.
With regard to the weight of the spanning arborescence itself, once it exceeds $c_{min} + r_{max} - r_{min}$ we know that even if it uses the minimum root that the total weight will be greater than worst case minimum 1-arborescence so that is the bound which we use the <code>ArborescenceIterator</code> with.</p>
<p>After implementing this boundary for checking spanning arborescences to find minimum 1-arborescences, both methods executed successfully on the test graph.</p>
<h2 id="next-steps">Next Steps<a class="headerlink" href="#next-steps" title="Link to this heading">#</a></h2>
<p>Now that both the ascent and branch and bound methods are working, they must be tested both for accuracy and performance.
Surprisingly, on the test graph I have been using, which is originally from the Held and Karp paper, the ascent method is between 2 and 3 times faster than the branch and bound method.
However, this six vertex graph is small and the branch and bound method may yet have better performance on larger graphs.
I will have to create larger test graphs and then select whichever method has better performance overall.</p>
<p>Additionally, this is an example where $f(\pi)$, the gap between a tour and 1-arborescence, converges to 0.
This is not always the case, so I will need to test on an example where the minimum gap is greater than 0.</p>
<p>Finally, the output of my Held Karp relaxation program is a tour.
This is just one part of the Asadpour asymmetric traveling salesperson problem and that algorithm takes a modified vector which is produced based on the final result of the relaxation.
I still need to convert the output to match the expectation of the overall algorithm I am seeking to implement this summer of code.</p>
<p>I hope to move onto the next step of the Asadpour algorithm on either June 30th or July 1st.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>[1] Held, M., Karp, R.M. <em>The traveling-salesman problem and minimum spanning trees</em>. Operations research, 1970-11-01, Vol.18 (6), p.1138-1162. <a href="https://www.jstor.org/stable/169411">https://www.jstor.org/stable/169411</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Understanding the Ascent Method]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/?utm_source=atom_feed" rel="related" type="text/html" title="implementing the Iterators" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/?utm_source=atom_feed" rel="related" type="text/html" title="Finding all Minimum Arborescences" />
                <link href="https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="A Closer Look at the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/?utm_source=atom_feed" rel="related" type="text/html" title="NetworkX Function Stubs" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Separation Oracle" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/</id>
            
            
            <published>2021-06-22T00:00:00+00:00</published>
            <updated>2021-06-22T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>A deep dive into the ascent method for the Held-Karp relaxation</blockquote><p>It has been far longer than I would have preferred since I wrote a blog post.
As I expected in my original GSoC proposal, the Held-Karp relaxation is proving to be quite difficult to implement.</p>
<p>My mentors and I agreed that the branch and bound method discussed in Held and Karp&rsquo;s 1970 paper <em>The Traveling-Salesman Problem and Minimum Spanning Trees</em> which first required the implementation of the ascent method because it is used in the branch and bound method.
For the last week and a half I have been implementing and debugging the ascent method and wanted to take some time to reflect on what I have learned.</p>
<p>I will start by saying that as of the writing of this post, my version of the ascent method is not giving what I expect to be the optimal solution.
For my testing, I took the graph which Held and Karp use in their example of the branch and bound method, a weighted $\mathcal{K}_6$, and converted to a directed but symmetric version given in the following adjacency matrix.</p>
<p>$$
\begin{bmatrix}
0 &amp; 97 &amp; 60 &amp; 73 &amp; 17 &amp; 52 \\\
97 &amp; 0 &amp; 41 &amp; 52 &amp; 90 &amp; 30 \\\
60 &amp; 41 &amp; 0 &amp; 21 &amp; 35 &amp; 41 \\\
73 &amp; 52 &amp; 21 &amp; 0 &amp; 95 &amp; 46 \\\
17 &amp; 90 &amp; 35 &amp; 95 &amp; 0 &amp; 81 \\\
52 &amp; 30 &amp; 41 &amp; 46 &amp; 81 &amp; 0
\end{bmatrix}
$$</p>
<p>The original solution is an undirected tour but in the directed version, the expected solutions depend on which way they are traversed.
Both of these cycles have a total weight of 207.</p>
<center><img src="expected-solution.png" alt="Expected solutions for the weighted K_6 used in the Held and Karp paper"/></center>
<p>This is the cycle returned by the program, which has a total weight of 246.</p>
<center><img src="found-solution.png" width=350 alt="The current solution being found by the program"/></center>
<p>All of this code goes into the function <code>_held_karp()</code> within <code>traveling_saleaman.py</code> in NetworkX and I tried to follow the algorithm outlined in the paper as closely as I could.
The <code>_held_karp()</code> function itself has three inner functions, <code>k_pi()</code>, <code>direction_of_ascent()</code> and <code>find_epsilon()</code> which represent the main three steps used in each iteration of the ascent method.</p>
<h2 id="k_pi"><code>k_pi()</code><a class="headerlink" href="#k_pi" title="Link to this heading">#</a></h2>
<p><code>k_pi()</code> uses the <code>ArborescenceIterator</code> I implemented during the first week of coding for the Summer of Code to find all of the minimum 1-arborescences in the graph.
My original assessment of creating 1-arborescences was slightly incorrect.
I stated that</p>
<blockquote>
<p>In order to connect vertex 1, we would choose the outgoing arc with the smallest cost and the incoming arc with the smallest cost.</p>
</blockquote>
<p>In reality, this method would produce graphs which are almost arborescences based solely on the fact that the outgoing arc would almost certainly create a vertex with two incoming arcs.
Instead, we need to connect vertex 1 with the incoming edge of lowest cost and the edge connecting to the root node of the arborescence on nodes ${2, 3, \dots, n}$ that way the in-degree constraint is not violated.</p>
<p>For the test graph on the first iteration of the ascent method, <code>k_pi()</code> returned 10 1-arborescences but the costs were not all the same.
Notice that because we have no agency in choosing the outgoing edge of vertex 1 that the total cost of the 1-arborescence will vary by the difference between the cheapest root to connect to and the most expensive node to connect to.
My original writing of this function was not very efficient and it created the 1-arborescence from all of the minimum spanning arborescences and then iterated over them to delete all of the non-minimum ones.</p>
<p>Yesterday I re-wrote this function so that once a 1-arborescence of lower weight was found it would delete all of the current minimum ones in favor on the new one and not add any 1-arborescences it found with greater weight to the set of minimum 1-arborescences.</p>
<p>The real reason that I re-wrote the method was to try something new in hopes of pushing the program from a suboptimal solution to the optimal one.
As I mentioned early, the forced choice of connecting to the root node created 1-arborescences of different weight.
I suspected then that different choices of vertex 1 would be able to create 1-arborescences of even lower weight than just arbitrarily using the one returned by <code>next(G.__iter__())</code>.
So I wrapped all of <code>k_pi()</code> with a <code>for</code> loop over the vertices of the graph and found that the choice of vertex 1 made a difference.</p>

<div class="highlight">
  <pre>Excluded node: 0, Total Weight: 161.0
Chosen incoming edge for node 0: (4, 0), chosen outgoing edge for node 0: (0, 4)
(2, 3, 21)
(2, 5, 41)
(4, 2, 35)
(4, 0, 17)
(5, 1, 30)
(0, 4, 17)

Excluded node: 0, Total Weight: 161.0
Chosen incoming edge for node 0: (4, 0), chosen outgoing edge for node 0: (0, 4)
(1, 5, 30)
(2, 1, 41)
(2, 3, 21)
(4, 2, 35)
(4, 0, 17)
(0, 4, 17)

Excluded node: 1, Total Weight: 174.0
Chosen incoming edge for node 1: (5, 1), chosen outgoing edge for node 1: (1, 5)
(2, 3, 21)
(2, 4, 35)
(4, 0, 17)
(5, 2, 41)
(5, 1, 30)
(1, 5, 30)

Excluded node: 2, Total Weight: 187.0
Chosen incoming edge for node 2: (3, 2), chosen outgoing edge for node 2: (2, 3)
(0, 4, 17)
(3, 5, 46)
(3, 2, 21)
(5, 0, 52)
(5, 1, 30)
(2, 3, 21)

Excluded node: 3, Total Weight: 165.0
Chosen incoming edge for node 3: (2, 3), chosen outgoing edge for node 3: (3, 2)
(1, 5, 30)
(2, 1, 41)
(2, 4, 35)
(2, 3, 21)
(4, 0, 17)
(3, 2, 21)

Excluded node: 3, Total Weight: 165.0
Chosen incoming edge for node 3: (2, 3), chosen outgoing edge for node 3: (3, 2)
(2, 4, 35)
(2, 5, 41)
(2, 3, 21)
(4, 0, 17)
(5, 1, 30)
(3, 2, 21)

Excluded node: 4, Total Weight: 178.0
Chosen incoming edge for node 4: (0, 4), chosen outgoing edge for node 4: (4, 0)
(0, 5, 52)
(0, 4, 17)
(1, 2, 41)
(2, 3, 21)
(5, 1, 30)
(4, 0, 17)

Excluded node: 4, Total Weight: 178.0
Chosen incoming edge for node 4: (0, 4), chosen outgoing edge for node 4: (4, 0)
(0, 5, 52)
(0, 4, 17)
(2, 3, 21)
(5, 1, 30)
(5, 2, 41)
(4, 0, 17)

Excluded node: 5, Total Weight: 174.0
Chosen incoming edge for node 5: (1, 5), chosen outgoing edge for node 5: (5, 1)
(1, 2, 41)
(1, 5, 30)
(2, 3, 21)
(2, 4, 35)
(4, 0, 17)
(5, 1, 30)</pre>
</div>

<p>Note that because my test graph is symmetric it likes to make cycles with only two nodes.
The weights of these 1-arborescences range from 161 to 178, so I tried to run the test which had been taking about 300 ms using the new approach&hellip; and the program was non-terminating.
I created breakpoints in PyCharm after 200 iterations of the ascent method and found that the program was stuck in a loop where it alternated between two different minimum 1-arborescences.
This was a long shot, but it did not work out so I reverted the code to always pick the same vertex for vertex 1.</p>
<p>Either way, the fact that I had almost entirely re-written this function without a change in output suggests that this function is not the source of the problem.</p>
<h2 id="direction_of_ascent"><code>direction_of_ascent()</code><a class="headerlink" href="#direction_of_ascent" title="Link to this heading">#</a></h2>
<p>This was the one function which has pseudocode in the Held and Karp paper:</p>
<blockquote>
<ol>
<li>Set $d$ equal to the zero $n$-vector.</li>
<li>Find a 1-tree $T^k$ such that $k \in K(\pi, d)$. [A method of executing Step 2 follows from the results of Section 6 (the greedy algorithm).]</li>
<li>If $\sum_{i=1}^{i=n} d_i v_{i k} &gt; 0$, STOP.</li>
<li>$d_i \rightarrow d_i + v_{i k}$, for $i = 2, 3, \dots, n$</li>
<li>GO TO 2.</li>
</ol>
</blockquote>
<p>Using this as a guide, the implementation of this function was simple until I got to the terminating condition, which is a linear program discussed on page 1149 as</p>
<blockquote>
<p>Thus, when failure to terminate is suspected, it is necessary to check whether no direction of ascent exists; by the Minkowski-Farkas lemma this is equivalent to the existence of nonnegative coefficients $\alpha_k$ such that</p>
<p>$ \sum_{k \in K(\pi)} \alpha_kv_{i k} = 0, \quad i = 1, 2, \dots, n $</p>
<p>This can be checked by linear programming.</p>
</blockquote>
<p>While I was able to implement this without much issue, one <em>very</em> important constraint of the linear program was not mentioned here, but rather the page before during a proof.
That constraint is</p>
<p>$$
\sum_{k \in K(\pi)} \alpha_k = 1
$$</p>
<p>Once I spent several hours trying to debug the original linear program and noticed the missing constraint. The linear program started to behave correctly, terminating the program when a tour is found.</p>
<h2 id="find_epsilon"><code>find_epsilon()</code><a class="headerlink" href="#find_epsilon" title="Link to this heading">#</a></h2>
<p>This function requires a completely different implementation compared to the one described in the Held and Karp paper.</p>
<p>The basic idea in both my implementation for directed graphs and the description for undirected graphs is finding edges which are substitutes for each other, or an edge outside the 1-arborescence which can replace an edge in the arborescence and will result in a 1-arborescence.</p>
<p>The undirected version uses the idea of fundamental cycles in the tree to find the substitutes, and I tried to use this idea as will with the <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.cycles.find_cycle.html"><code>find_cycle()</code></a> function in the NetworkX library.
I executed the first iteration of the ascent method by hand and noticed that what I computed for all of the possible values of $\epsilon$ and what the program found did not match.
I had found several that it had missed and it found several that I missed.
For the example graph, I found that the following edge pairs are substitutes where the first edge is not in the 1-arborescence and the second one is the one in the 1-arborescence which it can replace using the below minimum 1-arborescence.</p>
<center><img src="minimum-1-arborescence.png" width=350 alt="1-arborescence after the first iteration of the ascent method"/></center>
<p>$$
\begin{array}{l}
(0, 1) \rightarrow (2, 1) \text{ valid: } \epsilon = 56 \\\
(0, 2) \rightarrow (4, 2) \text{ valid: } \epsilon = 25 \\\
(0, 3) \rightarrow (2, 3) \text{ valid: } \epsilon = 52 \\\
(0, 5) \rightarrow (1, 5) \text{ valid: } \epsilon = \frac{30 - 52}{0 - 0} \text{, not valid} \\\
(1, 3) \rightarrow (2, 3) \text{ valid: } \epsilon = 15.5 \\\
(2, 5) \rightarrow (1, 5) \text{ valid: } \epsilon = 5.5 \\\
(3, 1) \rightarrow (2, 1) \text{ valid: } \epsilon = 5.5 \\\
(3, 5) \rightarrow (1, 5) \text{ valid: } \epsilon = \frac{30 - 46}{-1 + 1} \text{, not valid} \\\
(4, 1) \rightarrow (2, 1) \text{ valid: } \epsilon = \frac{41 - 90}{1 - 1} \text{, not valid} \\\
(4, 3) \rightarrow (2, 3) \text{ valid: } \epsilon = \frac{30 - 95}{1 - 1} \text{, not valid} \\\
(4, 5) \rightarrow (1, 5) \text{ valid: } \epsilon = -25.5 \text{, not valid (negative }\epsilon) \\\
(5, 3) \rightarrow (2, 3) \text{ valid: } \epsilon = 25 \\\
\end{array}
$$</p>
<p>I missed the following substitutes which the program did find.</p>
<p>$$
\begin{array}{l}
(1, 0) \rightarrow (4, 0) \text{ valid: } \epsilon = 80 \\\
(1, 4) \rightarrow (0, 4) \text{ valid: } \epsilon = 73 \\\
(2, 0) \rightarrow (4, 0) \text{ valid: } \epsilon = \frac{17 - 60}{1 - 1} \text{, not valid} \\\
(2, 4) \rightarrow (0, 4) \text{ valid: } \epsilon = -18 \text{, not valid (negative }\epsilon) \\\
(3, 0) \rightarrow (4, 0) \text{ valid: } \epsilon = 28 \\\
(3, 4) \rightarrow (0, 4) \text{ valid: } \epsilon = 78 \\\
(5, 0) \rightarrow (4, 0) \text{ valid: } \epsilon = 35 \\\
(5, 4) \rightarrow (0, 4) \text{ valid: } \epsilon = \frac{17 - 81}{0 - 0} \text{, not valid} \\\
\end{array}
$$</p>
<p>Notice that some substitutions do not cross over if we move in the direction of ascent, which are the pairs which have a zero as the denominator.
Additionally, $\epsilon$ is a distance, and the concept of a negative distance does not make sense.
Interpreting a negative distance as a positive distance in the opposite direction, if we needed to move in that direction, the direction of ascent vector would be pointing the other way.</p>
<p>The reason that my list did not match the list of the program was because <code>find_cycle()</code> did not always return the fundamental cycle containing the new edge.
If I called <code>find_cycle()</code> on a vertex in the other cycle in the graph (in this case ${(0, 4), (4, 0)}$), it would return that rather than the true fundamental cycle.</p>
<p>This prompted me to think about what really determines if edges in a 1-arborescence are substitutes for each other.
In every case where a substitute was valid, both of those edges lead to the same vertex.
If they did not, then the degree constraint of the arborescence would be violated because we did not replace the edge leading into a node with another edge leading into the same node.
This is true regardless of if the edges are part of the same fundamental cycle or not.</p>
<p>Thus, <code>find_epsilon()</code> now takes every edge in the graph but not the chosen 1-arborescence $k \in K(\pi, d)$ and find the other edge in $k$ pointing to the same vertex, swaps them and then checks that the degree constraint is not violated, it has the correct number of edges and it is still connected.
This is a more efficient method to use, and it found more valid substitutions as well so I was hopeful that it would finally bring the returned solution down to the optimal solution, perhaps because it was missing the correct value of $\epsilon$ on even just one of the iterations.</p>
<p>It did not.</p>
<h2 id="next-steps">Next Steps<a class="headerlink" href="#next-steps" title="Link to this heading">#</a></h2>
<p>At this point I have no real course forward, but two unappealing options.</p>
<ul>
<li>I found the problem with <code>find_epsilon()</code> by executing the first iteration of the ascent method by hand. It took about 90 minutes.
I could try to continue this process and hope that while iteration 1 is executing correctly I find some other bug in the code, but I doubt that I will ever reach the 9 iterations the program needs
to find the faulty solution.</li>
<li>Move on to the branch and bound part of the Held-Karp relaxation.
My hope is that because Held and Karp give a complete execution of the branch and bound method that I will be able to use that to trace a complete execution of the relaxation and find the flaw in
the ascent method that way.</li>
</ul>
<p>I will be discussing the next steps with my GSoC mentors soon.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>Held, M., Karp, R.M. <em>The traveling-salesman problem and minimum spanning trees</em>. Operations research, 1970-11-01, Vol.18 (6), p.1138-1162. <a href="https://www.jstor.org/stable/169411">https://www.jstor.org/stable/169411</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[implementing the Iterators]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/?utm_source=atom_feed" rel="related" type="text/html" title="Finding all Minimum Arborescences" />
                <link href="https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="A Closer Look at the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/?utm_source=atom_feed" rel="related" type="text/html" title="NetworkX Function Stubs" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Separation Oracle" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/</id>
            
            
            <published>2021-06-10T00:00:00+00:00</published>
            <updated>2021-06-10T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Implementation details about SpanningTreeIterator and ArborescenceIterator</blockquote><p>We are coming into the end of the first week of coding for the Summer of Code, and I have implemented two new, but related, features in NetworkX.
In this post, I will discuss how I implemented them, some of the challenges and how I tested them.
Those two new features are a spanning tree iterator and a spanning arborescence iterator.</p>
<p>The arborescence iterator is the feature that I will be using directly in my GSoC project, but I though that it was a good idea to implement the spanning tree iterator first as it would be easier and I could directly refer back to the research paper as needed.
The partition schemes between the two are the same, so once I figured it out for the spanning tress what I learned there would directly port into the arborescence iterator and there I could focus on modifying Edmond&rsquo;s algorithm to respect the partition.</p>
<h2 id="spanning-tree-iterator">Spanning Tree Iterator<a class="headerlink" href="#spanning-tree-iterator" title="Link to this heading">#</a></h2>
<p>This was the first of the new freatures.
It follows the algorithm detailed in a paper by Sörensen and Janssens from 2005 titled <em>An Algorithm to Generate all Spanning Trees of a Graph in Order of Increasing Cost</em> which can be found <a href="https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en&amp;format=pdf">here</a> [2].</p>
<p>Now, I needed to tweak the implementation of the algorithm because I wanted to implement a python iterator, so somebody can write</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">for</span> <span class="n">tree</span> <span class="ow">in</span> <span class="n">nx</span><span class="o">.</span><span class="n">SpanningTreeIterator</span><span class="p">(</span><span class="n">G</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">pass</span></span></span></code></pre>
</div>
<p>and that loop would return spanning trees starting with the ones of minimum cost and climbing to the ones of maximum cost.</p>
<p>In order to implement this feature, my first step was to ensure that once I know what the edge partition of the graph was, I could find a minimum spanning tree which respected the partition.
As a brief reminder, the edge partition creates two disjoint sets of edges of which one <em>must</em> appear in the resulting spanning tree and one <em>cannot</em> appear in the spanning tree.
Edges which are neither included or excluded from the spanning tree and called open.</p>
<p>The easiest algorithm to implement this which is Kruskal&rsquo;s algorithm.
The included edges are all added to the spanning tree first, and then the algorithm can join the components created with the included edges using the open edges.</p>
<p>This was easy to implement in NetworkX.
The Kruskal&rsquo;s algorithm in NetworkX is a generator which returns the edges in the minimum spanning tree one at a time using a sorted list of edges.
All that I had to do was change the sorting process so that the included edges where always at the front of that list, then the algorithm would always select them, regardless of weight for the spanning tree.</p>
<p>Additionally, since the general spanning tree of a graph is a partitioned tree where the partition has no included or excluded edges, I was about to convert the normal Kruskal&rsquo;s implementation into a wrapper for my partition respecting one in order to reduce redundant code.</p>
<p>As for the partitioning process itself, that proved to be a bit more tricky mostly stemming from my own limited python experience.
(I have only been working with python since the start of the calendar year)
In order to implement the partitioning scheme I needed an ordered data structure and choose the <a href="https://docs.python.org/3/library/queue.html"><code>PriorityQueue</code></a> class.
This was convienct, but for elements with the same weight for their minimum spanning trees it tried to compare the dictionaries hold the edge data was is not a supported operation.
Thus, I implemented a dataclass where only the weight of the spanning tree was comparable.
This means that for ties in spanning tree weight, the oldest partition with that weight is considered first.</p>
<p>Once the implementation details were ironed out, I moved on to testing.
At the time of this writing, I have tested the <code>SpanningTreeIterator</code> on the sample graph in the Sörensen and Janssens paper.
That graph is</p>
<center><img src="tree-example.png" alt="Example Graph"/></center>
<p>It has eight spanning trees, ranging in weight from 17 to 23 which are all shown below.</p>
<center>
<img src="eight-spanning-trees-1.png" alt="Four of the eight spanning trees on the sample graph"/>
<img src="eight-spanning-trees-2.png" alt="Four of the eight spanning trees on the sample graph"/>
</center>
<p>Since this graph only has a few spanning trees, it was easy to explicitly test that each graph returned from the iterator was the next one in the sequence.
The iterator also works backwards, so calling</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">for</span> <span class="n">tree</span> <span class="ow">in</span> <span class="n">nx</span><span class="o">.</span><span class="n">SpanningTreeIterator</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">minimum</span><span class="o">=</span><span class="kc">False</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">pass</span></span></span></code></pre>
</div>
<p>starts with the maximum spanning tree and works down to the minimum spanning tree.</p>
<p>The code for the spanning tree iterator can be found <a href="https://github.com/mjschwenne/networkx/blob/bothTSP/networkx/algorithms/tree/mst.py">here</a> starting around line 761.</p>
<h2 id="arborescence-iterator">Arborescence Iterator<a class="headerlink" href="#arborescence-iterator" title="Link to this heading">#</a></h2>
<p>The arborescence iterator is what I actually need for my GSoC project, and as expected was more complicated to implement.
In my original post titled <a href="../finding-all-minimum-arborescences">Finding All Minimum Arborescences</a>, I discussed cases that Edmond&rsquo;s algorithm [1] would need to handle and proposed a change to the <code>desired_edge</code> method.</p>
<p>These changes where easy to make, but were not the extent of the changes that needed to be made as I originally thought.
The original graph from Edmonds&rsquo; 1967 paper is below</p>
<center><img src="digraph-example.png" alt="An example directed graph"/></center>
<p>In my first test, which was limited to the minimum spanning arborescence of a random partition I created, the results where close.
Below, the blue edges are included and the red one is excluded.</p>
<center><img src="digraph-partition.png" alt="A small partition on the edges of the example graph"/></center>
<p>The minimum spanning arborescence initially is shown below.</p>
<center><img src="digraph-partition-msa.png" alt="Minimum spanning arborescence respecting the above partition"/></center>
<p>While the $(3, 0)$ edge is properly excluded and the $(2, 3)$ edge is included, the $(6, 2)$ is not present in the arborescence (show as a dashed edge).
Tracking this problem down was a hassle, but the way that Edmonds&rsquo; algorithm works is that a cycle, which would have been present if the $(6, 2)$ edge was included, are collapsed into a single vertex as the algorithm moves to the next iteration.
Once that cycle is collapsed into a vertex, it still has to choose how to access that vertex and the choice is based on the best edge as before (this is step I1 in [1]).
Then, when the algorithm expands the cycle out, it will remove the edge which is</p>
<ul>
<li>Wholly contained inside the cycle and,</li>
<li>Directed towards the vertex which is the &lsquo;access point&rsquo; for the cycle.</li>
</ul>
<p>Which is this case, would be $(6, 2)$ shown in red in the next image.
Represented visually, the cycle with incoming edges would look like</p>
<center><img src="digraph-cycle.png" alt="Problematic cycle with partition edges"/></center>
<p>And that would be collapsed into a new vertex, $N$ from which the incoming edge with weight 12 would be selected.</p>
<center><img src="digraph-collapsed-cycle.png" alt="Same cycle after the Edmonds algorithm collapsed it"/></center>
<p>In this example we want to forbid the algorithm from picking the edge with weight 12, so that when the cycle is reconstructed the included edge $(6, 2)$ is still present.
Once we make one of the incoming edges an included edge, we know from the definition of an arborescence that we cannot get to that vertex from any other edges.
They are all effectively excluded, so once we find an included edge directed towards a vertex we can made all of the other incoming edges excluded.</p>
<p>Returning to the example, the collapsed vertex $N$ would have the edge of weight 12 excluded and would pick the edge of weight 13.</p>
<center><img src="digraph-collapsed-forbidden-cycle.png" alt="Solution to tracking bad cycles in the arborescence"/></center>
<p>At this point the iterator would find 236 arborescences with cost ranging from 96 to 125.
I thought that I was very close to being finished and I knew that the cost of the minimum spanning arborescence was 96, until I checked to see what the weight of the maximum spanning arborescence was: 131.</p>
<p>This means that I was removing partitions which contained a valid arborescence before they were being added to priority queue.
My <code>check_partition</code> method within the <code>ArborescenceIterator</code> was doing the following:</p>
<ul>
<li>Count the number of included and excluded incoming edges for each vertex.</li>
<li>Save all of the included edges to a list to be checked for cycles.</li>
<li>If there was more than one included edge or all of the edges where excluded, return <code>False</code>.</li>
<li>If there was one included edge, make all of the others excluded.</li>
</ul>
<p>Rather than try to debug what I though was a good method, I decided to change my process.
I moved the last bullet point into the <code>write_partition</code> method and then stopped using the <code>check_partition</code> method.
If an edge partition does not have a spanning arborescence, the <code>partition_spanning_arborescence</code> function will return <code>None</code> and I discard the partition.
This approach is more computationally intensive, but it increased the number of returned spanning araborescences from 236 to 680 and the range expanded to the proper 96 - 131.</p>
<p>But how do I know that it isn&rsquo;t skipping arborescences within that range?
Since 680 arborescences is too many to explicitly check, I decided to write another test case.
This one would check that the number of arborescences was correct and that the sequence never decreases.</p>
<p>In order to check the number of arborescecnes, I decided to take a brute force approach.
There are</p>
<p>$$
\binom{18}{8} = 43,758
$$</p>
<p>possible combinations of edges which could be arborescences.
That&rsquo;s a lot of combintation, more than I wanted to check by hand so I wrote a short python script.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">combinations</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">networkx</span> <span class="k">as</span> <span class="nn">nx</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">edgelist</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">0</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mi">7</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">6</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">6</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="mi">7</span><span class="p">),</span>
</span></span><span class="line"><span class="cl"><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">combo_count</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl"><span class="n">arbor_count</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">combo</span> <span class="ow">in</span> <span class="n">combinations</span><span class="p">(</span><span class="n">edgelist</span><span class="p">,</span> <span class="mi">8</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">combo_count</span> <span class="o">+=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">    <span class="n">combo_test</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">DiGraph</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">combo_test</span><span class="o">.</span><span class="n">add_edges_from</span><span class="p">(</span><span class="n">combo</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">nx</span><span class="o">.</span><span class="n">is_arborescence</span><span class="p">(</span><span class="n">combo_test</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">arbor_count</span> <span class="o">+=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="sa">f</span><span class="s2">&#34;There are </span><span class="si">{</span><span class="n">combo_count</span><span class="si">}</span><span class="s2"> possible combinations of eight edges which &#34;</span>
</span></span><span class="line"><span class="cl">    <span class="sa">f</span><span class="s2">&#34;could be an arboresecnce.&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;Of those </span><span class="si">{</span><span class="n">combo_count</span><span class="si">}</span><span class="s2"> combinations, </span><span class="si">{</span><span class="n">arbor_count</span><span class="si">}</span><span class="s2"> are arborescences.&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p>The output of this script is</p>

<div class="highlight">
  <pre>There are 43758 possible combinations of eight edges which could be an arboresecnce.
Of those 43758 combinations, 680 are arborescences.</pre>
</div>

<p>So now I know how many arborescences where in the graph and it matched the number returned from the iterator.
Thus, I believe that the iterator is working well.</p>
<p>The iterator code is <a href="https://github.com/mjschwenne/networkx/blob/bothTSP/networkx/algorithms/tree/branchings.py">here</a> and starts around line 783.
It can be used in the same way as the spanning tree iterator.</p>
<p><a href="https://mjschwenne.github.io/assets/iterator-output.pdf">Attached</a> is a sample output from the iterator detailing all 680 arborescences of the test graph.
Since Jekyll will not let me put up the txt file I had to convert it into a pdf which is 127 pages to show the 6800 lines of output from displaying all of the arborescences.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>[1] J. Edmonds, <em>Optimum Branchings</em>, Journal of Research of the National Bureau of Standards, 1967, Vol. 71B, p.233-240, <a href="https://archive.org/details/jresv71Bn4p233">https://archive.org/details/jresv71Bn4p233</a></p>
<p>[2] G.K. Janssens, K. Sörensen, <em>An algorithm to generate all spanning trees in order of increasing cost</em>, Pesquisa Operacional, 2005-08, Vol. 25 (2), p. 219-229, <a href="https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en">https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Finding all Minimum Arborescences]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="A Closer Look at the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/?utm_source=atom_feed" rel="related" type="text/html" title="NetworkX Function Stubs" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Separation Oracle" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/</id>
            
            
            <published>2021-06-05T00:00:00+00:00</published>
            <updated>2021-06-05T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Exploring an algorithm to generate arborescences in ascending order</blockquote><p>There is only one thing that I need to figure out before the first coding period for GSoC starts on Monday: how to find <em>all</em> of the minimum arborescences of a graph.
This is the set $K(\pi)$ in the Held and Karp paper from 1970 which can be refined down to $K(\pi, d)$ or $K_{X, Y}(\pi)$ as needed.
For more information as to why I need to do this, please see my last post <a href="../a-closer-look-at-held-karp">here</a>.</p>
<p>This is a place where my contributions to NetworkX to implement the Asadpour algorithm [1] for the directed traveling salesman problem will be useful to the rest of the NetworkX community (I hope).
The research paper that I am going to template this off of is <a href="https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en&amp;format=pdf">this</a> 2005 paper by Sörensen and Janssens titled <em>An Algorithm to Generate all Spanning Trees of a Graph in Order of Increasing Cost</em> [4].</p>
<p>The basic idea here is to implement their algorithm and then generate spanning trees until we find the first one with a cost that is greater than the first one generated, which we know is a minimum, so that we have found all of the minimum spanning trees.
I know what you guys are saying, &ldquo;Matt, this paper discusses <em>spanning trees</em>, not spanning arborescences, how is this helpful?&rdquo;.
Well, the heart of this algorithm is to partition the vertices into either excluded edges which cannot appear in the tree, included edges which must appear in the tree and open edges which can be but are not required to be in the tree.
Once we have a partition, we need to be able to find a minimum spanning tree or minimum spanning arborescence that respects the partitioned edges.</p>
<p>In NetworkX, the minimum spanning arborescences are generated using Chu-Liu/Edmonds’ Algorithm developed by Yoeng-Jin Chu and Tseng-Hong Liu in 1965 and independently by Jack Edmonds in 1967.
I believe that Edmonds&rsquo; Algorithm [2] can be modified to require an arc to be either included or excluded from the resulting spanning arborescence, thus allowing me to implement Sörensen and Janssens&rsquo; algorithm for directed graphs.</p>
<p>First, let&rsquo;s explore whether the partition scheme discussed in the Sörensen and Janssens paper [4] will work for a directed graph.
The critical ideas for creating the partitions are given on pages 221 and 222 and are as follows:</p>
<blockquote>
<p>Given an MST of a partition, this partition can be split into a set of resulting partitions in such a way that the following statements hold:</p>
<ul>
<li>the intersection of any two resulting partitions is the empty set,</li>
<li>the MST of the original partition is not an element of any of the resulting partitions,</li>
<li>the union of the resulting partitions is equal to the original partition, minus the MST of the original partition.</li>
</ul>
</blockquote>
<p>In order to achieve these conditions, they define the generation of the partitions using this definition for a minimum spanning tree</p>
<p>$$
s(P) = {(i_1, j_1), \dots, (i_r, j_r), (t_1, v_1), \dots, (t_{n-r-1}, v_{n-r-1}}
$$</p>
<p>where the $(i, j)$ edges are the included edges of the original partition and the $(t, v)$ are from the open edges of the original partition.
Now, to create the next set of partitions, take each of the $(t, v)$ edges sequentially and introduce them one at a time, make that edge an excluded edge in the first partition it appears in and an included edge in all subsequent partitions.
This will produce something to the effects of</p>
<p>$$
\begin{array}{l}
P_1 = {(i_1, j_1), \dots, (i_r, j_r), (\overline{m_1, p_1}), \dots, (\overline{m_l, p_l}), (\overline{t_1, v_1})} \\\
P_2 = {(i_1, j_1), \dots, (i_r, j_r), (t_1, v_1), (\overline{m_1, p_1}), \dots, (\overline{m_l, p_l}), (\overline{t_2, v_2})} \\\
P_3 = {(i_1, j_1), \dots, (i_r, j_r), (t_1, v_1), (t_2, v_2), (\overline{m_1, p_1}), \dots, (\overline{m_l, p_l}), (\overline{t_3, v_3})} \\\
\vdots \\\
\begin{multline*}
P_{n-r-1} = {(i_1, j_1), \dots, (i_r, j_r), (t_1, v_1), \dots, (t_{n-r-2}, v_{n-r-2}), (\overline{m_1, p_1}), \dots, (\overline{m_l, p_l}), \\\
(\overline{t_{n-r-1}, v_{n-r-1}})}
\end{multline*} \\\
\end{array}
$$</p>
<p>Now, if we extend this to a directed graph, our included and excluded edges become included and excluded arcs, but the definition of the spanning arborescence of a partition does not change.
Let $s_a(P)$ be the minimum spanning arborescence of a partition $P$.
Then</p>
<p>$$
s_a(P) = {(i_1, j_1), \dots, (i_r, j_r), (t_1, v_1), \dots, (t_{n-r-1}, v_{n-r-1}}
$$</p>
<p>$s_a(P)$ is still constructed of all of the included arcs of the partition and a subset of the open arcs of that partition.
If we partition in the same manner as the Sörensen and Janssens paper [4], then their cannot be spanning trees which both include and exclude a given edge and this conflict exists for every combination of partitions.</p>
<p>Clearly the original arborescence, which includes all of the $(t_1, v_1), \dots, (t_{n-r-1}, v_{n-r-1})$ cannot be an element of any of the resulting partitions.</p>
<p>Finally, there is the claim that the union of the resulting partitions is the original partition minus the original minimum spanning tree.
Being honest here, this claim took a while for me to understand.
In fact, I had a whole paragraph talking about how this claim doesn&rsquo;t make sense before all of a sudden I realized that it does.
The important thing to remember here is that the union of all of the partitions isn&rsquo;t the union of the sets of included and excluded edges (which is where I went wrong the first time), it is a subset of spanning trees.
The original partition contains many spanning trees, one or more of which are minimum, but each tree in the partition is a unique subset of the edges of the original graph.
Now, because each of the resulting partitions cannot include one of the edges of the original partition&rsquo;s minimum spanning tree we know that the original minimum spanning tree is <em>not</em> an element of the union of the resulting partitions.
However, because every other spanning tree in the original partition which was not the selected minimum one is different by at least one edge it is a member of at least one of the resulting partitions, specifically the one where that one edge of the selected minimum spanning tree which it does not contain is the excluded edge.</p>
<p>So now we know that this same partition scheme which works for undirected graphs will work for directed ones.
We need to modify Edmonds’ algorithm to mandate that certain arcs be included and others excluded.
To start, a review of this algorithm is in order.
The original description of the algorithm is given on pages 234 and 235 of Jack Edmonds&rsquo; 1967 paper <em>Optimum Branchings</em> [2] and roughly speaking it has three major steps.</p>
<ol>
<li>For each vertex $v$, find the incoming arc with the smallest weight and place that arc in a bucket $E^i$ and the vertex in a bucket $D^i$.
Repeat this step until either (a) $E^i$ no longer qualifies as a branching or (b) all vertices of the graph are in $D^i$.
If (a) occurs, go to step 2, otherwise go to step 3.</li>
<li>If $E^i$ no longer qualifies as a branching then it must contain a cycle.
Contract all of the vertices of the cycle into one new one, say $v_1^{i + 1}$.
Every edge which has one endpoint in the cycle has that endpoint replaced with $v_1^{i + 1}$ and its cost updated.
Using this new graph $G^{i + 1}$, create buckets $D^{i + 1}$ containing the nodes in both $G^{i + 1}$ and $D^i$ and $E^{i + 1}$ containing edges in both $G^{i + 1}$ and $E^i$
(i.e. remove the edges and vertices which are affected by the creation of $G^{i + 1}$.)
Return to step 1 and apply it to graph $G^{i + 1}$.</li>
<li>Once this step is reached, we have a smaller graph for which we have found a minimum spanning arborescence.
Now we need to un-contract all of the cycles to return to the original graph.
To do this, if the node $v_1^{i + 1}$ is the root of the arborescence or not.
<ul>
<li>$v_1^{i + 1}$ is the root: Remove the arc of maximum weight from the cycle represented by $v_1^{i + 1}$.</li>
<li>$v_1^{i + 1}$ is not the root: There is a single arc directed towards $v_1^{i + 1}$ which translates into an arc directed to one of the vertices in the cycle represented by $v_1^{i + 1}$.
Because $v_1^{i + 1}$ represents a cycle, there is another arc wholly internal to the cycle which is directed into the same vertex as the incoming edge to the cycle.
Delete the internal one to break the cycle.
Repeat until the original graph has been restored.</li>
</ul>
</li>
</ol>
<p>Now that we are familiar with the minimum arborescence algorithm, we can discuss modifying it to force it to include certain edges or reject others.
The changes will be primarily located in step 1.
Under the normal operation of the algorithm, the consideration which happens at each vertex might look like this.</p>
<center><img src="edmonds-normal.png" alt="Edmonds algrithm selecting edge without restrictions"/></center>
<p>Where the bolded arrow is chosen by the algorithm as it is the incoming arc with minimum weight.
Now, if we were required to include a different edge, say the weight 6 arc, we would want this behavior even though it is strictly speaking not optimal.
In a similar case, if the arc of weight 2 was excluded we would also want to pick the arc of weight 6.
Below the excluded arc is a dashed line.</p>
<center><img src="edmonds-one-required.png" alt="Edmonds algorithm forces to picked a non-optimal arc"/></center>
<p>But realistically, these are routine cases that would not be difficult to implement.
A more interesting case would be if all of the arcs were excluded or if more than one are included.</p>
<center><img src="edmonds-all-excluded.png" alt="Edmonds algorithm which cannot pick any arc"/></center>
<p>Under this case, there is no spanning arborescence for the partition because the graph is not connected.
The Sörensen and Janssens paper characterize these as <em>empty</em> partitions and they are ignored.</p>
<center><img src="edmonds-multiple-required.png" alt="Edmonds algorithm which must pick more then one arc"/></center>
<p>In this case, things start to get a bit tricky.
With two (or more) included arcs leading to this vertex, it is but definition not an arborescence as according to Edmonds on page 233</p>
<blockquote>
<p>A branching is a forest whose edges are directed so that each is directed toward a different node. An arborescence is a connected branching.</p>
</blockquote>
<p>At first I thought that there was a case where because this case could result in the creation of a cycle that it was valid, but I realize now that in step 3 of Edmonds’ algorithm that one of those arcs would be removed anyways.
Thus, any partition with multiple included arcs leading to a single vertex is empty by definition.
While there are ways in which the algorithm can handle the inclusion of multiple arcs, one (or more) of them by definition of an arborescence will be deleted by the end of the algorithm.</p>
<p>I propose that these partitions are screened out before we hand off to Edmonds&rsquo; algorithm to find the arborescences.
As such, Edmonds&rsquo; algorithm will need to be modified for the cases of at most one included edge per vertex and any number of excluded edges per vertex.
The critical part of altering Edmonds&rsquo; Algorithm is contained within the <code>desired_edge</code> function in the NetworkX implementation starting on line 391 in <code>algorithms.tree.branchings</code>.
The whole function is as follows.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">desired_edge</span><span class="p">(</span><span class="n">v</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Find the edge directed toward v with maximal weight.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">edge</span> <span class="o">=</span> <span class="kc">None</span>
</span></span><span class="line"><span class="cl">    <span class="n">weight</span> <span class="o">=</span> <span class="o">-</span><span class="n">INF</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">u</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">data</span> <span class="ow">in</span> <span class="n">G</span><span class="o">.</span><span class="n">in_edges</span><span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">keys</span><span class="o">=</span><span class="kc">True</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">new_weight</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">attr</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">new_weight</span> <span class="o">&gt;</span> <span class="n">weight</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">weight</span> <span class="o">=</span> <span class="n">new_weight</span>
</span></span><span class="line"><span class="cl">            <span class="n">edge</span> <span class="o">=</span> <span class="p">(</span><span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">new_weight</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">edge</span><span class="p">,</span> <span class="n">weight</span></span></span></code></pre>
</div>
<p>The function would be changed to automatically return an included arc and then skip considering any excluded arcs.
Because this is an inner function, we can access parameters passed to the parent function such as something along the lines as <code>partition=None</code> where the value of <code>partition</code> is the edge attribute detailing <code>true</code> if the arc is included and <code>false</code> if it is excluded.
Open edges would not need this attribute or could use <code>None</code>.
The creation of an enum is also possible which would unify the language if I talk to my GSoC mentors about how it would fit into the NetworkX ecosystem.
A revised version of <code>desired_edge</code> using the <code>true</code> and <code>false</code> scheme would then look like this:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">desired_edge</span><span class="p">(</span><span class="n">v</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Find the edge directed toward v with maximal weight.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">edge</span> <span class="o">=</span> <span class="kc">None</span>
</span></span><span class="line"><span class="cl">    <span class="n">weight</span> <span class="o">=</span> <span class="o">-</span><span class="n">INF</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">u</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">data</span> <span class="ow">in</span> <span class="n">G</span><span class="o">.</span><span class="n">in_edges</span><span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">keys</span><span class="o">=</span><span class="kc">True</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">new_weight</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">attr</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">data</span><span class="p">[</span><span class="n">partition</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="n">edge</span><span class="p">,</span> <span class="n">data</span><span class="p">[</span><span class="n">attr</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">new_weight</span> <span class="o">&gt;</span> <span class="n">weight</span> <span class="ow">and</span> <span class="ow">not</span> <span class="n">data</span><span class="p">[</span><span class="n">partition</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl">            <span class="n">weight</span> <span class="o">=</span> <span class="n">new_weight</span>
</span></span><span class="line"><span class="cl">            <span class="n">edge</span> <span class="o">=</span> <span class="p">(</span><span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">new_weight</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">edge</span><span class="p">,</span> <span class="n">weight</span></span></span></code></pre>
</div>
<p>And a version using the enum might look like</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">desired_edge</span><span class="p">(</span><span class="n">v</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Find the edge directed toward v with maximal weight.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">edge</span> <span class="o">=</span> <span class="kc">None</span>
</span></span><span class="line"><span class="cl">    <span class="n">weight</span> <span class="o">=</span> <span class="o">-</span><span class="n">INF</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">u</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">data</span> <span class="ow">in</span> <span class="n">G</span><span class="o">.</span><span class="n">in_edges</span><span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">keys</span><span class="o">=</span><span class="kc">True</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">new_weight</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">attr</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">data</span><span class="p">[</span><span class="n">partition</span><span class="p">]</span> <span class="ow">is</span> <span class="n">Partition</span><span class="o">.</span><span class="n">INCLUDED</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="n">edge</span><span class="p">,</span> <span class="n">data</span><span class="p">[</span><span class="n">attr</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">new_weight</span> <span class="o">&gt;</span> <span class="n">weight</span> <span class="ow">and</span> <span class="n">data</span><span class="p">[</span><span class="n">partition</span><span class="p">]</span> <span class="ow">is</span> <span class="ow">not</span> <span class="n">Partition</span><span class="o">.</span><span class="n">EXCLUDED</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">weight</span> <span class="o">=</span> <span class="n">new_weight</span>
</span></span><span class="line"><span class="cl">            <span class="n">edge</span> <span class="o">=</span> <span class="p">(</span><span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">new_weight</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">edge</span><span class="p">,</span> <span class="n">weight</span></span></span></code></pre>
</div>
<p>Once Edmonds&rsquo; algorithm has been modified to be able to use partitions, the pseudocode from the Sörensen and Janssens paper would be applicable.</p>

<div class="highlight">
  <pre>Input: Graph G(V, E) and weight function w
Output: Output_File (all spanning trees of G, sorted in order of increasing cost)

List = {A}
Calculate_MST(A)
while MST ≠ ∅ do
	Get partition Ps in List that contains the smallest spanning tree
	Write MST of Ps to Output_File
	Remove Ps from List
	Partition(Ps)</pre>
</div>

<p>And the corresponding <code>Partition</code> function being</p>

<div class="highlight">
  <pre>P1 = P2 = P
for each edge i in P do
	if i not included in P and not excluded from P then
		make i excluded from P1
		make i include in P2
		Calculate_MST(P1)
		if Connected(P1) then
			add P1 to List
		P1 = P2</pre>
</div>

<p>I would need to change the format of the first code block as I would like it to be a Python iterator so that a <code>for</code> loop would be able to iterate through all of the spanning arborescences and then stop once the cost increases in order to limit it to only minimum spanning arborescences.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>[1] A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), p. 1043-1061, <a href="https://homes.cs.washington.edu/~shayan/atsp.pdf">https://homes.cs.washington.edu/~shayan/atsp.pdf</a>.</p>
<p>[2] J. Edmonds, <em>Optimum Branchings</em>, Journal of Research of the National Bureau of Standards, 1967, Vol. 71B, p.233-240, <a href="https://archive.org/details/jresv71Bn4p233">https://archive.org/details/jresv71Bn4p233</a></p>
<p>[3] M. Held, R.M. Karp, <em>The traveling-salesman problem and minimum spanning trees</em>, Operations research, 1970-11-01, Vol.18 (6), p.1138-1162, <a href="https://www.jstor.org/stable/169411">https://www.jstor.org/stable/169411</a></p>
<p>[4] G.K. Janssens, K. Sörensen, <em>An algorithm to generate all spanning trees in order of increasing cost</em>, Pesquisa Operacional, 2005-08, Vol. 25 (2), p. 219-229, <a href="https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en">https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[A Closer Look at the Held-Karp Relaxation]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/?utm_source=atom_feed" rel="related" type="text/html" title="NetworkX Function Stubs" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Separation Oracle" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/</id>
            
            
            <published>2021-06-03T00:00:00+00:00</published>
            <updated>2021-06-03T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Looking for a new method to solve the Held-Karp relaxation from the original Held and Karp paper</blockquote><p>After talking with my GSoC mentors about what we all believe to be the most difficult part of the Asadpour algorithm, the Held-Karp relaxation, we came to several conclusions:</p>
<ul>
<li>The Asadpour paper recommends using the ellipsoid method so that their algorithm runs in polynomial time.
We do not need a polynomial time, just an algorithm with reasonable execution time.
An example of this would be the ellipsoid algorithm versus the simplex algorithm.
While the simplex algorithm is exponential, in practice it is almost always faster than the ellipsoid algorithm.</li>
<li>Our interest in the ellipsoid algorithm was not based on performance, but rather the ability for the ellipsoid algorithm to be able to handle a linear program with an exponential number of constraints.
This was done with a separation oracle, see my post <a href="../held-karp-separation-oracle">here</a> for more information about the oracle.</li>
<li>Implementing a robust ellipsoid algorithm solver (something notable missing from the scientific python ecosystem) was a GSoC project onto itself and beyond the scope of this project for NetworkX.</li>
</ul>
<p>Thus, alternative methods for solving the Held-Karp relaxation needed to be investigated.
To this end, we turned to the original 1970 paper by Held and Karp, <em>The Traveling Salesman Problem and Minimum Spanning Trees</em> to see how they proposed solving the relaxation (Note that this paper was published before the ellipsoid algorithm was applied to linear programming in 1979).
The Held and Karp paper discusses three methods for solving the relaxation:</p>
<ul>
<li><strong>Column Generating:</strong> An older method of solving very large linear programs where only the variables that influence the optimal solution need to be examined.</li>
<li><strong>Ascent Method:</strong> A method based around maximizing the dual of the linear program which is best described as seeking the direction of ascent for the objective function in a similar way to the notion of a gradient in multivariate calculus.</li>
<li><strong>Branch and Bound:</strong> This method has the most theoretical benefits and seeks to augment the ascent method to avoid the introduction of fractional weights which are the largest contributors to a slow convergence rate.</li>
</ul>
<p>But before we explore the methods that Held and Karp discuss, we need to ensure that these methods still apply to solving the Held-Karp relaxation within the context of the Asadpour paper.
The definition of the Held-Karp relaxation that I have been using on this blog comes from the Asadpour paper, section 3 and is listed below.</p>
<p>$$
\begin{array}{c l l}
\text{min} &amp; \sum_{a} c(a)x_a \\\
\text{s.t.} &amp; x(\delta^+(U)) \geqslant 1 &amp; \forall\ U \subset V \text{ and } U \not= \emptyset \\\
&amp; x(\delta^+(v)) = x(\delta^-(v)) = 1 &amp; \forall\ v \in V \\\
&amp; x_a \geqslant 0 &amp; \forall\ a
\end{array}
$$</p>
<p>The closest match to this program in the Held Karp paper is their linear program 3, which is a linear programming representation of the entire traveling salesman problem, not solely the relaxed version.
Note that Held and Karp were dealing with the symmetric TSP (STSP) while Asadpour is addressing the asymmetric or directed TSP (ATSP).</p>
<p>$$
\begin{array}{c l l}
\text{min} &amp; \sum_{1 \leq i &lt; j \leq n} c_{i j}x_{i j} \\\
\text{s.t.} &amp; \sum_{j &gt; i} x_{i j} + \sum_{j &lt; i} x_{j i} = 2 &amp; (i = 1, 2, \dots, n) \\\
&amp; \sum_{i \in S\\\ j \in S\\\ i &lt; j} x_{i j} \leq |S| - 1 &amp; \text{for any proper subset } S \subset {2, 3, \dots, n} \\\
&amp; 0 \leq x_{i j} \leq 1 &amp; (1 \leq i &lt; j \leq n) \\\
&amp; x_{i j} \text{integer} \\\
\end{array}
$$</p>
<p>The last two constraints on the second linear program is correctly bounded and fits within the scope of the original problem while the first two constraints do most of the work in finding a TSP tour.
Additionally, changing the last two constraints to be $x_{i j} \geq 0$ <em>is</em> the Held Karp relaxation.
The first constraint, $\sum_{j &gt; i} x_{i j} + \sum_{j &lt; i} x_{j i} = 2$, ensures that for every vertex in the resulting tour there is one edge to get there and one edge to leave by.
This matches the second constraint in the Asadpour ATSP relaxation.
The second constraint in the Held Karp formulation is another form of the subtour elimination constraint seen in the Asadpour linear program.</p>
<p>Held and Karp also state that</p>
<blockquote>
<p>In this section, we show that minimizing the gap $f(\pi)$ is equivalent to solving this program <em>without</em> the integer constraints.</p>
</blockquote>
<p>on page 1141, so it would appear that solving one of the equivalent programs that Held and Karp forumalate should work here.</p>
<h2 id="column-generation-technique">Column Generation Technique<a class="headerlink" href="#column-generation-technique" title="Link to this heading">#</a></h2>
<p>The Column Generation technique seeks to solve linear program 2 from the Held and Karp paper, stated as</p>
<p>$$
\begin{array}{c l}
\text{min} &amp; \sum_{k} c_ky_k \\\
\text{s.t.} &amp; y_k \geq 0 \\\
&amp; \sum_k y_k = 1 \\\
&amp; \sum_{i = 2}^{n - 1} (-v_{i k})y_k = 0 \\\
\end{array}
$$</p>
<p>Where $v_{i k}$ is the degree of vertex $i$ in 1-Tree $k$ minus two, or $v_{i k} = d_{i k} - 2$ and each variable $y_k$ corresponds to a 1-Tree $T^k$.
The associated cost $c_k$ for each tree is the weight of $T^k$.</p>
<p>The rest of this method uses a simplex algorithm to solve the linear program.
We only focus on the edges which are in each of the 1-Trees, giving each column the form</p>
<p>$$
\begin{bmatrix}
1 &amp; -v_{2k} &amp; -v_{3k} &amp; \dots &amp; -v_{n-1,k}
\end{bmatrix}^T
$$</p>
<p>and the column which enters the solution in the 1-Tree for which $c_k + \theta + \sum_{j=2}^{n-1} \pi_jv_{j k}$ is a minimum where $\theta$ and $\pi_j$ come from the vector of &lsquo;shadow prices&rsquo; given by $(\theta, \pi_2, \pi_3, \dots, \pi_{n-1})$.
Now the basis is $(n - 1) \times (n - 1)$ and we can find the 1-Tree to add to the basis using a minimum 1-Tree algorithm which Held and Karp say can be done in $O(n^2)$ steps.</p>
<p>I am already <a href="https://github.com/mjschwenne/GraphAlgorithms/blob/main/src/Simplex.py">familiar</a> with the simplex method, so I will not detail it&rsquo;s implementation here.</p>
<h3 id="performance-of-the-column-generation-technique">Performance of the Column Generation Technique<a class="headerlink" href="#performance-of-the-column-generation-technique" title="Link to this heading">#</a></h3>
<p>This technique is slow to converge.
Held and Karp programmed in on an IBM/360 and where able to solve problems consestinal for up to $n = 12$.
Now, on a modern computer the clock rate is somewhere between 210 and 101,500 times faster (depending on the model of IBM/360 used), so we expect better performance, but cannot say at this time how much of an improvement.</p>
<p>They also talk about a heuristic procedure in which a vertex is eliminated from the program whenever the choice of its adjacent vertices was &rsquo;evident&rsquo;.
Technical details for the heuristic where essentially non-existent, but</p>
<blockquote>
<p>The procedure showed promise on examples up to $n = 48$, but was not explored systematically</p>
</blockquote>
<h2 id="ascent-method">Ascent Method<a class="headerlink" href="#ascent-method" title="Link to this heading">#</a></h2>
<p>This paper from Held and Karp is about minimizing $f(\pi)$ where $f(\pi)$ is the gap between the permuted 1-Trees and a TSP tour.
One way to do this is to maximize the dual of $f(\pi)$ which is written as $\text{max}_{\pi}\ w(\pi)$ where</p>
<p>$$
w(\pi) = \text{min}_k\ (c_k + \sum_{i=1}^{i=n} \pi_iv_{i k})
$$</p>
<p>This method uses the set of indices of 1-Trees that are of minimum weight with respect to the weights $\overline{c}_{i j} = c_{i j} + \pi_i + \pi_j$.</p>
<p>$$
K(\pi) = {k\ |\ w(\pi) = c_k + \sum_{i=1}^{i=n} \pi_i v_{i k}}
$$</p>
<p>If $\pi$ is not a maximum point of $w$, then there will be a vector $d$ called the direction of ascent at $\pi$.
This is theorem 3 and a proof is given on page 1148.
Let the functions $\Delta(\pi, d)$ and $K(\pi, d)$ be defined as below.</p>
<p>$$
\Delta(\pi, d) = \text{min}_{k \in K(\pi)}\ \sum_{i=1}^{i=n} d_iv_{i k} \\\
K(\pi, d) = {k\ |\ k \in K(\pi) \text{ and } \sum_{i=1}^{i=n} d_iv_{i k} = \Delta(\pi, d)}
$$</p>
<p>Now for a sufficiently small $\epsilon$, $K(\pi + \epsilon d) = K(\pi, d)$ and $w(\pi + \epsilon d) = w(\pi) + \epsilon \Delta(\pi, d)$, or the value of $w(\pi)$ increases and the growth rate of the minimum 1-Trees is at its smallest so we maintain the low weight 1-Trees and progress farther towards the optimal value.
Finally, let $\epsilon(\pi, d)$ be the following quantity</p>
<p>$$
\epsilon(\pi, d) = \text{max}\ {\epsilon\ |\text{ for } \epsilon&rsquo; &lt; \epsilon,\ K(\pi + \epsilon&rsquo;d = K(\pi, d)}
$$</p>
<p>So in other words, $\epsilon(\pi, d)$ is the maximum distance in the direction of $d$ that we can travel to maintain the desired behavior.</p>
<p>If we can find $d$ and $\epsilon$ then we can set $\pi = \pi + \epsilon d$ and move to the next iteration of the ascent method.
Held and Karp did give a protocol for finding $d$ on page 1149.</p>
<ol>
<li>Set $d$ equal to the zero $n$-vector.</li>
<li>Find a 1-tree $T^k$ such that $k \in K(\pi, d)$.</li>
<li>If $\sum_{i=1}^{i=n} d_iv_{i k} &gt; 0$ STOP.</li>
<li>$d_i \leftarrow d_i + v_{i k},$ for $i = 2, 3, \dots, n$</li>
<li>GO TO 2.</li>
</ol>
<p>There are two things which must be refined about this procedure in order to make it implementable in Python.</p>
<ul>
<li>How do we find the 1-Tree mentioned in step 2?</li>
<li>How do we know when there is no direction of ascent? (i.e. how do we know when we are at the maximal value of $w(\pi)$?)</li>
</ul>
<p>Held and Karp have provided guidance on both of these points.
In section 6 on matroids, we are told to use a method developed by Dijkstra in <em>A Note on Two Problems in Connexion with Graphs</em>, but in this particular case that is not the most helpful.
I have found this document, but there is a function called <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.tree.branchings.minimum_spanning_arborescence.html"><code>minimum_spanning_arborescence</code></a> already within NetworkX which we can use to create a minimum 1-Arborescence.
That process would be to find a minimum spanning arborescence on only the vertices in ${2, 3, \dots, n}$ and then connect vertex 1 to create the cycle.
In order to connect vertex 1, we would choose the outgoing arc with the smallest cost and the incoming arc with the smallest cost.</p>
<p>Finally, at the maximum value of $w(\pi)$, there is no direction of ascent and the procedure outlined by Held and Karp will not terminate.
Their article states on page 1149 that</p>
<blockquote>
<p>Thus, when failure to terminate is suspected, it is necessary to check whether no direction of ascent exists; by the Minkowski-Farkas lemma this is equivalent to the existence of nonnegative coefficients $\alpha_k$ such that</p>
<p>$ \sum_{k \in K(\pi)} \alpha_kv_{i k} = 0, \quad i = 1, 2, \dots, n $</p>
<p>This can be checked by linear programming.</p>
</blockquote>
<p>While it is nice that they gave that summation, the rest of the linear program would have been useful too.
The entire linear program would be written as follows</p>
<p>$$
\begin{array}{c l l}
\text{max} &amp; \sum_k \alpha_k \\\
\text{s.t.} &amp; \sum_{k \in K(\pi)} \alpha_k v_{i k} = 0 &amp; \forall\ i \in {1, 2, \dots n} \\\
&amp; \alpha_k \geq 0 &amp; \forall\ k \\\
\end{array}
$$</p>
<p>This linear program is not in standard form, but it is not difficult to convert it.
First, change the maximization to a minimization by minimizing the negative.</p>
<p>$$
\begin{array}{c l l}
\text{min} &amp; \sum_k -\alpha_k \\\
\text{s.t.} &amp; \sum_{k \in K(\pi)} \alpha_k v_{i k} = 0 &amp; \forall\ i \in {1, 2, \dots n} \\\
&amp; \alpha_k \geq 0 &amp; \forall\ k \\\
\end{array}
$$</p>
<p>While the constraint is not intuitively in standard form, a closer look reveals that it is.
Each column in the matrix form will be for one entry of $\alpha_k$, and each row will represent a different value of $i$, or a different vertex.
The one constraint is actually a collection of very similar one which could be written as</p>
<p>$$
\begin{array}{c l}
\text{min} &amp; \sum_k -\alpha_k \\\
\text{s.t.} &amp; \sum_{k \in K(\pi)} \alpha_k v_{1 k} = 0 \\\
&amp; \sum_{k \in K(\pi)} \alpha_k v_{2 k} = 0 \\\
&amp; \vdots \\\
&amp; \sum_{k \in K(\pi)} \alpha_k v_{n k} = 0 \\\
&amp; \alpha_k \geq 0 &amp; \forall\ k \\\
\end{array}
$$</p>
<p>Because all of the summations must equal zero, no stack and surplus variables are required, so the constraint matrix for this program is $n \times k$.
The $n$ obviously has a linear growth rate, but I&rsquo;m not sure how big to expect $k$ to become.
$k$ is the set of minimum 1-Trees, so I believe that it will be manageable.
This linear program can be solved using the built in <a href="https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.linprog.html"><code>linprog</code></a> function in the SciPy library.</p>
<p>As an implementation note, to start with I would probably check the terminating condition every iteration, but eventually we can find a number of iterations it has to execute before it starts to check for the terminating condition to save computational power.</p>
<p>One possible difficulty with the terminating condition is that we need to run the linear program with data from every minimum 1-Trees or 1-Arborescences, which means that we need to be able to generate all of the minimum 1-Trees.
There does not seem to be an easy way to do this within NetworkX at the moment.
Looking through the tree algorithms <a href="https://networkx.org/documentation/stable/reference/algorithms/tree.html">here</a> they seem exclusively focused on finding <em>one</em> minimum branching of the required type and not <em>all</em> of those branchings.</p>
<p>Now we have to find $\epsilon$.
Theorem 4 on page 1150 states that</p>
<blockquote>
<p>Let $k$ be any element of $K(\pi, d)$, where $d$ is a direction of ascent at $\pi$.
Then
$\epsilon(\pi, d) = \text{min}{\epsilon\ |\text{ for some pair } (e, e&rsquo;),\ e&rsquo; \text{ is a substitute for } e \text{ in } T^k \\\ \text{ and } e \text{ and } e&rsquo; \text{ cross over at } \epsilon }$</p>
</blockquote>
<p>The first step then is to determine if $e$ and $e&rsquo;$ are substitutes.
$e&rsquo;$ is a substitute if for a 1-Tree $T^k$, $(T^k - {e}) \cup {e&rsquo;}$ is also a 1-Tree.
The edges $e = {r, s}$ and $e&rsquo; = {i, j}$ cross over at $\epsilon$ if the pairs $(\overline{c}_{i j}, d_i + d_j)$ and $(\overline{c}_{r s}, d_r + d_s)$ are different but</p>
<p>$$
\overline{c}_{i j} + \epsilon(d_i + d_j) = \overline{c}_{r s} + \epsilon(d_r + d_s)
$$</p>
<p>From that equation, we can derive a formula for $\epsilon$.</p>
<p>$$
\begin{array}{r c l}
\overline{c}_{i j} + \epsilon(d_i + d_j) &amp;=&amp; \overline{c}_{r s} + \epsilon(d_r + d_s) \\\
\epsilon(d_i + d_j) &amp;=&amp; \overline{c}_{r s} + \epsilon(d_r + d_s) - \overline{c}_{i j} \\\
\epsilon(d_i + d_j) - \epsilon(d_r + d_s) &amp;=&amp; \overline{c}_{r s} - \overline{c}_{i j} \\\
\epsilon\left((d_i + d_j) - (d_r + d_s)\right) &amp;=&amp; \overline{c}_{r s} - \overline{c}_{i j} \\\
\epsilon(d_i + d_j - d_r - d_s) &amp;=&amp; \overline{c}_{r s} - \overline{c}_{i j} \\\
\epsilon &amp;=&amp; \displaystyle \frac{\overline{c}_{r s} - \overline{c}_{i j}}{d_i + d_j - d_r - d_s}
\end{array}
$$</p>
<p>So we can now find $epsilon$ for any two pairs of edges which are substitutes for each other, but we need to be able to find substitutes in the 1-Tree.
We know that $e&rsquo;$ is a substitute for $e$ if and only if $e$ and $e&rsquo;$ are both incident to vertex 1 or $e$ is in a cycle of $T^k \cup {e&rsquo;}$ that does not pass through vertex 1.
In a more formal sense, we are trying to find edges in the same fundamental cycle as $e&rsquo;$.
A fundamental cycle is created when any edge not in a spanning tree is added to that spanning tree.
Because the endpoints of this edge are connected by one, unique path this creates a unique cycle.
In order to find this cycle, we will take advantage of <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.cycles.find_cycle.html"><code>find_cycle</code></a> within the NetworkX library.</p>
<p>Below is a pseudocode procedure that uses Theorem 4 to find $\epsilon(\pi, d)$ that I sketched out.
It is not well optimized, but will find $\epsilon(\pi, d)$.</p>

<div class="highlight">
  <pre># Input: An element k of K(pi, d), the vector pi and the vector d.
# Output: epsilon(pi, d) using Theorem 4 on page 1150.

for each edge e in the graph G
	if e is in k:
		continue
	else:
		add e to k
		let v be the terminating end of e
		c = find_cycle(k, v)
		for each edge a in c not e:
			if a[cost] = e[cost] and d[i] &#43; d[j] = d[r] &#43; d[s]:
				continue
			epsilon = (a[cost] - e[cost])/(d[i] &#43; d[j] - d[r] - d[s])
			min_epsilon = min(min_epsilon, epsilon)
		remove e from k
return min_epsilon</pre>
</div>

<h3 id="performance-of-the-ascent-method">Performance of the Ascent Method<a class="headerlink" href="#performance-of-the-ascent-method" title="Link to this heading">#</a></h3>
<p>The ascent method is also slow, but would be better on a modern computer.
When Held and Karp programmed it, they tested it on some small problems up to 25 vertices and while the time per iteration was small, the number of iterations grew quickly.
They do not comment on if this is a better method than the Column Generation technique, but do point up that they did not determine if this method <em>always</em> converges to a maximum point of $w(\pi)$.</p>
<h2 id="branch-and-bound-method">Branch and Bound Method<a class="headerlink" href="#branch-and-bound-method" title="Link to this heading">#</a></h2>
<p>After talking with my GSoC mentors, we believe that this is the best method we can implement for the Held-Karp relaxation as needed by the Asadpour algorithm.
The ascent method is embedded within this method, so the in depth exploration of the previous method is required to implement this one.
Most of the notation in this method is reused from the ascent method.</p>
<p>The branch and bound method utilizes the concept that a vertex can be out-of-kilter.
A vertex $i$ is out-of-kilter high if</p>
<p>$$
\forall\ k \in K(\pi),\ v_{i k} \geq 1
$$</p>
<p>Similarly, vertex $i$ is out-of-kilter low if</p>
<p>$$
\forall\ k \in K(\pi),\ v_{i k} = -1
$$</p>
<p>Remember that $v_{i k}$ is the degree of the vertex minus 2.
We know that all the vertices have a degree of at least one, otherwise the 1-Tree $T^k$ would not be connected.
An out-of-kilter high vertex has a degree of 3 or higher in every minimum 1-Tree and an out-of-kilter low vertex has a degree of only one in all of the minimum 1-Trees.
Our goal is a minimum 1-Tree where every vertex has a degree of 2.</p>
<p>If we know that a vertex is out-of-kilter in either direction, we know the direction of ascent and that direction is a unit vector.
Let $u_i$ be an $n$-dimensional unit vector with 1 in the $i$-th coordinate.
$u_i$ is the direction of ascent if vertex $i$ is out-of-kilter high and $-u_i$ is the direction of ascent if vertex $i$ is out-of-kilter low.</p>
<p>Corollaries 3 and 4 from page 1151 also show that finding $\epsilon(\pi, d)$ is simpler when a vertex is out-of-kilter as well.</p>
<blockquote>
<p><em>Corollary 3.</em> Assume vertex $i$ is out-of-kilter low and let $k$ be an element of $K(\pi, -u_i)$.
Then $\epsilon(\pi, -u_i) = \text{min} (\overline{c}_{i j} - \overline{c}_{r s})$ such that ${i, j}$ is a substitute for ${r, s}$ in $T^k$ and $i \not\in {r, s}$.</p>
</blockquote>
<blockquote>
<p><em>Corollary 4.</em> Assume vertex $r$ is out-of-kilter high.
Then $\epsilon(\pi, u_r) = \text{min} (\overline{c}_{i j} - \overline{c}_{r s})$ such that ${i, j}$ is a substitute for ${r, s}$ in $T^k$ and $r \not\in {i, j}$.</p>
</blockquote>
<p>These corollaries can be implemented with a modified version of the pseudocode listing above for finding $\epsilon$ in the ascent method section.</p>
<p>Once there are no more out-of-kilter vertices, the direction of ascent is not a unit vector and fractional weights are introduced.
This is the cause of a major slow down in the convergence of the ascent method to the optimal solution, so it should be avoided if possible.</p>
<p>Before we can discuss implementation details, there are still some more primaries to be reviewed.
Let $X$ and $Y$ be disjoint sets of edges in the graph.
Then let $\mathsf{T}(X, Y)$ denote the set of 1-Trees which include all edges in $X$ but none of the edges in $Y$.
Finally, define $w_{X, Y}(\pi)$ and $K_{X, Y}(\pi)$ as follows.</p>
<p>$$
w_{X, Y}(\pi) = \text{min}_{k \in \mathsf{T}(X, Y)} (c_k + \sum_{i=1}^{i=n} \pi_i v_{i k}) \\\
K_{X, Y}(\pi) = {k\ |\ c_k + \sum \pi_i v_{i k} = w_{X, Y}(\pi)}
$$</p>
<p>From these functions, a revised definition of out-of-kilter high and low arise, allowing a vertex to be out-of-kilter relative to $X$ and $Y$.</p>
<p>During the completion of the branch and bound method, the branches are tracking in a list where each entry has the following format.</p>
<p>$$[X, Y, \pi, w_{X, Y}(\pi)]$$</p>
<p>Where $X$ and $Y$ are the disjoint sets discussed earlier, $\pi$ is the vector we are using to perturb the edge weights and $w_{X, Y}(\pi)$ is the <em>bound</em> of the entry.</p>
<p>At each iteration of the method, we consider the list entry with the minimum bound and try to find an out-of-kilter vertex.
If we find one, we apply one iteration of the ascent method using the simplified unit vector as the direction of ascent.
Here we can take advantage of integral weights if they exist.
Perhaps the documentation for the Asadpour implementation in NetworkX should state that integral edge weights will perform better but that claim will have to be supported by our testing.</p>
<p>If there is not an out-of-kilter vertex, we still need to find the direction of ascent in order to determine if we are at the maximum of $w(\pi)$.
If the direction of ascent exists, we branch.
If there is no direction of ascent, we search for a tour among $K_{X, Y}(\pi)$ and if none is found, we also branch.</p>
<p>The branching process is as follows.
From entry $[X, Y, \pi, w_{X, Y}(\pi)]$ an edge $e \not\in X \cup Y$ is chosen (Held and Karp do not give any criteria to branch on, so I believe the choose can be arbitrary) and the parent entry is replaced with two other entries of the forms</p>
<p>$$
[X \cup {e}, Y^*, \pi, w_{X \cup {e}, Y^*}(\pi)] \quad \text{and} \quad [X^*, Y \cup {e}, \pi, w_{X^*, Y \cup {e}}(\pi)]
$$</p>
<p>An example of the branch and bound method is given on pages 1153 through 1156 in the Held and Karp paper.</p>
<p>In order to implement this method, we need to be able to determine in addition to modifying some of the details of the ascent method.</p>
<ul>
<li>If a vertex is either out-of-kilter in either direction with respect to $X$ and $Y$.</li>
<li>Search $K_{X, Y}(\pi)$ for a tour.</li>
</ul>
<p>The Held and Karp paper states that in order to find an out-of-kilter vertex, all we need to do is test the unit vectors.
If for arbitrary member $k$ of $K(\pi, u_i)$, $v_{i k} \geq 1$ and the appropriate inverse holds for out-of-kilter low.
From this process we can find out-of-kilter vertices by sequentially checking the $u_i$&rsquo;s in an $O(n^2)$ procedure.</p>
<p>Searching $K_{X, Y}(\pi)$ for a tour would be easy if we can enumerate that set minimum 1-Trees.
While I know how find one of the minimum 1-Trees, or a member of $K(\pi)$, I am not sure how to find elements in $K(\pi, d)$ or even all of the members of $K(\pi)$.
Using the properties in the Held and Karp paper, I do know how to refine $K(\pi)$ into $K(\pi, d)$ and $K(\pi)$ into $K_{X, Y}(\pi)$.
This will have to a blog post for another time.</p>
<p>The most promising research paper I have been able to find on this problem is <a href="https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en&amp;format=pdf">this</a> 2005 paper by Sörensen and Janssens titled <em>An Algorithm to Generate all Spanning Trees of a Graph in Order of Increasing Cost</em>.
From here we generate spanning trees or arborescences until the cost moves upward at which point we have found all elements of $K(\pi)$.</p>
<h3 id="performance-of-the-branch-and-bound-method">Performance of the Branch and Bound Method<a class="headerlink" href="#performance-of-the-branch-and-bound-method" title="Link to this heading">#</a></h3>
<p>Held and Karp did not program this method.
We have some reason to believe that the performance of this method will be the best due to the fact that it is designed to be an improvement over the ascent method which was tested (somewhat) until $n = 25$ which is still better than the column generation technique which was only consistently able to solve up to $n = 12$.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061, <a href="https://homes.cs.washington.edu/~shayan/atsp.pdf">https://homes.cs.washington.edu/~shayan/atsp.pdf</a>.</p>
<p>Held, M., Karp, R.M. <em>The traveling-salesman problem and minimum spanning trees</em>. Operations research, 1970-11-01, Vol.18 (6), p.1138-1162. <a href="https://www.jstor.org/stable/169411">https://www.jstor.org/stable/169411</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[NetworkX Function Stubs]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Separation Oracle" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/</id>
            
            
            <published>2021-05-24T00:00:00+00:00</published>
            <updated>2021-05-24T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Draft function stubs for the Asadpour method to use in the NetworkX API</blockquote><p>Now that my proposal was accepted by NetworkX for the 2021 Google Summer of Code (GSoC), I can get more into the technical details of how I plan to implement the Asadpour algorithm within NetworkX.</p>
<p>In this post I am going to outline my thought process for the control scheme of my implementation and create function stubs according to my GSoC proposal.
Most of the work for this project will happen in <code>netowrkx.algorithms.approximation.traveling_salesman.py</code>, where I will finish the last algorithm for the Traveling Salesman Problem so it can be merged into the project. The main function in <code>traveling_salesman.py</code> is</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">traveling_salesman_problem</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">,</span> <span class="n">nodes</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">cycle</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">method</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    ...
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    G : NetworkX graph
</span></span></span><span class="line"><span class="cl"><span class="s2">        Undirected possibly weighted graph
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    nodes : collection of nodes (default=G.nodes)
</span></span></span><span class="line"><span class="cl"><span class="s2">        collection (list, set, etc.) of nodes to visit
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    weight : string, optional (default=&#34;weight&#34;)
</span></span></span><span class="line"><span class="cl"><span class="s2">        Edge data key corresponding to the edge weight.
</span></span></span><span class="line"><span class="cl"><span class="s2">        If any edge does not have this attribute the weight is set to 1.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    cycle : bool (default: True)
</span></span></span><span class="line"><span class="cl"><span class="s2">        Indicates whether a cycle should be returned, or a path.
</span></span></span><span class="line"><span class="cl"><span class="s2">        Note: the cycle is the approximate minimal cycle.
</span></span></span><span class="line"><span class="cl"><span class="s2">        The path simply removes the biggest edge in that cycle.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    method : function (default: None)
</span></span></span><span class="line"><span class="cl"><span class="s2">        A function that returns a cycle on all nodes and approximates
</span></span></span><span class="line"><span class="cl"><span class="s2">        the solution to the traveling salesman problem on a complete
</span></span></span><span class="line"><span class="cl"><span class="s2">        graph. The returned cycle is then used to find a corresponding
</span></span></span><span class="line"><span class="cl"><span class="s2">        solution on `G`. `method` should be callable; take inputs
</span></span></span><span class="line"><span class="cl"><span class="s2">        `G`, and `weight`; and return a list of nodes along the cycle.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">        Provided options include :func:`christofides`, :func:`greedy_tsp`,
</span></span></span><span class="line"><span class="cl"><span class="s2">        :func:`simulated_annealing_tsp` and :func:`threshold_accepting_tsp`.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">        If `method is None`: use :func:`christofides` for undirected `G` and
</span></span></span><span class="line"><span class="cl"><span class="s2">        :func:`threshold_accepting_tsp` for directed `G`.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">        To specify parameters for these provided functions, construct lambda
</span></span></span><span class="line"><span class="cl"><span class="s2">        functions that state the specific value. `method` must have 2 inputs.
</span></span></span><span class="line"><span class="cl"><span class="s2">        (See examples).
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    ...
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span></span></span></code></pre>
</div>
<p>All user calls to find an approximation to the traveling salesman problem will go through this function.
My implementation of the Asadpour algorithm will also need to be compatible with this function.
<code>traveling_salesman_problem</code> will handle creating a new, complete graph using the weight of the shortest path between nodes $u$ and $v$ as the weight of that arc, so we know that by the time the graph is passed to the Asadpour algorithm it is a complete digraph which satisfies the triangle inequality.
The main function also handles the <code>nodes</code> and <code>cycles</code> parameters by only copying the necessary nodes into the complete digraph before calling the requested method and afterwards searching for and removing the largest arc within the returned cycle.
Thus, the parent function for the Asadpour algorithm only needs to deal with the graph itself and the weights or costs of the arcs in the graph.</p>
<p>My controlling function will have the following signature and I have included a draft of the docstring as well.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">asadpour_tsp</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns an O( log n / log log n ) approximate solution to the traveling
</span></span></span><span class="line"><span class="cl"><span class="s2">    salesman problem.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    This approximate solution is one of the best known approximations for
</span></span></span><span class="line"><span class="cl"><span class="s2">    the asymmetric traveling salesman problem developed by Asadpour et al,
</span></span></span><span class="line"><span class="cl"><span class="s2">    [1]_. The algorithm first solves the Held-Karp relaxation to find a
</span></span></span><span class="line"><span class="cl"><span class="s2">    lower bound for the weight of the cycle. Next, it constructs an
</span></span></span><span class="line"><span class="cl"><span class="s2">    exponential distribution of undirected spanning trees where the
</span></span></span><span class="line"><span class="cl"><span class="s2">    probability of an edge being in the tree corresponds to the weight of
</span></span></span><span class="line"><span class="cl"><span class="s2">    that edge using a maximum entropy rounding scheme. Next we sample that
</span></span></span><span class="line"><span class="cl"><span class="s2">    distribution $2 </span><span class="se">\\\\\\</span><span class="s2">log n$ times and saves the minimum sampled tree once
</span></span></span><span class="line"><span class="cl"><span class="s2">    the direction of the arcs is added back to the edges. Finally,
</span></span></span><span class="line"><span class="cl"><span class="s2">    we argument then short circuit that graph to find the approximate tour
</span></span></span><span class="line"><span class="cl"><span class="s2">    for the salesman.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    G : nx.DiGraph
</span></span></span><span class="line"><span class="cl"><span class="s2">        The graph should be a complete weighted directed graph.
</span></span></span><span class="line"><span class="cl"><span class="s2">        The distance between all pairs of nodes should be included.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    weight : string, optional (default=&#34;weight&#34;)
</span></span></span><span class="line"><span class="cl"><span class="s2">        Edge data key corresponding to the edge weight.
</span></span></span><span class="line"><span class="cl"><span class="s2">        If any edge does not have this attribute the weight is set to 1.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">    -------
</span></span></span><span class="line"><span class="cl"><span class="s2">    cycle : list of nodes
</span></span></span><span class="line"><span class="cl"><span class="s2">        Returns the cycle (list of nodes) that a salesman can follow to minimize
</span></span></span><span class="line"><span class="cl"><span class="s2">        the total weight of the trip.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Raises
</span></span></span><span class="line"><span class="cl"><span class="s2">    ------
</span></span></span><span class="line"><span class="cl"><span class="s2">    NetworkXError
</span></span></span><span class="line"><span class="cl"><span class="s2">        If `G` is not complete, the algorithm raises an exception.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    References
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    .. [1] A. Asadpour, M. X. Goemans, A. Madry, S. O. Gharan, and A. Saberi,
</span></span></span><span class="line"><span class="cl"><span class="s2">       An o(log n/log log n)-approximation algorithm for the asymmetric
</span></span></span><span class="line"><span class="cl"><span class="s2">       traveling salesman problem, Operations research, 65 (2017),
</span></span></span><span class="line"><span class="cl"><span class="s2">       pp. 1043–1061
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">pass</span></span></span></code></pre>
</div>
<p>Following my GSoC proposal, the next function is <code>held_karp</code>, which will solve the Held-Karp relaxation on the complete digraph using the ellipsoid method (See my last two posts <a href="../held-karp-relaxation">here</a> and <a href="../held-karp-separation-oracle">here</a> for my thoughts on why and how to accomplish this).
Solving the Held-Karp relaxation is the first step in the algorithm.</p>
<p>Recall that the Held-Karp relaxation is defined as the following linear program:</p>
<p>$$
\begin{array}{c l l}
\text{min} &amp; \sum_{a} c(a)x_a \\\
\text{s.t.} &amp; x(\delta^+(U)) \geqslant 1 &amp; \forall\ U \subset V \text{ and } U \not= \emptyset \\\
&amp; x(\delta^+(v)) = x(\delta^-(v)) = 1 &amp; \forall\ v \in V \\\
&amp; x_a \geqslant 0 &amp; \forall\ a
\end{array}
$$</p>
<p>and that it is a semi-infinite program so it is too large to be solved in conventional forms.
The algorithm uses the solution to the Held-Karp relaxation to create a vector $z^*$ which is a symmetrized and slightly scaled down version of the true Held-Karp solution $x^*$.
$z^*$ is defined as</p>
<p>$$
z^*_{{u, v}} = \frac{n - 1}{n} \left(x^*_{uv} + x^*_{vu}\right)
$$</p>
<p>and since this is what the algorithm using to build the rest of the approximation, this should be one of the return values from <code>held_karp</code>.
I will also return the value of the cost of $x^*$, which is denoted as $c(x^*)$ or $OPT_{HK}$ in the Asadpour paper [1].</p>
<p>Additionally, the separation oracle will be defined as an inner function within <code>held_karp</code>.
At the present moment I am not sure what the exact parameters for the separation oracle, <code>sep_oracle</code>, but it should be the the point the algorithm wishes to test and will need to access the graph the algorithm is relaxing.
In particular, I&rsquo;m not sure <em>yet</em> how I will represent the hyperplane which is returned by the separation oracle.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">_held_karp</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Solves the Held-Karp relaxation of the input complete digraph and scales
</span></span></span><span class="line"><span class="cl"><span class="s2">    the output solution for use in the Asadpour [1]_ ASTP algorithm.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    The Held-Karp relaxation defines the lower bound for solutions to the
</span></span></span><span class="line"><span class="cl"><span class="s2">    ATSP, although it does return a fractional solution. This is used in the
</span></span></span><span class="line"><span class="cl"><span class="s2">    Asadpour algorithm as an initial solution which is later rounded to a
</span></span></span><span class="line"><span class="cl"><span class="s2">    integral tree within the spanning tree polytopes. This function solves
</span></span></span><span class="line"><span class="cl"><span class="s2">    the relaxation with the ellipsoid method for linear programs.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    G : nx.DiGraph
</span></span></span><span class="line"><span class="cl"><span class="s2">        The graph should be a complete weighted directed graph.
</span></span></span><span class="line"><span class="cl"><span class="s2">        The distance between all paris of nodes should be included.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    weight : string, optional (default=&#34;weight&#34;)
</span></span></span><span class="line"><span class="cl"><span class="s2">        Edge data key corresponding to the edge weight.
</span></span></span><span class="line"><span class="cl"><span class="s2">        If any edge does not have this attribute the weight is set to 1.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">    -------
</span></span></span><span class="line"><span class="cl"><span class="s2">    OPT : float
</span></span></span><span class="line"><span class="cl"><span class="s2">        The cost for the optimal solution to the Held-Karp relaxation
</span></span></span><span class="line"><span class="cl"><span class="s2">    z_star : numpy array
</span></span></span><span class="line"><span class="cl"><span class="s2">        A symmetrized and scaled version of the optimal solution to the
</span></span></span><span class="line"><span class="cl"><span class="s2">        Held-Karp relaxation for use in the Asadpour algorithm
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    References
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    .. [1] A. Asadpour, M. X. Goemans, A. Madry, S. O. Gharan, and A. Saberi,
</span></span></span><span class="line"><span class="cl"><span class="s2">       An o(log n/log log n)-approximation algorithm for the asymmetric
</span></span></span><span class="line"><span class="cl"><span class="s2">       traveling salesman problem, Operations research, 65 (2017),
</span></span></span><span class="line"><span class="cl"><span class="s2">       pp. 1043–1061
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">def</span> <span class="nf">sep_oracle</span><span class="p">(</span><span class="n">point</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">        The separation oracle used in the ellipsoid algorithm to solve the
</span></span></span><span class="line"><span class="cl"><span class="s2">        Held-Karp relaxation.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">        This &#39;black-box&#39; takes a point and check to see if it violates any
</span></span></span><span class="line"><span class="cl"><span class="s2">        of the Held-Karp constraints, which are defined as
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">            - The out-degree of all non-empty subsets of $V$ is at lest one.
</span></span></span><span class="line"><span class="cl"><span class="s2">            - The in-degree and out-degree of each vertex in $V$ is equal to
</span></span></span><span class="line"><span class="cl"><span class="s2">              one. Note that if a vertex has more than one incoming or
</span></span></span><span class="line"><span class="cl"><span class="s2">              outgoing arcs the values of each could be less than one so long
</span></span></span><span class="line"><span class="cl"><span class="s2">              as they sum to one.
</span></span></span><span class="line"><span class="cl"><span class="s2">            - The current value for each arc is greater
</span></span></span><span class="line"><span class="cl"><span class="s2">              than zero.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">        Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">        ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">        point : numpy array
</span></span></span><span class="line"><span class="cl"><span class="s2">            The point in n dimensional space we will to test to see if it
</span></span></span><span class="line"><span class="cl"><span class="s2">            violations any of the Held-Karp constraints.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">        Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">        -------
</span></span></span><span class="line"><span class="cl"><span class="s2">        numpy array
</span></span></span><span class="line"><span class="cl"><span class="s2">            The hyperplane which was the most violated by `point`, i.e the
</span></span></span><span class="line"><span class="cl"><span class="s2">            hyperplane defining the polytope of spanning trees which `point`
</span></span></span><span class="line"><span class="cl"><span class="s2">            was farthest from, None if no constraints are violated.
</span></span></span><span class="line"><span class="cl"><span class="s2">        &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="k">pass</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">pass</span></span></span></code></pre>
</div>
<p>Next the algorithm uses the symmetrized and scaled version of the Held-Karp solution to construct an exponential distribution of undirected spanning trees which preserves the marginal probabilities.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">_spanning_tree_distribution</span><span class="p">(</span><span class="n">z_star</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Solves the Maximum Entropy Convex Program in the Asadpour algorithm [1]_
</span></span></span><span class="line"><span class="cl"><span class="s2">    using the approach in section 7 to build an exponential distribution of
</span></span></span><span class="line"><span class="cl"><span class="s2">    undirected spanning trees.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    This algorithm ensures that the probability of any edge in a spanning
</span></span></span><span class="line"><span class="cl"><span class="s2">    tree is proportional to the sum of the probabilities of the trees
</span></span></span><span class="line"><span class="cl"><span class="s2">    containing that edge over the sum of the probabilities of all spanning
</span></span></span><span class="line"><span class="cl"><span class="s2">    trees of the graph.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    z_star : numpy array
</span></span></span><span class="line"><span class="cl"><span class="s2">        The output of `_held_karp()`, a scaled version of the Held-Karp
</span></span></span><span class="line"><span class="cl"><span class="s2">        solution.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">    -------
</span></span></span><span class="line"><span class="cl"><span class="s2">    gamma : numpy array
</span></span></span><span class="line"><span class="cl"><span class="s2">        The probability distribution which approximately preserves the marginal
</span></span></span><span class="line"><span class="cl"><span class="s2">        probabilities of `z_star`.
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">pass</span></span></span></code></pre>
</div>
<p>Now that the algorithm has the distribution of spanning trees, we need to sample them.
Each sampled tree is a $\lambda$-random tree and can be sampled using algorithm A8 in [2].</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">_sample_spanning_tree</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">gamma</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Sample one spanning tree from the distribution defined by `gamma`,
</span></span></span><span class="line"><span class="cl"><span class="s2">    roughly using algorithm A8 in [1]_ .
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    We &#39;shuffle&#39; the edges in the graph, and then probabilistically
</span></span></span><span class="line"><span class="cl"><span class="s2">    determine whether to add the edge conditioned on all of the previous
</span></span></span><span class="line"><span class="cl"><span class="s2">    edges which were added to the tree. Probabilities are calculated using
</span></span></span><span class="line"><span class="cl"><span class="s2">    Kirchhoff&#39;s Matrix Tree Theorem and a weighted Laplacian matrix.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    G : nx.Graph
</span></span></span><span class="line"><span class="cl"><span class="s2">        An undirected version of the original graph.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    gamma : numpy array
</span></span></span><span class="line"><span class="cl"><span class="s2">        The probabilities associated with each of the edges in the undirected
</span></span></span><span class="line"><span class="cl"><span class="s2">        graph `G`.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">    -------
</span></span></span><span class="line"><span class="cl"><span class="s2">    nx.Graph
</span></span></span><span class="line"><span class="cl"><span class="s2">        A spanning tree using the distribution defined by `gamma`.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    References
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    .. [1] V. Kulkarni, Generating random combinatorial objects, Journal of
</span></span></span><span class="line"><span class="cl"><span class="s2">       algorithms, 11 (1990), pp. 185–207
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">pass</span></span></span></code></pre>
</div>
<p>At this point there is only one function left to discuss, <code>laplacian_matrix</code>.
This function already exists within NetworkX at <code>networkx.linalg.laplacianmatrix.laplacian_matrix</code>, and even though this is relatively simple to implement, I&rsquo;d rather use an existing version than create duplicate code within the project.
A deeper look at the function signature reveals</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="nd">@not_implemented_for</span><span class="p">(</span><span class="s2">&#34;directed&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">laplacian_matrix</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">nodelist</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;Returns the Laplacian matrix of G.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    The graph Laplacian is the matrix L = D - A, where
</span></span></span><span class="line"><span class="cl"><span class="s2">    A is the adjacency matrix and D is the diagonal matrix of node degrees.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    G : graph
</span></span></span><span class="line"><span class="cl"><span class="s2">       A NetworkX graph
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    nodelist : list, optional
</span></span></span><span class="line"><span class="cl"><span class="s2">       The rows and columns are ordered according to the nodes in nodelist.
</span></span></span><span class="line"><span class="cl"><span class="s2">       If nodelist is None, then the ordering is produced by G.nodes().
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    weight : string or None, optional (default=&#39;weight&#39;)
</span></span></span><span class="line"><span class="cl"><span class="s2">       The edge data key used to compute each value in the matrix.
</span></span></span><span class="line"><span class="cl"><span class="s2">       If None, then each edge has weight 1.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">    -------
</span></span></span><span class="line"><span class="cl"><span class="s2">    L : SciPy sparse matrix
</span></span></span><span class="line"><span class="cl"><span class="s2">      The Laplacian matrix of G.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Notes
</span></span></span><span class="line"><span class="cl"><span class="s2">    -----
</span></span></span><span class="line"><span class="cl"><span class="s2">    For MultiGraph/MultiDiGraph, the edges weights are summed.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    See Also
</span></span></span><span class="line"><span class="cl"><span class="s2">    --------
</span></span></span><span class="line"><span class="cl"><span class="s2">    to_numpy_array
</span></span></span><span class="line"><span class="cl"><span class="s2">    normalized_laplacian_matrix
</span></span></span><span class="line"><span class="cl"><span class="s2">    laplacian_spectrum
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span></span></span></code></pre>
</div>
<p>Which is exactly what I need, <em>except</em> the decorator states that it does not support directed graphs and this algorithm deals with those types of graphs.
Fortunately, our distribution of spanning trees is for trees in a directed graph <em>once the direction is disregarded</em>, so we can actually use the existing function.
The definition given in the Asadpour paper [1], is</p>
<p>$$
L_{i,j} = \left\{
\begin{array}{l l}
-\lambda_e &amp; e = (i, j) \in E \\\
\sum_{e \in \delta({i})} \lambda_e &amp; i = j \\\
0 &amp; \text{otherwise}
\end{array}
\right.
$$</p>
<p>Where $E$ is defined as &ldquo;Let $E$ be the support of graph of $z^*$ when the direction of the arcs are disregarded&rdquo; on page 5 of the Asadpour paper.
Thus, I can use the existing method without having to create a new one, which will save time and effort on this GSoC project.</p>
<p>In addition to being discussed here, these function stubs have been added to my fork of <code>NetworkX</code> on the <code>bothTSP</code> branch.
The commit, <a href="https://github.com/mjschwenne/networkx/commit/d3a3db8823804faa3edbf8bfa0f4b12459143ac8"><code>Added function stubs and draft docstrings for the Asadpour algorithm</code></a> is visible on my GitHub using that link.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>[1] A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061, <a href="https://homes.cs.washington.edu/~shayan/atsp.pdf">https://homes.cs.washington.edu/~shayan/atsp.pdf</a>.</p>
<p>[2] V. Kulkarni, <em>Generating random combinatorial objects</em>, Journal of algorithms, 11 (1990), pp. 185–207</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Aitik Gupta joins as a Student Developer under GSoC'21]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/gsoc_2021_introduction/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2020_final_work_product/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC 2020 Work Product - Baseline Images Problem" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_5/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 3 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_4/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 2 Blog 2" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_3/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 2 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_2/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 1 Blog 2" />
            
                <id>https://blog.scientific-python.org/matplotlib/gsoc_2021_introduction/</id>
            
            
            <published>2021-05-19T20:03:57+05:30</published>
            <updated>2021-05-19T20:03:57+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Introduction about Aitik Gupta, Google Summer of Code 2021 Intern under the parent organisation: NumFOCUS</blockquote><p><strong><ins>The day of result, was a very, very long day.</ins></strong></p>
<p>With this small writeup, I intend to talk about everything before <em>that day</em>, my experiences, my journey, and the role of Matplotlib throughout!</p>
<h2 id="about-me">About Me<a class="headerlink" href="#about-me" title="Link to this heading">#</a></h2>
<p>I am a third-year undergraduate student currently pursuing a Dual Degree (B.Tech + M.Tech) in Information Technology at Indian Institute of Information Technology, Gwalior.</p>
<p>During my sophomore year, my interests started expanding in the domain of Machine Learning, where I learnt about various amazing open-source libraries like <em>NumPy</em>, <em>SciPy</em>, <em>pandas</em>, and <em>Matplotlib</em>! Gradually, in my third year, I explored the field of Computer Vision during my internship at a startup, where a big chunk of my work was to integrate their native C++ codebase to Android via JNI calls.</p>
<p>To actuate my learnings from the internship, I worked upon my own research along with a <a href="https://linkedin.com/in/aaditagarwal">friend from my university</a>. The paper was accepted in CoDS-COMAD’21 and is published at ACM Digital Library. (<a href="https://dl.acm.org/doi/abs/10.1145/3430984.3430986">Link</a>, if anyone&rsquo;s interested)</p>
<p>During this period, I also picked up the knack for open-source and started glaring at various issues (and pull requests) in libraries, including OpenCV [<a href="https://github.com/opencv/opencv/issues?q=author%3Aaitikgupta&#43;">contributions</a>] and NumPy [<a href="https://github.com/numpy/numpy/issues?q=author%3Aaitikgupta&#43;">contributions</a>].</p>
<p>I quickly got involved in Matplotlib’s community; it was very welcoming and beginner-friendly.</p>
<p><strong>Fun fact: Its dev call was the very first I attended with people from all around the world!</strong></p>
<h2 id="first-contributions">First Contributions<a class="headerlink" href="#first-contributions" title="Link to this heading">#</a></h2>
<p>We all mess up, my <a href="https://github.com/opencv/opencv/pull/18440">very first PR</a> to an organisation like OpenCV went horrible, till date, it looks like this:
<img src="https://user-images.githubusercontent.com/43996118/118848259-35d6e300-b8ec-11eb-8cdc-387e9f5a37a3.png" alt="OpenCV_PR"></p>
<p>In all honesty, I added a single commit with only a few lines of diff.</p>
<blockquote>
<p>However, I pulled all the changes from upstream <code>master</code> to my working branch, whereas the PR was to be made on <code>3.4</code> branch.</p>
</blockquote>
<p>I&rsquo;m sure I could&rsquo;ve done tons of things to solve it, but at that time I couldn&rsquo;t do anything - imagine the anxiety!</p>
<p>At this point when I look back at those fumbled PRs, I feel like they were important for my learning process.</p>
<p><strong>Fun Fact: Because of one of these initial contributions, I got a shiny little badge [<a href="https://github.com/readme/nasa-ingenuity-helicopter">Mars 2020 Helicopter Contributor</a>] on GitHub!</strong></p>
<img src="https://github.githubassets.com/images/modules/profile/badge--mars-64.png" style="width: 25%">
<h2 id="getting-started-with-matplotlib">Getting started with Matplotlib<a class="headerlink" href="#getting-started-with-matplotlib" title="Link to this heading">#</a></h2>
<p>It was around initial weeks of November last year, I was scanning through <code>Good First Issue</code> and <code>New Feature</code> labels, I realised a pattern - most <ins>Mathtext</ins> related issues were unattended.</p>
<p>To make it simple, Mathtext is a part of Matplotlib which parses mathematical expressions and provides TeX-like outputs, for example:
<span><img src="https://matplotlib.org/stable/_images/mathmpl/math-050e387807.png" style="width: 25%"></span></p>
<p>I scanned the related source code to try to figure out how to solve those Mathtext issues. Eventually, with the help of maintainers reviewing the PRs and <ins>a lot of verbose discussions</ins> on GitHub issues/pull requests and on the <a href="https://gitter.im/matplotlib/matplotlib">Gitter</a> channel, I was able to get my initial PRs merged!</p>
<h2 id="learning-throughout-the-process">Learning throughout the process<a class="headerlink" href="#learning-throughout-the-process" title="Link to this heading">#</a></h2>
<p>Most of us use libraries without understanding the underlining structure of them, which sometimes can cause downstream bugs!</p>
<p>While I was studying Matplotlib&rsquo;s architecture, I figured that I could use the same ideology for one of my <a href="https://aitikgupta.github.io/swi-ml/">own projects</a>!</p>
<p>Matplotlib uses a global dictionary-like object named as <code>rcParams</code>, I used a smaller interface, similar to rcParams, in <a href="https://pypi.org/project/swi-ml/">swi-ml</a> - a small Python library I wrote, implementing a subset of ML algorithms, with a <ins>switchable backend</ins>.</p>
<h2 id="where-does-gsoc-fit">Where does GSoC fit?<a class="headerlink" href="#where-does-gsoc-fit" title="Link to this heading">#</a></h2>
<p>It was around January, I had a conversation with one of the maintainers (hey <a href="https://github.com/anntzer">Antony</a>!) about the long-list of issues with the current ways of handling texts/fonts in the library.</p>
<p>After compiling them into an order, after few tweaks from maintainers, <a href="https://github.com/matplotlib/matplotlib/wiki/GSOC-2021-ideas">GSoC Idea-List</a> for Matplotlib was born. And so did my journey of building a strong proposal!</p>
<h2 id="about-the-project">About the Project<a class="headerlink" href="#about-the-project" title="Link to this heading">#</a></h2>
<h4 id="proposal-link-google-docs-will-stay-alive-after-gsoc-gsoc-website-not-so-sure">Proposal Link: <a href="https://docs.google.com/document/d/11PrXKjMHhl0rcQB4p_W9JY_AbPCkYuoTT0t85937nB0/edit?usp=sharing">Google Docs</a> (will stay alive after GSoC), <a href="https://storage.googleapis.com/summerofcode-prod.appspot.com/gsoc/core_project/doc/6319153410998272_1617936740_GSoC_Proposal_-_Matplotlib.pdf?Expires=1621539234&amp;GoogleAccessId=summerofcode-prod%40appspot.gserviceaccount.com&amp;Signature=QU8uSdPnXpa%2FooDtzVnzclz809LHjh9eU7Y7iR%2FH1NM32CBgzBO4%2FFbMeDmMsoic91B%2BKrPZEljzGt%2Fx9jtQeCR9X4O53JJLPVjw9Bg%2Fzb2YKjGzDk0oFMRPXjg9ct%2BV58PD6f4De1ucqARLtHGjis5jhK1W08LNiHAo88NB6BaL8Q5hqcTBgunLytTNBJh5lW2kD8eR2WeENnW9HdIe53aCdyxJkYpkgILJRoNLCvp111AJGC3RLYba9VKeU6w2CdrumPfRP45FX6fJlrKnClvxyf5VHo3uIjA3fGNWIQKwGgcd1ocGuFN3YnDTS4xkX3uiNplwTM4aGLQNhtrMqA%3D%3D">GSoC Website</a> (not so sure)<a class="headerlink" href="#proposal-link-google-docs-will-stay-alive-after-gsoc-gsoc-website-not-so-sure" title="Link to this heading">#</a></h4>
<h3 id="revisiting-textfont-handling">Revisiting Text/Font Handling<a class="headerlink" href="#revisiting-textfont-handling" title="Link to this heading">#</a></h3>
<p>The aim of the project is divided into 3 subgoals:</p>
<ol>
<li>
<p><strong>Font-Fallback</strong>: A redesigned text-first font interface - essentially parsing all family before rendering a &ldquo;tofu&rdquo;.</p>
<p><em>(similar to specifying <ins>font-family in CSS</ins>!)</em></p>
</li>
<li>
<p><strong>Font Subsetting</strong>: Every exported PS/PDF would contain embedded glyphs subsetted from the whole font.</p>
<p><em>(imagine a plot with just a single letter &ldquo;a&rdquo;, would you like it if the PDF you exported from Matplotlib to <ins>embed the whole font</ins> file within it?)</em></p>
</li>
<li>
<p>Most mpl backends would use the <ins>unified TeX exporting</ins> mechanism</p>
</li>
</ol>
<p><strong>Mentors</strong> <a href="https://github.com/tacaswell">Thomas A Caswell</a>, <a href="https://github.com/anntzer">Antony Lee</a>, <a href="https://github.com/story645">Hannah</a>.</p>
<p>Thanks a lot for spending time reading the blog! I&rsquo;ll be back with my progress in subsequent posts.</p>
<h5 id="note-this-blog-post-is-also-available-at-my-personal-website">NOTE: This blog post is also available at my <a href="https://aitikgupta.github.io/gsoc-intro/">personal website</a>!<a class="headerlink" href="#note-this-blog-post-is-also-available-at-my-personal-website" title="Link to this heading">#</a></h5>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="GSoC" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Held-Karp Separation Oracle]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/</id>
            
            
            <published>2021-05-08T00:00:00+00:00</published>
            <updated>2021-05-08T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Considering creating a separation oracle for the Held-Karp relaxation</blockquote><p>Continuing the theme of my last post, we know that the Held-Karp relaxation in the Asadpour Asymmetric Traveling Salesman Problem cannot be practically written into the standard matrix form of a linear program.
Thus, we need a different method to solve the relaxation, which is where the ellipsoid method comes into play.
The ellipsoid method can be used to solve semi-infinite linear programs, which is what the Held-Karp relaxation is.</p>
<p>One of the keys to the ellipsoid method is the separation oracle.
From the perspective of the algorithm itself, the oracle is a black-box program which takes a vector and determines</p>
<ul>
<li>Whether the vector is in the linear program&rsquo;s feasible region.</li>
<li>If not, it returns a hyperplane with the given point on one side and the linear program&rsquo;s feasible region on the other.</li>
</ul>
<p>In the most basic form, the ellipsoid method is a decision algorithm rather than an optimization algorithm, so it terminates once a single, but almost certainly nonoptimal, vector within the feasible region is found.
However, we can convert the ellipsoid method into an algorithm which is truly an optimization one.
What this means for us is that we can assume that the separation oracle will return a hyperplane.</p>
<p>The hyperplane that the oracle returns is then used to construct the next ellipsoid in the algorithm, which is of smaller volume and contains a half-ellipsoid from the originating ellipsoid.
This is, however, a topic for another post.
Right now I want to focus on this &lsquo;black-box&rsquo; separation oracle.</p>
<p>The reason that the Held-Karp relaxation is semi-infinite is because for a graph with $n$ vertices, there are $2^n + 2n$ constraints in the program.
A naive approach to the separation oracle would be to check each constraint individually for the input vector, creating a program with $O(2^n)$ running time.
While it would terminate eventually, it certainly would take a <em>long</em> time to do so.</p>
<p>So, we look for a more efficient way to do this.
Recall from the Asadpour paper [1] that the Held-Karp relaxation is the following linear program.</p>
<p>$$
\begin{array}{c l l}
\text{min} &amp; \sum_{a} c(a)x_a \\\
\text{s.t.} &amp; x(\delta^+(U)) \geqslant 1 &amp; \forall\ U \subset V \text{ and } U \not= \emptyset \\\
&amp; x(\delta^+(v)) = x(\delta^-(v)) = 1 &amp; \forall\ v \in V \\\
&amp; x_a \geqslant 0 &amp; \forall\ a
\end{array}
$$</p>
<p>The first set of constraints ensures that the output of the relaxation is connected.
This is called <em>subtour elimination</em>, and it prevents a solution with multiple disconnected clusters by ensuring that every set of vertices has at least one total outgoing arc (we are currently dealing with fractional arcs).
From the perspective of the separation oracle, we do not care about all of the sets of vertices for which $x(\delta^+(U)) \geqslant 1$, only trying to find one such subset of the vertices where $x(\delta^+(U)) &lt; 1$.</p>
<p>In order to find such a set of vertices $U \in V$ where $x(\delta^+(U)) &lt; 1$ we can find the subset $U$ with the smallest value of $\delta^+(x)$ for all $U \subset V$.
That is, find the <em>global minimum cut</em> in the complete digraph using the edge capacities given by the input vector to the separation oracle.
Using lecture notes by Michel X. Goemans (who is also one of the authors of the Asadpour algorithm this project seeks to implement), [2] we can find such a minimum cut with $2(n - 1)$ maximum flow calculations.</p>
<p>The algorithm described in section 6.4 of the lecture notes [2] is fairly simple.
Let $S$ be a subset of $V$ and $T$ be a subset of $V$ such that the $s-t$ cut is the global minimum cut for the graph.
First, we pick an arbitrary $s$ in the graph.
By definition, $s$ is either in $S$ or it is in $T$.
We now iterate through every other vertex in the graph $t$, and compute the $s-t$ and $t-s$ minimum cut.
If $s \in S$ than we will find that one of the choices of $t$ will produce the global minimum cut and the case where $s \not\in S$ or $s \in T$ is covered by using the $t-s$ cuts.</p>
<p>According to Geoman [2], the complexity of finding the global min cut in a weighted digraph, using an effeicent maxflow algorithm, is $O(mn^2\log(n^2/m))$.</p>
<p>The second constraint can be checked in $O(n)$ time with a simple loop.
It makes sense to actually check this one first as it is computationally simpler and thus if one of these conditions are violated we will be able to return the violated hyperplane faster.</p>
<p>Now we have reduced the complexity of the oracle from $O(2^n)$ to the same as finding the global min cut, $O(mn^2\log(n^2/m))$ which is substantially better.
For example, let us consider an initial graph with 100 vertices.
Using the $O(2^n)$ method, that is $1.2677 \times 10^{30}$ subsets $U$ that we need to check <em>times</em> whatever the complexity of actually determining whether the constraint violates $x(\delta^+(U)) \geqslant 1$.
For that same complete digraph on 100 vertices, we know that there $n = 100$ and $m = \binom{100}{2} = 4950$.
Using the global min cut approach, the complexity which includes finding the max flow as well as the number of times it needs to be found, is $15117042$ or $1.5117 \times 10^7$ which is faster by a factor of $10^{23}$.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>[1] A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061, <a href="https://homes.cs.washington.edu/~shayan/atsp.pdf">https://homes.cs.washington.edu/~shayan/atsp.pdf</a>.</p>
<p>[2] M. X. Goemans, <em>Lecture notes on flows and cuts</em>, Handout 18, Massachusetts Institute of Technology, Cambridge, MA, 2009 <a href="http://www-math.mit.edu/~goemans/18433S09/flowscuts.pdf">http://www-math.mit.edu/~goemans/18433S09/flowscuts.pdf</a>.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Held-Karp Relaxation]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
            
                <id>https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/</id>
            
            
            <published>2021-04-21T00:00:00+00:00</published>
            <updated>2021-04-21T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Brief explanation of the Held-Karp relaxation and why it cannot be solved directly</blockquote><p>In linear programming, we sometimes need to take what would be a integer program and &lsquo;relax&rsquo; it, or unbound the values of the variables so that they are continuous.
One particular application of this process is Held-Karp relaxation used the first part of the Asadpour algorithm for the Asymmetric Traveling Salesman Problem, where we find the lower bound of the approximation.
Normally the relaxation is written as follows.</p>
<p>$$
\begin{array}{c l l}
\text{min} &amp; \sum_{a} c(a)x_a \\\
\text{s.t.} &amp; x(\delta^+(U)) \geqslant 1 &amp; \forall\ U \subset V \text{ and } U \not= \emptyset \\\
&amp; x(\delta^+(v)) = x(\delta^-(v)) = 1 &amp; \forall\ v \in V \\\
&amp; x_a \geqslant 0 &amp; \forall\ a
\end{array}
$$</p>
<p>This is a convenient way to write the program, but if we want to solve it, and we definitely do, we need it written in standard form for a linear program.
Standard form is represented using a matrix for the set of constraints and vectors for the objective function.
It is shown below</p>
<p>$$
\begin{array}{c l}
\text{min} &amp; Z = c^TX \\\
\text{s.t.} &amp; AX = b \\\
&amp; X \geqslant 0
\end{array}
$$</p>
<p>Where $c$ is the coefficient vector for objective function, $X$ is the vector for the values of all of the variables, $A$ is the coefficient matrix for the constraints and $b$ is a vector of what the constraints are equal to.
Once a linear program is in this form there are efficient algorithms which can solve it.</p>
<p>In the Held-Karp relaxation, the objective function is a summation, so we can expand it to a summation.
If there are $n$ edges then it becomes</p>
<p>$$
\sum_{a} c(a)x_a = c(1)x_1 + c(2)x_2 + c(3)x_3 + \dots + c(n)_n
$$</p>
<p>Where $c(a)$ is the weight of that edge in the graph.
From here it is easy to convert the objective function into two vectors which satisfies the standard form.</p>
<p>$$
\begin{array}{rCl}
c &amp;=&amp; \begin{bmatrix}
c_1 &amp; c_2 &amp; c_3 &amp; \dots &amp; c_n
\end{bmatrix}^T \\\
X &amp;=&amp; \begin{bmatrix}
x_1 &amp; x_2 &amp; x_3 &amp; \dots &amp; x_n
\end{bmatrix}^T
\end{array}
$$</p>
<p>Now we have to convert the constraints to be in standard form.
First and foremost, notice that the Held-Karp relaxation contains $x_a \geqslant 0\ \forall\ a$ and the standard form uses $X \geqslant 0$, so these constants match already and no work is needed.
As for the others&hellip; well they do need some work.</p>
<p>Starting with the first constraint in the Held-Karp relaxation, $x(\delta^+(U)) \geqslant 1\ \forall\ U \subset V$ and $U \not= \emptyset$.
This constraint specifies that for every subset of the vertex set $V$, that subset must have at lest one arc with its tail in $U$ and its head not in $U$.
For any given $\delta^+(U)$, which is defined in the paper is $\delta^+(U) = {a = (u, v) \in A: u \in U, v \not\in U}$ where $A$ in this set is the set of all arcs in the graph, the coefficients on arcs not in $U$ are zero.
Arcs in $\delta^+(U)$ have a coefficient of $1$ as their full weight is counted as part of $\delta^+(U)$.
We know that there are about $2^{|V|}$ subsets of the vertex $V$, so this constraint adds that many rows to the constraint matrix $A$.</p>
<p>Moving to the next constraint, $x(\delta^+(v)) = x(\delta^-(v)) = 1$, we first need to split it in two.</p>
<p>$$
\begin{array}{rCl}
x(\delta^+(v)) &amp;=&amp; 1 \\\
x(\delta^-(v)) &amp;=&amp; 1
\end{array}
$$</p>
<p>Similar to the last constraint, each of these say that the number of arcs entering and leaving a vertex in the graph need to equal one.
For each vertex $v$ we find all the arcs which start at $v$ and those are the members of $\delta^+(v)$, so they have a weight of 1 and all others have a weight of zero.
The opposite is true for $\delta^-(v)$, every vertex which has a head on $v$ has a weight or coefficient of 1 while the rest have a weight of zero.
This adds $2 \times |V|$ rows to $A$, the coefficient matrix which brings the total to $2^{|V|} + 2|V|$ rows.</p>
<h2 id="the-impossible-size-of-a">The Impossible Size of $A$<a class="headerlink" href="#the-impossible-size-of-a" title="Link to this heading">#</a></h2>
<p>We already know that $A$ will have $2^{|V|} + 2|V|$ rows.
But how many columns will $A$ have?
We know that each arc is a variable so at lest $|E|$ rows, but in a traditional matrix form of a linear program, we have to introduce slack and surplus variables so that $AX = b$ and not $AX \geqslant b$ or any other inequality operation.
The $2|V|$ rows already comply with this requirement, but the rows created with every subset of $V$ do <em>not</em>, those rows only require that $x(\delta^+(U)) \geqslant 1$, so we introduce a surplus variable for each of these rows bring the column count to $|E| + 2^{|V|}$.</p>
<p>Now, the Held-Karp relaxation performed in the Asadpour algorithm in is done on the complete bi-directed graph.
For a graph with $n$ vertices, there will be $2 \times \binom{n}{2}$ arcs in the graph.
The updated value for the size of $A$ is then that it is a</p>
<p>$$
\left(2^n + 2n \right)\times \left(2\binom{n}{2} + 2^n\right)
$$</p>
<p>matrix.
This is <em>very</em> large.
For $n = 100$ there are $1.606 \times 10^{60}$ elements in the matrix.
Allocating a measly 8 bits per entry sill consumes over $1.28 \times 10^{52}$ gigabytes of memory.</p>
<p>This is an impossible amount of memory for any computer that we could run NetworkX on.</p>
<h2 id="solution">Solution<a class="headerlink" href="#solution" title="Link to this heading">#</a></h2>
<p>The Held-Karp relaxation <em>must</em> be solved in the Asadpour Asymmertic Traveling Salesman Problem Algorithm, but clearly putting it into standard form is not possible.
This means that we will not be able to use SciPy&rsquo;s linprog method which I was hoping to use.
I will instead have to research and write an ellipsoid method solver, which hopefully will be able to solve the Held-Karp relaxation in both polynomial time and a practical amount of memory.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Stellar Chart, a Type of Chart to Be on Your Radar]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/stellar-chart-alternative-radar-chart/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/ipcc-sr15/?utm_source=atom_feed" rel="related" type="text/html" title="Figures in the IPCC Special Report on Global Warming of 1.5°C (SR15)" />
                <link href="https://blog.scientific-python.org/matplotlib/codeswitching-visualization/?utm_source=atom_feed" rel="related" type="text/html" title="Visualizing Code-Switching with Step Charts" />
                <link href="https://blog.scientific-python.org/matplotlib/elementary-cellular-automata/?utm_source=atom_feed" rel="related" type="text/html" title="Elementary Cellular Automata" />
                <link href="https://blog.scientific-python.org/matplotlib/animated-fractals/?utm_source=atom_feed" rel="related" type="text/html" title="Animate Your Own Fractals in Python with Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/animated-polar-plot/?utm_source=atom_feed" rel="related" type="text/html" title="Animated polar plot with oceanographic data" />
            
                <id>https://blog.scientific-python.org/matplotlib/stellar-chart-alternative-radar-chart/</id>
            
            
            <published>2021-01-10T20:29:40+00:00</published>
            <updated>2021-01-10T20:29:40+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Learn how to create a simple stellar chart, an alternative to the radar chart.</blockquote><p>In May 2020, Alexandre Morin-Chassé published a blog post about the <strong>stellar chart</strong>. This type of chart is an (approximately) direct alternative to the <strong>radar chart</strong> (also known as web, spider, star, or cobweb chart) — you can read more about this chart <a href="https://medium.com/nightingale/the-stellar-chart-an-elegant-alternative-to-radar-charts-ae6a6931a28e">here</a>.</p>
<p><img src="/matplotlib/stellar-chart-alternative-radar-chart/radar_stellar_chart.png" alt="Comparison of a radar chart and a stellar chart"></p>
<p>In this tutorial, we will see how we can create a quick-and-dirty stellar chart. First of all, let&rsquo;s get the necessary modules/libraries, as well as prepare a dummy dataset (with just a single record).</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">chain</span><span class="p">,</span> <span class="n">zip_longest</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">math</span> <span class="kn">import</span> <span class="n">ceil</span><span class="p">,</span> <span class="n">pi</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">data</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="s2">&#34;V1&#34;</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="s2">&#34;V2&#34;</span><span class="p">,</span> <span class="mi">10</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="s2">&#34;V3&#34;</span><span class="p">,</span> <span class="mi">9</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="s2">&#34;V4&#34;</span><span class="p">,</span> <span class="mi">12</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="s2">&#34;V5&#34;</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="s2">&#34;V6&#34;</span><span class="p">,</span> <span class="mi">14</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="s2">&#34;V7&#34;</span><span class="p">,</span> <span class="mi">15</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="s2">&#34;V8&#34;</span><span class="p">,</span> <span class="mi">25</span><span class="p">),</span>
</span></span><span class="line"><span class="cl"><span class="p">]</span></span></span></code></pre>
</div>
<p>We will also need some helper functions, namely a function to round up to the nearest 10 (<code>round_up()</code>) and a function to join two sequences (<code>even_odd_merge()</code>). In the latter, the values of the first sequence (a list or a tuple, basically) will fill the even positions and the values of the second the odd ones.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">round_up</span><span class="p">(</span><span class="n">value</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    &gt;&gt;&gt; round_up(25)
</span></span></span><span class="line"><span class="cl"><span class="s2">    30
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">ceil</span><span class="p">(</span><span class="n">value</span> <span class="o">/</span> <span class="mf">10.0</span><span class="p">))</span> <span class="o">*</span> <span class="mi">10</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">even_odd_merge</span><span class="p">(</span><span class="n">even</span><span class="p">,</span> <span class="n">odd</span><span class="p">,</span> <span class="n">filter_none</span><span class="o">=</span><span class="kc">True</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    &gt;&gt;&gt; list(even_odd_merge([1,3], [2,4]))
</span></span></span><span class="line"><span class="cl"><span class="s2">    [1, 2, 3, 4]
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">filter_none</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">filter</span><span class="p">(</span><span class="kc">None</span><span class="o">.</span><span class="fm">__ne__</span><span class="p">,</span> <span class="n">chain</span><span class="o">.</span><span class="n">from_iterable</span><span class="p">(</span><span class="n">zip_longest</span><span class="p">(</span><span class="n">even</span><span class="p">,</span> <span class="n">odd</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">chain</span><span class="o">.</span><span class="n">from_iterable</span><span class="p">(</span><span class="n">zip_longest</span><span class="p">(</span><span class="n">even</span><span class="p">,</span> <span class="n">odd</span><span class="p">))</span></span></span></code></pre>
</div>
<p>That said, to plot <code>data</code> on a stellar chart, we need to apply some transformations, as well as calculate some auxiliary values. So, let&rsquo;s start by creating a function (<code>prepare_angles()</code>) to calculate the angle of each axis on the chart (<code>N</code> corresponds to the number of variables to be plotted).</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">prepare_angles</span><span class="p">(</span><span class="n">N</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">angles</span> <span class="o">=</span> <span class="p">[</span><span class="n">n</span> <span class="o">/</span> <span class="n">N</span> <span class="o">*</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">pi</span> <span class="k">for</span> <span class="n">n</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">N</span><span class="p">)]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Repeat the first angle to close the circle</span>
</span></span><span class="line"><span class="cl">    <span class="n">angles</span> <span class="o">+=</span> <span class="n">angles</span><span class="p">[:</span><span class="mi">1</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">angles</span></span></span></code></pre>
</div>
<p>Next, we need a function (<code>prepare_data()</code>) responsible for adjusting the original data (<code>data</code>) and separating it into several easy-to-use objects.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">prepare_data</span><span class="p">(</span><span class="n">data</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">labels</span> <span class="o">=</span> <span class="p">[</span><span class="n">d</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">d</span> <span class="ow">in</span> <span class="n">data</span><span class="p">]</span>  <span class="c1"># Variable names</span>
</span></span><span class="line"><span class="cl">    <span class="n">values</span> <span class="o">=</span> <span class="p">[</span><span class="n">d</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="k">for</span> <span class="n">d</span> <span class="ow">in</span> <span class="n">data</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Repeat the first value to close the circle</span>
</span></span><span class="line"><span class="cl">    <span class="n">values</span> <span class="o">+=</span> <span class="n">values</span><span class="p">[:</span><span class="mi">1</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">N</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">labels</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">angles</span> <span class="o">=</span> <span class="n">prepare_angles</span><span class="p">(</span><span class="n">N</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">labels</span><span class="p">,</span> <span class="n">values</span><span class="p">,</span> <span class="n">angles</span><span class="p">,</span> <span class="n">N</span></span></span></code></pre>
</div>
<p>Lastly, for this specific type of chart, we require a function (<code>prepare_stellar_aux_data()</code>) that, from the previously calculated angles, prepares two lists of auxiliary values: a list of <strong>intermediate angles</strong> for each pair of angles (<code>stellar_angles</code>) and a list of small <strong>constant values</strong> (<code>stellar_values</code>), which will act as the values of the variables to be plotted in order to achieve the <strong>star-like shape</strong> intended for the stellar chart.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">prepare_stellar_aux_data</span><span class="p">(</span><span class="n">angles</span><span class="p">,</span> <span class="n">ymax</span><span class="p">,</span> <span class="n">N</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">angle_midpoint</span> <span class="o">=</span> <span class="n">pi</span> <span class="o">/</span> <span class="n">N</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">stellar_angles</span> <span class="o">=</span> <span class="p">[</span><span class="n">angle</span> <span class="o">+</span> <span class="n">angle_midpoint</span> <span class="k">for</span> <span class="n">angle</span> <span class="ow">in</span> <span class="n">angles</span><span class="p">[:</span><span class="o">-</span><span class="mi">1</span><span class="p">]]</span>
</span></span><span class="line"><span class="cl">    <span class="n">stellar_values</span> <span class="o">=</span> <span class="p">[</span><span class="mf">0.05</span> <span class="o">*</span> <span class="n">ymax</span><span class="p">]</span> <span class="o">*</span> <span class="n">N</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">stellar_angles</span><span class="p">,</span> <span class="n">stellar_values</span></span></span></code></pre>
</div>
<p>At this point, we already have all the necessary <em>ingredients</em> for the stellar chart, so let&rsquo;s move on to the Matplotlib side of this tutorial. In terms of <strong>aesthetics</strong>, we can rely on a function (<code>draw_peripherals()</code>) designed for this specific purpose (feel free to customize it!).</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">draw_peripherals</span><span class="p">(</span><span class="n">ax</span><span class="p">,</span> <span class="n">labels</span><span class="p">,</span> <span class="n">angles</span><span class="p">,</span> <span class="n">ymax</span><span class="p">,</span> <span class="n">outer_color</span><span class="p">,</span> <span class="n">inner_color</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># X-axis</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_xticks</span><span class="p">(</span><span class="n">angles</span><span class="p">[:</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_xticklabels</span><span class="p">(</span><span class="n">labels</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="n">outer_color</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Y-axis</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_yticks</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="n">ymax</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_yticklabels</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="n">ymax</span><span class="p">,</span> <span class="mi">10</span><span class="p">),</span> <span class="n">color</span><span class="o">=</span><span class="n">inner_color</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">7</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_ylim</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">ymax</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_rlabel_position</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Both axes</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_axisbelow</span><span class="p">(</span><span class="kc">True</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Boundary line</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;polar&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_color</span><span class="p">(</span><span class="n">outer_color</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Grid lines</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">xaxis</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="kc">True</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="n">inner_color</span><span class="p">,</span> <span class="n">linestyle</span><span class="o">=</span><span class="s2">&#34;-&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">yaxis</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="kc">True</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="n">inner_color</span><span class="p">,</span> <span class="n">linestyle</span><span class="o">=</span><span class="s2">&#34;-&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p>To <strong>plot the data</strong> and orchestrate (almost) all the steps necessary to have a stellar chart, we just need one last function: <code>draw_stellar()</code>.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">draw_stellar</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">labels</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">values</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">angles</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">N</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">shape_color</span><span class="o">=</span><span class="s2">&#34;tab:blue&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">outer_color</span><span class="o">=</span><span class="s2">&#34;slategrey&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">inner_color</span><span class="o">=</span><span class="s2">&#34;lightgrey&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Limit the Y-axis according to the data to be plotted</span>
</span></span><span class="line"><span class="cl">    <span class="n">ymax</span> <span class="o">=</span> <span class="n">round_up</span><span class="p">(</span><span class="nb">max</span><span class="p">(</span><span class="n">values</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Get the lists of angles and variable values</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># with the necessary auxiliary values injected</span>
</span></span><span class="line"><span class="cl">    <span class="n">stellar_angles</span><span class="p">,</span> <span class="n">stellar_values</span> <span class="o">=</span> <span class="n">prepare_stellar_aux_data</span><span class="p">(</span><span class="n">angles</span><span class="p">,</span> <span class="n">ymax</span><span class="p">,</span> <span class="n">N</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">all_angles</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">even_odd_merge</span><span class="p">(</span><span class="n">angles</span><span class="p">,</span> <span class="n">stellar_angles</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="n">all_values</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">even_odd_merge</span><span class="p">(</span><span class="n">values</span><span class="p">,</span> <span class="n">stellar_values</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Apply the desired style to the figure elements</span>
</span></span><span class="line"><span class="cl">    <span class="n">draw_peripherals</span><span class="p">(</span><span class="n">ax</span><span class="p">,</span> <span class="n">labels</span><span class="p">,</span> <span class="n">angles</span><span class="p">,</span> <span class="n">ymax</span><span class="p">,</span> <span class="n">outer_color</span><span class="p">,</span> <span class="n">inner_color</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Draw (and fill) the star-shaped outer line/area</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">all_angles</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">all_values</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">linewidth</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">linestyle</span><span class="o">=</span><span class="s2">&#34;solid&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">solid_joinstyle</span><span class="o">=</span><span class="s2">&#34;round&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="n">shape_color</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">fill</span><span class="p">(</span><span class="n">all_angles</span><span class="p">,</span> <span class="n">all_values</span><span class="p">,</span> <span class="n">shape_color</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Add a small hole in the center of the chart</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">marker</span><span class="o">=</span><span class="s2">&#34;o&#34;</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;white&#34;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span></span></span></code></pre>
</div>
<p>Finally, let&rsquo;s get our chart on a <em>blank canvas</em> (figure).</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">dpi</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="mi">111</span><span class="p">,</span> <span class="n">polar</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>  <span class="c1"># Don&#39;t forget the projection!</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">draw_stellar</span><span class="p">(</span><span class="n">ax</span><span class="p">,</span> <span class="o">*</span><span class="n">prepare_data</span><span class="p">(</span><span class="n">data</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/stellar-chart-alternative-radar-chart/stellar_chart.png" alt="Example of a stellar chart"></p>
<p>It&rsquo;s done! Right now, you have an example of a stellar chart and the boilerplate code to add this type of chart to your <em>repertoire</em>. If you end up creating your own stellar charts, feel free to share them with the <em>world</em> (and <a href="https://twitter.com/joaompalmeiro">me</a>!). I hope this tutorial was useful and interesting for you!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Figures in the IPCC Special Report on Global Warming of 1.5°C (SR15)]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/ipcc-sr15/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/warming-stripes/?utm_source=atom_feed" rel="related" type="text/html" title="Creating the Warming Stripes in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/codeswitching-visualization/?utm_source=atom_feed" rel="related" type="text/html" title="Visualizing Code-Switching with Step Charts" />
                <link href="https://blog.scientific-python.org/matplotlib/elementary-cellular-automata/?utm_source=atom_feed" rel="related" type="text/html" title="Elementary Cellular Automata" />
                <link href="https://blog.scientific-python.org/matplotlib/animated-fractals/?utm_source=atom_feed" rel="related" type="text/html" title="Animate Your Own Fractals in Python with Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/animated-polar-plot/?utm_source=atom_feed" rel="related" type="text/html" title="Animated polar plot with oceanographic data" />
            
                <id>https://blog.scientific-python.org/matplotlib/ipcc-sr15/</id>
            
            
            <published>2020-12-31T08:32:45+01:00</published>
            <updated>2020-12-31T08:32:45+01:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Many figures in the IPCC SR15 were generated using Matplotlib.
The data and open-source notebooks were published to increase the transparency and reproducibility of the analysis.</blockquote><h2 id="background">Background<a class="headerlink" href="#background" title="Link to this heading">#</a></h2>
<figure style="float: right; ">
  <a href="https://www.ipcc.ch/sr15">
  <img src="IPCC-SR15-cover.jpg" style="width: 180px; "/>
  <figcaption style="text-align: center; color: grey; font-size: small">
  Cover of the IPCC SR15</figcaption></a>
</figure>
<p>The IPCC&rsquo;s <em>Special Report on Global Warming of 1.5°C</em> (SR15), published in October 2018,
presented the latest research on anthropogenic climate change.
It was written in response to the 2015 UNFCCC&rsquo;s &ldquo;Paris Agreement&rdquo; of</p>
<blockquote>
<p>holding the increase in the global average temperature to well below 2 °C
above pre-industrial levels and to pursue efforts to limit the temperature increase to 1.5 °C [&hellip;]&quot;.</p>
</blockquote>
<p>cf. <a href="https://unfccc.int/process-and-meetings/the-paris-agreement/the-paris-agreement">Article 2.1.a of the Paris Agreement</a></p>
<p>As part of the SR15 assessment, an ensemble of quantitative, model-based scenarios
was compiled to underpin the scientific analysis.
Many of the headline statements widely reported by media
are based on this scenario ensemble, including the finding that</p>
<blockquote>
<p>global net anthropogenic CO2 emissions decline by ~45% from 2010 levels by 2030</p>
</blockquote>
<p>in all pathways limiting global warming to 1.5°C
(cf. <a href="https://www.ipcc.ch/sr15/chapter/spm/">statement C.1</a> in the <em>Summary For Policymakers</em>).</p>
<h2 id="open-source-notebooks-for-transparency-and-reproducibility-of-the-assessment">Open-source notebooks for transparency and reproducibility of the assessment<a class="headerlink" href="#open-source-notebooks-for-transparency-and-reproducibility-of-the-assessment" title="Link to this heading">#</a></h2>
<p>When preparing the SR15, the authors wanted to go beyond previous reports
not just regarding the scientific rigor and scope of the analysis,
but also establish new standards in terms of openness, transparency and reproducibility.</p>
<p>The scenario ensemble was made accessible via an interactive <em>IAMC 1.5°C Scenario Explorer</em>
(<a href="http://data.ene.iiasa.ac.at/iamc-1.5c-explorer/#/workspaces">link</a>) in line with the
<a href="https://www.go-fair.org/fair-principles/">FAIR principles for scientific data management and stewardship</a>.
The process for compiling, validating and analyzing the scenario ensemble
was described in an open-access manuscript published in <em>Nature Climate Change</em>
(doi: <a href="https://doi.org/10.1038/s41558-018-0317-4">10.1038/s41558-018-0317-4</a>).</p>
<p>In addition, the Jupyter notebooks generating many of the headline statements,
tables and figures (using Matplotlib) were released under an open-source license
to facilitate a better understanding of the analysis
and enable reuse for subsequent research.
The notebooks are available in <a href="https://data.ene.iiasa.ac.at/sr15_scenario_analysis">rendered format</a>
and on <a href="https://github.com/iiasa/ipcc_sr15_scenario_analysis">GitHub</a>.</p>
<figure style="width: 600px ">
  <img src="sr15-fig2.4.png" style="width: 600px; "/>
  <figcaption style="text-align: center; color: grey; font-size: small">
  Figure 2.4 of the IPCC SR15, showing the range of assumptions of socioeconomic drivers<br />
  across the IAMC 1.5°C Scenario Ensemble<br />
  Drawn with Matplotlib, source code available <a href="https://data.ene.iiasa.ac.at/sr15_scenario_analysis/assessment/sr15_2.3.1_range_of_assumptions.html">here</a>
  </figcaption>
</figure>
<figure style="width: 600px ">
  <img src="sr15-fig2.15.png" style="width: 600px; "/>
  <figcaption style="text-align: center; color: grey; font-size: small">
  Figure 2.15 of the IPCC SR15, showing the primary energy development in illustrative pathways<br />
  Drawn with Matplotlib, source code available <a href="https://data.ene.iiasa.ac.at/sr15_scenario_analysis/assessment/sr15_2.4.2.1_primary_energy_marker-scenarios.html">here</a>
  </figcaption>
</figure>
<h2 id="a-package-for-scenario-analysis--visualization">A package for scenario analysis &amp; visualization<a class="headerlink" href="#a-package-for-scenario-analysis--visualization" title="Link to this heading">#</a></h2>
<p>To facilitate reusability of the scripts and plotting utilities
developed for the SR15 analysis, we started the open-source Python package <strong>pyam</strong>
as a toolbox for working with scenarios from integrated-assessment and energy system models.</p>
<p>The package is a wrapper for <a href="https://pandas.pydata.org">pandas</a> and Matplotlib
geared for several data formats commonly used in energy modelling.
<a href="https://pyam-iamc.readthedocs.io">Read the docs!</a></p>
<p><a href="https://pyam-iamc.readthedocs.io"><img src="pyam-header.png"></a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="academia" label="academia" />
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[GSoD: Developing Matplotlib Entry Paths]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/gsod-developing-matplotlib-entry-paths/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/codeswitching-visualization/?utm_source=atom_feed" rel="related" type="text/html" title="Visualizing Code-Switching with Step Charts" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_2020_final_work_product/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC 2020 Work Product - Baseline Images Problem" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_5/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 3 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_4/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 2 Blog 2" />
                <link href="https://blog.scientific-python.org/matplotlib/elementary-cellular-automata/?utm_source=atom_feed" rel="related" type="text/html" title="Elementary Cellular Automata" />
            
                <id>https://blog.scientific-python.org/matplotlib/gsod-developing-matplotlib-entry-paths/</id>
            
            
            <published>2020-12-08T08:16:42-08:00</published>
            <updated>2020-12-08T08:16:42-08:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>This is my first post contribution to Matplotlib.</blockquote><h1 id="introduction">Introduction<a class="headerlink" href="#introduction" title="Link to this heading">#</a></h1>
<p>This year’s Google Season of Docs (GSoD) provided me the opportunity to work with the open source organization, Matplotlib. In early summer, I submitted my proposal of Developing Matplotlib Entry Paths with the goal of improving the documentation with an alternative approach to writing.</p>
<p>I had set out to identify with users more by providing real world contexts to examples and programming. My purpose was to lower the barrier of entry for others to begin using the Python library with an expository approach. I focused on aligning with users based on consistent derived purposes and a foundation of task-based empathy.</p>
<p>The project began during the community bonding phase with learning the fundamentals of building documentation and working with open source code. I later generated usability testing surveys to the community and consolidated findings. From these results, I developed two new documents for merging into the Matplotlib repository, a Getting Started introductory tutorial and a lean Style Guide for the documentation.</p>
<h1 id="project-report">Project Report<a class="headerlink" href="#project-report" title="Link to this heading">#</a></h1>
<p>Throughout this year’s Season of Docs with Matplotlib, I learned a great deal about working on open source projects, provided contributions of surveying communities and interviewing subject matter experts in documentation usability testing, and produced a comprehensive introductory guide for improving entry-level content with an initiative style guide section.</p>
<p>As a new user to Git and GitHub, I had a learning curve in getting started with building documentation locally on my machine. Working with cloning repositories and familiarizing myself with commits and pull requests took the bulk of the first few weeks on this project. However, with experiencing errors and troubleshooting broken branches, it was excellent to be able to lean on my mentors for resolving these issues. Platforms like Gitter, Zoom, and HackMD were key in keeping communication timely and concise. I was fortunate to be able to get in touch with the team to help me as soon as I had problems.</p>
<p>With programming, I was not a completely fresh face to Python and Matplotlib. However, installing the library from the source and breaking down functionality to core essentials helped me grow in my understanding of not only the fundamentals, but also the terminology. Tackling everything through my own experience of using Python and then also having suggestions and advice from the development team accelerated the ideas and implementations I aimed to work towards.</p>
<p>New formats and standards with reStructuredText files and Sphinx compatibility were unfamiliar avenues to me at first. In building documentation and reading through already written content, I adapted to making the most of the features available with the ideas I had for writing material suited for users new to Matplotlib. Making use of tables and code examples embedded allowed me to be more flexible in visual layout and navigation.</p>
<p>During the beginning stages of the project, I was able to incorporate usability testing for the current documentation. By reaching out to communities on Twitter, Reddit, and various Slack channels, I compiled and consolidated findings that helped shape the language and focus of new content to create. I summarized and shared the community’s responses in addition to separate informational interviews conducted with subject matter experts in my location. These data points helped in justifying and supporting decisions for the scope and direction of the language and content.</p>
<p>At the end of the project, I completed our agreed upon expectations for the documentation. The focused goal consisted of a Getting Started tutorial to introduce and give context to Matplotlib for new users. In addition, through the documentation as well as the meetings with the community, we acknowledged a missing element of a Style Guide. Though a comprehensive document for the entire library was out of the scope of the project, I put together, in conjunction with the featured task, a lean version that serves as a foundational resource for writing Matplotlib documentation.</p>
<p>The two sections are part of a current pull request to merge into Matplotlib’s repository. I have already worked through smaller changes to the content and am working with the community in moving forward with the process.</p>
<h1 id="conclusion">Conclusion<a class="headerlink" href="#conclusion" title="Link to this heading">#</a></h1>
<p>This Season of Docs proposal began as a vision of ideals I hoped to share and work towards with an organization and has become a technical writing experience full of growth and camaraderie. I am pleased with the progress I had made and cannot thank the team enough for the leadership and mentorship they provided. It is fulfilling and rewarding to both appreciate and be appreciated within a team.</p>
<p>In addition, the opportunity put together by the team at Google to foster collaboration among skilled contributors cannot be understated. Highlighting the accomplishments of these new teams raises the bar for the open source community.</p>
<h1 id="details">Details<a class="headerlink" href="#details" title="Link to this heading">#</a></h1>
<h2 id="acknowledgements">Acknowledgements<a class="headerlink" href="#acknowledgements" title="Link to this heading">#</a></h2>
<p>Special thanks to Emily Hsu, Joe McEwen, and Smriti Singh for their time and responses, fellow Matplotlib Season of Docs writer Bruno Beltran for his insight and guidance, and the Matplotlib development team mentors Tim, Tom, and Hannah for their patience, support, and approachability for helping a new technical writer like me with my own Getting Started.</p>
<h2 id="external-links">External Links<a class="headerlink" href="#external-links" title="Link to this heading">#</a></h2>
<ul>
<li><a href="https://github.com/matplotlib/matplotlib/pull/18873">Getting Started GSoD Pull Request</a></li>
<li><a href="https://docs.google.com/forms/d/e/1FAIpQLSfPX13wXNOV5LM4OoHUYT3xtSZzVQ6I3ZA4cvz5P6DKuph4aw/viewform?usp=sf_link">Matplotlib User Survey</a></li>
<li><a href="https://docs.google.com/spreadsheets/d/1z_bAu7hG-IgtFkM5uPezkUHQvi6gsWKxoDnh0Hz1K5U/edit?usp=sharing">User Survey Responses</a></li>
<li><a href="https://docs.google.com/spreadsheets/d/15EzVNmWVn2SjCUBc-Kt5Y0_entLgvWRMRYy8syt_-Xg/edit?usp=sharing">User Survey Open Questions</a></li>
<li><a href="https://hackmd.io/cSNb2JhrSo26zJGag3bvLg">HackMD GSoD Meeting Agenda</a></li>
</ul>
<h2 id="about-me">About Me<a class="headerlink" href="#about-me" title="Link to this heading">#</a></h2>
<p>My name is <a href="https://www.linkedin.com/in/jeromefuertevillegas/">Jerome Villegas</a> and I&rsquo;m a technical writer based in Seattle. I&rsquo;ve been in education and education-adjacent fields for several years before transitioning to the industry of technical communication. My career has taken me to Taiwan to teach English and work in publishing, then to New York City to work in higher education, and back to Seattle where I worked at a private school.</p>
<p>Since leaving my job, I&rsquo;ve taken to supporting my family while studying technical writing at the University of Washington and supplementing the knowledge with learning programming on the side. Along with a former classmate, the two of us have worked with the UX writing community in the Pacific Northwest. We host interview sessions, moderate sessions at conferences, and generate content analyzing trends and patterns in UX/tech writing.</p>
<p>In telling people what I&rsquo;ve got going on in my life, you can find work I&rsquo;ve done at my <a href="https://jeromefvillegas.wordpress.com">personal site</a> and see what we&rsquo;re up to at <a href="https://teamshiftj.wordpress.com">shift J</a>. Thanks for reading!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsod" label="GSoD" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Visualizing Code-Switching with Step Charts]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/codeswitching-visualization/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/draw-all-graphs-of-n-nodes/?utm_source=atom_feed" rel="related" type="text/html" title="Draw all graphs of N nodes" />
                <link href="https://blog.scientific-python.org/matplotlib/elementary-cellular-automata/?utm_source=atom_feed" rel="related" type="text/html" title="Elementary Cellular Automata" />
                <link href="https://blog.scientific-python.org/matplotlib/animated-fractals/?utm_source=atom_feed" rel="related" type="text/html" title="Animate Your Own Fractals in Python with Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/animated-polar-plot/?utm_source=atom_feed" rel="related" type="text/html" title="Animated polar plot with oceanographic data" />
                <link href="https://blog.scientific-python.org/matplotlib/emoji-mosaic-art/?utm_source=atom_feed" rel="related" type="text/html" title="Emoji Mosaic Art" />
            
                <id>https://blog.scientific-python.org/matplotlib/codeswitching-visualization/</id>
            
            
            <published>2020-09-26T19:41:21-07:00</published>
            <updated>2020-09-26T19:41:21-07:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Learn how to easily create step charts through examining the multilingualism of pop group WayV</blockquote><p><img src="/matplotlib/codeswitching-visualization/Image1.png" alt="Frequency of Code-Switching 200403 WayV Instagram Live, a step chart. The figure shows a plot of ‘cumulative number of times of code-switching’ against the ‘duration of Instagram Live (in seconds)’. There are four members in the livestream: Yangyang (represented with a dark red line), Hendery (represented with a pink line), Ten (represented with a light blue line), Kun (represented with a dark blue line)."></p>
<h1 id="introduction">Introduction<a class="headerlink" href="#introduction" title="Link to this heading">#</a></h1>
<p>Code-switching is the practice of alternating between two or more languages in the context of a single conversation, either consciously or unconsciously. As someone who grew up bilingual and is currently learning other languages, I find code-switching a fascinating facet of communication from not only a purely linguistic perspective, but also a social one. In particular, I&rsquo;ve personally found that code-switching often helps build a sense of community and familiarity in a group and that the unique ways in which speakers code-switch with each other greatly contribute to shaping group dynamics.</p>
<p>This is something that&rsquo;s evident in seven-member pop boy group WayV. Aside from their discography, artistry, and group chemistry, WayV is well-known among fans and many non-fans alike for their multilingualism and code-switching, which many fans have affectionately coined as &ldquo;WayV language.&rdquo; Every member in the group is fluent in both Mandarin and Korean, and at least one member in the group is fluent in one or more of the following: English, Cantonese, Thai, Wenzhounese, and German. It&rsquo;s an impressive trait that&rsquo;s become a trademark of WayV as they&rsquo;ve quickly drawn a global audience since their debut in January 2019. Their multilingualism is reflected in their music as well. On top of their regular album releases in Mandarin, WayV has also released singles in Korean and English, with their latest single &ldquo;Bad Alive (English Ver.)&rdquo; being a mix of English, Korean, and Mandarin.</p>
<p>As an independent translator who translates WayV content into English, I&rsquo;ve become keenly aware of the true extent and rate of WayV&rsquo;s code-switching when communicating with each other. In a lot of their content, WayV frequently switches between three or more languages every couple of seconds, a phenomenon that can make translating quite challenging at times, but also extremely rewarding and fun. I wanted to be able to present this aspect of WayV in a way that would both highlight their linguistic skills and present this dimension of their group dynamic in a more concrete, quantitative, and visually intuitive manner, beyond just stating that &ldquo;they code-switch a lot.&rdquo; This prompted me to make step charts - perfect for displaying data that changes at irregular intervals but remains constant between the changes - in hopes of enriching the viewer&rsquo;s experience and helping make a potentially abstract concept more understandable and readily consumable. With a step chart, it becomes more apparent to the viewer the extent of how a group communicates, and cross-sections of the graph allow a rudimentary look into how multilinguals influence each other in code-switching.</p>
<h1 id="tutorial">Tutorial<a class="headerlink" href="#tutorial" title="Link to this heading">#</a></h1>
<p>This tutorial on creating step charts uses one of WayV&rsquo;s livestreams as an example. There were four members in this livestream and a total of eight languages/dialects spoken. I will go through the basic steps of creating a step chart that depicts the frequency of code-switching for just one member. A full code chunk that shows how to layer two or more step chart lines in one graph to depict code-switching for multiple members can be found near the end.</p>
<h2 id="dataset">Dataset<a class="headerlink" href="#dataset" title="Link to this heading">#</a></h2>
<p>First, we import the required libraries and load the data into a Pandas dataframe.</p>
<pre><code>import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
</code></pre>
<p>This dataset includes the timestamp of every switch (in seconds) and the language of switch for one speaker.</p>
<pre><code>df_h = pd.read_csv(&quot;WayVHendery.csv&quot;)
HENDERY = df_h.reset_index()
HENDERY.head()
</code></pre>
<table>
  <thead>
      <tr>
          <th>index</th>
          <th>time</th>
          <th>lang</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>0</td>
          <td>2</td>
          <td>ENG</td>
      </tr>
      <tr>
          <td>1</td>
          <td>3</td>
          <td>KOR</td>
      </tr>
      <tr>
          <td>2</td>
          <td>10</td>
          <td>ENG</td>
      </tr>
      <tr>
          <td>3</td>
          <td>13</td>
          <td>MAND</td>
      </tr>
      <tr>
          <td>4</td>
          <td>15</td>
          <td>ENG</td>
      </tr>
  </tbody>
</table>
<h2 id="plotting">Plotting<a class="headerlink" href="#plotting" title="Link to this heading">#</a></h2>
<p>With the dataset loaded, we can now set up our graph in terms of determining the size of the figure, dpi, font size, and axes limits. We can also play around with the aesthetics, such as modifying the colors of our plot. These few simple steps easily transform the default all-white graph into a more visually appealing one.</p>
<h3 id="without-customization">Without Customization<a class="headerlink" href="#without-customization" title="Link to this heading">#</a></h3>
<pre><code>fig, ax = plt.subplots(figsize = (20,12))
</code></pre>
<p><img src="/matplotlib/codeswitching-visualization/fig1.png" alt="An all-white graph with the x and y axis defined on the range [0, 1]."></p>
<h3 id="with-customization">With Customization<a class="headerlink" href="#with-customization" title="Link to this heading">#</a></h3>
<pre><code>sns.set(rc={'axes.facecolor':'aliceblue', 'figure.facecolor':'c'})
fig, ax = plt.subplots(figsize = (20,12), dpi = 300)

plt.xlabel(&quot;Duration of Instagram Live (seconds)&quot;, fontsize = 18)
plt.ylabel(&quot;Cumulative Number of Times of Code-Switching&quot;, fontsize = 18)

plt.xlim(0, 570)
plt.ylim(0, 85)
</code></pre>
<p><img src="/matplotlib/codeswitching-visualization/fig2.png" alt="A styled, blank graph with the ‘cumulative number of times of code-switching’ values on the y-axis and the ‘duration of Instagram Live (in seconds)’ values on the x-axis"></p>
<!--     ax.step(HENDERY.time, HENDERY.index, label = "HENDERY") -->
<p>Following this, we can make our step chart line easily with matplotlib.pyplot.step, in which we plot the x and y values and determine the text of the legend, color of the step chart line, and width of the step chart line.</p>
<pre><code>ax.step(HENDERY.time, HENDERY.index, label = &quot;HENDERY&quot;, color = &quot;palevioletred&quot;, linewidth = 4)
</code></pre>
<p><img src="/matplotlib/codeswitching-visualization/fig3.png" alt="A graph with the ‘cumulative number of times of code-switching’ values on the y-axis and the ‘duration of Instagram Live (in seconds)’ values on the x-axis showing a step chart line (in pink) for Hendery."></p>
<h2 id="labeling">Labeling<a class="headerlink" href="#labeling" title="Link to this heading">#</a></h2>
<p>Of course, we want to know not only how many switches there were and when they occurred, but also to what language the member switched. For this, we can write a for loop that labels each switch with its respective language as recorded in our dataset.</p>
<pre><code>for x,y,z in zip(HENDERY[&quot;time&quot;], HENDERY[&quot;index&quot;], HENDERY[&quot;lang&quot;]):
    label = z
    ax.annotate(label, #text
                 (x,y), #label coordinate
                 textcoords = &quot;offset points&quot;, #how to position text
                 xytext = (15,-5), #distance from text to coordinate (x,y)
                 ha = &quot;center&quot;, #alignment
                 fontsize = 8.5) #font size of text
</code></pre>
<p><img src="/matplotlib/codeswitching-visualization/fig4.png" alt="Language labels for each step on the graph. Languages include English, Korean, Mandarin, and German. The graph has the ‘cumulative number of times of code-switching’ values on the y-axis and the ‘duration of Instagram Live (in seconds)’ values on the x-axis showing a step chart line (in pink) for Hendery."></p>
<h2 id="final-touches">Final Touches<a class="headerlink" href="#final-touches" title="Link to this heading">#</a></h2>
<p>Now add a title, save the graph, and there you have it!</p>
<pre><code>plt.title(&quot;WayV Livestream Code-Switching&quot;, fontsize = 35)

fig.savefig(&quot;wayv_codeswitching.png&quot;, bbox_inches = &quot;tight&quot;, facecolor = fig.get_facecolor())
</code></pre>
<p>Below is the complete code for layering step chart lines for multiple speakers in one graph. You can see how easy it is to take the code for visualizing the code-switching of one speaker and adapt it to visualizing that of multiple speakers. In addition, you can see that I&rsquo;ve intentionally left the title blank so I can incorporate external graphic adjustments after I created the chart in Matplotlib, such as the addition of my social media handle and the use of a specific font I wanted, which you can see in the final graph. With visualizations being all about communicating information, I believe using Matplotlib in conjunction with simple elements of graphic design can be another way to make whatever you&rsquo;re presenting that little bit more effective and personal, especially when you&rsquo;re doing so on social media platforms.</p>
<h2 id="complete-code-for-step-chart-of-multiple-speakers">Complete Code for Step Chart of Multiple Speakers<a class="headerlink" href="#complete-code-for-step-chart-of-multiple-speakers" title="Link to this heading">#</a></h2>
<!-- ![](fig5.png) -->
<pre><code># Initialize graph color and size
sns.set(rc={'axes.facecolor':'aliceblue', 'figure.facecolor':'c'})

fig, ax = plt.subplots(figsize = (20,12), dpi = 120)

# Set up axes and labels
plt.xlabel(&quot;Duration of Instagram Live (seconds)&quot;, fontsize = 18)
plt.ylabel(&quot;Cumulative Number of Times of Code-Switching&quot;, fontsize = 18)

plt.xlim(0, 570)
plt.ylim(0, 85)

# Layer step charts for each speaker
ax.step(YANGYANG.time, YANGYANG.index, label = &quot;YANGYANG&quot;, color = &quot;firebrick&quot;, linewidth = 4)
ax.step(HENDERY.time, HENDERY.index, label = &quot;HENDERY&quot;, color = &quot;palevioletred&quot;, linewidth = 4)
ax.step(TEN.time, TEN.index, label = &quot;TEN&quot;, color = &quot;mediumpurple&quot;, linewidth = 4)
ax.step(KUN.time, KUN.index, label = &quot;KUN&quot;, color = &quot;mediumblue&quot;, linewidth = 4)

# Add legend
ax.legend(fontsize = 17)

# Label each data point with the language switch
for i in (KUN, TEN, HENDERY, YANGYANG): #for each dataset
    for x,y,z in zip(i[&quot;time&quot;], i[&quot;index&quot;], i[&quot;lang&quot;]): #looping within the dataset
        label = z
        ax.annotate(label, #text
                     (x,y), #label coordinate
                     textcoords = &quot;offset points&quot;, #how to position text
                     xytext = (15,-5), #distance from text to coordinate (x,y)
                     ha = &quot;center&quot;, #alignment
                     fontsize = 8.5) #font size of text

# Add title (blank to leave room for external graphics)
plt.title(&quot;\n\n&quot;, fontsize = 35)

# Save figure
fig.savefig(&quot;wayv_codeswitching.png&quot;, bbox_inches = &quot;tight&quot;, facecolor = fig.get_facecolor())
</code></pre>
<p><img src="/matplotlib/codeswitching-visualization/Image1.png" alt="Frequency of Code-Switching 200403 WayV Instagram Live, a step chart. The figure shows a plot of ‘cumulative number of times of code-switching’ against the ‘duration of Instagram Live (in seconds)’. There are four members in the livestream: Yangyang (represented with a dark red line), Hendery (represented with a pink line), Ten (represented with a light blue line), Kun (represented with a dark blue line)."></p>
<p>Languages/dialects: Korean (KOR), English (ENG), Mandarin (MAND), German (GER), Cantonese (CANT), Hokkien (HOKK), Teochew (TEO), Thai (THAI)</p>
<p>186 total switches! That&rsquo;s approximately one code-switch in the group every 2.95 seconds.</p>
<p>And voilà! There you have it: a brief guide on how to make step charts. While I utilized step charts here to visualize code-switching, you can use them to visualize whatever data you would like. Please feel free to contact me <a href="https://twitter.com/WayVSubs2019">here</a> if you have any questions or comments. I hope you enjoyed this tutorial, and thank you so much for reading!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="graphs" label="graphs" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[GSoC 2020 Work Product - Baseline Images Problem]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/gsoc_2020_final_work_product/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_5/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 3 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_4/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 2 Blog 2" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_3/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 2 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_2/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 1 Blog 2" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_1/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 1 Blog 1" />
            
                <id>https://blog.scientific-python.org/matplotlib/gsoc_2020_final_work_product/</id>
            
            
            <published>2020-08-16T09:47:51+05:30</published>
            <updated>2020-08-16T09:47:51+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Final Work Product Report for the Google Summer of Code 2020 for the Baseline Images Problem</blockquote><p>Google Summer of Code 2020 is completed. Hurray!! This post discusses about the progress so far in the three months of the coding period from 1 June to 24 August 2020 regarding the project <code>Baseline Images Problem</code> under <code>matplotlib</code> organisation under the umbrella of <code>NumFOCUS</code> organization.</p>
<h2 id="project-details">Project Details:<a class="headerlink" href="#project-details" title="Link to this heading">#</a></h2>
<p>This project helps with the difficulty in adding/modifying tests which require a baseline image. Baseline images are problematic because</p>
<ul>
<li>Baseline images cause the repo size to grow rather quickly.</li>
<li>Baseline images force matplotlib contributors to pin to a somewhat old version of FreeType because nearly every release of FreeType causes tiny rasterization changes that would entail regenerating all baseline images (and thus cause even more repo size growth).</li>
</ul>
<p>So, the idea is to not store the baseline images in the repository, instead to create them from the existing tests.</p>
<h2 id="creation-of-the-matplotlib_baseline_images-package">Creation of the matplotlib_baseline_images package<a class="headerlink" href="#creation-of-the-matplotlib_baseline_images-package" title="Link to this heading">#</a></h2>
<p>We had created the <code>matplotlib_baseline_images</code> package. This package is involved in the sub-wheels directory so that more packages can be added in the same directory, if needed in future. The <code>matplotlib_baseline_images</code> package contain baseline images for both <code>matplotlib</code> and <code>mpl_toolkits</code>.
The package can be installed by using <code>python3 -mpip install matplotlib_baseline_images</code>.</p>
<h2 id="creation-of-the-matplotlib-baseline-image-generation-flag">Creation of the matplotlib baseline image generation flag<a class="headerlink" href="#creation-of-the-matplotlib-baseline-image-generation-flag" title="Link to this heading">#</a></h2>
<p>We successfully created the <code>generate_missing</code> command line flag for baseline image generation for <code>matplotlib</code> and <code>mpl_toolkits</code> in the previous months. It was generating the <code>matplotlib</code> and the <code>mpl_toolkits</code> baseline images initially. Now, we have also modified the existing flow to generate any missing baseline images, which would be fetched from the <code>master</code> branch on doing <code>git pull</code> or <code>git checkout -b feature_branch</code>.</p>
<p>Now, the image generation on the time of fresh install of matplotlib and the generation of missing baseline images works with the <code>python3 -pytest lib/matplotlib matplotlib_baseline_image_generation</code> for the <code>lib/matplotlib</code> folder and <code>python3 -pytest lib/mpl_toolkits matplotlib_baseline_image_generation</code> for the <code>lib/mpl_toolkits</code> folder.</p>
<h2 id="documentation">Documentation<a class="headerlink" href="#documentation" title="Link to this heading">#</a></h2>
<p>We have written documentation explaining the following scenarios:</p>
<ol>
<li>How to generate the baseline images on a fresh install of matplotlib?</li>
<li>How to generate the missing baseline images on fetching changes from master?</li>
<li>How to install the <code>matplotlib_baseline_images_package</code> to be used for testing by the developer?</li>
<li>How to intentionally change an image?</li>
</ol>
<h2 id="links-to-the-work-done">Links to the work done<a class="headerlink" href="#links-to-the-work-done" title="Link to this heading">#</a></h2>
<ul>
<li><a href="https://github.com/matplotlib/matplotlib/issues/16447">Issue</a></li>
<li><a href="https://github.com/matplotlib/matplotlib/pull/17793">Pull Request</a></li>
<li><a href="/tags/gsoc/">Blog Posts</a></li>
</ul>
<h2 id="mentors">Mentors<a class="headerlink" href="#mentors" title="Link to this heading">#</a></h2>
<ul>
<li>Thomas A Caswell</li>
<li>Hannah</li>
<li>Antony Lee</li>
</ul>
<p>I am grateful to be part of such a great community. Project is really interesting and challenging :)</p>
<p>Thanks Thomas, Antony and Hannah for helping me to complete this project.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="GSoC" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[GSoC Coding Phase 3 Blog 1]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_5/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_4/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 2 Blog 2" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_3/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 2 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_2/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 1 Blog 2" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_1/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 1 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/introductory-gsoc2020-post/?utm_source=atom_feed" rel="related" type="text/html" title="Sidharth Bansal joined as GSoC&#39;20 intern" />
            
                <id>https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_5/</id>
            
            
            <published>2020-08-08T09:47:51+05:30</published>
            <updated>2020-08-08T09:47:51+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Progress Report for the first half of the Google Summer of Code 2020 Phase 3 for the Baseline Images Problem</blockquote><p>Google Summer of Code 2020&rsquo;s second evaluation is completed. I passed!!! Hurray! Now we are in the mid way of the last evaluation. This post discusses about the progress so far in the first two weeks of the third coding period from 26 July to 9 August 2020.</p>
<h2 id="completion-of-the-modification-logic-for-the-matplotlib_baseline_images-package">Completion of the modification logic for the matplotlib_baseline_images package<a class="headerlink" href="#completion-of-the-modification-logic-for-the-matplotlib_baseline_images-package" title="Link to this heading">#</a></h2>
<p>We successfully created the <code>matplotlib_baseline_image_generation</code> command line flag for baseline image generation for <code>matplotlib</code> and <code>mpl_toolkits</code> in the previous months. It was generating the matplotlib and the matplotlib toolkit baseline images successfully. Now, we modified the existing flow to generate any missing baseline images, which would be fetched from the <code>master</code> branch on doing <code>git pull</code> or <code>git checkout -b feature_branch</code>.</p>
<p>We initially thought of creating a command line flag <code>generate_baseline_images_for_test &quot;test_a,test_b&quot;</code>, but later on analysis of the approach, we came to the conclusion that the developer will not know about the test names to be given along with the flag. So, we tried to generate the missing images by <code>generate_missing</code> without the test names. This worked successfully.</p>
<h2 id="adopting-reusability-and-do-not-repeat-yourself-dry-principles">Adopting reusability and Do not Repeat Yourself (DRY) Principles<a class="headerlink" href="#adopting-reusability-and-do-not-repeat-yourself-dry-principles" title="Link to this heading">#</a></h2>
<p>Later, we refactored the <code>matplot_baseline_image_generation</code> and <code>generate_missing</code> command line flags to single command line flag <code>matplotlib_baseline_image_generation</code> as the logic was similar for both of them. Now, the image generation on the time of fresh install of matplotlib and the generation of missing baseline images works with the <code>python3 -pytest lib/matplotlib matplotlib_baseline_image_generation</code> for the <code>lib/matplotlib</code> folder and <code>python3 -pytest lib/mpl_toolkits matplotlib_baseline_image_generation</code> for the <code>lib/mpl_toolkits</code> folder.</p>
<h2 id="writing-the-documentation">Writing the documentation<a class="headerlink" href="#writing-the-documentation" title="Link to this heading">#</a></h2>
<p>We have written documentation explaining the following scenarios:</p>
<ol>
<li>How to generate the baseline images on a fresh install of matplotlib?</li>
<li>How to generate the missing baseline images on fetching changes from master?</li>
<li>How to install the <code>matplotlib_baseline_images_package</code> to be used for testing by the developer?</li>
<li>How to intentionally change an image?</li>
</ol>
<h2 id="refactoring-and-improving-the-code-quality-before-merging">Refactoring and improving the code quality before merging<a class="headerlink" href="#refactoring-and-improving-the-code-quality-before-merging" title="Link to this heading">#</a></h2>
<p>Right now, we are trying to refactor the code and maintain git clean history. The <a href="https://github.com/matplotlib/matplotlib/pull/17793">current PR</a> is under review. I am working on the suggested changes. We are trying to merge this :)</p>
<h2 id="daily-meet-ups">Daily Meet-ups<a class="headerlink" href="#daily-meet-ups" title="Link to this heading">#</a></h2>
<p>Monday to Thursday meeting initiated at <a href="https://everytimezone.com/">11:00pm IST</a> via Zoom. Meeting notes are present at HackMD.</p>
<p>I am grateful to be part of such a great community. Project is really interesting and challenging :) Thanks Thomas, Antony and Hannah for helping me so far.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="GSoC" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[GSoC Coding Phase 2 Blog 2]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_4/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_3/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 2 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_2/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 1 Blog 2" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_1/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 1 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/introductory-gsoc2020-post/?utm_source=atom_feed" rel="related" type="text/html" title="Sidharth Bansal joined as GSoC&#39;20 intern" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-rsef/?utm_source=atom_feed" rel="related" type="text/html" title="Elliott Sales de Andrade hired as Matplotlib Software Research Engineering Fellow" />
            
                <id>https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_4/</id>
            
            
            <published>2020-07-23T19:47:51+05:30</published>
            <updated>2020-07-23T19:47:51+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Progress Report for the second half of the Google Summer of Code 2020 Phase 2 for the Baseline Images Problem</blockquote><p>Google Summer of Code 2020&rsquo;s second evaluation is about to complete. Now we are about to start with the final coding phase. This post discusses about the progress so far in the last two weeks of the second coding period from 13 July to 26 July 2020.</p>
<h2 id="modular-approach-towards-removal-of-matplotlib-baseline-images">Modular approach towards removal of matplotlib baseline images<a class="headerlink" href="#modular-approach-towards-removal-of-matplotlib-baseline-images" title="Link to this heading">#</a></h2>
<p>We have divided the work in two parts as discussed in the <a href="../gsoc_coding_phase_blog_3/">previous blog</a>. The first part is the generation of the baseline images discussed below. The second part is the modification of the baseline images. The modification part will be implemented in the last phase of the Google Summer of Code 2020.</p>
<h2 id="generation-of-the-matplotlib-baseline-images">Generation of the matplotlib baseline images<a class="headerlink" href="#generation-of-the-matplotlib-baseline-images" title="Link to this heading">#</a></h2>
<p>Now, we have started removing the use of the <code>matplotlib_baseline_images</code> package. After the changes proposed in the <a href="https://github.com/matplotlib/matplotlib/pull/17557">previous PR</a>, the developer will have no baseline images on fresh install of matplotlib. So, the developer would need to generate matplotlib baseline images locally to get started with the testing part of the mpl.
The images can be generated by the image comparison tests with use of <code>matplotlib_baseline_image_generation</code> flag from the command line. Once these images are generated for the first time, then they can be used as the baseline images for the later times for comparison. This is the main principle adopted.</p>
<h2 id="completion-of-the-generation-of-images-for-the-matplotlib-directory">Completion of the generation of images for the matplotlib directory<a class="headerlink" href="#completion-of-the-generation-of-images-for-the-matplotlib-directory" title="Link to this heading">#</a></h2>
<p>We successfully created the <code>matplotlib_baseline_image_generation</code> flag in the beginning of the second evaluation but images were not created in the <code>baseline images</code> directory inside the <code>matplotlib</code> and <code>mpl_toolkits</code> directories, instead they were created in the <code>result_images</code> directory. So, we implemented this functionality. The images are created in the <code>lib/matplotlib/tests/baseline_images</code> directory directly now in the baseline image generation step. The baseline image generation step uses <code>python3 -mpytest lib/matplotlib --matplotlib_baseline_image_generation</code> command. Later on, running the pytests with <code>python3 -mpytest lib/matplotlib</code> will start the image comparison.</p>
<p>Right now, the matplotlib_baseline_image_generation flag works for the matplotlib directory. We are trying to achieve the same functionality for the mpl_toolkits directory.</p>
<h2 id="future-goals">Future Goals<a class="headerlink" href="#future-goals" title="Link to this heading">#</a></h2>
<p>Once the generation of the baseline images for <code>mpl_toolkits</code> directory is completed in the <a href="https://github.com/matplotlib/matplotlib/pull/17793">current PR</a>, we will move to the modification of the baseline images in the third coding phase. The addition of new baseline image and deletion of the old baseline image will also be implemented in the last phase of GSoC. Modification of baseline images will be further divided into two sub tasks: addition of new baseline image and the deletion of the previous baseline image.</p>
<h2 id="daily-meet-ups">Daily Meet-ups<a class="headerlink" href="#daily-meet-ups" title="Link to this heading">#</a></h2>
<p>Monday to Thursday meeting initiated at <a href="https://everytimezone.com/">11:00pm IST</a> via Zoom. Meeting notes are present at HackMD.</p>
<p>I am grateful to be part of such a great community. Project is really interesting and challenging :) Thanks Thomas, Antony and Hannah for helping me so far.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="GSoC" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Elementary Cellular Automata]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/elementary-cellular-automata/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/animated-fractals/?utm_source=atom_feed" rel="related" type="text/html" title="Animate Your Own Fractals in Python with Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/animated-polar-plot/?utm_source=atom_feed" rel="related" type="text/html" title="Animated polar plot with oceanographic data" />
                <link href="https://blog.scientific-python.org/matplotlib/emoji-mosaic-art/?utm_source=atom_feed" rel="related" type="text/html" title="Emoji Mosaic Art" />
                <link href="https://blog.scientific-python.org/matplotlib/draw-all-graphs-of-n-nodes/?utm_source=atom_feed" rel="related" type="text/html" title="Draw all graphs of N nodes" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-cyberpunk-style/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib Cyberpunk Style" />
            
                <id>https://blog.scientific-python.org/matplotlib/elementary-cellular-automata/</id>
            
            
            <published>2020-07-14T15:48:23-04:00</published>
            <updated>2020-07-14T15:48:23-04:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>A brief tour through the world of elementary cellular automata</blockquote><p><a href="https://en.wikipedia.org/wiki/Cellular_automaton">Cellular automata</a> are discrete models, typically on a grid, which evolve in time. Each grid cell has a finite state, such as 0 or 1, which is updated based on a certain set of rules. A specific cell uses information of the surrounding cells, called it&rsquo;s <em>neighborhood</em>, to determine what changes should be made. In general cellular automata can be defined in any number of dimensions. A famous two dimensional example is <a href="https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life">Conway&rsquo;s Game of Life</a> in which cells &ldquo;live&rdquo; and &ldquo;die&rdquo;, sometimes producing beautiful patterns.</p>
<p>In this post we will be looking at a one dimensional example known as <a href="https://en.wikipedia.org/wiki/Elementary_cellular_automaton">elementary cellular automaton</a>, popularized by <a href="https://en.wikipedia.org/wiki/Stephen_Wolfram">Stephen Wolfram</a> in the 1980s.</p>
<p><img src="/matplotlib/elementary-cellular-automata/ca-bar.png" alt="A row of cells, arranged side by side, each of which is colored black or white. The first few cells are white, then one black cell, and alternates in a similar pattern."></p>
<p>Imagine a row of cells, arranged side by side, each of which is colored black or white. We label black cells 1 and white cells 0, resulting in an array of bits. As an example lets consider a random array of 20 bits.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">rng</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">RandomState</span><span class="p">(</span><span class="mi">42</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">data</span> <span class="o">=</span> <span class="n">rng</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">20</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">data</span><span class="p">)</span></span></span></code></pre>
</div>
<pre><code>[0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 1 1 1 0]
</code></pre>
<p>To update the state of our cellular automaton we will need to define a set of rules.
A given cell \(C\) only knows about the state of it&rsquo;s left and right neighbors, labeled \(L\) and \(R\) respectively. We can define a function or rule, \(f(L, C, R)\), which maps the cell state to either 0 or 1.</p>
<p>Since our input cells are binary values there are \(2^3=8\) possible inputs into the function.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">8</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="nb">print</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">binary_repr</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="mi">3</span><span class="p">))</span></span></span></code></pre>
</div>
<pre><code>000
001
010
011
100
101
110
111
</code></pre>
<p>For each input triplet, we can assign 0 or 1 to the output. The output of \(f\) is the value which will replace the current cell \(C\) in the next time step. In total there are \(2^{2^3} = 2^8 = 256\) possible rules for updating a cell. Stephen Wolfram introduced a naming convention, now known as the <a href="https://en.wikipedia.org/wiki/Wolfram_code">Wolfram Code</a>, for the update rules in which each rule is represented by an 8 bit binary number.</p>
<p>For example &ldquo;Rule 30&rdquo; could be constructed by first converting to binary and then building an array for each bit</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">rule_number</span> <span class="o">=</span> <span class="mi">30</span>
</span></span><span class="line"><span class="cl"><span class="n">rule_string</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">binary_repr</span><span class="p">(</span><span class="n">rule_number</span><span class="p">,</span> <span class="mi">8</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">rule</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="nb">int</span><span class="p">(</span><span class="n">bit</span><span class="p">)</span> <span class="k">for</span> <span class="n">bit</span> <span class="ow">in</span> <span class="n">rule_string</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">rule</span><span class="p">)</span></span></span></code></pre>
</div>
<pre><code>[0 0 0 1 1 1 1 0]
</code></pre>
<p>By convention the Wolfram code associates the leading bit with &lsquo;111&rsquo; and the final bit with &lsquo;000&rsquo;. For rule 30 the relationship between the input, rule index and output is as follows:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">8</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">triplet</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">binary_repr</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;input:</span><span class="si">{</span><span class="n">triplet</span><span class="si">}</span><span class="s2">, index:</span><span class="si">{</span><span class="mi">7</span><span class="o">-</span><span class="n">i</span><span class="si">}</span><span class="s2">, output </span><span class="si">{</span><span class="n">rule</span><span class="p">[</span><span class="mi">7</span><span class="o">-</span><span class="n">i</span><span class="p">]</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<pre><code>input:000, index:7, output 0
input:001, index:6, output 1
input:010, index:5, output 1
input:011, index:4, output 1
input:100, index:3, output 1
input:101, index:2, output 0
input:110, index:1, output 0
input:111, index:0, output 0
</code></pre>
<p>We can define a function which maps the input cell information with the associated rule index. Essentially we are converting the binary input to decimal and adjusting the index range.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">rule_index</span><span class="p">(</span><span class="n">triplet</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">L</span><span class="p">,</span> <span class="n">C</span><span class="p">,</span> <span class="n">R</span> <span class="o">=</span> <span class="n">triplet</span>
</span></span><span class="line"><span class="cl">    <span class="n">index</span> <span class="o">=</span> <span class="mi">7</span> <span class="o">-</span> <span class="p">(</span><span class="mi">4</span> <span class="o">*</span> <span class="n">L</span> <span class="o">+</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">C</span> <span class="o">+</span> <span class="n">R</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">index</span><span class="p">)</span></span></span></code></pre>
</div>
<p>Now we can take in any input and look up the output based on our rule, for example:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">rule</span><span class="p">[</span><span class="n">rule_index</span><span class="p">((</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">))]</span></span></span></code></pre>
</div>
<pre><code>0
</code></pre>
<p>Finally, we can use Numpy to create a data structure containing all the triplets for our state array and apply the function across the appropriate axis to determine our new state.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">all_triplets</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">stack</span><span class="p">([</span><span class="n">np</span><span class="o">.</span><span class="n">roll</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">data</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">roll</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)])</span>
</span></span><span class="line"><span class="cl"><span class="n">new_data</span> <span class="o">=</span> <span class="n">rule</span><span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">apply_along_axis</span><span class="p">(</span><span class="n">rule_index</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">all_triplets</span><span class="p">)]</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">new_data</span><span class="p">)</span></span></span></code></pre>
</div>
<pre><code>[1 1 1 0 1 1 1 0 1 1 1 0 0 1 1 0 1 0 0 1]
</code></pre>
<p>That is the process for a single update of our cellular automata.</p>
<p>To do many updates and record the state over time, we will create a function.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">CA_run</span><span class="p">(</span><span class="n">initial_state</span><span class="p">,</span> <span class="n">n_steps</span><span class="p">,</span> <span class="n">rule_number</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">rule_string</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">binary_repr</span><span class="p">(</span><span class="n">rule_number</span><span class="p">,</span> <span class="mi">8</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">rule</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="nb">int</span><span class="p">(</span><span class="n">bit</span><span class="p">)</span> <span class="k">for</span> <span class="n">bit</span> <span class="ow">in</span> <span class="n">rule_string</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">m_cells</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">initial_state</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">CA_run</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros</span><span class="p">((</span><span class="n">n_steps</span><span class="p">,</span> <span class="n">m_cells</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="n">CA_run</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="p">:]</span> <span class="o">=</span> <span class="n">initial_state</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">step</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">n_steps</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">all_triplets</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">stack</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span>
</span></span><span class="line"><span class="cl">                <span class="n">np</span><span class="o">.</span><span class="n">roll</span><span class="p">(</span><span class="n">CA_run</span><span class="p">[</span><span class="n">step</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="p">:],</span> <span class="mi">1</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">                <span class="n">CA_run</span><span class="p">[</span><span class="n">step</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="p">:],</span>
</span></span><span class="line"><span class="cl">                <span class="n">np</span><span class="o">.</span><span class="n">roll</span><span class="p">(</span><span class="n">CA_run</span><span class="p">[</span><span class="n">step</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="p">:],</span> <span class="o">-</span><span class="mi">1</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">            <span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">CA_run</span><span class="p">[</span><span class="n">step</span><span class="p">,</span> <span class="p">:]</span> <span class="o">=</span> <span class="n">rule</span><span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">apply_along_axis</span><span class="p">(</span><span class="n">rule_index</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">all_triplets</span><span class="p">)]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">CA_run</span></span></span></code></pre>
</div>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">initial</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="n">data</span> <span class="o">=</span> <span class="n">CA_run</span><span class="p">(</span><span class="n">initial</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="mi">30</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">data</span><span class="p">)</span></span></span></code></pre>
</div>
<pre><code>[[0. 1. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 1. 1. 1. 0.]
 [1. 1. 1. 0. 1. 1. 1. 0. 1. 1. 1. 0. 0. 1. 1. 0. 1. 0. 0. 1.]
 [0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 1. 1. 1. 0. 0. 1. 1. 1. 1.]
 [1. 0. 0. 1. 1. 1. 0. 1. 1. 1. 1. 1. 0. 0. 1. 1. 1. 0. 0. 0.]
 [1. 1. 1. 1. 0. 0. 0. 1. 0. 0. 0. 0. 1. 1. 1. 0. 0. 1. 0. 1.]
 [0. 0. 0. 0. 1. 0. 1. 1. 1. 0. 0. 1. 1. 0. 0. 1. 1. 1. 0. 1.]
 [1. 0. 0. 1. 1. 0. 1. 0. 0. 1. 1. 1. 0. 1. 1. 1. 0. 0. 0. 1.]
 [0. 1. 1. 1. 0. 0. 1. 1. 1. 1. 0. 0. 0. 1. 0. 0. 1. 0. 1. 1.]
 [0. 1. 0. 0. 1. 1. 1. 0. 0. 0. 1. 0. 1. 1. 1. 1. 1. 0. 1. 0.]
 [1. 1. 1. 1. 1. 0. 0. 1. 0. 1. 1. 0. 1. 0. 0. 0. 0. 0. 1. 1.]]
</code></pre>
<h2 id="lets-get-visual">Let&rsquo;s Get Visual<a class="headerlink" href="#lets-get-visual" title="Link to this heading">#</a></h2>
<p>For larger simulations, interesting patterns start to emerge. To visualize our simulation results we will use the <code>ax.matshow</code> function.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">rcParams</span><span class="p">[</span><span class="s2">&#34;image.cmap&#34;</span><span class="p">]</span> <span class="o">=</span> <span class="s2">&#34;binary&#34;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">rng</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">RandomState</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">data</span> <span class="o">=</span> <span class="n">CA_run</span><span class="p">(</span><span class="n">rng</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">300</span><span class="p">),</span> <span class="mi">150</span><span class="p">,</span> <span class="mi">30</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">16</span><span class="p">,</span> <span class="mi">9</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">matshow</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">axis</span><span class="p">(</span><span class="kc">False</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/elementary-cellular-automata/output_18_0.png" alt="png"></p>
<h2 id="learning-the-rules">Learning the Rules<a class="headerlink" href="#learning-the-rules" title="Link to this heading">#</a></h2>
<p>With the code set up to produce the simulation, we can now start to explore the properties of these different rules. Wolfram separated the rules into four classes which are outlined below.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">plot_CA_class</span><span class="p">(</span><span class="n">rule_list</span><span class="p">,</span> <span class="n">class_label</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">rng</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">RandomState</span><span class="p">(</span><span class="n">seed</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="p">,</span> <span class="n">axs</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="mi">1</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">rule_list</span><span class="p">),</span> <span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mf">3.5</span><span class="p">),</span> <span class="n">constrained_layout</span><span class="o">=</span><span class="kc">True</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">initial</span> <span class="o">=</span> <span class="n">rng</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">ax</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">axs</span><span class="o">.</span><span class="n">ravel</span><span class="p">()):</span>
</span></span><span class="line"><span class="cl">        <span class="n">data</span> <span class="o">=</span> <span class="n">CA_run</span><span class="p">(</span><span class="n">initial</span><span class="p">,</span> <span class="mi">100</span><span class="p">,</span> <span class="n">rule_list</span><span class="p">[</span><span class="n">i</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">set_title</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;Rule </span><span class="si">{</span><span class="n">rule_list</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">matshow</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">axis</span><span class="p">(</span><span class="kc">False</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span><span class="n">class_label</span><span class="p">,</span> <span class="n">fontsize</span><span class="o">=</span><span class="mi">16</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">fig</span><span class="p">,</span> <span class="n">ax</span></span></span></code></pre>
</div>
<h3 id="class-one">Class One<a class="headerlink" href="#class-one" title="Link to this heading">#</a></h3>
<p>Cellular automata which rapidly converge to a uniform state</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">_</span> <span class="o">=</span> <span class="n">plot_CA_class</span><span class="p">([</span><span class="mi">4</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">172</span><span class="p">],</span> <span class="s2">&#34;Class One&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/elementary-cellular-automata/output_22_0.png" alt="png"></p>
<h3 id="class-two">Class Two<a class="headerlink" href="#class-two" title="Link to this heading">#</a></h3>
<p>Cellular automata which rapidly converge to a repetitive or stable state</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">_</span> <span class="o">=</span> <span class="n">plot_CA_class</span><span class="p">([</span><span class="mi">50</span><span class="p">,</span> <span class="mi">108</span><span class="p">,</span> <span class="mi">173</span><span class="p">],</span> <span class="s2">&#34;Class Two&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/elementary-cellular-automata/output_24_0.png" alt="png"></p>
<h3 id="class-three">Class Three<a class="headerlink" href="#class-three" title="Link to this heading">#</a></h3>
<p>Cellular automata which appear to remain in a random state</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">_</span> <span class="o">=</span> <span class="n">plot_CA_class</span><span class="p">([</span><span class="mi">60</span><span class="p">,</span> <span class="mi">106</span><span class="p">,</span> <span class="mi">150</span><span class="p">],</span> <span class="s2">&#34;Class Three&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/elementary-cellular-automata/output_26_0.png" alt="png"></p>
<h3 id="class-four">Class Four<a class="headerlink" href="#class-four" title="Link to this heading">#</a></h3>
<p>Cellular automata which form areas of repetitive or stable states, but also form structures that interact with each other in complicated ways.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">_</span> <span class="o">=</span> <span class="n">plot_CA_class</span><span class="p">([</span><span class="mi">54</span><span class="p">,</span> <span class="mi">62</span><span class="p">,</span> <span class="mi">110</span><span class="p">],</span> <span class="s2">&#34;Class Four&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/elementary-cellular-automata/output_28_0.png" alt="png"></p>
<p>Amazingly, the interacting structures which emerge from rule 110 has been shown to be capable of <a href="https://en.wikipedia.org/wiki/Turing_machine">universal computation</a>.</p>
<p>In all the examples above a random initial state was used, but another interesting case is when a single 1 is initialized with all other values set to zero.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">initial</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="mi">300</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">initial</span><span class="p">[</span><span class="mi">300</span> <span class="o">//</span> <span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl"><span class="n">data</span> <span class="o">=</span> <span class="n">CA_run</span><span class="p">(</span><span class="n">initial</span><span class="p">,</span> <span class="mi">150</span><span class="p">,</span> <span class="mi">30</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">matshow</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">axis</span><span class="p">(</span><span class="kc">False</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/elementary-cellular-automata/output_31_0.png" alt="png"></p>
<p>For certain rules, the emergent structures interact in chaotic and interesting ways.</p>
<p>I hope you enjoyed this brief look into the world of elementary cellular automata, and are inspired to make some pretty pictures of your own.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[GSoC Coding Phase 2 Blog 1]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_3/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_2/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 1 Blog 2" />
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_1/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 1 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/introductory-gsoc2020-post/?utm_source=atom_feed" rel="related" type="text/html" title="Sidharth Bansal joined as GSoC&#39;20 intern" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-rsef/?utm_source=atom_feed" rel="related" type="text/html" title="Elliott Sales de Andrade hired as Matplotlib Software Research Engineering Fellow" />
                <link href="https://blog.scientific-python.org/matplotlib/animated-fractals/?utm_source=atom_feed" rel="related" type="text/html" title="Animate Your Own Fractals in Python with Matplotlib" />
            
                <id>https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_3/</id>
            
            
            <published>2020-07-11T19:47:51+05:30</published>
            <updated>2020-07-11T19:47:51+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Progress Report for the first half of the Google Summer of Code 2020 Phase 2 for the Baseline Images Problem</blockquote><p>Google Summer of Code 2020&rsquo;s first evaluation is completed. I passed!!! Hurray! Now we are in the mid way of the second evaluation. This post discusses about the progress so far in the first two weeks of the second coding period from 30 June to 12 July 2020.</p>
<h2 id="completion-of-the-matplotlib_baseline_images-package">Completion of the matplotlib_baseline_images package<a class="headerlink" href="#completion-of-the-matplotlib_baseline_images-package" title="Link to this heading">#</a></h2>
<p>We successfully created the matplotlib_baseline_images package. It contains the matplotlib and the matplotlib toolkit baseline images. Symlinking is done for the baseline images, related changes for Travis, appvoyer, azure pipelines etc. are functional and tests/test_data is created as discussed in the previous blog. PR is reviewed and suggested work is done.</p>
<h2 id="modular-approach-towards-removal-of-matplotlib-baseline-images">Modular approach towards removal of matplotlib baseline images<a class="headerlink" href="#modular-approach-towards-removal-of-matplotlib-baseline-images" title="Link to this heading">#</a></h2>
<p>We have divide the work in two parts. The first part is the generation of the baseline images discussed below. The second part is the modification of the baseline images which happens when some baseline images gets modified due to <code>git push</code> or <code>git merge</code>. Modification of baseline images will be further divided into two sub tasks: addition of new baseline image and the deletion of the previous baseline image. This will be discussed in the second half of the second phase of the Google Summer of Code 2020.</p>
<h2 id="generation-of-the-matplotlib-baseline-images">Generation of the matplotlib baseline images<a class="headerlink" href="#generation-of-the-matplotlib-baseline-images" title="Link to this heading">#</a></h2>
<p>After the changes proposed in the <a href="https://github.com/matplotlib/matplotlib/pull/17557">previous PR</a>, the developer will have no baseline images on fresh install of matplotlib. The developer would need to install the sub-wheel matplotlib_baseline_images package to get started with the testing part of the mpl. Now, we have started removing the use of the matplotlib_baseline_images package. It will require two steps as discussed above.
The images can be generated by the image comparison tests. Once these images are generated for the first time, then they can be used as the baseline images for the later times for comparison. This is the main principle adopted. The images are first created in the <code>result_images</code> directory. Then they will be moved to the <code>lib/matplotlib/tests/baseline_images</code> directory. Later on, running the pytests will start the image comparison.</p>
<h2 id="created-commandline-flags-for-baseline-images-creation">Created commandline flags for baseline images creation<a class="headerlink" href="#created-commandline-flags-for-baseline-images-creation" title="Link to this heading">#</a></h2>
<p>I learned about the pytest hooks and fixtures. I build a command line flag <code>matplotlib_baseline_image_generation</code> which will create the baseline images in the <code>result_images</code> directory. The full command will be <code>python3 pytest --matplotlib_baseline_image_generation</code>. In order to do this, we have done changes in the <code>conftest.py</code> and also added markers to the <code>image_comparison</code> decorator.</p>
<h2 id="learning-more-about-the-git-and-virtual-environments">Learning more about the Git and virtual environments<a class="headerlink" href="#learning-more-about-the-git-and-virtual-environments" title="Link to this heading">#</a></h2>
<p>I came to know about the git worktree and the scenarios in which we can use it. I also know more about virtual environments and their need in different scenarios.</p>
<h2 id="future-goals">Future Goals<a class="headerlink" href="#future-goals" title="Link to this heading">#</a></h2>
<p>Once the generation of the baseline images is completed in the <a href="https://github.com/matplotlib/matplotlib/pull/17793">current PR</a>, we will move to the modification of the baseline images in the second half of the second coding phase.</p>
<h2 id="daily-meet-ups">Daily Meet-ups<a class="headerlink" href="#daily-meet-ups" title="Link to this heading">#</a></h2>
<p>Monday to Thursday meeting initiated at <a href="https://everytimezone.com/">11:00pm IST</a> via Zoom. Meeting notes are present at HackMD.</p>
<p>I am grateful to be part of such a great community. Project is really interesting and challenging :) Thanks Thomas, Antony and Hannah for helping me so far.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="GSoC" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Animate Your Own Fractals in Python with Matplotlib]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/animated-fractals/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/animated-polar-plot/?utm_source=atom_feed" rel="related" type="text/html" title="Animated polar plot with oceanographic data" />
                <link href="https://blog.scientific-python.org/matplotlib/emoji-mosaic-art/?utm_source=atom_feed" rel="related" type="text/html" title="Emoji Mosaic Art" />
                <link href="https://blog.scientific-python.org/matplotlib/draw-all-graphs-of-n-nodes/?utm_source=atom_feed" rel="related" type="text/html" title="Draw all graphs of N nodes" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-cyberpunk-style/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib Cyberpunk Style" />
                <link href="https://blog.scientific-python.org/matplotlib/mpl-for-making-diagrams/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib for Making Diagrams" />
            
                <id>https://blog.scientific-python.org/matplotlib/animated-fractals/</id>
            
            
            <published>2020-07-04T00:06:36+02:00</published>
            <updated>2020-07-04T00:06:36+02:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Discover the bizarre geometry of the fractals and learn how to make an animated visualization of these marvels using Python and the Matplotlib&rsquo;s Animation API.</blockquote><p>Imagine zooming an image over and over and never go out of finer details. It may sound bizarre but the mathematical
concept of <a href="https://en.wikipedia.org/wiki/Fractal">fractals</a> opens the realm towards this intricating infinity. This
strange geometry exhibits the same or similar patterns irrespectively of the scale. We can see one fractal example
in the image above.</p>
<p>The <em>fractals</em> may seem difficult to understand due to their peculiarity, but that&rsquo;s not the case. As Benoit Mandelbrot,
one of the founding fathers of the fractal geometry said in his legendary
<a href="https://www.ted.com/talks/benoit_mandelbrot_fractals_and_the_art_of_roughness?language=en">TED Talk</a>:</p>
<blockquote>
<p>A surprising aspect is that the rules of this geometry are extremely short. You crank the formulas several times and
at the end, you get things like this (pointing to a stunning plot)</p>
<p>&ndash; <cite>Benoit Mandelbrot</cite></p>
</blockquote>
<p>In this tutorial blog post, we will see how to construct fractals in Python and animate them using the amazing
<em>Matplotlib&rsquo;s</em> Animation API. First, we will demonstrate the convergence of the <em>Mandelbrot Set</em> with an
enticing animation. In the second part, we will analyze one interesting property of the <em>Julia Set</em>. Stay tuned!</p>
<h1 id="intuition">Intuition<a class="headerlink" href="#intuition" title="Link to this heading">#</a></h1>
<p>We all have a common sense of the concept of similarity. We say two objects are similar to each other if they share
some common patterns.</p>
<p>This notion is not only limited to a comparison of two different objects. We can also compare different parts of the
same object. For instance, a leaf. We know very well that the left side matches exactly the right side, i.e. the leaf
is symmetrical.</p>
<p>In mathematics, this phenomenon is known as <a href="https://en.wikipedia.org/wiki/Self-similarity">self-similarity</a>. It means
a given object is similar (completely or to some extent) to some smaller part of itself. One remarkable example is the
<a href="https://isquared.digital/visualizations/2020-06-15-koch-curve/">An orange Koch Snowflake. It has 6 bulges which themselves have 3 sub-bulges. These sub-bulges have another 3 sub-sub bulges. </a> as shown in the image below:</p>
<p><img src="/matplotlib/animated-fractals/snowflake.png" alt="Koch Snowflake"></p>
<p>We can infinitely magnify some part of it and the same pattern will repeat over and over again. This is how fractal
geometry is defined.</p>
<h1 id="animated-mandelbrot-set">Animated Mandelbrot Set<a class="headerlink" href="#animated-mandelbrot-set" title="Link to this heading">#</a></h1>
<p><a href="https://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a> is defined over the set of <em>complex numbers</em>. It consists
of all complex numbers <strong>c</strong>, such that the sequence <strong>zᵢ₊ᵢ = zᵢ² + c, z₀ = 0</strong> is bounded. It means, after a certain
number of iterations the absolute value must not exceed a given limit. At first sight, it might
seem odd and simple, but in fact, it has some mind-blowing properties.</p>
<p>The <em>Python</em> implementation is quite straightforward, as given in the code snippet below:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">mandelbrot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">threshold</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;Calculates whether the number c = x + i*y belongs to the
</span></span></span><span class="line"><span class="cl"><span class="s2">    Mandelbrot set. In order to belong, the sequence z[i + 1] = z[i]**2 + c
</span></span></span><span class="line"><span class="cl"><span class="s2">    must not diverge after &#39;threshold&#39; number of steps. The sequence diverges
</span></span></span><span class="line"><span class="cl"><span class="s2">    if the absolute value of z[i+1] is greater than 4.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    :param float x: the x component of the initial complex number
</span></span></span><span class="line"><span class="cl"><span class="s2">    :param float y: the y component of the initial complex number
</span></span></span><span class="line"><span class="cl"><span class="s2">    :param int threshold: the number of iterations to considered it converged
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># initial conditions</span>
</span></span><span class="line"><span class="cl">    <span class="n">c</span> <span class="o">=</span> <span class="nb">complex</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">z</span> <span class="o">=</span> <span class="nb">complex</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">threshold</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">z</span> <span class="o">=</span> <span class="n">z</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="n">c</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="nb">abs</span><span class="p">(</span><span class="n">z</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mf">4.0</span><span class="p">:</span>  <span class="c1"># it diverged</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="n">i</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">threshold</span> <span class="o">-</span> <span class="mi">1</span>  <span class="c1"># it didn&#39;t diverge</span></span></span></code></pre>
</div>
<p>As we can see, we set the maximum number of iterations encoded in the variable <code>threshold</code>. If the magnitude of the
sequence at some iteration exceeds <strong>4</strong>, we consider it as diverged (<strong>c</strong> does not belong to the set) and return the
iteration number at which this occurred. If this never happens (<strong>c</strong> belongs to the set), we return the maximum
number of iterations.</p>
<p>We can use the information about the number of iterations before the sequence diverges. All we have to do
is to associate this number to a color relative to the maximum number of loops. Thus, for all complex numbers
<strong>c</strong> in some lattice of the complex plane, we can make a nice animation of the convergence process as a function
of the maximum allowed iterations.</p>
<p>One particular and interesting area is the <em>3x3</em> lattice starting at position -2 and -1.5 for the <em>real</em> and
<em>imaginary</em> axis respectively. We can observe the process of convergence as the number of allowed iterations increases.
This is easily achieved using the <em>Matplotlib&rsquo;s</em> Animation API, as shown with the following code:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.animation</span> <span class="k">as</span> <span class="nn">animation</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">x_start</span><span class="p">,</span> <span class="n">y_start</span> <span class="o">=</span> <span class="o">-</span><span class="mi">2</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.5</span>  <span class="c1"># an interesting region starts here</span>
</span></span><span class="line"><span class="cl"><span class="n">width</span><span class="p">,</span> <span class="n">height</span> <span class="o">=</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">3</span>  <span class="c1"># for 3 units up and right</span>
</span></span><span class="line"><span class="cl"><span class="n">density_per_unit</span> <span class="o">=</span> <span class="mi">250</span>  <span class="c1"># how many pixles per unit</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># real and imaginary axis</span>
</span></span><span class="line"><span class="cl"><span class="n">re</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="n">x_start</span><span class="p">,</span> <span class="n">x_start</span> <span class="o">+</span> <span class="n">width</span><span class="p">,</span> <span class="n">width</span> <span class="o">*</span> <span class="n">density_per_unit</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">im</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="n">y_start</span><span class="p">,</span> <span class="n">y_start</span> <span class="o">+</span> <span class="n">height</span><span class="p">,</span> <span class="n">height</span> <span class="o">*</span> <span class="n">density_per_unit</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span>  <span class="c1"># instantiate a figure to draw</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">axes</span><span class="p">()</span>  <span class="c1"># create an axes object</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">animate</span><span class="p">(</span><span class="n">i</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">clear</span><span class="p">()</span>  <span class="c1"># clear axes object</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_xticks</span><span class="p">([],</span> <span class="p">[])</span>  <span class="c1"># clear x-axis ticks</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_yticks</span><span class="p">([],</span> <span class="p">[])</span>  <span class="c1"># clear y-axis ticks</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">X</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">empty</span><span class="p">((</span><span class="nb">len</span><span class="p">(</span><span class="n">re</span><span class="p">),</span> <span class="nb">len</span><span class="p">(</span><span class="n">im</span><span class="p">)))</span>  <span class="c1"># re-initialize the array-like image</span>
</span></span><span class="line"><span class="cl">    <span class="n">threshold</span> <span class="o">=</span> <span class="nb">round</span><span class="p">(</span><span class="mf">1.15</span> <span class="o">**</span> <span class="p">(</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span>  <span class="c1"># calculate the current threshold</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># iterations for the current threshold</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">re</span><span class="p">)):</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">im</span><span class="p">)):</span>
</span></span><span class="line"><span class="cl">            <span class="n">X</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">mandelbrot</span><span class="p">(</span><span class="n">re</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">im</span><span class="p">[</span><span class="n">j</span><span class="p">],</span> <span class="n">threshold</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># associate colors to the iterations with an interpolation</span>
</span></span><span class="line"><span class="cl">    <span class="n">img</span> <span class="o">=</span> <span class="n">ax</span><span class="o">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">X</span><span class="o">.</span><span class="n">T</span><span class="p">,</span> <span class="n">interpolation</span><span class="o">=</span><span class="s2">&#34;bicubic&#34;</span><span class="p">,</span> <span class="n">cmap</span><span class="o">=</span><span class="s2">&#34;magma&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">[</span><span class="n">img</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">anim</span> <span class="o">=</span> <span class="n">animation</span><span class="o">.</span><span class="n">FuncAnimation</span><span class="p">(</span><span class="n">fig</span><span class="p">,</span> <span class="n">animate</span><span class="p">,</span> <span class="n">frames</span><span class="o">=</span><span class="mi">45</span><span class="p">,</span> <span class="n">interval</span><span class="o">=</span><span class="mi">120</span><span class="p">,</span> <span class="n">blit</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">anim</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="s2">&#34;mandelbrot.gif&#34;</span><span class="p">,</span> <span class="n">writer</span><span class="o">=</span><span class="s2">&#34;imagemagick&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p>We make animations in <em>Matplotlib</em> using the <code>FuncAnimation</code> function from the <em>Animation</em> API. We need to specify
the <code>figure</code> on which we draw a predefined number of consecutive <code>frames</code>. A predetermined <code>interval</code> expressed in
milliseconds defines the delay between the frames.</p>
<p>In this context, the <code>animate</code> function plays a central role, where the input argument is the frame number, starting
from 0. It means, in order to animate we always have to think in terms of frames. Hence, we use the frame number
to calculate the variable <code>threshold</code> which is the maximum number of allowed iterations.</p>
<p>To represent our lattice we instantiate two arrays <code>re</code> and <code>im</code>: the former for the values on the <em>real</em> axis
and the latter for the values on the <em>imaginary</em> axis. The number of elements in these two arrays is defined by
the variable <code>density_per_unit</code> which defines the number of samples per unit step. The higher it is, the better
quality we get, but at a cost of heavier computation.</p>
<p>Now, depending on the current <code>threshold</code>, for every complex number <strong>c</strong> in our lattice, we calculate the number of
iterations before the sequence <strong>zᵢ₊ᵢ = zᵢ² + c, z₀ = 0</strong> diverges. We save them in an initially empty matrix called <code>X</code>.
In the end, we <em>interpolate</em> the values in <code>X</code> and assign them a color drawn from a prearranged <em>colormap</em>.</p>
<p>After cranking the <code>animate</code> function multiple times we get a stunning animation as depicted below:</p>
<p><img src="/matplotlib/animated-fractals/mandelbrot.gif" alt="Mandelbrot set animation. The first few frames only show a few outlines of the Mandelbrot shape. The middle frames show a more defined shape. The last few frames show the characteristic Mandelbrot shape in a very clear way."></p>
<h1 id="animated-julia-set">Animated Julia Set<a class="headerlink" href="#animated-julia-set" title="Link to this heading">#</a></h1>
<p>The <a href="https://en.wikipedia.org/wiki/Julia_set">Julia Set</a> is quite similar to the <em>Mandelbrot Set</em>. Instead of setting
<strong>z₀ = 0</strong> and testing whether for some complex number <strong>c = x + i*y</strong> the sequence <strong>zᵢ₊ᵢ = zᵢ² + c</strong> is bounded, we
switch the roles a bit. We fix the value for <strong>c</strong>, we set an arbitrary initial condition <strong>z₀ = x + i*y</strong>, and we
observe the convergence of the sequence. The <em>Python</em> implementation is given below:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">julia_quadratic</span><span class="p">(</span><span class="n">zx</span><span class="p">,</span> <span class="n">zy</span><span class="p">,</span> <span class="n">cx</span><span class="p">,</span> <span class="n">cy</span><span class="p">,</span> <span class="n">threshold</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;Calculates whether the number z[0] = zx + i*zy with a constant c = x + i*y
</span></span></span><span class="line"><span class="cl"><span class="s2">    belongs to the Julia set. In order to belong, the sequence
</span></span></span><span class="line"><span class="cl"><span class="s2">    z[i + 1] = z[i]**2 + c, must not diverge after &#39;threshold&#39; number of steps.
</span></span></span><span class="line"><span class="cl"><span class="s2">    The sequence diverges if the absolute value of z[i+1] is greater than 4.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    :param float zx: the x component of z[0]
</span></span></span><span class="line"><span class="cl"><span class="s2">    :param float zy: the y component of z[0]
</span></span></span><span class="line"><span class="cl"><span class="s2">    :param float cx: the x component of the constant c
</span></span></span><span class="line"><span class="cl"><span class="s2">    :param float cy: the y component of the constant c
</span></span></span><span class="line"><span class="cl"><span class="s2">    :param int threshold: the number of iterations to considered it converged
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># initial conditions</span>
</span></span><span class="line"><span class="cl">    <span class="n">z</span> <span class="o">=</span> <span class="nb">complex</span><span class="p">(</span><span class="n">zx</span><span class="p">,</span> <span class="n">zy</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">c</span> <span class="o">=</span> <span class="nb">complex</span><span class="p">(</span><span class="n">cx</span><span class="p">,</span> <span class="n">cy</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">threshold</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">z</span> <span class="o">=</span> <span class="n">z</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="n">c</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="nb">abs</span><span class="p">(</span><span class="n">z</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mf">4.0</span><span class="p">:</span>  <span class="c1"># it diverged</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="n">i</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">threshold</span> <span class="o">-</span> <span class="mi">1</span>  <span class="c1"># it didn&#39;t diverge</span></span></span></code></pre>
</div>
<p>Obviously, the setup is quite similar as the <em>Mandelbrot Set</em> implementation. The maximum number of iterations is
denoted as <code>threshold</code>. If the magnitude of the sequence is never greater than <strong>4</strong>, the number <strong>z₀</strong> belongs to
the <em>Julia Set</em> and vice-versa.</p>
<p>The number <strong>c</strong> is giving us the freedom to analyze its impact on the convergence of the sequence, given that the
number of maximum iterations is fixed. One interesting range of values for <strong>c</strong> is for <strong>c = r cos α + i × r sin α</strong>
such that <strong>r=0.7885</strong> and <strong>α ∈ [0, 2π]</strong>.</p>
<p>The best possible way to make this analysis is to create an animated visualization as the number <strong>c</strong> changes.
This <a href="https://isquared.digital/blog/2020-02-08-interactive-dataviz/">ameliorates our visual perception</a> and
understanding of such abstract phenomena in a captivating manner. To do so, we use the Matplotlib&rsquo;s <em>Animation API</em>, as
demonstrated in the code below:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.animation</span> <span class="k">as</span> <span class="nn">animation</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">x_start</span><span class="p">,</span> <span class="n">y_start</span> <span class="o">=</span> <span class="o">-</span><span class="mi">2</span><span class="p">,</span> <span class="o">-</span><span class="mi">2</span>  <span class="c1"># an interesting region starts here</span>
</span></span><span class="line"><span class="cl"><span class="n">width</span><span class="p">,</span> <span class="n">height</span> <span class="o">=</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">4</span>  <span class="c1"># for 4 units up and right</span>
</span></span><span class="line"><span class="cl"><span class="n">density_per_unit</span> <span class="o">=</span> <span class="mi">200</span>  <span class="c1"># how many pixles per unit</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># real and imaginary axis</span>
</span></span><span class="line"><span class="cl"><span class="n">re</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="n">x_start</span><span class="p">,</span> <span class="n">x_start</span> <span class="o">+</span> <span class="n">width</span><span class="p">,</span> <span class="n">width</span> <span class="o">*</span> <span class="n">density_per_unit</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">im</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="n">y_start</span><span class="p">,</span> <span class="n">y_start</span> <span class="o">+</span> <span class="n">height</span><span class="p">,</span> <span class="n">height</span> <span class="o">*</span> <span class="n">density_per_unit</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">threshold</span> <span class="o">=</span> <span class="mi">20</span>  <span class="c1"># max allowed iterations</span>
</span></span><span class="line"><span class="cl"><span class="n">frames</span> <span class="o">=</span> <span class="mi">100</span>  <span class="c1"># number of frames in the animation</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># we represent c as c = r*cos(a) + i*r*sin(a) = r*e^{i*a}</span>
</span></span><span class="line"><span class="cl"><span class="n">r</span> <span class="o">=</span> <span class="mf">0.7885</span>
</span></span><span class="line"><span class="cl"><span class="n">a</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="p">,</span> <span class="n">frames</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span>  <span class="c1"># instantiate a figure to draw</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">axes</span><span class="p">()</span>  <span class="c1"># create an axes object</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">animate</span><span class="p">(</span><span class="n">i</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">clear</span><span class="p">()</span>  <span class="c1"># clear axes object</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_xticks</span><span class="p">([],</span> <span class="p">[])</span>  <span class="c1"># clear x-axis ticks</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_yticks</span><span class="p">([],</span> <span class="p">[])</span>  <span class="c1"># clear y-axis ticks</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">X</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">empty</span><span class="p">((</span><span class="nb">len</span><span class="p">(</span><span class="n">re</span><span class="p">),</span> <span class="nb">len</span><span class="p">(</span><span class="n">im</span><span class="p">)))</span>  <span class="c1"># the initial array-like image</span>
</span></span><span class="line"><span class="cl">    <span class="n">cx</span><span class="p">,</span> <span class="n">cy</span> <span class="o">=</span> <span class="n">r</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">cos</span><span class="p">(</span><span class="n">a</span><span class="p">[</span><span class="n">i</span><span class="p">]),</span> <span class="n">r</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">a</span><span class="p">[</span><span class="n">i</span><span class="p">])</span>  <span class="c1"># the initial c number</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># iterations for the given threshold</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">re</span><span class="p">)):</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">im</span><span class="p">)):</span>
</span></span><span class="line"><span class="cl">            <span class="n">X</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">julia_quadratic</span><span class="p">(</span><span class="n">re</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">im</span><span class="p">[</span><span class="n">j</span><span class="p">],</span> <span class="n">cx</span><span class="p">,</span> <span class="n">cy</span><span class="p">,</span> <span class="n">threshold</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">img</span> <span class="o">=</span> <span class="n">ax</span><span class="o">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">X</span><span class="o">.</span><span class="n">T</span><span class="p">,</span> <span class="n">interpolation</span><span class="o">=</span><span class="s2">&#34;bicubic&#34;</span><span class="p">,</span> <span class="n">cmap</span><span class="o">=</span><span class="s2">&#34;magma&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">[</span><span class="n">img</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">anim</span> <span class="o">=</span> <span class="n">animation</span><span class="o">.</span><span class="n">FuncAnimation</span><span class="p">(</span><span class="n">fig</span><span class="p">,</span> <span class="n">animate</span><span class="p">,</span> <span class="n">frames</span><span class="o">=</span><span class="n">frames</span><span class="p">,</span> <span class="n">interval</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span> <span class="n">blit</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">anim</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="s2">&#34;julia_set.gif&#34;</span><span class="p">,</span> <span class="n">writer</span><span class="o">=</span><span class="s2">&#34;imagemagick&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p>The logic in the <code>animate</code> function is very similar to the previous example. We update the number <strong>c</strong> as a function
of the frame number. Based on that we estimate the convergence of all complex numbers in the defined lattice, given the
fixed <code>threshold</code> of allowed iterations. Same as before, we save the results in an initially empty matrix <code>X</code> and
associate them to a color relative to the maximum number of iterations. The resulting animation is illustrated below:</p>
<p><img src="/matplotlib/animated-fractals/julia_set.gif" alt="Julia Set Animation"></p>
<h1 id="summary">Summary<a class="headerlink" href="#summary" title="Link to this heading">#</a></h1>
<p>The fractals are really mind-gobbling structures as we saw during this blog. First, we gave a general intuition
of the fractal geometry. Then, we observed two types of fractals: the <em>Mandelbrot</em> and <em>Julia</em> sets. We implemented
them in Python and made interesting animated visualizations of their properties.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[GSoC Coding Phase 1 Blog 2]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_2/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_1/?utm_source=atom_feed" rel="related" type="text/html" title="GSoC Coding Phase 1 Blog 1" />
                <link href="https://blog.scientific-python.org/matplotlib/introductory-gsoc2020-post/?utm_source=atom_feed" rel="related" type="text/html" title="Sidharth Bansal joined as GSoC&#39;20 intern" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-rsef/?utm_source=atom_feed" rel="related" type="text/html" title="Elliott Sales de Andrade hired as Matplotlib Software Research Engineering Fellow" />
                <link href="https://blog.scientific-python.org/matplotlib/animated-polar-plot/?utm_source=atom_feed" rel="related" type="text/html" title="Animated polar plot with oceanographic data" />
                <link href="https://blog.scientific-python.org/matplotlib/pyplot-vs-object-oriented-interface/?utm_source=atom_feed" rel="related" type="text/html" title="Pyplot vs Object Oriented Interface" />
            
                <id>https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_2/</id>
            
            
            <published>2020-06-24T16:47:51+05:30</published>
            <updated>2020-06-24T16:47:51+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Progress Report for the second half of the Google Summer of Code 2020 Phase 1 for the Baseline Images Problem</blockquote><p>Google Summer of Code 2020&rsquo;s first evaluation is about to complete. This post discusses about the progress so far in the last two weeks of the first coding period from 15 June to 30 June 2020.</p>
<h2 id="completion-of-the-demo-package">Completion of the demo package<a class="headerlink" href="#completion-of-the-demo-package" title="Link to this heading">#</a></h2>
<p>We successfully created the demo app and uploaded it to the test.pypi. It contains the main and the secondary package. The main package is analogous to the matplotlib and secondary package is analogous to the matplotlib_baseline_images package as discussed in the previous blog.</p>
<h2 id="learning-more-about-the-git-and-mpl-workflow">Learning more about the Git and mpl workflow<a class="headerlink" href="#learning-more-about-the-git-and-mpl-workflow" title="Link to this heading">#</a></h2>
<p>I came across another way to merge the master into the branch to resolve conflicts is by rebasing the master. I understood how to create modular commits inside a pull request for easy reviewal process and better understandability of the code.</p>
<h2 id="creation-of-the-matplotlib_baseline_images-package">Creation of the matplotlib_baseline_images package<a class="headerlink" href="#creation-of-the-matplotlib_baseline_images-package" title="Link to this heading">#</a></h2>
<p>Then, we implemented the similar changes to create the <code>matplotlib_baseline_images</code> package. Finally, we were successful in uploading it to the <a href="https://test.pypi.org/project/matplotlib.baseline-images/3.3.0rc1/#history">test.pypi</a>. This package is involved in the <code>sub-wheels</code> directory so that more packages can be added in the same directory, if needed in future. The <code>matplotlib_baseline_images</code> package contain baseline images for both <code>matplotlib</code> and <code>mpl_toolkits</code>.
Some changes were required in the main <code>matplotlib</code> package&rsquo;s setup.py so that it will not take information from the packages present in the <code>sub-wheels</code> directory.</p>
<h2 id="symlinking-the-baseline-images">Symlinking the baseline images<a class="headerlink" href="#symlinking-the-baseline-images" title="Link to this heading">#</a></h2>
<p>As baseline images are moved out of the <code>lib/matplotlib</code> and <code>lib/mpl_toolkits</code> directory. We symlinked the locations where they are used, namely in <code>lib/matplotlib/testing/decorator.py</code>, <code>tools/triage_tests.py</code>, <code>lib/matplotlib/tests/__init__.py</code> and <code>lib/mpl_toolkits/tests/__init__.py</code>.</p>
<h2 id="creation-of-the-teststest_data-directory">Creation of the tests/test_data directory<a class="headerlink" href="#creation-of-the-teststest_data-directory" title="Link to this heading">#</a></h2>
<p>There are some test data that is present in the <code>baseline_images</code> which doesn&rsquo;t need to be moved to the <code>matplotlib_baseline_images</code> package. So, that is stored under the <code>lib/matplotlib/tests/test_data</code> folder.</p>
<h2 id="understanding-travis-appvoyer-and-azure-pipelines">Understanding Travis, Appvoyer and Azure-pipelines<a class="headerlink" href="#understanding-travis-appvoyer-and-azure-pipelines" title="Link to this heading">#</a></h2>
<p>I came across the Continuous Integration tools used at mpl. We tried to install the <code>matplotlib</code> followed by <code>matplotlib_baseline_images</code> package in all three travis, appvoyer and azure-pipeline.</p>
<h2 id="future-goals">Future Goals<a class="headerlink" href="#future-goals" title="Link to this heading">#</a></h2>
<p>Once the <a href="https://github.com/matplotlib/matplotlib/pull/17557">current PR</a> is merged, we will move to the <a href="https://github.com/matplotlib/matplotlib/issues/16447">Proposal for the baseline images problem</a>.</p>
<h2 id="daily-meet-ups">Daily Meet-ups<a class="headerlink" href="#daily-meet-ups" title="Link to this heading">#</a></h2>
<p>Everyday meeting initiated at <a href="https://everytimezone.com/">11:00pm IST</a> via Zoom. Meeting notes are present at HackMD.</p>
<p>I am grateful to be part of such a great community. Project is really interesting and challenging :) Thanks Antony and Hannah for helping me so far.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="GSoC" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Animated polar plot with oceanographic data]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/animated-polar-plot/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/emoji-mosaic-art/?utm_source=atom_feed" rel="related" type="text/html" title="Emoji Mosaic Art" />
                <link href="https://blog.scientific-python.org/matplotlib/draw-all-graphs-of-n-nodes/?utm_source=atom_feed" rel="related" type="text/html" title="Draw all graphs of N nodes" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-cyberpunk-style/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib Cyberpunk Style" />
                <link href="https://blog.scientific-python.org/matplotlib/mpl-for-making-diagrams/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib for Making Diagrams" />
                <link href="https://blog.scientific-python.org/matplotlib/create-ridgeplots-in-matplotlib/?utm_source=atom_feed" rel="related" type="text/html" title="Create Ridgeplots in Matplotlib" />
            
                <id>https://blog.scientific-python.org/matplotlib/animated-polar-plot/</id>
            
            
            <published>2020-06-12T09:56:36+02:00</published>
            <updated>2020-06-12T09:56:36+02:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>This post describes how to animate some oceanographic measurements in a tweaked polar plot</blockquote><p>The <strong>ocean</strong> is a key component of the Earth climate system. It thus needs a continuous real-time monitoring to help scientists better understand its dynamic and predict its evolution. All around the world, oceanographers have managed to join their efforts and set up a <a href="https://www.goosocean.org">Global Ocean Observing System</a> among which <a href="http://www.argo.ucsd.edu/"><strong>Argo</strong></a> is a key component. Argo is a global network of nearly 4000 autonomous probes or floats measuring pressure, temperature and salinity from the surface to 2000m depth every 10 days. The localisation of these floats is nearly random between the 60th parallels (see live coverage <a href="http://collab.umr-lops.fr/app/divaa/">here</a>). All data are collected by satellite in real-time, processed by several data centers and finally merged in a single dataset (collecting more than 2 millions of vertical profiles data) made freely available to anyone.</p>
<p>In this particular case, we want to plot temperature (surface and 1000m deep) data measured by those floats, for the period 2010-2020 and for the Mediterranean sea. We want this plot to be circular and animated, now you start to get the title of this post: <strong>Animated polar plot</strong>.</p>
<p>First we need some data to work with. To retrieve our temperature values from Argo, we use <a href="https://argopy.readthedocs.io"><strong>Argopy</strong></a>, which is a Python library that aims to ease Argo data access, manipulation and visualization for standard users, as well as Argo experts and operators. Argopy returns <a href="http://xarray.pydata.org">xarray</a> dataset objects, which make our analysis much easier.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">argopy</span> <span class="kn">import</span> <span class="n">DataFetcher</span> <span class="k">as</span> <span class="n">ArgoDataFetcher</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">argo_loader</span> <span class="o">=</span> <span class="n">ArgoDataFetcher</span><span class="p">(</span><span class="n">cache</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Query surface and 1000m temp in Med sea with argopy</span>
</span></span><span class="line"><span class="cl"><span class="n">df1</span> <span class="o">=</span> <span class="n">argo_loader</span><span class="o">.</span><span class="n">region</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="p">[</span><span class="o">-</span><span class="mf">1.2</span><span class="p">,</span> <span class="mf">29.0</span><span class="p">,</span> <span class="mf">28.0</span><span class="p">,</span> <span class="mf">46.0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mf">10.0</span><span class="p">,</span> <span class="s2">&#34;2009-12&#34;</span><span class="p">,</span> <span class="s2">&#34;2020-01&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span><span class="o">.</span><span class="n">to_xarray</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">df2</span> <span class="o">=</span> <span class="n">argo_loader</span><span class="o">.</span><span class="n">region</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="p">[</span><span class="o">-</span><span class="mf">1.2</span><span class="p">,</span> <span class="mf">29.0</span><span class="p">,</span> <span class="mf">28.0</span><span class="p">,</span> <span class="mf">46.0</span><span class="p">,</span> <span class="mf">975.0</span><span class="p">,</span> <span class="mf">1025.0</span><span class="p">,</span> <span class="s2">&#34;2009-12&#34;</span><span class="p">,</span> <span class="s2">&#34;2020-01&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span><span class="o">.</span><span class="n">to_xarray</span><span class="p">()</span></span></span></code></pre>
</div>
<p>Here we create some arrays we&rsquo;ll use for plotting, we set up a date array and extract day of the year and year itself that will be useful. Then to build our temperature array, we use xarray very useful methods : <code>where()</code> and <code>mean()</code>. Then we build a pandas Dataframe, because it&rsquo;s prettier!</p>

<div class="highlight">
  <pre># Weekly date array
daterange = np.arange(&#34;2010-01-01&#34;, &#34;2020-01-03&#34;, dtype=&#34;datetime64[7D]&#34;)
dayoftheyear = pd.DatetimeIndex(
    np.array(daterange, dtype=&#34;datetime64[D]&#34;) &#43; 3
).dayofyear  # middle of the week
activeyear = pd.DatetimeIndex(
    np.array(daterange, dtype=&#34;datetime64[D]&#34;) &#43; 3
).year  # extract year

# Init final arrays
tsurf = np.zeros(len(daterange))
t1000 = np.zeros(len(daterange))

# Filling arrays
for i in range(len(daterange)):
    i1 = (df1[&#34;TIME&#34;] &gt;= daterange[i]) &amp; (df1[&#34;TIME&#34;] &lt; daterange[i] &#43; 7)
    i2 = (df2[&#34;TIME&#34;] &gt;= daterange[i]) &amp; (df2[&#34;TIME&#34;] &lt; daterange[i] &#43; 7)
    tsurf[i] = df1.where(i1, drop=True)[&#34;TEMP&#34;].mean().values
    t1000[i] = df2.where(i2, drop=True)[&#34;TEMP&#34;].mean().values

# Creating dataframe
d = {&#34;date&#34;: np.array(daterange, dtype=&#34;datetime64[D]&#34;), &#34;tsurf&#34;: tsurf, &#34;t1000&#34;: t1000}
ndf = pd.DataFrame(data=d)
ndf.head()</pre>
</div>

<p>This produces:</p>

<div class="highlight">
  <pre>        date  tsurf  t1000
0 2009-12-31    0.0    0.0
1 2010-01-07    0.0    0.0
2 2010-01-14    0.0    0.0
3 2010-01-21    0.0    0.0
4 2010-01-28    0.0    0.0</pre>
</div>

<p>Then it&rsquo;s time to plot, for that we first need to import what we need, and set some useful variables.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">rcParams</span><span class="p">[</span><span class="s2">&#34;xtick.major.pad&#34;</span><span class="p">]</span> <span class="o">=</span> <span class="s2">&#34;17&#34;</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">rcParams</span><span class="p">[</span><span class="s2">&#34;axes.axisbelow&#34;</span><span class="p">]</span> <span class="o">=</span> <span class="kc">False</span>
</span></span><span class="line"><span class="cl"><span class="n">matplotlib</span><span class="o">.</span><span class="n">rc</span><span class="p">(</span><span class="s2">&#34;axes&#34;</span><span class="p">,</span> <span class="n">edgecolor</span><span class="o">=</span><span class="s2">&#34;w&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib.lines</span> <span class="kn">import</span> <span class="n">Line2D</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib.animation</span> <span class="kn">import</span> <span class="n">FuncAnimation</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">HTML</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">big_angle</span> <span class="o">=</span> <span class="mi">360</span> <span class="o">/</span> <span class="mi">12</span>  <span class="c1"># How we split our polar space</span>
</span></span><span class="line"><span class="cl"><span class="n">date_angle</span> <span class="o">=</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="p">((</span><span class="mi">360</span> <span class="o">/</span> <span class="mi">365</span><span class="p">)</span> <span class="o">*</span> <span class="n">dayoftheyear</span><span class="p">)</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span> <span class="o">/</span> <span class="mi">180</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>  <span class="c1"># For a day, a corresponding angle</span>
</span></span><span class="line"><span class="cl"><span class="c1"># inner and outer ring limit values</span>
</span></span><span class="line"><span class="cl"><span class="n">inner</span> <span class="o">=</span> <span class="mi">10</span>
</span></span><span class="line"><span class="cl"><span class="n">outer</span> <span class="o">=</span> <span class="mi">30</span>
</span></span><span class="line"><span class="cl"><span class="c1"># setting our color values</span>
</span></span><span class="line"><span class="cl"><span class="n">ocean_color</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;#ff7f50&#34;</span><span class="p">,</span> <span class="s2">&#34;#004752&#34;</span><span class="p">]</span></span></span></code></pre>
</div>
<p>Now we want to make our axes like we want, for that we build a function <code>dress_axes</code> that will be called during the animation process. Here we plot some bars with an offset (combination of <code>bottom</code> and <code>ylim</code> after). Those bars are actually our background, and the offset allows us to plot a legend in the middle of the plot.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">dress_axes</span><span class="p">(</span><span class="n">ax</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_facecolor</span><span class="p">(</span><span class="s2">&#34;w&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_theta_zero_location</span><span class="p">(</span><span class="s2">&#34;N&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_theta_direction</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Here is how we position the months labels</span>
</span></span><span class="line"><span class="cl">    <span class="n">middles</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="n">big_angle</span> <span class="o">/</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">360</span><span class="p">,</span> <span class="n">big_angle</span><span class="p">)</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span> <span class="o">/</span> <span class="mi">180</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_xticks</span><span class="p">(</span><span class="n">middles</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_xticklabels</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;January&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;February&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;March&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;April&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;May&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;June&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;July&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;August&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;September&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;October&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;November&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;December&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_yticks</span><span class="p">([</span><span class="mi">15</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">25</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_yticklabels</span><span class="p">([</span><span class="s2">&#34;15°C&#34;</span><span class="p">,</span> <span class="s2">&#34;20°C&#34;</span><span class="p">,</span> <span class="s2">&#34;25°C&#34;</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Changing radial ticks angle</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_rlabel_position</span><span class="p">(</span><span class="mi">359</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">tick_params</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="s2">&#34;both&#34;</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;w&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">plt</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="s2">&#34;x&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">plt</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="s2">&#34;y&#34;</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;w&#34;</span><span class="p">,</span> <span class="n">linestyle</span><span class="o">=</span><span class="s2">&#34;:&#34;</span><span class="p">,</span> <span class="n">linewidth</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Here is the bar plot that we use as background</span>
</span></span><span class="line"><span class="cl">    <span class="n">bars</span> <span class="o">=</span> <span class="n">ax</span><span class="o">.</span><span class="n">bar</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">middles</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">outer</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">width</span><span class="o">=</span><span class="n">big_angle</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span> <span class="o">/</span> <span class="mi">180</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">bottom</span><span class="o">=</span><span class="n">inner</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="s2">&#34;lightgray&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">edgecolor</span><span class="o">=</span><span class="s2">&#34;w&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">zorder</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">plt</span><span class="o">.</span><span class="n">ylim</span><span class="p">([</span><span class="mi">2</span><span class="p">,</span> <span class="n">outer</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Custom legend</span>
</span></span><span class="line"><span class="cl">    <span class="n">legend_elements</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="n">Line2D</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mi">0</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mi">0</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="n">marker</span><span class="o">=</span><span class="s2">&#34;o&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">color</span><span class="o">=</span><span class="s2">&#34;w&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">label</span><span class="o">=</span><span class="s2">&#34;Surface&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">markerfacecolor</span><span class="o">=</span><span class="n">ocean_color</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="n">markersize</span><span class="o">=</span><span class="mi">15</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="n">Line2D</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mi">0</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mi">0</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="n">marker</span><span class="o">=</span><span class="s2">&#34;o&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">color</span><span class="o">=</span><span class="s2">&#34;w&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">label</span><span class="o">=</span><span class="s2">&#34;1000m&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">markerfacecolor</span><span class="o">=</span><span class="n">ocean_color</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="n">markersize</span><span class="o">=</span><span class="mi">15</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">handles</span><span class="o">=</span><span class="n">legend_elements</span><span class="p">,</span> <span class="n">loc</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span> <span class="n">fontsize</span><span class="o">=</span><span class="mi">13</span><span class="p">,</span> <span class="n">frameon</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Main title for the figure</span>
</span></span><span class="line"><span class="cl">    <span class="n">plt</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;Mediterranean temperature from Argo profiles&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">fontsize</span><span class="o">=</span><span class="mi">16</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">horizontalalignment</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span></span></span></code></pre>
</div>
<p>From there we can plot the frame of our plot.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="mi">111</span><span class="p">,</span> <span class="n">polar</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">dress_axes</span><span class="p">(</span><span class="n">ax</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/animated-polar-plot/axes_empty.png" alt="axesFrame"></p>
<p>Then it&rsquo;s finally time to plot our data. Since we want to animated the plot, we&rsquo;ll build a function that will be called in <code>FuncAnimation</code> later on. Since the state of the plot changes on every time stamp, we have to redress the axes for each frame, easy with our <code>dress_axes</code> function. Then we plot our temperature data using basic <code>plot()</code>: thin lines for historical measurements, thicker lines for the current year.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">draw_data</span><span class="p">(</span><span class="n">i</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Clear</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">cla</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Redressing axes</span>
</span></span><span class="line"><span class="cl">    <span class="n">dress_axes</span><span class="p">(</span><span class="n">ax</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Limit between thin lines and thick line, this is current date minus 51 weeks basically.</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># why 51 and not 52 ? That create a small gap before the current date, which is prettier</span>
</span></span><span class="line"><span class="cl">    <span class="n">i0</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">max</span><span class="p">([</span><span class="n">i</span> <span class="o">-</span> <span class="mi">51</span><span class="p">,</span> <span class="mi">0</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">date_angle</span><span class="p">[</span><span class="n">i0</span> <span class="p">:</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">ndf</span><span class="p">[</span><span class="s2">&#34;tsurf&#34;</span><span class="p">][</span><span class="n">i0</span> <span class="p">:</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;-&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="n">ocean_color</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">alpha</span><span class="o">=</span><span class="mf">1.0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">linewidth</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">date_angle</span><span class="p">[</span><span class="mi">0</span> <span class="p">:</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">ndf</span><span class="p">[</span><span class="s2">&#34;tsurf&#34;</span><span class="p">][</span><span class="mi">0</span> <span class="p">:</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;-&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="n">ocean_color</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">linewidth</span><span class="o">=</span><span class="mf">0.7</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">date_angle</span><span class="p">[</span><span class="n">i0</span> <span class="p">:</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">ndf</span><span class="p">[</span><span class="s2">&#34;t1000&#34;</span><span class="p">][</span><span class="n">i0</span> <span class="p">:</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;-&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="n">ocean_color</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">alpha</span><span class="o">=</span><span class="mf">1.0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">linewidth</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">date_angle</span><span class="p">[</span><span class="mi">0</span> <span class="p">:</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">ndf</span><span class="p">[</span><span class="s2">&#34;t1000&#34;</span><span class="p">][</span><span class="mi">0</span> <span class="p">:</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;-&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="n">ocean_color</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">linewidth</span><span class="o">=</span><span class="mf">0.7</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Plotting a line to spot the current date easily</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="n">date_angle</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">date_angle</span><span class="p">[</span><span class="n">i</span><span class="p">]],</span> <span class="p">[</span><span class="n">inner</span><span class="p">,</span> <span class="n">outer</span><span class="p">],</span> <span class="s2">&#34;k-&#34;</span><span class="p">,</span> <span class="n">linewidth</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Display the current year as a title, just beneath the suptitle</span>
</span></span><span class="line"><span class="cl">    <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">activeyear</span><span class="p">[</span><span class="n">i</span><span class="p">]),</span> <span class="n">fontsize</span><span class="o">=</span><span class="mi">16</span><span class="p">,</span> <span class="n">horizontalalignment</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Test it</span>
</span></span><span class="line"><span class="cl"><span class="n">draw_data</span><span class="p">(</span><span class="mi">322</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/animated-polar-plot/thumbnail.png" alt="oneplot"></p>
<p>Finally it&rsquo;s time to animate, using <code>FuncAnimation</code>. Then we save it as a mp4 file or we display it in our notebook with <code>HTML(anim.to_html5_video())</code>.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">anim</span> <span class="o">=</span> <span class="n">FuncAnimation</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="p">,</span> <span class="n">draw_data</span><span class="p">,</span> <span class="n">interval</span><span class="o">=</span><span class="mi">40</span><span class="p">,</span> <span class="n">frames</span><span class="o">=</span><span class="nb">len</span><span class="p">(</span><span class="n">daterange</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="n">repeat</span><span class="o">=</span><span class="kc">False</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># anim.save(&#39;ArgopyUseCase_MedTempAnimation.mp4&#39;)</span>
</span></span><span class="line"><span class="cl"><span class="n">HTML</span><span class="p">(</span><span class="n">anim</span><span class="o">.</span><span class="n">to_html5_video</span><span class="p">())</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/animated-polar-plot/animatedpolar.gif" alt="animation"></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[GSoC Coding Phase 1 Blog 1]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_1/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/introductory-gsoc2020-post/?utm_source=atom_feed" rel="related" type="text/html" title="Sidharth Bansal joined as GSoC&#39;20 intern" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-rsef/?utm_source=atom_feed" rel="related" type="text/html" title="Elliott Sales de Andrade hired as Matplotlib Software Research Engineering Fellow" />
                <link href="https://blog.scientific-python.org/matplotlib/pyplot-vs-object-oriented-interface/?utm_source=atom_feed" rel="related" type="text/html" title="Pyplot vs Object Oriented Interface" />
                <link href="https://blog.scientific-python.org/matplotlib/emoji-mosaic-art/?utm_source=atom_feed" rel="related" type="text/html" title="Emoji Mosaic Art" />
                <link href="https://blog.scientific-python.org/matplotlib/draw-all-graphs-of-n-nodes/?utm_source=atom_feed" rel="related" type="text/html" title="Draw all graphs of N nodes" />
            
                <id>https://blog.scientific-python.org/matplotlib/gsoc_coding_phase_blog_1/</id>
            
            
            <published>2020-06-09T16:47:51+05:30</published>
            <updated>2020-06-09T16:47:51+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Progress Report for the first half of the Google Summer of Code 2020 Phase 1 for the Baseline Images Problem</blockquote><p>I Sidharth Bansal, was waiting for the coding period to start from the March end so that I can make my hands dirty with the code. Finally, coding period has started. Two weeks have passed. This blog contains information about the progress so far from 1 June to 14 June 2020.</p>
<h2 id="movement-from-mpl-test-and-mpl-packages-to-mpl-and-mpl-baseline-images-packages">Movement from mpl-test and mpl packages to mpl and mpl-baseline-images packages<a class="headerlink" href="#movement-from-mpl-test-and-mpl-packages-to-mpl-and-mpl-baseline-images-packages" title="Link to this heading">#</a></h2>
<p>Initially, we thought of creating a <a href="https://github.com/matplotlib/matplotlib/pull/17434">mpl-test and mpl package</a>. Mpl-test package would contain the test suite and baseline images while the other package would contain parts of repository other than test and baseline-images related files and folders.
We changed our decision to creation of <a href="https://github.com/matplotlib/matplotlib/pull/17557">mpl and mpl-baseline-images packages</a> as we don&rsquo;t need to create separate package for entire test suite. Our main aim was to eliminate baseline_images from the repository. Mpl-baseline-images package will contain the data[/baseline images] and related information. The other package will contain files and folders other than baseline images.
We are now trying to create the following structure for the repository:</p>

<div class="highlight">
  <pre>mpl/
  setup.py
  lib/mpl/...
  lib/mpl/tests/...  [contains the tests .py files]
  baseline_images/
    setup.py
    data/...  [contains the image files]</pre>
</div>

<p>It will involve:</p>
<ul>
<li>Symlinking baseline images out.</li>
<li>Creating a wheel/sdist with just the baseline images; uploading it to testpypi (so that one can do <code>pip install mpl-baseline-images</code>).</li>
</ul>
<h2 id="following-prototype-modelling">Following prototype modelling<a class="headerlink" href="#following-prototype-modelling" title="Link to this heading">#</a></h2>
<p>I am creating a prototype first with two packages - main package and sub-wheel package. Once the demo app works well on <a href="https://test.pypi.org/">Test PyPi</a>, we can do similar changes to the main mpl repository.
The structure of demo app is analogous to the work needed for separation of baseline-images to a new package mpl-baseline-images as given below:</p>

<div class="highlight">
  <pre>testrepo/
  setup.py
  lib/testpkg/__init__.py
  baseline_images/setup.py
  baseline_images/testdata.txt</pre>
</div>

<p>This will also include related MANIFEST files and setup.cfg.template files. The setup.py will also contain logic for exclusion of baseline-images folder from the main mpl-package.</p>
<h2 id="following-enhancements-over-iterations">Following Enhancements over iterations<a class="headerlink" href="#following-enhancements-over-iterations" title="Link to this heading">#</a></h2>
<p>After the <a href="https://github.com/matplotlib/matplotlib/pull/17557">current PR</a> is merged, we will focus on eliminating the baseline-images from the mpl-baseline-images package. Then we will do similar changes for the Travis CI.</p>
<h2 id="bi-weekly-meet-ups-scheduled">Bi weekly meet-ups scheduled<a class="headerlink" href="#bi-weekly-meet-ups-scheduled" title="Link to this heading">#</a></h2>
<p>Every Tuesday and every Friday meeting is initiated at <a href="https://everytimezone.com/">8:30pm IST</a> via <a href="https://zoom.us/j/95996536871">Zoom</a>. Meeting notes are present at <a href="https://hackmd.io/pY25bSkCSRymk_7nX68xtw">HackMD</a>.</p>
<p>I am grateful to be part of such a great community. Project is really interesting and challenging :) Thanks Antony and Hannah for helping me so far.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="GSoC" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Pyplot vs Object Oriented Interface]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/pyplot-vs-object-oriented-interface/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/emoji-mosaic-art/?utm_source=atom_feed" rel="related" type="text/html" title="Emoji Mosaic Art" />
                <link href="https://blog.scientific-python.org/matplotlib/draw-all-graphs-of-n-nodes/?utm_source=atom_feed" rel="related" type="text/html" title="Draw all graphs of N nodes" />
                <link href="https://blog.scientific-python.org/matplotlib/introductory-gsoc2020-post/?utm_source=atom_feed" rel="related" type="text/html" title="Sidharth Bansal joined as GSoC&#39;20 intern" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-cyberpunk-style/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib Cyberpunk Style" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-rsef/?utm_source=atom_feed" rel="related" type="text/html" title="Elliott Sales de Andrade hired as Matplotlib Software Research Engineering Fellow" />
            
                <id>https://blog.scientific-python.org/matplotlib/pyplot-vs-object-oriented-interface/</id>
            
            
            <published>2020-05-27T20:21:30+05:30</published>
            <updated>2020-05-27T20:21:30+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>This post describes the difference between the pyplot and object oriented interface to make plots.</blockquote><h2 id="generating-the-data-points">Generating the data points<a class="headerlink" href="#generating-the-data-points" title="Link to this heading">#</a></h2>
<p>To get acquainted with the basics of plotting with <code>matplotlib</code>, let&rsquo;s try plotting how much distance an object under free-fall travels with respect to time and also, its velocity at each time step.</p>
<p>If, you have ever studied physics, you can tell that is a classic case of Newton&rsquo;s equations of motion, where</p>
<p>$$ v = a \times t $$</p>
<p>$$ S = 0.5 \times a \times t^{2} $$</p>
<p>We will assume an initial velocity of zero.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">time</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="mf">10.0</span><span class="p">,</span> <span class="mf">0.2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">velocity</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros_like</span><span class="p">(</span><span class="n">time</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="nb">float</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">distance</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros_like</span><span class="p">(</span><span class="n">time</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="nb">float</span><span class="p">)</span></span></span></code></pre>
</div>
<p>We know that under free-fall, all objects move with the constant acceleration of $$g = 9.8~m/s^2$$</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">g</span> <span class="o">=</span> <span class="mf">9.8</span>  <span class="c1"># m/s^2</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">velocity</span> <span class="o">=</span> <span class="n">g</span> <span class="o">*</span> <span class="n">time</span>
</span></span><span class="line"><span class="cl"><span class="n">distance</span> <span class="o">=</span> <span class="mf">0.5</span> <span class="o">*</span> <span class="n">g</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">power</span><span class="p">(</span><span class="n">time</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span></span></span></code></pre>
</div>
<p>The above code gives us two <code>numpy</code> arrays populated with the distance and velocity data points.</p>
<h2 id="pyplot-vs-object-oriented-interface">Pyplot vs. Object-Oriented interface<a class="headerlink" href="#pyplot-vs-object-oriented-interface" title="Link to this heading">#</a></h2>
<p>When using <code>matplotlib</code> we have two approaches:</p>
<ol>
<li><code>pyplot</code> interface / functional interface.</li>
<li>Object-Oriented interface (OO).</li>
</ol>
<h3 id="pyplot-interface">Pyplot Interface<a class="headerlink" href="#pyplot-interface" title="Link to this heading">#</a></h3>
<p><code>matplotlib</code> on the surface is made to imitate MATLAB&rsquo;s method of generating plots, which is called <code>pyplot</code>. All the <code>pyplot</code> commands make changes and modify the same figure. This is a state-based interface, where the state (i.e., the figure) is preserved through various function calls (i.e., the methods that modify the figure). This interface allows us to quickly and easily generate plots. The state-based nature of the interface allows us to add elements and/or modify the plot as we need, when we need it.</p>
<p>This interface shares a lot of similarities in syntax and methodology with MATLAB. For example, if we want to plot a blue line where each data point is marked with a circle, we can use the string <code>'bo-'</code>.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">9</span><span class="p">,</span> <span class="mi">7</span><span class="p">),</span> <span class="n">dpi</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">time</span><span class="p">,</span> <span class="n">distance</span><span class="p">,</span> <span class="s2">&#34;bo-&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&#34;Time&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&#34;Distance&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">([</span><span class="s2">&#34;Distance&#34;</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="kc">True</span><span class="p">)</span></span></span></code></pre>
</div>
<p>The plot shows how much distance was covered by the free-falling object with each passing second.</p>
<p><img src="/matplotlib/pyplot-vs-object-oriented-interface/figure/just-distance.png" alt="Fig. 1.1"></p>
<div class="image-caption">
<b>Fig. 1.1</b> The amount of distance travelled in each second is increasing, which is a direct result of increasing velocity due to the gravitational acceleration.
</div>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">9</span><span class="p">,</span> <span class="mi">7</span><span class="p">),</span> <span class="n">dpi</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">time</span><span class="p">,</span> <span class="n">velocity</span><span class="p">,</span> <span class="s2">&#34;go-&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&#34;Time&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&#34;Velocity&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">([</span><span class="s2">&#34;Velocity&#34;</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="kc">True</span><span class="p">)</span></span></span></code></pre>
</div>
<p>The plot below shows us how the velocity is increasing.</p>
<p><img src="/matplotlib/pyplot-vs-object-oriented-interface/figure/just-velocity.png" alt="Fig. 1.2"></p>
<div class="image-caption">
<b>Fig. 1.2</b> Velocity is increasing in fixed steps, due to a "constant" acceleration.
</div>
<p>Let&rsquo;s try to see what kind of plot we get when we plot both distance and velocity in the same plot.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">9</span><span class="p">,</span> <span class="mi">7</span><span class="p">),</span> <span class="n">dpi</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">time</span><span class="p">,</span> <span class="n">velocity</span><span class="p">,</span> <span class="s2">&#34;g-&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">time</span><span class="p">,</span> <span class="n">distance</span><span class="p">,</span> <span class="s2">&#34;b-&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&#34;Distance and Velocity&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&#34;Time&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">([</span><span class="s2">&#34;Distance&#34;</span><span class="p">,</span> <span class="s2">&#34;Velocity&#34;</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="kc">True</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/pyplot-vs-object-oriented-interface/figure/distance-and-velocity-same-axes.png" alt="png"></p>
<p>Here, we run into some obvious and serious issues. We can see that since both the quantities share the same axis but have very different magnitudes, the graph looks disproportionate. What we need to do is separate the two quantities on two different axes. This is where the second approach to making plot comes into play.</p>
<p>Also, the <code>pyplot</code> approach doesn&rsquo;t really scale when we are required to make multiple plots or when we have to make intricate plots that require a lot of customisation. However, internally <code>matplotlib</code> has an Object-Oriented interface that can be accessed just as easily, which allows to reuse objects.</p>
<h3 id="object-oriented-interface">Object-Oriented Interface<a class="headerlink" href="#object-oriented-interface" title="Link to this heading">#</a></h3>
<p>When using the OO interface, it helps to know how the <code>matplotlib</code> structures its plots. The final plot that we see as the output is a &lsquo;Figure&rsquo; object. The <code>Figure</code> object is the top level container for all the other elements that make up the graphic image. These &ldquo;other&rdquo; elements are called <code>Artists</code>. The <code>Figure</code> object can be thought of as a canvas, upon which different artists act to create the final graphic image. This <code>Figure</code> can contain any number of various artists.</p>
<p><img src="/matplotlib/pyplot-vs-object-oriented-interface/figure/anatomy-of-a-figure.png" alt="png"></p>
<p>Things to note about the anatomy of a figure are:</p>
<ol>
<li>All of the items labelled in <em>blue</em> are <code>Artists</code>. <code>Artists</code> are basically all the elements that are rendered onto the figure. This can include text, patches (like arrows and shapes), etc. Thus, all the following <code>Figure</code>, <code>Axes</code> and <code>Axis</code> objects are also Artists.</li>
<li>Each plot that we see in a figure, is an <code>Axes</code> object. The <code>Axes</code> object holds the actual data that we are going to display. It will also contain X- and Y-axis labels, a title. Each <code>Axes</code> object will contain two or more <code>Axis</code> objects.</li>
<li>The <code>Axis</code> objects set the data limits. It also contains the ticks and ticks labels. <code>ticks</code> are the marks that we see on a axis.</li>
</ol>
<p>Understanding this hierarchy of <code>Figure</code>, <code>Artist</code>, <code>Axes</code> and <code>Axis</code> is immensely important, because it plays a crucial role in how me make an animation in <code>matplotlib</code>.</p>
<p>Now that we understand how plots are generated, we can easily solve the problem we faced earlier. To make Velocity and Distance plot to make more sense, we need to plot each data item against a separate axis, with a different scale. Thus, we will need one parent <code>Figure</code> object and two <code>Axes</code> objects.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax1</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax1</span><span class="o">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s2">&#34;distance (m)&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax1</span><span class="o">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="s2">&#34;time&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax1</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">time</span><span class="p">,</span> <span class="n">distance</span><span class="p">,</span> <span class="s2">&#34;blue&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax2</span> <span class="o">=</span> <span class="n">ax1</span><span class="o">.</span><span class="n">twinx</span><span class="p">()</span>  <span class="c1"># create another y-axis sharing a common x-axis</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax2</span><span class="o">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s2">&#34;velocity (m/s)&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax2</span><span class="o">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="s2">&#34;time&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax2</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">time</span><span class="p">,</span> <span class="n">velocity</span><span class="p">,</span> <span class="s2">&#34;green&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">set_size_inches</span><span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">set_dpi</span><span class="p">(</span><span class="mi">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/pyplot-vs-object-oriented-interface/figure/distance-and-velocity-different-axes-unfinished.png" alt="png"></p>
<p>This plot is still not very intuitive. We should add a grid and a legend. Perhaps, we can also change the color of the axis labels and tick labels to the color of the lines.</p>
<p>But, something very weird happens when we try to turn on the grid, which you can see <a href="https://github.com/whereistejas/whereistejas.github.io/blob/master/assets/jupyter-nb/double-pendulum-part-1-basics-of-plotting.ipynb">here</a> at Cell 8. The grid lines don&rsquo;t align with the tick labels on the both the Y-axes. We can see that tick values <code>matplotlib</code> is calculating on its own are not suitable to our needs and, thus, we will have to calculate them ourselves.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax1</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax1</span><span class="o">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s2">&#34;distance (m)&#34;</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;blue&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax1</span><span class="o">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="s2">&#34;time&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax1</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">time</span><span class="p">,</span> <span class="n">distance</span><span class="p">,</span> <span class="s2">&#34;blue&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax1</span><span class="o">.</span><span class="n">set_yticks</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="o">*</span><span class="n">ax1</span><span class="o">.</span><span class="n">get_ybound</span><span class="p">(),</span> <span class="mi">10</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="n">ax1</span><span class="o">.</span><span class="n">tick_params</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="s2">&#34;y&#34;</span><span class="p">,</span> <span class="n">labelcolor</span><span class="o">=</span><span class="s2">&#34;blue&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax1</span><span class="o">.</span><span class="n">xaxis</span><span class="o">.</span><span class="n">grid</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">ax1</span><span class="o">.</span><span class="n">yaxis</span><span class="o">.</span><span class="n">grid</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax2</span> <span class="o">=</span> <span class="n">ax1</span><span class="o">.</span><span class="n">twinx</span><span class="p">()</span>  <span class="c1"># create another y-axis sharing a common x-axis</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax2</span><span class="o">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s2">&#34;velocity (m/s)&#34;</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;green&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax2</span><span class="o">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="s2">&#34;time&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax2</span><span class="o">.</span><span class="n">tick_params</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="s2">&#34;y&#34;</span><span class="p">,</span> <span class="n">labelcolor</span><span class="o">=</span><span class="s2">&#34;green&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax2</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">time</span><span class="p">,</span> <span class="n">velocity</span><span class="p">,</span> <span class="s2">&#34;green&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax2</span><span class="o">.</span><span class="n">set_yticks</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="o">*</span><span class="n">ax2</span><span class="o">.</span><span class="n">get_ybound</span><span class="p">(),</span> <span class="mi">10</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">set_size_inches</span><span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">set_dpi</span><span class="p">(</span><span class="mi">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">legend</span><span class="p">([</span><span class="s2">&#34;Distance&#34;</span><span class="p">,</span> <span class="s2">&#34;Velocity&#34;</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p>The command <code>ax1.set_yticks(np.linspace(*ax1.get_ybound(), 10))</code> calculates the tick values for us. Let&rsquo;s break this down to see what is happening:</p>
<ol>
<li>The <code>np.linspace</code> command will create a set of <code>n</code> no. of partitions between a specified upper and lower limit.</li>
<li>The method <code>ax1.get_ybound()</code> returns a list which contains the maximum and minimum limits for that particular axis (which in our case is the Y-axis).</li>
<li>In python, the operator <code>*</code> acts as an unpacking operator when prepended before a <code>list</code> or <code>tuple</code>. Thus, it will convert a list <code>[1, 2, 3, 4]</code> into separate values <code>1, 2, 3, 4</code>. This is an immensely powerful feature.</li>
<li>Thus, we are asking the <code>np.linspace</code> method to divide the interval between the maximum and minimum tick values into 10 equal parts.</li>
<li>We provide this array to the <code>set_yticks</code> method.</li>
</ol>
<p>The same process is repeated for the second axis.</p>
<p><img src="/matplotlib/pyplot-vs-object-oriented-interface/figure/distance-and-velocity-different-axes-finished.png" alt="png"></p>
<h2 id="conclusion">Conclusion<a class="headerlink" href="#conclusion" title="Link to this heading">#</a></h2>
<p>In this part, we covered some basics of <code>matplotlib</code> plotting, covering the basic two approaches of how to make plots. In the next part, we will cover how to make simple animations. If you like the content of this blog post, or you have any suggestions or comments, drop me an email or tweet or ping me on IRC. Nowadays, you will find me hanging around #matplotlib on Freenode. Thanks!</p>
<h2 id="after-thoughts">After-thoughts<a class="headerlink" href="#after-thoughts" title="Link to this heading">#</a></h2>
<p>This post is part of a series I&rsquo;m doing on my personal <a href="http://whereistejas.me">blog</a>. This series is basically going to be about how to animate stuff using python&rsquo;s <code>matplotlib</code> library. <code>matplotlib</code> has an excellent <a href="https://matplotlib.org/3.2.1/contents.html">documentation</a> where you can find a detailed documentation on each of the methods I have used in this blog post. Also, I will be publishing each part of this series in the form of a jupyter notebook, which can be found <a href="https://github.com/whereistejas/whereistejas.github.io/blob/master/assets/jupyter-nb/Part-1-basics-of-plotting.ipynb">here</a>.</p>
<p>The series will have three posts which will cover:</p>
<ol>
<li>Part 1 - How to make plots using <code>matplotlib</code>.</li>
<li>Part 2 - Basic animation using <code>FuncAnimation</code>.</li>
<li>Part 3 - Optimizations to make animations faster (blitting).</li>
</ol>
<p>I would like to say a few words about the methodology of these series:</p>
<ol>
<li>Each part will have a list of references at the end of the post, mostly leading to appropriate pages of the documentation and helpful blogs written by other people. <strong>THIS IS THE MOST IMPORTANT PART</strong>. The sooner you get used to reading the documentation, the better.</li>
<li>The code written here, is meant to show you how you can piece everything together. I will try my best to describe the nuances of my implementations and the tiny lessons I learned.</li>
</ol>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<ol>
<li><a href="https://youtu.be/bD05uGo_sVI">Python Generators (YouTube)</a></li>
<li><a href="https://medium.com/@kapil.mathur1987/matplotlib-an-introduction-to-its-object-oriented-interface-a318b1530aed">Matplotlib: An Introduction to its Object-Oriented Interface</a></li>
<li><a href="https://matplotlib.org/3.2.1/tutorials/introductory/lifecycle.html">Lifecycle of a Plot</a></li>
<li><a href="https://matplotlib.org/faq/usage_faq.html">Basic Concepts</a></li>
</ol>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Emoji Mosaic Art]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/emoji-mosaic-art/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/draw-all-graphs-of-n-nodes/?utm_source=atom_feed" rel="related" type="text/html" title="Draw all graphs of N nodes" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-cyberpunk-style/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib Cyberpunk Style" />
                <link href="https://blog.scientific-python.org/matplotlib/mpl-for-making-diagrams/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib for Making Diagrams" />
                <link href="https://blog.scientific-python.org/matplotlib/create-ridgeplots-in-matplotlib/?utm_source=atom_feed" rel="related" type="text/html" title="Create Ridgeplots in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/create-a-tesla-cybertruck-that-drives/?utm_source=atom_feed" rel="related" type="text/html" title="Create a Tesla Cybertruck That Drives" />
            
                <id>https://blog.scientific-python.org/matplotlib/emoji-mosaic-art/</id>
            
            
            <published>2020-05-24T19:11:01+05:30</published>
            <updated>2020-05-24T19:11:01+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Applied image manipulation to create procedural art.</blockquote><p>A while back, I came across this cool <a href="https://github.com/willdady/emosaic">repository</a> to create emoji-art from images. I wanted to use it to transform my mundane Facebook profile picture to something more snazzy. The only trouble? It was written in Rust.</p>
<p>So instead of going through the process of installing Rust, I decided to take the easy route and spin up some code to do the same in Python using <a href="https://matplotlib.org/">matplotlib</a>.</p>
<p><del>Because that&rsquo;s what anyone sane would do, right?</del></p>
<p>In this post, I&rsquo;ll try to explain my process as we attempt to recreate similar mosaics as this one below. I&rsquo;ve aimed this post at people who&rsquo;ve worked with <em>some</em> sort of image data before; but really, anyone can follow along.</p>
<p><img src="/matplotlib/emoji-mosaic-art/warhol.png" alt="Emoji mosaic by Will Dady based on Andy Warhol’s Multiple Marilyns."></p>
<h2 id="packages">Packages<a class="headerlink" href="#packages" title="Link to this heading">#</a></h2>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">tqdm</span> <span class="kn">import</span> <span class="n">tqdm</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">scipy</span> <span class="kn">import</span> <span class="n">spatial</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib</span> <span class="kn">import</span> <span class="n">cm</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">scipy</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;Matplotlib:</span><span class="si">{</span><span class="n">matplotlib</span><span class="o">.</span><span class="n">__version__</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;Numpy:</span><span class="si">{</span><span class="n">np</span><span class="o">.</span><span class="n">__version__</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;Scipy: </span><span class="si">{</span><span class="n">scipy</span><span class="o">.</span><span class="n">__version__</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">## Matplotlib: &#39;3.2.1&#39;</span>
</span></span><span class="line"><span class="cl"><span class="c1">## Numpy: &#39;1.18.1&#39;</span>
</span></span><span class="line"><span class="cl"><span class="c1">## Scipy: &#39;1.4.1&#39;</span></span></span></code></pre>
</div>
<p>Let&rsquo;s read in our image:</p>

<div class="highlight">
  <pre>img = plt.imread(r&#34;naomi_32.png&#34;, 1)
dim = img.shape[0] ##we&#39;ll need this later
plt.imshow(img)</pre>
</div>

<p><img src="/matplotlib/emoji-mosaic-art/save_100.png" alt="Naomi Watts Cannes (2014) Licensed under Creative Commons attributed to Georges Biard"></p>
<p><strong>Note</strong>: <em>The image displayed above is 100x100 but we&rsquo;ll use a 32x32 from here on since that&rsquo;s gonna suffice all our needs.</em></p>
<p>So really, what <em>is</em> an image? To numpy and matplotlib (and for almost every image processing library out there), it is, essentially, just a matrix (say A), where every individual pixel (p) is an element of A. If it&rsquo;s a grayscale image, every pixel (p) is just a single number (or a scalar) - in the range [0,1] if float, or [0,255] if integer. If it&rsquo;s not grayscale - like in our case - every pixel is a vector of either dimension 3 - <strong>Red</strong> (R), <strong>Green</strong> (G), and <strong>Blue</strong> (B), or dimension 4 - RGBA (A stands for <strong>Alpha</strong>, which is basically transparency).</p>
<p>If anything is unclear so far, I&rsquo;d strongly suggest going through a post like <a href="https://matplotlib.org/3.1.1/tutorials/introductory/images.html">this</a> or <a href="http://scipy-lectures.org/advanced/image_processing/">this</a>. Knowing that an image can be represented as a matrix (or a <code>numpy array</code>) greatly helps us as almost every transformation of the image can be represented in terms of matrix maths.</p>
<p>To prove my point, let&rsquo;s look at <code>img</code> a little.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1">## Let&#39;s check the type of img</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="nb">type</span><span class="p">(</span><span class="n">img</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1"># &lt;class &#39;numpy.ndarray&#39;&gt;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">## The shape of the array img</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">img</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># (32, 32, 4)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">## The value of the first pixel of img</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">img</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="c1"># [128 144 117 255]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">## Let&#39;s view the color of the first pixel</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">color</span> <span class="o">=</span> <span class="n">img</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">/</span> <span class="mf">255.0</span>  <span class="c1">##RGBA only accepts values in the 0-1 range</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">fill</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">],</span> <span class="n">color</span><span class="o">=</span><span class="n">color</span><span class="p">)</span></span></span></code></pre>
</div>
<p>That should give you a square filled with the color of the first pixel of <code>img</code>.</p>
<p><img src="/matplotlib/emoji-mosaic-art/first_pixel.png" alt="img[0][0]" title="The first pixel of the image we read in"></p>
<h2 id="methodology">Methodology<a class="headerlink" href="#methodology" title="Link to this heading">#</a></h2>
<p>We want to go from a plain image to an image full of emojis - or in other words, <strong>an image of images</strong>. Essentially, we&rsquo;re going to replace all pixels with emojis. However, to ensure that our new emoji-image looks like the original image and not just random smiley faces, the trick is to make sure that <em>every pixel is replaced my an emoji which has similar color to that pixel</em>. That&rsquo;s what gives the result the look of a mosaic.</p>
<p>&lsquo;Similar&rsquo; really just means that the <strong>mean</strong> (median is also worth trying) color of the emoji should be close to the pixel it replaces.</p>
<p>So how do you find the mean color of an entire image? Easy. We just take all the RGBA arrays and average the Rs together, and then the Gs together, and then the Bs together, and then the As together (the As, by the way, are just all 1 in our case, so the mean is also going to be 1). Here&rsquo;s that idea expressed formally:</p>
<p>\[ (r, g, b){\mu}=\left(\frac{\left(r{1}+r_{2}+\ldots+r_{N}\right)}{N}, \frac{\left(g_{1}+g_{2}+\ldots+g_{N}\right)}{N}, \frac{\left(b_{1}+b_{2}+\ldots+b_{N}\right)}{N}\right) \]</p>
<p>The resulting color would be single array of RGBA values: \[ [r_{\mu}, g_{\mu}, b_{\mu}, 1] \]</p>
<p>So now our steps become somewhat like this:</p>
<p><strong>Part I</strong> - Get emoji matches</p>
<ol>
<li>Find a bunch of emojis.</li>
<li>Find the mean of the emojis.</li>
<li>For each pixel in the image, find the emoji closest to it (wrt color), and replace pixel with that emoji (say, E).</li>
</ol>
<p><strong>Part II</strong> - Reshape emojis to image</p>
<ol>
<li>Reshape the flattened array of all Es back to the shape of our image.</li>
<li>Concatenate all emojis into a single array (reduce dimensions).</li>
</ol>
<p>That&rsquo;s pretty much it!</p>
<h3 id="step-i1---our-emoji-bank">Step I.1 - Our Emoji bank<a class="headerlink" href="#step-i1---our-emoji-bank" title="Link to this heading">#</a></h3>
<p>I took care of this for you beforehand with a bit of BeautifulSoup and requests magic. Our emoji collection is a numpy array of shape <code>1506, 16, 16, 4</code> - that&rsquo;s 1506 emojis with each being a 16x16 array of RGBA values. You can find it <a href="https://github.com/sharmaabhishekk/emoji-mosaic-mpl">here</a>.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">emoji_array</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="s2">&#34;emojis_16.npy&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">emoji_array</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1">## 1506, 16, 16, 4</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">##plt.imshow(emoji_array[0]) ##to view the first emoji</span></span></span></code></pre>
</div>
<h3 id="step-i2---calculate-the-mean-rgba-value-of-all-emojis">Step I.2 - Calculate the mean RGBA value of all emojis.<a class="headerlink" href="#step-i2---calculate-the-mean-rgba-value-of-all-emojis" title="Link to this heading">#</a></h3>
<p>We&rsquo;ve seen the formula above; here&rsquo;s the numpy code for it. We&rsquo;re gonna iterate over all all the 1506 emojis and create an array <code>emoji_mean_array</code> out of them.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">emoji_mean_array</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="p">[</span><span class="n">ar</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span> <span class="k">for</span> <span class="n">ar</span> <span class="ow">in</span> <span class="n">emoji_array</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>  <span class="c1">##`np.median(ar, axis=(0,1))` for median instead of mean</span></span></span></code></pre>
</div>
<h3 id="step-i3---finding-closest-emoji-match-for-all-pixels">Step I.3 - finding closest emoji match for all pixels<a class="headerlink" href="#step-i3---finding-closest-emoji-match-for-all-pixels" title="Link to this heading">#</a></h3>
<p>The easiest way to do that would be use Scipy&rsquo;s <strong><code>KDTree</code></strong> to create a <code>tree</code> object of all average RGBA values we calculated in #2. This enables us to perform fast lookup for every pixel using the <code>query</code> method. Here&rsquo;s how the code for that looks -</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">tree</span> <span class="o">=</span> <span class="n">spatial</span><span class="o">.</span><span class="n">KDTree</span><span class="p">(</span><span class="n">emoji_mean_array</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">indices</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl"><span class="n">flattened_img</span> <span class="o">=</span> <span class="n">img</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">img</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span>  <span class="c1">##shape = [1024, 16, 16, 4]</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">pixel</span> <span class="ow">in</span> <span class="n">tqdm</span><span class="p">(</span><span class="n">flattened_img</span><span class="p">,</span> <span class="n">desc</span><span class="o">=</span><span class="s2">&#34;Matching emojis&#34;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">_</span><span class="p">,</span> <span class="n">index</span> <span class="o">=</span> <span class="n">tree</span><span class="o">.</span><span class="n">query</span><span class="p">(</span><span class="n">pixel</span><span class="p">)</span>  <span class="c1">##returns distance and index of closest match.</span>
</span></span><span class="line"><span class="cl">    <span class="n">indices</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">index</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">emoji_matches</span> <span class="o">=</span> <span class="n">emoji_array</span><span class="p">[</span><span class="n">indices</span><span class="p">]</span>  <span class="c1">##our emoji_matches</span></span></span></code></pre>
</div>
<h3 id="step-ii1">Step II.1<a class="headerlink" href="#step-ii1" title="Link to this heading">#</a></h3>
<p>The final step is to reshape the array a little more to enable us to plot it using the imshow function. As you can see above, to loop over the pixels we had to flatten the image out into the <code>flattened_img</code>. Now we have to sort of un-flatten it back; to make sure it&rsquo;s back in the form of an image. Fortunately, using numpy&rsquo;s <code>reshape</code> function makes this easy.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">resized_ar</span> <span class="o">=</span> <span class="n">emoji_matches</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">dim</span><span class="p">,</span> <span class="n">dim</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mi">4</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>  <span class="c1">##dim is what we got earlier when we read in the image</span></span></span></code></pre>
</div>
<h3 id="step-ii2">Step II.2<a class="headerlink" href="#step-ii2" title="Link to this heading">#</a></h3>
<p>The last bit is the trickiest. The problem with the output we&rsquo;ve got so far is that it&rsquo;s too nested. Or in simpler terms, what we have is a image where every individual pixel is itself an image. That&rsquo;s all fine but it&rsquo;s not valid input for imshow and if we try to pass it in, it tells us exactly that.</p>

<div class="highlight">
  <pre>TypeError: Invalid shape (32, 32, 16, 16, 4) for image data</pre>
</div>

<p>To grasp our problem intuitively, think about it this way. What we have right now are lots of images like these:</p>
<p><img src="/matplotlib/emoji-mosaic-art/chopped_face.png" alt="“Chopped raccoon img”" title="Image from Scipy under BSD License"></p>
<p>What we want is to merge them all together. Like so:</p>
<p><img src="/matplotlib/emoji-mosaic-art/rejoined_face.png" alt="“Rejoined raccoon img”"></p>
<p>To think about it slightly more technically, what we have right now is a <em>five</em> dimensional array. What we need is to rehshape it in such a way that it&rsquo;s - at maximum - <em>three</em> dimensional. However, it&rsquo;s not as easy as a simple <code>np.reshape</code> (I&rsquo;d suggest you go ahead and try that anyway).</p>
<p>Don&rsquo;t worry though, we have Stack Overflow to the rescue! This excellent <a href="https://stackoverflow.com/questions/52730668/concatenating-multiple-images-into-one/52733370#52733370">answer</a> does exactly that. You don&rsquo;t have to go through it, I have copied the relevant code in here.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">np_block_2D</span><span class="p">(</span><span class="n">chops</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;Converts list of chopped images to one single image&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">block</span><span class="p">([[[</span><span class="n">x</span><span class="p">]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">row</span><span class="p">]</span> <span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">chops</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">final_img</span> <span class="o">=</span> <span class="n">np_block_2D</span><span class="p">(</span><span class="n">resized_ar</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">final_img</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1">## (512, 512, 4)</span></span></span></code></pre>
</div>
<p>The shape looks correct enough. Let&rsquo;s try to plot it.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">final_img</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/emoji-mosaic-art/final_image.png" alt="“Emoji Mosaic final_img”"></p>
<p><strong>Et Voilà</strong></p>
<p>Of course, the result looks a little <em>meh</em> but that&rsquo;s because we only used 32x32 emojis. Here&rsquo;s what the same code would do with 10000 emojis (100x100).</p>
<p><img src="/matplotlib/emoji-mosaic-art/final_image_100.png" alt="“Emoji Mosaic full_size”"></p>
<p>Better?</p>
<p>Now, let&rsquo;s try and create nine of these emoji-images and grid them together.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">canvas</span><span class="p">(</span><span class="n">gray_scale_img</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Plot a 3x3 matrix of the images using different colormaps
</span></span></span><span class="line"><span class="cl"><span class="s2">        param gray_scale_img: a square gray_scale_image
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="p">,</span> <span class="n">axes</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">nrows</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">ncols</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">13</span><span class="p">,</span> <span class="mi">8</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="n">axes</span> <span class="o">=</span> <span class="n">axes</span><span class="o">.</span><span class="n">flatten</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">cmaps</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;BuPu_r&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;bone&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;CMRmap&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;magma&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;afmhot&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;ocean&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;inferno&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;PuRd_r&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;gist_gray&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">cmap</span><span class="p">,</span> <span class="n">ax</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">cmaps</span><span class="p">,</span> <span class="n">axes</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">cmapper</span> <span class="o">=</span> <span class="n">cm</span><span class="o">.</span><span class="n">get_cmap</span><span class="p">(</span><span class="n">cmap</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">rgba_image</span> <span class="o">=</span> <span class="n">cmapper</span><span class="p">(</span><span class="n">gray_scale_img</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">single_plot</span><span class="p">(</span><span class="n">rgba_image</span><span class="p">,</span> <span class="n">ax</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="c1"># ax.imshow(rgba_image) ##try this if you just want to plot the plain image in different color spaces, comment the single_plot call above</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">set_axis_off</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">plt</span><span class="o">.</span><span class="n">subplots_adjust</span><span class="p">(</span><span class="n">hspace</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">wspace</span><span class="o">=-</span><span class="mf">0.2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">fig</span><span class="p">,</span> <span class="n">axes</span></span></span></code></pre>
</div>
<p>The code does mostly the same stuff as before. To get the different colours, I used a simple hack. I first converted the image to grayscale and then used 9 different colormaps on it. Then I used the RGB values returned by the colormap to get the absolute values for our new input image. After that, the only part left is to just feed the new input image through the pipeline we&rsquo;ve discussed so far and that gives us our emoji-image.</p>
<p>Here&rsquo;s what that looks like:</p>
<p><img src="/matplotlib/emoji-mosaic-art/final_3x3_tile.png" alt="“Emoji Mosaic 3x3_grid”"></p>
<p><em>Pretty</em></p>
<h2 id="conclusion">Conclusion<a class="headerlink" href="#conclusion" title="Link to this heading">#</a></h2>
<p>Some final thoughts to wrap this up.</p>
<ul>
<li>
<p>I&rsquo;m not sure if my way to get different colours using different cmaps is what people usually do. I&rsquo;m almost certain there&rsquo;s a better way and if you know one, please submit a PR to the repo (link below).</p>
</li>
<li>
<p>Iterating over every pixel is not really the best idea. We got away with it since it&rsquo;s just 1024 (32x32) pixels but for images with higher resolution, we&rsquo;d have to either iterate over grids of images at once (say a 3x3 or 2x2 window) or resize the image itself to a more workable shape. I prefer the latter since that way we can also just resize it to a square shape in the same call which also has the additional advantage of fitting in nicely in our 3x3 mosaic. I&rsquo;ll leave the readers to work that out themselves using numpy (and, no, please don&rsquo;t use <code>cv2.resize</code>).</p>
</li>
<li>
<p>The <code>KDTree</code> was not part of my initial code. Initially, I&rsquo;d just looped over every emoji for every pixel and then calculated the Euclidean distance (using <code>np.linalg.norm(a-b)</code>). As you can probably imagine, the nested loop in there slowed down the code tremendously - even a 32x32 emoji-image took around 10 minutes to run - right now the same code takes ~19 seconds. Guess that&rsquo;s the power of vectorization for you all.</p>
</li>
<li>
<p>It&rsquo;s worth messing around with median instead of mean to get the RGBA values of the emojis. Most emojis are circular in shape and hence there&rsquo;s a lot of space left outside the area of the circular region which sort of waters down the average color in turn watering down the end result. Considering the median might sort out this problem for some images which aren&rsquo;t very rich.</p>
</li>
<li>
<p>While I&rsquo;ve tried to go in a linear manner with (what I hope was) a good mix of explanation and code, I&rsquo;d strongly suggest looking at the full code in the repository <a href="https://github.com/sharmaabhishekk/emoji-mosaic-mpl">here</a> in case you feel like I sprung anything on you.</p>
</li>
</ul>
<hr>
<p>I hope you enjoyed this post and learned something from it. If you have any feedback, criticism, questions, please feel free to DM me on <a href="https://twitter.com/abhisheksh_98">Twitter</a> or email me (preferably the former since I&rsquo;m almost always on there). Thank you, and take care!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="art" label="art" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Draw all graphs of N nodes]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/draw-all-graphs-of-n-nodes/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-cyberpunk-style/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib Cyberpunk Style" />
                <link href="https://blog.scientific-python.org/matplotlib/mpl-for-making-diagrams/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib for Making Diagrams" />
                <link href="https://blog.scientific-python.org/matplotlib/create-ridgeplots-in-matplotlib/?utm_source=atom_feed" rel="related" type="text/html" title="Create Ridgeplots in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/create-a-tesla-cybertruck-that-drives/?utm_source=atom_feed" rel="related" type="text/html" title="Create a Tesla Cybertruck That Drives" />
                <link href="https://blog.scientific-python.org/matplotlib/an-inquiry-into-matplotlib-figures/?utm_source=atom_feed" rel="related" type="text/html" title="An Inquiry Into Matplotlib&#39;s Figures" />
            
                <id>https://blog.scientific-python.org/matplotlib/draw-all-graphs-of-n-nodes/</id>
            
            
            <published>2020-05-07T09:05:32+01:00</published>
            <updated>2020-05-07T09:05:32+01:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>A fun project about drawing all possible differently-looking (not isomorphic) graphs of N nodes.</blockquote><p>The other day I was homeschooling my kids, and they asked me: &ldquo;Daddy, can you draw us all possible non-isomorphic graphs of 3 nodes&rdquo;? Or maybe I asked them that? Either way, we happily drew all possible graphs of 3 nodes, but already for 4 nodes it got hard, and for 5 nodes - <a href="https://www.graphclasses.org/smallgraphs.html#nodes5">plain impossible</a>!</p>
<p>So I thought: let me try to write a brute-force program to do it! I spent a few hours sketching some smart dynamic programming solution to generate these graphs, and went nowhere, as apparently the <a href="http://www.cs.columbia.edu/~cs4205/files/CM9.pdf">problem is quite hard</a>. I gave up, and decided to go with a naive approach:</p>
<ol>
<li>Generate all graphs of N nodes, even if some of them look the same (are isomorphic). For \(N\) nodes, there are \(\frac{N(N-1)}{2}\) potential edges to connect these nodes, so it&rsquo;s like generating a bunch of binary numbers. Simple!</li>
<li>Write a program to tell if two graphs are isomorphic, then remove all duplicates, unworthy of being presented in the final picture.</li>
</ol>
<p>This strategy seemed more reasonable, but writing a &ldquo;graph-comparator&rdquo; still felt like a cumbersome task, and more importantly, this part would itself be slow, as I&rsquo;d still have to go through a whole tree of options for every graph comparison. So after some more head-scratching, I decided to simplify it even further, and use the fact that these days the memory is cheap:</p>
<ol>
<li>Generate all possible graphs (some of them totally isomorphic, meaning that they would look as a repetition if plotted on a figure)</li>
<li>For each graph, generate its &ldquo;description&rdquo; (like an <a href="https://en.wikipedia.org/wiki/Adjacency_matrix">adjacency matrix</a>, of an edge list), and check if a graph with this description is already on the list. If yes, skip it, we got its portrait already!</li>
<li>If however the graph is unique, include it in the picture, and also generate all possible &ldquo;descriptions&rdquo; of it, up to node permutation, and add them to the hash table. To make sure no other graph of this particular shape would ever be included in our pretty picture again.</li>
</ol>
<p>For the first task, I went with the edge list, which made the task identical to <a href="https://www.geeksforgeeks.org/generate-all-the-binary-strings-of-n-bits/">generating all binary numbers</a> of length \(\frac{N(N-1)}{2}\) with a recursive function, except instead of writing zeroes you skip edges, and instead of writing ones, you include them. Below is the function that does the trick, and has an additional bonus of listing all edges in a neat orderly way. For every edge \(i \rightarrow j\) we can be sure that \(i\) is lower than \(j\), and also that edges are sorted as words in a dictionary. Which is good, as it restricts the set of possible descriptions a bit, which will simplify our life later.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">make_graphs</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">i</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;Make a graph recursively, by either including, or skipping each edge.
</span></span></span><span class="line"><span class="cl"><span class="s2">    Edges are given in lexicographical order by construction.&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">out</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">i</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>  <span class="c1"># First call</span>
</span></span><span class="line"><span class="cl">        <span class="n">out</span> <span class="o">=</span> <span class="p">[[(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">)]</span> <span class="o">+</span> <span class="n">r</span> <span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="n">make_graphs</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="n">n</span><span class="p">,</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">)]</span>
</span></span><span class="line"><span class="cl">    <span class="k">elif</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">out</span> <span class="o">+=</span> <span class="p">[[(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)]</span> <span class="o">+</span> <span class="n">r</span> <span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="n">make_graphs</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="n">n</span><span class="p">,</span> <span class="n">i</span><span class="o">=</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="n">j</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)]</span>
</span></span><span class="line"><span class="cl">        <span class="n">out</span> <span class="o">+=</span> <span class="p">[</span><span class="n">r</span> <span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="n">make_graphs</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="n">n</span><span class="p">,</span> <span class="n">i</span><span class="o">=</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="n">j</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)]</span>
</span></span><span class="line"><span class="cl">    <span class="k">elif</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">out</span> <span class="o">=</span> <span class="n">make_graphs</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="n">n</span><span class="p">,</span> <span class="n">i</span><span class="o">=</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">else</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">out</span> <span class="o">=</span> <span class="p">[[]]</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">out</span></span></span></code></pre>
</div>
<p>If you run this function for a small number of nodes (say, \(N=3\)), you can see how it generates all possible graph topologies, but that some of the descriptions would actually lead to identical pictures, if drawn (graphs 2 and 3 in the list below).</p>

<div class="highlight">
  <pre>[(0, 1), (0, 2), (1, 2)]
[(0, 1), (0, 2)]
[(0, 1), (1, 2)]
[(0, 1)]</pre>
</div>

<p>Also, while building a graph from edges means that we&rsquo;ll never get lonely unconnected points, we can get graphs that are smaller than \(n\) nodes (the last graph in the list above), or graphs that have unconnected parts. It is impossible for \(n=3\), but starting with \(n=4\) we would get things like <code>[(0,1), (2,3)]</code>, which is technically a graph, but you cannot exactly wear it as a piece of jewelry, as it would fall apart. So at this point I decided to only visualize fully connected graphs of exactly \(n\) vertices.</p>
<p>To continue with the plan, we now need to make a function that for every graph would generate a family of its &ldquo;alternative representations&rdquo; (given the constraints of our generator), to make sure duplicates would not slip under the radar. First we need a permutation function, to permute the nodes (you could also use a built-in function in <code>numpy</code>, but coding this one from scratch is always fun, isn&rsquo;t it?). Here&rsquo;s the permutation generator:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">perm</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">s</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;All permutations of n elements.&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">s</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">perm</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="nb">tuple</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="n">n</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="ow">not</span> <span class="n">s</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="p">[[]]</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">[[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">p</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">s</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">perm</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="nb">tuple</span><span class="p">([</span><span class="n">k</span> <span class="k">for</span> <span class="n">k</span> <span class="ow">in</span> <span class="n">s</span> <span class="k">if</span> <span class="n">k</span> <span class="o">!=</span> <span class="n">i</span><span class="p">]))]</span></span></span></code></pre>
</div>
<p>Now, for any given graph description, we can permute its nodes, sort the \(i,j\) within each edge, sort the edges themselves, remove duplicate alt-descriptions, and remember the list of potential impostors:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">permute</span><span class="p">(</span><span class="n">g</span><span class="p">,</span> <span class="n">n</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;Create a set of all possible isomorphic codes for a graph,
</span></span></span><span class="line"><span class="cl"><span class="s2">    as nice hashable tuples. All edges are i&lt;j, and sorted lexicographically.&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">ps</span> <span class="o">=</span> <span class="n">perm</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">out</span> <span class="o">=</span> <span class="nb">set</span><span class="p">([])</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">ps</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">out</span><span class="o">.</span><span class="n">add</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="nb">tuple</span><span class="p">(</span><span class="nb">sorted</span><span class="p">([(</span><span class="n">p</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">p</span><span class="p">[</span><span class="n">j</span><span class="p">])</span> <span class="k">if</span> <span class="n">p</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">&lt;</span> <span class="n">p</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="k">else</span> <span class="p">(</span><span class="n">p</span><span class="p">[</span><span class="n">j</span><span class="p">],</span> <span class="n">p</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="ow">in</span> <span class="n">g</span><span class="p">]))</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nb">list</span><span class="p">(</span><span class="n">out</span><span class="p">)</span></span></span></code></pre>
</div>
<p>Say, for an input description of <code>[(0, 1), (0, 2)]</code>, the function above returns three &ldquo;synonyms&rdquo;:</p>

<div class="highlight">
  <pre>((0, 1), (1, 2))
((0, 1), (0, 2))
((0, 2), (1, 2))</pre>
</div>

<p>I suspect there should be a neater way to code that, to avoid using the <code>list → set → list</code> pipeline to get rid of duplicates, but hey, it works!</p>
<p>At this point, the only thing that&rsquo;s missing is the function to check whether the graph comes in one piece, which happens to be a famous and neat algorithm called the &ldquo;<a href="https://en.wikipedia.org/wiki/Disjoint-set_data_structure">Union-Find</a>&rdquo;. I won&rsquo;t describe it here in detail, but in short, it goes though all edges and connects nodes to each other in a special way; then counts how many separate connected components (like, chunks of the graph) remain in the end. If all nodes are in one chunk, we like it. If not, I don&rsquo;t want to see it in my pictures!</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">connected</span><span class="p">(</span><span class="n">g</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;Check if the graph is fully connected, with Union-Find.&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">nodes</span> <span class="o">=</span> <span class="nb">set</span><span class="p">([</span><span class="n">i</span> <span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="n">g</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">e</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">    <span class="n">roots</span> <span class="o">=</span> <span class="p">{</span><span class="n">node</span><span class="p">:</span> <span class="n">node</span> <span class="k">for</span> <span class="n">node</span> <span class="ow">in</span> <span class="n">nodes</span><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">def</span> <span class="nf">_root</span><span class="p">(</span><span class="n">node</span><span class="p">,</span> <span class="n">depth</span><span class="o">=</span><span class="mi">0</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">node</span> <span class="o">==</span> <span class="n">roots</span><span class="p">[</span><span class="n">node</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="p">(</span><span class="n">node</span><span class="p">,</span> <span class="n">depth</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="n">_root</span><span class="p">(</span><span class="n">roots</span><span class="p">[</span><span class="n">node</span><span class="p">],</span> <span class="n">depth</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="ow">in</span> <span class="n">g</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">ri</span><span class="p">,</span> <span class="n">di</span> <span class="o">=</span> <span class="n">_root</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">rj</span><span class="p">,</span> <span class="n">dj</span> <span class="o">=</span> <span class="n">_root</span><span class="p">(</span><span class="n">j</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">ri</span> <span class="o">==</span> <span class="n">rj</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="k">continue</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">di</span> <span class="o">&lt;=</span> <span class="n">dj</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">roots</span><span class="p">[</span><span class="n">ri</span><span class="p">]</span> <span class="o">=</span> <span class="n">rj</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">roots</span><span class="p">[</span><span class="n">rj</span><span class="p">]</span> <span class="o">=</span> <span class="n">ri</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nb">len</span><span class="p">(</span><span class="nb">set</span><span class="p">([</span><span class="n">_root</span><span class="p">(</span><span class="n">node</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">node</span> <span class="ow">in</span> <span class="n">nodes</span><span class="p">]))</span> <span class="o">==</span> <span class="mi">1</span></span></span></code></pre>
</div>
<p>Now we can finally generate the &ldquo;overkill&rdquo; list of graphs, filter it, and plot the pics:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">filter</span><span class="p">(</span><span class="n">gs</span><span class="p">,</span> <span class="n">target_nv</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;Filter all improper graphs: those with not enough nodes,
</span></span></span><span class="line"><span class="cl"><span class="s2">    those not fully connected, and those isomorphic to previously considered.&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">mem</span> <span class="o">=</span> <span class="nb">set</span><span class="p">({})</span>
</span></span><span class="line"><span class="cl">    <span class="n">gs2</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">g</span> <span class="ow">in</span> <span class="n">gs</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">nv</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="nb">set</span><span class="p">([</span><span class="n">i</span> <span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="n">g</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">e</span><span class="p">]))</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">nv</span> <span class="o">!=</span> <span class="n">target_nv</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="k">continue</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="ow">not</span> <span class="n">connected</span><span class="p">(</span><span class="n">g</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">            <span class="k">continue</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="nb">tuple</span><span class="p">(</span><span class="n">g</span><span class="p">)</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">mem</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">gs2</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">g</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="n">mem</span> <span class="o">|=</span> <span class="nb">set</span><span class="p">(</span><span class="n">permute</span><span class="p">(</span><span class="n">g</span><span class="p">,</span> <span class="n">target_nv</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">gs2</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Main body</span>
</span></span><span class="line"><span class="cl"><span class="n">NV</span> <span class="o">=</span> <span class="mi">6</span>
</span></span><span class="line"><span class="cl"><span class="n">gs</span> <span class="o">=</span> <span class="n">make_graphs</span><span class="p">(</span><span class="n">NV</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">gs</span> <span class="o">=</span> <span class="nb">filter</span><span class="p">(</span><span class="n">gs</span><span class="p">,</span> <span class="n">NV</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plot_graphs</span><span class="p">(</span><span class="n">gs</span><span class="p">,</span> <span class="n">figsize</span><span class="o">=</span><span class="mi">14</span><span class="p">,</span> <span class="n">dotsize</span><span class="o">=</span><span class="mi">20</span><span class="p">)</span></span></span></code></pre>
</div>
<p>For plotting the graphs I wrote a small wrapper for the MatPlotLib-based NetworkX visualizer, splitting the figure into lots of tiny little facets using Matplotlib <code>subplot</code> command. &ldquo;Kamada-Kawai&rdquo; layout below is a <a href="https://en.wikipedia.org/wiki/Force-directed_graph_drawing">popular and fast version of a spring-based layout</a>, that makes the graphs look really nice.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">plot_graphs</span><span class="p">(</span><span class="n">graphs</span><span class="p">,</span> <span class="n">figsize</span><span class="o">=</span><span class="mi">14</span><span class="p">,</span> <span class="n">dotsize</span><span class="o">=</span><span class="mi">20</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;Utility to plot a lot of graphs from an array of graphs.
</span></span></span><span class="line"><span class="cl"><span class="s2">    Each graphs is a list of edges; each edge is a tuple.&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">n</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">graphs</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="n">figsize</span><span class="p">,</span> <span class="n">figsize</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="o">.</span><span class="n">patch</span><span class="o">.</span><span class="n">set_facecolor</span><span class="p">(</span><span class="s2">&#34;white&#34;</span><span class="p">)</span>  <span class="c1"># To make copying possible (white background)</span>
</span></span><span class="line"><span class="cl">    <span class="n">k</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">plt</span><span class="o">.</span><span class="n">subplot</span><span class="p">(</span><span class="n">k</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">k</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">g</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">Graph</span><span class="p">()</span>  <span class="c1"># Generate a Networkx object</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="n">graphs</span><span class="p">[</span><span class="n">i</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl">            <span class="n">g</span><span class="o">.</span><span class="n">add_edge</span><span class="p">(</span><span class="n">e</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">e</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">        <span class="n">nx</span><span class="o">.</span><span class="n">draw_kamada_kawai</span><span class="p">(</span><span class="n">g</span><span class="p">,</span> <span class="n">node_size</span><span class="o">=</span><span class="n">dotsize</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="nb">print</span><span class="p">(</span><span class="s2">&#34;.&#34;</span><span class="p">,</span> <span class="n">end</span><span class="o">=</span><span class="s2">&#34;&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p>Here are the results. To build the anticipation, let&rsquo;s start with something trivial: all graphs of 3 nodes:</p>
<p><img src="/matplotlib/draw-all-graphs-of-n-nodes/3nodes.png" alt="Two non isomorphic graphs with 3 nodes, the first graph connects all 3 nodes and creates a triangle. The second graph is a path graph with 3 nodes connected as a single path."></p>
<p>All graphs of 4 nodes:</p>
<p><img src="/matplotlib/draw-all-graphs-of-n-nodes/4nodes.png" alt="All six possible non isomorphic graphs with 4 nodes. The first graph is a complete graph with all 4 nodes connected to each other. The second one is a complete graph with one edge removed. The third graph is a triangle graph with one node attached with one of the nodes in the graph. The fourth graph is a star graph, with one central node connected to the other 3 nodes. The fifth one is a graph where the edges form a square. The sixth one is a path graph which connects all 4 nodes as a single path."></p>
<p>All graphs of 5 nodes:</p>
<p><img src="/matplotlib/draw-all-graphs-of-n-nodes/5nodes.png" alt="All 21 possibilities of non isomorphic graphs with 5 nodes. The different graphs show multiple possible structures from a complete graph of 5 nodes to a path graph of 5 nodes. Other structures present in this collection of graphs show a pentagon shaped graph, a star graph and others."></p>
<p>Generating figures above is of course all instantaneous on a decent computer, but for 6 nodes (below) it takes a few seconds:</p>
<p><img src="/matplotlib/draw-all-graphs-of-n-nodes/6nodes.png" alt="All 112 possibilities of non isomorphic graphs with 6 nodes. The different graphs show multiple possible structures from a complete graph of 6 nodes to a path graph of 6 nodes. Other structures present in this collection of graphs show a hexagon shaped graph, a star graph, two complete graphs with 4 nodes stacked on top of each other with the two complete 4 graphs sharing an edge and 107 other structures!"></p>
<p>For 7 nodes (below) it takes about 5-10 minutes. It&rsquo;s easy to see why: the brute-force approach generates all \(2^{\frac{n(n-1)}{2}}\) possible graphs, which means that the number of operations grows exponentially! Every increase of \(n\) by one, gives us \(n-1\) new edges to consider, which means that the time to run the program increases by \(~2^{n-1}\). For \(n=7\) it brought me from seconds to minutes, for \(n=8\) it would have shifted me from minutes to hours, and for \(n=9\), from hours, to months of computation. Isn&rsquo;t it fun? We are all specialists in exponential growth these days, so here you are :)</p>
<p><img src="/matplotlib/draw-all-graphs-of-n-nodes/7nodes.png" alt="ALL 853 possibilities of non isomorphic graphs with 7 nodes. The different graphs show multiple possible structures from a complete graph of 7 nodes to a path graph of 7 nodes. Other structures present in this collection show many star and kite-shaped graphs."></p>
<p>The code is available as a <a href="https://github.com/khakhalin/Sketches/blob/master/classic/generate_all_graphs.ipynb">Jupyter Notebook on my GitHub</a>. I hope you enjoyed the pictures, and the read! Which of those charms above would bring most luck? Which ones seem best for divination? Let me know what you think! :)</p>
<p>Contact me via <a href="https://twitter.com/ampanmdagaba">Twitter</a> or <a href="https://github.com/khakhalin">Github</a>.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="graphs" label="graphs" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Sidharth Bansal joined as GSoC'20 intern]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/introductory-gsoc2020-post/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-rsef/?utm_source=atom_feed" rel="related" type="text/html" title="Elliott Sales de Andrade hired as Matplotlib Software Research Engineering Fellow" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-cyberpunk-style/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib Cyberpunk Style" />
                <link href="https://blog.scientific-python.org/matplotlib/mpl-for-making-diagrams/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib for Making Diagrams" />
                <link href="https://blog.scientific-python.org/matplotlib/create-ridgeplots-in-matplotlib/?utm_source=atom_feed" rel="related" type="text/html" title="Create Ridgeplots in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/create-a-tesla-cybertruck-that-drives/?utm_source=atom_feed" rel="related" type="text/html" title="Create a Tesla Cybertruck That Drives" />
            
                <id>https://blog.scientific-python.org/matplotlib/introductory-gsoc2020-post/</id>
            
            
            <published>2020-05-06T21:47:36+05:30</published>
            <updated>2020-05-06T21:47:36+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Introductory post about Sidharth Bansal, Google Summer of Code 2020 Intern for Baseline Image Problem Project under Numfocus</blockquote><p>When I, Sidharth Bansal, heard I got selected in Google Summer of Code(GSOC) 2020 with Matplotlib under Numfocus, I was jumping and dancing. In this post, I talk about my past experiences, how I got selected for GSOC with Matplotlib, and my project details.
I am grateful to the community :)</p>
<h2 id="about-me">About me:<a class="headerlink" href="#about-me" title="Link to this heading">#</a></h2>
<p>I am currently pursuing a Bachelor’s in Technology in Software Engineering at Delhi Technological University, Delhi, India. I started my journey of open source with Public Lab, an open-source organization as a full-stack Ruby on Rails web developer. I initially did the Google Summer of Code there. I built a Multi-Party Authentication System which involves authentication of the user through multiple websites linked like mapknitter.org and spectralworkbench.org with OmniAuth providers like Facebook, twitter, google, and Github. I also worked on a Multi-Tag Subscription project there. It involved tag/category subscription by the user so that users will be notified of subsequent posts in the category they subscribe to earlier. I have also mentored there as for Google Code-In and GSoC last year. I also worked there as a freelancer.</p>
<p>Apart from this, I also successfully completed an internship in the Google Payments team at Google, India this year as a Software Engineering Intern. I built a PAN Collection Flow there. PAN(Taxation Number) information is collected from the user if the total amount claimed by the user through Scratch cards in the current financial year exceeds PAN_LIMIT. Triggered PAN UI at the time of scratching the reward. Enabled Paisa-Offers to uplift their limit to grant Scratch Cards after crossing PAN_LIMIT. Used different technologies like Java, Guice, Android, Spanner Queues, Protocol Buffers, JUnit, etc.</p>
<p>I also have a keen interest in Machine Learning and Natural Language Processing and have done a couple of projects at my university. I have researched on <code>Query Expansion using fuzzy logic</code>. I will be publishing it in some time. It involves the fuzzification of the traditional wordnet for query expansion.</p>
<p>Our paper <code>Experimental Comparison &amp; Scientometric Inspection of Research for Word Embeddings</code> got accepted in ESCI Journal and Springer LNN past week. It explains the ongoing trends in universal embeddings and compares them.</p>
<h2 id="getting-started-with-matplotlib">Getting started with matplotlib<a class="headerlink" href="#getting-started-with-matplotlib" title="Link to this heading">#</a></h2>
<p>I chose matplotlib as it is an organization with so much cool stuff relating to plotting. I have always wanted to work on such things. People are really friendly, always eager to help!</p>
<h2 id="taking-baby-steps">Taking Baby steps:<a class="headerlink" href="#taking-baby-steps" title="Link to this heading">#</a></h2>
<p>The first step is getting involved with the community. I started using the Gitter channel to know about the maintainers. I started learning the different pieces which tie up for the baseline image problem. I started with learning the system architecture of matplotlib. Then I installed the matplotlib, learned the cool tech stack related to matplotlib like sphinx, python, pypi etc.</p>
<h2 id="keep-on-contributing-and-keep-on-learning">Keep on contributing and keep on learning:<a class="headerlink" href="#keep-on-contributing-and-keep-on-learning" title="Link to this heading">#</a></h2>
<p>Learning is a continuous task. Taking guidance from mentors about the various use case scenarios involved in the GSoC project helped me to gain a lot of insights. I solved a couple of small issues. I learned about the code-review process followed here, sphinx documentation, how releases work. I did <a href="https://github.com/matplotlib/matplotlib/pulls?q=is%3Apr&#43;author%3ASidharthBansal&#43;is%3Aclosed">some PRs</a>. It was a great learning experience.</p>
<h2 id="about-the-project">About the Project:<a class="headerlink" href="#about-the-project" title="Link to this heading">#</a></h2>
<p><a href="https://github.com/matplotlib/matplotlib/issues/16447">The project</a> is about the generation of baseline images instead of downloading them. The baseline images are problematic because they cause the repo size to grow rather quickly by adding more baseline images. Also, the baseline images force matplotlib contributors to pin to a somewhat old version of FreeType because nearly every release of FreeType causes tiny rasterization changes that would entail regenerating all baseline images. Thus, it causes even more repository size growth.
The idea is not to store the baseline images at all in the Github repo. It involves dividing the matplotlib package into two separate packages - mpl-test and mpl-notest. Mpl-test will have test suite and related information. The functionality of mpl plotting library will be present in mpl-notest. We will then create the logic for generating and grabbing the latest release. Some caching will be done too. We will then implement an analogous strategy to the CI.</p>
<p><strong>Mentor</strong> <a href="https://github.com/anntzer">Antony Lee</a></p>
<p>Thanks a lot for reading….having a great time coding with great people at Matplotlib. I will be right back with my work progress in subsequent posts.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="GSoC" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Matplotlib Cyberpunk Style]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/matplotlib-cyberpunk-style/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/mpl-for-making-diagrams/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib for Making Diagrams" />
                <link href="https://blog.scientific-python.org/matplotlib/create-ridgeplots-in-matplotlib/?utm_source=atom_feed" rel="related" type="text/html" title="Create Ridgeplots in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/create-a-tesla-cybertruck-that-drives/?utm_source=atom_feed" rel="related" type="text/html" title="Create a Tesla Cybertruck That Drives" />
                <link href="https://blog.scientific-python.org/matplotlib/an-inquiry-into-matplotlib-figures/?utm_source=atom_feed" rel="related" type="text/html" title="An Inquiry Into Matplotlib&#39;s Figures" />
                <link href="https://blog.scientific-python.org/matplotlib/custom-3d-engine/?utm_source=atom_feed" rel="related" type="text/html" title="Custom 3D engine in Matplotlib" />
            
                <id>https://blog.scientific-python.org/matplotlib/matplotlib-cyberpunk-style/</id>
            
            
            <published>2020-03-27T20:26:07+01:00</published>
            <updated>2020-03-27T20:26:07+01:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Futuristic neon glow for your next data visualization</blockquote><p><img src="/matplotlib/matplotlib-cyberpunk-style/figures/5.png" alt="A line graph styled with a dark background and neon glowing lines in the style of Cyberpunk."></p>
<h2 id="1---the-basis">1 - The Basis<a class="headerlink" href="#1---the-basis" title="Link to this heading">#</a></h2>
<p>Let&rsquo;s make up some numbers, put them in a Pandas dataframe and plot them:</p>
<pre><code>import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'A': [1, 3, 9, 5, 2, 1, 1],
                   'B': [4, 5, 5, 7, 9, 8, 6]})

df.plot(marker='o')
plt.show()
</code></pre>
<p><img src="/matplotlib/matplotlib-cyberpunk-style/figures/1.png" alt="A simple chart consisted of two lines, one blue line, and one orange line. The lines are on a white background."></p>
<h2 id="2---the-darkness">2 - The Darkness<a class="headerlink" href="#2---the-darkness" title="Link to this heading">#</a></h2>
<p>Not bad, but somewhat ordinary. Let&rsquo;s customize it by using Seaborn&rsquo;s dark style, as well as changing background and font colors:</p>
<pre><code>plt.style.use(&quot;seaborn-dark&quot;)

for param in ['figure.facecolor', 'axes.facecolor', 'savefig.facecolor']:
    plt.rcParams[param] = '#212946'  # bluish dark grey

for param in ['text.color', 'axes.labelcolor', 'xtick.color', 'ytick.color']:
    plt.rcParams[param] = '0.9'  # very light grey

ax.grid(color='#2A3459')  # bluish dark grey, but slightly lighter than background
</code></pre>
<p><img src="/matplotlib/matplotlib-cyberpunk-style/figures/2.png" alt="A simple chart with a dark background consisted of two lines: A is the blue line and B is the orange line."></p>
<h2 id="3---the-light">3 - The Light<a class="headerlink" href="#3---the-light" title="Link to this heading">#</a></h2>
<p>It looks more interesting now, but we need our colors to shine more against the dark background:</p>
<pre><code>fig, ax = plt.subplots()
colors = [
    '#08F7FE',  # teal/cyan
    '#FE53BB',  # pink
    '#F5D300',  # yellow
    '#00ff41', # matrix green
]
df.plot(marker='o', ax=ax, color=colors)
</code></pre>
<p><img src="/matplotlib/matplotlib-cyberpunk-style/figures/3.png" alt="A simple chart with a dark background consisted of two lines: A is the blue line and B is the purple line."></p>
<h2 id="4---the-glow">4 - The Glow<a class="headerlink" href="#4---the-glow" title="Link to this heading">#</a></h2>
<p>Now, how to get that neon look? To make it shine, we <em>redraw the lines multiple times</em>, with low alpha value and slightly increasing linewidth. The overlap creates the glow effect.</p>
<pre><code>n_lines = 10
diff_linewidth = 1.05
alpha_value = 0.03

for n in range(1, n_lines+1):

    df.plot(marker='o',
            linewidth=2+(diff_linewidth*n),
            alpha=alpha_value,
            legend=False,
            ax=ax,
            color=colors)
</code></pre>
<p><img src="/matplotlib/matplotlib-cyberpunk-style/figures/4.png" alt="A simple chart with a dark background consisted of two lines: A is the blue line and B is the purple line.  However, they have a neon look and are both glowing."></p>
<h2 id="5---the-finish">5 - The Finish<a class="headerlink" href="#5---the-finish" title="Link to this heading">#</a></h2>
<p>For some more fine tuning, we color the area below the line (via <code>ax.fill_between</code>) and adjust the axis limits.</p>
<p>Here&rsquo;s the full code:</p>
<pre><code>import pandas as pd
import matplotlib.pyplot as plt


plt.style.use(&quot;dark_background&quot;)

for param in ['text.color', 'axes.labelcolor', 'xtick.color', 'ytick.color']:
    plt.rcParams[param] = '0.9'  # very light grey

for param in ['figure.facecolor', 'axes.facecolor', 'savefig.facecolor']:
    plt.rcParams[param] = '#212946'  # bluish dark grey

colors = [
    '#08F7FE',  # teal/cyan
    '#FE53BB',  # pink
    '#F5D300',  # yellow
    '#00ff41',  # matrix green
]


df = pd.DataFrame({'A': [1, 3, 9, 5, 2, 1, 1],
                   'B': [4, 5, 5, 7, 9, 8, 6]})

fig, ax = plt.subplots()

df.plot(marker='o', color=colors, ax=ax)

# Redraw the data with low alpha and slightly increased linewidth:
n_shades = 10
diff_linewidth = 1.05
alpha_value = 0.3 / n_shades

for n in range(1, n_shades+1):

    df.plot(marker='o',
            linewidth=2+(diff_linewidth*n),
            alpha=alpha_value,
            legend=False,
            ax=ax,
            color=colors)

# Color the areas below the lines:
for column, color in zip(df, colors):
    ax.fill_between(x=df.index,
                    y1=df[column].values,
                    y2=[0] * len(df),
                    color=color,
                    alpha=0.1)

ax.grid(color='#2A3459')

ax.set_xlim([ax.get_xlim()[0] - 0.2, ax.get_xlim()[1] + 0.2])  # to not have the markers cut off
ax.set_ylim(0)

plt.show()
</code></pre>
<p><img src="/matplotlib/matplotlib-cyberpunk-style/figures/5.png" alt="A simple chart with a dark background consisted of two lines: A is the blue line and B is the purple line. However they are neon, both glow, and the area below them glows as well."></p>
<p>If this helps you or if you have constructive criticism, I&rsquo;d be happy to hear about it! Please contact me via <a href="https://dhaitz.github.io">here</a> or <a href="https://twitter.com/d_haitz">here</a>. Thanks!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Elliott Sales de Andrade hired as Matplotlib Software Research Engineering Fellow]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/matplotlib-rsef/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/mpl-for-making-diagrams/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib for Making Diagrams" />
                <link href="https://blog.scientific-python.org/matplotlib/create-ridgeplots-in-matplotlib/?utm_source=atom_feed" rel="related" type="text/html" title="Create Ridgeplots in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/create-a-tesla-cybertruck-that-drives/?utm_source=atom_feed" rel="related" type="text/html" title="Create a Tesla Cybertruck That Drives" />
                <link href="https://blog.scientific-python.org/matplotlib/an-inquiry-into-matplotlib-figures/?utm_source=atom_feed" rel="related" type="text/html" title="An Inquiry Into Matplotlib&#39;s Figures" />
                <link href="https://blog.scientific-python.org/matplotlib/custom-3d-engine/?utm_source=atom_feed" rel="related" type="text/html" title="Custom 3D engine in Matplotlib" />
            
                <id>https://blog.scientific-python.org/matplotlib/matplotlib-rsef/</id>
            
            
            <published>2020-03-20T15:51:00-04:00</published>
            <updated>2020-03-20T15:51:00-04:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>We have hired Elliott Sales de Andrade as the Matplotlib Software Research Engineering Fellow supported by the Chan Zuckerberg Initiative Essential Open Source Software for Science</blockquote><p>As has been discussed in detail in Nadia Eghbal&rsquo;s <a href="https://www.fordfoundation.org/work/learning/research-reports/roads-and-bridges-the-unseen-labor-behind-our-digital-infrastructure/">Roads and Bridges</a>, the CZI EOSS <a href="https://chanzuckerberg.com/rfa/essential-open-source-software-for-science/">program
announcement</a>, and in the NumFocus <a href="https://numfocus.org/programs/sustainability">sustainability program goals</a>, much of the critical software that science and industry are built on
is maintained by a primarily volunteer community. While this has worked, it is not sustainable in the long term for the health of many
projects or their contributors.</p>
<p>We are happy to announce that we have hired Elliott Sales de Andrade (<a href="https://github.com/QuLogic">QuLogic</a>)
as the <a href="https://github.com/matplotlib/CZI_2019-07_mpl">Matplotlib Software Research Engineering
Fellow</a> supported by
the <a href="https://chanzuckerberg.com/eoss/proposals/matplotlib-foundation-of-scientific-visualization-in-python/">Chan Zuckerberg Initiative Essential Open Source Software for
Science</a>
effective March 1, 2020!</p>
<p>Elliott has been contributing to a broad variety of Free and Open
Source projects for several years. He is an active Matplotlib
contributor and has had commit rights since October 2015. In addition
to working on Matplotlib, Elliott has contributed to a wide range of
projects in the Scientific Python software stack, both downstream and
upstream of Matplotlib, including
<a href="https://scitools.org.uk/cartopy/">Cartopy</a>,
<a href="https://obspy.org/">ObsPy</a>, and <a href="https://numpy.org/">NumPy</a>. Outside
of Python, Elliott is a developer on the <a href="https://pidgin.im/">Pidgin
project</a> and a packager for <a href="https://getfedora.org/">Fedora
Linux</a>. In his work on Matplotlib, he is interested in advancing
science through reproducible workflows and more accessible libraries.</p>
<p>We are already seeing a reduction in the backlog of open issues and
pull requests, which we hope will make the library easier to
contribute to and maintain long term. We also benefit from Elliott
having the bandwidth to maintain a library wide view of all the
on-going work and open bugs. Hiring Elliott as an RSEF is the
start of ensuring that Matplotlib is sustainable in the long term.</p>
<p>Looking forward to all the good work we are going to do this year!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="news" label="News" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Matplotlib for Making Diagrams]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/mpl-for-making-diagrams/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/create-ridgeplots-in-matplotlib/?utm_source=atom_feed" rel="related" type="text/html" title="Create Ridgeplots in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/create-a-tesla-cybertruck-that-drives/?utm_source=atom_feed" rel="related" type="text/html" title="Create a Tesla Cybertruck That Drives" />
                <link href="https://blog.scientific-python.org/matplotlib/an-inquiry-into-matplotlib-figures/?utm_source=atom_feed" rel="related" type="text/html" title="An Inquiry Into Matplotlib&#39;s Figures" />
                <link href="https://blog.scientific-python.org/matplotlib/custom-3d-engine/?utm_source=atom_feed" rel="related" type="text/html" title="Custom 3D engine in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/warming-stripes/?utm_source=atom_feed" rel="related" type="text/html" title="Creating the Warming Stripes in Matplotlib" />
            
                <id>https://blog.scientific-python.org/matplotlib/mpl-for-making-diagrams/</id>
            
            
            <published>2020-02-19T12:57:07-05:00</published>
            <updated>2020-02-19T12:57:07-05:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>How to use Matplotlib to make diagrams.</blockquote><h1 id="matplotlib-for-diagrams">Matplotlib for diagrams<a class="headerlink" href="#matplotlib-for-diagrams" title="Link to this heading">#</a></h1>
<p>This is my first post for the Matplotlib blog so I wanted to lead
with an example of what I most love about it:
How much control Matplotlib gives you.
I like to use it as a programmable drawing tool that happens
to be good at plotting data.</p>
<p>The default layout for Matplotlib works great for a lot of things,
but sometimes you want to exert
more control. Sometimes you want to treat your figure window as
a blank canvas and create diagrams
to communicate your ideas. Here, we will walk through the process
for setting this up. Most of these tricks are detailed in
<a href="https://e2eml.school/matplotlib_framing.html">this cheat sheet for laying out plots</a>.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span></span></span></code></pre>
</div>
<p>The first step is to choose the size of your canvas.</p>
<p>(Just a heads up, I love the metaphor
of the canvas, so that&rsquo;s how I am using the term here.
The Canvas object is a very specific
thing in the Matplotlib code base. That&rsquo;s not what I&rsquo;m referring to.)</p>
<p>I&rsquo;m planning to make a diagram that is 16 centimeters wide
and 9 centimeters high.
This will fit comfortably on a piece of A4 or US Letter paper
and will be almost twice as wide as it is high.
It also scales up nicely to fit on a wide-format slide presentation.</p>
<p>The <code>plt.figure()</code> function accepts a <code>figsize</code> argument,
a tuple of <code>(width, height)</code> in <strong>inches</strong>.
To convert from centimeters, we&rsquo;ll divide by 2.54.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig_width</span> <span class="o">=</span> <span class="mi">16</span>  <span class="c1"># cm</span>
</span></span><span class="line"><span class="cl"><span class="n">fig_height</span> <span class="o">=</span> <span class="mi">9</span>  <span class="c1"># cm</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="n">fig_width</span> <span class="o">/</span> <span class="mf">2.54</span><span class="p">,</span> <span class="n">fig_height</span> <span class="o">/</span> <span class="mf">2.54</span><span class="p">))</span></span></span></code></pre>
</div>
<p>The next step is to add an Axes object that we can draw on.
By default, Matplotlib will size and place the Axes to leave
a little border and room for x- and y-axis labels. However, we don&rsquo;t
want that this time around. We want our Axes to extend right up
to the edge of the Figure.</p>
<p>The <code>add_axes()</code> function lets us specify exactly where to place
our new Axes and how big to make it. It accepts a tuple of the format
<code>(left, bottom, width, height)</code>. The coordinate frame of the Figure
is always (0, 0) at the bottom left corner and (1, 1) at the upper right,
no matter what size of Figure you are working with. Positions, widths,
and heights all become fractions of the total width and height of the Figure.</p>
<p>To fill the Figure with our Axes entirely, we specify a left position of 0,
a bottom position of 0, a width of 1, and a height of 1.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_axes</span><span class="p">((</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span></span></span></code></pre>
</div>
<p>To make our diagram creation easier, we can set the axis limits so that
one unit in the figure equals one centimeter. This grants us
an intuitive way to control the size of objects in the diagram.
A circle with a radius of 2 will be drawn as a circle (not an ellipse)
in the final image and have a radius of 2 cm.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_xlim</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">fig_width</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_ylim</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">fig_height</span><span class="p">)</span></span></span></code></pre>
</div>
<p>We can also do away with the automatically generated ticks
and tick labels with this pair of calls.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">tick_params</span><span class="p">(</span><span class="n">bottom</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">top</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">left</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">right</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">tick_params</span><span class="p">(</span><span class="n">labelbottom</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">labeltop</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">labelleft</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">labelright</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span></span></span></code></pre>
</div>
<p>At this point we have a big blank space of exactly the right size and shape.
Now we can begin building our diagram. The foundation of the image will be
the background color. White is fine, but sometimes it&rsquo;s fun to mix it up.
<a href="https://e2eml.school/matplotlib_lines.html#color">Here are some ideas</a>
to get you started.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_facecolor</span><span class="p">(</span><span class="s2">&#34;antiquewhite&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p>We can also add a border to the diagram to visually set it apart.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;top&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_color</span><span class="p">(</span><span class="s2">&#34;midnightblue&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;bottom&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_color</span><span class="p">(</span><span class="s2">&#34;midnightblue&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;left&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_color</span><span class="p">(</span><span class="s2">&#34;midnightblue&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;right&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_color</span><span class="p">(</span><span class="s2">&#34;midnightblue&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;top&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_linewidth</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;bottom&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_linewidth</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;left&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_linewidth</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;right&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_linewidth</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span></span></span></code></pre>
</div>
<p>Now we have a foundation and background in place
and we&rsquo;re finally ready to start drawing.
You have complete freedom to
<a href="https://e2eml.school/matplotlib_lines.html">draw curves and shapes</a>,
<a href="https://e2eml.school/matplotlib_points.html">place points</a>,
and <a href="https://e2eml.school/matplotlib_text.html">add text</a>
of any variety within our 16 x 9 garden walls.</p>
<p>Then when you&rsquo;re done, the last step is to save the figure out as a
<code>.png</code> file. In this format it can be imported to and added to whatever
document or presentation you&rsquo;re working on</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">savefig</span><span class="p">(</span><span class="s2">&#34;blank_diagram.png&#34;</span><span class="p">,</span> <span class="n">dpi</span><span class="o">=</span><span class="mi">300</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/mpl-for-making-diagrams/blank_diagram.png" alt="Blank diagram example."></p>
<p>If you&rsquo;re making a collection of diagrams,
you can make a convenient template for your blank canvas.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">blank_diagram</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig_width</span><span class="o">=</span><span class="mi">16</span><span class="p">,</span> <span class="n">fig_height</span><span class="o">=</span><span class="mi">9</span><span class="p">,</span> <span class="n">bg_color</span><span class="o">=</span><span class="s2">&#34;antiquewhite&#34;</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;midnightblue&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="n">fig_width</span> <span class="o">/</span> <span class="mf">2.54</span><span class="p">,</span> <span class="n">fig_height</span> <span class="o">/</span> <span class="mf">2.54</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_axes</span><span class="p">((</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_xlim</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">fig_width</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_ylim</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">fig_height</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_facecolor</span><span class="p">(</span><span class="n">bg_color</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">tick_params</span><span class="p">(</span><span class="n">bottom</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">top</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">left</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">right</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">tick_params</span><span class="p">(</span><span class="n">labelbottom</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">labeltop</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">labelleft</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">labelright</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;top&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_color</span><span class="p">(</span><span class="n">color</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;bottom&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_color</span><span class="p">(</span><span class="n">color</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;left&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_color</span><span class="p">(</span><span class="n">color</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;right&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_color</span><span class="p">(</span><span class="n">color</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;top&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_linewidth</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;bottom&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_linewidth</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;left&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_linewidth</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">spines</span><span class="p">[</span><span class="s2">&#34;right&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">set_linewidth</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">fig</span><span class="p">,</span> <span class="n">ax</span></span></span></code></pre>
</div>
<p>Then you can take that canvas and add arbitrary text, shapes, and lines.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">blank_diagram</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">x0</span> <span class="ow">in</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="o">-</span><span class="mi">3</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="n">x0</span><span class="p">,</span> <span class="n">x0</span> <span class="o">+</span> <span class="mi">3</span><span class="p">],</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">9</span><span class="p">],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">savefig</span><span class="p">(</span><span class="s2">&#34;stripes.png&#34;</span><span class="p">,</span> <span class="n">dpi</span><span class="o">=</span><span class="mi">300</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/mpl-for-making-diagrams/stripes.png" alt="White rectangle with black stripes."></p>
<p>Or more intricately:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">blank_diagram</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">centers</span> <span class="o">=</span> <span class="p">[(</span><span class="mf">3.5</span><span class="p">,</span> <span class="mf">6.5</span><span class="p">),</span> <span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="mf">6.5</span><span class="p">),</span> <span class="p">(</span><span class="mf">12.5</span><span class="p">,</span> <span class="mf">6.5</span><span class="p">),</span> <span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">)]</span>
</span></span><span class="line"><span class="cl"><span class="n">radii</span> <span class="o">=</span> <span class="mf">1.5</span>
</span></span><span class="line"><span class="cl"><span class="n">texts</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;</span><span class="se">\n</span><span class="s2">&#34;</span><span class="o">.</span><span class="n">join</span><span class="p">([</span><span class="s2">&#34;My roommate&#34;</span><span class="p">,</span> <span class="s2">&#34;is a Philistine&#34;</span><span class="p">,</span> <span class="s2">&#34;and a boor&#34;</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;</span><span class="se">\n</span><span class="s2">&#34;</span><span class="o">.</span><span class="n">join</span><span class="p">([</span><span class="s2">&#34;My roommate&#34;</span><span class="p">,</span> <span class="s2">&#34;ate the last&#34;</span><span class="p">,</span> <span class="s2">&#34;of the&#34;</span><span class="p">,</span> <span class="s2">&#34;cold cereal&#34;</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;</span><span class="se">\n</span><span class="s2">&#34;</span><span class="o">.</span><span class="n">join</span><span class="p">([</span><span class="s2">&#34;I am really&#34;</span><span class="p">,</span> <span class="s2">&#34;really hungy&#34;</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;</span><span class="se">\n</span><span class="s2">&#34;</span><span class="o">.</span><span class="n">join</span><span class="p">([</span><span class="s2">&#34;I&#39;m annoyed&#34;</span><span class="p">,</span> <span class="s2">&#34;at my roommate&#34;</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl"><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Draw circles with text in the center</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">center</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">centers</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">x</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">center</span>
</span></span><span class="line"><span class="cl">    <span class="n">theta</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">x</span> <span class="o">+</span> <span class="n">radii</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">cos</span><span class="p">(</span><span class="n">theta</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="n">y</span> <span class="o">+</span> <span class="n">radii</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">theta</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="s2">&#34;midnightblue&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">x</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">y</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">texts</span><span class="p">[</span><span class="n">i</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">horizontalalignment</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">verticalalignment</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="s2">&#34;midnightblue&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Draw arrows connecting them</span>
</span></span><span class="line"><span class="cl"><span class="c1"># https://e2eml.school/matplotlib_text.html#annotate</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">annotate</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">centers</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="n">radii</span><span class="p">,</span> <span class="n">centers</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="mi">1</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">centers</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">radii</span><span class="p">,</span> <span class="n">centers</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl">    <span class="n">arrowprops</span><span class="o">=</span><span class="nb">dict</span><span class="p">(</span><span class="n">arrowstyle</span><span class="o">=</span><span class="s2">&#34;-|&gt;&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">annotate</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">centers</span><span class="p">[</span><span class="mi">2</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="n">radii</span><span class="p">,</span> <span class="n">centers</span><span class="p">[</span><span class="mi">2</span><span class="p">][</span><span class="mi">1</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">centers</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">radii</span><span class="p">,</span> <span class="n">centers</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="mi">1</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl">    <span class="n">arrowprops</span><span class="o">=</span><span class="nb">dict</span><span class="p">(</span><span class="n">arrowstyle</span><span class="o">=</span><span class="s2">&#34;-|&gt;&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">annotate</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">centers</span><span class="p">[</span><span class="mi">3</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mf">0.7</span> <span class="o">*</span> <span class="n">radii</span><span class="p">,</span> <span class="n">centers</span><span class="p">[</span><span class="mi">3</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="mf">0.7</span> <span class="o">*</span> <span class="n">radii</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">centers</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="mf">0.7</span> <span class="o">*</span> <span class="n">radii</span><span class="p">,</span> <span class="n">centers</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">-</span> <span class="mf">0.7</span> <span class="o">*</span> <span class="n">radii</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="n">arrowprops</span><span class="o">=</span><span class="nb">dict</span><span class="p">(</span><span class="n">arrowstyle</span><span class="o">=</span><span class="s2">&#34;-|&gt;&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">annotate</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">centers</span><span class="p">[</span><span class="mi">3</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="mf">0.7</span> <span class="o">*</span> <span class="n">radii</span><span class="p">,</span> <span class="n">centers</span><span class="p">[</span><span class="mi">3</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="mf">0.7</span> <span class="o">*</span> <span class="n">radii</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">centers</span><span class="p">[</span><span class="mi">2</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mf">0.7</span> <span class="o">*</span> <span class="n">radii</span><span class="p">,</span> <span class="n">centers</span><span class="p">[</span><span class="mi">2</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">-</span> <span class="mf">0.7</span> <span class="o">*</span> <span class="n">radii</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="n">arrowprops</span><span class="o">=</span><span class="nb">dict</span><span class="p">(</span><span class="n">arrowstyle</span><span class="o">=</span><span class="s2">&#34;-|&gt;&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">savefig</span><span class="p">(</span><span class="s2">&#34;causal.png&#34;</span><span class="p">,</span> <span class="n">dpi</span><span class="o">=</span><span class="mi">300</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/mpl-for-making-diagrams/causal.png" alt="A joke flow chart. The top level goes from left to right, and then both left and right edges lead to the bottom level. The top level is: 1. My roommate is a Philistine and a boor. 2. My roommate ate the last of the cold cereal. 3. I am really really hungry. The bottom level is: I’m annoyed at my roommate."></p>
<p>Once you get started on this path, you can start making
extravagantly annotated plots. It can elevate your data
presentations to true storytelling.</p>
<p>Happy diagram building!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Create Ridgeplots in Matplotlib]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/create-ridgeplots-in-matplotlib/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/create-a-tesla-cybertruck-that-drives/?utm_source=atom_feed" rel="related" type="text/html" title="Create a Tesla Cybertruck That Drives" />
                <link href="https://blog.scientific-python.org/matplotlib/an-inquiry-into-matplotlib-figures/?utm_source=atom_feed" rel="related" type="text/html" title="An Inquiry Into Matplotlib&#39;s Figures" />
                <link href="https://blog.scientific-python.org/matplotlib/custom-3d-engine/?utm_source=atom_feed" rel="related" type="text/html" title="Custom 3D engine in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/warming-stripes/?utm_source=atom_feed" rel="related" type="text/html" title="Creating the Warming Stripes in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-in-data-driven-seo/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib in Data Driven SEO" />
            
                <id>https://blog.scientific-python.org/matplotlib/create-ridgeplots-in-matplotlib/</id>
            
            
            <published>2020-02-15T09:50:16+01:00</published>
            <updated>2020-02-15T09:50:16+01:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>This post details how to leverage gridspec to create ridgeplots in Matplotlib</blockquote><h1 id="introduction">Introduction<a class="headerlink" href="#introduction" title="Link to this heading">#</a></h1>
<p>This post will outline how we can leverage <a href="https://matplotlib.org/3.1.3/api/_as_gen/matplotlib.gridspec.GridSpec.html">gridspec</a> to create ridgeplots in Matplotlib. While this is a relatively straightforward tutorial, some experience working with sklearn would be beneficial. Naturally it being a <em>vast</em> undertaking, this will not be an sklearn tutorial, those interested can read through the docs <a href="https://scikit-learn.org/stable/user_guide.html">here</a>. However, I will use its <code>KernelDensity</code> module from <code>sklearn.neighbors</code>.</p>
<!--
# Contents
  - [Packages](#packages)
  - [Data](#data)
  - [GridSpec](gs1)
  - [Kernel Density Estimation](#kde)
  - [Overlapping Axes Objects](#gs2)
  - [Complete Snippet](#snippet)
 -->
<h3 id="packages">Packages <a id="packages"></a><a class="headerlink" href="#packages" title="Link to this heading">#</a></h3>

<div class="highlight">
  <pre>import pandas as pd
import numpy as np
from sklearn.neighbors import KernelDensity

import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.gridspec as grid_spec</pre>
</div>

<h3 id="data">Data <a id="data"></a><a class="headerlink" href="#data" title="Link to this heading">#</a></h3>
<p>I&rsquo;ll be using some mock data I created. You can grab the dataset from GitHub <a href="https://github.com/petermckeever/mock-data/blob/master/datasets/mock-european-test-results.csv">here</a> if you want to play along. The data looks at aptitude test scores broken down by country, age, and sex.</p>

<div class="highlight">
  <pre>data = pd.read_csv(&#34;mock-european-test-results.csv&#34;)</pre>
</div>

<table>
  <thead>
      <tr>
          <th>country</th>
          <th>age</th>
          <th>sex</th>
          <th>score</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Italy</td>
          <td>21</td>
          <td>female</td>
          <td>0.77</td>
      </tr>
      <tr>
          <td>Spain</td>
          <td>20</td>
          <td>female</td>
          <td>0.87</td>
      </tr>
      <tr>
          <td>Italy</td>
          <td>24</td>
          <td>female</td>
          <td>0.39</td>
      </tr>
      <tr>
          <td>United Kingdom</td>
          <td>20</td>
          <td>female</td>
          <td>0.70</td>
      </tr>
      <tr>
          <td>Germany</td>
          <td>20</td>
          <td>male</td>
          <td>0.25</td>
      </tr>
      <tr>
          <td>&hellip;</td>
          <td></td>
          <td></td>
          <td></td>
      </tr>
  </tbody>
</table>
<h3 id="gridspec">GridSpec <a id="gs1"></a><a class="headerlink" href="#gridspec" title="Link to this heading">#</a></h3>
<p>GridSpec is a Matplotlib module that allows us easy creation of subplots. We can control the number of subplots, the positions, the height, width, and spacing between each. As a basic example, let&rsquo;s create a quick template. The key parameters we&rsquo;ll be focusing on are <code>nrows</code>, <code>ncols</code>, and <code>width_ratios</code>.</p>
<p><code>nrows</code>and <code>ncols</code> divide our figure into areas we can add axes to. <code>width_ratios</code>controls the width of each of our columns. If we create something like <code>GridSpec(2,2,width_ratios=[2,1])</code>, we are subsetting our figure into 2 rows, 2 columns, and setting our width ratio to 2:1, i.e., that the first column will take up two times the width of the figure.</p>
<p>What&rsquo;s great about GridSpec is that now we have created those subsets, we are not <em>bound</em> to them, as we will see below.</p>
<p><strong>Note</strong>: I am using my own theme, so plots will look different. Creating custom themes is outside the scope of this tutorial (but I may write one in the future).</p>

<div class="highlight">
  <pre>gs = (grid_spec.GridSpec(2,2,width_ratios=[2,1]))

fig = plt.figure(figsize=(8,6))

ax = fig.add_subplot(gs[0:1,0])
ax1 = fig.add_subplot(gs[1:,0])
ax2 = fig.add_subplot(gs[0:,1:])

ax_objs = [ax,ax1,ax2]
n = [&#34;&#34;,1,2]

i = 0
for ax_obj in ax_objs:
    ax_obj.text(0.5,0.5,&#34;ax{}&#34;.format(n[i]),
                ha=&#34;center&#34;,color=&#34;red&#34;,
                fontweight=&#34;bold&#34;,size=20)
    i &#43;= 1

plt.show()</pre>
</div>

<p><img src="/matplotlib/create-ridgeplots-in-matplotlib/basic_template.png" alt="This is a chart with 3 subplot labeled ax, ax1, and ax2. It was created using GridSpec."></p>
<p>I won&rsquo;t get into more detail about what everything does here. If you are interested in learning more about figures, axes, and gridspec, Akash Palrecha has <a href="../an-inquiry-into-matplotlib-figures/">written a very nice guide here</a>.</p>
<h3 id="kernel-density-estimation">Kernel Density Estimation <a id="kde"></a><a class="headerlink" href="#kernel-density-estimation" title="Link to this heading">#</a></h3>
<p>We have a couple of options here. The easiest by far is to stick with the pipes built into pandas. All that&rsquo;s needed is to select the column and add <code>plot.kde</code>. This defaults to a Scott bandwidth method, but you can choose a Silverman method, or add your own. Let&rsquo;s use GridSpec again to plot the distribution for each country. First we&rsquo;ll grab the unique country names and create a list of colors.</p>

<div class="highlight">
  <pre>countries = [x for x in np.unique(data.country)]
colors = [&#39;#0000ff&#39;, &#39;#3300cc&#39;, &#39;#660099&#39;, &#39;#990066&#39;, &#39;#cc0033&#39;, &#39;#ff0000&#39;]</pre>
</div>

<p>Next we&rsquo;ll loop through each country and color to plot our data. Unlike the above we will not explicitly declare how many rows we want to plot. The reason for this is to make our code more dynamic. If we set a specific number of rows and specific number of axes objects, we&rsquo;re creating inefficient code. This is a bit of an aside, but when creating visualizations, you should always aim to reduce and reuse. By reduce, we specifically mean lessening the number of variables we are declaring and the unnecessary code associated with that. We are plotting data for six countries, what happens if we get data for 20 countries? That&rsquo;s a lot of additional code. Related, by not explicitly declaring those variables we make our code adaptable and ready to be scripted to automatically create new plots when new data of the same kind becomes available.</p>

<div class="highlight">
  <pre>gs = (grid_spec.GridSpec(len(countries),1))

fig = plt.figure(figsize=(8,6))

i = 0

#creating empty list
ax_objs = []

for country in countries:
    # creating new axes object and appending to ax_objs
    ax_objs.append(fig.add_subplot(gs[i:i&#43;1, 0:]))

    # plotting the distribution
    plot = (data[data.country == country]
            .score.plot.kde(ax=ax_objs[-1],color=&#34;#f0f0f0&#34;, lw=0.5)
           )

    # grabbing x and y data from the kde plot
    x = plot.get_children()[0]._x
    y = plot.get_children()[0]._y

    # filling the space beneath the distribution
    ax_objs[-1].fill_between(x,y,color=colors[i])

    # setting uniform x and y lims
    ax_objs[-1].set_xlim(0, 1)
    ax_objs[-1].set_ylim(0,2.2)

    i &#43;= 1

plt.tight_layout()
plt.show()</pre>
</div>

<p><img src="/matplotlib/create-ridgeplots-in-matplotlib/grid_spec_distro.png" alt="6 kernel density estimate (KDE) charts, each with different shades ranging from blue to blood red."></p>
<p>We&rsquo;re not quite at ridge plots yet, but let&rsquo;s look at what&rsquo;s going on here. You&rsquo;ll notice instead of setting an explicit number of rows, we&rsquo;ve set it to the length of our countries list - <code>gs = (grid_spec.GridSpec(len(countries),1))</code>. This gives us flexibility for future plotting with the ability to plot more or less countries without needing to adjust the code.</p>
<p>Just after the for loop we create each axes object: <code>ax_objs.append(fig.add_subplot(gs[i:i+1, 0:]))</code>. Before the loop we declared <code>i = 0</code>. Here we are saying create axes object from row 0 to 1, the next time the loop runs it creates an axes object from row 1 to 2, then 2 to 3, 3 to 4, and so on.</p>
<p>Following this we can use <code>ax_objs[-1]</code> to access the last created axes object to use as our plotting area.</p>
<p>Next, we create the kde plot. We declare this as a variable so we can retrieve the x and y values to use in the <code>fill_between</code> that follows.</p>
<h3 id="overlapping-axes-objects">Overlapping Axes Objects <a id="gs2"></a><a class="headerlink" href="#overlapping-axes-objects" title="Link to this heading">#</a></h3>
<p>Once again using GridSpec, we can adjust the spacing between each of the subplots. We can do this by adding one line outside of the loop before <code>plt.tight_layout()</code>The exact value will depend on your distribution so feel free to play around with the exact value:</p>

<div class="highlight">
  <pre>gs.update(hspace= -0.5)</pre>
</div>

<p><img src="/matplotlib/create-ridgeplots-in-matplotlib/grid_spec_distro_overlap_1.png" alt="6 kernel density estimate (KDE) charts, each with different shades ranging from blue to blood red. However, the y axis object are overlapping."></p>
<p>Now our axes objects are overlapping! Great-ish. Each axes object is hiding the one layered below it. We <em>could</em> just add <code>ax_objs[-1].axis(&quot;off&quot;)</code> to our for loop, but if we do that we will lose our xticklabels. Instead we will create a variable to access the background of each axes object, and we will loop through each line of the border (spine) to turn them off. As we <em>only</em> need the xticklabels for the final plot, we will add an if statement to handle that. We will also add in our country labels here. In our for loop we add:</p>

<div class="highlight">
  <pre># make background transparent
rect = ax_objs[-1].patch
rect.set_alpha(0)

# remove borders, axis ticks, and labels
ax_objs[-1].set_yticklabels([])
ax_objs[-1].set_ylabel(&#39;&#39;)

if i == len(countries)-1:
    pass
else:
    ax_objs[-1].set_xticklabels([])

spines = [&#34;top&#34;,&#34;right&#34;,&#34;left&#34;,&#34;bottom&#34;]
for s in spines:
    ax_objs[-1].spines[s].set_visible(False)

country = country.replace(&#34; &#34;,&#34;\n&#34;)
ax_objs[-1].text(-0.02,0,country,fontweight=&#34;bold&#34;,fontsize=14,ha=&#34;center&#34;)</pre>
</div>

<p><img src="/matplotlib/create-ridgeplots-in-matplotlib/grid_spec_distro_overlap_2.png" alt="6 kernel density estimate (KDE) charts each representing 6 countries. France is blue, Germany is dark blue, Ireland is purple, Italy is a lighter shade of purple, Spain is red, and the United Kingdom is blood red.  The x-axis shows the distribution of children in each county listed."></p>
<p>As an alternative to the above, we can use the <code>KernelDensity</code> module from <code>sklearn.neighbors</code> to create our distribution. This gives us a bit more control over our bandwidth. The method here is taken from Jake VanderPlas&rsquo;s fantastic <em>Python Data Science Handbook</em>, you can read his full excerpt <a href="https://jakevdp.github.io/PythonDataScienceHandbook/05.13-kernel-density-estimation.html">here</a>. We can reuse most of the above code, but need to make a couple of changes. Rather than repeat myself, I&rsquo;ll add the full snippet here and you can see the changes and minor additions (added title, label to xaxis).</p>
<h3 id="complete-plot-snippet">Complete Plot Snippet <a id="snippet"></a><a class="headerlink" href="#complete-plot-snippet" title="Link to this heading">#</a></h3>

<div class="highlight">
  <pre>countries = [x for x in np.unique(data.country)]
colors = [&#39;#0000ff&#39;, &#39;#3300cc&#39;, &#39;#660099&#39;, &#39;#990066&#39;, &#39;#cc0033&#39;, &#39;#ff0000&#39;]

gs = grid_spec.GridSpec(len(countries),1)
fig = plt.figure(figsize=(16,9))

i = 0

ax_objs = []
for country in countries:
    country = countries[i]
    x = np.array(data[data.country == country].score)
    x_d = np.linspace(0,1, 1000)

    kde = KernelDensity(bandwidth=0.03, kernel=&#39;gaussian&#39;)
    kde.fit(x[:, None])

    logprob = kde.score_samples(x_d[:, None])

    # creating new axes object
    ax_objs.append(fig.add_subplot(gs[i:i&#43;1, 0:]))

    # plotting the distribution
    ax_objs[-1].plot(x_d, np.exp(logprob),color=&#34;#f0f0f0&#34;,lw=1)
    ax_objs[-1].fill_between(x_d, np.exp(logprob), alpha=1,color=colors[i])


    # setting uniform x and y lims
    ax_objs[-1].set_xlim(0,1)
    ax_objs[-1].set_ylim(0,2.5)

    # make background transparent
    rect = ax_objs[-1].patch
    rect.set_alpha(0)

    # remove borders, axis ticks, and labels
    ax_objs[-1].set_yticklabels([])

    if i == len(countries)-1:
        ax_objs[-1].set_xlabel(&#34;Test Score&#34;, fontsize=16,fontweight=&#34;bold&#34;)
    else:
        ax_objs[-1].set_xticklabels([])

    spines = [&#34;top&#34;,&#34;right&#34;,&#34;left&#34;,&#34;bottom&#34;]
    for s in spines:
        ax_objs[-1].spines[s].set_visible(False)

    adj_country = country.replace(&#34; &#34;,&#34;\n&#34;)
    ax_objs[-1].text(-0.02,0,adj_country,fontweight=&#34;bold&#34;,fontsize=14,ha=&#34;right&#34;)


    i &#43;= 1

gs.update(hspace=-0.7)

fig.text(0.07,0.85,&#34;Distribution of Aptitude Test Results from 18 – 24 year-olds&#34;,fontsize=20)

plt.tight_layout()
plt.show()</pre>
</div>

<p><img src="/matplotlib/create-ridgeplots-in-matplotlib/grid_spec_distro_overlap_3.png" alt="6 kernel density estimate (KDE) charts each representing 6 countries. France is blue, Germany is dark blue, Ireland is purple, Italy is a lighter shade of purple, Spain is red, and the United Kingdom is blood red.  The x-axis shows the distribution of aptitude test results from 18 to 24 years old in each county listed."></p>
<p>I&rsquo;ll finish this off with a little project to put the above code into practice. The data provided also contains information on whether the test taker was male or female. Using the above code as a template, see how you get on creating something like this:</p>
<p><img src="/matplotlib/create-ridgeplots-in-matplotlib/split_ridges.png" alt="Each of two sets of six kernel density estimate (KDE) charts shows six different countries. France is blue, Germany is dark blue, Ireland is purple, Italy is a lighter shade of purple, Spain is red, and the United Kingdom is blood red. The x-axis displays the distribution of 18 to 24 year olds’ aptitude test scores in each county presented. The distribution of the results of the aptitude test among boys aged 18 to 24 in each county is shown on one of the two 6-kennel charts, while the distribution of females is shown on the other."></p>
<p>For those more ambitious, this could be turned into a split violin plot with males on one side and females on the other. Is there a way to combine the ridge and violin plot?</p>
<p>I&rsquo;d love to see what people come back with so if you do create something, send it to me on twitter <a href="http://twitter.com/petermckeever">here</a>!</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Create a Tesla Cybertruck That Drives]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/create-a-tesla-cybertruck-that-drives/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/an-inquiry-into-matplotlib-figures/?utm_source=atom_feed" rel="related" type="text/html" title="An Inquiry Into Matplotlib&#39;s Figures" />
                <link href="https://blog.scientific-python.org/matplotlib/custom-3d-engine/?utm_source=atom_feed" rel="related" type="text/html" title="Custom 3D engine in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/warming-stripes/?utm_source=atom_feed" rel="related" type="text/html" title="Creating the Warming Stripes in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-in-data-driven-seo/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib in Data Driven SEO" />
                <link href="https://blog.scientific-python.org/matplotlib/using-matplotlib-to-advocate-for-postdocs/?utm_source=atom_feed" rel="related" type="text/html" title="Using Matplotlib to Advocate for Postdocs" />
            
                <id>https://blog.scientific-python.org/matplotlib/create-a-tesla-cybertruck-that-drives/</id>
            
            
            <published>2020-01-12T13:35:34-05:00</published>
            <updated>2020-01-12T13:35:34-05:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Learn how to create a Tesla Cybertruck with Matplotlib that drives via animation.</blockquote><p>My name is <a href="https://wwww.twitter.com/tedpetrou">Ted Petrou</a>, founder of <a href="https://www.dunderdata.com">Dunder Data</a>, and in this tutorial you will learn how to create the new <a href="https://www.tesla.com/cybertruck">Tesla Cybertruck</a> using Matplotlib. I was inspired by the image below which was originally created by <a href="https://twitter.com/lynnandtonic/status/1197989912970067969?lang=en">Lynn Fisher</a> (without Matplotlib).</p>
<p>Before going into detail, let&rsquo;s jump to the results. Here is the completed recreation of the Tesla Cybertruck that drives off the screen.</p>
<video width="700" height="500" controls>
  <source src="tesla_animate.mp4" type="video/mp4">
</video>
<h2 id="tutorial">Tutorial<a class="headerlink" href="#tutorial" title="Link to this heading">#</a></h2>
<p>A tutorial now follows containing all the steps that creates a Tesla Cybertruck that drives. It covers the following topics:</p>
<ul>
<li>Figure and Axes setup</li>
<li>Adding shapes</li>
<li>Color gradients</li>
<li>Animation</li>
</ul>
<p>Understanding these topics should give you enough to start animating your own figures in Matplotlib. This tutorial is not suited for those with no Matplotlib experience. You need to understand the relationship between the Figure and Axes and how to use the object-oriented interface of Matplotlib.</p>
<h3 id="figure-and-axes-setup">Figure and Axes setup<a class="headerlink" href="#figure-and-axes-setup" title="Link to this heading">#</a></h3>
<p>We first create a Matplotlib Figure without any Axes (the plotting surface). The function <code>create_axes</code> adds an Axes to the Figure, sets the x-limits to be twice the y-limits (to match the ratio of the figure dimensions (16 x 8)), fills in the background with two different dark colors using <code>fill_between</code>, and adds grid lines to make it easier to plot the objects in the exact place you desire. Set the <code>draft</code> parameter to <code>False</code> when you want to remove the grid lines, tick marks, and tick labels.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">Figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">16</span><span class="p">,</span> <span class="mi">8</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">create_axes</span><span class="p">(</span><span class="n">draft</span><span class="o">=</span><span class="kc">True</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="kc">True</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_ylim</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">set_xlim</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">fill_between</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">y1</span><span class="o">=</span><span class="mf">0.36</span><span class="p">,</span> <span class="n">y2</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">fill_between</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">y1</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">y2</span><span class="o">=</span><span class="mf">0.36</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#101115&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="ow">not</span> <span class="n">draft</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="kc">False</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">axis</span><span class="p">(</span><span class="s2">&#34;off&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">create_axes</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/create-a-tesla-cybertruck-that-drives/output_4_0.png" alt="png"></p>
<h3 id="shapes-in-matplotlib">Shapes in Matplotlib<a class="headerlink" href="#shapes-in-matplotlib" title="Link to this heading">#</a></h3>
<p>Most of the Cybertruck is composed of shapes (patches in Matplotlib terminology) - circles, rectangles, and polygons. These shapes are available in the patches Matplotlib module. After importing, we instantiate single instances of these patches and then call the <code>add_patch</code> method to add the patch to the Axes.</p>
<p>For the Cybertruck, I used three patches, <code>Polygon</code>, <code>Rectangle</code>, and <code>Circle</code>. They each have different parameters available in their constructor. I first constructed the body of the car as four polygons. Two other polygons were used for the rims. Each polygon is provided a list of x, y coordinates where the corner points are located. Matplotlib connects all the points in the order given and fills it in with the provided color.</p>
<p>Notice how the Axes is retrieved as the first line of the function. This is used throughout the tutorial.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib.patches</span> <span class="kn">import</span> <span class="n">Polygon</span><span class="p">,</span> <span class="n">Rectangle</span><span class="p">,</span> <span class="n">Circle</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">create_body</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">axes</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="n">top</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">([[</span><span class="mf">0.62</span><span class="p">,</span> <span class="mf">0.51</span><span class="p">],</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mf">0.66</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.6</span><span class="p">,</span> <span class="mf">0.56</span><span class="p">]],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#DCDCDC&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">windows</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[[</span><span class="mf">0.74</span><span class="p">,</span> <span class="mf">0.54</span><span class="p">],</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mf">0.64</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.26</span><span class="p">,</span> <span class="mf">0.6</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.262</span><span class="p">,</span> <span class="mf">0.57</span><span class="p">]],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">windows_bottom</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[[</span><span class="mf">0.8</span><span class="p">,</span> <span class="mf">0.56</span><span class="p">],</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mf">0.635</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.255</span><span class="p">,</span> <span class="mf">0.597</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.255</span><span class="p">,</span> <span class="mf">0.585</span><span class="p">]],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#474747&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">base</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.62</span><span class="p">,</span> <span class="mf">0.51</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.62</span><span class="p">,</span> <span class="mf">0.445</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.67</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.78</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.84</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.3</span><span class="p">,</span> <span class="mf">0.423</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.36</span><span class="p">,</span> <span class="mf">0.51</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.44</span><span class="p">,</span> <span class="mf">0.51</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.52</span><span class="p">,</span> <span class="mf">0.43</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.58</span><span class="p">,</span> <span class="mf">0.44</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.6</span><span class="p">,</span> <span class="mf">0.56</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#1E2329&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">left_rim</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.62</span><span class="p">,</span> <span class="mf">0.445</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.67</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.78</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.84</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.824</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.77</span><span class="p">,</span> <span class="mf">0.49</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.674</span><span class="p">,</span> <span class="mf">0.49</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">0.633</span><span class="p">,</span> <span class="mf">0.445</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#373E48&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">right_rim</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.3</span><span class="p">,</span> <span class="mf">0.423</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.36</span><span class="p">,</span> <span class="mf">0.51</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.44</span><span class="p">,</span> <span class="mf">0.51</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.52</span><span class="p">,</span> <span class="mf">0.43</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.504</span><span class="p">,</span> <span class="mf">0.43</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.436</span><span class="p">,</span> <span class="mf">0.498</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.364</span><span class="p">,</span> <span class="mf">0.498</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="p">[</span><span class="mf">1.312</span><span class="p">,</span> <span class="mf">0.423</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#4D586A&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">top</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">windows</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">windows_bottom</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">base</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">left_rim</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">right_rim</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">create_body</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/create-a-tesla-cybertruck-that-drives/output_6_0.png" alt="png"></p>
<h4 id="tires">Tires<a class="headerlink" href="#tires" title="Link to this heading">#</a></h4>
<p>I used three <code>Circle</code> patches for each of the tires. You must provide the center and radius. For the innermost circles (the &ldquo;spokes&rdquo;), I&rsquo;ve set the <code>zorder</code> to 99. The <code>zorder</code> determines the order of how plotting objects are layered on top of each other. The higher the number, the higher up on the stack of layers the object will be plotted. During the next step, we will draw some rectangles through the tires and they need to be plotted underneath these spokes.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">create_tires</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">axes</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="n">left_tire</span> <span class="o">=</span> <span class="n">Circle</span><span class="p">((</span><span class="mf">0.724</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">),</span> <span class="n">radius</span><span class="o">=</span><span class="mf">0.075</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#202328&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">right_tire</span> <span class="o">=</span> <span class="n">Circle</span><span class="p">((</span><span class="mf">1.404</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">),</span> <span class="n">radius</span><span class="o">=</span><span class="mf">0.075</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#202328&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">left_inner_tire</span> <span class="o">=</span> <span class="n">Circle</span><span class="p">((</span><span class="mf">0.724</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">),</span> <span class="n">radius</span><span class="o">=</span><span class="mf">0.052</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#15191C&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">right_inner_tire</span> <span class="o">=</span> <span class="n">Circle</span><span class="p">((</span><span class="mf">1.404</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">),</span> <span class="n">radius</span><span class="o">=</span><span class="mf">0.052</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#15191C&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">left_spoke</span> <span class="o">=</span> <span class="n">Circle</span><span class="p">((</span><span class="mf">0.724</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">),</span> <span class="n">radius</span><span class="o">=</span><span class="mf">0.019</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#202328&#34;</span><span class="p">,</span> <span class="n">zorder</span><span class="o">=</span><span class="mi">99</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">right_spoke</span> <span class="o">=</span> <span class="n">Circle</span><span class="p">((</span><span class="mf">1.404</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">),</span> <span class="n">radius</span><span class="o">=</span><span class="mf">0.019</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#202328&#34;</span><span class="p">,</span> <span class="n">zorder</span><span class="o">=</span><span class="mi">99</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">left_inner_spoke</span> <span class="o">=</span> <span class="n">Circle</span><span class="p">((</span><span class="mf">0.724</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">),</span> <span class="n">radius</span><span class="o">=</span><span class="mf">0.011</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#131418&#34;</span><span class="p">,</span> <span class="n">zorder</span><span class="o">=</span><span class="mi">99</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">right_inner_spoke</span> <span class="o">=</span> <span class="n">Circle</span><span class="p">((</span><span class="mf">1.404</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">),</span> <span class="n">radius</span><span class="o">=</span><span class="mf">0.011</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#131418&#34;</span><span class="p">,</span> <span class="n">zorder</span><span class="o">=</span><span class="mi">99</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">left_tire</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">right_tire</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">left_inner_tire</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">right_inner_tire</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">left_spoke</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">right_spoke</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">left_inner_spoke</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">right_inner_spoke</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">create_tires</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/create-a-tesla-cybertruck-that-drives/output_8_0.png" alt="png"></p>
<h4 id="axles">Axles<a class="headerlink" href="#axles" title="Link to this heading">#</a></h4>
<p>I used the <code>Rectangle</code> patch to represent the two &lsquo;axles&rsquo; (this isn&rsquo;t the correct term, but you&rsquo;ll see what I mean) going through the tires. You must provide a coordinate for the lower left corner, a width, and a height. You can also provide it an angle (in degrees) to control its orientation. Notice that they go under the spokes plotted from above. This is due to their lower <code>zorder</code>.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">create_axles</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">axes</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="n">left_left_axle</span> <span class="o">=</span> <span class="n">Rectangle</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mf">0.687</span><span class="p">,</span> <span class="mf">0.427</span><span class="p">),</span> <span class="n">width</span><span class="o">=</span><span class="mf">0.104</span><span class="p">,</span> <span class="n">height</span><span class="o">=</span><span class="mf">0.005</span><span class="p">,</span> <span class="n">angle</span><span class="o">=</span><span class="mi">315</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#202328&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">left_right_axle</span> <span class="o">=</span> <span class="n">Rectangle</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mf">0.761</span><span class="p">,</span> <span class="mf">0.427</span><span class="p">),</span> <span class="n">width</span><span class="o">=</span><span class="mf">0.104</span><span class="p">,</span> <span class="n">height</span><span class="o">=</span><span class="mf">0.005</span><span class="p">,</span> <span class="n">angle</span><span class="o">=</span><span class="mi">225</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#202328&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">right_left_axle</span> <span class="o">=</span> <span class="n">Rectangle</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mf">1.367</span><span class="p">,</span> <span class="mf">0.427</span><span class="p">),</span> <span class="n">width</span><span class="o">=</span><span class="mf">0.104</span><span class="p">,</span> <span class="n">height</span><span class="o">=</span><span class="mf">0.005</span><span class="p">,</span> <span class="n">angle</span><span class="o">=</span><span class="mi">315</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#202328&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">right_right_axle</span> <span class="o">=</span> <span class="n">Rectangle</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="mf">1.441</span><span class="p">,</span> <span class="mf">0.427</span><span class="p">),</span> <span class="n">width</span><span class="o">=</span><span class="mf">0.104</span><span class="p">,</span> <span class="n">height</span><span class="o">=</span><span class="mf">0.005</span><span class="p">,</span> <span class="n">angle</span><span class="o">=</span><span class="mi">225</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#202328&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">left_left_axle</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">left_right_axle</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">right_left_axle</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">right_right_axle</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">create_axles</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/create-a-tesla-cybertruck-that-drives/output_10_0.png" alt="png"></p>
<h4 id="other-details">Other details<a class="headerlink" href="#other-details" title="Link to this heading">#</a></h4>
<p>The front bumper, head light, tail light, door and window lines are added below. I used regular Matplotlib lines for some of these. Those lines are not patches and get added directly to the Axes without any other additional method.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">create_other_details</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">axes</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># other details</span>
</span></span><span class="line"><span class="cl">    <span class="n">front</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[[</span><span class="mf">0.62</span><span class="p">,</span> <span class="mf">0.51</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.597</span><span class="p">,</span> <span class="mf">0.51</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.589</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.589</span><span class="p">,</span> <span class="mf">0.445</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.62</span><span class="p">,</span> <span class="mf">0.445</span><span class="p">]],</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#26272d&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">front_bottom</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[[</span><span class="mf">0.62</span><span class="p">,</span> <span class="mf">0.438</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.58</span><span class="p">,</span> <span class="mf">0.438</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.58</span><span class="p">,</span> <span class="mf">0.423</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.62</span><span class="p">,</span> <span class="mf">0.423</span><span class="p">]],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#26272d&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">head_light</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[[</span><span class="mf">0.62</span><span class="p">,</span> <span class="mf">0.51</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.597</span><span class="p">,</span> <span class="mf">0.51</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.589</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.589</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.62</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">]],</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span><span class="o">=</span><span class="s2">&#34;aqua&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">step</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[[</span><span class="mf">0.84</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.84</span><span class="p">,</span> <span class="mf">0.394</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.3</span><span class="p">,</span> <span class="mf">0.397</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.3</span><span class="p">,</span> <span class="mf">0.393</span><span class="p">]],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;#1E2329&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># doors</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="mf">0.84</span><span class="p">,</span> <span class="mf">0.84</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.42</span><span class="p">,</span> <span class="mf">0.523</span><span class="p">],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="mf">1.02</span><span class="p">,</span> <span class="mf">1.04</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.42</span><span class="p">,</span> <span class="mf">0.53</span><span class="p">],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="mf">1.26</span><span class="p">,</span> <span class="mf">1.26</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.42</span><span class="p">,</span> <span class="mf">0.54</span><span class="p">],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="mf">0.84</span><span class="p">,</span> <span class="mf">0.85</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.523</span><span class="p">,</span> <span class="mf">0.547</span><span class="p">],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="mf">1.04</span><span class="p">,</span> <span class="mf">1.04</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.53</span><span class="p">,</span> <span class="mf">0.557</span><span class="p">],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="mf">1.26</span><span class="p">,</span> <span class="mf">1.26</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.54</span><span class="p">,</span> <span class="mf">0.57</span><span class="p">],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># window lines</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="mf">0.87</span><span class="p">,</span> <span class="mf">0.88</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.56</span><span class="p">,</span> <span class="mf">0.59</span><span class="p">],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">([</span><span class="mf">1.03</span><span class="p">,</span> <span class="mf">1.04</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.56</span><span class="p">,</span> <span class="mf">0.63</span><span class="p">],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;black&#34;</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># tail light</span>
</span></span><span class="line"><span class="cl">    <span class="n">tail_light</span> <span class="o">=</span> <span class="n">Circle</span><span class="p">((</span><span class="mf">1.6</span><span class="p">,</span> <span class="mf">0.56</span><span class="p">),</span> <span class="n">radius</span><span class="o">=</span><span class="mf">0.007</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;red&#34;</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.6</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">tail_light_center</span> <span class="o">=</span> <span class="n">Circle</span><span class="p">((</span><span class="mf">1.6</span><span class="p">,</span> <span class="mf">0.56</span><span class="p">),</span> <span class="n">radius</span><span class="o">=</span><span class="mf">0.003</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;yellow&#34;</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.6</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">tail_light_up</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[[</span><span class="mf">1.597</span><span class="p">,</span> <span class="mf">0.56</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.6</span><span class="p">,</span> <span class="mf">0.6</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.603</span><span class="p">,</span> <span class="mf">0.56</span><span class="p">]],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;red&#34;</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.4</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">tail_light_right</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[[</span><span class="mf">1.6</span><span class="p">,</span> <span class="mf">0.563</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.64</span><span class="p">,</span> <span class="mf">0.56</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.6</span><span class="p">,</span> <span class="mf">0.557</span><span class="p">]],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;red&#34;</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.4</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">tail_light_down</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">[[</span><span class="mf">1.597</span><span class="p">,</span> <span class="mf">0.56</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.6</span><span class="p">,</span> <span class="mf">0.52</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.603</span><span class="p">,</span> <span class="mf">0.56</span><span class="p">]],</span> <span class="n">color</span><span class="o">=</span><span class="s2">&#34;red&#34;</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.4</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">front</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">front_bottom</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">head_light</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">step</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">tail_light</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">tail_light_center</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">tail_light_up</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">tail_light_right</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">tail_light_down</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">create_other_details</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/create-a-tesla-cybertruck-that-drives/output_12_0.png" alt="png"></p>
<h4 id="color-gradients-for-the-head-light-beam">Color gradients for the head light beam<a class="headerlink" href="#color-gradients-for-the-head-light-beam" title="Link to this heading">#</a></h4>
<p>The head light beam has a distinct color gradient that dissipates into the night sky. This is challenging to complete. I found an <a href="https://wwww.twitter.com/tedpetrou">excellent answer on Stack Overflow from user Joe Kington</a> on how to do this. We begin by using the <code>imshow</code> function which creates images from 3-dimensional arrays. Our image will simply be a rectangle of colors.</p>
<p>We create a 1 x 100 x 4 array that represents 1 row by 100 columns of points of RGBA (red, green, blue, alpha) values. Every point is given the same red, green, and blue values of (0, 1, 1) which represents the color &lsquo;aqua&rsquo;. The alpha value represents opacity and ranges between 0 and 1 with 0 being completely transparent (invisible) and 1 being opaque. We would like the opacity to decrease as the light extends further from the head light (that is further to the left). The NumPy <code>linspace</code> function is used to create an array of 100 numbers increasing linearly from 0 to 1. This array will be set as the alpha values.</p>
<p>The <code>extent</code> parameter defines the rectangular region where the image will be shown. The four values correspond to xmin, xmax, ymin, and ymax. The 100 alpha values will be mapped to this region beginning from the left. The array of alphas begins at 0, which means that the very left of this rectangular region will be transparent. The opacity will increase moving to the right-side of the rectangle where it eventually reaches 1.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.colors</span> <span class="k">as</span> <span class="nn">mcolors</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">create_headlight_beam</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">axes</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="n">z</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">empty</span><span class="p">((</span><span class="mi">1</span><span class="p">,</span> <span class="mi">100</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="nb">float</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">rgb</span> <span class="o">=</span> <span class="n">mcolors</span><span class="o">.</span><span class="n">colorConverter</span><span class="o">.</span><span class="n">to_rgb</span><span class="p">(</span><span class="s2">&#34;aqua&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">alphas</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">z</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">rgb</span>
</span></span><span class="line"><span class="cl">    <span class="n">z</span><span class="p">[:,</span> <span class="p">:,</span> <span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">alphas</span>
</span></span><span class="line"><span class="cl">    <span class="n">im</span> <span class="o">=</span> <span class="n">ax</span><span class="o">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">z</span><span class="p">,</span> <span class="n">extent</span><span class="o">=</span><span class="p">[</span><span class="mf">0.3</span><span class="p">,</span> <span class="mf">0.589</span><span class="p">,</span> <span class="mf">0.501</span><span class="p">,</span> <span class="mf">0.505</span><span class="p">],</span> <span class="n">zorder</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">create_headlight_beam</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/create-a-tesla-cybertruck-that-drives/output_14_0.png" alt="png"></p>
<h4 id="headlight-cloud">Headlight Cloud<a class="headerlink" href="#headlight-cloud" title="Link to this heading">#</a></h4>
<p>The cloud of points surrounding the headlight beam is even more challenging to complete. This time, a 100 x 100 grid of points was used to control the opacity. The opacity is directly proportional to the vertical distance from the center beam. Additionally, if a point was outside of the diagonal of the rectangle defined by <code>extent</code>, its opacity was set to 0.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">create_headlight_cloud</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">axes</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="n">z2</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">empty</span><span class="p">((</span><span class="mi">100</span><span class="p">,</span> <span class="mi">100</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="nb">float</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">rgb</span> <span class="o">=</span> <span class="n">mcolors</span><span class="o">.</span><span class="n">colorConverter</span><span class="o">.</span><span class="n">to_rgb</span><span class="p">(</span><span class="s2">&#34;aqua&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">z2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">rgb</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">j</span><span class="p">,</span> <span class="n">x</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">100</span><span class="p">)):</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">y</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">abs</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="o">-</span><span class="mf">0.2</span><span class="p">,</span> <span class="mf">0.2</span><span class="p">,</span> <span class="mi">100</span><span class="p">))):</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="n">x</span> <span class="o">*</span> <span class="mf">0.2</span> <span class="o">&gt;</span> <span class="n">y</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">                <span class="n">z2</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">-</span> <span class="p">(</span><span class="n">y</span> <span class="o">+</span> <span class="mf">0.8</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span>
</span></span><span class="line"><span class="cl">            <span class="k">else</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">                <span class="n">z2</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl">    <span class="n">im2</span> <span class="o">=</span> <span class="n">ax</span><span class="o">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">z2</span><span class="p">,</span> <span class="n">extent</span><span class="o">=</span><span class="p">[</span><span class="mf">0.3</span><span class="p">,</span> <span class="mf">0.65</span><span class="p">,</span> <span class="mf">0.45</span><span class="p">,</span> <span class="mf">0.55</span><span class="p">],</span> <span class="n">zorder</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">create_headlight_cloud</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/create-a-tesla-cybertruck-that-drives/output_16_0.png" alt="png"></p>
<h3 id="creating-a-function-to-draw-the-car">Creating a Function to Draw the Car<a class="headerlink" href="#creating-a-function-to-draw-the-car" title="Link to this heading">#</a></h3>
<p>All of our work from above can be placed in a single function that draws the car. This will be used when initializing our animation. Notice, that the first line of the function clears the Figure, which removes our Axes. If we don&rsquo;t clear the Figure, then we will keep adding more and more Axes each time this function is called. Since this is our final product, we set <code>draft</code> to <code>False</code>.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">draw_car</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="o">.</span><span class="n">clear</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">create_axes</span><span class="p">(</span><span class="n">draft</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">create_body</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">create_tires</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">create_axles</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">create_other_details</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">create_headlight_beam</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">create_headlight_beam</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">draw_car</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/create-a-tesla-cybertruck-that-drives/output_18_0.png" alt="png"></p>
<h2 id="animation">Animation<a class="headerlink" href="#animation" title="Link to this heading">#</a></h2>
<p>Animation in Matplotlib is fairly straightforward. You must create a function that updates the position of the objects in your figure for each frame. This function is called repeatedly for each frame.</p>
<p>In the <code>update</code> function below, we loop through each patch, line, and image in our Axes and reduce the x-value of each plotted object by .015. This has the effect of moving the truck to the left. The trickiest part was changing the x and y values for the rectangular tire &lsquo;axles&rsquo; so that it appeared that the tires were rotating. Some basic trigonometry helps calculate this.</p>
<p>Implicitly, Matplotlib passes the update function the frame number as an integer as the first argument. We accept this input as the parameter <code>frame_number</code>. We only use it in one place, and that is to do nothing during the first frame.</p>
<p>Finally, the <code>FuncAnimation</code> class from the animation module is used to construct the animation. We provide it our original Figure, the function to update the Figure (<code>update</code>), a function to initialize the Figure (<code>draw_car</code>), the total number of frames, and any extra arguments used during update (<code>fargs</code>).</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib.animation</span> <span class="kn">import</span> <span class="n">FuncAnimation</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">update</span><span class="p">(</span><span class="n">frame_number</span><span class="p">,</span> <span class="n">x_delta</span><span class="p">,</span> <span class="n">radius</span><span class="p">,</span> <span class="n">angle</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">frame_number</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">axes</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">patch</span> <span class="ow">in</span> <span class="n">ax</span><span class="o">.</span><span class="n">patches</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">patch</span><span class="p">,</span> <span class="n">Polygon</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">            <span class="n">arr</span> <span class="o">=</span> <span class="n">patch</span><span class="o">.</span><span class="n">get_xy</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">            <span class="n">arr</span><span class="p">[:,</span> <span class="mi">0</span><span class="p">]</span> <span class="o">-=</span> <span class="n">x_delta</span>
</span></span><span class="line"><span class="cl">        <span class="k">elif</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">patch</span><span class="p">,</span> <span class="n">Circle</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">            <span class="n">x</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">patch</span><span class="o">.</span><span class="n">get_center</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">            <span class="n">patch</span><span class="o">.</span><span class="n">set_center</span><span class="p">((</span><span class="n">x</span> <span class="o">-</span> <span class="n">x_delta</span><span class="p">,</span> <span class="n">y</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="k">elif</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">patch</span><span class="p">,</span> <span class="n">Rectangle</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">            <span class="n">xd_old</span> <span class="o">=</span> <span class="o">-</span><span class="n">np</span><span class="o">.</span><span class="n">cos</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">pi</span> <span class="o">*</span> <span class="n">patch</span><span class="o">.</span><span class="n">angle</span> <span class="o">/</span> <span class="mi">180</span><span class="p">)</span> <span class="o">*</span> <span class="n">radius</span>
</span></span><span class="line"><span class="cl">            <span class="n">yd_old</span> <span class="o">=</span> <span class="o">-</span><span class="n">np</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">pi</span> <span class="o">*</span> <span class="n">patch</span><span class="o">.</span><span class="n">angle</span> <span class="o">/</span> <span class="mi">180</span><span class="p">)</span> <span class="o">*</span> <span class="n">radius</span>
</span></span><span class="line"><span class="cl">            <span class="n">patch</span><span class="o">.</span><span class="n">angle</span> <span class="o">+=</span> <span class="n">angle</span>
</span></span><span class="line"><span class="cl">            <span class="n">xd</span> <span class="o">=</span> <span class="o">-</span><span class="n">np</span><span class="o">.</span><span class="n">cos</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">pi</span> <span class="o">*</span> <span class="n">patch</span><span class="o">.</span><span class="n">angle</span> <span class="o">/</span> <span class="mi">180</span><span class="p">)</span> <span class="o">*</span> <span class="n">radius</span>
</span></span><span class="line"><span class="cl">            <span class="n">yd</span> <span class="o">=</span> <span class="o">-</span><span class="n">np</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">pi</span> <span class="o">*</span> <span class="n">patch</span><span class="o">.</span><span class="n">angle</span> <span class="o">/</span> <span class="mi">180</span><span class="p">)</span> <span class="o">*</span> <span class="n">radius</span>
</span></span><span class="line"><span class="cl">            <span class="n">x</span> <span class="o">=</span> <span class="n">patch</span><span class="o">.</span><span class="n">get_x</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">            <span class="n">y</span> <span class="o">=</span> <span class="n">patch</span><span class="o">.</span><span class="n">get_y</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">            <span class="n">x_new</span> <span class="o">=</span> <span class="n">x</span> <span class="o">-</span> <span class="n">x_delta</span> <span class="o">+</span> <span class="n">xd</span> <span class="o">-</span> <span class="n">xd_old</span>
</span></span><span class="line"><span class="cl">            <span class="n">y_new</span> <span class="o">=</span> <span class="n">y</span> <span class="o">+</span> <span class="n">yd</span> <span class="o">-</span> <span class="n">yd_old</span>
</span></span><span class="line"><span class="cl">            <span class="n">patch</span><span class="o">.</span><span class="n">set_x</span><span class="p">(</span><span class="n">x_new</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="n">patch</span><span class="o">.</span><span class="n">set_y</span><span class="p">(</span><span class="n">y_new</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="n">ax</span><span class="o">.</span><span class="n">lines</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">xdata</span> <span class="o">=</span> <span class="n">line</span><span class="o">.</span><span class="n">get_xdata</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">        <span class="n">line</span><span class="o">.</span><span class="n">set_xdata</span><span class="p">(</span><span class="n">xdata</span> <span class="o">-</span> <span class="n">x_delta</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">image</span> <span class="ow">in</span> <span class="n">ax</span><span class="o">.</span><span class="n">images</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">extent</span> <span class="o">=</span> <span class="n">image</span><span class="o">.</span><span class="n">get_extent</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">        <span class="n">extent</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-=</span> <span class="n">x_delta</span>
</span></span><span class="line"><span class="cl">        <span class="n">extent</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">-=</span> <span class="n">x_delta</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">animation</span> <span class="o">=</span> <span class="n">FuncAnimation</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="p">,</span> <span class="n">update</span><span class="p">,</span> <span class="n">init_func</span><span class="o">=</span><span class="n">draw_car</span><span class="p">,</span> <span class="n">frames</span><span class="o">=</span><span class="mi">110</span><span class="p">,</span> <span class="n">repeat</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">fargs</span><span class="o">=</span><span class="p">(</span><span class="mf">0.015</span><span class="p">,</span> <span class="mf">0.052</span><span class="p">,</span> <span class="mi">4</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span></span></span></code></pre>
</div>
<h3 id="save-animation">Save animation<a class="headerlink" href="#save-animation" title="Link to this heading">#</a></h3>
<p>Finally, we can save the animation as an mp4 file (you must have ffmpeg installed for this to work). We set the frames-per-second (<code>fps</code>) to 30. From above, the total number of frames is 110 (enough to move the truck off the screen) so the video will last nearly four seconds (110 / 30).</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">animation</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="s2">&#34;tesla_animate.mp4&#34;</span><span class="p">,</span> <span class="n">fps</span><span class="o">=</span><span class="mi">30</span><span class="p">,</span> <span class="n">bitrate</span><span class="o">=</span><span class="mi">3000</span><span class="p">)</span></span></span></code></pre>
</div>
<video width="700" height="500" controls>
  <source src="tesla_animate.mp4" type="video/mp4">
</video>
<h2 id="continue-animating">Continue Animating<a class="headerlink" href="#continue-animating" title="Link to this heading">#</a></h2>
<p>I encourage you to add more components to your Cybertruck animation to personalize the creation. I suggest encapsulating each addition with a function as done in this tutorial.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[An Inquiry Into Matplotlib's Figures]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/an-inquiry-into-matplotlib-figures/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/custom-3d-engine/?utm_source=atom_feed" rel="related" type="text/html" title="Custom 3D engine in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/warming-stripes/?utm_source=atom_feed" rel="related" type="text/html" title="Creating the Warming Stripes in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-in-data-driven-seo/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib in Data Driven SEO" />
                <link href="https://blog.scientific-python.org/matplotlib/using-matplotlib-to-advocate-for-postdocs/?utm_source=atom_feed" rel="related" type="text/html" title="Using Matplotlib to Advocate for Postdocs" />
            
                <id>https://blog.scientific-python.org/matplotlib/an-inquiry-into-matplotlib-figures/</id>
            
            
            <published>2019-12-24T11:25:42+05:30</published>
            <updated>2019-12-24T11:25:42+05:30</updated>
            
            
            <content type="html"><![CDATA[<blockquote>This guide dives deep into the inner workings of Matplotlib&rsquo;s Figures, Axes, subplots and the very amazing GridSpec!</blockquote><h1 id="preliminaries">Preliminaries<a class="headerlink" href="#preliminaries" title="Link to this heading">#</a></h1>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib</span> <span class="k">as</span> <span class="nn">mpl</span></span></span></code></pre>
</div>
<blockquote>
<p>A Top-Down runnable Jupyter Notebook with the exact contents of this blog can be found <a href="https://gist.github.com/akashpalrecha/4652e98c9b2f3f1961637be001dc0239">here</a></p>
</blockquote>
<blockquote>
<p>An interactive version of this guide can be accessed on <a href="https://colab.research.google.com/drive/1SOgWPI9HckTQ0zm47Ma-gYCTucMccTxg">Google Colab</a></p>
</blockquote>
<h1 id="a-word-before-we-get-started">A word before we get started&hellip;<a class="headerlink" href="#a-word-before-we-get-started" title="Link to this heading">#</a></h1>
<hr>
<p>Although a beginner can follow along with this guide, it is primarily meant for people who have at least a basic knowledge of how Matplotlib&rsquo;s plotting functionality works.</p>
<p>Essentially, if you know how to take 2 NumPy arrays and plot them (using an appropriate type of graph) on 2 different axes in a single figure and give it basic styling, you&rsquo;re good to go for the purposes of this guide.</p>
<p>If you feel you need some introduction to basic Matplotlib plotting, here&rsquo;s a great guide that can help you get a feel for <a href="https://matplotlib.org/devdocs/gallery/subplots_axes_and_figures/subplots_demo.html">introductory plotting using Matplotlib</a></p>
<p>From here on, I will be assuming that you have gained sufficient knowledge to follow along this guide.</p>
<p>Also, in order to save everyone&rsquo;s time, I will keep my explanations short, terse and very much to the point, and sometimes leave it for the reader to interpret things (because that&rsquo;s what I&rsquo;ve done throughout this guide for myself anyway).</p>
<p>The primary driver in this whole exercise will be code and not text, and I encourage you to spin up a Jupyter notebook and type in and try out everything yourself to make the best use of this resource.</p>
<h2 id="what-this-guide-is-and-what-it-is-not">What this guide <em>is</em> and what it is <em>not</em>:<a class="headerlink" href="#what-this-guide-is-and-what-it-is-not" title="Link to this heading">#</a></h2>
<p>This is not a guide about how to beautifully plot different kinds of data using Matplotlib, the internet is more than full of such tutorials by people who can explain it way better than I can.</p>
<p>This article attempts to explain the workings of some of the foundations of any plot you create using Matplotlib.
We will mostly refrain from focusing on what data we are plotting and instead focus on the anatomy of our plots.</p>
<h1 id="setting-up">Setting up<a class="headerlink" href="#setting-up" title="Link to this heading">#</a></h1>
<p>Matplotlib has many styles available, we can see the available options using:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">style</span><span class="o">.</span><span class="n">available</span></span></span></code></pre>
</div>
<pre><code>['seaborn-dark',
 'seaborn-darkgrid',
 'seaborn-ticks',
 'fivethirtyeight',
 'seaborn-whitegrid',
 'classic',
 '_classic_test',
 'fast',
 'seaborn-talk',
 'seaborn-dark-palette',
 'seaborn-bright',
 'seaborn-pastel',
 'grayscale',
 'seaborn-notebook',
 'ggplot',
 'seaborn-colorblind',
 'seaborn-muted',
 'seaborn',
 'Solarize_Light2',
 'seaborn-paper',
 'bmh',
 'tableau-colorblind10',
 'seaborn-white',
 'dark_background',
 'seaborn-poster',
 'seaborn-deep']
</code></pre>
<p>We shall use <code>seaborn</code>. This is done like so:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">style</span><span class="o">.</span><span class="n">use</span><span class="p">(</span><span class="s2">&#34;seaborn&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p>Let&rsquo;s get started!</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># Creating some fake data for plotting</span>
</span></span><span class="line"><span class="cl"><span class="n">xs</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="p">,</span> <span class="mi">400</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ys</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">xs</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">xc</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="p">,</span> <span class="mi">600</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">yc</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">cos</span><span class="p">(</span><span class="n">xc</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span></span></span></code></pre>
</div>
<h1 id="exploration">Exploration<a class="headerlink" href="#exploration" title="Link to this heading">#</a></h1>
<p>The usual way to create a plot using Matplotlib goes somewhat like this:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">16</span><span class="p">,</span> <span class="mi">8</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1"># `Fig` is short for Figure. `ax` is short for Axes.</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xs</span><span class="p">,</span> <span class="n">ys</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">]</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xs</span><span class="p">,</span> <span class="n">ys</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">]</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xc</span><span class="p">,</span> <span class="n">yc</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xc</span><span class="p">,</span> <span class="n">yc</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span><span class="s2">&#34;Basic plotting using Matplotlib&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_14_0.png" alt="png"></p>
<p>Our goal today is to take apart the previous snippet of code and understand all of the underlying building blocks well enough so that we can use them separately and in a much more powerful way.</p>
<p>If you&rsquo;re a beginner like I was before writing this guide, let me assure you: this is all very simple stuff.</p>
<p>Going into <a href="https://matplotlib.org/api/_as_gen/matplotlib.pyplot.subplots.html?highlight=subplots#matplotlib.pyplot.subplots"><code>plt.subplots</code></a> documentation (hit <code>Shift+Tab+Tab</code> in a Jupyter notebook) reveals some of the other Matplotlib internals that it uses in order to give us the <code>Figure</code> and it&rsquo;s <code>Axes</code>.</p>
<p>These include :<br></p>
<ol>
<li><code>plt.subplot</code></li>
<li><code>plt.figure</code></li>
<li><code>mpl.figure.Figure</code></li>
<li><code>mpl.figure.Figure.add_subplot</code></li>
<li><code>mpl.gridspec.GridSpec</code></li>
<li><code>mpl.axes.Axes</code></li>
</ol>
<p>Let&rsquo;s try and figure out what these functions / classes do.</p>
<h1 id="what-is-a-figure-and-what-are-axes">What is a <code>Figure</code>? And what are <code>Axes</code>?<a class="headerlink" href="#what-is-a-figure-and-what-are-axes" title="Link to this heading">#</a></h1>
<p>A <a href="https://matplotlib.org/api/_as_gen/matplotlib.pyplot.figure.html?highlight=figure#matplotlib.pyplot.figure"><code>Figure</code></a> in Matplotlib is simply your main (imaginary) canvas. This is where you will be doing all your plotting / drawing / putting images and what not. This is the central object with which you will always be interacting. A figure has a size defined for it at the time of creation.</p>
<p>You can define a figure like so (both statements are equivalent):</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">mpl</span><span class="o">.</span><span class="n">figure</span><span class="o">.</span><span class="n">Figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1"># OR</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span></span></span></code></pre>
</div>
<p>Notice the word <em>imaginary</em> above. What this means is that a Figure by itself does not have any place for you to plot. You need to attach/add an <a href="https://matplotlib.org/api/axes_api.html?highlight=matplotlib.axes.axes#matplotlib.axes.Axes"><code>Axes</code></a> to it to do any kind of plotting. You can put as many <code>Axes</code> objects as you want inside of any <code>Figure</code> you have created.</p>
<p>An <code>Axes</code>:</p>
<ol>
<li>Has a space (like a blank Page) where you can draw/plot data.</li>
<li>A parent <code>Figure</code></li>
<li>Has properties stating where it will be placed inside it&rsquo;s parent <code>Figure</code>.</li>
<li>Has methods to draw/plot different kinds of data in different ways and add custom styles.</li>
</ol>
<p>You can create an <code>Axes</code> like so (both statements are equivalent):</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">ax1</span> <span class="o">=</span> <span class="n">mpl</span><span class="o">.</span><span class="n">axes</span><span class="o">.</span><span class="n">Axes</span><span class="p">(</span><span class="n">fig</span><span class="o">=</span><span class="n">fig</span><span class="p">,</span> <span class="n">rect</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mf">0.8</span><span class="p">,</span> <span class="mf">0.8</span><span class="p">],</span> <span class="n">facecolor</span><span class="o">=</span><span class="s2">&#34;red&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># OR</span>
</span></span><span class="line"><span class="cl"><span class="n">ax1</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">Axes</span><span class="p">(</span><span class="n">fig</span><span class="o">=</span><span class="n">fig</span><span class="p">,</span> <span class="n">rect</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mf">0.8</span><span class="p">,</span> <span class="mf">0.8</span><span class="p">],</span> <span class="n">facecolor</span><span class="o">=</span><span class="s2">&#34;red&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1">#</span></span></span></code></pre>
</div>
<p>The first parameter <code>fig</code> is simply a pointer to the parent <code>Figure</code> to which an Axes will belong.<br>
The second parameter <code>rect</code> has four numbers : <code>[left_position, bottom_position, height, width]</code> to define the position of the <code>Axes</code> inside the <code>Figure</code> and the height and width <em>with respect to the <code>Figure</code></em>. All these numbers are expressed in percentages.</p>
<p>A <code>Figure</code> simply holds a given number of <code>Axes</code> at any point of time</p>
<p>We will go into some of these design decisions in a few moments'</p>
<h1 id="recreating-pltsubplots-with-basic-matplotlib-functionality">Recreating <code>plt.subplots</code> with basic Matplotlib functionality<a class="headerlink" href="#recreating-pltsubplots-with-basic-matplotlib-functionality" title="Link to this heading">#</a></h1>
<p>We will try and recreate the below plot using Matplotlib primitives as a way to understand them better. We&rsquo;ll try and be a slightly creative by deviating a bit though.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span><span class="s2">&#34;2x2 Grid&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<pre><code>Text(0.5, 0.98, '2x2 Grid')
</code></pre>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_20_1.png" alt="png"></p>
<h1 id="lets-create-our-first-plot-using-matplotlib-primitives">Let&rsquo;s create our first plot using Matplotlib primitives:<a class="headerlink" href="#lets-create-our-first-plot-using-matplotlib-primitives" title="Link to this heading">#</a></h1>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># We first need a figure, an imaginary canvas to put things on</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">Figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">6</span><span class="p">,</span> <span class="mi">6</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1"># Let&#39;s start with two Axes with an arbitrary position and size</span>
</span></span><span class="line"><span class="cl"><span class="n">ax1</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">Axes</span><span class="p">(</span><span class="n">fig</span><span class="o">=</span><span class="n">fig</span><span class="p">,</span> <span class="n">rect</span><span class="o">=</span><span class="p">[</span><span class="mf">0.3</span><span class="p">,</span> <span class="mf">0.3</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">],</span> <span class="n">facecolor</span><span class="o">=</span><span class="s2">&#34;red&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax2</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">Axes</span><span class="p">(</span><span class="n">fig</span><span class="o">=</span><span class="n">fig</span><span class="p">,</span> <span class="n">rect</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">],</span> <span class="n">facecolor</span><span class="o">=</span><span class="s2">&#34;blue&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p>Now you need to add the <code>Axes</code> to <code>fig</code>. You should stop right here and think about why would there be a need to do this when <code>fig</code> is already a parent of <code>ax1</code> and <code>ax2</code>? Let&rsquo;s do this anyway and we&rsquo;ll go into the details afterwards.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">add_axes</span><span class="p">(</span><span class="n">ax2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">add_axes</span><span class="p">(</span><span class="n">ax1</span><span class="p">)</span></span></span></code></pre>
</div>
<pre><code>&lt;matplotlib.axes._axes.Axes at 0x1211dead0&gt;
</code></pre>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># As you can see the Axes are exactly where we specified.</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_25_0.png" alt="png"></p>
<p>That means you can do this now:</p>
<blockquote>
<p>Remark: Notice the <code>ax.reverse()</code> call in the snippet below. If I hadn&rsquo;t done that, the biggest plot would be placed in the end on top of every other plot and you would just see a single, blank &lsquo;cyan&rsquo; colored plot.</p>
</blockquote>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">6</span><span class="p">,</span> <span class="mi">6</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl"><span class="n">sizes</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mf">0.02</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">50</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">50</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">color</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="nb">hex</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">sizes</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">*</span> <span class="mi">255</span><span class="p">)))[</span><span class="mi">2</span><span class="p">:]</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">color</span><span class="p">)</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">color</span> <span class="o">=</span> <span class="s2">&#34;0&#34;</span> <span class="o">+</span> <span class="n">color</span>
</span></span><span class="line"><span class="cl">    <span class="n">color</span> <span class="o">=</span> <span class="s2">&#34;#99&#34;</span> <span class="o">+</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">color</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">plt</span><span class="o">.</span><span class="n">Axes</span><span class="p">(</span><span class="n">fig</span><span class="o">=</span><span class="n">fig</span><span class="p">,</span> <span class="n">rect</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">sizes</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">sizes</span><span class="p">[</span><span class="n">i</span><span class="p">]],</span> <span class="n">facecolor</span><span class="o">=</span><span class="n">color</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">reverse</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">axes</span> <span class="ow">in</span> <span class="n">ax</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="o">.</span><span class="n">add_axes</span><span class="p">(</span><span class="n">axes</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_27_0.png" alt="png"></p>
<p>The above example demonstrates why it is important to decouple the process of creation of an <code>Axes</code> and actually putting it onto a <code>Figure</code>.</p>
<p>Also, you can remove an <code>Axes</code> from the canvas area of a <code>Figure</code> like so:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">delaxes</span><span class="p">(</span><span class="n">ax</span><span class="p">)</span></span></span></code></pre>
</div>
<p>This can be useful when you want to compare the same primary data (GDP) to several secondary data sources (education, spending, etc.) one by one (you&rsquo;ll need to add and delete each graph from the Figure in succession)<br>
I also encourage you to look into the documentation for <code>Figure</code> and <code>Axes</code> and glance over the several methods available to them. This will help you know what parts of the wheel you do not need to rebuild when you&rsquo;re working with these objects the next time.</p>
<h2 id="recreating-our-subplots-literally-from-scratch">Recreating our subplots literally from scratch<a class="headerlink" href="#recreating-our-subplots-literally-from-scratch" title="Link to this heading">#</a></h2>
<p>This should now make sense. We can now create our original <code>plt.subplots(2, 2)</code> example using the knowledge we have thus gained so far.<br>
(Although, this is definitely not the most convenient way to do this)</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">mpl</span><span class="o">.</span><span class="n">figure</span><span class="o">.</span><span class="n">Figure</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span><span class="s2">&#34;Recreating plt.subplots(2, 2)&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax1</span> <span class="o">=</span> <span class="n">mpl</span><span class="o">.</span><span class="n">axes</span><span class="o">.</span><span class="n">Axes</span><span class="p">(</span><span class="n">fig</span><span class="o">=</span><span class="n">fig</span><span class="p">,</span> <span class="n">rect</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="n">ax2</span> <span class="o">=</span> <span class="n">mpl</span><span class="o">.</span><span class="n">axes</span><span class="o">.</span><span class="n">Axes</span><span class="p">(</span><span class="n">fig</span><span class="o">=</span><span class="n">fig</span><span class="p">,</span> <span class="n">rect</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="n">ax3</span> <span class="o">=</span> <span class="n">mpl</span><span class="o">.</span><span class="n">axes</span><span class="o">.</span><span class="n">Axes</span><span class="p">(</span><span class="n">fig</span><span class="o">=</span><span class="n">fig</span><span class="p">,</span> <span class="n">rect</span><span class="o">=</span><span class="p">[</span><span class="mf">0.5</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="n">ax4</span> <span class="o">=</span> <span class="n">mpl</span><span class="o">.</span><span class="n">axes</span><span class="o">.</span><span class="n">Axes</span><span class="p">(</span><span class="n">fig</span><span class="o">=</span><span class="n">fig</span><span class="p">,</span> <span class="n">rect</span><span class="o">=</span><span class="p">[</span><span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">add_axes</span><span class="p">(</span><span class="n">ax1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">add_axes</span><span class="p">(</span><span class="n">ax2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">add_axes</span><span class="p">(</span><span class="n">ax3</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">add_axes</span><span class="p">(</span><span class="n">ax4</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_30_0.png" alt="png"></p>
<h2 id="using-gridspecgridspec">Using <code>gridspec.GridSpec</code><a class="headerlink" href="#using-gridspecgridspec" title="Link to this heading">#</a></h2>
<p>Docs : <a href="https://matplotlib.org/api/_as_gen/matplotlib.gridspec.GridSpec.html#matplotlib.gridspec.GridSpec">https://matplotlib.org/api/_as_gen/matplotlib.gridspec.GridSpec.html#matplotlib.gridspec.GridSpec</a></p>
<p><code>GridSpec</code> objects allow us more intuitive control over how our plot is exactly divided into subplots and what the size of each <code>Axes</code> is.<br>
You can essentially decide a <strong>Grid</strong> which all your <code>Axes</code> will conform to when laying themselves over.<br>
Once you define a grid, or <code>GridSpec</code> so to say, you can use that object to <em>generate</em> new <code>Axes</code> conforming to the grid which you can then add to your <code>Figure</code></p>
<p>Lets see how all of this works in code:</p>
<p>You can define a <code>GridSpec</code> object like so (both statements are equivalent):</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">gs</span> <span class="o">=</span> <span class="n">mpl</span><span class="o">.</span><span class="n">gridspec</span><span class="o">.</span><span class="n">GridSpec</span><span class="p">(</span><span class="n">nrows</span><span class="p">,</span> <span class="n">ncols</span><span class="p">,</span> <span class="n">width_ratios</span><span class="p">,</span> <span class="n">height_ratios</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># OR</span>
</span></span><span class="line"><span class="cl"><span class="n">gs</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">GridSpec</span><span class="p">(</span><span class="n">nrows</span><span class="p">,</span> <span class="n">ncols</span><span class="p">,</span> <span class="n">width_ratios</span><span class="p">,</span> <span class="n">height_ratios</span><span class="p">)</span></span></span></code></pre>
</div>
<p>More specifically:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">gs</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">GridSpec</span><span class="p">(</span><span class="n">nrows</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">ncols</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">width_ratios</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span> <span class="n">height_ratios</span><span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span></span></span></code></pre>
</div>
<p><code>nrows</code> and <code>ncols</code> are pretty self explanatory. <code>width_ratios</code> determines the relative width of each column. <code>height_ratios</code> follows along the same lines.
The whole <code>grid</code> will always distribute itself using all the space available to it inside of a figure (things change up a bit when you have multiple <code>GridSpec</code> objects for a single figure, but that&rsquo;s for you to explore!). And inside of a <code>grid</code>, all the Axes will conform to the sizes and ratios defined already</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">annotate_axes</span><span class="p">(</span><span class="n">fig</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;Taken from https://matplotlib.org/gallery/userdemo/demo_gridspec03.html#sphx-glr-gallery-userdemo-demo-gridspec03-py
</span></span></span><span class="line"><span class="cl"><span class="s2">    takes a figure and puts an &#39;axN&#39; label in the center of each Axes
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">ax</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">fig</span><span class="o">.</span><span class="n">axes</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="s2">&#34;ax</span><span class="si">%d</span><span class="s2">&#34;</span> <span class="o">%</span> <span class="p">(</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">),</span> <span class="n">va</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s2">&#34;center&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">ax</span><span class="o">.</span><span class="n">tick_params</span><span class="p">(</span><span class="n">labelbottom</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">labelleft</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span></span></span></code></pre>
</div>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># We will try and vary axis sizes here just to see what happens</span>
</span></span><span class="line"><span class="cl"><span class="n">gs</span> <span class="o">=</span> <span class="n">mpl</span><span class="o">.</span><span class="n">gridspec</span><span class="o">.</span><span class="n">GridSpec</span><span class="p">(</span><span class="n">nrows</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">ncols</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">width_ratios</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">height_ratios</span><span class="o">=</span><span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span></span></span></code></pre>
</div>
<pre><code>&lt;Figure size 576x396 with 0 Axes&gt;
</code></pre>
<p>You can pass <code>GridSpec</code> objects to a <code>Figure</code> to create subplots in your desired sizes and proportions like so :<br>
Notice how the sizes of the <code>Axes</code> relates to the ratios we defined when creating the Grid.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">clear</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">ax1</span><span class="p">,</span> <span class="n">ax2</span><span class="p">,</span> <span class="n">ax3</span><span class="p">,</span> <span class="n">ax4</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">gs</span><span class="p">[</span><span class="mi">0</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">gs</span><span class="p">[</span><span class="mi">1</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">gs</span><span class="p">[</span><span class="mi">2</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl">    <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">gs</span><span class="p">[</span><span class="mi">3</span><span class="p">]),</span>
</span></span><span class="line"><span class="cl"><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">annotate_axes</span><span class="p">(</span><span class="n">fig</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_36_0.png" alt="png"></p>
<p>Doing the same thing in a simpler way</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">add_gs_to_fig</span><span class="p">(</span><span class="n">fig</span><span class="p">,</span> <span class="n">gs</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;Adds all `SubplotSpec`s in `gs` to `fig`&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">g</span> <span class="ow">in</span> <span class="n">gs</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">g</span><span class="p">)</span></span></span></code></pre>
</div>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">clear</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">add_gs_to_fig</span><span class="p">(</span><span class="n">fig</span><span class="p">,</span> <span class="n">gs</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">annotate_axes</span><span class="p">(</span><span class="n">fig</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_39_0.png" alt="png"></p>
<p>That means you can now do this:<br>
(Notice how the <code>Axes</code> sizes increase from top-left to bottom-right)</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">14</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="n">length</span> <span class="o">=</span> <span class="mi">6</span>
</span></span><span class="line"><span class="cl"><span class="n">gs</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">GridSpec</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">nrows</span><span class="o">=</span><span class="n">length</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">ncols</span><span class="o">=</span><span class="n">length</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">width_ratios</span><span class="o">=</span><span class="nb">list</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">length</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)),</span>
</span></span><span class="line"><span class="cl">    <span class="n">height_ratios</span><span class="o">=</span><span class="nb">list</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">length</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)),</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">add_gs_to_fig</span><span class="p">(</span><span class="n">fig</span><span class="p">,</span> <span class="n">gs</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">annotate_axes</span><span class="p">(</span><span class="n">fig</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">ax</span> <span class="ow">in</span> <span class="n">fig</span><span class="o">.</span><span class="n">axes</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xs</span><span class="p">,</span> <span class="n">ys</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_41_0.png" alt="png"></p>
<h2 id="a-very-unexpected-observation-which-gives-us-yet-more-clarity-and-power">A very unexpected observation: (which gives us yet more clarity, and Power)<a class="headerlink" href="#a-very-unexpected-observation-which-gives-us-yet-more-clarity-and-power" title="Link to this heading">#</a></h2>
<p>Notice how after each print operation, different addresses get printed for each <code>gs</code> object.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">gs</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">gs</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="n">gs</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="n">gs</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span></span></span></code></pre>
</div>
<pre><code>(&lt;matplotlib.gridspec.SubplotSpec at 0x1282a9e50&gt;,
 &lt;matplotlib.gridspec.SubplotSpec at 0x12942add0&gt;,
 &lt;matplotlib.gridspec.SubplotSpec at 0x12942a750&gt;,
 &lt;matplotlib.gridspec.SubplotSpec at 0x12a727e10&gt;)
</code></pre>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">gs</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">gs</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="n">gs</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="n">gs</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span></span></span></code></pre>
</div>
<pre><code>(&lt;matplotlib.gridspec.SubplotSpec at 0x127d5c6d0&gt;,
 &lt;matplotlib.gridspec.SubplotSpec at 0x12b6d0b10&gt;,
 &lt;matplotlib.gridspec.SubplotSpec at 0x129fc6390&gt;,
 &lt;matplotlib.gridspec.SubplotSpec at 0x129fc6a50&gt;)
</code></pre>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">gs</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> <span class="n">gs</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">],</span> <span class="n">gs</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> <span class="n">gs</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span></span></span></code></pre>
</div>
<pre><code>&lt;matplotlib.gridspec.SubplotSpec object at 0x12951a610&gt; &lt;matplotlib.gridspec.SubplotSpec object at 0x12951a890&gt; &lt;matplotlib.gridspec.SubplotSpec object at 0x12951ac10&gt; &lt;matplotlib.gridspec.SubplotSpec object at 0x12951a150&gt;
</code></pre>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">gs</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> <span class="n">gs</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">],</span> <span class="n">gs</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> <span class="n">gs</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span></span></span></code></pre>
</div>
<pre><code>&lt;matplotlib.gridspec.SubplotSpec object at 0x128fad4d0&gt; &lt;matplotlib.gridspec.SubplotSpec object at 0x1291ebbd0&gt; &lt;matplotlib.gridspec.SubplotSpec object at 0x1294f9850&gt; &lt;matplotlib.gridspec.SubplotSpec object at 0x128106250&gt;
</code></pre>
<p><strong>Lets understand why this happens:</strong></p>
<p><em>Notice how a group of <code>gs</code> objects indexed into at the same time also produces just one object instead of multiple objects</em></p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">gs</span><span class="p">[:,</span> <span class="p">:],</span> <span class="n">gs</span><span class="p">[:,</span> <span class="mi">0</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="c1"># both output just one object each</span></span></span></code></pre>
</div>
<pre><code>(&lt;matplotlib.gridspec.SubplotSpec at 0x128116e50&gt;,
 &lt;matplotlib.gridspec.SubplotSpec at 0x128299290&gt;)
</code></pre>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># Lets try another `gs` object, this time a little more crowded</span>
</span></span><span class="line"><span class="cl"><span class="c1"># I chose the ratios randomly</span>
</span></span><span class="line"><span class="cl"><span class="n">gs</span> <span class="o">=</span> <span class="n">mpl</span><span class="o">.</span><span class="n">gridspec</span><span class="o">.</span><span class="n">GridSpec</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">nrows</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">ncols</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">width_ratios</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">],</span> <span class="n">height_ratios</span><span class="o">=</span><span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span></span></span></code></pre>
</div>
<p><em>All these operations print just one object. What is going on here?</em></p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">gs</span><span class="p">[:,</span> <span class="mi">0</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">gs</span><span class="p">[</span><span class="mi">1</span><span class="p">:,</span> <span class="p">:</span><span class="mi">2</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">gs</span><span class="p">[:,</span> <span class="p">:])</span></span></span></code></pre>
</div>
<pre><code>&lt;matplotlib.gridspec.SubplotSpec object at 0x12a075fd0&gt;
&lt;matplotlib.gridspec.SubplotSpec object at 0x128cf0990&gt;
&lt;matplotlib.gridspec.SubplotSpec object at 0x12a075fd0&gt;
</code></pre>
<p>Let&rsquo;s try and add subplots to our <code>Figure</code> to <code>see</code> what&rsquo;s going on.<br>
We&rsquo;ll do a few different permutations to get an exact idea.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="n">ax1</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">gs</span><span class="p">[:</span><span class="mi">2</span><span class="p">,</span> <span class="mi">0</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="n">ax2</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">gs</span><span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">0</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="n">ax3</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">gs</span><span class="p">[:,</span> <span class="mi">1</span><span class="p">:])</span>
</span></span><span class="line"><span class="cl"><span class="n">annotate_axes</span><span class="p">(</span><span class="n">fig</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_54_0.png" alt="png"></p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1"># ax1 = fig.add_subplot(gs[:2, 0])</span>
</span></span><span class="line"><span class="cl"><span class="n">ax2</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">gs</span><span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">0</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="n">ax3</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">gs</span><span class="p">[:,</span> <span class="mi">1</span><span class="p">:])</span>
</span></span><span class="line"><span class="cl"><span class="n">annotate_axes</span><span class="p">(</span><span class="n">fig</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_55_0.png" alt="png"></p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1"># ax1 = fig.add_subplot(gs[:2, 0])</span>
</span></span><span class="line"><span class="cl"><span class="c1"># ax2 = fig.add_subplot(gs[2, 0])</span>
</span></span><span class="line"><span class="cl"><span class="n">ax3</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">gs</span><span class="p">[:,</span> <span class="mi">1</span><span class="p">:])</span>
</span></span><span class="line"><span class="cl"><span class="n">annotate_axes</span><span class="p">(</span><span class="n">fig</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_56_0.png" alt="png"></p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1"># ax1 = fig.add_subplot(gs[:2, 0])</span>
</span></span><span class="line"><span class="cl"><span class="c1"># ax2 = fig.add_subplot(gs[2, 0])</span>
</span></span><span class="line"><span class="cl"><span class="n">ax3</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">gs</span><span class="p">[:,</span> <span class="mi">1</span><span class="p">:])</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Notice the line below : You can overlay Axes using `GridSpec` too</span>
</span></span><span class="line"><span class="cl"><span class="n">ax4</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="n">gs</span><span class="p">[</span><span class="mi">2</span><span class="p">:,</span> <span class="mi">1</span><span class="p">:])</span>
</span></span><span class="line"><span class="cl"><span class="n">ax4</span><span class="o">.</span><span class="n">set_facecolor</span><span class="p">(</span><span class="s2">&#34;orange&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">annotate_axes</span><span class="p">(</span><span class="n">fig</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_57_0.png" alt="png"></p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">clear</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">add_gs_to_fig</span><span class="p">(</span><span class="n">fig</span><span class="p">,</span> <span class="n">gs</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">annotate_axes</span><span class="p">(</span><span class="n">fig</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_58_0.png" alt="png"></p>
<p>Here&rsquo;s a bullet point summary of what this means:</p>
<ol>
<li><code>gs</code> can be used as a sort of a <code>factory</code> for different kinds of <code>Axes</code>.</li>
<li>You give this <code>factory</code> an order by indexing into particular areas of the <code>Grid</code>. It gives back a single <code>SubplotSpec</code> (check <code>type(gs[0]</code>) object that helps you create an <code>Axes</code> which has all of the area you indexed into combined into one unit.</li>
<li>Your <code>height</code> and <code>width</code> ratios for the indexed portion will determine the size of the <code>Axes</code> that gets generated.</li>
<li><code>Axes</code> will maintain relative proportions according to your <code>height</code> and <code>width</code> ratios always.</li>
<li>For all these reasons, I like <code>GridSpec</code>!</li>
</ol>
<p>This ability to create different grid variations that <code>GridSpec</code> provides is probably the reason for that anomaly we saw a while ago (printing different Addresses).</p>
<p>It creates new objects every time you index into it because it will be very troublesome to store all permutations of <code>SubplotSpec</code> objects into one group in memory (try and count permutations for a <code>GridSpec</code> of 10x10 and you&rsquo;ll know why)</p>
<hr>
<h2 id="now-lets-finally-create-pltsubplots22-once-again-using-gridspec">Now let&rsquo;s finally create <code>plt.subplots(2,2)</code> once again using GridSpec<a class="headerlink" href="#now-lets-finally-create-pltsubplots22-once-again-using-gridspec" title="Link to this heading">#</a></h2>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">gs</span> <span class="o">=</span> <span class="n">mpl</span><span class="o">.</span><span class="n">gridspec</span><span class="o">.</span><span class="n">GridSpec</span><span class="p">(</span><span class="n">nrows</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">ncols</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">add_gs_to_fig</span><span class="p">(</span><span class="n">fig</span><span class="p">,</span> <span class="n">gs</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">annotate_axes</span><span class="p">(</span><span class="n">fig</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span><span class="s2">&#34;We&#39;re done!&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="s2">&#34;yayy&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<pre><code>yayy
</code></pre>
<p><img src="/matplotlib/an-inquiry-into-matplotlib-figures/output_61_1.png" alt="png"></p>
<h1 id="what-you-should-try">What you should try:<a class="headerlink" href="#what-you-should-try" title="Link to this heading">#</a></h1>
<hr>
<p>Here&rsquo;s a few things I think you should go ahead and explore:</p>
<ol>
<li>Multiple <code>GridSpec</code> objects for the Same Figure.</li>
<li>Deleting and adding <code>Axes</code> effectively and meaningfully.</li>
<li>All the methods available for <code>mpl.figure.Figure</code> and <code>mpl.axes.Axes</code> allowing us to manipulate their properties.</li>
<li>Kaggle Learn&rsquo;s Data visualization course is a great place to learn effective plotting using Python</li>
<li>Armed with knowledge, you will be able to use other plotting libraries such as <code>seaborn</code>, <code>plotly</code>, <code>pandas</code> and <code>altair</code> with much more flexibility (you can pass an <code>Axes</code> object to all their plotting functions). I encourage you to explore these libraries too.</li>
</ol>
<p>This is the first time I&rsquo;ve written any technical guide for the internet, it may not be as clean as tutorials generally are. But, I&rsquo;m open to all the constructive criticism that you may have for me (drop me an email on <a href="mailto:akashpalrecha@gmail.com">akashpalrecha@gmail.com</a>)</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Custom 3D engine in Matplotlib]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/custom-3d-engine/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/warming-stripes/?utm_source=atom_feed" rel="related" type="text/html" title="Creating the Warming Stripes in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/matplotlib-in-data-driven-seo/?utm_source=atom_feed" rel="related" type="text/html" title="Matplotlib in Data Driven SEO" />
                <link href="https://blog.scientific-python.org/matplotlib/using-matplotlib-to-advocate-for-postdocs/?utm_source=atom_feed" rel="related" type="text/html" title="Using Matplotlib to Advocate for Postdocs" />
            
                <id>https://blog.scientific-python.org/matplotlib/custom-3d-engine/</id>
            
            
            <published>2019-12-18T09:05:32+01:00</published>
            <updated>2019-12-18T09:05:32+01:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>3D rendering is really easy once you&rsquo;ve understood a few concepts. To demonstrate that, we&rsquo;ll design a simple custom 3D engine that with 60 lines of Python and one Matplotlib call. That is, we&rsquo;ll render the bunny without using the 3D axis.</blockquote><p><img src="/matplotlib/custom-3d-engine/bunny.jpg" alt="A colourful outline of a bunny."></p>
<p>Matplotlib has a really nice <a href="https://matplotlib.org/mpl_toolkits/mplot3d/tutorial.html">3D
interface</a> with many
capabilities (and some limitations) that is quite popular among users. Yet, 3D
is still considered to be some kind of black magic for some users (or maybe
for the majority of users). I would thus like to explain in this post that 3D
rendering is really easy once you&rsquo;ve understood a few concepts. To demonstrate
that, we&rsquo;ll render the bunny above with 60 lines of Python and one Matplotlib
call. That is, without using the 3D axis.</p>
<p><strong>Advertisement</strong>: This post comes from an upcoming open access book on
scientific visualization using Python and Matplotlib. If you want to
support my work and have an early access to the book, go to
<a href="https://github.com/rougier/scientific-visualization-book">https://github.com/rougier/scientific-visualization-book</a>.</p>
<h1 id="loading-the-bunny">Loading the bunny<a class="headerlink" href="#loading-the-bunny" title="Link to this heading">#</a></h1>
<p>First things first, we need to load our model. We&rsquo;ll use a <a href="/matplotlib/custom-3d-engine/bunny.obj">simplified
version</a> of the <a href="https://en.wikipedia.org/wiki/Stanford_bunny">Stanford
bunny</a>. The file uses the
<a href="https://en.wikipedia.org/wiki/Wavefront_.obj_file">wavefront format</a> which is
one of the simplest format, so let&rsquo;s make a very simple (but error-prone)
loader that will just do the job for this post (and this model):</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">V</span><span class="p">,</span> <span class="n">F</span> <span class="o">=</span> <span class="p">[],</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl"><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s2">&#34;bunny.obj&#34;</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">   <span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="n">f</span><span class="o">.</span><span class="n">readlines</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">       <span class="k">if</span> <span class="n">line</span><span class="o">.</span><span class="n">startswith</span><span class="p">(</span><span class="s1">&#39;#&#39;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">           <span class="k">continue</span>
</span></span><span class="line"><span class="cl">       <span class="n">values</span> <span class="o">=</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">       <span class="k">if</span> <span class="ow">not</span> <span class="n">values</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">           <span class="k">continue</span>
</span></span><span class="line"><span class="cl">       <span class="k">if</span> <span class="n">values</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="s1">&#39;v&#39;</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">           <span class="n">V</span><span class="o">.</span><span class="n">append</span><span class="p">([</span><span class="nb">float</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">values</span><span class="p">[</span><span class="mi">1</span><span class="p">:</span><span class="mi">4</span><span class="p">]])</span>
</span></span><span class="line"><span class="cl">       <span class="k">elif</span> <span class="n">values</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="s1">&#39;f&#39;</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">           <span class="n">F</span><span class="o">.</span><span class="n">append</span><span class="p">([</span><span class="nb">int</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">values</span><span class="p">[</span><span class="mi">1</span><span class="p">:</span><span class="mi">4</span><span class="p">]])</span>
</span></span><span class="line"><span class="cl"><span class="n">V</span><span class="p">,</span> <span class="n">F</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">V</span><span class="p">),</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">F</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span></span></span></code></pre>
</div>
<p><code>V</code> is now a set of vertices (3D points if you prefer) and <code>F</code> is a set of
faces (= triangles). Each triangle is described by 3 indices relatively to the
vertices array. Now, let&rsquo;s normalize the vertices such that the overall bunny
fits the unit box:</p>

<div class="highlight">
  <pre>V = (V-(V.max(0)&#43;V.min(0))/2)/max(V.max(0)-V.min(0))</pre>
</div>

<p>Now, we can have a first look at the model by getting only the x,y coordinates of the vertices and get rid of the z coordinate. To do this we can use the powerful
<a href="https://matplotlib.org/3.1.1/api/collections_api.html#matplotlib.collections.PolyCollection">PolyCollection</a>
object that allow to render efficiently a collection of non-regular
polygons. Since, we want to render a bunch of triangles, this is a perfect
match. So let&rsquo;s first extract the triangles and get rid of the <code>z</code> coordinate:</p>

<div class="highlight">
  <pre>T = V[F][...,:2]</pre>
</div>

<p>And we can now render it:</p>

<div class="highlight">
  <pre>fig = plt.figure(figsize=(6,6))
ax = fig.add_axes([0,0,1,1], xlim=[-1,&#43;1], ylim=[-1,&#43;1],
                  aspect=1, frameon=False)
collection = PolyCollection(T, closed=True, linewidth=0.1,
                            facecolor=&#34;None&#34;, edgecolor=&#34;black&#34;)
ax.add_collection(collection)
plt.show()</pre>
</div>

<p>You should obtain something like this (<a href="/matplotlib/custom-3d-engine/bunny-1.py">bunny-1.py</a>):</p>
<p><img src="/matplotlib/custom-3d-engine/bunny-1.png" alt="A black and white outline of a bunny facing left side."></p>
<h1 id="perspective-projection">Perspective Projection<a class="headerlink" href="#perspective-projection" title="Link to this heading">#</a></h1>
<p>The rendering we&rsquo;ve just made is actually an <a href="https://en.wikipedia.org/wiki/Orthographic_projection">orthographic
projection</a> while the
top bunny uses a <a href="https://en.wikipedia.org/wiki/3D_projection#Perspective_projection">perspective projection</a>:</p>
<p><img src="/matplotlib/custom-3d-engine/projections.png" alt="Difference in perspective projection and orthographic projection. The near clip plane appears smaller in the perspective projective than in the orthographic projection."></p>
<p>In both cases, the proper way of defining a projection is first to define a
viewing volume, that is, the volume in the 3D space we want to render on the
screen. To do that, we need to consider 6 clipping planes (left, right, top,
bottom, far, near) that enclose the viewing volume (frustum) relatively to the
camera. If we define a camera position and a viewing direction, each plane can
be described by a single scalar. Once we have this viewing volume, we can
project onto the screen using either the orthographic or the perspective
projection.</p>
<p>Fortunately for us, these projections are quite well known and can be expressed
using 4x4 matrices:</p>

<div class="highlight">
  <pre>def frustum(left, right, bottom, top, znear, zfar):
    M = np.zeros((4, 4), dtype=np.float32)
    M[0, 0] = &#43;2.0 * znear / (right - left)
    M[1, 1] = &#43;2.0 * znear / (top - bottom)
    M[2, 2] = -(zfar &#43; znear) / (zfar - znear)
    M[0, 2] = (right &#43; left) / (right - left)
    M[2, 1] = (top &#43; bottom) / (top - bottom)
    M[2, 3] = -2.0 * znear * zfar / (zfar - znear)
    M[3, 2] = -1.0
    return M

def perspective(fovy, aspect, znear, zfar):
    h = np.tan(0.5*radians(fovy)) * znear
    w = h * aspect
    return frustum(-w, w, -h, h, znear, zfar)</pre>
</div>

<p>For the perspective projection, we also need to specify the aperture angle that
(more or less) sets the size of the near plane relatively to the far
plane. Consequently, for high apertures, you&rsquo;ll get a lot of &ldquo;deformations&rdquo;.</p>
<p>However, if you look at the two functions above, you&rsquo;ll realize they return 4x4
matrices while our coordinates are 3D. How to use these matrices then ? The
answer is <a href="https://en.wikipedia.org/wiki/Homogeneous_coordinates">homogeneous
coordinates</a>. To make
a long story short, homogeneous coordinates are best to deal with transformation
and projections in 3D. In our case, because we&rsquo;re dealing with vertices (and
not vectors), we only need to add 1 as the fourth coordinate (<code>w</code>) to all our
vertices. Then we can apply the perspective transformation using the dot
product.</p>

<div class="highlight">
  <pre>V = np.c_[V, np.ones(len(V))] @ perspective(25,1,1,100).T</pre>
</div>

<p>Last step, we need to re-normalize the homogeneous coordinates. This means we
divide each transformed vertices with the last component (<code>w</code>) such as to
always have <code>w</code>=1 for each vertices.</p>

<div class="highlight">
  <pre>V /= V[:,3].reshape(-1,1)</pre>
</div>

<p>Now we can display the result again (<a href="/matplotlib/custom-3d-engine/bunny-2.py">bunny-2.py</a>):</p>
<p><img src="/matplotlib/custom-3d-engine/bunny-2.png" alt=""></p>
<p>Oh, weird result. What&rsquo;s wrong? What is wrong is that the camera is actually
inside the bunny. To have a proper rendering, we need to move the bunny away
from the camera or move the camera away from the bunny. Let&rsquo;s do the latter. The
camera is currently positioned at (0,0,0) and looking up in the z direction
(because of the frustum transformation). We thus need to move the camera away a
little bit in the z negative direction and <strong>before the perspective
transformation</strong>:</p>

<div class="highlight">
  <pre>V = V - (0,0,3.5)
V = np.c_[V, np.ones(len(V))] @ perspective(25,1,1,100).T
V /= V[:,3].reshape(-1,1)</pre>
</div>

<p>An now you should obtain (<a href="/matplotlib/custom-3d-engine/bunny-3.py">bunny-3.py</a>):</p>
<p><img src="/matplotlib/custom-3d-engine/bunny-3.png" alt=""></p>
<h1 id="model-view-projection-mvp">Model, view, projection (MVP)<a class="headerlink" href="#model-view-projection-mvp" title="Link to this heading">#</a></h1>
<p>It might be not obvious, but the last rendering is actually a perspective
transformation. To make it more obvious, we&rsquo;ll rotate the bunny around. To do
that, we need some rotation matrices (4x4) and we can as well define the
translation matrix in the meantime:</p>

<div class="highlight">
  <pre>def translate(x, y, z):
    return np.array([[1, 0, 0, x],
                     [0, 1, 0, y],
                     [0, 0, 1, z],
                     [0, 0, 0, 1]], dtype=float)

def xrotate(theta):
    t = np.pi * theta / 180
    c, s = np.cos(t), np.sin(t)
    return np.array([[1, 0,  0, 0],
                     [0, c, -s, 0],
                     [0, s,  c, 0],
                     [0, 0,  0, 1]], dtype=float)

def yrotate(theta):
    t = np.pi * theta / 180
    c, s = np.cos(t), np.sin(t)
    return  np.array([[ c, 0, s, 0],
                      [ 0, 1, 0, 0],
                      [-s, 0, c, 0],
                      [ 0, 0, 0, 1]], dtype=float)</pre>
</div>

<p>We&rsquo;ll now decompose the transformations we want to apply in term of model
(local transformations), view (global transformations) and projection such that
we can compute a global MVP matrix that will do everything at once:</p>

<div class="highlight">
  <pre>model = xrotate(20) @ yrotate(45)
view  = translate(0,0,-3.5)
proj  = perspective(25, 1, 1, 100)
MVP   = proj  @ view  @ model</pre>
</div>

<p>and we now write:</p>

<div class="highlight">
  <pre>V = np.c_[V, np.ones(len(V))] @ MVP.T
V /= V[:,3].reshape(-1,1)</pre>
</div>

<p>You should obtain (<a href="/matplotlib/custom-3d-engine/bunny-4.py">bunny-4.py</a>):</p>
<p><img src="/matplotlib/custom-3d-engine/bunny-4.png" alt=""></p>
<p>Let&rsquo;s now play a bit with the aperture such that you can see the difference.
Note that we also have to adapt the distance to the camera in order for the bunnies to have the same apparent size (<a href="/matplotlib/custom-3d-engine/bunny-5.py">bunny-5.py</a>):</p>
<p><img src="/matplotlib/custom-3d-engine/bunny-5.png" alt=""></p>
<h1 id="depth-sorting">Depth sorting<a class="headerlink" href="#depth-sorting" title="Link to this heading">#</a></h1>
<p>Let&rsquo;s try now to fill the triangles (<a href="/matplotlib/custom-3d-engine/bunny-6.py">bunny-6.py</a>):</p>
<p><img src="/matplotlib/custom-3d-engine/bunny-6.png" alt=""></p>
<p>As you can see, the result is &ldquo;interesting&rdquo; and totally wrong. The problem is
that the PolyCollection will draw the triangles in the order they are given
while we would like to have them from back to front. This means we need to sort
them according to their depth. The good news is that we already computed this
information when we applied the MVP transformation. It is stored in the new z
coordinates. However, these z values are vertices based while we need to sort
the triangles. We&rsquo;ll thus take the mean z value as being representative of the
depth of a triangle. If triangles are relatively small and do not intersect,
this works beautifully:</p>

<div class="highlight">
  <pre>T =  V[:,:,:2]
Z = -V[:,:,2].mean(axis=1)
I = np.argsort(Z)
T = T[I,:]</pre>
</div>

<p>And now everything is rendered right (<a href="/matplotlib/custom-3d-engine/bunny-7.py">bunny-7.py</a>):</p>
<p><img src="/matplotlib/custom-3d-engine/bunny-7.png" alt=""></p>
<p>Let&rsquo;s add some colors using the depth buffer. We&rsquo;ll color each triangle
according to it depth. The beauty of the PolyCollection object is that you can
specify the color of each of the triangle using a NumPy array, so let&rsquo;s just do
that:</p>

<div class="highlight">
  <pre>zmin, zmax = Z.min(), Z.max()
Z = (Z-zmin)/(zmax-zmin)
C = plt.get_cmap(&#34;magma&#34;)(Z)
I = np.argsort(Z)
T, C = T[I,:], C[I,:]</pre>
</div>

<p>And now everything is rendered right (<a href="/matplotlib/custom-3d-engine/bunny-8.py">bunny-8.py</a>):</p>
<p><img src="/matplotlib/custom-3d-engine/bunny-8.png" alt=""></p>
<p>The final script is 57 lines (but hardly readable):</p>

<div class="highlight">
  <pre>import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import PolyCollection

def frustum(left, right, bottom, top, znear, zfar):
    M = np.zeros((4, 4), dtype=np.float32)
    M[0, 0] = &#43;2.0 * znear / (right - left)
    M[1, 1] = &#43;2.0 * znear / (top - bottom)
    M[2, 2] = -(zfar &#43; znear) / (zfar - znear)
    M[0, 2] = (right &#43; left) / (right - left)
    M[2, 1] = (top &#43; bottom) / (top - bottom)
    M[2, 3] = -2.0 * znear * zfar / (zfar - znear)
    M[3, 2] = -1.0
    return M
def perspective(fovy, aspect, znear, zfar):
    h = np.tan(0.5*np.radians(fovy)) * znear
    w = h * aspect
    return frustum(-w, w, -h, h, znear, zfar)
def translate(x, y, z):
    return np.array([[1, 0, 0, x], [0, 1, 0, y],
                     [0, 0, 1, z], [0, 0, 0, 1]], dtype=float)
def xrotate(theta):
    t = np.pi * theta / 180
    c, s = np.cos(t), np.sin(t)
    return np.array([[1, 0,  0, 0], [0, c, -s, 0],
                     [0, s,  c, 0], [0, 0,  0, 1]], dtype=float)
def yrotate(theta):
    t = np.pi * theta / 180
    c, s = np.cos(t), np.sin(t)
    return  np.array([[ c, 0, s, 0], [ 0, 1, 0, 0],
                      [-s, 0, c, 0], [ 0, 0, 0, 1]], dtype=float)
V, F = [], []
with open(&#34;bunny.obj&#34;) as f:
    for line in f.readlines():
        if line.startswith(&#39;#&#39;):  continue
        values = line.split()
        if not values:            continue
        if values[0] == &#39;v&#39;:      V.append([float(x) for x in values[1:4]])
        elif values[0] == &#39;f&#39; :   F.append([int(x) for x in values[1:4]])
V, F = np.array(V), np.array(F)-1
V = (V-(V.max(0)&#43;V.min(0))/2) / max(V.max(0)-V.min(0))
MVP = perspective(25,1,1,100) @ translate(0,0,-3.5) @ xrotate(20) @ yrotate(45)
V = np.c_[V, np.ones(len(V))]  @ MVP.T
V /= V[:,3].reshape(-1,1)
V = V[F]
T =  V[:,:,:2]
Z = -V[:,:,2].mean(axis=1)
zmin, zmax = Z.min(), Z.max()
Z = (Z-zmin)/(zmax-zmin)
C = plt.get_cmap(&#34;magma&#34;)(Z)
I = np.argsort(Z)
T, C = T[I,:], C[I,:]
fig = plt.figure(figsize=(6,6))
ax = fig.add_axes([0,0,1,1], xlim=[-1,&#43;1], ylim=[-1,&#43;1], aspect=1, frameon=False)
collection = PolyCollection(T, closed=True, linewidth=0.1, facecolor=C, edgecolor=&#34;black&#34;)
ax.add_collection(collection)
plt.show()</pre>
</div>

<p>Now it&rsquo;s your turn to play. Starting from this simple script, you can achieve
interesting results:</p>
<p><img src="/matplotlib/custom-3d-engine/checkered-sphere.png" alt="">
<img src="/matplotlib/custom-3d-engine/platonic-solids.png" alt="">
<img src="/matplotlib/custom-3d-engine/surf.png" alt="">
<img src="/matplotlib/custom-3d-engine/bar.png" alt="">
<img src="/matplotlib/custom-3d-engine/contour.png" alt=""></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="3d" label="3D" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Matplotlib in Data Driven SEO]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/matplotlib-in-data-driven-seo/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/warming-stripes/?utm_source=atom_feed" rel="related" type="text/html" title="Creating the Warming Stripes in Matplotlib" />
                <link href="https://blog.scientific-python.org/matplotlib/using-matplotlib-to-advocate-for-postdocs/?utm_source=atom_feed" rel="related" type="text/html" title="Using Matplotlib to Advocate for Postdocs" />
            
                <id>https://blog.scientific-python.org/matplotlib/matplotlib-in-data-driven-seo/</id>
            
            
            <published>2019-12-04T17:23:24+01:00</published>
            <updated>2019-12-04T17:23:24+01:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>At Whites Agency we analyze big unstructured data to increases client&rsquo;s online visibility. We share our story of how we used Matplotlib to present the complicated data in a simple and reader-friendly way.</blockquote><p><img src="/matplotlib/matplotlib-in-data-driven-seo/fig4.jpg" alt="A collection of visualizations displaying search engine optimization statistics including 1) a pie chart showing last minute is most popular 2) a bar chart showing popularity on different sites 3) a collection of circles displaying the frequency of sites that have features such as sitelinks and prices 4) a line chart displaying the google search rank for a collection of single word queries from first to twentieth."></p>
<p>Search Engine Optimization (SEO) is a process that aims to increase quantity and quality of website traffic by ensuring a website can be found in search engines for phrases that are relevant to what the site is offering. Google is the most popular search engine in the world and presence in top search results is invaluable for any online business since click rates drop exponentially with ranking position. Since the beginning, specialized entities have been decoding signals that influence position in search engine result page (SERP) focusing on e.g. number of outlinks, presence of keywords or content length. Developed practices typically resulted in better visibility, but needed to be constantly challenged because search engines introduce changes to their algorithms even every day. Since the rapid advancements in Big Data and machine learning finding significant ranking factors became increasingly more difficult. Thus, the whole SEO field required a shift where recommendations are backed up by large scale studies based on real data rather than old-fashioned practices. <a href="https://whites.agency/">Whites Agency</a> focuses strongly on Data-Driven SEO. We run many Big Data analyses which give us insights into multiple optimization opportunities.</p>
<p>Majority of cases we are dealing with right now focus on data harvesting and analysis. Data presentation plays an important part and since the beginning, we needed a tool that would allow us to experiment with different forms of visualizations. Because our organization is Python driven, Matplotlib was a straightforward choice for us. It is a mature project that offers flexibility and control. Among other features, Matplotlib figures can be easily exported not only to raster graphic formats (PNG, JPG) but also to vector ones (SVG, PDF, EPS), creating high-quality images that can be embedded in HTML code, LaTeX or utilized by graphic designers. In one of our projects, Matplotlib was a part of the Python processing pipeline that automatically generated PDF summaries from an HTML template for individual clients. Every data visualization project has the same core presented in the figure below, where data is loaded from the database, processed in pandas or PySpark and finally visualized with Matplotlib.</p>
<p><img src="/matplotlib/matplotlib-in-data-driven-seo/fig1.png" alt="Data Visualization Pipeline at Whites Agency"></p>
<p>In what follows, we would like to share two insights from our studies. All figures were prepared in Matplotlib and in each case we set up a global style (overwritten if necessary):</p>

<div class="highlight">
  <pre>import matplotlib.pyplot as plt
from cycler import cycler

colors = [&#39;#00b2b8&#39;, &#39;#fa5e00&#39;, &#39;#404040&#39;, &#39;#78A3B3&#39;, &#39;#008F8F&#39;, &#39;#ADC9D6&#39;]

plt.rc(&#39;axes&#39;, grid=True, labelcolor=&#39;k&#39;, linewidth=0.8, edgecolor=&#39;#696969&#39;,
    labelweight=&#39;medium&#39;, labelsize=18)
plt.rc(&#39;axes.spines&#39;, left=False, right=False, top=False, bottom=True)
plt.rc(&#39;axes.formatter&#39;, use_mathtext=True)

plt.rcParams[&#39;axes.prop_cycle&#39;] = cycler(&#39;color&#39;, colors)

plt.rc(&#39;grid&#39;, alpha=1.0, color=&#39;#B2B2B2&#39;, linestyle=&#39;dotted&#39;, linewidth=1.0)
plt.rc(&#39;xtick.major&#39;, top=False, width=0.8, size=8.0)
plt.rc(&#39;ytick&#39;, left=False, color=&#39;k&#39;)
plt.rcParams[&#39;xtick.color&#39;] = &#39;k&#39;
plt.rc(&#39;font&#39;,family=&#39;Montserrat&#39;)
plt.rcParams[&#39;font.weight&#39;] = &#39;medium&#39;
plt.rcParams[&#39;xtick.labelsize&#39;] = 13
plt.rcParams[&#39;ytick.labelsize&#39;] = 13
plt.rcParams[&#39;lines.linewidth&#39;] = 2.0</pre>
</div>

<h1 id="case-1-website-speed-performance">Case 1: Website Speed Performance<a class="headerlink" href="#case-1-website-speed-performance" title="Link to this heading">#</a></h1>
<p>Our R&amp;D department analyzed a set of 10,000 potential customer intent phrases from the <em>Electronics</em> (eCommerce) and <em>News</em> domains (5000 phrases each). They scraped data from the Google ranking in a specific location (London, United Kingdom) both for mobile and desktop results [full study available <a href="https://whites.agency/blog/google-lighthouse-study-seo-ranking-factors-in-ecommerce-vs-news/">here</a>]. Based on those data, we distinguished TOP 20 results that appeared in SERPs. Next, each page was audited with the <a href="https://developers.google.com/web/tools/lighthouse">Google Lighthouse tool</a>. Google Lighthouse is an open-source, automated tool for improving the quality of web pages, that among other collects information about website loading time. A single sample from our analysis which shows variations of <em>Time to First Byte</em> (TTFB) as a function of Google position (grouped in threes) is presented below. TTFB measures the time it takes for a user&rsquo;s browser to receive the first byte of page content. Regardless of the device, TTFB score is the lowest for websites that occurred in TOP 3 positions. The difference is significant, especially between TOP 3 and 4-6 results. Therefore, Google favors websites that respond fast and therefore it is advised to invest in website speed optimization.</p>
<p><img src="/matplotlib/matplotlib-in-data-driven-seo/fig2.png" alt="Time to first byte from Lighthouse study performed at Whites Agency."></p>
<p>The figure above uses <code>fill_between</code> function from Matplotlib library to draw colored shade that represents the 40-60th percentile range. A simple line plot with circle markers denotes the median (50th percentile). X-axis labels were assigned manually. The whole style is wrapped into a custom function that allows us to reproduce the whole figure in a single line of code. A sample is presented below:</p>

<div class="highlight">
  <pre>import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap

# --------------------------------------------
# Set double column layout
# --------------------------------------------
fig, axx = plt.subplots(figsize=(20,6), ncols=2)

# --------------------------------------------
# Plot 50th percentile
# --------------------------------------------
line_kws = {
   &#39;lw&#39;: 4.0,
   &#39;marker&#39;: &#39;o&#39;,
   &#39;ms&#39;: 9,
   &#39;markerfacecolor&#39;: &#39;w&#39;,
   &#39;markeredgewidth&#39;: 2,
   &#39;c&#39;: &#39;#00b2b8&#39;
}

# just demonstration
axx[0].plot(x, y, label=&#39;Electronics&#39;, **line_kws)

# --------------------------------------------
# Plot 40-60th percentile
# --------------------------------------------
# make color lighter
cmap = LinearSegmentedColormap.from_list(&#39;whites&#39;, [&#39;#FFFFFF&#39;, &#39;#00b2b8&#39;])

# just demonstration
axx[0].fill_between(
   x, yl, yu,
   color=cmap(0.5),
   label=&#39;_nolegend_&#39;
)

# ---------------------------------------------
# Add x-axis labels
# ---------------------------------------------
# done automatically
xtick_labels = [&#39;1-3&#39;,&#39;4-6&#39;,&#39;7-9&#39;,&#39;10-12&#39;,&#39;13-15&#39;,&#39;16-18&#39;,&#39;19-20&#39;]
for ax in axx:
   ax.set_xticklabels(xtick_labels)

# ----------------------------------------------
# Export figure
# ----------------------------------------------
fig.savefig(&#34;lighthouse.png&#34;, bbox_inches=&#39;tight&#39;, dpi=250)</pre>
</div>

<h1 id="case-2-google-ads-ranking">Case 2: Google Ads ranking<a class="headerlink" href="#case-2-google-ads-ranking" title="Link to this heading">#</a></h1>
<p>Another example let us draw insights from Google&rsquo;s paid campaigns (Ads). Our R&amp;D department scraped the first page in Google for more than 7600 queries and analyzed the ads that were present [study available only in <a href="https://agencjawhites.pl/aktualnosci/ponad-1000-graczy-walczy-o-polskiego-turyste-w-wyszukiwarce-google/">Polish</a>]. The queries were narrowed down to <em>Travel</em> category. At the moment of writing this post, each SERP can have up to 4 ads at the top and up to 3 ads at the bottom. Each ad is associated with a domain and has a headline, description, and optional extensions. Below we present TOP 25 domains with the highest visibility on desktop computers. The Y-axis shows the name of a domain and the X-axis indicates how many ads is linked with particular domain, in total. We repeated the study 3 times and aggregated the counts that is why the scale is much larger than 7600. In this project, the type of plot below allows us to summarize different brands&rsquo; ads campaign strategies and their advertising market shares. For example, <em>itaka</em> and <em>wakacje</em> have the strongest presence both on mobile and desktop and most of their ads appear at the top. The <em>neckermann</em> positions itself are very high, but most of their ads appear at the bottom of search results.</p>
<p><img src="/matplotlib/matplotlib-in-data-driven-seo/fig3.png" alt="TOP 25 domains with the highest visibility on desktop computers."></p>
<p>The figure above is a standard horizontal bar plot that can be reproduced with <code>barh</code> function in Matplotlib. Each y-tick has 4 different pieces (see legend). We also added automatically generated count numbers at the end of each bar for better readability. The code snippet is shown below:</p>

<div class="highlight">
  <pre>import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib.colors import LinearSegmentedColormap, PowerNorm

# -----------------------------
# Set default colors
# -----------------------------
blues = LinearSegmentedColormap.from_list(name=&#39;WhitesBlues&#39;, colors=[&#39;#FFFFFF&#39;, &#39;#00B3B8&#39;], gamma=1.0)
oranges = LinearSegmentedColormap.from_list(name=&#39;WhitesOranges&#39;, colors=[&#39;#FFFFFF&#39;, &#39;#FB5E01&#39;], gamma=1.0)

# colors
desktop_top = blues(1.0)
desktop_bottom = oranges(1.0)
mobile_top = blues(0.5)
mobile_bottom = oranges(0.5)

# -----------------------------
# Prepare Figure
# -----------------------------
fig, ax = plt.subplots(figsize=(10,15))
ax.grid(False)

# -----------------------------
# Plot bars
# -----------------------------
# just demonstration

for name in yticklabels:
    # tmp_desktop - DataFrame with desktop data
    # tmp_mobile - DataFrame with mobile data

    ax.barh(cnt, tmp_desktop[&#39;top&#39;], color=desktop_top, height=0.9)
    ax.barh(cnt, tmp_desktop[&#39;bottom&#39;], left=tmp_desktop[&#39;top&#39;], color=desktop_bottom, height=0.9)
    # text counter
    ax.text(tmp_desktop[&#39;all&#39;]&#43;100, cnt, &#34;%d&#34; % tmp_desktop[&#39;all&#39;], horizontalalignment=&#39;left&#39;,
            verticalalignment=&#39;center&#39;, fontsize=10)

    ax.barh(cnt-1, tmp_mobile[&#39;top&#39;], color=mobile_top, height=0.9)
    ax.barh(cnt-1, tmp_mobile[&#39;bottom&#39;], left=tmp_mobile[&#39;top&#39;], color=mobile_bottom, height=0.9)
    ax.text(tmp_mobile[&#39;all&#39;]&#43;100, cnt-1, &#34;%d&#34; % tmp_mobile[&#39;all&#39;], horizontalalignment=&#39;left&#39;,
            verticalalignment=&#39;center&#39;, fontsize=10)


    yticks.append(cnt)

    cnt = cnt - 2.5

# -----------------------------
# set labels
# -----------------------------
ax.set_yticks(yticks)
ax.set_yticklabels(yticklabels)

# -----------------------------
# Add legend manually
# -----------------------------
legend_elements = [
    mpatches.Patch(color=desktop_top, label=&#39;desktop top&#39;),
    mpatches.Patch(color=desktop_bottom, label=&#39;desktop bottom&#39;),
    mpatches.Patch(color=mobile_top, label=&#39;mobile top&#39;),
    mpatches.Patch(color=mobile_bottom, label=&#39;mobile bottom&#39;)
]

ax.legend(handles=legend_elements, fontsize=15)</pre>
</div>

<h1 id="summary">Summary<a class="headerlink" href="#summary" title="Link to this heading">#</a></h1>
<p>This is just a sample from our studies and more can be found on our website. The Matplotlib library meets our needs in terms of visual capabilities and flexibility. It allows us to create standard plots in a single line of code, as well as experiment with different forms of graphs thanks to its lower level features. Thanks to opportunities offered by Matplotlib we may present the complicated data in a simple and reader-friendly way.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="industry" label="industry" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Creating the Warming Stripes in Matplotlib]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/warming-stripes/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/matplotlib/using-matplotlib-to-advocate-for-postdocs/?utm_source=atom_feed" rel="related" type="text/html" title="Using Matplotlib to Advocate for Postdocs" />
            
                <id>https://blog.scientific-python.org/matplotlib/warming-stripes/</id>
            
            
            <published>2019-11-11T09:21:28+01:00</published>
            <updated>2019-11-11T09:21:28+01:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Ed Hawkins made this impressively simple plot to show how global temperatures have risen since 1880. Here is how to recreate it using Matplotlib.</blockquote><p><img src="/matplotlib/warming-stripes/warming-stripes.png" alt="A horizontal bar divided into stripes with colors ranging from white to shades of blue and shades of red. There is a clear tendency of shades of blue on the left side of the bar, and shades of red on the right side of the bar."></p>
<p>Earth&rsquo;s temperatures are rising and nothing shows this in a simpler,
more approachable graphic than the “Warming Stripes”.
Introduced by Prof. Ed Hawkins they show the temperatures either for
the global average or for your region as colored bars from blue to red for the last 170 years, available at <a href="https://showyourstripes.info">#ShowYourStripes</a>.</p>
<p>The stripes have since become the logo of the <a href="https://scientistsforfuture.org">Scientists for Future</a>.
Here is how you can recreate this yourself using Matplotlib.</p>
<p>We are going to use the <a href="https://www.metoffice.gov.uk/hadobs/hadcrut4/index.html">HadCRUT4</a> dataset, published by the Met Office.
It uses combined sea and land surface temperatures.
The dataset used for the warming stripes is the annual global average.</p>
<p>First, let&rsquo;s import everything we are going to use.
The plot will consist of a bar for each year, colored using a custom
color map.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib.patches</span> <span class="kn">import</span> <span class="n">Rectangle</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib.collections</span> <span class="kn">import</span> <span class="n">PatchCollection</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">matplotlib.colors</span> <span class="kn">import</span> <span class="n">ListedColormap</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span></span></span></code></pre>
</div>
<p>Then we define our time limits, our reference period for
the neutral color and the range around it for maximum saturation.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">FIRST</span> <span class="o">=</span> <span class="mi">1850</span>
</span></span><span class="line"><span class="cl"><span class="n">LAST</span> <span class="o">=</span> <span class="mi">2018</span>  <span class="c1"># inclusive</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Reference period for the center of the color scale</span>
</span></span><span class="line"><span class="cl"><span class="n">FIRST_REFERENCE</span> <span class="o">=</span> <span class="mi">1971</span>
</span></span><span class="line"><span class="cl"><span class="n">LAST_REFERENCE</span> <span class="o">=</span> <span class="mi">2000</span>
</span></span><span class="line"><span class="cl"><span class="n">LIM</span> <span class="o">=</span> <span class="mf">0.7</span>  <span class="c1"># degrees</span></span></span></code></pre>
</div>
<p>Here we use pandas to read the fixed width text file, only the
first two columns, which are the year and the deviation from the
mean from 1961 to 1990.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># data from</span>
</span></span><span class="line"><span class="cl"><span class="c1"># https://www.metoffice.gov.uk/hadobs/hadcrut4/data/current/time_series/HadCRUT.4.6.0.0.annual_ns_avg.txt</span>
</span></span><span class="line"><span class="cl"><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_fwf</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;HadCRUT.4.6.0.0.annual_ns_avg.txt&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">index_col</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">usecols</span><span class="o">=</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="n">names</span><span class="o">=</span><span class="p">[</span><span class="s2">&#34;year&#34;</span><span class="p">,</span> <span class="s2">&#34;anomaly&#34;</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">    <span class="n">header</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">anomaly</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="n">loc</span><span class="p">[</span><span class="n">FIRST</span><span class="p">:</span><span class="n">LAST</span><span class="p">,</span> <span class="s2">&#34;anomaly&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">dropna</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="n">reference</span> <span class="o">=</span> <span class="n">anomaly</span><span class="o">.</span><span class="n">loc</span><span class="p">[</span><span class="n">FIRST_REFERENCE</span><span class="p">:</span><span class="n">LAST_REFERENCE</span><span class="p">]</span><span class="o">.</span><span class="n">mean</span><span class="p">()</span></span></span></code></pre>
</div>
<p>This is our custom colormap, we could also use one of
the <a href="https://matplotlib.org/3.1.0/tutorials/colors/colormaps.html">colormaps</a> that come with <code>matplotlib</code>, e.g. <code>coolwarm</code> or <code>RdBu</code>.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># the colors in this colormap come from http://colorbrewer2.org</span>
</span></span><span class="line"><span class="cl"><span class="c1"># the 8 more saturated colors from the 9 blues / 9 reds</span>
</span></span><span class="line"><span class="cl"><span class="n">cmap</span> <span class="o">=</span> <span class="n">ListedColormap</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#08306b&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#08519c&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#2171b5&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#4292c6&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#6baed6&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#9ecae1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#c6dbef&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#deebf7&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#fee0d2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#fcbba1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#fc9272&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#fb6a4a&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#ef3b2c&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#cb181d&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#a50f15&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;#67000d&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span></span></span></code></pre>
</div>
<p>We create a figure with a single axes object that fills the full area
of the figure and does not have any axis ticks or labels.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_axes</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_axis_off</span><span class="p">()</span></span></span></code></pre>
</div>
<p>Finally, we create bars for each year, assign the
data, colormap and color limits and add it to the axes.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># create a collection with a rectangle for each year</span>
</span></span><span class="line"><span class="cl"><span class="n">col</span> <span class="o">=</span> <span class="n">PatchCollection</span><span class="p">([</span><span class="n">Rectangle</span><span class="p">((</span><span class="n">y</span><span class="p">,</span> <span class="mi">0</span><span class="p">),</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="k">for</span> <span class="n">y</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">FIRST</span><span class="p">,</span> <span class="n">LAST</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)])</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># set data, colormap and color limits</span>
</span></span><span class="line"><span class="cl"><span class="n">col</span><span class="o">.</span><span class="n">set_array</span><span class="p">(</span><span class="n">anomaly</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">col</span><span class="o">.</span><span class="n">set_cmap</span><span class="p">(</span><span class="n">cmap</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">col</span><span class="o">.</span><span class="n">set_clim</span><span class="p">(</span><span class="n">reference</span> <span class="o">-</span> <span class="n">LIM</span><span class="p">,</span> <span class="n">reference</span> <span class="o">+</span> <span class="n">LIM</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">add_collection</span><span class="p">(</span><span class="n">col</span><span class="p">)</span></span></span></code></pre>
</div>
<p>Make sure the axes limits are correct and save the figure.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_ylim</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">ax</span><span class="o">.</span><span class="n">set_xlim</span><span class="p">(</span><span class="n">FIRST</span><span class="p">,</span> <span class="n">LAST</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">fig</span><span class="o">.</span><span class="n">savefig</span><span class="p">(</span><span class="s2">&#34;warming-stripes.png&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p><img src="/matplotlib/warming-stripes/warming-stripes.png" alt="Warming Stripes"></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="tutorials" label="tutorials" />
                             
                                <category scheme="taxonomy:Tags" term="academia" label="academia" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Using Matplotlib to Advocate for Postdocs]]></title>
            <link href="https://blog.scientific-python.org/matplotlib/using-matplotlib-to-advocate-for-postdocs/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
            
                <id>https://blog.scientific-python.org/matplotlib/using-matplotlib-to-advocate-for-postdocs/</id>
            
            
            <published>2019-10-23T12:43:23-04:00</published>
            <updated>2019-10-23T12:43:23-04:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Advocating is all about communicating facts clearly. I used Matplotlib to show the financial struggles of postdocs in the Boston area.</blockquote><p>Postdocs are the <a href="https://en.wikipedia.org/wiki/Postdoctoral_researcher">workers of academia</a>.
They are the main players beyond the majority of scientific papers published in
journals and conferences. Yet, their effort is often not recognized in terms of
salary and benefits.</p>
<p>A few years ago, the NIH has established stipend levels for undergraduate,
predoctoral and postdoctoral trainees and fellows, the so-called NIH guidelines.
Many universities and research institutes currently adopt these guidelines for
deciding how much to pay postdocs.</p>
<p>One of the key problem of the NIH guidelines is that they are established at a
national level. This means that a postdoc in Buffalo is paid the same than a postdoc in Boston,
despite <a href="https://www.mentalfloss.com/article/85668/11-most-affordable-cities-us">Buffalo is one of the most affordable city to live in the USA</a>,
while <a href="https://www.investopedia.com/articles/personal-finance/080916/top-10-most-expensive-cities-us.asp">Boston is one of the most expensive</a>.
Every year, the NIH releases new guidelines, where the stipends are slightly
increased. <strong>Do these adjustments help a postdoc in the Boston area
take home a bit more money?</strong></p>
<p>I have used <a href="https://matplotlib.org">Matplotlib</a> to plot the NIH stipend levels
(y axis) for each year of postdoctoral experience (x axis) for the past 4 years
of NIH guidelines (color). I have also looked at the inflation of years 2017&ndash;2019
and increased the salaries of the previous year by that percentage (dashed lines).
<img src="/matplotlib/using-matplotlib-to-advocate-for-postdocs/gross_salary.png" alt="Plot of the NIH stipend level for postdoc versus the number of years of postdoc experience for every year from 2016 to 2019 inclusive. The x-axis ranges from 0 to 7 years of experience. The y-axis ranges from $45,000 USD to $60,000 USD. The plot also shows how these salaries would increase according to the rate of inflation during the same years. The overall message is that postdoc salaries are adjusted for inflation nationally."></p>
<p>The data revealed that the salaries of 2017 were just increased by the
inflation rate for the most senior postdocs, while junior postdocs (up to 1 year
of experience) received an increase more than 2.5 times of the inflation. In
2018, all salaries were just adjusted to the inflation. In 2019, the increase was
slightly higher than the inflation level. So, overall, every year the NIH makes
sure that the postdoc salaries are, at least, adjusted to the inflation. Great!</p>
<p>As mentioned earlier, there are cities in the US that are more expensive than
others, for example Boston. To partially account for such differences when
looking at the postdoc salaries, I subtracted from each salary the average rent
for a one-bedroom apartment in Boston.
Of course, it also increases every year, but, unfortunately for postdocs, <strong>rent
increases way more than the inflation</strong>. The results are below.
<img src="/matplotlib/using-matplotlib-to-advocate-for-postdocs/gross_salary_minus_rent.png" alt="This plot is similar to the previous plot, but has adjusted the y-axis by subtracting the cost of rent in Boston. Plot of the NIH stipend level for postdocs minus the average rent in Boston versus the number of years of postdoc experience for every year from 2016 to 2019 inclusive. The x-axis ranges from 0 to 7 years of experience. The y-axis ranges from $8,000 USD to $22,000 USD. The overall message is that increases in rent in Boston have outpaced increases in NIH stipends."></p>
<p>It turns out that the best year for postdocs with at least one year of experience
was actually 2016. In the subsequent years, the real estate has eaten larger and
larger portions of the postdoc salary, resulting in 2019-paid postdocs taking home
<strong>20% less money</strong> than 2016-paid postdocs with the same experience.</p>
<p>In the end, life is financially harder and harder for postdocs in the Boston area.
These data should be taken into account by research institutes and universities,
which have the freedom of topping up postdocs&rsquo; salaries to reflect the real cost
of living of different cities.</p>
<p>You can download the Jupyter notebook [here](Postdoc salary Analysis.ipynb).</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="academia" label="academia" />
                             
                                <category scheme="taxonomy:Tags" term="matplotlib" label="matplotlib" />
                            
                        
                    
                
            
        </entry>
    
</feed>
