<?xml version="1.0" encoding="utf-8"?> 
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-us">
    <generator uri="https://gohugo.io/" version="0.152.2">Hugo</generator><title type="html"><![CDATA[Traveling-Salesman-Problem on Blog]]></title>
    
    
    
            <link href="https://blog.scientific-python.org/tags/traveling-salesman-problem/" rel="alternate" type="text/html" title="html" />
            <link href="https://blog.scientific-python.org/tags/traveling-salesman-problem/atom.xml" rel="self" type="application/atom" title="atom" />
    <updated>2026-04-04T04:32:36+00:00</updated>
    
    
    
    
        <id>https://blog.scientific-python.org/tags/traveling-salesman-problem/</id>
    
        
        <entry>
            <title type="html"><![CDATA[My Summer of Code 2021]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/my-summer-of-code-2021/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/completing-the-asadpour-algorithm/?utm_source=atom_feed" rel="related" type="text/html" title="Completing the Asadpour Algorithm" />
                <link href="https://blog.scientific-python.org/networkx/atsp/looking-at-the-big-picture/?utm_source=atom_feed" rel="related" type="text/html" title="Looking at the Big Picture" />
                <link href="https://blog.scientific-python.org/networkx/atsp/sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Preliminaries for Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution/?utm_source=atom_feed" rel="related" type="text/html" title="The Entropy Distribution" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/my-summer-of-code-2021/</id>
            
            
            <published>2021-08-16T00:00:00+00:00</published>
            <updated>2021-08-16T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Review of my entire summer implementing the Asadpour ATSP Algorithm</blockquote><p>Welcome! This post is not going to be discussing technical implementation details or theortical work for my Google Summer of Code project, but rather serve as a summary and recap for the work that I did this summer.</p>
<p>I am very happy with the work I was able to accomplish and believe that I successfully completed my project.</p>
<h2 id="overview">Overview<a class="headerlink" href="#overview" title="Link to this heading">#</a></h2>
<p>My project was titled NetworkX: Implementing the Asadpour Asymmetric Traveling Salesman Problem Algorithm.
The updated abstract given on the Summer of Code project <a href="https://summerofcode.withgoogle.com/dashboard/project/5352909442646016/details/">project page</a> is below.</p>
<blockquote>
<p>This project seems to implement the asymmetric traveling salesman problem developed by Asadpour et al, originally published in 2010 and revised in 2017.
The project is broken into multiple methods, each of which has a set timetable during the project.
We start by solving the Held-Karp relaxation using the Ascent method from the original paper by Held and Karp.
Assuming the result is fractional, we continue into the Asadpour algorithm (integral solutions are optimal by definition and immediately returned).
We approximate the distribution of spanning trees on the undirected support of the Held Karp solution using a maximum entropy rounding method to construct a distribution of trees.
Roughly speaking, the probability of sampling any given tree is proportional to the product of all its edge lambda values.
We sample 2 log <em>n</em> trees from the distribution using an iterative approach developed by V. G. Kulkarni and choose the tree with the smallest cost after returning direction to the arcs.
Finally, the minimum tree is augmented using a minimum network flow algorithm and shortcut down to an <em>O(log n / log log n)</em> approximation of the minimum Hamiltonian cycle.</p>
</blockquote>
<p>My proposal PDF for the 2021 Summer of Code can be <a href="https://drive.google.com/file/d/1XGrjupLYWioz-Nf8Vp63AeuBVApdkwSa/view?usp=sharing">found here</a>.</p>
<p>All of my changes and additions to NetworkX are part of <a href="https://github.com/networkx/networkx/pull/4740">this pull request</a> and can also be found on <a href="https://github.com/mjschwenne/networkx/tree/bothTSP">this branch</a> in my fork of the GitHub repository, but I will be discussing the changes and commits in more detail later.
Also note that for the commits I listed in each section, this is an incomplete list only hitting on focused commits to that function or its tests.
For the complete list, please reference the pull request or the <code>bothTSP</code> GitHub branch on my fork of NetworkX.</p>
<p>My contributions to NetworkX this summer consist predominantly of the following functions and classes, each of which I will discuss in their own sections of this blog post.
Functions and classes which are front-facing are also linked to the <a href="https://networkx.org/documentation/networkx-2.7.1/index.html">developer documentation</a> for NetworkX in the list below and for their section headers.</p>
<ul>
<li><a href="https://networkx.org/documentation/networkx-2.7.1/reference/algorithms/generated/networkx.algorithms.tree.mst.SpanningTreeIterator.html"><code>SpanningTreeIterator</code></a></li>
<li><a href="https://networkx.org/documentation/networkx-2.7.1/reference/algorithms/generated/networkx.algorithms.tree.branchings.ArborescenceIterator.html"><code>ArborescenceIterator</code></a></li>
<li><code>held_karp_ascent</code></li>
<li><code>spanning_tree_distribution</code></li>
<li><code>sample_spanning_tree</code></li>
<li><a href="https://networkx.org/documentation/networkx-2.7.1/reference/algorithms/generated/networkx.algorithms.approximation.traveling_salesman.asadpour_atsp.html"><code>asadpour_atsp</code></a></li>
</ul>
<p>These functions have also been unit tested, and those tests will be integrated into NetworkX once the pull request is merged.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>The following papers are where all of these algorithms originate form and they were of course instrumental in the completion of this project.</p>
<p>[1] A. Asadpour, M. X. Goemans, A. Madry, S. O. Gharan, and A. Saberi, <em>An O (log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, SODA ’10, Society for Industrial and Applied Mathematics, 2010, p. 379 - 389 <a href="https://dl.acm.org/doi/abs/10.5555/1873601.1873633">https://dl.acm.org/doi/abs/10.5555/1873601.1873633</a>.</p>
<p>[2] J. Edmonds, <em>Optimum Branchings</em>, Journal of Research of the National Bureau of Standards, 1967, Vol. 71B, p.233-240, <a href="https://archive.org/details/jresv71Bn4p233">https://archive.org/details/jresv71Bn4p233</a></p>
<p>[3] M. Held, R.M. Karp, <em>The traveling-salesman problem and minimum spanning trees</em>. Operations research, 1970-11-01, Vol.18 (6), p.1138-1162. <a href="https://www.jstor.org/stable/169411">https://www.jstor.org/stable/169411</a></p>
<p>[4] G.K. Janssens, K. Sörensen, <em>An algorithm to generate all spanning trees in order of increasing cost</em>, Pesquisa Operacional, 2005-08, Vol. 25 (2), p. 219-229, <a href="https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en">https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en</a></p>
<p>[5] V. G. Kulkarni, <em>Generating random combinatorial objects</em>, Journal of algorithms, 11 (1990), p. 185–207.</p>
<h2 id="spanningtreeiterator"><a href="https://networkx.org/documentation/networkx-2.7.1/reference/algorithms/generated/networkx.algorithms.tree.mst.SpanningTreeIterator.html"><code>SpanningTreeIterator</code></a><a class="headerlink" href="#spanningtreeiterator" title="Link to this heading">#</a></h2>
<p>The <code>SpanningTreeIterator</code> was the first contribution I completed as part of my GSoC project.
This class takes a graph and returns every spanning tree in it in order of increasing cost, which makes it a direct implementation of [4].</p>
<p>The interesting thing about this iterator is that it is not used as part of the Asadpour algorithm, but served as an intermediate step so that I could develop the <code>ArborescenceIterator</code> which is required for the Held Karp relaxation.
It works by partitioning the edges of the graph as either included, excluded or open and then finding the minimum spanning tree which respects the partition data on the graph edges.
In order to get this to work, I created a new minimum spanning tree function called <code>kruskal_mst_edges_partition</code> which does exactly that.
To prevent redundancy, all kruskal minimum spanning trees now use this function (the original <code>kruskal_mst_edges</code> function is now just a wrapper for the partitioned version).
Once a spanning tree is returned from the iterator, the partition data for that tree is split so that the union of the newly generated partitions is the set of all spanning trees in the partition except the returned minimum spanning tree.</p>
<p>As I mentioned earlier, the <code>SpanningTreeIterator</code> is not directly used in my GSoC project, but I still decided to implement it to understand the partition process and be able to directly use the examples from [4] before moving onto the <code>ArborescenceIterator</code>.
This class I&rsquo;m sure will be useful to the other users of NetworkX and provided a strong foundation to build the <code>ArborescenceIterator</code> off of.</p>
<p><strong>Blog Posts about <code>SpanningTreeIterator</code></strong></p>
<p>5 Jun 2021 - <a href="../finding-all-minimum-arborescences">Finding All Minimum Arborescences</a></p>
<p>10 Jun 2021 - <a href="../implementing-the-iterators">Implementing The Iterators</a></p>
<p><strong>Commits about <code>SpanningTreeIterator</code></strong></p>
<p>Now, at the beginning of this project, my commit messages were not very good&hellip;
I had some problems about merge conflicts after I accidentally committed to the wrong branch and this was the first time I&rsquo;d used a pre-commit hook.</p>
<p>I have not changed the commit messages here, so that you may be assumed by my troughly unhelpful messages, but did annotate them to provide a more accurate description of the commit.</p>
<p><a href="https://github.com/mjschwenne/networkx/commit/495458842d3ec798c6ea52dc1c8089b9a5ce3de5">Testing</a> - <em>Rewrote Kruskal&rsquo;s algorithm to respect partitions and tested that while stubbing the iterators in a separate file</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/3d81e36c8313013a3ae4c4dfc6517c3bde8d826e">I&rsquo;m not entirely sure how the commit hook works&hellip;</a> - <em>Added test cases and finalized implementation of Spanning Tree Iterator in the incorrect file</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/d481f757125a699f69bf5c16790d2e727e3cc159">Moved iterators into the correct files to maintain proper codebase visibility</a> - <em>Realized that the iterators need to be in <code>mst.py</code> and <code>branchings.py</code> respectively to keep private functions hidden</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5503203433bc875df8c0de5d827bda7bed1589e2">Documentation update for the iterators</a> - <em>No explanation needed</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/337804ee38b2c1ac3964447a39d67184081deb01">Update mst.py to accept suggestion</a> - <em>Accepted doc string edit from code review</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5f97de07821e49cc9ba4f9996ec6d1495eb268b7">Review suggestions from dshult</a> - <em>Implemented code review suggestions from one of my mentors</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/97b2da1b5499ecbfd15ef2abd385e50f94c6ba97">Cleaned code, merged functions if possible and opened partition functionality to all</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/aef90dfcbb8b8424c6ed887311b4825559d0a398">Implement suggestions from boothby</a></p>
<h2 id="arborescenceiterator"><a href="https://networkx.org/documentation/networkx-2.7.1/reference/algorithms/generated/networkx.algorithms.tree.branchings.ArborescenceIterator.html"><code>ArborescenceIterator</code></a><a class="headerlink" href="#arborescenceiterator" title="Link to this heading">#</a></h2>
<p>The <code>ArborescenceIterator</code> is a modified version of the algorithm discussed in [4] so that it iterates over the spanning arborescences.</p>
<p>This iterator was a bit more difficult to implement, but that is due to how the minimum spanning arborescence algorithm is structured rather than the partition scheme not being applicable to directed graphs.
In fact the partition scheme is identical to the undirected <code>SpanningTreeIterator</code>, but Edmonds&rsquo; algorithm is more complex and there are several edge cases about how nodes can be contracted and what it means for respecting the partition data.
In order to fully understand the NetworkX implementation, I had to read the original Edmonds paper, [2].</p>
<p>The most notable change was that when the iterator writes the next partition onto the edges of the graph just before Edmonds&rsquo; algorithm is executed, if any incoming edge is marked as included, all of the others are marked as excluded.
This is an implicit part of the <code>SpanningTreeIterator</code>, but needed to be explicitly done here so that if the vertex in question was merged during Edmonds&rsquo; algorithm we could not choose two of the incoming edges to the same vertex once the merging was reversed.</p>
<p>As a final note, the <code>ArborescenceIterator</code> has one more initial parameter than the <code>SpanningTreeIterator</code>, which is the ability to give it an initial partition and iterate over all spanning arborescence with cost greater than the initial partition.
This was used as part of the branch and bound method, but is no longer a part of the my Asadpour algorithm implementation.</p>
<p><strong>Blog Posts about <code>ArborescenceIterator</code></strong></p>
<p>5 Jun 2021 - <a href="../finding-all-minimum-arborescences">Finding All Minimum Arborescences</a></p>
<p>10 Jun 2021 - <a href="../implementing-the-iterators">Implementing The Iterators</a></p>
<p><strong>Commits about <code>ArborescenceIterator</code></strong></p>
<p>My commits listed here are still annotated and much of the work was done at the same time.</p>
<p><a href="https://github.com/mjschwenne/networkx/commit/495458842d3ec798c6ea52dc1c8089b9a5ce3de5">Testing</a> - <em>Rewrote Kruskal&rsquo;s algorithm to respect partitions and tested that while stubbing the iterators in a separate file</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/d481f757125a699f69bf5c16790d2e727e3cc159">Moved iterators into the correct files to maintain proper codebase visibility</a> - <em>Realized that the iterators need to be in <code>mst.py</code> and <code>branchings.py</code> respectively to keep private functions hidden</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/73cade29568f9e10303fb901c97ac52b1d45b8aa">Including Black reformat</a> - <em>Modified Edmonds&rsquo; algorithm to respect partitions</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/ae1c1031980f7e3c3854d718c8813b226d2e8d42">Modified the ArborescenceIterator to accept init partition</a> - <em>No explanation needed</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5503203433bc875df8c0de5d827bda7bed1589e2">Documentation update for the iterators</a> - <em>No explanation needed</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/b44a5ab9c8d5ac86db446213d7b9712e5b9aac81">Update branchings.py accept doc string edit</a> - <em>No explanation needed</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5f97de07821e49cc9ba4f9996ec6d1495eb268b7">Review suggestions from dshult</a> - <em>Implemented code review suggestions from one of my mentors</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/97b2da1b5499ecbfd15ef2abd385e50f94c6ba97">Cleaned code, merged functions if possible and opened partition functionality to all</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/55688deb9a84bc7a77aecc556a63ff80dc41c56f">Implemented review suggestions from rossbar</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/aef90dfcbb8b8424c6ed887311b4825559d0a398">Implement suggestions from boothby</a></p>
<h2 id="held_karp_ascent"><code>held_karp_ascent</code><a class="headerlink" href="#held_karp_ascent" title="Link to this heading">#</a></h2>
<p>The Held Karp relaxation was the most difficult part of my GSoC project and the part that I was the most worried about going into this May.</p>
<p>My plans on how to solve the relaxation evolved over the course of the summer as well, finally culminating in <code>held_karp_ascent</code>.
In my GSoC proposal, I discuss using <code>scipy</code> to solve the relaxation, but the Held Karp relaxation is a semi-infinite linear problem (that is, it is finite but exponential) so I would quickly surpass the capabilities of virtually any computer that the code would be run on.
Fortunately I realized that while I was still writing my proposal and was able to change it.
Next, I wanted to use the ellipsoid algorithm because that is the suggested method in the Asadpour paper [1].</p>
<p>As it happens, the ellipsoid algorithm is not implemented in <code>numpy</code> or <code>scipy</code> and after discussing the practicality of implementing the algorithm as part of this project, we decided that a robust ellipsoid solver was a GSoC project onto itself and beyond the scope of the Asadpour algorithm.
Another method was needed, and was found.
In the original paper by Held and Karp [3], they present three different algorithms for solving the relaxation, the column-generation technique, the ascent method and the branch and bound method.
After reading the paper and comparing all of the methods, I decided that the branch and bound method was the best in terms of performance and wanted to implement that one.</p>
<p>The branch and bound method is a modified version of the ascent method, so I started by implementing the ascent method, then the branch and bound around it.
This had the extra benefit of allowing me to compare the two and determine which is actually better.</p>
<p>Implementing the ascent method proved difficult.
There were a number of subtle bugs in finding the minimum 1-arborescences and finding the value of epsilon by not realizing all of the valid edge substitutions in the graph.
More information about these problems can be found in my post titled <em>Understanding the Ascent Method</em>.
Even after this the ascent method was not working proper, but I decided to move onto the branch and bound method in hopes of learning more about the process so that I could fix the ascent method.</p>
<p>That is exactly what happened!
While debugging the branch and bound method, I realized that my function for finding the set of minimum 1-arborescences would stop searching too soon and possibly miss the minimum 1-arborescences.
Once I fixed that bug, both the ascent as well as the branch and bound method started to produce the correct results.</p>
<p>But which one would be used in the final project?</p>
<p>Well, that came down to which output was more compatible with the rest of the Asadpour algorithm.
The ascent method could find a fractional solution where the edges are not totally in or out of the solution while the branch and bound method would take the time to ensure that the solution was integral.
As it would happen, the Asadpour algorithm expects a fractional solution to the Held Karp relaxation so in the end the ascent method one out and the branch and bound method was removed from the project.</p>
<p>All of this is detailed in the (many) blog posts I wrote on this topic, which are listed below.</p>
<p><strong>Blog posts about the Held Karp relaxation</strong></p>
<p>My first two posts were about the <code>scipy</code> solution and the ellipsoid algorithm.</p>
<p>11 Apr 2021 - <a href="../held-karp-relaxation">Held Karp Relaxation</a></p>
<p>8 May 2021 - <a href="../held-karp-separation-oracle">Held Karp Separation Oracle</a></p>
<p>This next post discusses the merits of each algorithm presenting in the original Held and Karp paper [3].</p>
<p>3 Jun 2021 - <a href="../a-closer-look-at-held-karp">A Closer Look At Held Karp</a></p>
<p>And finally, the last three Held Karp related posts are about the debugging of the algorithms I did implement.</p>
<p>22 Jun 2021 - <a href="../understanding-the-ascent-method">Understanding The Ascent Method</a></p>
<p>28 Jun 2021 - <a href="../implementing-the-held-karp-relaxation">Implementing The Held Karp Relaxation</a></p>
<p>7 Jul 2021 - <a href="../finalizing-held-karp">Finalizing Held Karp</a></p>
<p><strong>Commits about the Held Karp relaxation</strong></p>
<p>Annotations only provided if needed.</p>
<p><a href="https://github.com/networkx/networkx/pull/4740/commits/716437f6ccbbd6c77a7a01b38d330f899c333f0a">Grabbing black reformats</a> - <em>Initial Ascent method implementation</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/cd28eb71676ecc34c7af6f2e0f8980ad6ae89f00">Working on debugging ascent method plus black reformats</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/660e4d3f04a0b4ce28e152af7f8c7df84e1961b3">Ascent method terminating, but at non-optimal solution</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/8314c3c28d205ed5a7d6316904f4db0265d93942">minor edits</a> - <em>Removed some debug statements</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/f7dcb54ce17ec3646e7d3c33f909f6b382608532">Fixed termination condition, still given non-optimal result</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/beccc98c362eb8bdddc42b72af0d669ad082e468">Minor bugfix, still non-optimal result</a> - <em>Ensured reported answer is the cycle if multiple options</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/68ffad5c70811a702ade569817a1f3a14c33a1af">Fixed subtle bug in find_epsilon()</a> - <em>Fixed the improper substitute detection bug</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/a4f1442dcf2c6f69dcf03dacf0ed38183cdc7ddb">Cleaned code and tried something which didn&rsquo;t work</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/644d14ac6ce327ce577592e566153c0117c6dcb6">Black formats</a> - <em>Initial branch and bound implementation</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/288bb5324cceb11e94396e435616c70b87926f69">Branch and bound returning optimal solution</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/242b53da0e00326ece75304a4ad8fb89e9ba8a25">black formatting changes</a> - <em>Split ascent and branch and bound methods into different functions</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/adbf930c23271c17a4d2fed6fbcd03552799793c">Performance tweaks and testing fractional answers</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/d3a45122bba3240d933a2b4275173f7e8a987cfa">Fixed test bug, I hope</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/37d6219887bff444d9f29e38526965ec4cc0687d">Asadpour output for ascent method</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/bcfb0ebcbe552524e44f9c85e353b53b1711e028">Removed branch and bound method. One unit test misbehaving</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/b529389be5263144b5755f8e4589216606e37484">Added asymmetric fractional test for the ascent method</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/c6cedc1f9d53a0c486c0196041188ae1b9c740d4">Removed printn statements and tweaked final test to be more asymmetric</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/b6bec0dada9ff67dc1cf28f5ae0fe3b1df490dc5">Changed HK to only report on the support of the answer</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/837d0448d38936278cfa9fdb7d8cb636eb8552c3">documentation update</a></p>
<h2 id="spanning_tree_distribution"><code>spanning_tree_distribution</code><a class="headerlink" href="#spanning_tree_distribution" title="Link to this heading">#</a></h2>
<p>Once we have the support of the Held Karp relaxation, we calculate edge weights $\gamma$ for support so that the probability of any tree being sampled is proportional to the product of $e^{\gamma}$ across its edges.
This is called a maximum entropy distribution in the Asadpour paper.
This procedure was included in the Asadpour paper [1] on page 386.</p>
<blockquote>
<ol>
<li>Set $\gamma = \vec{0}$.</li>
<li>While there exists an edge $e$ with $q_e(\gamma) &gt; (1 + \epsilon)z_e$:</li>
</ol>
<ul>
<li>Compute $\delta$ such that if we define $\gamma&rsquo;$ ad $\gamma_e&rsquo; = \gamma_e - \delta$ and $\gamma_f&rsquo; = \gamma_e$ for all $f \in E \backslash {e}$, then $q_e(\gamma&rsquo;) = (1 + \epsilon / 2)z_e$</li>
<li>Set $\gamma \leftarrow \gamma'$</li>
</ul>
<ol start="3">
<li>Output $\tilde{\gamma} := \gamma$.</li>
</ol>
</blockquote>
<p>Where $q_e(\gamma)$ is the probability that any given edge $e$ will be in a sampled spanning tree chosen with probability proportional to $\exp(\gamma(T))$.
$\delta$ is also given as</p>
<p>$$
\delta = \frac{q_e(\gamma)(1-(1+\epsilon/2)z_e)}{(1-q_e(\gamma))(1+\epsilon/2)z_e}
$$</p>
<p>so the Asadpour paper did almost all of the heavy lifting for this function.
However, they were not very clear on how to calculate $q_e(\gamma)$ other than that Krichhoff&rsquo;s Tree Matrix Theorem can be used.</p>
<p>My original method for calculating $q*e(\gamma)$ was to apply Krichhoff&rsquo;s Theorem to the original laplacian matrix and the laplacian produced once the edge $e$ is contracted from the graph.
Testing quickly showed that once the edge is contracted from the graph, it cannot affect the value of the laplacian and thus after subtracting $\delta$ the probability of that edge would increase rather than decrease.
Multiplying my original value of $q_e(\gamma)$ by $\exp(\gamma_e)$ proved to be the solution here for reasons extensively discussed in my blog post _The Entropy Distribution* and in particular the &ldquo;Update! (28 July 2021)&rdquo; section.</p>
<p><strong>Blog posts about <code>spanning_tree_distribution</code></strong></p>
<p>13 Jul 2021 - <a href="../entropy-distribution-setup">Entropy Distribution Setup</a></p>
<p>20 Jul 2021 - <a href="../entropy-distribution">The Entropy Distribution</a></p>
<p><strong>Commits about <code>spanning_tree_distribution</code></strong></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/da1f5cf688277426575115e3328e16d8f5b29a3c">Draft of spanning_tree_distribution</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/b6bec0dada9ff67dc1cf28f5ae0fe3b1df490dc5">Changed HK to only report on the support of the answer</a> - <em>Needing to limit $\gamma$ to only the support of the Held Karp relaxation is what caused this change</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/0fcf0b3ecfc3704db17830eeeae72a67b4182ffb">Fixed contraction bug by changing to MultiGraph. Problem with prob &gt; 1</a> - <em>Because the probability is only</em> proportional <em>to the product of the edge weights, this was not actually a problem</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/e820d4f921268ff0d55f913624bcd402c90244b2">Black reformats</a> - <em>Rewrote the test and cleaned the code</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/2195002e9394bcb2c47876809cfbbec3c05b1008">Fixed pypi test error</a> - <em>The pypi tests do not have <code>numpy</code> or <code>scipy</code> and I forgot to flag the test to be skipped if they are not available</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/e4cd4f17311e8d908f016cea45f03b1b3e35822e">Further testing of dist fix</a> - <em>Fixed function to multiply $q_e(\gamma)$ by $\exp(\gamma_e)$ and implemented exception if $\delta$ ever misbehaves</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/68f0cf95565bcdce0aec4678e3af9815e23b494e">Can sample spanning trees</a> - <em>Streamlined finding $q_e(\gamma)$ using new helper function</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/837d0448d38936278cfa9fdb7d8cb636eb8552c3">documentation update</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5f97de07821e49cc9ba4f9996ec6d1495eb268b7">Review suggestions from dshult</a> - <em>Implemented code review suggestions from one of my mentors</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/aef90dfcbb8b8424c6ed887311b4825559d0a398">Implement suggestions from boothby</a></p>
<h2 id="sample_spanning_tree"><code>sample_spanning_tree</code><a class="headerlink" href="#sample_spanning_tree" title="Link to this heading">#</a></h2>
<p>What good is a spanning tree distribution if we can&rsquo;t sample from it?</p>
<p>While the Asadpour paper [1] provides a rough outline of the sampling process, the bulk of their methodology comes from the Kulkarni paper, <em>Generating random combinatorial objects</em> [5].
That paper had a much more detailed explanation and even this pseudo code from page 202.</p>
<blockquote>
<p>$U = \emptyset,$ $V = E$<br>
Do $i = 1$ to $N$;<br>
$\qquad$Let $a = n(G(U, V))$<br>
$\qquad\qquad a&rsquo;$ $= n(G(U \cup {i}, V))$<br>
$\qquad$Generate $Z \sim U[0, 1]$<br>
$\qquad$If $Z \leq \alpha_i \times \left(a&rsquo; / a\right)$<br>
$\qquad\qquad$then $U = U \cup {i}$,<br>
$\qquad\qquad$else $V = V - {i}$<br>
$\qquad$end.<br>
Stop. $U$ is the required spanning tree.</p>
</blockquote>
<p>The only real difficulty here was tracking how the nodes were being contracted.
My first attempt was a mess of <code>if</code> statements and the like, but switching it to a merge-find data structure (or disjoint set data structure) proved to be a wise decision.</p>
<p>Of course, it is one thing to be able to sample a spanning tree and another entirely to know if the sampling technique matches the expected distribution.
My first iteration test for <code>sample_spanning_tree</code> just sampled a large number of trees (50000) and they printed the percent error from the normalized distribution of spanning tree.
With a sample size of 50000 all of the errors were under 10%, but I still wanted to find a better test.</p>
<p>From my AP statistics class in high school I remembered the $X^2$ (Chi-squared) test and realized that it would be perfect here.
<code>scipy</code> even had the ability to conduct one.
By converting to a chi-squared test I was able to reduce the sample size down to 1200 (near the minimum required sample size to have a valid chi-squared test) and use a proper hypothesis test at the $\alpha = 0.01$ significance level.
Unfortunately, the test would still fail 1% of the time until I added the <code>@py_random_state</code> decorator to <code>sample_spanning_tree</code>, and then the test can pass in a <code>Random</code> object to produce repeatable results.</p>
<p><strong>Blog posts about <code>sample_spanning_tree</code></strong></p>
<p>21 Jul 2021 - <a href="../preliminaries-for-sampling-a-spanning-tree">Preliminaries For Sampling A Spanning Tree</a></p>
<p>28 Jul 2021 - <a href="../sampling-a-spanning-tree">Sampling A Spanning Tree</a></p>
<p><strong>Commits about <code>sample_spanning_tree</code></strong></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/68f0cf95565bcdce0aec4678e3af9815e23b494e">Can sample spanning trees</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/3cca2b5bfdf001b1613f8e803f78c9fb380adc59">Developing test for sampling spanning tree</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/274e2c5908f337941ee5234d727fd307257a9b85">Changed sample_spanning_tree test to Chi squared test</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/7ebc6d874ec703a46dfc40f195fa84594bb9582c">Adding test cases</a> - <em>Implemented <code>@py_random_state</code> decorator</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/837d0448d38936278cfa9fdb7d8cb636eb8552c3">documentation update</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5f97de07821e49cc9ba4f9996ec6d1495eb268b7">Review suggestions from dshult</a> - <em>Implemented code review suggestions from one of my mentors</em></p>
<h2 id="asadpour_atsp"><a href="https://networkx.org/documentation/networkx-2.7.1/reference/algorithms/generated/networkx.algorithms.approximation.traveling_salesman.asadpour_atsp.html"><code>asadpour_atsp</code></a><a class="headerlink" href="#asadpour_atsp" title="Link to this heading">#</a></h2>
<p>This function was the last piece of the puzzle, connecting all of the others together and producing the final result!</p>
<p>Implementation of this function was actually rather smooth.
The only technical difficulty I had was reading the support of the <code>flow_dict</code> and the theoretical difficulties were adapting the <code>min_cost_flow</code> function to solve the minimum circulation problem.
Oh, and that if the flow is greater than 1 I need to add parallel edges to the graph so that it is still eulerian.</p>
<p>A brief overview of the whole algorithm is given below:</p>
<ol>
<li>Solve the Held Karp relaxation and symmertize the result to made it undirected.</li>
<li>Calculate the maximum entropy spanning tree distribution on the Held Karp support graph.</li>
<li>Sample $2 \lceil \ln n \rceil$ spanning trees and record the smallest weight one before reintroducing direction to the edges.</li>
<li>Find the minimum cost circulation to create an eulerian graph containing the sampled tree.</li>
<li>Take the eulerian walk of that graph and shortcut the answer.</li>
<li>return the shortcut answer.</li>
</ol>
<p><strong>Blog posts about <code>asadpour_atsp</code></strong></p>
<p>29 Jul 2021 - <a href="../looking-at-the-big-picture">Looking At The Big Picture</a></p>
<p>10 Aug 2021 - <a href="../completing-the-asadpour-algorithm">Completing The Asadpour Algorithm</a></p>
<p><strong>Commits about <code>asadpour_atsp</code></strong></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/2c1dc57542cc9651b5443f6015fb94b94bc2f7cd">untested implementation of asadpour_tsp</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/454c82ca61ab4746b57c6681449f8ea08f96d557">Fixed issue reading flow_dict</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/328a4f3b2669fa9890d2c08a4d72f0f9bb7573dc">Fixed runtime errors in asadpour_tsp</a> - <em>General traveling salesman problem function assumed graph were undirected. This is not work with an atsp algorithm</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/1d345054a20a88b3115af900972a0145d708d8b5">black reformats</a> - <em>Fixed parallel edges from flow support bug</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/7ebc6d874ec703a46dfc40f195fa84594bb9582c">Adding test cases</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/837d0448d38936278cfa9fdb7d8cb636eb8552c3">documentation update</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/11fef147246eb3374568515a4b29aeee5a9f469d">One new test and check</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/6db9f7692fc5294ac206fa331242fe679cbfb7d7">Fixed rounding error with tests</a></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/5f97de07821e49cc9ba4f9996ec6d1495eb268b7">Review suggestions from dshult</a> - <em>Implemented code review suggestions from one of my mentors</em></p>
<p><a href="https://github.com/mjschwenne/networkx/commit/55688deb9a84bc7a77aecc556a63ff80dc41c56f">Implemented review suggestions from rossbar</a></p>
<h2 id="future-involvement-with-networkx">Future Involvement with NetworkX<a class="headerlink" href="#future-involvement-with-networkx" title="Link to this heading">#</a></h2>
<p>Overall, I really enjoyed this Summer of Code.
I was able to branch out, continue to learn python and more about graphs and graph algorithms which is an area of interest for me.</p>
<p>Assuming that I have any amount of free time this coming fall semester, I&rsquo;d love to stay involved with NetworkX.
In fact, there are already some things that I have in mind even though my current code works as is.</p>
<ul>
<li>
<p>Move <code>sample_spanning_tree</code> to <code>mst.py</code> and rename it to <code>random_spanning_tree</code>.
The ability to sample random spanning trees is not a part of the greater NetworkX library and could be useful to others.
One of my mentors mentioned it being relevant to <a href="https://en.wikipedia.org/wiki/Steiner_tree_problem">Steiner trees</a> and if I can help other developers and users out, I will.</p>
</li>
<li>
<p>Adapt <code>sample_spanning_tree</code> so that it can use both additive and multiplicative weight functions.
The Asadpour algorithm only needs the multiplicative weight, but the Kulkarni paper [5] does talk about using an additive weight function which may be more useful to other NetworkX users.</p>
</li>
<li>
<p>Move my Krichhoff&rsquo;s Tree Matrix Theorem helper function to <code>laplacian_matrix.py</code> so that other NetworkX users can access it.</p>
</li>
<li>
<p>Investigate the following article about the Held Karp relaxation.
While I have no definite evidence for this one, I do believe that the Held Karp relaxation is the slowest part of my implementation of the Asadpour algorithm and thus is the best place for improving it.
The ascent method I am using comes from the original Held and Karp paper [3], but they did release a part II which may have better algorithms in it.
The citation is given below.</p>
<p>M. Held, R.M. Karp, <em>The traveling-salesman problem and minimum spanning trees: Part II</em>. Mathematical Programming, 1971, 1(1), p. 6–25. <a href="https://doi.org/10.1007/BF01584070">https://doi.org/10.1007/BF01584070</a></p>
</li>
<li>
<p>Refactor the <code>Edmonds</code> class in <code>branchings.py</code>.
That class is the implementation for Edmonds&rsquo; branching algorithm but uses an iterative approach rather than the recursive one discussed in Edmonds&rsquo; paper [2].
I did also agree to work with another person, <a href="https://github.com/lkora">lkora</a> to help rework this class and possible add a <code>minimum_maximal_branching</code> function to find the minimum branching which still connects as many nodes as possible.
This would be analogous to a spanning forest in an undirected graph.
At the moment, neither of us have had time to start such work.
For more information please reference issue <a href="https://github.com/networkx/networkx/issues/4836">#4836</a>.</p>
</li>
</ul>
<p>While there are areas of this problem which I can improve upon, it is important for me to remember that this project was still a complete success.
NetworkX now has an algorithm to approximate the traveling salesman problem in asymmetric or directed graphs.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Completing the Asadpour Algorithm]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/completing-the-asadpour-algorithm/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/looking-at-the-big-picture/?utm_source=atom_feed" rel="related" type="text/html" title="Looking at the Big Picture" />
                <link href="https://blog.scientific-python.org/networkx/atsp/sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Preliminaries for Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution/?utm_source=atom_feed" rel="related" type="text/html" title="The Entropy Distribution" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/?utm_source=atom_feed" rel="related" type="text/html" title="Entropy Distribution Setup" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/completing-the-asadpour-algorithm/</id>
            
            
            <published>2021-08-10T00:00:00+00:00</published>
            <updated>2021-08-10T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Implementation details for asadpour_atsp</blockquote><p>My implementation of <code>asadpour_atsp</code> is now working!
Recall that my pseudo code for this function from my last post was</p>

<div class="highlight">
  <pre>def asadpour_tsp
    Input: A complete graph G with weight being the attribute key for the edge weights.
    Output: A list of edges which form the approximate ATSP solution.

    z_star = held_karp(G)
    # test to see if z_star is a graph or dict
    if type(z_star) is nx.DiGraph
        return z_star.edges

    z_support = nx.MultiGraph()
    for u, v in z_star
        if not in z_support.edges
            edge_weight = min(G[u][v][weight], G[v][u][weight])
            z_support.add_edge(u, v, weight=edge_weight)
    gamma = spanning_tree_distribution(z_support, z_star)

    for u, v in z_support.edges
        z_support[u][v][lambda] = exp(gamma[(u, v)])

    for _ in range 1 to 2 ceil(log(n))
        sampled_tree = sample_spanning_tree(G)
        sampled_tree_weight = sampled_tree.size()
        if sampled_tree_weight &lt; minimum_sampled_tree_weight
            minimum_sampled_tree = sampled_tree.copy()
            minimum_sampled_tree_weight = sampled_tree_weight

    t_star = nx.DiGraph
    for u, v, d in minimum_sampled_tree.edges(data=weight)
        if d == G[u][v][weight]
            t_star.add_edge(u, v, weight=d)
        else
            t_star.add_edge(v, u, weight=d)

    for n in t_star
        node_demands[n] = t_star.out_degree(n) - t_star.in_degree(n)

    nx.set_node_attributes(G, node_demands)
    flow_dict = nx.min_cost_flow(G)

     for u, v in flow_dict
        if edge not in t_star.edges and flow_dict[u, v] &gt; 0
            t_star.add_edge(u, v)
    eulerian_curcuit = nx.eulerian_circuit(t_star)
    return _shortcutting(eulerian_curcuit)</pre>
</div>

<p>And this was more or less correct.
A few issues were present, as they always were going to be.</p>
<p>First, my largest issue came from a part of a word being in parenthesis in the Asadpour paper on page 385.</p>
<blockquote>
<p>This integral circulation $f^*$ corresponds to a directed (multi)graph $H$ which contains $\vec{T}^*$.</p>
</blockquote>
<p>Basically if the minimum flow is every larger than 1 along an edge, I need to add that many parallel edges in order to ensure that everything is still Eulerian.
This became a problem quickly while developing my test cases as shown in the below example.</p>
<center><img src="example-multiflow.png" alt="Example of correct and incorrect circulation from the directed spanning tree"/></center>
<p>As you can see, for the incorrect circulation, vertices 2 and 3 are not eulerian as they in and out degrees do not match.</p>
<p>All of the others were just minor points where the pseudo code didn&rsquo;t directly translate into python (because, after all, it isn&rsquo;t python).</p>
<h2 id="understanding-the-output">Understanding the Output<a class="headerlink" href="#understanding-the-output" title="Link to this heading">#</a></h2>
<p>The first thing I did once <code>asadpour_atsp</code> was take the fractional, symmetric Held Karp relaxation test graph and run it through the general <code>traveling_salesman_problem</code> function.
Since there are random numbers involved here, the results were always within the $O(\log n / \log \log n)$ approximation factor but were different.
Three examples are shown below.</p>
<center><img src="example-tours.png" alt="Three possible ATSP tours on an example graph"/></center>
<p>The first thing we want to check is the approximation ratio.
We know that the minimum cost output of the <code>traveling_saleman_problem</code> function is 304 (This is actually lower than the optimal tour in the undirected version, more on this later).
Next we need to know what our maximum approximation factor is.
Now, the Asadpour algorithm is $O(\log n / \log \log n)$ which for our six vertex graph would be $\ln(6) / \ln(\ln(6)) \approx 3.0723$.
However, on page 386 they give the coefficients of the approximation as $(2 + 8 \log n / \log \log n)$ which would be $2 + 8 \times \ln(6) / \ln(\ln(6)) \approx 26.5784$.
(Remember that all $\log$&rsquo;s in the Asadpour paper refer to the natural logarithm.)
All of our examples are well below even the lower limit.</p>
<p>For example 1:</p>
<p>$$
\begin{array}{r l}
\text{actual}: &amp; 504 \\\
\text{expected}: &amp; 304 \\\
\text{approx. factor}: &amp; \frac{504}{304} \approx 1.6578 &lt; 3.0723
\end{array}
$$</p>
<p>Example 2:</p>
<p>$$
\begin{array}{r l}
\text{actual}: &amp; 404 \\\
\text{expected}: &amp; 304 \\\
\text{approx. factor}: &amp; \frac{404}{304} \approx 1.3289 &lt; 3.0723
\end{array}
$$</p>
<p>Example 3:</p>
<p>$$
\begin{array}{r l}
\text{actual}: &amp; 304 \\\
\text{expected}: &amp; 304 \\\
\text{approx. factor}: &amp; \frac{304}{304} = 1.0000 &lt; 3.0723
\end{array}
$$</p>
<p>At this point, you&rsquo;ve probably noticed that the examples given are strictly speaking, <em>not</em> hamiltonian cycles: they visit vertices multiple times.
This is because the graph we have is not complete.
The Asadpour algorithm only works on complete graphs, so the <code>traveling_salesman_problem</code> function finds the shortest cost path between every pair of vertices and inserts the missing edges.
In fact, if the <code>asadpour_atsp</code> function is given an incomplete graph, it will raise an exception.
Take example three, since there is only one repeated vertex, 5.</p>
<p>Behind the scenes, the graph is complete and the solution may contain the dashed edge in the below image.</p>
<center><img src="complete-bypass.png" alt="Reversing an edge bypass to translate the TSP back to the original graph"/></center>
<p>But that edge is not in the original graph, so during the post-processing done by the <code>traveling_salesman_problem</code> function, the red edges are inserted instead of the dashed edge.</p>
<h2 id="testing-the-asadpour-algorithm">Testing the Asadpour Algorithm<a class="headerlink" href="#testing-the-asadpour-algorithm" title="Link to this heading">#</a></h2>
<p>Before I could write any tests, I needed to ensure that the tests were consistent from execution to execution.
At the time, this was not the case since there were random numbers being generated in order to sample the spanning trees.
So I had to learn how to use the <code>@py_random_state</code> decorator.</p>
<p>When this decorator is added to the top of a function, we pass it either the position of the argument in the function signature or the name of the keyword for that argument.
It then takes that argument and configures a python Random object based on the input parameter.</p>
<ul>
<li>Parameter is <code>None</code>, use a new <code>Random</code> object.</li>
<li>Parameter is an <code>int</code>, use a new <code>Random</code> object with that seed.</li>
<li>Parameter is a <code>Random</code> object, use that object as is.</li>
</ul>
<p>So I changed the function signature of <code>sample_spanning_tree</code> to have <code>random=None</code> at the end.
For most use cases, the default value will not be changed and the results will be different every time the method is called, but if we give it an <code>int</code>, the same tree will be sampled every time.
But, for my tests I can give it a seed to create repeatable behaviour.
Since the <code>sample_spanning_tree</code> function is not visible outside of the <code>treveling_salesman</code> file, I also had to create a pass-through parameter for <code>asadpour_atsp</code> so that my seed could have any effect.</p>
<p>Once this was done, I modified the test for <code>sample_spanning_tree</code> so that it would not have a 1 in 100 chance of spontaneously failing.
At first I just passed it an <code>int</code>, but that forced every tree sampled to be the same (since the edges were shuffled the same and sampled from the same sequence of numbers) and the test failed.
So I tweaked it to use a <code>Random</code> object from the random package and this worked well.</p>
<p>From here, I wrap the complete <code>asadpour_atsp</code> parameters I want in another function <code>fixed_asadpour</code> like this:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">fixed_asadpour</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">nx_app</span><span class="o">.</span><span class="n">asadpour_atsp</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="p">,</span> <span class="mi">56</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">path</span> <span class="o">=</span> <span class="n">nx_app</span><span class="o">.</span><span class="n">traveling_salesman_problem</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">,</span> <span class="n">cycle</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">method</span><span class="o">=</span><span class="n">fixed_asadpour</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span></span></span></code></pre>
</div>
<p>I tested using both <code>traveling_salesman_problem</code> and <code>asadpour_atsp</code>.
The tests included:</p>
<ul>
<li>The fractional, symmetric Held Karp graph from above.</li>
<li>A real world example using airline prices between six cities (also uses non-integer node names).</li>
<li>The same real world example but asking for a path not a cycle.</li>
<li>Using a disconnected graph (raises exception).</li>
<li>Using an incomplete graph (raises exception).</li>
<li>Using an integral Held Karp solution (returns directly after Held Karp with exact solution).</li>
<li>Using an impossible graph (one vertex has only out edges).</li>
</ul>
<h2 id="bonus-feature">Bonus Feature<a class="headerlink" href="#bonus-feature" title="Link to this heading">#</a></h2>
<p>There is even a bonus feature!
The <code>asadpour_atsp</code> function accepts a fourth argument, <code>source</code>!
Since both of the return methods use <code>eulerian_circuit</code> and the <code>_shortcutting</code> functions, I can pass a <code>source</code> vertex to the circuit function and ensure that the returned path starts and returns to the desired vertex.</p>
<p>Access it by wrapping the method, just be sure that the source vertex is in the graph to avoid an exception.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">fixed_asadpour</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">nx_app</span><span class="o">.</span><span class="n">asadpour_atsp</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="p">,</span> <span class="n">source</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">path</span> <span class="o">=</span> <span class="n">nx_app</span><span class="o">.</span><span class="n">traveling_salesman_problem</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">,</span> <span class="n">cycle</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">method</span><span class="o">=</span><span class="n">fixed_asadpour</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span></span></span></code></pre>
</div>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Madry, S. O. Gharan, and A. Saberi, <em>An O (log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, SODA ’10, Society for Industrial and Applied Mathematics, 2010, <a href="https://dl.acm.org/doi/abs/10.5555/1873601.1873633">https://dl.acm.org/doi/abs/10.5555/1873601.1873633</a>.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Looking at the Big Picture]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/looking-at-the-big-picture/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Preliminaries for Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution/?utm_source=atom_feed" rel="related" type="text/html" title="The Entropy Distribution" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/?utm_source=atom_feed" rel="related" type="text/html" title="Entropy Distribution Setup" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="Finalizing the Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/looking-at-the-big-picture/</id>
            
            
            <published>2021-07-29T00:00:00+00:00</published>
            <updated>2021-07-29T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Prelimiaries for the final Asadpour algorithm function in NetworkX</blockquote><p>Well, we&rsquo;re finally at the point in this GSoC project where the end is glimmering on the horizon.
I have completed the Held Karp relaxation, generating a spanning tree distribution and now sampling from that distribution.
That means that it is time to start thinking about how to link these separate components into one algorithm.</p>
<p>Recall that from the Asadpour paper the overview of the algorithm is</p>
<blockquote>
<hr>
<p><strong>Algorithm 1</strong> An $O(\log n / \log \log n)$-approximation algorithm for the ATSP</p>
<hr>
<p><strong>Input:</strong> A set $V$ consisting of $n$ points and a cost function $c\ :\ V \times V \rightarrow \mathbb{R}^+$ satisfying the triangle inequality.</p>
<p><strong>Output:</strong> $O(\log n / \log \log n)$-approximation of the asymmetric traveling salesman problem instance described by $V$ and $c$.</p>
<ol>
<li>Solve the Held-Karp LP relaxation of the ATSP instance to get an optimum extreme point solution $x^*$.
Define $z^*$ as in (5), making it a symmetrized and scaled down version of $x^*$.
Vector $z^*$ can be viewed as a point in the spanning tree polytope of the undirected graph on the support of $x^*$ that one obtains after disregarding the directions of arcs (See Section 3.)</li>
<li>Let $E$ be the support graph of $z^*$ when the direction of the arcs are disregarded.
Find weights ${\tilde{\gamma}}_{e \in E}$ such that the exponential distribution on the spanning trees, $\tilde{p}(T) \propto \exp(\sum_{e \in T} \tilde{\gamma}_e)$ (approximately) preserves the marginals imposed by $z^*$, i.e. for any edge $e \in E$,
<center>$\sum\_{T \in \mathcal{T} : T \ni e} \tilde{p}(T) \leq (1 + \epsilon) z^\*\_e$,</center>
for a small enough value of $\epsilon$.
(In this paper we show that $\epsilon = 0.2$ suffices for our purpose. See Section 7 and 8 for a description of how to compute such a distribution.)
</li>
<li>Sample $2\lceil \log n \rceil$ spanning trees $T_1, \dots, T_{2\lceil \log n \rceil}$ from $\tilde{p}(.)$.
For each of these trees, orient all its edges so as to minimize its cost with respect to our (asymmetric) cost function $c$.
Let $T^*$ be the tree whose resulting cost is minimal among all of the sampled trees.</li>
<li>Find a minimum cost integral circulation that contains the oriented tree $\vec{T}^*$.
Shortcut this circulation to a tour and output it. (See Section 4.)</li>
</ol>
<hr>
</blockquote>
<p>We are now firmly in the steps 3 and 4 area.
Going all the way back to my post on 24 May 2021 titled <a href="../networkx-function-stubs">Networkx Function stubs</a> the only function left is <code>asadpour_tsp</code>, the main function which needs to accomplish this entire algorithm.
But before we get to creating pseudo code for it there is still step 4 which needs a thorough examination.</p>
<h2 id="circulation-and-shortcutting">Circulation and Shortcutting<a class="headerlink" href="#circulation-and-shortcutting" title="Link to this heading">#</a></h2>
<p>Once we have sampled enough spanning trees from the graph and converted the minimum one into $\vec{T}^*$ we need to find the minimum cost integral circulation in the graph which contains $\vec{T}^*$.
While NetworkX a minimum cost circulation function, namely, <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.flow.min_cost_flow.html"><code>min_cost_flow</code></a>, it is not suitable for the Asadpour algorithm out of the box.
The problem here is that we do not have node demands, we have edge demands.
However, after some reading and discussion with one of my mentors Dan, we can convert the current problem into one which can be solved using the <code>min_cost_flow</code> function.</p>
<p>The problem that we are trying to solve is called the minimum cost circulation problem and the one which <code>min_cost_flow</code> is able to solve is the, well, minimum cost flow problem.
As it happens, these are equivalent problems, so I can convert the minimum cost circulation into a minimum cost flow problem by transforming the minimum edge demands into node demands.</p>
<p>Recall that at this point we have a directed minimum sampled spanning tree $\vec{T}^*$ and that the flow through each of the edges in $\vec{T}^*$ needs to be at least one.
From the perspective of a flow problem, $\vec{T}^*$ is moving some flow around the graph.
However, in order to augment $\vec{T}^*$ into an Eulerian graph so that we can walk it, we need to counteract this flow so that the net flow for each node is 0 $(f(\delta^+(v)) = f(\delta^-(v))$ in the Asadpour paper).</p>
<p>So, we find the net flow of each node and then assign its demand to be the negative of that number so that the flow will balance at the node in question.
If the total flow at any node $i$ is $\delta^+(i) - \delta^-(i)$ then the demand we assign to that node is $\delta^-(i) - \delta^+(i)$.
Once we assign the demands to the nodes we can temporarily ignore the edge lower capacities to find the minimum flow.</p>
<p>For more information on the conversion process, please see [2].</p>
<p>After the minimum flow is found, we take the support of the flow and add it to the $\vec{T}^*$ to create a multigraph $H$.
Now we know that $H$ is weakly connected (it contains $\vec{T^*}$) and that it is Eulerian because for every node the in-degree is equal to the out-degree.
A closed eulerian walk or eulerian circuit can be found in this graph with <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.euler.eulerian_circuit.html"><code>eulerian_circuit</code></a>.</p>
<p>Here is an example of this process on a simple graph.
I suspect that the flow will not always be the back edges from the spanning tree and that the only reason that is the case here is due to the small number of vertcies.</p>
<center><img src="example-min-flow.png" alt="Example of finding the minimum flow on a directed spanning tree"/></center>
<p>Finally, we take the eulerian circuit and shortcut it.
On the plus side, the shortcutting process is the same as the Christofides algorithm so that is already the <code>_shortcutting</code> helper function in the traveling salesman file.
This is really where it is critical that the triangle inequality holds so that the shortcutting cannot increase the cost of the circulation.</p>
<h2 id="pseudo-code-for-asadpour_tsp">Pseudo code for asadpour_tsp<a class="headerlink" href="#pseudo-code-for-asadpour_tsp" title="Link to this heading">#</a></h2>
<p>Let&rsquo;s start with the function signature.</p>

<div class="highlight">
  <pre>def asadpour_tsp
    Input: A complete graph G with weight being the attribute key for the edge weights.
    Output: A list of edges which form the approximate ATSP solution.</pre>
</div>

<p>This is exactly what we&rsquo;d expect, take a complete graph $G$ satisfying the triangle inequality and return the edges in the approximate solution to the asymmetric traveling salesman problem.
Recall from my post <a href="../networkx-function-stubs">Networkx Function Stubs</a> what the primary traveling salesman function, <code>traveling_salesman_problem</code> will ensure that we are given a complete graph that follows the triangle inequality by using all-pairs shortest path calculations and will handle if we are expected to return a true cycle or only a path.</p>
<p>The first step in the Asadpour algorithm is the Held Karp relaxation.
I am planning on editing the flow of the algorithm here a bit.
If the Held Karp relaxation finds an integer solution, then we know that is one of the optimal TSP routes so there is no point in continuing the algorithm: we can just return that as an optimal solution.
However, if the Held Karp relaxation finds a fractional solution we will press on with the algorithm.</p>

<div class="highlight">
  <pre>    z_star = held_karp(G)
    # test to see if z_star is a graph or dict
    if type(z_star) is nx.DiGraph
        return z_star.edges</pre>
</div>

<p>Once we have the Held Karp solution, we create the undirected support of <code>z_star</code> for the next step of creating the exponential distribution of spanning trees.</p>

<div class="highlight">
  <pre>    z_support = nx.MultiGraph()
    for u, v in z_star
        if not in z_support.edges
            edge_weight = min(G[u][v][weight], G[v][u][weight])
            z_support.add_edge(u, v, weight=edge_weight)
    gamma = spanning_tree_distribution(z_support, z_star)</pre>
</div>

<p>This completes steps 1 and 2 in the Asadpour overview at the top of this post.
Next we sample $2 \lceil \log n \rceil$ spanning trees.</p>

<div class="highlight">
  <pre>    for u, v in z_support.edges
        z_support[u][v][lambda] = exp(gamma[(u, v)])

    for _ in range 1 to 2 ceil(log(n))
        sampled_tree = sample_spanning_tree(G)
        sampled_tree_weight = sampled_tree.size()
        if sampled_tree_weight &lt; minimum_sampled_tree_weight
            minimum_sampled_tree = sampled_tree.copy()
            minimum_sampled_tree_weight = sampled_tree_weight</pre>
</div>

<p>Now that we have the minimum sampled tree, we need to orient the edge directions to keep the cost equal to that minimum tree.
We can do this by iterating over the edges in <code>minimum_sampled_tree</code> and checking the edge weights in the original graph $G$.
Using $G$ is required here if we did not record the minimum direction which is a possibility when we create <code>z_support</code>.</p>

<div class="highlight">
  <pre>    t_star = nx.DiGraph
    for u, v, d in minimum_sampled_tree.edges(data=weight)
        if d == G[u][v][weight]
            t_star.add_edge(u, v, weight=d)
        else
            t_star.add_edge(v, u, weight=d)</pre>
</div>

<p>Next we create a mapping of nodes to node demands for the minimum cost flow problem which was discussed earlier in this post.
I think that using a dict is the best option as it can be passed into <a href="https://networkx.org/documentation/stable/reference/generated/networkx.classes.function.set_node_attributes.html"><code>set_node_attributes</code></a> all at once before finding the minimum cost flow.</p>

<div class="highlight">
  <pre>    for n in t_star
        node_demands[n] = t_star.out_degree(n) - t_star.in_degree(n)

    nx.set_node_attributes(G, node_demands)
    flow_dict = nx.min_cost_flow(G)</pre>
</div>

<p>Take the Eulerian circuit and shortcut it on the way out.
Here we can add the support of the flow directly to <code>t_star</code> to simulate adding the two graphs together.</p>

<div class="highlight">
  <pre>    for u, v in flow_dict
        if edge not in t_star.edges and flow_dict[u, v] &gt; 0
            t_star.add_edge(u, v)
    eulerian_curcuit = nx.eulerian_circuit(t_star)
    return _shortcutting(eulerian_curcuit)</pre>
</div>

<p>That should be it.
Once the code for <code>asadpour_tsp</code> is written it will need to be tested.
I&rsquo;m not sure how I&rsquo;m going to create the test cases yet, but I do plan on testing it using real world airline ticket prices as that is my go to example for the asymmetric traveling salesman problem.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061.</p>
<p>D. Williamson, <em>ORIE 633 Network Flows Lecture 11</em>, 11 Oct 2007, <a href="https://people.orie.cornell.edu/dpw/orie633/LectureNotes/lecture11.pdf">https://people.orie.cornell.edu/dpw/orie633/LectureNotes/lecture11.pdf</a>.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Sampling a Spanning Tree]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/sampling-a-spanning-tree/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/?utm_source=atom_feed" rel="related" type="text/html" title="Preliminaries for Sampling a Spanning Tree" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution/?utm_source=atom_feed" rel="related" type="text/html" title="The Entropy Distribution" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/?utm_source=atom_feed" rel="related" type="text/html" title="Entropy Distribution Setup" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="Finalizing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Implementing the Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/sampling-a-spanning-tree/</id>
            
            
            <published>2021-07-28T00:00:00+00:00</published>
            <updated>2021-07-28T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Implementation details for sample_spanning_tree</blockquote><p>The heavy lifting I did in the preliminary post certainly paid off here!
In just one day I was able to implement <code>sample_spanning_tree</code> and its two helper functions.</p>
<h2 id="krichhoffs">krichhoffs<a class="headerlink" href="#krichhoffs" title="Link to this heading">#</a></h2>
<p>This was a very easy function to implement.
It followed exactly from the pesudo code and was working with <code>spanning_tree_distribution</code> before I started on <code>sample_spanning_tree</code>.</p>
<h2 id="sample_spanning_tree">sample_spanning_tree<a class="headerlink" href="#sample_spanning_tree" title="Link to this heading">#</a></h2>
<p>This function was more difficult than I originally anticipated.
The code for the main body of the function only needed minor tweaks to work with the specifics of python such as <code>shuffle</code> being in place and returning <code>None</code> and some details about how sets work.
For example, I add edge $e$ to $U$ before calling <code>prepare_graph</code> on in and then switch the <code>if</code> statement to be the inverse to remove $e$ from $U$.
Those portions are functionally the same.
The issues I had with this function <em>all</em> stem back to contracting multiple nodes in a row and how that affects the graph.</p>
<p>As a side note, the <code>contracted_edge</code> function in NetworkX is a wrapper for <code>contracted_node</code> and the latter has a <code>copy</code> keyword argument that is assumed to be <code>True</code> by the former function.
It was a trivial change to extend this functionality to <code>contracted_edge</code> but in the end I used <code>contracted_node</code> so the whole thing is moot.</p>
<p>First recall how edge contraction, or in this case node contraction, works.
Two nodes are merged into one which is connected by the same edges which connected the original two nodes.
Edges between those two nodes become self loops, but in this case I prevented the creation of self loops as directed by Kulkarni.
If a node which is not contracted has edges to both of the contracted nodes, we insert a parallel edge between them.
I struggled with NetworkX&rsquo;s API about the graph classes in a past post titled <a href="../entropy-distribution">The Entropy Distribution</a>.</p>
<p>For NetworkX&rsquo;s implementation, we would call <code>nx.contracted_nodes(G, u, v)</code> and <code>u</code> and <code>v</code> would always be merged into <code>u</code>, so <code>v</code> is the node which is no longer in the graph.</p>
<p>Now imagine that we have three edges to contract because they are all in $U$ which look like the following.</p>
<center><img src="multiple-contraction.png" alt="Example subgraph with multiple edges to contract"></center>
<p>If we process this from left to right, we first contract nodes 0 and 1.
At this point, the $\{1, 2\}$ no longer exists in $G$ as node 1 itself has been removed.
However, we would still need to contract the new $\{0, 2\}$ edge which is equivalent to the old $\{1, 2\}$ edge.</p>
<p>My first attempt to solve this was&hellip; messy and didn&rsquo;t work well.
I developed an <code>if-elif</code> chain for which endpoints of the contracting edge no longer existed in the graph and tried to use dict comprehension to force a dict to always be up to date with which vertices were equivalent to each other.
It didn&rsquo;t work and was very messy.</p>
<p>Fortunately there was a better solution.
This next bit of code I actually first used in my Graph Algorithms class from last semester.
In particular it is the merge-find or disjoint set data structure from the components algorithm (code can be found <a href="https://github.com/mjschwenne/GraphAlgorithms/blob/main/src/Components.py">here</a> and more information about the data structure <a href="https://en.wikipedia.org/wiki/Disjoint-set_data_structure">here</a>).</p>
<p>Basically we create a mapping from a node to that node&rsquo;s representative.
In this case a node&rsquo;s representative is the node that is still in $G$ but the input node has been merged into through a series of contractions.
In the above example, once node 1 is merged into node 0, 0 would become node 1&rsquo;s representative.
We search recursively through the <code>merged_nodes</code> dict until we find a node which is not in the dict, meaning that it is still its own representative and therefore in the graph.
This will let us handle a representative node later being merged into another node.
Finally, we take advantage of path compression so that lookup times remain good as the number of entries in <code>merged_nodes</code> grows.</p>
<p>This worked well once I caught a bug where the <code>prepare_graph</code> function tried to contract a node with itself.
However, the function was running and returning a result but it could have one or two more edges than needed which of course means it is not a tree.
I was testing on the symmetric fractional Held Karp graph by the way, so with six nodes it should have five edges per tree.</p>
<p>I seeded the random number generator for one of the seven edge results and started to debug!
Recall that once we generate a uniform decimal between 0 and 1 we compare it to</p>
<p>$$
\lambda_e \times \frac{K_{G \backslash {e}}}{K_G}
$$</p>
<p>where $K$ is the result of Krichhoff&rsquo;s Theorem on the subscripted graph.
One probability that caught my eye had the fractional component equal to 1.
This means that adding $e$ to the set of contracted edges had no effect on where that edge should be included in the final spanning tree.
Closer inspection revealed that the edge $e$ in question already could not be picked for the spanning tree since it did not exist in $G$ it could not exist in $G \backslash {e}$.</p>
<p>Imagine the following situation.
We have three edges to contract but they form a cycle of length three.</p>
<center><img src="contraction-cycle.png" alt="Example of the contraction of a cycle in a subgraph"></center>
<p>If we contract $\{0, 1\}$ and then $\{0, 2\}$ what does that mean for $\{1, 2\}$?
Well, ${1, 2}$ would become a self loop on vertex 0 but we are deleting self loops so it cannot exist.
It has to have a probability of 0.
Yet in the current implementation of the function, it would have a probability of $\lambda_{\{1, 2\}}$.
So, I have to check to see if a representative edge exists for the edge we are considering in the current iteration of the main for loop.</p>
<p>The solution to this is to return the merge-find data structure with the prepared graph for $G$ and then check that an edge with endpoints at the two representatives for the endpoints of the original edge exists.
If so, use the kirchhoff value as normal but if not make <code>G_e_total_tree_weight</code> equal to zero so that this edge cannot be picked.
Finally I was able to sample trees from <code>G</code> consistently, but did they match the expected probabilities?</p>
<h2 id="testing-sample_spanning_tree">Testing sample_spanning_tree<a class="headerlink" href="#testing-sample_spanning_tree" title="Link to this heading">#</a></h2>
<p>The first test I was working with sampled one tree and checked to see if it was actually a tree.
I first expanded it to sample 1000 trees and make sure that they were all trees.
At this point, I thought that the function will always return a tree, but I need to check the tree distribution.</p>
<p>So after a lot of difficulty writing the test itself to check which of the 75 possible spanning trees I had sampled I was ready to check the actual distribution.
First, the test iterates over all the spanning trees, records the products of edge weights and normalizes the data.
(Remember that the actual probability is only <em>proportional</em> to the product of edge weights).
Then I sample 50000 trees and record the actual frequency.
Next, it calculates the percent error from the expected probability to the actual frequency.
The sample size is so large because at 1000 trees the percent error was all over the place but, as the Law of Large Numbers dictates, the larger sample shows the actual results converging to the expected results so I do believe that the function is working.</p>
<p>That being said, seeing the percent error converge to be less than 15% for all 75 spanning trees is not a very rigorous test.
I can either implement a formal test using the percent error or try to create a Chi squared test using scipy.</p>
<h3 id="update-29-july-2021">Update! (29 July 2021)<a class="headerlink" href="#update-29-july-2021" title="Link to this heading">#</a></h3>
<p>This morning I was able to get a Chi squared test working and it was definitely the correct decision.
I was able to reduce the sample size from 50,000 to 1200 which is a near minimum sample.
In order to run a Chi squared test you need an expected frequency of at least 5 for all of the categories so I had to find the number of samples to ganturee that for a tree with a probability of about 0.4% which was 1163 that I rounded to 1200.</p>
<p>I am testing at the 0.01 signigance level, so this test may fail without reason 1% of the time but it is still a overall good test for distribution.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, SODA ’10,
Society for Industrial and Applied Mathematics, 2010, pp. 379-389, <a href="https://dl.acm.org/doi/abs/10.5555/1873601.1873633">https://dl.acm.org/doi/abs/10.5555/1873601.1873633</a>.</p>
<p>V. G. Kulkarni, <em>Generating random combinatorial objects</em>, Journal of algorithms, 11 (1990), pp. 185–207.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Preliminaries for Sampling a Spanning Tree]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution/?utm_source=atom_feed" rel="related" type="text/html" title="The Entropy Distribution" />
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/?utm_source=atom_feed" rel="related" type="text/html" title="Entropy Distribution Setup" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="Finalizing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Implementing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/?utm_source=atom_feed" rel="related" type="text/html" title="Understanding the Ascent Method" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/preliminaries-for-sampling-a-spanning-tree/</id>
            
            
            <published>2021-07-21T00:00:00+00:00</published>
            <updated>2021-07-21T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>A close examination of the mathematics required to sample a random spanning tree from a graph</blockquote><p>In order to test the exponential distribution that I generate using <code>spanning_tree_distribution</code>, I need to be able to sample a tree from the distribution.
The primary citation used in the Asadpour paper is <em>Generating Random Combinatorial Objects</em> by V. G. Kulkarni (1989).
While I was not able to find an online copy of this article, the Michigan Tech library did have a copy that I was able to read.</p>
<h2 id="does-the-kulkarni-algorithm-work-with-asadpour">Does the Kulkarni Algorithm work with Asadpour?<a class="headerlink" href="#does-the-kulkarni-algorithm-work-with-asadpour" title="Link to this heading">#</a></h2>
<p>Kulkarni gave a general overview of the algorithm in Section 2, but Section 5 is titled `Random Spanning Trees&rsquo; and starts on page 200.
First, let&rsquo;s check that the preliminaries for the Kulkarni paper on page 200 match the Asadpour algorithm.</p>
<blockquote>
<p>Let $G = (V, E)$ be an undirected network of $M$ nodes and $N$ arcs&hellip;
Let $\mathfrak{B}$ be the set of all spanning trees in $G$.
Let $\alpha_i$ be the positive weight of arc $i \in E$.
Defined the weight $w(B)$ of a spanning tree $B \in \mathfrak{B}$ as</p>
<p>$$w(B) = \prod_{i \in B} \alpha_i$$</p>
<p>Also define</p>
<p>$$n(G) = \sum_{B \in \mathfrak{B}} w(B)$$</p>
<p>In this section we describe an algorithm to generate $B \in \mathfrak{B}$ so that</p>
<p>$$P\{B \text{ is generated}\} = \frac{w(B)}{n(G)}$$</p>
</blockquote>
<p>Immediately we can see that $\mathfrak{B}$ is the same as $\mathcal{T}$ from the Asadpour paper, the set of all spanning trees.
The weight of each edge is $\alpha_i$ for Kulkarni and $\lambda_e$ to Asadpour.
As for the product of the weights of the graph being the probability, the Asadpour paper states on page 382</p>
<blockquote>
<p>Given $\lambda*e \geq 0$ for $e \in E$, a $\lambda$*-random tree_ $T$ of $G$ is a tree $T$ chosen from the set of all spanning trees of $G$ with probability proportional to $\prod_{e \in T} \lambda_e$.</p>
</blockquote>
<p>So this is not a concern.
Finally, $n(G)$ can be written as</p>
<p>$$\sum_{T \in \mathcal{T}} \prod_{e \in T} \lambda_e$$</p>
<p>which does appear several times throughout the Asadpour paper.
Thus the preliminaries between the Kulkarni and Asadpour papers align.</p>
<h2 id="the-kulkarni-algorithm">The Kulkarni Algorithm<a class="headerlink" href="#the-kulkarni-algorithm" title="Link to this heading">#</a></h2>
<p>The specialized version of the general algorithm which Kulkarni gives is Algorithm A8 on page 202.</p>
<blockquote>
<p>$U = \emptyset,$ $V = E$<br>
Do $i = 1$ to $N$;<br>
$\qquad$Let $a = n(G(U, V))$<br>
$\qquad\qquad a&rsquo;$ $= n(G(U \cup {i}, V))$<br>
$\qquad$Generate $Z \sim U[0, 1]$<br>
$\qquad$If $Z \leq \alpha_i \times \left(a&rsquo; / a\right)$<br>
$\qquad\qquad$then $U = U \cup {i}$,<br>
$\qquad\qquad$else $V = V - {i}$<br>
$\qquad$end.<br>
Stop. $U$ is the required spanning tree.</p>
</blockquote>
<p>Now we have to understand this algorithm so we can create pseudo code for it.
First as a notational explanation, the statement &ldquo;Generate $Z \sim U[0, 1]$&rdquo; means picking a uniformly random variable over the interval $[0, 1]$ which is independent of all the random variables generated before it (See page 188 of Kulkarni for more information).
The built-in python module <a href="https://docs.python.org/3/library/random.html"><code>random</code></a> can be used here.
Looking at real-valued distributions, I believe that using <code>random.uniform(0, 1)</code> is preferable to <code>random.random()</code> since the latter does not have the probability of generating a &lsquo;1&rsquo; and that is explicitly part of the interval discussed in the Kulkarni paper.</p>
<p>The other notational oddity would be statements similar to $G(U, V)$ which is this case does not refer to a graph with $U$ as the vertex set and $V$ as the edge set as $U$ and $V$ are both subsets of the full edge set $E$.</p>
<p>$G(U, V)$ is defined in the Kulkarni paper on page 201 as</p>
<blockquote>
<p>Let $G(U, V)$ be a subgraph of $G$ obtained by deleting arcs that are not in $V$, and collapsing arcs that are in $U$ (i.e., identifying the end nodes of arcs in $U$) and deleting all self-loops resulting from these deletions and collapsing.</p>
</blockquote>
<p>This language seems a bit&hellip; clunky, especially for the edges in $U$.
In this case, &ldquo;collapsing arcs that are in $U$&rdquo; would be contracting those edges without self loops.
Fortunately, this functionality is a part of NetworkX using <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.minors.contracted_edge.html#networkx.algorithms.minors.contracted_edge"><code>networkx.algorithms.minors.contracted_edge</code></a> with the <code>self_loops</code> keyword argument set to <code>False</code>.</p>
<p>As for the edges in $E - V$, this can be easily accomplished by using <a href="https://networkx.org/documentation/stable/reference/classes/generated/networkx.MultiGraph.remove_edges_from.html"><code>networkx.MultiGraph.remove_edges_from</code></a>.</p>
<p>Once we have generated $G(U, V)$, we need to find $n(G(U, V)$.
This can be done with something we are already familiar with: Kirchhoff&rsquo;s Tree Matrix Theorem.
All we need to do is create the Laplacian matrix and then find the determinant of the first cofactor.
This code will probably be taken directly from the <code>spanning_tree_distribution</code> function.
Actually, this is a place to create a broader helper function called <code>krichhoffs</code> which will take a graph and return the number of weighted spanning trees in it which would then be used as part of <code>q</code> in <code>spanning_tree_distribution</code> and in <code>sample_spanning_tree</code>.</p>
<p>From here we compare $Z$ to $\alpha_i \left(a&rsquo; / a\right)$ so see if that edge is added to the graph or discarded.
Understanding the process of the algorithm gives context to the meaning of $U$ and $V$.
$U$ is the set of edges which we have decided to include in the spanning tree while $V$ is the set of edges yet to be considered for $U$ (roughly speaking).</p>
<p>Now there is still a bit of ambiguity in the algorithm that Kulkarni gives, mainly about $i$.
In the loop condition, $i$ is an integer from 1 to $N$, the number of arcs in the graph but it is later being added to $U$ so it has to be an edge.
Referencing the Asadpour paper, it starts its description of sampling the $\lambda$-random tree on page 383 by saying &ldquo;The idea is to order the edges $e_1, \dots, e_m$ of $G$ arbitrarily and process them one by one&rdquo;.
So I believe that the edge interpretation is correct and the integer notation used in Kulkarni was assuming that a mapping of the edges to ${1, 2, \dots, N}$ has occurred.</p>
<h2 id="sample_spanning_tree-pseudo-code">sample_spanning_tree pseudo code<a class="headerlink" href="#sample_spanning_tree-pseudo-code" title="Link to this heading">#</a></h2>
<p>Time to write some pseudo code!
Starting with the function signature</p>

<div class="highlight">
  <pre>def sample_spanning_tree
    Input: A multigraph G whose edges contain a lambda value stored at lambda_key
    Output: A new graph which is a spanning tree of G</pre>
</div>

<p>Next up is a bit of initialization</p>

<div class="highlight">
  <pre>    U = set()
    V = set(G.edges)
    shuffled_edges = shuffle(G.edges)</pre>
</div>

<p>Now the definitions of <code>U</code> and <code>V</code> come directly from Algorithm A8, but <code>shuffled_edges</code> is new.
My thoughts are that this will be what we use for $i$.
We shuffle the edges of the graph and then in the loop we iterate over the edges within <code>shuffled_edges</code>.
Next we have the loop.</p>

<div class="highlight">
  <pre>    for edge e in shuffled_edges
        G_total_tree_weight = kirchhoffs(prepare_graph(G, U, V))
        G_i_total_tree_weight = kirchhoffs(prepare_graph(G, U.add(e), V))
        z = uniform(0, 1)
        if z &lt;= e[lambda_key] * G_i_total_tree_weight / G_total_tree_weight
            U = U.add(e)
            if len(U) == G.number_of_edges - 1
                # Spanning tree complete, no need to continue to consider edges.
                spanning_tree = nx.Graph
                spanning_tree.add_edges_from(U)
                return spanning_tree
        else
            V = V.remove(e)</pre>
</div>

<p>The main loop body does use two other functions which are not part of the standard NetworkX libraries, <code>krichhoffs</code> and <code>prepare_graph</code>.
As I mentioned before, <code>krichhoffs</code> will apply Krichhoff&rsquo;s Theorem to the graph.
Pseudo code for this is below and strongly based off of the existing code in <code>q</code> of <code>spanning_tree_distribution</code> which will be updated to use this new helper.</p>

<div class="highlight">
  <pre>def krichhoffs
    Input: A multigraph G and weight key, weight
    Output: The total weight of the graph&#39;s spanning trees

    G_laplacian = laplacian_matrix(G, weight=weight)
    G_laplacian = G_laplacian.delete(0, 0)
    G_laplacian = G_laplacian.delete(0, 1)

    return det(G_laplacian)</pre>
</div>

<p>The process for the other helper, <code>prepare_graph</code> is also given.</p>

<div class="highlight">
  <pre>def prepare_graph
    Input: A graph G, set of contracted edges U and edges which are not removed V
    Output: A subgraph of G in which all vertices in U are contracted and edges not in V are
			removed

    result = G.copy
    edges_to_remove = set(result.edges).difference(V)
    result.remove_edges_from(edges_to_remove)

    for edge e in U
        nx.contracted_edge(e)

    return result</pre>
</div>

<p>There is one other change to the NetworkX API that I would like to make.
At the moment, <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.minors.contracted_edge.html"><code>networkx.algorithms.minors.contracted_edge</code></a> is programmed to always return a copy of a graph.
Since I need to be contracting multiple edges at once, it would make a lot more sense to do the contraction in place.
I would like to add an optional keyword argument to <code>contracted_edge</code> called <code>copy</code> which will default to <code>True</code> so that the overall functionality will not change but I will be able to perform in place contractions.</p>
<h2 id="next-steps">Next Steps<a class="headerlink" href="#next-steps" title="Link to this heading">#</a></h2>
<p>The most obvious one is to implement the functions that I have laid out in the pseudo code step, but testing is still a concerning area.
My best bet is to sample say 1000 trees and check that the probability of each tree is equal to the product of all of the lambda&rsquo;s on it&rsquo;s edges.</p>
<p>That actually just caused me to think of a new test of <code>spanning_tree_distribution</code>.
If I generate the distribution and then iterate over all of the spanning trees with a <code>SpanningTreeIterator</code> I can sum the total probability of each tree being sampled and if that is not 1 (or very close to it) than I do not have a valid distribution over the spanning trees.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, SODA ’10,
Society for Industrial and Applied Mathematics, 2010, pp. 379-389, <a href="https://dl.acm.org/doi/abs/10.5555/1873601.1873633">https://dl.acm.org/doi/abs/10.5555/1873601.1873633</a>.</p>
<p>V. G. Kulkarni, <em>Generating random combinatorial objects</em>, Journal of algorithms, 11 (1990), pp. 185–207.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[The Entropy Distribution]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/?utm_source=atom_feed" rel="related" type="text/html" title="Entropy Distribution Setup" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="Finalizing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Implementing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/?utm_source=atom_feed" rel="related" type="text/html" title="Understanding the Ascent Method" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/?utm_source=atom_feed" rel="related" type="text/html" title="implementing the Iterators" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/entropy-distribution/</id>
            
            
            <published>2021-07-20T00:00:00+00:00</published>
            <updated>2021-07-20T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Details on implementing the entropy distribution</blockquote><p>Implementing <code>spanning_tree_distribution</code> proved to have some NetworkX difficulties and one algorithmic difficulty.
Recall that the algorithm for creating the distribution is given in the Asadpour paper as</p>
<blockquote>
<ol>
<li>Set $\gamma = \vec{0}$.</li>
<li>While there exists an edge $e$ with $q_e(\gamma) &gt; (1 + \epsilon) z_e$:
<ul>
<li>Compute $\delta$ such that if we define $\gamma&rsquo;$ as $\gamma_e&rsquo; = \gamma_e - \delta$, and $\gamma_f&rsquo; = \gamma_f$ for all $f \in E\ \backslash {e}$, then $q_e(\gamma&rsquo;) = (1 + \epsilon/2)z_e$.</li>
<li>Set $\gamma \leftarrow \gamma&rsquo;$.</li>
</ul>
</li>
<li>Output $\tilde{\gamma} := \gamma$.</li>
</ol>
</blockquote>
<p>Now, the procedure that I laid out in my last blog titled <a href="../entropy-distribution-setup">Entropy Distribution Setup</a> worked well for the while loop portion.
All of my difficulties with the NetworkX API happened in the <code>q</code> inner function.</p>
<p>After I programmed the function, I of course needed to run it and at first I was just printing the <code>gamma</code> dict out so that I could see what the values for each edge were.
My first test uses the symmetric fractional Held Karp solution and to my surprise, every value of $\gamma$ returned as 0.
I didn&rsquo;t think that this was intended behavior because if it was, there would be no reason to include this step in the overall Asadpour algorithm, so I started to dig around the code with PyCharm&rsquo;s debugger.
The results were, as I suspected, not correct.
I was running Krichhoff&rsquo;s tree matrix theorem on the original graph, so the returned probabilities were an order of magnitude smaller than the values of $z_e$ that I was comparing them to.
Additionally, all of the values were the same so I knew that this was a problem and not that the first edge I checked had unusually small probabilities.</p>
<p>So, I returned to the Asadpour paper and started to ask myself questions like</p>
<ul>
<li>Do I need to normalize the Held Karp answer in some way?</li>
<li>Do I need to consider edges outside of $E$ (the undirected support of the Held Karp relaxation solution) or only work with the edges in $E$?</li>
</ul>
<p>It was pretty easy to dismiss the first question, if normalization was required it would be mentioned in the Asadpour paper and without a description of how to normalize it the chances of me finding the `correct&rsquo; way to do so would be next to impossible.
The second question did take some digging.
The sections of the Asadpour paper which talk about using Krichhoff&rsquo;s theorem all discuss it using the graph $G$ which is why I was originally using all edges in $G$ rather than the edges
in $E$.
A few hints pointed to the fact that I needed to only consider the edges in $E$, the first being the algorithm overview which states</p>
<blockquote>
<p>Find weights ${\tilde{\gamma}}_{e \in E}$</p>
</blockquote>
<p>In particular the $e \in E$ statement says that I do not need to consider the edges which are not in $E$.
Secondly, Lemma 7.2 starts by stating</p>
<blockquote>
<p>Let $G = (V, E)$ be a graph with weights $\gamma_e$ for $e \in E$</p>
</blockquote>
<p>Based on the current state of the function and these hints, I decided to reduce the input graph to <code>spanning_tree_distribution</code> to only edges with $z_e &gt; 0$.
Running the test on the symmetric fractional solution now, it still returned $\gamma = \vec{0}$ but the probabilities it was comparing were much closer during that first iteration.
Due to the fact that I do not have an example graph and distribution to work with, this could be the correct answer, but the fact that every value was the same still confused me.</p>
<p>My next step was to determine the actual probability of an edge being in the spanning trees for the first iteration when $\gamma = \vec{0}$.
This can be easily done with my <code>SpanningTreeIterator</code> and exploits the fact that $\gamma = \vec{0} \equiv \lambda_e = 1\ \forall\ e \in \gamma$ so we can just iterate over the spanning trees and count how often each edge appears.</p>
<p>That script is listed below</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">networkx</span> <span class="k">as</span> <span class="nn">nx</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">edges</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span>
</span></span><span class="line"><span class="cl"><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">G</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">from_edgelist</span><span class="p">(</span><span class="n">edges</span><span class="p">,</span> <span class="n">create_using</span><span class="o">=</span><span class="n">nx</span><span class="o">.</span><span class="n">Graph</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">edge_frequency</span> <span class="o">=</span> <span class="p">{}</span>
</span></span><span class="line"><span class="cl"><span class="n">sp_count</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">tree</span> <span class="ow">in</span> <span class="n">nx</span><span class="o">.</span><span class="n">SpanningTreeIterator</span><span class="p">(</span><span class="n">G</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">sp_count</span> <span class="o">+=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="n">tree</span><span class="o">.</span><span class="n">edges</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">e</span> <span class="ow">in</span> <span class="n">edge_frequency</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">edge_frequency</span><span class="p">[</span><span class="n">e</span><span class="p">]</span> <span class="o">+=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">edge_frequency</span><span class="p">[</span><span class="n">e</span><span class="p">]</span> <span class="o">=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">u</span><span class="p">,</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">edge_frequency</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="nb">print</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="sa">f</span><span class="s2">&#34;(</span><span class="si">{</span><span class="n">u</span><span class="si">}</span><span class="s2">, </span><span class="si">{</span><span class="n">v</span><span class="si">}</span><span class="s2">): </span><span class="si">{</span><span class="n">edge_frequency</span><span class="p">[(</span><span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">)]</span><span class="si">}</span><span class="s2"> / </span><span class="si">{</span><span class="n">sp_count</span><span class="si">}</span><span class="s2"> = </span><span class="si">{</span><span class="n">edge_frequency</span><span class="p">[(</span><span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">)]</span> <span class="o">/</span> <span class="n">sp_count</span><span class="si">}</span><span class="s2">&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span></span></span></code></pre>
</div>
<p>This output revealed that the probabilities returned by <code>q</code> should vary from edge to edge and that the correct solution for $\gamma$ is certainly not $\vec{0}$.</p>

<div class="highlight">
  <pre>(networkx-dev) mjs@mjs-ubuntu:~/Workspace$ python3 spanning_tree_frequency.py
(0, 1): 40 / 75 = 0.5333333333333333
(0, 2): 40 / 75 = 0.5333333333333333
(0, 5): 45 / 75 = 0.6
(1, 4): 45 / 75 = 0.6
(2, 3): 45 / 75 = 0.6
(1, 2): 40 / 75 = 0.5333333333333333
(5, 3): 40 / 75 = 0.5333333333333333
(5, 4): 40 / 75 = 0.5333333333333333
(4, 3): 40 / 75 = 0.5333333333333333</pre>
</div>

<p>Let&rsquo;s focus on that first edge, $(0, 1)$.
My brute force script says that it appears in 40 of the 75 spanning trees of the below graph where each edge is labelled with its $z_e$ value.</p>
<center><img src="test-graph-z-e.png" alt="probabilities over the example graph"/></center>
<p>Yet <code>q</code> was saying that the edge was in 24 of 75 spanning trees.
Since the denominator was correct, I decided to focus on the numerator which is the number of spanning trees in $G\ \backslash\ \{(0, 1)\}$.
That graph would be the following.</p>
<center><img src="contracted-graph.png" alt="contracting on the least likely edge"/></center>
<p>An argument can be made that this graph should have a self-loop on vertex 0, but this does not affect the Laplacian matrix in any way so it is omitted here.
Basically, the $[0, 0]$ entry of the adjacency matrix would be 1 and the degree of vertex 0 would be 5 and $5 - 1 = 4$ which is what the entry would be without the self loop.</p>
<p>What was happening was that I was giving <code>nx.contracted_edge</code> a graph of the Graph class (not a directed graph since $E$ is undirected) and was getting a graph of the Graph class back.
The Graph class does not support multiple edges between two nodes so the returned graph only had one edge between node 0 and node 2 which was affecting the overall Laplacian matrix and thus the number of spanning trees.
Switching from a Graph to a MultiGraph did the trick, but this subtle change should be mentioned in the NetworkX documentation for the function, linked <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.minors.contracted_edge.html?">here</a>.
I definitely believed that if a contracted an edge the output should automatically include both of the $(0, 2)$ edges.
An argument can be made for changing the default behavior to match this, but at the very least the documentation should explain this problem.</p>
<p>Now the <code>q</code> function was returning the correct $40 / 75$ answer for $(0, 1)$ and correct values for the rest of the edges so long as all of the $\gamma_e$&rsquo;s were 0.
But the test was erroring out with a <code>ValueError</code> when I tried to compute $\delta$.
<code>q</code> was returning a probability of an edge being in a sampled spanning tree of more than 1, which is clearly impossible but also caused the denominator of $\delta$ to become negative and violate the domain of the natural log.</p>
<p>During my investigation of this problem, I noticed that after computing $\delta$ and subtracting it from $\gamma_e$, it did not have the desired effect on $q_e$.
Recall that we define $\delta$ so that $\gamma_e - \delta$ yields a $q_e$ of $(1 + \epsilon / 2) z_e$.
In other words, the effect of $\delta$ is to decrease an edge probability which is too high, but in my current implementation it was having the opposite effect.
The value of $q_{(0, 1)}$ was going from 0.5333 to just over 0.6.
If I let this trend continue, the program would eventually hit one of those cases where $q_e \geq 1$ and crash the program.</p>
<p>Here I can use edge $(0, 1)$ as an example to show the problem.
The original Laplacian matrix for $G$ with $\gamma = \vec{0}$ is</p>
<p>$$
\begin{bmatrix}
3 &amp; -1 &amp; -1 &amp; 0 &amp; 0 &amp; -1 \\\
-1 &amp; 3 &amp; -1 &amp; 0 &amp; -1 &amp; 0 \\\
-1 &amp; -1 &amp; 3 &amp; -1 &amp; 0 &amp; 0 \\\
0 &amp; 0 &amp; -1 &amp; 3 &amp; -1 &amp; -1 \\\
0 &amp; -1 &amp; 0 &amp; -1 &amp; 3 &amp; -1 \\\
-1 &amp; 0 &amp; 0 &amp; -1 &amp; -1 &amp; 3
\end{bmatrix}
$$</p>
<p>and the Laplacian for $G\ \backslash\ \{(0, 1)\}$ is</p>
<p>$$
\begin{bmatrix}
4 &amp; -2 &amp; -1 &amp; -1 &amp; 0 \\\
-2 &amp; 3 &amp; 0 &amp; 0 &amp; -1 \\\
-1 &amp; 0 &amp; 3 &amp; -1 &amp; -1 \\\
-1 &amp; 0 &amp; -1 &amp; 3 &amp; -1 \\\
0 &amp; -1 &amp; -1 &amp; -1 &amp; 3
\end{bmatrix}
$$</p>
<p>The determinant of the first cofactor is how we get the $40 / 75$.
Now consider the Laplacian matrices after we updated $\gamma_{(0, 1)}$ for the first time.
The one for $G$ becomes</p>
<p>$$
\begin{bmatrix}
2.74 &amp; -0.74 &amp; -1 &amp; 0 &amp; 0 &amp; -1 \\\
-0.74 &amp; 2.74 &amp; -1 &amp; 0 &amp; -1 &amp; 0 \\\
-1 &amp; -1 &amp; 3 &amp; -1 &amp; 0 &amp; 0 \\\
0 &amp; 0 &amp; -1 &amp; 3 &amp; -1 &amp; -1 \\\
0 &amp; -1 &amp; 0 &amp; -1 &amp; 3 &amp; -1 \\\
-1 &amp; 0 &amp; 0 &amp; -1 &amp; -1 &amp; 3
\end{bmatrix}
$$</p>
<p>and its first cofactor determinant is reduced from 75 to 61.6.
What do we expect the value of the matrix for $G\ \backslash\ \{(0, 1)\}$ to be?
Well, we know that the final value of $q_e$ needs to be $(1 + \epsilon / 2) z_e$ or $1.1 \times 0.41\overline{6}$ which is $0.458\overline{3}$.
So</p>
<p>$$
\begin{array}{r c l}
\displaystyle\frac{x}{61.6} &amp;=&amp; 0.458\overline{3} \\\
x &amp;=&amp; 28.2\overline{3}
\end{array}
$$</p>
<p>and the value of the first cofactor determinant should be $28.2\overline{3}$.
However, the contracted Laplacian for $(0, 1)$ after the value of $\gamma_e$ is updated is</p>
<p>$$
\begin{bmatrix}
4 &amp; -2 &amp; -1 &amp; -1 &amp; 0 \\\
-2 &amp; 3 &amp; 0 &amp; 0 &amp; -1 \\\
-1 &amp; 0 &amp; 3 &amp; -1 &amp; -1 \\\
-1 &amp; 0 &amp; -1 &amp; 3 &amp; -1 \\\
0 &amp; -1 &amp; -1 &amp; -1 &amp; 3
\end{bmatrix}
$$</p>
<p>the <strong>same as before!</strong>
The only edge with a different $\gamma_e$ than before is $(0, 1)$, but since it is the contracted edge it is no longer in the graph any more and thus cannot affect the value of the first cofactor&rsquo;s determinant!</p>
<p>But if we change the algorithm to add $\delta$ to $\gamma_e$ rather than subtract it, the determinant of the first cofactor for $G\ \backslash\ \{e\}$’s Laplacian will not change but the determinant for the Laplacian of $G$&rsquo;s first cofactor will increase.
This reduces the overall probability of picking $e$ in a spanning tree.
And, if we happen to use the same formula for $\delta$ as before for our example of $(0, 1)$ then $q_{(0, 1)}$ becomes $0.449307$.
Recall our target value of $0.458\overline{3}$.
This answer has a $-1.96%$ error.</p>
<p>$$
\begin{array}{r c l}
\text{error} &amp;=&amp; \frac{0.449307 - 0.458333}{0.458333} \times 100 \\\
&amp;=&amp; \frac{-0.009026}{0.458333} \times 100 \\\
&amp;=&amp; -0.019693 \times 100 \\\
&amp;=&amp; -1.9693%
\end{array}
$$</p>
<p>Also, the test now completes without error.</p>
<h2 id="update-28-july-2021">Update! (28 July 2021)<a class="headerlink" href="#update-28-july-2021" title="Link to this heading">#</a></h2>
<p>Further research and discussion with my mentors revealed just how flawed my original analysis was.
In the next step, sampling the spanning trees, adding anything to $\gamma$ would directly increase the probability that the edge would be sampled.
That being said, the original problem that I found was still an issue.</p>
<p>Going back to the notion that we a graph on which every spanning tree maps to every spanning tree which contains the desired edge, this is still the key idea which lets us use Krichhoff&rsquo;s Tree Matrix Theorem.
And, contracting the edge will still give a graph in which every spanning tree can be mapped to a corresponding spanning tree which includes $e$.
However, the weight of those spanning trees in $G \backslash \{e\}$ do not quite map between the two graphs.</p>
<p>Recall that we are dealing with a multiplicative weight function, so the final weight of a tree is the product of all the $\lambda$&rsquo;s on its edges.</p>
<p>$$
c(T) = \prod_{e \in E} \lambda_e
$$</p>
<p>The above statement can be expanded into</p>
<p>$$
c(T) = \lambda_1 \times \lambda_2 \times \dots \times \lambda_{|E|}
$$</p>
<p>with some arbitrary ordering of the edges $1, 2, \dots |E|$.
Because the ordering of the edges is arbitrary and due to the associative property of multiplication, we can assume without loss of generality that the desired edge $e$ is the last one in the sequence.</p>
<p>Any spanning tree in $G \backslash \{e\}$ cannot include that last $\lambda$ in it because that edge does not exist in the graph.
Therefore in order to convert the weight from a tree in $G \backslash \{e\}$ we need to multiply $\lambda_e$ back into the weight of the contracted tree.
So, we can now state that</p>
<p>$$
c(T \in \mathcal{T}: T \ni e) = \lambda_e \prod_{f \in E} \lambda_f\ \forall\ T \in G \backslash \{e\}
$$</p>
<p>or that for all trees in $G \backslash \{e\}$, the cost of the corresponding tree in $G$ is the product of its edge $\lambda$&rsquo;s times the weight of the desired edge.
Now recall that $q_e(\gamma)$ is</p>
<p>$$
\frac{\sum_{T \ni e} \exp(\gamma(T))}{\sum_{T \in \mathcal{T}} \exp(\gamma(T))}
$$</p>
<p>In particular we are dealing with the numerator of the above fraction and using $\lambda_e = \exp(\gamma_e)$ we can rewrite it as</p>
<p>$$
\sum_{T \ni e} \exp(\gamma(T)) = \sum_{T \ni e} \prod_{e \in T} \lambda_e
$$</p>
<p>Since we now know that we are missing the $\lambda_e$ term, we can add it into the expression.</p>
<p>$$
\sum_{T \ni e} \lambda_e \times \prod_{f \in T, f \not= e} \lambda_f
$$</p>
<p>Using the rules of summation, we can pull the $\lambda_e$ factor out of the summation to get</p>
<p>$$
\lambda_e \times \sum_{T \ni e} \prod_{f \in T, f \not= e} \lambda_f
$$</p>
<p>And since we use that applying Krichhoff&rsquo;s Theorem to $G \backslash \{e\}$ will yield everything except the factor of $\lambda_e$, we can just multiply it back manually.
This would let the peusdo code for <code>q</code> become</p>

<div class="highlight">
  <pre>def q
    input: e, the edge of interest

    # Create the laplacian matrices
    write lambda = exp(gamma) into the edges of G
    G_laplace = laplacian(G, lambda)
    G_e = nx.contracted_edge(G, e)
    G_e_laplace = laplacian(G, lambda)

    # Delete a row and column from each matrix to made a cofactor matrix
    G_laplace.delete((0, 0))
    G_e_laplace.delete((0, 0))

    # Calculate the determinant of the cofactor matrices
    det_G_laplace = G_laplace.det
    det_G_e_laplace = G_e_laplace.det

    # return q_e
    return lambda_e * det_G_e_laplace / det_G_laplace</pre>
</div>

<p>Making this small change to <code>q</code> worked very well.
I was able to change back to subtracting $\delta$ as the Asadpour paper does and even added a check to code so that every time we update a value in $\gamma$ we know that $\delta$ has had the correct effect.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="c1"># Check that delta had the desired effect</span>
</span></span><span class="line"><span class="cl"><span class="n">new_q_e</span> <span class="o">=</span> <span class="n">q</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">desired_q_e</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span> <span class="o">+</span> <span class="n">EPSILON</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span> <span class="o">*</span> <span class="n">z_e</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="nb">round</span><span class="p">(</span><span class="n">new_q_e</span><span class="p">,</span> <span class="mi">8</span><span class="p">)</span> <span class="o">!=</span> <span class="nb">round</span><span class="p">(</span><span class="n">desired_q_e</span><span class="p">,</span> <span class="mi">8</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">raise</span> <span class="ne">Exception</span></span></span></code></pre>
</div>
<p>And the test passes without fail!</p>
<h2 id="whats-next">What&rsquo;s Next<a class="headerlink" href="#whats-next" title="Link to this heading">#</a></h2>
<p>I technically do not know if this distribution is correct until I can start to sample from it.
I have written the test I have been working with into a proper test but since my oracle is the program itself, the only way it can fail is if I change the function&rsquo;s behavior without knowing it.</p>
<p>So I must press onwards to write <code>sample_spanning_tree</code> and get a better test for both of those functions.</p>
<p>As for the tests of <code>spanning_tree_distribution</code>, I would of course like to add more test cases.
However, if the Held Karp relaxation returns a cycle as an answer, then there will be $n - 1$ path spanning trees and the notion of creating this distribution in the first place as we have already found a solution to the ATSP.
I really need more truly fractional Held Karp solutions to expand the test of these next two functions.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Entropy Distribution Setup]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="Finalizing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Implementing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/?utm_source=atom_feed" rel="related" type="text/html" title="Understanding the Ascent Method" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/?utm_source=atom_feed" rel="related" type="text/html" title="implementing the Iterators" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/?utm_source=atom_feed" rel="related" type="text/html" title="Finding all Minimum Arborescences" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/entropy-distribution-setup/</id>
            
            
            <published>2021-07-13T00:00:00+00:00</published>
            <updated>2021-07-13T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Preliminaries for the entropy distribution over spanning trees</blockquote><p>Finally moving on from the Held Karp relaxation, we arrive at the second step of the Asadpour asymmetric traveling salesman problem algorithm.
Referencing the Algorithm 1 from the Asadpour paper, we are now <em>finally</em> on step two.</p>
<blockquote>
<hr>
<p><strong>Algorithm 1</strong> An $O(\log n / \log \log n)$-approximation algorithm for the ATSP</p>
<hr>
<p><strong>Input:</strong> A set $V$ consisting of $n$ points and a cost function $c\ :\ V \times V \rightarrow \mathbb{R}^+$ satisfying the triangle inequality.</p>
<p><strong>Output:</strong> $O(\log n / \log \log n)$-approximation of the asymmetric traveling salesman problem instance described by $V$ and $c$.</p>
<ol>
<li>Solve the Held-Karp LP relaxation of the ATSP instance to get an optimum extreme point solution $x^*$.
Define $z^*$ as in (5), making it a symmetrized and scaled down version of $x^*$.
Vector $z^*$ can be viewed as a point in the spanning tree polytope of the undirected graph on the support of $x^*$ that one obtains after disregarding the directions of arcs (See Section 3.)</li>
<li>Let $E$ be the support graph of $z^*$ when the direction of the arcs are disregarded.
Find weights ${\tilde{\gamma}}_{e \in E}$ such that the exponential distribution on the spanning trees, $\tilde{p}(T) \propto \exp(\sum_{e \in T} \tilde{\gamma}_e)$ (approximately) preserves the marginals imposed by $z^*$, i.e. for any edge $e \in E$,
$$\sum_{T \in \mathcal{T} : T \ni e} \tilde{p}(T) \leq (1 + \epsilon) z^*_e$$
for a small enough value of $\epsilon$.
(In this paper we show that $\epsilon = 0.2$ suffices for our purpose. See Section 7 and 8 for a description of how to compute such a distribution.)</li>
<li>Sample $2\lceil \log n \rceil$ spanning trees $T_1, \dots, T_{2\lceil \log n \rceil}$ from $\tilde{p}(.)$.
For each of these trees, orient all its edges so as to minimize its cost with respect to our (asymmetric) cost function $c$.
Let $T^*$ be the tree whose resulting cost is minimal among all of the sampled trees.</li>
<li>Find a minimum cost integral circulation that contains the oriented tree $\vec{T}^*$.
Shortcut this circulation to a tour and output it. (See Section 4.)</li>
</ol>
<hr>
</blockquote>
<p>Sections 7 and 8 provide two different methods to find the desired probability distribution, with section 7 using a combinatorial approach and section 8 the ellipsoid method.
Considering that there is no ellipsoid solver in the scientific python ecosystem, and my mentors and I have already decided not to implement one within this project, I will be using the method in section 7.</p>
<p>The algorithm given in section 7 is as follows:</p>
<blockquote>
<ol>
<li>Set $\gamma = \vec{0}$.</li>
<li>While there exists an edge $e$ with $q_e(\gamma) &gt; (1 + \epsilon) z_e$:
<ul>
<li>Compute $\delta$ such that if we define $\gamma&rsquo;$ as $\gamma_e&rsquo; = \gamma_e - \delta$, and $\gamma_f&rsquo; = \gamma_f$ for all $f \in E\ \backslash {e}$, then $q_e(\gamma&rsquo;) = (1 + \epsilon/2)z_e$.</li>
<li>Set $\gamma \leftarrow \gamma&rsquo;$.</li>
</ul>
</li>
<li>Output $\tilde{\gamma} := \gamma$.</li>
</ol>
</blockquote>
<p>This structure is fairly straightforward, but we need to know what $q_e(\gamma)$ is and how to calculate $\delta$.</p>
<p>Finding $\delta$ is very easy, the formula is given in the Asadpour paper
(Although I did not realize this at the time that I wrote my GSoC proposal and re-derived the equation for delta. Fortunately my formula matches the one in the paper.)</p>
<p>$$
\delta = \ln \frac{q_e(\gamma)(1 - (1 + \epsilon / 2)z_e)}{(1 - q_e(\gamma))(1 + \epsilon / 2) z_e}
$$</p>
<p>Notice that the formula for $\delta$ is reliant on $q_e(\gamma)$.
The paper defines $q_e(\gamma)$ as</p>
<p>$$
q_e(\gamma) = \frac{\sum_{T \ni e} \exp(\gamma(T))}{\sum_{T \in \mathcal{T}} \exp(\gamma(T))}
$$</p>
<p>where $\gamma(T) = \sum_{f \in T} \gamma_f$.</p>
<p>The first thing that I noticed is that in the denominator the summation is over all spanning trees for in the graph, which for the complete graphs we will be working with is exponential so a `brute force&rsquo; approach here is useless.
Fortunately, Asadpour and team realized we can use Kirchhoff&rsquo;s matrix tree theorem to our advantage.</p>
<p>As an aside about Kirchhoff&rsquo;s matrix tree theorem, I was not familiar with this theorem before this project so I had to do a bit of reading about it.
Basically, if you have a laplacian matrix (the adjacency matrix minus the degree matrix), the absolute value of any cofactor is the number of spanning trees in the graph.
This was something completely unexpected to me, and I think that it is very cool that this type of connection exists.</p>
<p>The details of using Kirchhoff&rsquo;s theorem are given in section 5.3.
We will be using a weighted laplacian $L$ defined by</p>
<p>$$
L_{i, j} = \left\{
\begin{array}{l l}
-\lambda_e &amp; e = (i, j) \in E \\\
\sum_{e \in \delta({i})} \lambda_e &amp; i = j \\\
0 &amp; \text{otherwise}
\end{array}
\right.
$$</p>
<p>where $\lambda_e = \exp(\gamma_e)$.</p>
<p>Now, we know that applying Krichhoff&rsquo;s theorem to $L$ will return</p>
<p>$$
\sum_{t \in \mathcal{T}} \prod_{e \in T} \lambda_e
$$</p>
<p>but which part of $q_e(\gamma)$ is that?</p>
<p>If we apply $\lambda_e = \exp(\gamma_e)$, we find that</p>
<p>$$
\begin{array}{r c l}
\sum_{T \in \mathcal{T}} \prod_{e \in T} \lambda_e &amp;=&amp; \sum_{T \in \mathcal{T}} \prod_{e \in T} \exp(\gamma_e) \\\
&amp;&amp; \sum_{T \in \mathcal{T}} \exp\left(\sum_{e \in T} \gamma_e\right) \\\
&amp;&amp; \sum_{T \in \mathcal{T}} \exp(\gamma(T)) \\\
\end{array}
$$</p>
<p>So moving from the first row to the second row is a confusing step, but essentially we are exploiting the properties of exponents.
Recall that $\exp(x) = e^x$, so could have written it as $\prod_{e \in T} e^{\gamma_e}$ but this introduces ambiguity as we would have multiple meanings of $e$.
Now, for all values of $e$, $e_1, e_2, \dots, e_{n-1}$ in the spanning tree $T$ that product can be expanded as</p>
<p>$$
\prod_{e \in T} e^{\gamma_e} = e^{\gamma_{e_1}} \times e^{\gamma_{e_2}} \times \dots \times e^{\gamma_{e_{n-1}}}
$$</p>
<p>Each exponential factor has the same base, so we can collapse that into</p>
<p>$$
e^{\gamma_{e_1} + \gamma_{e_2} + \dots + \gamma_{e_{n-1}}}
$$</p>
<p>which is also</p>
<p>$$
e^{\sum_{e \in T} \gamma_e}
$$</p>
<p>but we know that $\sum_{e \in T} \gamma_e$ is $\gamma(T)$, so it becomes</p>
<p>$$
e^{\gamma(T)} = \exp(\gamma(T))
$$</p>
<p>Once we put that back into the summation we arrive at the denominator in $q_e(\gamma)$, $\sum_{T \in \mathcal{T}} \exp(\gamma(T))$.</p>
<p>Next, we need to find the numerator for $q_e(\gamma)$.
Just as before, a `brute force&rsquo; approach would be exponential in complexity, so we have to find a better way.
Well, the only difference between the numerator and denominator is the condition on the outer summation, which the $T \in \mathcal{T}$ being changed to $T \ni e$ or every tree containing edge $e$.</p>
<p>There is a way to use Krichhoff&rsquo;s matrix tree theorem here as well.
If we had a graph in which every spanning tree could be mapped in a one-to-one fashion onto every spanning tree in the original graph which contains the desired edge $e$.
In order for a spanning tree to contain edge $e$, we know that the endpoints of $e$, $(u, v)$ will be directly connected to each other.
So we are then interested in every spanning tree in which we reach vertex $u$ and then leave from vertex $v$.
(As opposed to the spanning trees where we reach vertex $u$ and then leave from that same vertex).
In a sense, we are treating vertices $u$ and $v$ is the same vertex.
We can apply this literally by <em>contracting</em> $e$ from the graph, creating $G / {e}$.
Every spanning tree in this graph can be uniquely mapped from $G / {e}$ onto a spanning tree in $G$ which contains the edge $e$.</p>
<p>From here, the logic to show that a cofactor from $L$ is actually the numerator of $q_e(\gamma)$ parallels the logic for the denominator.</p>
<p>At this point, we have all of the needed information to create some pseudo code for the next function in the Asadpour method, <code>spanning_tree_distribution()</code>.
Here I will use an inner function <code>q()</code> to find $q_e$.</p>

<div class="highlight">
  <pre>def spanning_tree_distribution
    input: z, the symmetrized and scaled output of the Held Karp relaxation.
    output: gamma, the maximum entropy exponential distribution for sampling spanning trees
           from the graph.

    def q
        input: e, the edge of interest

        # Create the laplacian matrices
        write lambda = exp(gamma) into the edges of G
        G_laplace = laplacian(G, lambda)
        G_e = nx.contracted_edge(G, e)
        G_e_laplace = laplacian(G, lambda)

        # Delete a row and column from each matrix to made a cofactor matrix
        G_laplace.delete((0, 0))
        G_e_laplace.delete((0, 0))

        # Calculate the determinant of the cofactor matrices
        det_G_laplace = G_laplace.det
        det_G_e_laplace = G_e_laplace.det

        # return q_e
        return det_G_e_laplace / det_G_laplace

    # initialize the gamma vector
    gamma = 0 vector of length G.size

    while true
        # We will iterate over the edges in z until we complete the
        # for loop without changing a value in gamma. This will mean
        # that there is not an edge with q_e &gt; 1.2 * z_e
        valid_count = 0
        # Search for an edge with q_e &gt; 1.2 * z_e
        for e in z
            q_e = q(e)
            z_e = z[e]
            if q_e &gt; 1.2 * z_e
                delta = ln(q_e * (1 - 1.1 * z_e) / (1 - q_e) * 1.1 * z_e)
                gamma[e] -= delta
            else
                valid_count &#43;= 1
        if valid_count == number of edges in z
            break

    return gamma</pre>
</div>

<h2 id="next-steps">Next Steps<a class="headerlink" href="#next-steps" title="Link to this heading">#</a></h2>
<p>The clear next step is to implement the function <code>spanning_tree_distribution</code> using the pseudo code above as an outline.
I will start by writing <code>q</code> and testing it with the same graphs which I am using to test the Held Karp relaxation.
Once <code>q</code> is complete, the rest of the function seems fairly straight forward.</p>
<p>One thing that I am concerned about is my ability to test <code>spanning_tree_distribution</code>.
There are no examples given in the Asadpour research paper and no other easy resources which I could turn to in order to find an oracle.</p>
<p>The only method that I can think of right now would be to complete this function, then complete <code>sample_spanning_tree</code>.
Once both functions are complete, I can sample a large number of spanning trees to find an experimental probability for each tree, then run a statistical test (such as an h-test) to see if the probability of each tree is near $\exp(\gamma(T))$ which is the desired distribution.
An alternative test would be to use the marginals in the distribution and have to manually check that</p>
<p>$$
\sum_{T \in \mathcal{T} : T \ni e} p(T) \leq (1 + \epsilon) z^*_e,\ \forall\ e \in E
$$</p>
<p>where $p(T)$ is the experimental data from the sampled trees.</p>
<p>Both methods seem very computationally intensive and because they are sampling from a probability distribution they may fail randomly due to an unlikely sample.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Finalizing the Held-Karp Relaxation]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Implementing the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/?utm_source=atom_feed" rel="related" type="text/html" title="Understanding the Ascent Method" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/?utm_source=atom_feed" rel="related" type="text/html" title="implementing the Iterators" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/?utm_source=atom_feed" rel="related" type="text/html" title="Finding all Minimum Arborescences" />
                <link href="https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="A Closer Look at the Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/finalizing-held-karp/</id>
            
            
            <published>2021-07-07T00:00:00+00:00</published>
            <updated>2021-07-07T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Picking which method to use for the final implementation of the Asadpour algorithm in NetworkX</blockquote><p>This <em>should</em> be my final post about the Held-Karp relaxation!
Since my last post titled <a href="../implementing-the-held-karp-relaxation">Implementing The Held Karp Relaxation</a>, I have been testing both the ascent method as well as the branch and bound method.</p>
<p>My first test was to use a truly asymmetric graph rather than a directed graph where the cost in each direction happened to be the same.
In order to create such a test, I needed to know the solution to any such proposed graphs.
I wrote a python script called <code>brute_force_optimal_tour.py</code> which will generate a random graph, print its adjacency matrix and then check every possible combination of edges to find the optimal tour.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">networkx</span> <span class="k">as</span> <span class="nn">nx</span>
</span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">combinations</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">math</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">random</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">is_1_arborescence</span><span class="p">(</span><span class="n">G</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns true if `G` is a 1-arborescence
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">G</span><span class="o">.</span><span class="n">number_of_edges</span><span class="p">()</span> <span class="o">==</span> <span class="n">G</span><span class="o">.</span><span class="n">order</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">        <span class="ow">and</span> <span class="nb">max</span><span class="p">(</span><span class="n">d</span> <span class="k">for</span> <span class="n">n</span><span class="p">,</span> <span class="n">d</span> <span class="ow">in</span> <span class="n">G</span><span class="o">.</span><span class="n">in_degree</span><span class="p">())</span> <span class="o">&lt;=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">        <span class="ow">and</span> <span class="n">nx</span><span class="o">.</span><span class="n">is_weakly_connected</span><span class="p">(</span><span class="n">G</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Generate a random adjacency matrix</span>
</span></span><span class="line"><span class="cl"><span class="n">size</span> <span class="o">=</span> <span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">7</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">G_array</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">empty</span><span class="p">(</span><span class="n">size</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="nb">int</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">size</span><span class="p">[</span><span class="mi">0</span><span class="p">]):</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">size</span><span class="p">[</span><span class="mi">1</span><span class="p">]):</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">r</span> <span class="o">==</span> <span class="n">c</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">G_array</span><span class="p">[</span><span class="n">r</span><span class="p">][</span><span class="n">c</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl">            <span class="k">continue</span>
</span></span><span class="line"><span class="cl">        <span class="n">G_array</span><span class="p">[</span><span class="n">r</span><span class="p">][</span><span class="n">c</span><span class="p">]</span> <span class="o">=</span> <span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Print that adjacency matrix</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">G_array</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">G</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">from_numpy_array</span><span class="p">(</span><span class="n">G_array</span><span class="p">,</span> <span class="n">create_using</span><span class="o">=</span><span class="n">nx</span><span class="o">.</span><span class="n">DiGraph</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">num_nodes</span> <span class="o">=</span> <span class="n">G</span><span class="o">.</span><span class="n">order</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">combo_count</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl"><span class="n">min_weight_tour</span> <span class="o">=</span> <span class="kc">None</span>
</span></span><span class="line"><span class="cl"><span class="n">min_tour_weight</span> <span class="o">=</span> <span class="n">math</span><span class="o">.</span><span class="n">inf</span>
</span></span><span class="line"><span class="cl"><span class="n">test_combo</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">DiGraph</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">combo</span> <span class="ow">in</span> <span class="n">combinations</span><span class="p">(</span><span class="n">G</span><span class="o">.</span><span class="n">edges</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">),</span> <span class="n">G</span><span class="o">.</span><span class="n">order</span><span class="p">()):</span>
</span></span><span class="line"><span class="cl">    <span class="n">combo_count</span> <span class="o">+=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">    <span class="n">test_combo</span><span class="o">.</span><span class="n">clear</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">test_combo</span><span class="o">.</span><span class="n">add_weighted_edges_from</span><span class="p">(</span><span class="n">combo</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># Test to see if test_combo is a tour.</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># This means first that it is an 1-arborescence</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="ow">not</span> <span class="n">is_1_arborescence</span><span class="p">(</span><span class="n">test_combo</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="k">continue</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># It also means that every vertex has a degree of 2</span>
</span></span><span class="line"><span class="cl">    <span class="n">arborescence_weight</span> <span class="o">=</span> <span class="n">test_combo</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="s2">&#34;weight&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="nb">len</span><span class="p">([</span><span class="n">n</span> <span class="k">for</span> <span class="n">n</span><span class="p">,</span> <span class="n">deg</span> <span class="ow">in</span> <span class="n">test_combo</span><span class="o">.</span><span class="n">degree</span> <span class="k">if</span> <span class="n">deg</span> <span class="o">==</span> <span class="mi">2</span><span class="p">])</span> <span class="o">==</span> <span class="n">num_nodes</span>
</span></span><span class="line"><span class="cl">        <span class="ow">and</span> <span class="n">arborescence_weight</span> <span class="o">&lt;</span> <span class="n">min_tour_weight</span>
</span></span><span class="line"><span class="cl">    <span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="c1"># Tour found</span>
</span></span><span class="line"><span class="cl">        <span class="n">min_weight_tour</span> <span class="o">=</span> <span class="n">test_combo</span><span class="o">.</span><span class="n">copy</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">        <span class="n">min_tour_weight</span> <span class="o">=</span> <span class="n">arborescence_weight</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="sa">f</span><span class="s2">&#34;Minimum tour found with weight </span><span class="si">{</span><span class="n">min_tour_weight</span><span class="si">}</span><span class="s2"> from </span><span class="si">{</span><span class="n">combo_count</span><span class="si">}</span><span class="s2"> combinations of edges</span><span class="se">\n</span><span class="s2">&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">,</span> <span class="n">d</span> <span class="ow">in</span> <span class="n">min_weight_tour</span><span class="o">.</span><span class="n">edges</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;(</span><span class="si">{</span><span class="n">u</span><span class="si">}</span><span class="s2">, </span><span class="si">{</span><span class="n">v</span><span class="si">}</span><span class="s2">, </span><span class="si">{</span><span class="n">d</span><span class="si">}</span><span class="s2">)&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<h2 id="everything-is-cool-with-the-ascent-method">Everything is Cool with the Ascent Method<a class="headerlink" href="#everything-is-cool-with-the-ascent-method" title="Link to this heading">#</a></h2>
<p>This is useful information as every though the ascent method returns a vector, because if the ascent method returns this solution (a.k.a $f(\pi) = 0$) we can calculate that vector off of the edges in the solution without having to explicitly enumerate the dict returned by <code>held_karp_ascent()</code>.</p>
<p>The first output from the program was a six vertex graph and is presented below.</p>

<div class="highlight">
  <pre>~ time python3 brute_force_optimal_tour.py
[[ 0 45 39 92 29 31]
 [72  0  4 12 21 60]
 [81  6  0 98 70 53]
 [49 71 59  0 98 94]
 [74 95 24 43  0 47]
 [56 43  3 65 22  0]]
Minimum tour found with weight 144.0 from 593775 combinations of edges

(0, 5, 31)
(5, 4, 22)
(1, 3, 12)
(3, 0, 49)
(2, 1, 6)
(4, 2, 24)

real	0m9.596s
user	0m9.689s
sys     0m0.241s</pre>
</div>

<p>First I checked that the ascent method was returning a solution with the same weight, 144, which it was.
Also, every entry in the vector was $0.866\overline{6}$ which is $\frac{5}{6}$ or the scaling factor from the Asadpour paper so I know that it was finding the exact solution.
Because if this, my test in <code>test_traveling_salesman.py</code> checks that for all edges in the solution edge set both $(u, v)$ and $(v, u)$ are equal to $\frac{5}{6}$.</p>
<p>For my next test, I created a $7 \times 7$ matrix to test with, and as expected the running time of the python script was much slower.</p>

<div class="highlight">
  <pre>~ time python3 brute_force_optimal_tour.py
[[ 0 26 63 59 69 31 41]
 [62  0 91 53 75 87 47]
 [47 82  0 90 15  9 18]
 [68 19  5  0 58 34 93]
 [11 58 53 55  0 61 79]
 [88 75 13 76 98  0 40]
 [41 61 55 88 46 45  0]]
Minimum tour found with weight 190.0 from 26978328 combinations of edges

(0, 1, 26)
(1, 3, 53)
(3, 2, 5)
(2, 5, 9)
(5, 6, 40)
(4, 0, 11)
(6, 4, 46)

real	7m28.979s
user	7m29.048s
sys     0m0.245s</pre>
</div>

<p>Once again, the value of $f(\pi)$ hit 0, so the ascent method returned an exact solution and my testing procedure was the same as for the six vertex graph.</p>
<h2 id="trouble-with-branch-and-bound">Trouble with Branch and Bound<a class="headerlink" href="#trouble-with-branch-and-bound" title="Link to this heading">#</a></h2>
<p>The branch and bound method was not working well with the two example graphs I generated.
First, on the seven vertex matrix, I programmed the test and let it run&hellip; and run&hellip; and run&hellip; until I stopped it at just over an hour of execution time.
If it took one eight of that time to brute force the solution, then the branch and bound method truly is not efficient.</p>
<p>I moved to the six vertex graph with high hopes, I already had a six vertex graph which was correctly executing in a reasonable amount of time.
The six vertex graph created a large number of exceptions and errors when I ran the tests.
I was able to determine why the errors were being generated, but the context did not conform which my expectations for the branch and bound method.</p>
<p>Basically, <code>direction_of_ascent_kilter()</code> was finding a vertex which was out-of-kilter and returning the corresponding direction of ascent, but <code>find_epsilon()</code> was not finding any valid cross over edges and returning a maximum direction of travel of $\infty$.
While I could change the default value for the return value of <code>find_epsilon()</code> to zero, that would not solve the problem because the value of the vector $\pi$ would get stuck and the program would enter an infinite loop.</p>
<p>I do have an analogy for this situation.
Imagine that you are in an unfamiliar city and you have to meet somebody at the tallest building in that city.
However, you don&rsquo;t know the address and have no way to get a GPS route to that building.
Instead of wandering around aimlessly, you decide to scan the skyline for the tallest building you can see and start walking down the street which is the closest to matching that direction.
Additionally, you have the ability to tell at any given direction how far down the chosen street to go before you need to re-evaluate and pick a new street.</p>
<p>This hypothetical is a better approximation of the ascent method, but the problem here can be demonstrated non the less.</p>
<ul>
<li>Determining if you are at the tallest building is running the linear program to see if the direction of ascent still exists.</li>
<li>Picking the street to go down is the same as finding the direction of ascent.</li>
<li>Finding out how far to go down that street is the same as finding epsilon.</li>
</ul>
<p>After this procedure works for a while, you suddenly find yourself in an unusual situation.
You can still see the tallest building, so you know you are not there yet.
You know what street will take you closer to the building, but for some reason you cannot move down that street.</p>
<p>From my understanding of the ascent and branch and bound methods, if the direction of ascent exists, then we have to be able to move some amount in that direction without fail, but the branch and bound method was failing to provide an adequate distance to move.</p>
<p>Considering the trouble with the branch and bound method, and that it is not going to be used in the final Asadpour algorithm, I plan on removing it from the NetworkX pull request and moving onwards using only the ascent method for the rest of the Ascent method.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061.</p>
<p>M. Held, R. M. Karp, <em>The traveling-salesman problem and minimum spanning trees</em>. Operations research, 1970-11-01, Vol.18 (6), p.1138-1162. <a href="https://www.jstor.org/stable/169411">https://www.jstor.org/stable/169411</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Implementing the Held-Karp Relaxation]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/?utm_source=atom_feed" rel="related" type="text/html" title="Understanding the Ascent Method" />
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/?utm_source=atom_feed" rel="related" type="text/html" title="implementing the Iterators" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/?utm_source=atom_feed" rel="related" type="text/html" title="Finding all Minimum Arborescences" />
                <link href="https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="A Closer Look at the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/?utm_source=atom_feed" rel="related" type="text/html" title="NetworkX Function Stubs" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/implementing-the-held-karp-relaxation/</id>
            
            
            <published>2021-06-28T00:00:00+00:00</published>
            <updated>2021-06-28T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Implementation details for the ascent method to solve the Held-Karp relaxation</blockquote><p>I have now completed my implementation of the ascent and the branch and bound method detailed in the 1970 paper <em>The Traveling-Salesman Problem and Minimum Spanning Trees</em> by Micheal Held and Richard M. Karp.
In my last post, titled <a href="../understanding-the-ascent-method">Understanding the Ascent Method</a>, I completed the first iteration of the ascent method and found an important bug in the <code>find_epsilon()</code> method and found a more efficient way to determine substitutes in the graph.
However the solution being given was still not the optimal solution.</p>
<p>After discussing my options with my GSoC mentors, I decided to move onto the branch and bound method anyways with the hope that because the method is more human-computable and an example was given in the paper by Held and Karp that I would be able to find the remaining flaws.
Fortunately, this was indeed the case and I was able to correctly implement the branch and bound method and fix the last problem with the ascent method.</p>
<h2 id="initial-implementation-of-the-branch-and-bound-method">Initial Implementation of the Branch and Bound Method<a class="headerlink" href="#initial-implementation-of-the-branch-and-bound-method" title="Link to this heading">#</a></h2>
<p>The branch and bound method follows from the ascent method, but tweaks how we determine the direction of ascent and simplifies the expression used for $\epsilon$.
As a reminder, we use the notion of an <em>out-of-kilter</em> vertex to find directions of ascent which are unit vectors or negative unit vectors.
An out-of-kilter vertex is a vertex which is consistently not connected enough or connected too much in the set of minimum 1-arborescences of a graph.
The formal definition is given on page 1151 as</p>
<blockquote>
<p>Vertex $i$ is said to be <em>out-of-kilter high</em> at the point $\pi$, if, for all $k \in K(\pi), v_{ik} \geqq 1$;
similarly, vertex $i$ is <em>out-of-kilter low</em> at the point $\pi$ if, for all $k \in K(\pi), v_{ik} = -1$.</p>
</blockquote>
<p>Where $v_{ik}$ is the degree of the vertex minus two.
First, I created a function called <code>direction_of_ascent_kilter()</code> which returns a direction of ascent based on whether a vertex is out-of-kilter.
However, I did not use the method mentioned on the paper by Held and Karp, which is to find a member of $K(\pi, u_i)$ where $u_i$ is the unit vector with 1 in the $i$th location and check if vertex $i$ had a degree of 1 or more than two.
Instead, I knew that I could find the elements of $K(\pi)$ with existing code and decided to check the value of $v_{ik}$ for all $k \in K(\pi)$ and once it is determined that a vertex is out-of-kilter simply move on to the next vertex.</p>
<p>Once I have a mapping of all vertices to their kilter state, find one which is out-of-kilter and return the corresponding direction of ascent.</p>
<p>The changes to <code>find_epsilon()</code> were very minor, basically removing the denominator from the formula for $\epsilon$ and adding a check to see if we have a negative direction of ascent so that the crossover distances become positive and thus valid.</p>
<p>The brand new function which was needed was <code>branch()</code>, which well&hellip; branches according to the Held and Karp paper.
The first thing it does is run the linear program to form the ascent method to determine if a direction of ascent exists.
If the direction does exist, branch.
If not, search the set of minimum 1-arborescences for a tour and then branch if it does not exist.
The branch process itself is rather simple, find the first open edge (an edge not in the partition sets $X$ and $Y$) and then create two new configurations where that edges is either included or excluded respectively.</p>
<p>Finally the overall structure of the algorithm, written in pseudocode is</p>

<div class="highlight">
  <pre>Initialize pi to be the zero vector.
Add the configuration (∅, ∅, pi, w(0)) to the configuration priority queue.
while configuration_queue is not empty:
    config = configuration_queue.get()
    dir_ascent = direction_of_ascent_kilter()
    if dir_ascent is None:
        branch()
        if solution returned by branch is not None
            return solution
    else:
        max_dist = find_epsilon()
        update pi
        update edge weights
        update config pi and bound value</pre>
</div>

<h2 id="debugging-the-branch-and-bound-method">Debugging the Branch and Bound Method<a class="headerlink" href="#debugging-the-branch-and-bound-method" title="Link to this heading">#</a></h2>
<p>My initial implementation of the branch and bound method returned the same, incorrect solution is the ascent method, but with different edge weights.
As a reminder, I wanted a solution which looked like this:</p>
<center><img src="expected-solution.png" alt="Expected solution for the Held-Karp relaxation for the example graph"/></center>
<p>and I now had two algorithms returning this solution:</p>
<center><img src="found-solution.png" width=350 alt="Solution found by the branch-and-bound method"/></center>
<p>As I mentioned before, the branch and bound method is more human-computable than the ascent method, so I decided to follow the execution of my implementation with the one given in [1].
Below, the left side is the data from the Held and Karp paper and on the right my program&rsquo;s execution on the directed version.</p>
<table>
  <thead>
      <tr>
          <th>Undirected Graph</th>
          <th>Directed Graph</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Iteration 1:</td>
          <td></td>
      </tr>
      <tr>
          <td>Starting configuration: $(\emptyset, \emptyset, \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 \end{bmatrix}, 196)$</td>
          <td>Starting configuration: $(\emptyset, \emptyset, \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 \end{bmatrix}, 196)$</td>
      </tr>
      <tr>
          <td>Minimum 1-Trees:</td>
          <td>Minimum 1-Arborescences:</td>
      </tr>
      <tr>
          <td><img src="/networkx/atsp/implementing-the-held-karp-relaxation/minimum-1-trees-iteration-1.png" alt="minimum 1 trees"></td>
          <td><img src="/networkx/atsp/implementing-the-held-karp-relaxation/minimum-1-arborescences-iteration-1.png" alt="minimum 1 arborescences"></td>
      </tr>
      <tr>
          <td>Vertex 3 out-of-kilter LOW</td>
          <td>Vertex 3 out-of-kilter LOW</td>
      </tr>
      <tr>
          <td>$d = \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; -1 &amp; 0 &amp; 0 \end{bmatrix}$</td>
          <td>$d = \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; -1 &amp; 0 &amp; 0 \end{bmatrix}$</td>
      </tr>
      <tr>
          <td>$\epsilon(\pi, d) = 5$</td>
          <td>$\epsilon(\pi, d) = 5$</td>
      </tr>
      <tr>
          <td>New configuration: $(\emptyset, \emptyset, \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; -5 &amp; 0 &amp; 0 \end{bmatrix}, 201)$</td>
          <td>New configuration: $(\emptyset, \emptyset, \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; -5 &amp; 0 &amp; 0 \end{bmatrix}, 212)$</td>
      </tr>
      <tr>
          <td></td>
          <td></td>
      </tr>
      <tr>
          <td>Iteration 2:</td>
          <td></td>
      </tr>
      <tr>
          <td>Minimum 1-Trees:</td>
          <td>Minimum 1-Arborescences:</td>
      </tr>
      <tr>
          <td><img src="/networkx/atsp/implementing-the-held-karp-relaxation/minimum-1-trees-iteration-2.png" alt="minimum 1 trees"></td>
          <td><img src="/networkx/atsp/implementing-the-held-karp-relaxation/minimum-1-arborescences-iteration-2.png" alt="minimum 1 arborescences"></td>
      </tr>
  </tbody>
</table>
<p>In order to get these results, I forbid the program from being able to choose to connect vertex 0 to the same other vertex for both the incoming and outgoing edge.
However, it is very clear that from the start, iteration two was not going to be the same.</p>
<p>I noticed that in the first iteration, there were twice as many 1-arborescences as 1-trees and that the difference was that the cycle can be traversed in both directions.
This creates a mapping between 1-trees and 1-arborescences.
In the second iteration, there is not as twice as many 1-arborescences and that mapping is not present.
Vertex 0 always connects to vertex 3 in the arborescences and vertex 5 in the trees.
Additionally, the cost of the 1-arborescences are higher than the costs of the 1-trees.</p>
<p>I knew that the choice of root node in the arborescences affects the total price from working on the ascent method.
I now wondered if a minimum 1-arborescence could come from a non-minimum spanning arborescence.
So it would be, the answer is yes.</p>
<p>In order to test this hypothesis, I created a simple python script using a modified version of <code>k_pi()</code>.
The entire thing is longer than I&rsquo;d like to put here, but the gist was simple; iterate over <em>all</em> of the spanning arborescences in the graph, tracking the minimum weight and then printing the minimum 1-arborescences that this program finds to compare to the ones that the unaltered one finds.</p>
<p>The output is below:</p>

<div class="highlight">
  <pre>Adding arborescence with weight 212.0
Adding arborescence with weight 212.0
Adding arborescence with weight 212.0
Adding arborescence with weight 204.0
Adding arborescence with weight 204.0
Adding arborescence with weight 196.0
Adding arborescence with weight 196.0
Adding arborescence with weight 196.0
Adding arborescence with weight 196.0
Adding arborescence with weight 196.0
Adding arborescence with weight 196.0
Found 6 minimum 1-arborescences

(1, 5, 30)
(2, 1, 41)
(2, 3, 21)
(4, 2, 35)
(5, 0, 52)
(0, 4, 17)

(1, 2, 41)
(2, 3, 21)
(2, 4, 35)
(4, 0, 17)
(5, 1, 30)
(0, 5, 52)

(2, 3, 21)
(2, 4, 35)
(4, 0, 17)
(5, 1, 30)
(5, 2, 41)
(0, 5, 52)

(2, 4, 35)
(3, 2, 16)
(4, 0, 17)
(5, 1, 30)
(5, 3, 46)
(0, 5, 52)

(2, 3, 21)
(3, 5, 41)
(4, 2, 35)
(5, 1, 30)
(5, 0, 52)
(0, 4, 17)

(2, 3, 21)
(2, 5, 41)
(4, 2, 35)
(5, 1, 30)
(5, 0, 52)
(0, 4, 17)</pre>
</div>

<p>This was very enlightening.
The 1-arborescences of weight 212 were the ones that my branch and bound method was using in the second iteration, but not the true minimum ones.
Graphically, those six 1-arborescences look like this:</p>
<center><img src="true-minimum-arborescences.png" alt-"The true set of minimum arborescenes in the example graph"/></center>
<p>And suddenly that mapping between the 1-trees and 1-arborescences is back!
But why can minimum 1-arborescences come from non-minimum spanning arborescences?
Remember that we create 1-arborescences by find spanning arborescences on the vertex set ${2, 3, \dots, n}$ and then connecting that missing vertex to the root of the spanning arborescence and the minimum weight incoming edge.</p>
<p>This means that even among the true minimum spanning arborescences, the final weight of the 1-arborescence can vary based on the cost of connecting &lsquo;vertex 1&rsquo; to the root of the arborescence.
I already had to deal with this issue earlier in the implementation of the ascent method.
Now suppose that not every vertex in the graph is a root of an arborescence in the set of minimum spanning arborescences.
Let the <em>minimum</em> root be the root vertex of the arborescence which is the cheapest to connect to and the <em>maximum</em> root the root vertex which is the most expensive to connect to.
If we needed to, we could order the roots from minimum to maximum based on the weight of the edge from &lsquo;vertex 1&rsquo; to that root.</p>
<p>Finally, suppose that the result of considering only the set of minimum spanning arborescences results in a set of minimum 1-arborescenes which do not use the minimum root and have a total cost $c$ more than the cost of the minimum spanning arborescence plus the cost of connecting to the minimum root.
Continue to consider spanning arborescences in increasing weight, such as the ones returned by the <code>ArborescenceIterator</code>.
Eventually the <code>ArborescenceIterator</code> will return a spanning arborescence which has the minimum root.
If the cost of the minimum spanning arborescence is $c_{min}$ and the cost of this arborescence is less than $c_{min} + c$ then a new minimum 1-arborescence has been found from a non-minimum spanning arborescence.</p>
<p>It is obviously impractical to consider all of the spanning arborescences in the graph, and because <code>ArborescenceIterator</code> returns arborescences in order of increasing weight, there is a weight after which it is impossible to produce a minimum 1-arborescence.</p>
<p>Let the cost of a minimum spanning arborescence be $c_{min}$ and the total costs of connecting the roots range from $r_{min}$ to $r_{max}$.
The worst case cost of the minimum 1-arborescence is $c_{min} + r_{max}$ which would connect the minimum spanning arborescence to the most expensive root and the best case minimum 1-arborescence would be $c_{min} + r_{min}$.
With regard to the weight of the spanning arborescence itself, once it exceeds $c_{min} + r_{max} - r_{min}$ we know that even if it uses the minimum root that the total weight will be greater than worst case minimum 1-arborescence so that is the bound which we use the <code>ArborescenceIterator</code> with.</p>
<p>After implementing this boundary for checking spanning arborescences to find minimum 1-arborescences, both methods executed successfully on the test graph.</p>
<h2 id="next-steps">Next Steps<a class="headerlink" href="#next-steps" title="Link to this heading">#</a></h2>
<p>Now that both the ascent and branch and bound methods are working, they must be tested both for accuracy and performance.
Surprisingly, on the test graph I have been using, which is originally from the Held and Karp paper, the ascent method is between 2 and 3 times faster than the branch and bound method.
However, this six vertex graph is small and the branch and bound method may yet have better performance on larger graphs.
I will have to create larger test graphs and then select whichever method has better performance overall.</p>
<p>Additionally, this is an example where $f(\pi)$, the gap between a tour and 1-arborescence, converges to 0.
This is not always the case, so I will need to test on an example where the minimum gap is greater than 0.</p>
<p>Finally, the output of my Held Karp relaxation program is a tour.
This is just one part of the Asadpour asymmetric traveling salesperson problem and that algorithm takes a modified vector which is produced based on the final result of the relaxation.
I still need to convert the output to match the expectation of the overall algorithm I am seeking to implement this summer of code.</p>
<p>I hope to move onto the next step of the Asadpour algorithm on either June 30th or July 1st.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>[1] Held, M., Karp, R.M. <em>The traveling-salesman problem and minimum spanning trees</em>. Operations research, 1970-11-01, Vol.18 (6), p.1138-1162. <a href="https://www.jstor.org/stable/169411">https://www.jstor.org/stable/169411</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Understanding the Ascent Method]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/?utm_source=atom_feed" rel="related" type="text/html" title="implementing the Iterators" />
                <link href="https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/?utm_source=atom_feed" rel="related" type="text/html" title="Finding all Minimum Arborescences" />
                <link href="https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="A Closer Look at the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/?utm_source=atom_feed" rel="related" type="text/html" title="NetworkX Function Stubs" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Separation Oracle" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/understanding-the-ascent-method/</id>
            
            
            <published>2021-06-22T00:00:00+00:00</published>
            <updated>2021-06-22T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>A deep dive into the ascent method for the Held-Karp relaxation</blockquote><p>It has been far longer than I would have preferred since I wrote a blog post.
As I expected in my original GSoC proposal, the Held-Karp relaxation is proving to be quite difficult to implement.</p>
<p>My mentors and I agreed that the branch and bound method discussed in Held and Karp&rsquo;s 1970 paper <em>The Traveling-Salesman Problem and Minimum Spanning Trees</em> which first required the implementation of the ascent method because it is used in the branch and bound method.
For the last week and a half I have been implementing and debugging the ascent method and wanted to take some time to reflect on what I have learned.</p>
<p>I will start by saying that as of the writing of this post, my version of the ascent method is not giving what I expect to be the optimal solution.
For my testing, I took the graph which Held and Karp use in their example of the branch and bound method, a weighted $\mathcal{K}_6$, and converted to a directed but symmetric version given in the following adjacency matrix.</p>
<p>$$
\begin{bmatrix}
0 &amp; 97 &amp; 60 &amp; 73 &amp; 17 &amp; 52 \\\
97 &amp; 0 &amp; 41 &amp; 52 &amp; 90 &amp; 30 \\\
60 &amp; 41 &amp; 0 &amp; 21 &amp; 35 &amp; 41 \\\
73 &amp; 52 &amp; 21 &amp; 0 &amp; 95 &amp; 46 \\\
17 &amp; 90 &amp; 35 &amp; 95 &amp; 0 &amp; 81 \\\
52 &amp; 30 &amp; 41 &amp; 46 &amp; 81 &amp; 0
\end{bmatrix}
$$</p>
<p>The original solution is an undirected tour but in the directed version, the expected solutions depend on which way they are traversed.
Both of these cycles have a total weight of 207.</p>
<center><img src="expected-solution.png" alt="Expected solutions for the weighted K_6 used in the Held and Karp paper"/></center>
<p>This is the cycle returned by the program, which has a total weight of 246.</p>
<center><img src="found-solution.png" width=350 alt="The current solution being found by the program"/></center>
<p>All of this code goes into the function <code>_held_karp()</code> within <code>traveling_saleaman.py</code> in NetworkX and I tried to follow the algorithm outlined in the paper as closely as I could.
The <code>_held_karp()</code> function itself has three inner functions, <code>k_pi()</code>, <code>direction_of_ascent()</code> and <code>find_epsilon()</code> which represent the main three steps used in each iteration of the ascent method.</p>
<h2 id="k_pi"><code>k_pi()</code><a class="headerlink" href="#k_pi" title="Link to this heading">#</a></h2>
<p><code>k_pi()</code> uses the <code>ArborescenceIterator</code> I implemented during the first week of coding for the Summer of Code to find all of the minimum 1-arborescences in the graph.
My original assessment of creating 1-arborescences was slightly incorrect.
I stated that</p>
<blockquote>
<p>In order to connect vertex 1, we would choose the outgoing arc with the smallest cost and the incoming arc with the smallest cost.</p>
</blockquote>
<p>In reality, this method would produce graphs which are almost arborescences based solely on the fact that the outgoing arc would almost certainly create a vertex with two incoming arcs.
Instead, we need to connect vertex 1 with the incoming edge of lowest cost and the edge connecting to the root node of the arborescence on nodes ${2, 3, \dots, n}$ that way the in-degree constraint is not violated.</p>
<p>For the test graph on the first iteration of the ascent method, <code>k_pi()</code> returned 10 1-arborescences but the costs were not all the same.
Notice that because we have no agency in choosing the outgoing edge of vertex 1 that the total cost of the 1-arborescence will vary by the difference between the cheapest root to connect to and the most expensive node to connect to.
My original writing of this function was not very efficient and it created the 1-arborescence from all of the minimum spanning arborescences and then iterated over them to delete all of the non-minimum ones.</p>
<p>Yesterday I re-wrote this function so that once a 1-arborescence of lower weight was found it would delete all of the current minimum ones in favor on the new one and not add any 1-arborescences it found with greater weight to the set of minimum 1-arborescences.</p>
<p>The real reason that I re-wrote the method was to try something new in hopes of pushing the program from a suboptimal solution to the optimal one.
As I mentioned early, the forced choice of connecting to the root node created 1-arborescences of different weight.
I suspected then that different choices of vertex 1 would be able to create 1-arborescences of even lower weight than just arbitrarily using the one returned by <code>next(G.__iter__())</code>.
So I wrapped all of <code>k_pi()</code> with a <code>for</code> loop over the vertices of the graph and found that the choice of vertex 1 made a difference.</p>

<div class="highlight">
  <pre>Excluded node: 0, Total Weight: 161.0
Chosen incoming edge for node 0: (4, 0), chosen outgoing edge for node 0: (0, 4)
(2, 3, 21)
(2, 5, 41)
(4, 2, 35)
(4, 0, 17)
(5, 1, 30)
(0, 4, 17)

Excluded node: 0, Total Weight: 161.0
Chosen incoming edge for node 0: (4, 0), chosen outgoing edge for node 0: (0, 4)
(1, 5, 30)
(2, 1, 41)
(2, 3, 21)
(4, 2, 35)
(4, 0, 17)
(0, 4, 17)

Excluded node: 1, Total Weight: 174.0
Chosen incoming edge for node 1: (5, 1), chosen outgoing edge for node 1: (1, 5)
(2, 3, 21)
(2, 4, 35)
(4, 0, 17)
(5, 2, 41)
(5, 1, 30)
(1, 5, 30)

Excluded node: 2, Total Weight: 187.0
Chosen incoming edge for node 2: (3, 2), chosen outgoing edge for node 2: (2, 3)
(0, 4, 17)
(3, 5, 46)
(3, 2, 21)
(5, 0, 52)
(5, 1, 30)
(2, 3, 21)

Excluded node: 3, Total Weight: 165.0
Chosen incoming edge for node 3: (2, 3), chosen outgoing edge for node 3: (3, 2)
(1, 5, 30)
(2, 1, 41)
(2, 4, 35)
(2, 3, 21)
(4, 0, 17)
(3, 2, 21)

Excluded node: 3, Total Weight: 165.0
Chosen incoming edge for node 3: (2, 3), chosen outgoing edge for node 3: (3, 2)
(2, 4, 35)
(2, 5, 41)
(2, 3, 21)
(4, 0, 17)
(5, 1, 30)
(3, 2, 21)

Excluded node: 4, Total Weight: 178.0
Chosen incoming edge for node 4: (0, 4), chosen outgoing edge for node 4: (4, 0)
(0, 5, 52)
(0, 4, 17)
(1, 2, 41)
(2, 3, 21)
(5, 1, 30)
(4, 0, 17)

Excluded node: 4, Total Weight: 178.0
Chosen incoming edge for node 4: (0, 4), chosen outgoing edge for node 4: (4, 0)
(0, 5, 52)
(0, 4, 17)
(2, 3, 21)
(5, 1, 30)
(5, 2, 41)
(4, 0, 17)

Excluded node: 5, Total Weight: 174.0
Chosen incoming edge for node 5: (1, 5), chosen outgoing edge for node 5: (5, 1)
(1, 2, 41)
(1, 5, 30)
(2, 3, 21)
(2, 4, 35)
(4, 0, 17)
(5, 1, 30)</pre>
</div>

<p>Note that because my test graph is symmetric it likes to make cycles with only two nodes.
The weights of these 1-arborescences range from 161 to 178, so I tried to run the test which had been taking about 300 ms using the new approach&hellip; and the program was non-terminating.
I created breakpoints in PyCharm after 200 iterations of the ascent method and found that the program was stuck in a loop where it alternated between two different minimum 1-arborescences.
This was a long shot, but it did not work out so I reverted the code to always pick the same vertex for vertex 1.</p>
<p>Either way, the fact that I had almost entirely re-written this function without a change in output suggests that this function is not the source of the problem.</p>
<h2 id="direction_of_ascent"><code>direction_of_ascent()</code><a class="headerlink" href="#direction_of_ascent" title="Link to this heading">#</a></h2>
<p>This was the one function which has pseudocode in the Held and Karp paper:</p>
<blockquote>
<ol>
<li>Set $d$ equal to the zero $n$-vector.</li>
<li>Find a 1-tree $T^k$ such that $k \in K(\pi, d)$. [A method of executing Step 2 follows from the results of Section 6 (the greedy algorithm).]</li>
<li>If $\sum_{i=1}^{i=n} d_i v_{i k} &gt; 0$, STOP.</li>
<li>$d_i \rightarrow d_i + v_{i k}$, for $i = 2, 3, \dots, n$</li>
<li>GO TO 2.</li>
</ol>
</blockquote>
<p>Using this as a guide, the implementation of this function was simple until I got to the terminating condition, which is a linear program discussed on page 1149 as</p>
<blockquote>
<p>Thus, when failure to terminate is suspected, it is necessary to check whether no direction of ascent exists; by the Minkowski-Farkas lemma this is equivalent to the existence of nonnegative coefficients $\alpha_k$ such that</p>
<p>$ \sum_{k \in K(\pi)} \alpha_kv_{i k} = 0, \quad i = 1, 2, \dots, n $</p>
<p>This can be checked by linear programming.</p>
</blockquote>
<p>While I was able to implement this without much issue, one <em>very</em> important constraint of the linear program was not mentioned here, but rather the page before during a proof.
That constraint is</p>
<p>$$
\sum_{k \in K(\pi)} \alpha_k = 1
$$</p>
<p>Once I spent several hours trying to debug the original linear program and noticed the missing constraint. The linear program started to behave correctly, terminating the program when a tour is found.</p>
<h2 id="find_epsilon"><code>find_epsilon()</code><a class="headerlink" href="#find_epsilon" title="Link to this heading">#</a></h2>
<p>This function requires a completely different implementation compared to the one described in the Held and Karp paper.</p>
<p>The basic idea in both my implementation for directed graphs and the description for undirected graphs is finding edges which are substitutes for each other, or an edge outside the 1-arborescence which can replace an edge in the arborescence and will result in a 1-arborescence.</p>
<p>The undirected version uses the idea of fundamental cycles in the tree to find the substitutes, and I tried to use this idea as will with the <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.cycles.find_cycle.html"><code>find_cycle()</code></a> function in the NetworkX library.
I executed the first iteration of the ascent method by hand and noticed that what I computed for all of the possible values of $\epsilon$ and what the program found did not match.
I had found several that it had missed and it found several that I missed.
For the example graph, I found that the following edge pairs are substitutes where the first edge is not in the 1-arborescence and the second one is the one in the 1-arborescence which it can replace using the below minimum 1-arborescence.</p>
<center><img src="minimum-1-arborescence.png" width=350 alt="1-arborescence after the first iteration of the ascent method"/></center>
<p>$$
\begin{array}{l}
(0, 1) \rightarrow (2, 1) \text{ valid: } \epsilon = 56 \\\
(0, 2) \rightarrow (4, 2) \text{ valid: } \epsilon = 25 \\\
(0, 3) \rightarrow (2, 3) \text{ valid: } \epsilon = 52 \\\
(0, 5) \rightarrow (1, 5) \text{ valid: } \epsilon = \frac{30 - 52}{0 - 0} \text{, not valid} \\\
(1, 3) \rightarrow (2, 3) \text{ valid: } \epsilon = 15.5 \\\
(2, 5) \rightarrow (1, 5) \text{ valid: } \epsilon = 5.5 \\\
(3, 1) \rightarrow (2, 1) \text{ valid: } \epsilon = 5.5 \\\
(3, 5) \rightarrow (1, 5) \text{ valid: } \epsilon = \frac{30 - 46}{-1 + 1} \text{, not valid} \\\
(4, 1) \rightarrow (2, 1) \text{ valid: } \epsilon = \frac{41 - 90}{1 - 1} \text{, not valid} \\\
(4, 3) \rightarrow (2, 3) \text{ valid: } \epsilon = \frac{30 - 95}{1 - 1} \text{, not valid} \\\
(4, 5) \rightarrow (1, 5) \text{ valid: } \epsilon = -25.5 \text{, not valid (negative }\epsilon) \\\
(5, 3) \rightarrow (2, 3) \text{ valid: } \epsilon = 25 \\\
\end{array}
$$</p>
<p>I missed the following substitutes which the program did find.</p>
<p>$$
\begin{array}{l}
(1, 0) \rightarrow (4, 0) \text{ valid: } \epsilon = 80 \\\
(1, 4) \rightarrow (0, 4) \text{ valid: } \epsilon = 73 \\\
(2, 0) \rightarrow (4, 0) \text{ valid: } \epsilon = \frac{17 - 60}{1 - 1} \text{, not valid} \\\
(2, 4) \rightarrow (0, 4) \text{ valid: } \epsilon = -18 \text{, not valid (negative }\epsilon) \\\
(3, 0) \rightarrow (4, 0) \text{ valid: } \epsilon = 28 \\\
(3, 4) \rightarrow (0, 4) \text{ valid: } \epsilon = 78 \\\
(5, 0) \rightarrow (4, 0) \text{ valid: } \epsilon = 35 \\\
(5, 4) \rightarrow (0, 4) \text{ valid: } \epsilon = \frac{17 - 81}{0 - 0} \text{, not valid} \\\
\end{array}
$$</p>
<p>Notice that some substitutions do not cross over if we move in the direction of ascent, which are the pairs which have a zero as the denominator.
Additionally, $\epsilon$ is a distance, and the concept of a negative distance does not make sense.
Interpreting a negative distance as a positive distance in the opposite direction, if we needed to move in that direction, the direction of ascent vector would be pointing the other way.</p>
<p>The reason that my list did not match the list of the program was because <code>find_cycle()</code> did not always return the fundamental cycle containing the new edge.
If I called <code>find_cycle()</code> on a vertex in the other cycle in the graph (in this case ${(0, 4), (4, 0)}$), it would return that rather than the true fundamental cycle.</p>
<p>This prompted me to think about what really determines if edges in a 1-arborescence are substitutes for each other.
In every case where a substitute was valid, both of those edges lead to the same vertex.
If they did not, then the degree constraint of the arborescence would be violated because we did not replace the edge leading into a node with another edge leading into the same node.
This is true regardless of if the edges are part of the same fundamental cycle or not.</p>
<p>Thus, <code>find_epsilon()</code> now takes every edge in the graph but not the chosen 1-arborescence $k \in K(\pi, d)$ and find the other edge in $k$ pointing to the same vertex, swaps them and then checks that the degree constraint is not violated, it has the correct number of edges and it is still connected.
This is a more efficient method to use, and it found more valid substitutions as well so I was hopeful that it would finally bring the returned solution down to the optimal solution, perhaps because it was missing the correct value of $\epsilon$ on even just one of the iterations.</p>
<p>It did not.</p>
<h2 id="next-steps">Next Steps<a class="headerlink" href="#next-steps" title="Link to this heading">#</a></h2>
<p>At this point I have no real course forward, but two unappealing options.</p>
<ul>
<li>I found the problem with <code>find_epsilon()</code> by executing the first iteration of the ascent method by hand. It took about 90 minutes.
I could try to continue this process and hope that while iteration 1 is executing correctly I find some other bug in the code, but I doubt that I will ever reach the 9 iterations the program needs
to find the faulty solution.</li>
<li>Move on to the branch and bound part of the Held-Karp relaxation.
My hope is that because Held and Karp give a complete execution of the branch and bound method that I will be able to use that to trace a complete execution of the relaxation and find the flaw in
the ascent method that way.</li>
</ul>
<p>I will be discussing the next steps with my GSoC mentors soon.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>Held, M., Karp, R.M. <em>The traveling-salesman problem and minimum spanning trees</em>. Operations research, 1970-11-01, Vol.18 (6), p.1138-1162. <a href="https://www.jstor.org/stable/169411">https://www.jstor.org/stable/169411</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[implementing the Iterators]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/?utm_source=atom_feed" rel="related" type="text/html" title="Finding all Minimum Arborescences" />
                <link href="https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="A Closer Look at the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/?utm_source=atom_feed" rel="related" type="text/html" title="NetworkX Function Stubs" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Separation Oracle" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/implementing-the-iterators/</id>
            
            
            <published>2021-06-10T00:00:00+00:00</published>
            <updated>2021-06-10T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Implementation details about SpanningTreeIterator and ArborescenceIterator</blockquote><p>We are coming into the end of the first week of coding for the Summer of Code, and I have implemented two new, but related, features in NetworkX.
In this post, I will discuss how I implemented them, some of the challenges and how I tested them.
Those two new features are a spanning tree iterator and a spanning arborescence iterator.</p>
<p>The arborescence iterator is the feature that I will be using directly in my GSoC project, but I though that it was a good idea to implement the spanning tree iterator first as it would be easier and I could directly refer back to the research paper as needed.
The partition schemes between the two are the same, so once I figured it out for the spanning tress what I learned there would directly port into the arborescence iterator and there I could focus on modifying Edmond&rsquo;s algorithm to respect the partition.</p>
<h2 id="spanning-tree-iterator">Spanning Tree Iterator<a class="headerlink" href="#spanning-tree-iterator" title="Link to this heading">#</a></h2>
<p>This was the first of the new freatures.
It follows the algorithm detailed in a paper by Sörensen and Janssens from 2005 titled <em>An Algorithm to Generate all Spanning Trees of a Graph in Order of Increasing Cost</em> which can be found <a href="https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en&amp;format=pdf">here</a> [2].</p>
<p>Now, I needed to tweak the implementation of the algorithm because I wanted to implement a python iterator, so somebody can write</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">for</span> <span class="n">tree</span> <span class="ow">in</span> <span class="n">nx</span><span class="o">.</span><span class="n">SpanningTreeIterator</span><span class="p">(</span><span class="n">G</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">pass</span></span></span></code></pre>
</div>
<p>and that loop would return spanning trees starting with the ones of minimum cost and climbing to the ones of maximum cost.</p>
<p>In order to implement this feature, my first step was to ensure that once I know what the edge partition of the graph was, I could find a minimum spanning tree which respected the partition.
As a brief reminder, the edge partition creates two disjoint sets of edges of which one <em>must</em> appear in the resulting spanning tree and one <em>cannot</em> appear in the spanning tree.
Edges which are neither included or excluded from the spanning tree and called open.</p>
<p>The easiest algorithm to implement this which is Kruskal&rsquo;s algorithm.
The included edges are all added to the spanning tree first, and then the algorithm can join the components created with the included edges using the open edges.</p>
<p>This was easy to implement in NetworkX.
The Kruskal&rsquo;s algorithm in NetworkX is a generator which returns the edges in the minimum spanning tree one at a time using a sorted list of edges.
All that I had to do was change the sorting process so that the included edges where always at the front of that list, then the algorithm would always select them, regardless of weight for the spanning tree.</p>
<p>Additionally, since the general spanning tree of a graph is a partitioned tree where the partition has no included or excluded edges, I was about to convert the normal Kruskal&rsquo;s implementation into a wrapper for my partition respecting one in order to reduce redundant code.</p>
<p>As for the partitioning process itself, that proved to be a bit more tricky mostly stemming from my own limited python experience.
(I have only been working with python since the start of the calendar year)
In order to implement the partitioning scheme I needed an ordered data structure and choose the <a href="https://docs.python.org/3/library/queue.html"><code>PriorityQueue</code></a> class.
This was convienct, but for elements with the same weight for their minimum spanning trees it tried to compare the dictionaries hold the edge data was is not a supported operation.
Thus, I implemented a dataclass where only the weight of the spanning tree was comparable.
This means that for ties in spanning tree weight, the oldest partition with that weight is considered first.</p>
<p>Once the implementation details were ironed out, I moved on to testing.
At the time of this writing, I have tested the <code>SpanningTreeIterator</code> on the sample graph in the Sörensen and Janssens paper.
That graph is</p>
<center><img src="tree-example.png" alt="Example Graph"/></center>
<p>It has eight spanning trees, ranging in weight from 17 to 23 which are all shown below.</p>
<center>
<img src="eight-spanning-trees-1.png" alt="Four of the eight spanning trees on the sample graph"/>
<img src="eight-spanning-trees-2.png" alt="Four of the eight spanning trees on the sample graph"/>
</center>
<p>Since this graph only has a few spanning trees, it was easy to explicitly test that each graph returned from the iterator was the next one in the sequence.
The iterator also works backwards, so calling</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">for</span> <span class="n">tree</span> <span class="ow">in</span> <span class="n">nx</span><span class="o">.</span><span class="n">SpanningTreeIterator</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">minimum</span><span class="o">=</span><span class="kc">False</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="k">pass</span></span></span></code></pre>
</div>
<p>starts with the maximum spanning tree and works down to the minimum spanning tree.</p>
<p>The code for the spanning tree iterator can be found <a href="https://github.com/mjschwenne/networkx/blob/bothTSP/networkx/algorithms/tree/mst.py">here</a> starting around line 761.</p>
<h2 id="arborescence-iterator">Arborescence Iterator<a class="headerlink" href="#arborescence-iterator" title="Link to this heading">#</a></h2>
<p>The arborescence iterator is what I actually need for my GSoC project, and as expected was more complicated to implement.
In my original post titled <a href="../finding-all-minimum-arborescences">Finding All Minimum Arborescences</a>, I discussed cases that Edmond&rsquo;s algorithm [1] would need to handle and proposed a change to the <code>desired_edge</code> method.</p>
<p>These changes where easy to make, but were not the extent of the changes that needed to be made as I originally thought.
The original graph from Edmonds&rsquo; 1967 paper is below</p>
<center><img src="digraph-example.png" alt="An example directed graph"/></center>
<p>In my first test, which was limited to the minimum spanning arborescence of a random partition I created, the results where close.
Below, the blue edges are included and the red one is excluded.</p>
<center><img src="digraph-partition.png" alt="A small partition on the edges of the example graph"/></center>
<p>The minimum spanning arborescence initially is shown below.</p>
<center><img src="digraph-partition-msa.png" alt="Minimum spanning arborescence respecting the above partition"/></center>
<p>While the $(3, 0)$ edge is properly excluded and the $(2, 3)$ edge is included, the $(6, 2)$ is not present in the arborescence (show as a dashed edge).
Tracking this problem down was a hassle, but the way that Edmonds&rsquo; algorithm works is that a cycle, which would have been present if the $(6, 2)$ edge was included, are collapsed into a single vertex as the algorithm moves to the next iteration.
Once that cycle is collapsed into a vertex, it still has to choose how to access that vertex and the choice is based on the best edge as before (this is step I1 in [1]).
Then, when the algorithm expands the cycle out, it will remove the edge which is</p>
<ul>
<li>Wholly contained inside the cycle and,</li>
<li>Directed towards the vertex which is the &lsquo;access point&rsquo; for the cycle.</li>
</ul>
<p>Which is this case, would be $(6, 2)$ shown in red in the next image.
Represented visually, the cycle with incoming edges would look like</p>
<center><img src="digraph-cycle.png" alt="Problematic cycle with partition edges"/></center>
<p>And that would be collapsed into a new vertex, $N$ from which the incoming edge with weight 12 would be selected.</p>
<center><img src="digraph-collapsed-cycle.png" alt="Same cycle after the Edmonds algorithm collapsed it"/></center>
<p>In this example we want to forbid the algorithm from picking the edge with weight 12, so that when the cycle is reconstructed the included edge $(6, 2)$ is still present.
Once we make one of the incoming edges an included edge, we know from the definition of an arborescence that we cannot get to that vertex from any other edges.
They are all effectively excluded, so once we find an included edge directed towards a vertex we can made all of the other incoming edges excluded.</p>
<p>Returning to the example, the collapsed vertex $N$ would have the edge of weight 12 excluded and would pick the edge of weight 13.</p>
<center><img src="digraph-collapsed-forbidden-cycle.png" alt="Solution to tracking bad cycles in the arborescence"/></center>
<p>At this point the iterator would find 236 arborescences with cost ranging from 96 to 125.
I thought that I was very close to being finished and I knew that the cost of the minimum spanning arborescence was 96, until I checked to see what the weight of the maximum spanning arborescence was: 131.</p>
<p>This means that I was removing partitions which contained a valid arborescence before they were being added to priority queue.
My <code>check_partition</code> method within the <code>ArborescenceIterator</code> was doing the following:</p>
<ul>
<li>Count the number of included and excluded incoming edges for each vertex.</li>
<li>Save all of the included edges to a list to be checked for cycles.</li>
<li>If there was more than one included edge or all of the edges where excluded, return <code>False</code>.</li>
<li>If there was one included edge, make all of the others excluded.</li>
</ul>
<p>Rather than try to debug what I though was a good method, I decided to change my process.
I moved the last bullet point into the <code>write_partition</code> method and then stopped using the <code>check_partition</code> method.
If an edge partition does not have a spanning arborescence, the <code>partition_spanning_arborescence</code> function will return <code>None</code> and I discard the partition.
This approach is more computationally intensive, but it increased the number of returned spanning araborescences from 236 to 680 and the range expanded to the proper 96 - 131.</p>
<p>But how do I know that it isn&rsquo;t skipping arborescences within that range?
Since 680 arborescences is too many to explicitly check, I decided to write another test case.
This one would check that the number of arborescences was correct and that the sequence never decreases.</p>
<p>In order to check the number of arborescecnes, I decided to take a brute force approach.
There are</p>
<p>$$
\binom{18}{8} = 43,758
$$</p>
<p>possible combinations of edges which could be arborescences.
That&rsquo;s a lot of combintation, more than I wanted to check by hand so I wrote a short python script.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">combinations</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">networkx</span> <span class="k">as</span> <span class="nn">nx</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">edgelist</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">0</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mi">7</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">6</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">6</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="mi">7</span><span class="p">),</span>
</span></span><span class="line"><span class="cl"><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">combo_count</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl"><span class="n">arbor_count</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">combo</span> <span class="ow">in</span> <span class="n">combinations</span><span class="p">(</span><span class="n">edgelist</span><span class="p">,</span> <span class="mi">8</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">combo_count</span> <span class="o">+=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">    <span class="n">combo_test</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">DiGraph</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">combo_test</span><span class="o">.</span><span class="n">add_edges_from</span><span class="p">(</span><span class="n">combo</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">nx</span><span class="o">.</span><span class="n">is_arborescence</span><span class="p">(</span><span class="n">combo_test</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">arbor_count</span> <span class="o">+=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="sa">f</span><span class="s2">&#34;There are </span><span class="si">{</span><span class="n">combo_count</span><span class="si">}</span><span class="s2"> possible combinations of eight edges which &#34;</span>
</span></span><span class="line"><span class="cl">    <span class="sa">f</span><span class="s2">&#34;could be an arboresecnce.&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;Of those </span><span class="si">{</span><span class="n">combo_count</span><span class="si">}</span><span class="s2"> combinations, </span><span class="si">{</span><span class="n">arbor_count</span><span class="si">}</span><span class="s2"> are arborescences.&#34;</span><span class="p">)</span></span></span></code></pre>
</div>
<p>The output of this script is</p>

<div class="highlight">
  <pre>There are 43758 possible combinations of eight edges which could be an arboresecnce.
Of those 43758 combinations, 680 are arborescences.</pre>
</div>

<p>So now I know how many arborescences where in the graph and it matched the number returned from the iterator.
Thus, I believe that the iterator is working well.</p>
<p>The iterator code is <a href="https://github.com/mjschwenne/networkx/blob/bothTSP/networkx/algorithms/tree/branchings.py">here</a> and starts around line 783.
It can be used in the same way as the spanning tree iterator.</p>
<p><a href="https://mjschwenne.github.io/assets/iterator-output.pdf">Attached</a> is a sample output from the iterator detailing all 680 arborescences of the test graph.
Since Jekyll will not let me put up the txt file I had to convert it into a pdf which is 127 pages to show the 6800 lines of output from displaying all of the arborescences.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>[1] J. Edmonds, <em>Optimum Branchings</em>, Journal of Research of the National Bureau of Standards, 1967, Vol. 71B, p.233-240, <a href="https://archive.org/details/jresv71Bn4p233">https://archive.org/details/jresv71Bn4p233</a></p>
<p>[2] G.K. Janssens, K. Sörensen, <em>An algorithm to generate all spanning trees in order of increasing cost</em>, Pesquisa Operacional, 2005-08, Vol. 25 (2), p. 219-229, <a href="https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en">https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Finding all Minimum Arborescences]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/?utm_source=atom_feed" rel="related" type="text/html" title="A Closer Look at the Held-Karp Relaxation" />
                <link href="https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/?utm_source=atom_feed" rel="related" type="text/html" title="NetworkX Function Stubs" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Separation Oracle" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/finding-all-minimum-arborescences/</id>
            
            
            <published>2021-06-05T00:00:00+00:00</published>
            <updated>2021-06-05T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Exploring an algorithm to generate arborescences in ascending order</blockquote><p>There is only one thing that I need to figure out before the first coding period for GSoC starts on Monday: how to find <em>all</em> of the minimum arborescences of a graph.
This is the set $K(\pi)$ in the Held and Karp paper from 1970 which can be refined down to $K(\pi, d)$ or $K_{X, Y}(\pi)$ as needed.
For more information as to why I need to do this, please see my last post <a href="../a-closer-look-at-held-karp">here</a>.</p>
<p>This is a place where my contributions to NetworkX to implement the Asadpour algorithm [1] for the directed traveling salesman problem will be useful to the rest of the NetworkX community (I hope).
The research paper that I am going to template this off of is <a href="https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en&amp;format=pdf">this</a> 2005 paper by Sörensen and Janssens titled <em>An Algorithm to Generate all Spanning Trees of a Graph in Order of Increasing Cost</em> [4].</p>
<p>The basic idea here is to implement their algorithm and then generate spanning trees until we find the first one with a cost that is greater than the first one generated, which we know is a minimum, so that we have found all of the minimum spanning trees.
I know what you guys are saying, &ldquo;Matt, this paper discusses <em>spanning trees</em>, not spanning arborescences, how is this helpful?&rdquo;.
Well, the heart of this algorithm is to partition the vertices into either excluded edges which cannot appear in the tree, included edges which must appear in the tree and open edges which can be but are not required to be in the tree.
Once we have a partition, we need to be able to find a minimum spanning tree or minimum spanning arborescence that respects the partitioned edges.</p>
<p>In NetworkX, the minimum spanning arborescences are generated using Chu-Liu/Edmonds’ Algorithm developed by Yoeng-Jin Chu and Tseng-Hong Liu in 1965 and independently by Jack Edmonds in 1967.
I believe that Edmonds&rsquo; Algorithm [2] can be modified to require an arc to be either included or excluded from the resulting spanning arborescence, thus allowing me to implement Sörensen and Janssens&rsquo; algorithm for directed graphs.</p>
<p>First, let&rsquo;s explore whether the partition scheme discussed in the Sörensen and Janssens paper [4] will work for a directed graph.
The critical ideas for creating the partitions are given on pages 221 and 222 and are as follows:</p>
<blockquote>
<p>Given an MST of a partition, this partition can be split into a set of resulting partitions in such a way that the following statements hold:</p>
<ul>
<li>the intersection of any two resulting partitions is the empty set,</li>
<li>the MST of the original partition is not an element of any of the resulting partitions,</li>
<li>the union of the resulting partitions is equal to the original partition, minus the MST of the original partition.</li>
</ul>
</blockquote>
<p>In order to achieve these conditions, they define the generation of the partitions using this definition for a minimum spanning tree</p>
<p>$$
s(P) = {(i_1, j_1), \dots, (i_r, j_r), (t_1, v_1), \dots, (t_{n-r-1}, v_{n-r-1}}
$$</p>
<p>where the $(i, j)$ edges are the included edges of the original partition and the $(t, v)$ are from the open edges of the original partition.
Now, to create the next set of partitions, take each of the $(t, v)$ edges sequentially and introduce them one at a time, make that edge an excluded edge in the first partition it appears in and an included edge in all subsequent partitions.
This will produce something to the effects of</p>
<p>$$
\begin{array}{l}
P_1 = {(i_1, j_1), \dots, (i_r, j_r), (\overline{m_1, p_1}), \dots, (\overline{m_l, p_l}), (\overline{t_1, v_1})} \\\
P_2 = {(i_1, j_1), \dots, (i_r, j_r), (t_1, v_1), (\overline{m_1, p_1}), \dots, (\overline{m_l, p_l}), (\overline{t_2, v_2})} \\\
P_3 = {(i_1, j_1), \dots, (i_r, j_r), (t_1, v_1), (t_2, v_2), (\overline{m_1, p_1}), \dots, (\overline{m_l, p_l}), (\overline{t_3, v_3})} \\\
\vdots \\\
\begin{multline*}
P_{n-r-1} = {(i_1, j_1), \dots, (i_r, j_r), (t_1, v_1), \dots, (t_{n-r-2}, v_{n-r-2}), (\overline{m_1, p_1}), \dots, (\overline{m_l, p_l}), \\\
(\overline{t_{n-r-1}, v_{n-r-1}})}
\end{multline*} \\\
\end{array}
$$</p>
<p>Now, if we extend this to a directed graph, our included and excluded edges become included and excluded arcs, but the definition of the spanning arborescence of a partition does not change.
Let $s_a(P)$ be the minimum spanning arborescence of a partition $P$.
Then</p>
<p>$$
s_a(P) = {(i_1, j_1), \dots, (i_r, j_r), (t_1, v_1), \dots, (t_{n-r-1}, v_{n-r-1}}
$$</p>
<p>$s_a(P)$ is still constructed of all of the included arcs of the partition and a subset of the open arcs of that partition.
If we partition in the same manner as the Sörensen and Janssens paper [4], then their cannot be spanning trees which both include and exclude a given edge and this conflict exists for every combination of partitions.</p>
<p>Clearly the original arborescence, which includes all of the $(t_1, v_1), \dots, (t_{n-r-1}, v_{n-r-1})$ cannot be an element of any of the resulting partitions.</p>
<p>Finally, there is the claim that the union of the resulting partitions is the original partition minus the original minimum spanning tree.
Being honest here, this claim took a while for me to understand.
In fact, I had a whole paragraph talking about how this claim doesn&rsquo;t make sense before all of a sudden I realized that it does.
The important thing to remember here is that the union of all of the partitions isn&rsquo;t the union of the sets of included and excluded edges (which is where I went wrong the first time), it is a subset of spanning trees.
The original partition contains many spanning trees, one or more of which are minimum, but each tree in the partition is a unique subset of the edges of the original graph.
Now, because each of the resulting partitions cannot include one of the edges of the original partition&rsquo;s minimum spanning tree we know that the original minimum spanning tree is <em>not</em> an element of the union of the resulting partitions.
However, because every other spanning tree in the original partition which was not the selected minimum one is different by at least one edge it is a member of at least one of the resulting partitions, specifically the one where that one edge of the selected minimum spanning tree which it does not contain is the excluded edge.</p>
<p>So now we know that this same partition scheme which works for undirected graphs will work for directed ones.
We need to modify Edmonds’ algorithm to mandate that certain arcs be included and others excluded.
To start, a review of this algorithm is in order.
The original description of the algorithm is given on pages 234 and 235 of Jack Edmonds&rsquo; 1967 paper <em>Optimum Branchings</em> [2] and roughly speaking it has three major steps.</p>
<ol>
<li>For each vertex $v$, find the incoming arc with the smallest weight and place that arc in a bucket $E^i$ and the vertex in a bucket $D^i$.
Repeat this step until either (a) $E^i$ no longer qualifies as a branching or (b) all vertices of the graph are in $D^i$.
If (a) occurs, go to step 2, otherwise go to step 3.</li>
<li>If $E^i$ no longer qualifies as a branching then it must contain a cycle.
Contract all of the vertices of the cycle into one new one, say $v_1^{i + 1}$.
Every edge which has one endpoint in the cycle has that endpoint replaced with $v_1^{i + 1}$ and its cost updated.
Using this new graph $G^{i + 1}$, create buckets $D^{i + 1}$ containing the nodes in both $G^{i + 1}$ and $D^i$ and $E^{i + 1}$ containing edges in both $G^{i + 1}$ and $E^i$
(i.e. remove the edges and vertices which are affected by the creation of $G^{i + 1}$.)
Return to step 1 and apply it to graph $G^{i + 1}$.</li>
<li>Once this step is reached, we have a smaller graph for which we have found a minimum spanning arborescence.
Now we need to un-contract all of the cycles to return to the original graph.
To do this, if the node $v_1^{i + 1}$ is the root of the arborescence or not.
<ul>
<li>$v_1^{i + 1}$ is the root: Remove the arc of maximum weight from the cycle represented by $v_1^{i + 1}$.</li>
<li>$v_1^{i + 1}$ is not the root: There is a single arc directed towards $v_1^{i + 1}$ which translates into an arc directed to one of the vertices in the cycle represented by $v_1^{i + 1}$.
Because $v_1^{i + 1}$ represents a cycle, there is another arc wholly internal to the cycle which is directed into the same vertex as the incoming edge to the cycle.
Delete the internal one to break the cycle.
Repeat until the original graph has been restored.</li>
</ul>
</li>
</ol>
<p>Now that we are familiar with the minimum arborescence algorithm, we can discuss modifying it to force it to include certain edges or reject others.
The changes will be primarily located in step 1.
Under the normal operation of the algorithm, the consideration which happens at each vertex might look like this.</p>
<center><img src="edmonds-normal.png" alt="Edmonds algrithm selecting edge without restrictions"/></center>
<p>Where the bolded arrow is chosen by the algorithm as it is the incoming arc with minimum weight.
Now, if we were required to include a different edge, say the weight 6 arc, we would want this behavior even though it is strictly speaking not optimal.
In a similar case, if the arc of weight 2 was excluded we would also want to pick the arc of weight 6.
Below the excluded arc is a dashed line.</p>
<center><img src="edmonds-one-required.png" alt="Edmonds algorithm forces to picked a non-optimal arc"/></center>
<p>But realistically, these are routine cases that would not be difficult to implement.
A more interesting case would be if all of the arcs were excluded or if more than one are included.</p>
<center><img src="edmonds-all-excluded.png" alt="Edmonds algorithm which cannot pick any arc"/></center>
<p>Under this case, there is no spanning arborescence for the partition because the graph is not connected.
The Sörensen and Janssens paper characterize these as <em>empty</em> partitions and they are ignored.</p>
<center><img src="edmonds-multiple-required.png" alt="Edmonds algorithm which must pick more then one arc"/></center>
<p>In this case, things start to get a bit tricky.
With two (or more) included arcs leading to this vertex, it is but definition not an arborescence as according to Edmonds on page 233</p>
<blockquote>
<p>A branching is a forest whose edges are directed so that each is directed toward a different node. An arborescence is a connected branching.</p>
</blockquote>
<p>At first I thought that there was a case where because this case could result in the creation of a cycle that it was valid, but I realize now that in step 3 of Edmonds’ algorithm that one of those arcs would be removed anyways.
Thus, any partition with multiple included arcs leading to a single vertex is empty by definition.
While there are ways in which the algorithm can handle the inclusion of multiple arcs, one (or more) of them by definition of an arborescence will be deleted by the end of the algorithm.</p>
<p>I propose that these partitions are screened out before we hand off to Edmonds&rsquo; algorithm to find the arborescences.
As such, Edmonds&rsquo; algorithm will need to be modified for the cases of at most one included edge per vertex and any number of excluded edges per vertex.
The critical part of altering Edmonds&rsquo; Algorithm is contained within the <code>desired_edge</code> function in the NetworkX implementation starting on line 391 in <code>algorithms.tree.branchings</code>.
The whole function is as follows.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">desired_edge</span><span class="p">(</span><span class="n">v</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Find the edge directed toward v with maximal weight.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">edge</span> <span class="o">=</span> <span class="kc">None</span>
</span></span><span class="line"><span class="cl">    <span class="n">weight</span> <span class="o">=</span> <span class="o">-</span><span class="n">INF</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">u</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">data</span> <span class="ow">in</span> <span class="n">G</span><span class="o">.</span><span class="n">in_edges</span><span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">keys</span><span class="o">=</span><span class="kc">True</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">new_weight</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">attr</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">new_weight</span> <span class="o">&gt;</span> <span class="n">weight</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">weight</span> <span class="o">=</span> <span class="n">new_weight</span>
</span></span><span class="line"><span class="cl">            <span class="n">edge</span> <span class="o">=</span> <span class="p">(</span><span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">new_weight</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">edge</span><span class="p">,</span> <span class="n">weight</span></span></span></code></pre>
</div>
<p>The function would be changed to automatically return an included arc and then skip considering any excluded arcs.
Because this is an inner function, we can access parameters passed to the parent function such as something along the lines as <code>partition=None</code> where the value of <code>partition</code> is the edge attribute detailing <code>true</code> if the arc is included and <code>false</code> if it is excluded.
Open edges would not need this attribute or could use <code>None</code>.
The creation of an enum is also possible which would unify the language if I talk to my GSoC mentors about how it would fit into the NetworkX ecosystem.
A revised version of <code>desired_edge</code> using the <code>true</code> and <code>false</code> scheme would then look like this:</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">desired_edge</span><span class="p">(</span><span class="n">v</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Find the edge directed toward v with maximal weight.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">edge</span> <span class="o">=</span> <span class="kc">None</span>
</span></span><span class="line"><span class="cl">    <span class="n">weight</span> <span class="o">=</span> <span class="o">-</span><span class="n">INF</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">u</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">data</span> <span class="ow">in</span> <span class="n">G</span><span class="o">.</span><span class="n">in_edges</span><span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">keys</span><span class="o">=</span><span class="kc">True</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">new_weight</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">attr</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">data</span><span class="p">[</span><span class="n">partition</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="n">edge</span><span class="p">,</span> <span class="n">data</span><span class="p">[</span><span class="n">attr</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">new_weight</span> <span class="o">&gt;</span> <span class="n">weight</span> <span class="ow">and</span> <span class="ow">not</span> <span class="n">data</span><span class="p">[</span><span class="n">partition</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl">            <span class="n">weight</span> <span class="o">=</span> <span class="n">new_weight</span>
</span></span><span class="line"><span class="cl">            <span class="n">edge</span> <span class="o">=</span> <span class="p">(</span><span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">new_weight</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">edge</span><span class="p">,</span> <span class="n">weight</span></span></span></code></pre>
</div>
<p>And a version using the enum might look like</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">desired_edge</span><span class="p">(</span><span class="n">v</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Find the edge directed toward v with maximal weight.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">edge</span> <span class="o">=</span> <span class="kc">None</span>
</span></span><span class="line"><span class="cl">    <span class="n">weight</span> <span class="o">=</span> <span class="o">-</span><span class="n">INF</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">u</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">data</span> <span class="ow">in</span> <span class="n">G</span><span class="o">.</span><span class="n">in_edges</span><span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">keys</span><span class="o">=</span><span class="kc">True</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">new_weight</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">attr</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">data</span><span class="p">[</span><span class="n">partition</span><span class="p">]</span> <span class="ow">is</span> <span class="n">Partition</span><span class="o">.</span><span class="n">INCLUDED</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="n">edge</span><span class="p">,</span> <span class="n">data</span><span class="p">[</span><span class="n">attr</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">new_weight</span> <span class="o">&gt;</span> <span class="n">weight</span> <span class="ow">and</span> <span class="n">data</span><span class="p">[</span><span class="n">partition</span><span class="p">]</span> <span class="ow">is</span> <span class="ow">not</span> <span class="n">Partition</span><span class="o">.</span><span class="n">EXCLUDED</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">weight</span> <span class="o">=</span> <span class="n">new_weight</span>
</span></span><span class="line"><span class="cl">            <span class="n">edge</span> <span class="o">=</span> <span class="p">(</span><span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">new_weight</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">edge</span><span class="p">,</span> <span class="n">weight</span></span></span></code></pre>
</div>
<p>Once Edmonds&rsquo; algorithm has been modified to be able to use partitions, the pseudocode from the Sörensen and Janssens paper would be applicable.</p>

<div class="highlight">
  <pre>Input: Graph G(V, E) and weight function w
Output: Output_File (all spanning trees of G, sorted in order of increasing cost)

List = {A}
Calculate_MST(A)
while MST ≠ ∅ do
	Get partition Ps in List that contains the smallest spanning tree
	Write MST of Ps to Output_File
	Remove Ps from List
	Partition(Ps)</pre>
</div>

<p>And the corresponding <code>Partition</code> function being</p>

<div class="highlight">
  <pre>P1 = P2 = P
for each edge i in P do
	if i not included in P and not excluded from P then
		make i excluded from P1
		make i include in P2
		Calculate_MST(P1)
		if Connected(P1) then
			add P1 to List
		P1 = P2</pre>
</div>

<p>I would need to change the format of the first code block as I would like it to be a Python iterator so that a <code>for</code> loop would be able to iterate through all of the spanning arborescences and then stop once the cost increases in order to limit it to only minimum spanning arborescences.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>[1] A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), p. 1043-1061, <a href="https://homes.cs.washington.edu/~shayan/atsp.pdf">https://homes.cs.washington.edu/~shayan/atsp.pdf</a>.</p>
<p>[2] J. Edmonds, <em>Optimum Branchings</em>, Journal of Research of the National Bureau of Standards, 1967, Vol. 71B, p.233-240, <a href="https://archive.org/details/jresv71Bn4p233">https://archive.org/details/jresv71Bn4p233</a></p>
<p>[3] M. Held, R.M. Karp, <em>The traveling-salesman problem and minimum spanning trees</em>, Operations research, 1970-11-01, Vol.18 (6), p.1138-1162, <a href="https://www.jstor.org/stable/169411">https://www.jstor.org/stable/169411</a></p>
<p>[4] G.K. Janssens, K. Sörensen, <em>An algorithm to generate all spanning trees in order of increasing cost</em>, Pesquisa Operacional, 2005-08, Vol. 25 (2), p. 219-229, <a href="https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en">https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[A Closer Look at the Held-Karp Relaxation]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/?utm_source=atom_feed" rel="related" type="text/html" title="NetworkX Function Stubs" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Separation Oracle" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/a-closer-look-at-held-karp/</id>
            
            
            <published>2021-06-03T00:00:00+00:00</published>
            <updated>2021-06-03T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Looking for a new method to solve the Held-Karp relaxation from the original Held and Karp paper</blockquote><p>After talking with my GSoC mentors about what we all believe to be the most difficult part of the Asadpour algorithm, the Held-Karp relaxation, we came to several conclusions:</p>
<ul>
<li>The Asadpour paper recommends using the ellipsoid method so that their algorithm runs in polynomial time.
We do not need a polynomial time, just an algorithm with reasonable execution time.
An example of this would be the ellipsoid algorithm versus the simplex algorithm.
While the simplex algorithm is exponential, in practice it is almost always faster than the ellipsoid algorithm.</li>
<li>Our interest in the ellipsoid algorithm was not based on performance, but rather the ability for the ellipsoid algorithm to be able to handle a linear program with an exponential number of constraints.
This was done with a separation oracle, see my post <a href="../held-karp-separation-oracle">here</a> for more information about the oracle.</li>
<li>Implementing a robust ellipsoid algorithm solver (something notable missing from the scientific python ecosystem) was a GSoC project onto itself and beyond the scope of this project for NetworkX.</li>
</ul>
<p>Thus, alternative methods for solving the Held-Karp relaxation needed to be investigated.
To this end, we turned to the original 1970 paper by Held and Karp, <em>The Traveling Salesman Problem and Minimum Spanning Trees</em> to see how they proposed solving the relaxation (Note that this paper was published before the ellipsoid algorithm was applied to linear programming in 1979).
The Held and Karp paper discusses three methods for solving the relaxation:</p>
<ul>
<li><strong>Column Generating:</strong> An older method of solving very large linear programs where only the variables that influence the optimal solution need to be examined.</li>
<li><strong>Ascent Method:</strong> A method based around maximizing the dual of the linear program which is best described as seeking the direction of ascent for the objective function in a similar way to the notion of a gradient in multivariate calculus.</li>
<li><strong>Branch and Bound:</strong> This method has the most theoretical benefits and seeks to augment the ascent method to avoid the introduction of fractional weights which are the largest contributors to a slow convergence rate.</li>
</ul>
<p>But before we explore the methods that Held and Karp discuss, we need to ensure that these methods still apply to solving the Held-Karp relaxation within the context of the Asadpour paper.
The definition of the Held-Karp relaxation that I have been using on this blog comes from the Asadpour paper, section 3 and is listed below.</p>
<p>$$
\begin{array}{c l l}
\text{min} &amp; \sum_{a} c(a)x_a \\\
\text{s.t.} &amp; x(\delta^+(U)) \geqslant 1 &amp; \forall\ U \subset V \text{ and } U \not= \emptyset \\\
&amp; x(\delta^+(v)) = x(\delta^-(v)) = 1 &amp; \forall\ v \in V \\\
&amp; x_a \geqslant 0 &amp; \forall\ a
\end{array}
$$</p>
<p>The closest match to this program in the Held Karp paper is their linear program 3, which is a linear programming representation of the entire traveling salesman problem, not solely the relaxed version.
Note that Held and Karp were dealing with the symmetric TSP (STSP) while Asadpour is addressing the asymmetric or directed TSP (ATSP).</p>
<p>$$
\begin{array}{c l l}
\text{min} &amp; \sum_{1 \leq i &lt; j \leq n} c_{i j}x_{i j} \\\
\text{s.t.} &amp; \sum_{j &gt; i} x_{i j} + \sum_{j &lt; i} x_{j i} = 2 &amp; (i = 1, 2, \dots, n) \\\
&amp; \sum_{i \in S\\\ j \in S\\\ i &lt; j} x_{i j} \leq |S| - 1 &amp; \text{for any proper subset } S \subset {2, 3, \dots, n} \\\
&amp; 0 \leq x_{i j} \leq 1 &amp; (1 \leq i &lt; j \leq n) \\\
&amp; x_{i j} \text{integer} \\\
\end{array}
$$</p>
<p>The last two constraints on the second linear program is correctly bounded and fits within the scope of the original problem while the first two constraints do most of the work in finding a TSP tour.
Additionally, changing the last two constraints to be $x_{i j} \geq 0$ <em>is</em> the Held Karp relaxation.
The first constraint, $\sum_{j &gt; i} x_{i j} + \sum_{j &lt; i} x_{j i} = 2$, ensures that for every vertex in the resulting tour there is one edge to get there and one edge to leave by.
This matches the second constraint in the Asadpour ATSP relaxation.
The second constraint in the Held Karp formulation is another form of the subtour elimination constraint seen in the Asadpour linear program.</p>
<p>Held and Karp also state that</p>
<blockquote>
<p>In this section, we show that minimizing the gap $f(\pi)$ is equivalent to solving this program <em>without</em> the integer constraints.</p>
</blockquote>
<p>on page 1141, so it would appear that solving one of the equivalent programs that Held and Karp forumalate should work here.</p>
<h2 id="column-generation-technique">Column Generation Technique<a class="headerlink" href="#column-generation-technique" title="Link to this heading">#</a></h2>
<p>The Column Generation technique seeks to solve linear program 2 from the Held and Karp paper, stated as</p>
<p>$$
\begin{array}{c l}
\text{min} &amp; \sum_{k} c_ky_k \\\
\text{s.t.} &amp; y_k \geq 0 \\\
&amp; \sum_k y_k = 1 \\\
&amp; \sum_{i = 2}^{n - 1} (-v_{i k})y_k = 0 \\\
\end{array}
$$</p>
<p>Where $v_{i k}$ is the degree of vertex $i$ in 1-Tree $k$ minus two, or $v_{i k} = d_{i k} - 2$ and each variable $y_k$ corresponds to a 1-Tree $T^k$.
The associated cost $c_k$ for each tree is the weight of $T^k$.</p>
<p>The rest of this method uses a simplex algorithm to solve the linear program.
We only focus on the edges which are in each of the 1-Trees, giving each column the form</p>
<p>$$
\begin{bmatrix}
1 &amp; -v_{2k} &amp; -v_{3k} &amp; \dots &amp; -v_{n-1,k}
\end{bmatrix}^T
$$</p>
<p>and the column which enters the solution in the 1-Tree for which $c_k + \theta + \sum_{j=2}^{n-1} \pi_jv_{j k}$ is a minimum where $\theta$ and $\pi_j$ come from the vector of &lsquo;shadow prices&rsquo; given by $(\theta, \pi_2, \pi_3, \dots, \pi_{n-1})$.
Now the basis is $(n - 1) \times (n - 1)$ and we can find the 1-Tree to add to the basis using a minimum 1-Tree algorithm which Held and Karp say can be done in $O(n^2)$ steps.</p>
<p>I am already <a href="https://github.com/mjschwenne/GraphAlgorithms/blob/main/src/Simplex.py">familiar</a> with the simplex method, so I will not detail it&rsquo;s implementation here.</p>
<h3 id="performance-of-the-column-generation-technique">Performance of the Column Generation Technique<a class="headerlink" href="#performance-of-the-column-generation-technique" title="Link to this heading">#</a></h3>
<p>This technique is slow to converge.
Held and Karp programmed in on an IBM/360 and where able to solve problems consestinal for up to $n = 12$.
Now, on a modern computer the clock rate is somewhere between 210 and 101,500 times faster (depending on the model of IBM/360 used), so we expect better performance, but cannot say at this time how much of an improvement.</p>
<p>They also talk about a heuristic procedure in which a vertex is eliminated from the program whenever the choice of its adjacent vertices was &rsquo;evident&rsquo;.
Technical details for the heuristic where essentially non-existent, but</p>
<blockquote>
<p>The procedure showed promise on examples up to $n = 48$, but was not explored systematically</p>
</blockquote>
<h2 id="ascent-method">Ascent Method<a class="headerlink" href="#ascent-method" title="Link to this heading">#</a></h2>
<p>This paper from Held and Karp is about minimizing $f(\pi)$ where $f(\pi)$ is the gap between the permuted 1-Trees and a TSP tour.
One way to do this is to maximize the dual of $f(\pi)$ which is written as $\text{max}_{\pi}\ w(\pi)$ where</p>
<p>$$
w(\pi) = \text{min}_k\ (c_k + \sum_{i=1}^{i=n} \pi_iv_{i k})
$$</p>
<p>This method uses the set of indices of 1-Trees that are of minimum weight with respect to the weights $\overline{c}_{i j} = c_{i j} + \pi_i + \pi_j$.</p>
<p>$$
K(\pi) = {k\ |\ w(\pi) = c_k + \sum_{i=1}^{i=n} \pi_i v_{i k}}
$$</p>
<p>If $\pi$ is not a maximum point of $w$, then there will be a vector $d$ called the direction of ascent at $\pi$.
This is theorem 3 and a proof is given on page 1148.
Let the functions $\Delta(\pi, d)$ and $K(\pi, d)$ be defined as below.</p>
<p>$$
\Delta(\pi, d) = \text{min}_{k \in K(\pi)}\ \sum_{i=1}^{i=n} d_iv_{i k} \\\
K(\pi, d) = {k\ |\ k \in K(\pi) \text{ and } \sum_{i=1}^{i=n} d_iv_{i k} = \Delta(\pi, d)}
$$</p>
<p>Now for a sufficiently small $\epsilon$, $K(\pi + \epsilon d) = K(\pi, d)$ and $w(\pi + \epsilon d) = w(\pi) + \epsilon \Delta(\pi, d)$, or the value of $w(\pi)$ increases and the growth rate of the minimum 1-Trees is at its smallest so we maintain the low weight 1-Trees and progress farther towards the optimal value.
Finally, let $\epsilon(\pi, d)$ be the following quantity</p>
<p>$$
\epsilon(\pi, d) = \text{max}\ {\epsilon\ |\text{ for } \epsilon&rsquo; &lt; \epsilon,\ K(\pi + \epsilon&rsquo;d = K(\pi, d)}
$$</p>
<p>So in other words, $\epsilon(\pi, d)$ is the maximum distance in the direction of $d$ that we can travel to maintain the desired behavior.</p>
<p>If we can find $d$ and $\epsilon$ then we can set $\pi = \pi + \epsilon d$ and move to the next iteration of the ascent method.
Held and Karp did give a protocol for finding $d$ on page 1149.</p>
<ol>
<li>Set $d$ equal to the zero $n$-vector.</li>
<li>Find a 1-tree $T^k$ such that $k \in K(\pi, d)$.</li>
<li>If $\sum_{i=1}^{i=n} d_iv_{i k} &gt; 0$ STOP.</li>
<li>$d_i \leftarrow d_i + v_{i k},$ for $i = 2, 3, \dots, n$</li>
<li>GO TO 2.</li>
</ol>
<p>There are two things which must be refined about this procedure in order to make it implementable in Python.</p>
<ul>
<li>How do we find the 1-Tree mentioned in step 2?</li>
<li>How do we know when there is no direction of ascent? (i.e. how do we know when we are at the maximal value of $w(\pi)$?)</li>
</ul>
<p>Held and Karp have provided guidance on both of these points.
In section 6 on matroids, we are told to use a method developed by Dijkstra in <em>A Note on Two Problems in Connexion with Graphs</em>, but in this particular case that is not the most helpful.
I have found this document, but there is a function called <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.tree.branchings.minimum_spanning_arborescence.html"><code>minimum_spanning_arborescence</code></a> already within NetworkX which we can use to create a minimum 1-Arborescence.
That process would be to find a minimum spanning arborescence on only the vertices in ${2, 3, \dots, n}$ and then connect vertex 1 to create the cycle.
In order to connect vertex 1, we would choose the outgoing arc with the smallest cost and the incoming arc with the smallest cost.</p>
<p>Finally, at the maximum value of $w(\pi)$, there is no direction of ascent and the procedure outlined by Held and Karp will not terminate.
Their article states on page 1149 that</p>
<blockquote>
<p>Thus, when failure to terminate is suspected, it is necessary to check whether no direction of ascent exists; by the Minkowski-Farkas lemma this is equivalent to the existence of nonnegative coefficients $\alpha_k$ such that</p>
<p>$ \sum_{k \in K(\pi)} \alpha_kv_{i k} = 0, \quad i = 1, 2, \dots, n $</p>
<p>This can be checked by linear programming.</p>
</blockquote>
<p>While it is nice that they gave that summation, the rest of the linear program would have been useful too.
The entire linear program would be written as follows</p>
<p>$$
\begin{array}{c l l}
\text{max} &amp; \sum_k \alpha_k \\\
\text{s.t.} &amp; \sum_{k \in K(\pi)} \alpha_k v_{i k} = 0 &amp; \forall\ i \in {1, 2, \dots n} \\\
&amp; \alpha_k \geq 0 &amp; \forall\ k \\\
\end{array}
$$</p>
<p>This linear program is not in standard form, but it is not difficult to convert it.
First, change the maximization to a minimization by minimizing the negative.</p>
<p>$$
\begin{array}{c l l}
\text{min} &amp; \sum_k -\alpha_k \\\
\text{s.t.} &amp; \sum_{k \in K(\pi)} \alpha_k v_{i k} = 0 &amp; \forall\ i \in {1, 2, \dots n} \\\
&amp; \alpha_k \geq 0 &amp; \forall\ k \\\
\end{array}
$$</p>
<p>While the constraint is not intuitively in standard form, a closer look reveals that it is.
Each column in the matrix form will be for one entry of $\alpha_k$, and each row will represent a different value of $i$, or a different vertex.
The one constraint is actually a collection of very similar one which could be written as</p>
<p>$$
\begin{array}{c l}
\text{min} &amp; \sum_k -\alpha_k \\\
\text{s.t.} &amp; \sum_{k \in K(\pi)} \alpha_k v_{1 k} = 0 \\\
&amp; \sum_{k \in K(\pi)} \alpha_k v_{2 k} = 0 \\\
&amp; \vdots \\\
&amp; \sum_{k \in K(\pi)} \alpha_k v_{n k} = 0 \\\
&amp; \alpha_k \geq 0 &amp; \forall\ k \\\
\end{array}
$$</p>
<p>Because all of the summations must equal zero, no stack and surplus variables are required, so the constraint matrix for this program is $n \times k$.
The $n$ obviously has a linear growth rate, but I&rsquo;m not sure how big to expect $k$ to become.
$k$ is the set of minimum 1-Trees, so I believe that it will be manageable.
This linear program can be solved using the built in <a href="https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.linprog.html"><code>linprog</code></a> function in the SciPy library.</p>
<p>As an implementation note, to start with I would probably check the terminating condition every iteration, but eventually we can find a number of iterations it has to execute before it starts to check for the terminating condition to save computational power.</p>
<p>One possible difficulty with the terminating condition is that we need to run the linear program with data from every minimum 1-Trees or 1-Arborescences, which means that we need to be able to generate all of the minimum 1-Trees.
There does not seem to be an easy way to do this within NetworkX at the moment.
Looking through the tree algorithms <a href="https://networkx.org/documentation/stable/reference/algorithms/tree.html">here</a> they seem exclusively focused on finding <em>one</em> minimum branching of the required type and not <em>all</em> of those branchings.</p>
<p>Now we have to find $\epsilon$.
Theorem 4 on page 1150 states that</p>
<blockquote>
<p>Let $k$ be any element of $K(\pi, d)$, where $d$ is a direction of ascent at $\pi$.
Then
$\epsilon(\pi, d) = \text{min}{\epsilon\ |\text{ for some pair } (e, e&rsquo;),\ e&rsquo; \text{ is a substitute for } e \text{ in } T^k \\\ \text{ and } e \text{ and } e&rsquo; \text{ cross over at } \epsilon }$</p>
</blockquote>
<p>The first step then is to determine if $e$ and $e&rsquo;$ are substitutes.
$e&rsquo;$ is a substitute if for a 1-Tree $T^k$, $(T^k - {e}) \cup {e&rsquo;}$ is also a 1-Tree.
The edges $e = {r, s}$ and $e&rsquo; = {i, j}$ cross over at $\epsilon$ if the pairs $(\overline{c}_{i j}, d_i + d_j)$ and $(\overline{c}_{r s}, d_r + d_s)$ are different but</p>
<p>$$
\overline{c}_{i j} + \epsilon(d_i + d_j) = \overline{c}_{r s} + \epsilon(d_r + d_s)
$$</p>
<p>From that equation, we can derive a formula for $\epsilon$.</p>
<p>$$
\begin{array}{r c l}
\overline{c}_{i j} + \epsilon(d_i + d_j) &amp;=&amp; \overline{c}_{r s} + \epsilon(d_r + d_s) \\\
\epsilon(d_i + d_j) &amp;=&amp; \overline{c}_{r s} + \epsilon(d_r + d_s) - \overline{c}_{i j} \\\
\epsilon(d_i + d_j) - \epsilon(d_r + d_s) &amp;=&amp; \overline{c}_{r s} - \overline{c}_{i j} \\\
\epsilon\left((d_i + d_j) - (d_r + d_s)\right) &amp;=&amp; \overline{c}_{r s} - \overline{c}_{i j} \\\
\epsilon(d_i + d_j - d_r - d_s) &amp;=&amp; \overline{c}_{r s} - \overline{c}_{i j} \\\
\epsilon &amp;=&amp; \displaystyle \frac{\overline{c}_{r s} - \overline{c}_{i j}}{d_i + d_j - d_r - d_s}
\end{array}
$$</p>
<p>So we can now find $epsilon$ for any two pairs of edges which are substitutes for each other, but we need to be able to find substitutes in the 1-Tree.
We know that $e&rsquo;$ is a substitute for $e$ if and only if $e$ and $e&rsquo;$ are both incident to vertex 1 or $e$ is in a cycle of $T^k \cup {e&rsquo;}$ that does not pass through vertex 1.
In a more formal sense, we are trying to find edges in the same fundamental cycle as $e&rsquo;$.
A fundamental cycle is created when any edge not in a spanning tree is added to that spanning tree.
Because the endpoints of this edge are connected by one, unique path this creates a unique cycle.
In order to find this cycle, we will take advantage of <a href="https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.cycles.find_cycle.html"><code>find_cycle</code></a> within the NetworkX library.</p>
<p>Below is a pseudocode procedure that uses Theorem 4 to find $\epsilon(\pi, d)$ that I sketched out.
It is not well optimized, but will find $\epsilon(\pi, d)$.</p>

<div class="highlight">
  <pre># Input: An element k of K(pi, d), the vector pi and the vector d.
# Output: epsilon(pi, d) using Theorem 4 on page 1150.

for each edge e in the graph G
	if e is in k:
		continue
	else:
		add e to k
		let v be the terminating end of e
		c = find_cycle(k, v)
		for each edge a in c not e:
			if a[cost] = e[cost] and d[i] &#43; d[j] = d[r] &#43; d[s]:
				continue
			epsilon = (a[cost] - e[cost])/(d[i] &#43; d[j] - d[r] - d[s])
			min_epsilon = min(min_epsilon, epsilon)
		remove e from k
return min_epsilon</pre>
</div>

<h3 id="performance-of-the-ascent-method">Performance of the Ascent Method<a class="headerlink" href="#performance-of-the-ascent-method" title="Link to this heading">#</a></h3>
<p>The ascent method is also slow, but would be better on a modern computer.
When Held and Karp programmed it, they tested it on some small problems up to 25 vertices and while the time per iteration was small, the number of iterations grew quickly.
They do not comment on if this is a better method than the Column Generation technique, but do point up that they did not determine if this method <em>always</em> converges to a maximum point of $w(\pi)$.</p>
<h2 id="branch-and-bound-method">Branch and Bound Method<a class="headerlink" href="#branch-and-bound-method" title="Link to this heading">#</a></h2>
<p>After talking with my GSoC mentors, we believe that this is the best method we can implement for the Held-Karp relaxation as needed by the Asadpour algorithm.
The ascent method is embedded within this method, so the in depth exploration of the previous method is required to implement this one.
Most of the notation in this method is reused from the ascent method.</p>
<p>The branch and bound method utilizes the concept that a vertex can be out-of-kilter.
A vertex $i$ is out-of-kilter high if</p>
<p>$$
\forall\ k \in K(\pi),\ v_{i k} \geq 1
$$</p>
<p>Similarly, vertex $i$ is out-of-kilter low if</p>
<p>$$
\forall\ k \in K(\pi),\ v_{i k} = -1
$$</p>
<p>Remember that $v_{i k}$ is the degree of the vertex minus 2.
We know that all the vertices have a degree of at least one, otherwise the 1-Tree $T^k$ would not be connected.
An out-of-kilter high vertex has a degree of 3 or higher in every minimum 1-Tree and an out-of-kilter low vertex has a degree of only one in all of the minimum 1-Trees.
Our goal is a minimum 1-Tree where every vertex has a degree of 2.</p>
<p>If we know that a vertex is out-of-kilter in either direction, we know the direction of ascent and that direction is a unit vector.
Let $u_i$ be an $n$-dimensional unit vector with 1 in the $i$-th coordinate.
$u_i$ is the direction of ascent if vertex $i$ is out-of-kilter high and $-u_i$ is the direction of ascent if vertex $i$ is out-of-kilter low.</p>
<p>Corollaries 3 and 4 from page 1151 also show that finding $\epsilon(\pi, d)$ is simpler when a vertex is out-of-kilter as well.</p>
<blockquote>
<p><em>Corollary 3.</em> Assume vertex $i$ is out-of-kilter low and let $k$ be an element of $K(\pi, -u_i)$.
Then $\epsilon(\pi, -u_i) = \text{min} (\overline{c}_{i j} - \overline{c}_{r s})$ such that ${i, j}$ is a substitute for ${r, s}$ in $T^k$ and $i \not\in {r, s}$.</p>
</blockquote>
<blockquote>
<p><em>Corollary 4.</em> Assume vertex $r$ is out-of-kilter high.
Then $\epsilon(\pi, u_r) = \text{min} (\overline{c}_{i j} - \overline{c}_{r s})$ such that ${i, j}$ is a substitute for ${r, s}$ in $T^k$ and $r \not\in {i, j}$.</p>
</blockquote>
<p>These corollaries can be implemented with a modified version of the pseudocode listing above for finding $\epsilon$ in the ascent method section.</p>
<p>Once there are no more out-of-kilter vertices, the direction of ascent is not a unit vector and fractional weights are introduced.
This is the cause of a major slow down in the convergence of the ascent method to the optimal solution, so it should be avoided if possible.</p>
<p>Before we can discuss implementation details, there are still some more primaries to be reviewed.
Let $X$ and $Y$ be disjoint sets of edges in the graph.
Then let $\mathsf{T}(X, Y)$ denote the set of 1-Trees which include all edges in $X$ but none of the edges in $Y$.
Finally, define $w_{X, Y}(\pi)$ and $K_{X, Y}(\pi)$ as follows.</p>
<p>$$
w_{X, Y}(\pi) = \text{min}_{k \in \mathsf{T}(X, Y)} (c_k + \sum_{i=1}^{i=n} \pi_i v_{i k}) \\\
K_{X, Y}(\pi) = {k\ |\ c_k + \sum \pi_i v_{i k} = w_{X, Y}(\pi)}
$$</p>
<p>From these functions, a revised definition of out-of-kilter high and low arise, allowing a vertex to be out-of-kilter relative to $X$ and $Y$.</p>
<p>During the completion of the branch and bound method, the branches are tracking in a list where each entry has the following format.</p>
<p>$$[X, Y, \pi, w_{X, Y}(\pi)]$$</p>
<p>Where $X$ and $Y$ are the disjoint sets discussed earlier, $\pi$ is the vector we are using to perturb the edge weights and $w_{X, Y}(\pi)$ is the <em>bound</em> of the entry.</p>
<p>At each iteration of the method, we consider the list entry with the minimum bound and try to find an out-of-kilter vertex.
If we find one, we apply one iteration of the ascent method using the simplified unit vector as the direction of ascent.
Here we can take advantage of integral weights if they exist.
Perhaps the documentation for the Asadpour implementation in NetworkX should state that integral edge weights will perform better but that claim will have to be supported by our testing.</p>
<p>If there is not an out-of-kilter vertex, we still need to find the direction of ascent in order to determine if we are at the maximum of $w(\pi)$.
If the direction of ascent exists, we branch.
If there is no direction of ascent, we search for a tour among $K_{X, Y}(\pi)$ and if none is found, we also branch.</p>
<p>The branching process is as follows.
From entry $[X, Y, \pi, w_{X, Y}(\pi)]$ an edge $e \not\in X \cup Y$ is chosen (Held and Karp do not give any criteria to branch on, so I believe the choose can be arbitrary) and the parent entry is replaced with two other entries of the forms</p>
<p>$$
[X \cup {e}, Y^*, \pi, w_{X \cup {e}, Y^*}(\pi)] \quad \text{and} \quad [X^*, Y \cup {e}, \pi, w_{X^*, Y \cup {e}}(\pi)]
$$</p>
<p>An example of the branch and bound method is given on pages 1153 through 1156 in the Held and Karp paper.</p>
<p>In order to implement this method, we need to be able to determine in addition to modifying some of the details of the ascent method.</p>
<ul>
<li>If a vertex is either out-of-kilter in either direction with respect to $X$ and $Y$.</li>
<li>Search $K_{X, Y}(\pi)$ for a tour.</li>
</ul>
<p>The Held and Karp paper states that in order to find an out-of-kilter vertex, all we need to do is test the unit vectors.
If for arbitrary member $k$ of $K(\pi, u_i)$, $v_{i k} \geq 1$ and the appropriate inverse holds for out-of-kilter low.
From this process we can find out-of-kilter vertices by sequentially checking the $u_i$&rsquo;s in an $O(n^2)$ procedure.</p>
<p>Searching $K_{X, Y}(\pi)$ for a tour would be easy if we can enumerate that set minimum 1-Trees.
While I know how find one of the minimum 1-Trees, or a member of $K(\pi)$, I am not sure how to find elements in $K(\pi, d)$ or even all of the members of $K(\pi)$.
Using the properties in the Held and Karp paper, I do know how to refine $K(\pi)$ into $K(\pi, d)$ and $K(\pi)$ into $K_{X, Y}(\pi)$.
This will have to a blog post for another time.</p>
<p>The most promising research paper I have been able to find on this problem is <a href="https://www.scielo.br/j/pope/a/XHswBwRwJyrfL88dmMwYNWp/?lang=en&amp;format=pdf">this</a> 2005 paper by Sörensen and Janssens titled <em>An Algorithm to Generate all Spanning Trees of a Graph in Order of Increasing Cost</em>.
From here we generate spanning trees or arborescences until the cost moves upward at which point we have found all elements of $K(\pi)$.</p>
<h3 id="performance-of-the-branch-and-bound-method">Performance of the Branch and Bound Method<a class="headerlink" href="#performance-of-the-branch-and-bound-method" title="Link to this heading">#</a></h3>
<p>Held and Karp did not program this method.
We have some reason to believe that the performance of this method will be the best due to the fact that it is designed to be an improvement over the ascent method which was tested (somewhat) until $n = 25$ which is still better than the column generation technique which was only consistently able to solve up to $n = 12$.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061, <a href="https://homes.cs.washington.edu/~shayan/atsp.pdf">https://homes.cs.washington.edu/~shayan/atsp.pdf</a>.</p>
<p>Held, M., Karp, R.M. <em>The traveling-salesman problem and minimum spanning trees</em>. Operations research, 1970-11-01, Vol.18 (6), p.1138-1162. <a href="https://www.jstor.org/stable/169411">https://www.jstor.org/stable/169411</a></p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[NetworkX Function Stubs]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Separation Oracle" />
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/networkx-function-stubs/</id>
            
            
            <published>2021-05-24T00:00:00+00:00</published>
            <updated>2021-05-24T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Draft function stubs for the Asadpour method to use in the NetworkX API</blockquote><p>Now that my proposal was accepted by NetworkX for the 2021 Google Summer of Code (GSoC), I can get more into the technical details of how I plan to implement the Asadpour algorithm within NetworkX.</p>
<p>In this post I am going to outline my thought process for the control scheme of my implementation and create function stubs according to my GSoC proposal.
Most of the work for this project will happen in <code>netowrkx.algorithms.approximation.traveling_salesman.py</code>, where I will finish the last algorithm for the Traveling Salesman Problem so it can be merged into the project. The main function in <code>traveling_salesman.py</code> is</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">traveling_salesman_problem</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">,</span> <span class="n">nodes</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">cycle</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">method</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    ...
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    G : NetworkX graph
</span></span></span><span class="line"><span class="cl"><span class="s2">        Undirected possibly weighted graph
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    nodes : collection of nodes (default=G.nodes)
</span></span></span><span class="line"><span class="cl"><span class="s2">        collection (list, set, etc.) of nodes to visit
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    weight : string, optional (default=&#34;weight&#34;)
</span></span></span><span class="line"><span class="cl"><span class="s2">        Edge data key corresponding to the edge weight.
</span></span></span><span class="line"><span class="cl"><span class="s2">        If any edge does not have this attribute the weight is set to 1.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    cycle : bool (default: True)
</span></span></span><span class="line"><span class="cl"><span class="s2">        Indicates whether a cycle should be returned, or a path.
</span></span></span><span class="line"><span class="cl"><span class="s2">        Note: the cycle is the approximate minimal cycle.
</span></span></span><span class="line"><span class="cl"><span class="s2">        The path simply removes the biggest edge in that cycle.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    method : function (default: None)
</span></span></span><span class="line"><span class="cl"><span class="s2">        A function that returns a cycle on all nodes and approximates
</span></span></span><span class="line"><span class="cl"><span class="s2">        the solution to the traveling salesman problem on a complete
</span></span></span><span class="line"><span class="cl"><span class="s2">        graph. The returned cycle is then used to find a corresponding
</span></span></span><span class="line"><span class="cl"><span class="s2">        solution on `G`. `method` should be callable; take inputs
</span></span></span><span class="line"><span class="cl"><span class="s2">        `G`, and `weight`; and return a list of nodes along the cycle.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">        Provided options include :func:`christofides`, :func:`greedy_tsp`,
</span></span></span><span class="line"><span class="cl"><span class="s2">        :func:`simulated_annealing_tsp` and :func:`threshold_accepting_tsp`.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">        If `method is None`: use :func:`christofides` for undirected `G` and
</span></span></span><span class="line"><span class="cl"><span class="s2">        :func:`threshold_accepting_tsp` for directed `G`.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">        To specify parameters for these provided functions, construct lambda
</span></span></span><span class="line"><span class="cl"><span class="s2">        functions that state the specific value. `method` must have 2 inputs.
</span></span></span><span class="line"><span class="cl"><span class="s2">        (See examples).
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    ...
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span></span></span></code></pre>
</div>
<p>All user calls to find an approximation to the traveling salesman problem will go through this function.
My implementation of the Asadpour algorithm will also need to be compatible with this function.
<code>traveling_salesman_problem</code> will handle creating a new, complete graph using the weight of the shortest path between nodes $u$ and $v$ as the weight of that arc, so we know that by the time the graph is passed to the Asadpour algorithm it is a complete digraph which satisfies the triangle inequality.
The main function also handles the <code>nodes</code> and <code>cycles</code> parameters by only copying the necessary nodes into the complete digraph before calling the requested method and afterwards searching for and removing the largest arc within the returned cycle.
Thus, the parent function for the Asadpour algorithm only needs to deal with the graph itself and the weights or costs of the arcs in the graph.</p>
<p>My controlling function will have the following signature and I have included a draft of the docstring as well.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">asadpour_tsp</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns an O( log n / log log n ) approximate solution to the traveling
</span></span></span><span class="line"><span class="cl"><span class="s2">    salesman problem.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    This approximate solution is one of the best known approximations for
</span></span></span><span class="line"><span class="cl"><span class="s2">    the asymmetric traveling salesman problem developed by Asadpour et al,
</span></span></span><span class="line"><span class="cl"><span class="s2">    [1]_. The algorithm first solves the Held-Karp relaxation to find a
</span></span></span><span class="line"><span class="cl"><span class="s2">    lower bound for the weight of the cycle. Next, it constructs an
</span></span></span><span class="line"><span class="cl"><span class="s2">    exponential distribution of undirected spanning trees where the
</span></span></span><span class="line"><span class="cl"><span class="s2">    probability of an edge being in the tree corresponds to the weight of
</span></span></span><span class="line"><span class="cl"><span class="s2">    that edge using a maximum entropy rounding scheme. Next we sample that
</span></span></span><span class="line"><span class="cl"><span class="s2">    distribution $2 </span><span class="se">\\\\\\</span><span class="s2">log n$ times and saves the minimum sampled tree once
</span></span></span><span class="line"><span class="cl"><span class="s2">    the direction of the arcs is added back to the edges. Finally,
</span></span></span><span class="line"><span class="cl"><span class="s2">    we argument then short circuit that graph to find the approximate tour
</span></span></span><span class="line"><span class="cl"><span class="s2">    for the salesman.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    G : nx.DiGraph
</span></span></span><span class="line"><span class="cl"><span class="s2">        The graph should be a complete weighted directed graph.
</span></span></span><span class="line"><span class="cl"><span class="s2">        The distance between all pairs of nodes should be included.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    weight : string, optional (default=&#34;weight&#34;)
</span></span></span><span class="line"><span class="cl"><span class="s2">        Edge data key corresponding to the edge weight.
</span></span></span><span class="line"><span class="cl"><span class="s2">        If any edge does not have this attribute the weight is set to 1.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">    -------
</span></span></span><span class="line"><span class="cl"><span class="s2">    cycle : list of nodes
</span></span></span><span class="line"><span class="cl"><span class="s2">        Returns the cycle (list of nodes) that a salesman can follow to minimize
</span></span></span><span class="line"><span class="cl"><span class="s2">        the total weight of the trip.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Raises
</span></span></span><span class="line"><span class="cl"><span class="s2">    ------
</span></span></span><span class="line"><span class="cl"><span class="s2">    NetworkXError
</span></span></span><span class="line"><span class="cl"><span class="s2">        If `G` is not complete, the algorithm raises an exception.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    References
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    .. [1] A. Asadpour, M. X. Goemans, A. Madry, S. O. Gharan, and A. Saberi,
</span></span></span><span class="line"><span class="cl"><span class="s2">       An o(log n/log log n)-approximation algorithm for the asymmetric
</span></span></span><span class="line"><span class="cl"><span class="s2">       traveling salesman problem, Operations research, 65 (2017),
</span></span></span><span class="line"><span class="cl"><span class="s2">       pp. 1043–1061
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">pass</span></span></span></code></pre>
</div>
<p>Following my GSoC proposal, the next function is <code>held_karp</code>, which will solve the Held-Karp relaxation on the complete digraph using the ellipsoid method (See my last two posts <a href="../held-karp-relaxation">here</a> and <a href="../held-karp-separation-oracle">here</a> for my thoughts on why and how to accomplish this).
Solving the Held-Karp relaxation is the first step in the algorithm.</p>
<p>Recall that the Held-Karp relaxation is defined as the following linear program:</p>
<p>$$
\begin{array}{c l l}
\text{min} &amp; \sum_{a} c(a)x_a \\\
\text{s.t.} &amp; x(\delta^+(U)) \geqslant 1 &amp; \forall\ U \subset V \text{ and } U \not= \emptyset \\\
&amp; x(\delta^+(v)) = x(\delta^-(v)) = 1 &amp; \forall\ v \in V \\\
&amp; x_a \geqslant 0 &amp; \forall\ a
\end{array}
$$</p>
<p>and that it is a semi-infinite program so it is too large to be solved in conventional forms.
The algorithm uses the solution to the Held-Karp relaxation to create a vector $z^*$ which is a symmetrized and slightly scaled down version of the true Held-Karp solution $x^*$.
$z^*$ is defined as</p>
<p>$$
z^*_{{u, v}} = \frac{n - 1}{n} \left(x^*_{uv} + x^*_{vu}\right)
$$</p>
<p>and since this is what the algorithm using to build the rest of the approximation, this should be one of the return values from <code>held_karp</code>.
I will also return the value of the cost of $x^*$, which is denoted as $c(x^*)$ or $OPT_{HK}$ in the Asadpour paper [1].</p>
<p>Additionally, the separation oracle will be defined as an inner function within <code>held_karp</code>.
At the present moment I am not sure what the exact parameters for the separation oracle, <code>sep_oracle</code>, but it should be the the point the algorithm wishes to test and will need to access the graph the algorithm is relaxing.
In particular, I&rsquo;m not sure <em>yet</em> how I will represent the hyperplane which is returned by the separation oracle.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">_held_karp</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Solves the Held-Karp relaxation of the input complete digraph and scales
</span></span></span><span class="line"><span class="cl"><span class="s2">    the output solution for use in the Asadpour [1]_ ASTP algorithm.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    The Held-Karp relaxation defines the lower bound for solutions to the
</span></span></span><span class="line"><span class="cl"><span class="s2">    ATSP, although it does return a fractional solution. This is used in the
</span></span></span><span class="line"><span class="cl"><span class="s2">    Asadpour algorithm as an initial solution which is later rounded to a
</span></span></span><span class="line"><span class="cl"><span class="s2">    integral tree within the spanning tree polytopes. This function solves
</span></span></span><span class="line"><span class="cl"><span class="s2">    the relaxation with the ellipsoid method for linear programs.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    G : nx.DiGraph
</span></span></span><span class="line"><span class="cl"><span class="s2">        The graph should be a complete weighted directed graph.
</span></span></span><span class="line"><span class="cl"><span class="s2">        The distance between all paris of nodes should be included.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    weight : string, optional (default=&#34;weight&#34;)
</span></span></span><span class="line"><span class="cl"><span class="s2">        Edge data key corresponding to the edge weight.
</span></span></span><span class="line"><span class="cl"><span class="s2">        If any edge does not have this attribute the weight is set to 1.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">    -------
</span></span></span><span class="line"><span class="cl"><span class="s2">    OPT : float
</span></span></span><span class="line"><span class="cl"><span class="s2">        The cost for the optimal solution to the Held-Karp relaxation
</span></span></span><span class="line"><span class="cl"><span class="s2">    z_star : numpy array
</span></span></span><span class="line"><span class="cl"><span class="s2">        A symmetrized and scaled version of the optimal solution to the
</span></span></span><span class="line"><span class="cl"><span class="s2">        Held-Karp relaxation for use in the Asadpour algorithm
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    References
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    .. [1] A. Asadpour, M. X. Goemans, A. Madry, S. O. Gharan, and A. Saberi,
</span></span></span><span class="line"><span class="cl"><span class="s2">       An o(log n/log log n)-approximation algorithm for the asymmetric
</span></span></span><span class="line"><span class="cl"><span class="s2">       traveling salesman problem, Operations research, 65 (2017),
</span></span></span><span class="line"><span class="cl"><span class="s2">       pp. 1043–1061
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">def</span> <span class="nf">sep_oracle</span><span class="p">(</span><span class="n">point</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">        The separation oracle used in the ellipsoid algorithm to solve the
</span></span></span><span class="line"><span class="cl"><span class="s2">        Held-Karp relaxation.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">        This &#39;black-box&#39; takes a point and check to see if it violates any
</span></span></span><span class="line"><span class="cl"><span class="s2">        of the Held-Karp constraints, which are defined as
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">            - The out-degree of all non-empty subsets of $V$ is at lest one.
</span></span></span><span class="line"><span class="cl"><span class="s2">            - The in-degree and out-degree of each vertex in $V$ is equal to
</span></span></span><span class="line"><span class="cl"><span class="s2">              one. Note that if a vertex has more than one incoming or
</span></span></span><span class="line"><span class="cl"><span class="s2">              outgoing arcs the values of each could be less than one so long
</span></span></span><span class="line"><span class="cl"><span class="s2">              as they sum to one.
</span></span></span><span class="line"><span class="cl"><span class="s2">            - The current value for each arc is greater
</span></span></span><span class="line"><span class="cl"><span class="s2">              than zero.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">        Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">        ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">        point : numpy array
</span></span></span><span class="line"><span class="cl"><span class="s2">            The point in n dimensional space we will to test to see if it
</span></span></span><span class="line"><span class="cl"><span class="s2">            violations any of the Held-Karp constraints.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">        Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">        -------
</span></span></span><span class="line"><span class="cl"><span class="s2">        numpy array
</span></span></span><span class="line"><span class="cl"><span class="s2">            The hyperplane which was the most violated by `point`, i.e the
</span></span></span><span class="line"><span class="cl"><span class="s2">            hyperplane defining the polytope of spanning trees which `point`
</span></span></span><span class="line"><span class="cl"><span class="s2">            was farthest from, None if no constraints are violated.
</span></span></span><span class="line"><span class="cl"><span class="s2">        &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="k">pass</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">pass</span></span></span></code></pre>
</div>
<p>Next the algorithm uses the symmetrized and scaled version of the Held-Karp solution to construct an exponential distribution of undirected spanning trees which preserves the marginal probabilities.</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">_spanning_tree_distribution</span><span class="p">(</span><span class="n">z_star</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Solves the Maximum Entropy Convex Program in the Asadpour algorithm [1]_
</span></span></span><span class="line"><span class="cl"><span class="s2">    using the approach in section 7 to build an exponential distribution of
</span></span></span><span class="line"><span class="cl"><span class="s2">    undirected spanning trees.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    This algorithm ensures that the probability of any edge in a spanning
</span></span></span><span class="line"><span class="cl"><span class="s2">    tree is proportional to the sum of the probabilities of the trees
</span></span></span><span class="line"><span class="cl"><span class="s2">    containing that edge over the sum of the probabilities of all spanning
</span></span></span><span class="line"><span class="cl"><span class="s2">    trees of the graph.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    z_star : numpy array
</span></span></span><span class="line"><span class="cl"><span class="s2">        The output of `_held_karp()`, a scaled version of the Held-Karp
</span></span></span><span class="line"><span class="cl"><span class="s2">        solution.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">    -------
</span></span></span><span class="line"><span class="cl"><span class="s2">    gamma : numpy array
</span></span></span><span class="line"><span class="cl"><span class="s2">        The probability distribution which approximately preserves the marginal
</span></span></span><span class="line"><span class="cl"><span class="s2">        probabilities of `z_star`.
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">pass</span></span></span></code></pre>
</div>
<p>Now that the algorithm has the distribution of spanning trees, we need to sample them.
Each sampled tree is a $\lambda$-random tree and can be sampled using algorithm A8 in [2].</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">_sample_spanning_tree</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">gamma</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">    Sample one spanning tree from the distribution defined by `gamma`,
</span></span></span><span class="line"><span class="cl"><span class="s2">    roughly using algorithm A8 in [1]_ .
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    We &#39;shuffle&#39; the edges in the graph, and then probabilistically
</span></span></span><span class="line"><span class="cl"><span class="s2">    determine whether to add the edge conditioned on all of the previous
</span></span></span><span class="line"><span class="cl"><span class="s2">    edges which were added to the tree. Probabilities are calculated using
</span></span></span><span class="line"><span class="cl"><span class="s2">    Kirchhoff&#39;s Matrix Tree Theorem and a weighted Laplacian matrix.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    G : nx.Graph
</span></span></span><span class="line"><span class="cl"><span class="s2">        An undirected version of the original graph.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    gamma : numpy array
</span></span></span><span class="line"><span class="cl"><span class="s2">        The probabilities associated with each of the edges in the undirected
</span></span></span><span class="line"><span class="cl"><span class="s2">        graph `G`.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">    -------
</span></span></span><span class="line"><span class="cl"><span class="s2">    nx.Graph
</span></span></span><span class="line"><span class="cl"><span class="s2">        A spanning tree using the distribution defined by `gamma`.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    References
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    .. [1] V. Kulkarni, Generating random combinatorial objects, Journal of
</span></span></span><span class="line"><span class="cl"><span class="s2">       algorithms, 11 (1990), pp. 185–207
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="k">pass</span></span></span></code></pre>
</div>
<p>At this point there is only one function left to discuss, <code>laplacian_matrix</code>.
This function already exists within NetworkX at <code>networkx.linalg.laplacianmatrix.laplacian_matrix</code>, and even though this is relatively simple to implement, I&rsquo;d rather use an existing version than create duplicate code within the project.
A deeper look at the function signature reveals</p>


<div class="highlight">
  <pre class="chroma"><code><span class="line"><span class="cl"><span class="nd">@not_implemented_for</span><span class="p">(</span><span class="s2">&#34;directed&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">laplacian_matrix</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">nodelist</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">weight</span><span class="o">=</span><span class="s2">&#34;weight&#34;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;Returns the Laplacian matrix of G.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    The graph Laplacian is the matrix L = D - A, where
</span></span></span><span class="line"><span class="cl"><span class="s2">    A is the adjacency matrix and D is the diagonal matrix of node degrees.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Parameters
</span></span></span><span class="line"><span class="cl"><span class="s2">    ----------
</span></span></span><span class="line"><span class="cl"><span class="s2">    G : graph
</span></span></span><span class="line"><span class="cl"><span class="s2">       A NetworkX graph
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    nodelist : list, optional
</span></span></span><span class="line"><span class="cl"><span class="s2">       The rows and columns are ordered according to the nodes in nodelist.
</span></span></span><span class="line"><span class="cl"><span class="s2">       If nodelist is None, then the ordering is produced by G.nodes().
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    weight : string or None, optional (default=&#39;weight&#39;)
</span></span></span><span class="line"><span class="cl"><span class="s2">       The edge data key used to compute each value in the matrix.
</span></span></span><span class="line"><span class="cl"><span class="s2">       If None, then each edge has weight 1.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Returns
</span></span></span><span class="line"><span class="cl"><span class="s2">    -------
</span></span></span><span class="line"><span class="cl"><span class="s2">    L : SciPy sparse matrix
</span></span></span><span class="line"><span class="cl"><span class="s2">      The Laplacian matrix of G.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    Notes
</span></span></span><span class="line"><span class="cl"><span class="s2">    -----
</span></span></span><span class="line"><span class="cl"><span class="s2">    For MultiGraph/MultiDiGraph, the edges weights are summed.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">    See Also
</span></span></span><span class="line"><span class="cl"><span class="s2">    --------
</span></span></span><span class="line"><span class="cl"><span class="s2">    to_numpy_array
</span></span></span><span class="line"><span class="cl"><span class="s2">    normalized_laplacian_matrix
</span></span></span><span class="line"><span class="cl"><span class="s2">    laplacian_spectrum
</span></span></span><span class="line"><span class="cl"><span class="s2">    &#34;&#34;&#34;</span></span></span></code></pre>
</div>
<p>Which is exactly what I need, <em>except</em> the decorator states that it does not support directed graphs and this algorithm deals with those types of graphs.
Fortunately, our distribution of spanning trees is for trees in a directed graph <em>once the direction is disregarded</em>, so we can actually use the existing function.
The definition given in the Asadpour paper [1], is</p>
<p>$$
L_{i,j} = \left\{
\begin{array}{l l}
-\lambda_e &amp; e = (i, j) \in E \\\
\sum_{e \in \delta({i})} \lambda_e &amp; i = j \\\
0 &amp; \text{otherwise}
\end{array}
\right.
$$</p>
<p>Where $E$ is defined as &ldquo;Let $E$ be the support of graph of $z^*$ when the direction of the arcs are disregarded&rdquo; on page 5 of the Asadpour paper.
Thus, I can use the existing method without having to create a new one, which will save time and effort on this GSoC project.</p>
<p>In addition to being discussed here, these function stubs have been added to my fork of <code>NetworkX</code> on the <code>bothTSP</code> branch.
The commit, <a href="https://github.com/mjschwenne/networkx/commit/d3a3db8823804faa3edbf8bfa0f4b12459143ac8"><code>Added function stubs and draft docstrings for the Asadpour algorithm</code></a> is visible on my GitHub using that link.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>[1] A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061, <a href="https://homes.cs.washington.edu/~shayan/atsp.pdf">https://homes.cs.washington.edu/~shayan/atsp.pdf</a>.</p>
<p>[2] V. Kulkarni, <em>Generating random combinatorial objects</em>, Journal of algorithms, 11 (1990), pp. 185–207</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Held-Karp Separation Oracle]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
                <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/?utm_source=atom_feed" rel="related" type="text/html" title="Held-Karp Relaxation" />
            
                <id>https://blog.scientific-python.org/networkx/atsp/held-karp-separation-oracle/</id>
            
            
            <published>2021-05-08T00:00:00+00:00</published>
            <updated>2021-05-08T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Considering creating a separation oracle for the Held-Karp relaxation</blockquote><p>Continuing the theme of my last post, we know that the Held-Karp relaxation in the Asadpour Asymmetric Traveling Salesman Problem cannot be practically written into the standard matrix form of a linear program.
Thus, we need a different method to solve the relaxation, which is where the ellipsoid method comes into play.
The ellipsoid method can be used to solve semi-infinite linear programs, which is what the Held-Karp relaxation is.</p>
<p>One of the keys to the ellipsoid method is the separation oracle.
From the perspective of the algorithm itself, the oracle is a black-box program which takes a vector and determines</p>
<ul>
<li>Whether the vector is in the linear program&rsquo;s feasible region.</li>
<li>If not, it returns a hyperplane with the given point on one side and the linear program&rsquo;s feasible region on the other.</li>
</ul>
<p>In the most basic form, the ellipsoid method is a decision algorithm rather than an optimization algorithm, so it terminates once a single, but almost certainly nonoptimal, vector within the feasible region is found.
However, we can convert the ellipsoid method into an algorithm which is truly an optimization one.
What this means for us is that we can assume that the separation oracle will return a hyperplane.</p>
<p>The hyperplane that the oracle returns is then used to construct the next ellipsoid in the algorithm, which is of smaller volume and contains a half-ellipsoid from the originating ellipsoid.
This is, however, a topic for another post.
Right now I want to focus on this &lsquo;black-box&rsquo; separation oracle.</p>
<p>The reason that the Held-Karp relaxation is semi-infinite is because for a graph with $n$ vertices, there are $2^n + 2n$ constraints in the program.
A naive approach to the separation oracle would be to check each constraint individually for the input vector, creating a program with $O(2^n)$ running time.
While it would terminate eventually, it certainly would take a <em>long</em> time to do so.</p>
<p>So, we look for a more efficient way to do this.
Recall from the Asadpour paper [1] that the Held-Karp relaxation is the following linear program.</p>
<p>$$
\begin{array}{c l l}
\text{min} &amp; \sum_{a} c(a)x_a \\\
\text{s.t.} &amp; x(\delta^+(U)) \geqslant 1 &amp; \forall\ U \subset V \text{ and } U \not= \emptyset \\\
&amp; x(\delta^+(v)) = x(\delta^-(v)) = 1 &amp; \forall\ v \in V \\\
&amp; x_a \geqslant 0 &amp; \forall\ a
\end{array}
$$</p>
<p>The first set of constraints ensures that the output of the relaxation is connected.
This is called <em>subtour elimination</em>, and it prevents a solution with multiple disconnected clusters by ensuring that every set of vertices has at least one total outgoing arc (we are currently dealing with fractional arcs).
From the perspective of the separation oracle, we do not care about all of the sets of vertices for which $x(\delta^+(U)) \geqslant 1$, only trying to find one such subset of the vertices where $x(\delta^+(U)) &lt; 1$.</p>
<p>In order to find such a set of vertices $U \in V$ where $x(\delta^+(U)) &lt; 1$ we can find the subset $U$ with the smallest value of $\delta^+(x)$ for all $U \subset V$.
That is, find the <em>global minimum cut</em> in the complete digraph using the edge capacities given by the input vector to the separation oracle.
Using lecture notes by Michel X. Goemans (who is also one of the authors of the Asadpour algorithm this project seeks to implement), [2] we can find such a minimum cut with $2(n - 1)$ maximum flow calculations.</p>
<p>The algorithm described in section 6.4 of the lecture notes [2] is fairly simple.
Let $S$ be a subset of $V$ and $T$ be a subset of $V$ such that the $s-t$ cut is the global minimum cut for the graph.
First, we pick an arbitrary $s$ in the graph.
By definition, $s$ is either in $S$ or it is in $T$.
We now iterate through every other vertex in the graph $t$, and compute the $s-t$ and $t-s$ minimum cut.
If $s \in S$ than we will find that one of the choices of $t$ will produce the global minimum cut and the case where $s \not\in S$ or $s \in T$ is covered by using the $t-s$ cuts.</p>
<p>According to Geoman [2], the complexity of finding the global min cut in a weighted digraph, using an effeicent maxflow algorithm, is $O(mn^2\log(n^2/m))$.</p>
<p>The second constraint can be checked in $O(n)$ time with a simple loop.
It makes sense to actually check this one first as it is computationally simpler and thus if one of these conditions are violated we will be able to return the violated hyperplane faster.</p>
<p>Now we have reduced the complexity of the oracle from $O(2^n)$ to the same as finding the global min cut, $O(mn^2\log(n^2/m))$ which is substantially better.
For example, let us consider an initial graph with 100 vertices.
Using the $O(2^n)$ method, that is $1.2677 \times 10^{30}$ subsets $U$ that we need to check <em>times</em> whatever the complexity of actually determining whether the constraint violates $x(\delta^+(U)) \geqslant 1$.
For that same complete digraph on 100 vertices, we know that there $n = 100$ and $m = \binom{100}{2} = 4950$.
Using the global min cut approach, the complexity which includes finding the max flow as well as the number of times it needs to be found, is $15117042$ or $1.5117 \times 10^7$ which is faster by a factor of $10^{23}$.</p>
<h2 id="references">References<a class="headerlink" href="#references" title="Link to this heading">#</a></h2>
<p>[1] A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, <em>An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem</em>, Operations Research, 65 (2017), pp. 1043-1061, <a href="https://homes.cs.washington.edu/~shayan/atsp.pdf">https://homes.cs.washington.edu/~shayan/atsp.pdf</a>.</p>
<p>[2] M. X. Goemans, <em>Lecture notes on flows and cuts</em>, Handout 18, Massachusetts Institute of Technology, Cambridge, MA, 2009 <a href="http://www-math.mit.edu/~goemans/18433S09/flowscuts.pdf">http://www-math.mit.edu/~goemans/18433S09/flowscuts.pdf</a>.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Held-Karp Relaxation]]></title>
            <link href="https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/?utm_source=atom_feed" rel="alternate" type="text/html" />
            
            
                <id>https://blog.scientific-python.org/networkx/atsp/held-karp-relaxation/</id>
            
            
            <published>2021-04-21T00:00:00+00:00</published>
            <updated>2021-04-21T00:00:00+00:00</updated>
            
            
            <content type="html"><![CDATA[<blockquote>Brief explanation of the Held-Karp relaxation and why it cannot be solved directly</blockquote><p>In linear programming, we sometimes need to take what would be a integer program and &lsquo;relax&rsquo; it, or unbound the values of the variables so that they are continuous.
One particular application of this process is Held-Karp relaxation used the first part of the Asadpour algorithm for the Asymmetric Traveling Salesman Problem, where we find the lower bound of the approximation.
Normally the relaxation is written as follows.</p>
<p>$$
\begin{array}{c l l}
\text{min} &amp; \sum_{a} c(a)x_a \\\
\text{s.t.} &amp; x(\delta^+(U)) \geqslant 1 &amp; \forall\ U \subset V \text{ and } U \not= \emptyset \\\
&amp; x(\delta^+(v)) = x(\delta^-(v)) = 1 &amp; \forall\ v \in V \\\
&amp; x_a \geqslant 0 &amp; \forall\ a
\end{array}
$$</p>
<p>This is a convenient way to write the program, but if we want to solve it, and we definitely do, we need it written in standard form for a linear program.
Standard form is represented using a matrix for the set of constraints and vectors for the objective function.
It is shown below</p>
<p>$$
\begin{array}{c l}
\text{min} &amp; Z = c^TX \\\
\text{s.t.} &amp; AX = b \\\
&amp; X \geqslant 0
\end{array}
$$</p>
<p>Where $c$ is the coefficient vector for objective function, $X$ is the vector for the values of all of the variables, $A$ is the coefficient matrix for the constraints and $b$ is a vector of what the constraints are equal to.
Once a linear program is in this form there are efficient algorithms which can solve it.</p>
<p>In the Held-Karp relaxation, the objective function is a summation, so we can expand it to a summation.
If there are $n$ edges then it becomes</p>
<p>$$
\sum_{a} c(a)x_a = c(1)x_1 + c(2)x_2 + c(3)x_3 + \dots + c(n)_n
$$</p>
<p>Where $c(a)$ is the weight of that edge in the graph.
From here it is easy to convert the objective function into two vectors which satisfies the standard form.</p>
<p>$$
\begin{array}{rCl}
c &amp;=&amp; \begin{bmatrix}
c_1 &amp; c_2 &amp; c_3 &amp; \dots &amp; c_n
\end{bmatrix}^T \\\
X &amp;=&amp; \begin{bmatrix}
x_1 &amp; x_2 &amp; x_3 &amp; \dots &amp; x_n
\end{bmatrix}^T
\end{array}
$$</p>
<p>Now we have to convert the constraints to be in standard form.
First and foremost, notice that the Held-Karp relaxation contains $x_a \geqslant 0\ \forall\ a$ and the standard form uses $X \geqslant 0$, so these constants match already and no work is needed.
As for the others&hellip; well they do need some work.</p>
<p>Starting with the first constraint in the Held-Karp relaxation, $x(\delta^+(U)) \geqslant 1\ \forall\ U \subset V$ and $U \not= \emptyset$.
This constraint specifies that for every subset of the vertex set $V$, that subset must have at lest one arc with its tail in $U$ and its head not in $U$.
For any given $\delta^+(U)$, which is defined in the paper is $\delta^+(U) = {a = (u, v) \in A: u \in U, v \not\in U}$ where $A$ in this set is the set of all arcs in the graph, the coefficients on arcs not in $U$ are zero.
Arcs in $\delta^+(U)$ have a coefficient of $1$ as their full weight is counted as part of $\delta^+(U)$.
We know that there are about $2^{|V|}$ subsets of the vertex $V$, so this constraint adds that many rows to the constraint matrix $A$.</p>
<p>Moving to the next constraint, $x(\delta^+(v)) = x(\delta^-(v)) = 1$, we first need to split it in two.</p>
<p>$$
\begin{array}{rCl}
x(\delta^+(v)) &amp;=&amp; 1 \\\
x(\delta^-(v)) &amp;=&amp; 1
\end{array}
$$</p>
<p>Similar to the last constraint, each of these say that the number of arcs entering and leaving a vertex in the graph need to equal one.
For each vertex $v$ we find all the arcs which start at $v$ and those are the members of $\delta^+(v)$, so they have a weight of 1 and all others have a weight of zero.
The opposite is true for $\delta^-(v)$, every vertex which has a head on $v$ has a weight or coefficient of 1 while the rest have a weight of zero.
This adds $2 \times |V|$ rows to $A$, the coefficient matrix which brings the total to $2^{|V|} + 2|V|$ rows.</p>
<h2 id="the-impossible-size-of-a">The Impossible Size of $A$<a class="headerlink" href="#the-impossible-size-of-a" title="Link to this heading">#</a></h2>
<p>We already know that $A$ will have $2^{|V|} + 2|V|$ rows.
But how many columns will $A$ have?
We know that each arc is a variable so at lest $|E|$ rows, but in a traditional matrix form of a linear program, we have to introduce slack and surplus variables so that $AX = b$ and not $AX \geqslant b$ or any other inequality operation.
The $2|V|$ rows already comply with this requirement, but the rows created with every subset of $V$ do <em>not</em>, those rows only require that $x(\delta^+(U)) \geqslant 1$, so we introduce a surplus variable for each of these rows bring the column count to $|E| + 2^{|V|}$.</p>
<p>Now, the Held-Karp relaxation performed in the Asadpour algorithm in is done on the complete bi-directed graph.
For a graph with $n$ vertices, there will be $2 \times \binom{n}{2}$ arcs in the graph.
The updated value for the size of $A$ is then that it is a</p>
<p>$$
\left(2^n + 2n \right)\times \left(2\binom{n}{2} + 2^n\right)
$$</p>
<p>matrix.
This is <em>very</em> large.
For $n = 100$ there are $1.606 \times 10^{60}$ elements in the matrix.
Allocating a measly 8 bits per entry sill consumes over $1.28 \times 10^{52}$ gigabytes of memory.</p>
<p>This is an impossible amount of memory for any computer that we could run NetworkX on.</p>
<h2 id="solution">Solution<a class="headerlink" href="#solution" title="Link to this heading">#</a></h2>
<p>The Held-Karp relaxation <em>must</em> be solved in the Asadpour Asymmertic Traveling Salesman Problem Algorithm, but clearly putting it into standard form is not possible.
This means that we will not be able to use SciPy&rsquo;s linprog method which I was hoping to use.
I will instead have to research and write an ellipsoid method solver, which hopefully will be able to solve the Held-Karp relaxation in both polynomial time and a practical amount of memory.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="taxonomy:Tags" term="gsoc" label="gsoc" />
                             
                                <category scheme="taxonomy:Tags" term="networkx" label="networkx" />
                             
                                <category scheme="taxonomy:Tags" term="traveling-salesman-problem" label="traveling-salesman-problem" />
                            
                        
                    
                
            
        </entry>
    
</feed>
