<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Inductio Ex Machina</title>
	<atom:link href="http://conflate.net/inductio/feed/" rel="self" type="application/rss+xml" />
	<link>http://conflate.net/inductio</link>
	<description>Thoughts on Machine Learning and Inference</description>
	<pubDate>Mon, 21 Jul 2008 11:18:19 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
	<language>en</language>
			<item>
		<title>Evaluation Methods for Machine Learning</title>
		<link>http://conflate.net/inductio/2008/07/evaluation-methods-for-machine-learning/</link>
		<comments>http://conflate.net/inductio/2008/07/evaluation-methods-for-machine-learning/#comments</comments>
		<pubDate>Mon, 21 Jul 2008 11:18:19 +0000</pubDate>
		<dc:creator>Mark Reid</dc:creator>
		
		<category><![CDATA[Philosophy]]></category>

		<guid isPermaLink="false">http://conflate.net/inductio/?p=45</guid>
		<description><![CDATA[Some thoughts on the workshop on evaluation methods that I attended as part of ICML 2008 in Helsinki.]]></description>
			<content:encoded><![CDATA[<p>Although I wasn&#8217;t able to attend the talks at <a href="http://icml2008.cs.helsinki.fi/" onclick="javascript:urchinTracker ('/outbound/article/icml2008.cs.helsinki.fi');">ICML 2008</a> I was able to participate in the <a href="http://www.site.uottawa.ca/ICML08WS/" onclick="javascript:urchinTracker ('/outbound/article/www.site.uottawa.ca');">Workshop on Evaluation Methods for Machine Learning</a> run by William Klement, <a href="http://www.site.uottawa.ca/~cdrummon/" onclick="javascript:urchinTracker ('/outbound/article/www.site.uottawa.ca');">Chris Drummond</a>, and <a href="http://www.site.uottawa.ca/~nat/" onclick="javascript:urchinTracker ('/outbound/article/www.site.uottawa.ca');">Nathalie Japkowicz</a>.</p>

<p>This workshop at ICML was a continuation of previous workshops held at AAAI that aim to cast a critical eye on the methods used in machine learning to experimentally evaluate the performance of algorithms.</p>

<p>It kicked off with a series of mini debates with Nathalie and Chris articulating the opposing sides. The questions included the following:</p>

<ul>
<li>Should we change how evaluation is done?</li>
<li>Is evaluation central to empirical work?</li>
<li>Are statistical tests critical to evaluation?</li>
<li>Are the UCI data sets sufficient for evaluation?</li>
</ul>

<p>There were three papers I particularly liked: <a href="http://www.ailab.si/janez/" onclick="javascript:urchinTracker ('/outbound/article/www.ailab.si');">Janez Demsar</a>&#8217;s talk &#8220;<a href="http://www.site.uottawa.ca/ICML08WS/papers/J_Demsar.pdf" onclick="javascript:urchinTracker ('/outbound/article/www.site.uottawa.ca');">On the Appropriateness of Statistical Tests in Machine Learning</a>&#8220;, <a href="http://www.cs.cmu.edu/~elaw/" onclick="javascript:urchinTracker ('/outbound/article/www.cs.cmu.edu');">Edith Law</a>&#8217;s &#8220;<a href="http://www.site.uottawa.ca/ICML08WS/papers/E_Law.pdf" onclick="javascript:urchinTracker ('/outbound/article/www.site.uottawa.ca');">The Problem of Accuracy as an Evaluation Criterion</a>&#8220;, and <a href="http://www.site.uottawa.ca/~cdrummon/" onclick="javascript:urchinTracker ('/outbound/article/www.site.uottawa.ca');">Chris Drummond</a>&#8217;s call for a mild-mannered revolution &#8220;<a href="http://www.site.uottawa.ca/ICML08WS/papers/C_Drummond.pdf" onclick="javascript:urchinTracker ('/outbound/article/www.site.uottawa.ca');">Finding a Balance between Anarchy and Orthodoxy</a>&#8220;.</p>

<p>Janez&#8217;s talk touched on a number of criticisms that <a href="http://conflate.net/inductio/2008/04/the-earth-is-round/" >I had found in Jacob Cohen&#8217;s paper &#8220;The Earth is Round (p &lt; 0.05)&#8221;</a> making the case that people often incorrectly report and incorrectly interpret p-values for statistical tests. Unfortunately, as Janez points out, since machine learning is a discipline that (rightly) places emphasis on results it is difficult as a reviewer to reject a paper that presents an ill-motivated and confusing idea if its authors have shown that, statistically, it outperforms similar approaches.</p>

<p>Edith&#8217;s talk argued that accuracy is sometimes a poor measure of performance making all this concern over whether we are constructing statistical tests for it (or AUC) moot. In particular, for tasks like salient region detection in images, language translation and music tagging there is no single correct region, translation or tag. Whether or not a particular region/translation/tag is &#8220;correct&#8221; or not is impossible to determine independent of the more difficult tasks of image recognition/language understanding/music identification. Solving these for the purposes of evaluation would make a solution to the smaller tasks redundant. Instead of focusing on evaluation of the smaller tasks, Edith suggests ways in which games that humans play on the web &#8212; such as the <a href="http://www.espgame.org/" onclick="javascript:urchinTracker ('/outbound/article/www.espgame.org');">ESP Game</a> &#8212; can be used to evaluate machine performance on these tasks by playing learning algorithms against humans.</p>

<p>Finally, Chris&#8217;s talk made the bold claim that the way we approach evaluation in machine learning is an &#8220;impoverished realization of a controversial methodology&#8221;, namely statistical hypothesis testing. &#8220;Impoverished&#8221; because when we do do hypothesis testing it is in the narrowest of senses, mainly to test that my algorithm is better than yours on this handful of data sets. &#8220;Controversial&#8221; since many believe science to have social, exploratory and accidental aspects &#8212; much more than just the clinical proposing of hypotheses for careful testing.</p>

<p>What these papers and the workshop as a whole showed me was how unresolved my position is on these and other questions regarding evaluation. On the one hand I spent a lot of time painstakingly setting up, running and analysing experiments for my <a href="http://www.library.unsw.edu.au/~thesis/adt-NUN/public/adt-NUN20070512.173744/index.html" onclick="javascript:urchinTracker ('/outbound/article/www.library.unsw.edu.au');">PhD research</a> on inductive transfer in order to evaluate the methods I was proposing. I taught myself how to correctly control for confounding factors, use the <a href="http://en.wikipedia.org/wiki/Bonferroni_correction" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">Bonferroni correction</a> to adjust significance levels and other esoterica of statistical testing. Applying all these procedures carefully to my work felt very scientific and I was able to create many pretty graphs and tables replete with confidence intervals, p-values and the like. On the other hand &#8212; and with sufficient hindsight &#8212; it&#8217;s not clear how much value this type of analysis added to the thesis overall (apart from demonstrating to my reviewers that I could do it).</p>

<p>The dilemma is this: when one algorithm or approach clearly dominates another details such as p-values, t-tests and the like only obscure the results; and when two algorithms are essentially indistinguishable using &#8220;significance&#8221; levels to pry them apart seems to be grasping at straws.</p>

<p>That&#8217;s not to say that we should get rid of empirical evaluation all together. Rather, we should carefully choose (or create) our data sets and empirical questions so as to gain as much insight as possible and go beyond &#8220;my algorithm is better than yours&#8221;. Statistical tests should not mark the end of an experimental evaluation but rather act as a starting point for further questions and carefully constructed experiments that resolve those questions.</p>
]]></content:encoded>
			<wfw:commentRss>http://conflate.net/inductio/2008/07/evaluation-methods-for-machine-learning/feed/</wfw:commentRss>
		</item>
		<item>
		<title>ICML Discussion Site</title>
		<link>http://conflate.net/inductio/2008/07/icml-discussion-site/</link>
		<comments>http://conflate.net/inductio/2008/07/icml-discussion-site/#comments</comments>
		<pubDate>Tue, 01 Jul 2008 15:40:23 +0000</pubDate>
		<dc:creator>Mark Reid</dc:creator>
		
		<category><![CDATA[Community]]></category>

		<guid isPermaLink="false">http://conflate.net/inductio/?p=44</guid>
		<description><![CDATA[A little while ago, John Langford suggested that a discussion site be set up for ICML that allows attendees and others to talk about the accepted papers.

Having played around with various wiki systems and discussion sites in the past, I volunteered to help set something up. As John has noted on his blog the discussion [...]]]></description>
			<content:encoded><![CDATA[<p>A little while ago, John Langford <a href="http://hunch.net/?p=327" onclick="javascript:urchinTracker ('/outbound/article/hunch.net');">suggested</a> that a discussion site be set up for ICML that allows attendees and others to talk about the accepted papers.</p>

<p>Having played around with various wiki systems and discussion sites in the past, I volunteered to help set something up. As John has <a href="http://hunch.net/?p=335" onclick="javascript:urchinTracker ('/outbound/article/hunch.net');">noted on his blog</a> the discussion site is now <a href="http://conflate.net/icml" >up and running</a>.</p>

<p>The main aim with this first attempt was to provide basic functionality: papers can be browsed by author, title and keyword; each paper has a discussion thread where anyone can leave comments. There are no comments at the time of writing this but I&#8217;m hoping this will change once the conference gets underway.</p>

<p>Provided there are no disasters, the site will remain up for as long as it is useful. Ultimately, I&#8217;d like to add earlier conference proceedings to the site and ensure future conferences can be added as well. We will see how it goes this year and incorporate and feedback into future versions of the site.</p>

<p>For those interested in the technical details, I used <a href="http://wiki.splitbrain.org/wiki:dokuwiki" onclick="javascript:urchinTracker ('/outbound/article/wiki.splitbrain.org');">DokuWiki</a> as the engine for the site along with a number of plugins, most importantly the <a href="http://wiki.splitbrain.org/plugin:discussion" onclick="javascript:urchinTracker ('/outbound/article/wiki.splitbrain.org');">discussion plugin</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://conflate.net/inductio/2008/07/icml-discussion-site/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Visualising 19th Century Reading in Australia</title>
		<link>http://conflate.net/inductio/2008/06/visualising-reading/</link>
		<comments>http://conflate.net/inductio/2008/06/visualising-reading/#comments</comments>
		<pubDate>Tue, 17 Jun 2008 03:10:34 +0000</pubDate>
		<dc:creator>Mark Reid</dc:creator>
		
		<category><![CDATA[Application]]></category>

		<category><![CDATA[Exposition]]></category>

		<category><![CDATA[Reading]]></category>

		<category><![CDATA[books]]></category>

		<category><![CDATA[PCA]]></category>

		<category><![CDATA[Processing]]></category>

		<category><![CDATA[Visualisation]]></category>

		<guid isPermaLink="false">http://conflate.net/inductio/?p=40</guid>
		<description><![CDATA[A description of a visualisation of some 19th century Australian borrowing records from the Australian Common Readers Project.]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve recently spent a bit of time collaborating with my wife on a research project. Research collaboration by couples is not new but given that Julieanne is a <a href="http://cass.anu.edu.au/humanities/school_sites/staff.php" onclick="javascript:urchinTracker ('/outbound/article/cass.anu.edu.au');">lecturer in the English program</a> and I&#8217;m part of the <a href="http://csl.cecs.anu.edu.au/" onclick="javascript:urchinTracker ('/outbound/article/csl.cecs.anu.edu.au');">computer sciences laboratory</a>, this piece of joint research is a little unusual.</p>

<p>The rest of this post describes the intersection of our interests &#8212; data from the Australian Common Reader Project &#8212; and the visualisation tool I wrote to explore it. The tool itself is based on a simple application of linear Principal Component Analysis (PCA). I&#8217;ll attempt to explain it here in such a way that readers who have not studied this technique might still be able to make use of the tool.</p>

<h4>The Australian Common Reader Project</h4>

<p>One of Julieanne&#8217;s research interests is the Australian audience of the late 19th and early 20th centuries. As part of her PhD, she made use of an amazing database that is part of the <a href="http://www.api-network.com/hosted/acrp/" onclick="javascript:urchinTracker ('/outbound/article/www.api-network.com');">Australian Common Reader Project</a> &#8212; a project that has collected and entered library borrowing records from Australian libraries along with annotations about when books were borrowed, their genres, borrower occupations, author information, <i>etc</i>. This sort of information makes it possible for Australian literature and cultural studies academics to ask empirical questions about Australian readers&#8217; relationship with books and periodicals.</p>

<p>Ever on the lookout for <a href="http://conflate.net/inductio/2008/02/a-meta-index-of-data-sets/" >interesting data sets</a>, I suggested that we apply some basic data analysis tools to the database to see what kind of relationships between books and borrowers we might find. When asked if we could have access to the database, <a href="http://www.humanities.curtin.edu.au/staff.cfm/t.dolin" onclick="javascript:urchinTracker ('/outbound/article/www.humanities.curtin.edu.au');">Tim Dolin</a> graciously agreed and enlisted <a href="http://www.humanities.curtin.edu.au/staff.cfm/j.ensor" onclick="javascript:urchinTracker ('/outbound/article/www.humanities.curtin.edu.au');">Jason Ensor</a> to help with our technical questions.</p>

<h4>Books and Borrowers</h4>

<p>After an initial inspection, my first thought was to try to visualise the similarity of the books in the database as measured by the number of borrowers they have in common. 
The full database contains 99,692 loans of 7,078 different books from 11 libraries by one of the 2,642 people. To make this more manageable, I focused on books that had at least 20 different borrowers and only considered people who had borrowed one of these books.
This distilled the database down to a simple table with each row representing one of 1,616 books and each column representing one of 2,473 people.</p>

<table class="aligncenter">
<caption>Table 1: A portion of the book and borrower table. A 1 indicates that the borrower (column)
borrowed the book (row) at least once. A 0 indicates that the borrower never borrowed the book.
</caption>
<tr><th rowspan="2" class="title">Book<br/>ID</th><th colspan="4" class="title">Borrower ID</th></tr>
<tr><th>1</th><th>2</th><th>&#8230;</th><th>2,473</th></tr>
<tr><th>1</th><td>1</td><td>0</td><td>&#8230;</td><td>1</td></tr>
<tr><th>2</th><td>1</td><td>1</td><td>&#8230;</td><td>0</td></tr>
<tr><th>3</th><td>0</td><td>0</td><td>&#8230;</td><td>1</td></tr>
<tr><th>&#8230;</th><td>&#8230;</td><td>&#8230;</td><td>&#8230;</td><td>&#8230;</td></tr>
<tr><th>1,616</th><td>1</td><td>1</td><td>&#8230;</td><td>1</td></tr>
</table>

<p>Conceptually, each cell in the table contains a 1 if the person associated with the cell&#8217;s column borrowed the book associated with the cell&#8217;s row. If there was no such loan between a given book and borrower the corresponding cell contains a 0. For example, Table 1 shows that book 2 was borrowed (at least once) by borrower 1 but never by borrower 2,473.</p>

<h4>Book Similarity</h4>

<p>The table view of the books and their borrowers does not readily lend itself to insight. The approach we took to get a better picture of this information was to plot each book as a point on a graph so that similar books are placed closer together than dissimilar books. To do this a notion of what &#8220;similar books&#8221; is required.</p>

<p>Mathematically, row <img src='/inductio/wp-content/plugins/latexrender/pictures/865c0c0b4ab0e063e5caa3387c1a8741_1.0pt.png' title='i' alt='i'  style="vertical-align:-1.0pt;" > of Table 1 can be represented as a vector <img src='/inductio/wp-content/plugins/latexrender/pictures/65e8f93906568bf486ac075da3136b42_2.49998pt.png' title='\mathbf{b}_i' alt='\mathbf{b}_i'  style="vertical-align:-2.49998pt;" > of 1s and 0s. The value of the cell in the <img src='/inductio/wp-content/plugins/latexrender/pictures/363b122c528f54df4a0446b6bab05515_2.94444pt.png' title='j' alt='j'  style="vertical-align:-2.94444pt;" ><sup>th</sup> column of that row will be denoted <img src='/inductio/wp-content/plugins/latexrender/pictures/0a12651908668b936d408556cc9f74db_3.86108pt.png' title='b_{i,j}' alt='b_{i,j}'  style="vertical-align:-3.86108pt;" >. For example, the 2<sup>nd</sup> row in the table can be written as the vector <img src='/inductio/wp-content/plugins/latexrender/pictures/0d8af057b3c54400b66cc79af2916f3a_3.5pt.png' title='\mathbf{b}_2 = (1,1,\ldots,0)' alt='\mathbf{b}_2 = (1,1,\ldots,0)'  style="vertical-align:-3.5pt;" > and the value in its first column is <img src='/inductio/wp-content/plugins/latexrender/pictures/eea1387f63bb33e4e592c7e3565aa0db_3.86108pt.png' title='b_{2,1} = 1' alt='b_{2,1} = 1'  style="vertical-align:-3.86108pt;" >.</p>

<p>A crude measure of the similarity between book 1 and book 2 can be computed from this table by counting how many borrowers they have in common. That is, the number of columns that have a <code>1</code> in the row for book 1 and the row for book 2.</p>

<p>In terms of the vector representation, this similarity measure is simply the &#8220;<a href="http://en.wikipedia.org/wiki/Inner_product_space" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">inner product</a>&#8221; between <img src='/inductio/wp-content/plugins/latexrender/pictures/f6076202b94ea8b8b6db2d5cb364ba08_2.49998pt.png' title='\mathbf{b}_1' alt='\mathbf{b}_1'  style="vertical-align:-2.49998pt;" > and <img src='/inductio/wp-content/plugins/latexrender/pictures/ac9f410fbffe1ccaca1cc5a2944baf8d_2.49998pt.png' title='\mathbf{b}_2' alt='\mathbf{b}_2'  style="vertical-align:-2.49998pt;" > and is written <img src='/inductio/wp-content/plugins/latexrender/pictures/0fcf061ceaa6fc34348254be077c6084_3.86108pt.png' title='\left&lt;\mathbf{b}_1,\mathbf{b}_2\right&gt; = b_{1,1}b_{2,1} + \cdots + b_{1,N}b_{2,N}' alt='\left&lt;\mathbf{b}_1,\mathbf{b}_2\right&gt; = b_{1,1}b_{2,1} + \cdots + b_{1,N}b_{2,N}'  style="vertical-align:-3.86108pt;" > where N = 2,473 is the total number of borrowers.</p>

<p>It turns out that simply counting the number of borrowers two books is not a great measure of similarity. The problem is that two very popular books, each with 100 borrowers, that only share 10% of their borrowers would be considered as similar as two books, each with 10 readers, that share all of their borrowers. An easy way to correct this is to &#8220;normalise&#8221; the borrower counts by making sure the similarity of a book with itself is always equal to 1. A common way of doing this is by dividing the inner product of two books by the &#8220;size&#8221; of each of the vectors for those books.</p>

<p>Mathematically, we will denote the size of a book vector <img src='/inductio/wp-content/plugins/latexrender/pictures/65e8f93906568bf486ac075da3136b42_2.49998pt.png' title='\mathbf{b}_i' alt='\mathbf{b}_i'  style="vertical-align:-2.49998pt;" > as <img src='/inductio/wp-content/plugins/latexrender/pictures/bd5421e9469469622d074174997073a1_4.05008pt.png' title='\|\mathbf{b}_i\| = \sqrt{\left&lt;\mathbf{b}_i,\mathbf{b}_i\right&gt;}' alt='\|\mathbf{b}_i\| = \sqrt{\left&lt;\mathbf{b}_i,\mathbf{b}_i\right&gt;}'  style="vertical-align:-4.05008pt;" >. The similarity between two books then becomes:</p>

<p><center>
<img src='/inductio/wp-content/plugins/latexrender/pictures/490f044820658ce4d7350b5f5498ffd9_10.7206pt.png' title='\displaystyle&#13;&#10;    \text{sim}(\mathbf{b}_i,\mathbf{b}_j) &#13;&#10;     = \frac{\left&lt;\mathbf{b}_i,\mathbf{b}_j\right&gt;}{\|\mathbf{b}_i\|\|\mathbf{b}_j\|}&#13;&#10;' alt='\displaystyle&#13;&#10;    \text{sim}(\mathbf{b}_i,\mathbf{b}_j) &#13;&#10;     = \frac{\left&lt;\mathbf{b}_i,\mathbf{b}_j\right&gt;}{\|\mathbf{b}_i\|\|\mathbf{b}_j\|}&#13;&#10;'  style="vertical-align:-10.7206pt;" >
</center></p>

<h4>Principal Component Analysis</h4>

<p>Now that we have a similarity measure between books the idea is to create a plot of points &#8212; one per book &#8212; so that similar books are placed close together and dissimilar books are kept far apart.</p>

<p>A standard technique for doing this is <a href="http://en.wikipedia.org/wiki/Principal_components_analysis" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">Principal Component Analysis</a>. Intuitively, this technique aims to find a way of reducing the number of coordinates in each book vector  in such a way that when the similarity between two books is computed using these smaller vectors it is as close as possible to the original similarity. That is, PCA creates a new table that represents books in terms of only two columns.</p>

<table class="aligncenter">
<caption>Table 2: A portion of the book table after PCA. The values in the two new columns (PCA IDs) can be used to plot the books.
</caption>
<tr><th rowspan="2" class="title">Book<br/>ID</th><th colspan="2" class="title">PCA ID</th></tr>
<tr>                 <th>1</th><th>2</th></tr>
<tr><th>1</th><td>-8.2</td><td>2.3</td></tr>
<tr><th>2</th><td>0.4</td><td>-4.3</td></tr>
<tr><th>3</th><td>-1.3</td><td>-3.7</td></tr>
<tr><th>&#8230;</th><td>&#8230;</td><td>&#8230;</td></tr>
<tr><th>1,616</th><td>2.2</td><td>-5.6</td></tr>
</table>

<p>Table 2 gives an example of the book table after PCA that reduces the book vectors (rows) from 2,473 to two entries. The PCA columns cannot be as easily interpreted as the borrowers columns in Table 1 but the values in the columns are such that the similarity of the books in Table 2 are roughly as similar as if the values in Table 1 were used. That is, if <img src='/inductio/wp-content/plugins/latexrender/pictures/cda9fb280e5c02cdc4d5ef3c359b68fe_3.5pt.png' title='\mathbf{c}_1 = (-8.2,2.3)' alt='\mathbf{c}_1 = (-8.2,2.3)'  style="vertical-align:-3.5pt;" > and <img src='/inductio/wp-content/plugins/latexrender/pictures/e6d2369d0605621d0ab577fbb97240e4_3.5pt.png' title='\mathbf{c}_2=(0.4,-4.3)' alt='\mathbf{c}_2=(0.4,-4.3)'  style="vertical-align:-3.5pt;" > are the vectors
for the first two rows of Table 2 then <img src='/inductio/wp-content/plugins/latexrender/pictures/4d1ffafc74b72569015a8a2ce809317e_3.5pt.png' title='\text{sim}(\mathbf{c}_1,\mathbf{c}_2)' alt='\text{sim}(\mathbf{c}_1,\mathbf{c}_2)'  style="vertical-align:-3.5pt;" >
would be close to <img src='/inductio/wp-content/plugins/latexrender/pictures/96c218f9795bf29f5e3654f3fbd19d1d_3.5pt.png' title='\text{sim}(\mathbf{b}_1,\mathbf{b}_2)' alt='\text{sim}(\mathbf{b}_1,\mathbf{b}_2)'  style="vertical-align:-3.5pt;" >, the similarity of the
first two rows in Table 1.<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup></p>

<h4>Visualising the Data</h4>

<p>Figure 1 shows a plot of the PCA reduced book data. Each circle represents one of the 1,616 books, plotted according to the coordinates in a table like Table 2. The size of each circle indicates how many borrowers each book had and its colour indicates which library the book belongs to.<sup id="fnref:2"><a href="#fn:2" rel="footnote">2</a></sup></p>

<div class="image">
<img src="http://conflate.net/inductio/wp-content/uploads/2008/06/all_libraries.png" alt="Plot of the books across all libraries in the ACRP database" width="550" class="aligncenter wp-image-43" />
<p>Figure 1: A PCA plot of all the books in the ACRP database coloured according to which library they belong to. The size of each circle indicates the number of borrowers of the corresponding book.
</div>

<p>One immediate observation is that books are clustered according to which library they belong to. This is not too surprising since the books in a library limit what borrowers from that library can read. This means it is likely that two voracious readers that frequent the same library will read the same books. This, in turn, will mean the similarity of two books from a library will be higher than books from different libraries as there are very few borrowers that use more than one library.</p>

<h4>Drilling Down and Interacting</h4>

<p>To get a better picture of the data, we decided to focus on books from a single library to avoid this clustering. The library we focused on was the <a href="http://en.wikipedia.org/wiki/Lambton,_New_South_Wales" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">Lambton</a> Miners&#8217; and Mechanics&#8217; Institute in New South Wales. This library had the largest number of loans (20,253) and so was most likely to have interesting similarity data.</p>

<p>There are a total of 789 books in the Lambton institute and 469 borrowers of those books. A separate PCA reduction was performed on this restricted part of the database to create a plot of only the Lambton books.</p>

<p>To make it easier to explore this data, I wrote a simple tool that allows a viewer to interact with the PCA plot. A screenshot from this tool is shown in Figure 2. Once again, larger circles represent books with a larger number of borrowers.</p>

<p>Clicking on the figure will open a new window and, after a short delay, the tool will run. The same page can also be accessed from <a href="/inductio/wp-content/public/acrp/">this link</a>.</p>

<div class="image">
<a href='http://conflate.net/inductio/wp-content/public/acrp/' target="_"><img src="http://conflate.net/inductio/wp-content/uploads/2008/06/acrp.png" alt="Click to open visualisation applet" width="550" class="aligncenter wp-image-41" /></a>
<p>Figure 2: A screenshot of the ACRP visualisation tool showing books from the Lambton Institute. Click the image to run the tool in a new window.</p>
</div>

<p>Instructions describing how to use the tool can be found below it. 
In a nutshell: hovering over a circle will reveal the title of the book corresponding to that circle; clicking on a circle will draw lines to its most similar neighbours; altering the &#8220;Borrowers&#8221; bar will only show books with at least that many borrowers; and altering the &#8220;Similarity&#8221; bar will only draw lines to books with at least that proportion of books in common.</p>

<h4>Future Work and Distant Reading</h4>

<p>Julieanne and I are still at the early stages of our research using the ACRP database. The use of PCA for visualisation was a first step in our pursuit of what <a href="http://en.wikipedia.org/wiki/Franco_Moretti" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">Franco Moretti</a> calls &#8220;distant reading&#8221; &#8212; looking at books as objects and how they are read rather than the &#8220;close reading&#8221; of the text of individual books.</p>

<p>Now that we have this tool, we are able to quickly explore relationships between these books based on the reading habits of Australians at the turn of the century. Of course, there are many caveats that apply to any patterns we might see in these plots. For instance, the similarity between books is only based on habits of a small number of readers and will be influenced by the peculiarities of the libraries and the books they choose to buy. For this reason, these plots are not intended to provide conclusive answers to questions we might.</p>

<p>Instead we hope that exploring the ACRP database in this way will lead us to interesting questions about particular pairs or groups of books that can be followed up by a more thorough analysis of their readers, their text as well as other historical and cultural factors about them.</p>

<h4>Data and Code</h4>

<p>For the technically minded, I have made the code I used to do the visualisation is available on <a href="http://github.com/mreid/acrp/tree/master" onclick="javascript:urchinTracker ('/outbound/article/github.com');">GitHub</a>. It is a combination of <a href="http://en.wikipedia.org/wiki/SQL" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">SQL</a> for data preprocessing, <a href="http://www.r-project.org/" onclick="javascript:urchinTracker ('/outbound/article/www.r-project.org');">R</a> for the PCA reduction and <a href="http://processing.org/" onclick="javascript:urchinTracker ('/outbound/article/processing.org');">Processing</a> for creating the visualisation tool. You will also find a number of images and some notes at the same location.</p>

<p>Access to the data that the code acts upon is not mine to give, so the code is primarily to show how I did the visualisation rather than a way to let others analyse the data. If the founders of the <a href="http://www.api-network.com/hosted/acrp/" onclick="javascript:urchinTracker ('/outbound/article/www.api-network.com');">ACRP</a> project decide to release the data to the public at a later date I will link to it from here.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>Technically, the guarantee of the &#8220;closeness&#8221; of the similarity measures only holds on average, that is, over all possible pairs of books. There is no guarantee any particular pair&#8217;s
similarity is estimated well.&#160;<a href="#fnref:1" rev="footnote">&#8617;</a></p>
</li>

<li id="fn:2">
<p>A book can belong to more than one library. In this case one library is chosen at random to determine a circle&#8217;s colour.&#160;<a href="#fnref:2" rev="footnote">&#8617;</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://conflate.net/inductio/2008/06/visualising-reading/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Constructive and Classical Mathematics</title>
		<link>http://conflate.net/inductio/2008/06/constructive-and-classical-mathematics/</link>
		<comments>http://conflate.net/inductio/2008/06/constructive-and-classical-mathematics/#comments</comments>
		<pubDate>Thu, 12 Jun 2008 02:08:12 +0000</pubDate>
		<dc:creator>Mark Reid</dc:creator>
		
		<category><![CDATA[Philosophy]]></category>

		<guid isPermaLink="false">http://conflate.net/inductio/?p=42</guid>
		<description><![CDATA[I have a (very) amateur interest in the philosophy of mathematics. My interest was recently piqued again after finishing the very readable &#8220;Introducing Philosophy of Mathematics&#8221; by Michèle Friend. Since then, I&#8217;ve been a lot more aware of terms like &#8220;constructivist&#8221;, &#8220;realist&#8221;, and &#8220;formalist&#8221; as they apply to mathematics.

Today, I was flicking through the entry [...]]]></description>
			<content:encoded><![CDATA[<p>I have a (very) amateur interest in the philosophy of mathematics. My interest was recently piqued again after finishing the very readable &#8220;<a href="http://www.librarything.com/work/3362656/book/17581191" onclick="javascript:urchinTracker ('/outbound/article/www.librarything.com');">Introducing Philosophy of Mathematics</a>&#8221; by Michèle Friend. Since then, I&#8217;ve been a lot more aware of terms like &#8220;constructivist&#8221;, &#8220;realist&#8221;, and &#8220;formalist&#8221; as they apply to mathematics.</p>

<p>Today, I was flicking through the entry on &#8220;<a href="http://plato.stanford.edu/entries/mathematics-constructive/" onclick="javascript:urchinTracker ('/outbound/article/plato.stanford.edu');">Constructivist Mathematics</a>&#8221; in the <a href="http://plato.stanford.edu/" onclick="javascript:urchinTracker ('/outbound/article/plato.stanford.edu');">Stanford Encyclopedia of Philosophy</a> and found a simple example of some of the problems with non-constructive take on what disjunction means in mathematical statements. The article calls it &#8220;well-worn&#8221; but I hadn&#8217;t seen it before.</p>

<p>Consider the statement:</p>

<blockquote>
  <p>There exists irrational numbers a and b such that a<sup>b</sup> is rational.</p>
</blockquote>

<p>The article gives a slick proof that this statement is true by invoking the <a href="http://en.wikipedia.org/wiki/Law_of_the_excluded_middle" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">law of the excluded middle</a> (LEM). That is, every number must be either rational or irrational.</p>

<p>Now consider <img src='/inductio/wp-content/plugins/latexrender/pictures/e2f7d525aaaddd64948ae1e8b7bab16d_2.32779pt.png' title='\sqrt{2}^\sqrt{2}' alt='\sqrt{2}^\sqrt{2}'  style="vertical-align:-2.32779pt;" >.  By the LEM, this must rational or irrational:</p>

<ul>
<li><p>Case 1: If it is rational then we have proved the statement since we know <img src='/inductio/wp-content/plugins/latexrender/pictures/664fcbd2c335fd506336dfcb487d258a_2.32779pt.png' title='a = b = \sqrt{2}' alt='a = b = \sqrt{2}'  style="vertical-align:-2.32779pt;" > is irrational.</p></li>
<li><p>Case 2: If <img src='/inductio/wp-content/plugins/latexrender/pictures/e2f7d525aaaddd64948ae1e8b7bab16d_2.32779pt.png' title='\sqrt{2}^\sqrt{2}' alt='\sqrt{2}^\sqrt{2}'  style="vertical-align:-2.32779pt;" > is irrational then choosing <img src='/inductio/wp-content/plugins/latexrender/pictures/de1e6484fedeff4a4f18353d9ccc3209_2.32779pt.png' title='a = \sqrt{2}^\sqrt{2}' alt='a = \sqrt{2}^\sqrt{2}'  style="vertical-align:-2.32779pt;" > and <img src='/inductio/wp-content/plugins/latexrender/pictures/f433fa929f79d2e2d072be919fd3c404_2.32779pt.png' title='b = \sqrt{2}' alt='b = \sqrt{2}'  style="vertical-align:-2.32779pt;" > as our two irrational numbers gives <img src='/inductio/wp-content/plugins/latexrender/pictures/83dff9386b02b4082277e2a31140876b_2.32779pt.png' title='{\sqrt{2}^{\sqrt{2}^\sqrt{2}}} = {\sqrt{2}^2} = 2' alt='{\sqrt{2}^{\sqrt{2}^\sqrt{2}}} = {\sqrt{2}^2} = 2'  style="vertical-align:-2.32779pt;" > &#8212; a rational number.</p></li>
</ul>

<p>Either way, we&#8217;ve proven the existence of two irrational numbers yielding a rational one.
The problem with this is that this argument is non-constructive and so we don&#8217;t know which of case 1 and case 2 is true, we only know that one of them must be<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>. This is a simple case of <i>reductio ad absurdum</i> in disguise.</p>

<p>As a born-again computer scientist (my undergraduate degree was pure maths and my PhD in computer science) I&#8217;ve become increasingly suspicious of these sorts of proof and more <a href="http://en.wikipedia.org/wiki/Constructivism_%28mathematics%29" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">constructivist</a> &#8212; even <a href="http://en.wikipedia.org/wiki/Intuitionism" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">intuitionist</a> &#8212; in my tastes. I think the seed of doubt was planted during the awkward discussions of the <a href="http://en.wikipedia.org/wiki/Axiom_of_choice" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">Axiom of Choice</a> in my functional analysis lectures. The sense of unease is summed up nicely in the following joke:</p>

<blockquote>
  <p>The Axiom of Choice is obviously true, the well-ordering principle obviously false, 
  and who can tell about Zorn&#8217;s lemma?</p>
</blockquote>

<p>Of course, all those concepts are equivalent but that&#8217;s far from intuitive.</p>

<p>I don&#8217;t think I&#8217;m extremist enough to take a wholeheartedly computational view of mathematics &#8212; denying all but the computable real numbers and functions, thereby making <a href="http://math.andrej.com/2006/03/27/sometimes-all-functions-are-continuous/" onclick="javascript:urchinTracker ('/outbound/article/math.andrej.com');">all functions continuous</a> &#8212; but it is a tempting view of the subject.</p>

<p>In machine learning, I think there is a fairly pragmatic take on the philosophy of mathematics. For example, classical theorems from functional analysis are used to derive results involving kernels but when it comes to implementation, estimations and approximations are used with abandon. In my opinion, this is a <a href="http://www.daniel-lemire.com/blog/archives/2008/06/05/why-pure-theory-is-wasteful/" onclick="javascript:urchinTracker ('/outbound/article/www.daniel-lemire.com');">healthy way for the theory in this area to proceed</a>. As in physics, if the experimental work reveals inconsistencies with a theory, revisit the maths. If that doesn&#8217;t work, <a href="http://diveintomark.org/archives/2008/06/11/purity" onclick="javascript:urchinTracker ('/outbound/article/diveintomark.org');">talk to the philosophers</a>.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>It turns out that, by <a href="http://en.wikipedia.org/wiki/Gelfond's_theorem" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">Gelfond&#8217;s Theorem</a> that <img src='/inductio/wp-content/plugins/latexrender/pictures/e2f7d525aaaddd64948ae1e8b7bab16d_2.32779pt.png' title='\sqrt{2}^\sqrt{2}' alt='\sqrt{2}^\sqrt{2}'  style="vertical-align:-2.32779pt;" > is transcendental, and therefore irrational so the second case alone proves the statement. However, I&#8217;m not sure what machinery is required to prove Gelfond&#8217;s theorem.&#160;<a href="#fnref:1" rev="footnote">&#8617;</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://conflate.net/inductio/2008/06/constructive-and-classical-mathematics/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Research-Changing Books</title>
		<link>http://conflate.net/inductio/2008/05/research-changing-books/</link>
		<comments>http://conflate.net/inductio/2008/05/research-changing-books/#comments</comments>
		<pubDate>Mon, 26 May 2008 06:45:44 +0000</pubDate>
		<dc:creator>Mark Reid</dc:creator>
		
		<category><![CDATA[Reading]]></category>

		<guid isPermaLink="false">http://conflate.net/inductio/?p=38</guid>
		<description><![CDATA[In response to a post by Peter Turney, I list the books I feel shaped my research career.]]></description>
			<content:encoded><![CDATA[<p>A <a href="http://apperceptual.wordpress.com/2008/05/25/the-book-that-changed-my-life/" onclick="javascript:urchinTracker ('/outbound/article/apperceptual.wordpress.com');">recent post by Peter Turney</a> lists the books that have influenced his research. As well as compiling a great list of books that are now on my mental &#8220;must read one day&#8221; list, he makes a crucial point for compiling such a list:</p>

<blockquote>
  <p>If a reader cannot point to some tangible outcome from reading a book, 
  then the reader may be overestimating the personal impact of the book.</p>
</blockquote>

<p>With that in mind I tried to think of which books had a substantial impact on my research career.</p>

<p>Although I can barely remember any of it now, the <a href="http://www.geocities.com/rmelick/prg.txt" onclick="javascript:urchinTracker ('/outbound/article/www.geocities.com');">manual</a> that came with the Commodore Vic 20 computer I read when I was around seven got me hooked on programming. In primary and secondary school it was that book and the subsequent Commodore 64 and Amiga manuals that set me on the road to studying computer science and maths.</p>

<p>In my second year at university I had the great fortune of being recommended Hofstadter&#8217;s &#8220;<a href="http://www.librarything.com/work/5619/book/12512722" onclick="javascript:urchinTracker ('/outbound/article/www.librarything.com');">Gödel, Escher, Bach</a>&#8221; by a fellow student. It is centrally responsible for getting me to start thinking about thinking and subsequently doing a PhD in machine learning. The fanciful but extremely well written detours into everything from genetics to Zen Buddhism also broadened my horizons immensely.</p>

<p>I. J. Good&#8217;s &#8220;<a href="http://www.librarything.com/work/2542774/book/12420041" onclick="javascript:urchinTracker ('/outbound/article/www.librarything.com');">The Estimation of Probabilities</a>&#8221; was the tiny 1965 monograph I bought second-hand for $2 that made my thesis take a huge change in direction by giving it a Bayesian flavour. I now realise that a lot of that work had since been superseded by much more sophisticated Bayesian methods but sometimes finding a theory before it has been over-polished means that there is much more expository writing to aid intuition. It also helps that Good is a fabulous technical writer.</p>

<p>Philosophically, Nelson Goodman&#8217;s &#8220;<a href="http://www.librarything.com/work/70761/book/12419989" onclick="javascript:urchinTracker ('/outbound/article/www.librarything.com');">Fact, Fiction and Forecast</a>&#8221; also shaped my thinking about induction quite a lot. His ideas on the &#8220;virtuous circle&#8221; of basing current induction on the successes and failures of the past provided me with a philosophical basis for the transfer learning aspects of my research. I found his views a refreshing alternative to Popper&#8217;s (also personally influential) take on induction in &#8220;<a href="http://www.librarything.com/work/68144/book/31001290" onclick="javascript:urchinTracker ('/outbound/article/www.librarything.com');">The Logic of Scientific Discovery</a>&#8220;. Whereas Popper beautifully characterises the line between metaphysical and scientific theories, Goodman tries to give an explanation of <em>how</em> we might practically come up with new theories in the first place given that there will be, in general, countless that adequately fit the available data. In a nutshell, his theory of &#8220;entrenchment&#8221; says that we accrete a network of terms by induction and use these terms as features for future induction depending on how successful they were when used in past inductive leaps. This is a view of induction inline with Hume&#8217;s &#8220;habits of the mind&#8221; and one I find quite satisfying.</p>

<p>While not directly related to machine learning or computer science, there are a few other books that helped me form opinions on the process of research in general. I read Scott&#8217;s &#8220;<a href="http://www.librarything.com/work/1093218/book/31001976" onclick="javascript:urchinTracker ('/outbound/article/www.librarything.com');">Write to the Point</a>&#8221; over a decade ago now but it still makes me stop, look at my writing and simplify it. My attitude to presenting technical ideas was also greatly influenced by reading Feynman&#8217;s &#8220;<a href="http://www.librarything.com/work/27937/book/12512712" onclick="javascript:urchinTracker ('/outbound/article/www.librarything.com');">QED</a>&#8221; lectures. They are a perfect example of communicating extremely deep and difficult ideas to a non-technical audience without condescension and misrepresentation. Finally, I read Kennedy&#8217;s &#8220;<a href="http://www.librarything.com/work/252530/book/20392830" onclick="javascript:urchinTracker ('/outbound/article/www.librarything.com');">Academic Duty</a>&#8221; just as I started my current post-doc and found it immensely insightful. I plan to reread it as I (hopefully) hit various milestone&#8217;s in my academic career.</p>

<p>Of course, like Peter, there are innumerable other books, papers and web pages that have shaped my thinking but the ones above are the ones that leap to mind when I think about how my research interests have developed over time.</p>
]]></content:encoded>
			<wfw:commentRss>http://conflate.net/inductio/2008/05/research-changing-books/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
