<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Stop! Using! Bad! Numbers!</title>
	<atom:link href="http://www.courtneymilan.com/ramblings/2010/01/14/stop-using-bad-numbers/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.courtneymilan.com/ramblings/2010/01/14/stop-using-bad-numbers/</link>
	<description>historical romance on the blog</description>
	<lastBuildDate>Tue, 07 Feb 2012 22:36:55 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
	<item>
		<title>By: Magdalen</title>
		<link>http://www.courtneymilan.com/ramblings/2010/01/14/stop-using-bad-numbers/comment-page-1/#comment-11030</link>
		<dc:creator>Magdalen</dc:creator>
		<pubDate>Sat, 16 Jan 2010 13:50:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.courtneymilan.com/ramblings/?p=1037#comment-11030</guid>
		<description>I have to laugh -- I wrote a review once (for the Journal of the American Statistical Association) of a book called &quot;Lies, Damned Lies, and Statistics.&quot;  (My most persistent memory of that experience is that I got the author&#039;s address wrong because I didn&#039;t know then that there was a Halifax in England as well as Nova Scotia...)

The one aspect to Attributor&#039;s study that worries me: sampling methodology.  As we know from political surveys, the better the sampling methodology the more likely the conclusions about the entire group will prove to be accurate.  What makes this study particularly tricky is that sampling has taken place at two times:  first when the pirating sites decide which books to uh, offer, and second when Attributor decides which of those pirates and their &quot;offerings&quot; to track.

I&#039;m not a statistician, obviously.  But common sense would suggest that the pirates are going to select primarily bestsellers to pirate.  (Which sets your nightmare, Courtney, as a bad joke: Good news: your book is selling very well.  Bad news: it&#039;s selling well enough to be pirated.)

That&#039;s not to say that there isn&#039;t somewhere a pirated copy available of less-than-best sellers.

I have no doubt that the statistics (i.e., the actual math) used by the Attributor study are consistent and appropriate.  But sampling theory is more than getting your chi-square tests right.  This is a complicated problem, and when a company has a motive for its research, getting the result to appear both transparent and devoid of bias is that much harder.  (Which is why Consumer Reports is so valuable.)

I know this has been mentioned already and I apologize for restating the obvious, but I&#039;m not convinced that people willing to download pirated material would actually pay for that material if the pirated stuff were not available.  Therefore, any monetary conclusion of the effect of piracy on publishing would have to have a third level of sampling: sampling the downloaders.  It doesn&#039;t sound like Attributor did that.  (The fact that the downloaders are stealing intellectual property might make it a bit harder to get them to cooperate with a study.  That&#039;s why sampling experts get paid well...)</description>
		<content:encoded><![CDATA[<p>I have to laugh &#8212; I wrote a review once (for the Journal of the American Statistical Association) of a book called &#8220;Lies, Damned Lies, and Statistics.&#8221;  (My most persistent memory of that experience is that I got the author&#8217;s address wrong because I didn&#8217;t know then that there was a Halifax in England as well as Nova Scotia&#8230;)</p>
<p>The one aspect to Attributor&#8217;s study that worries me: sampling methodology.  As we know from political surveys, the better the sampling methodology the more likely the conclusions about the entire group will prove to be accurate.  What makes this study particularly tricky is that sampling has taken place at two times:  first when the pirating sites decide which books to uh, offer, and second when Attributor decides which of those pirates and their &#8220;offerings&#8221; to track.</p>
<p>I&#8217;m not a statistician, obviously.  But common sense would suggest that the pirates are going to select primarily bestsellers to pirate.  (Which sets your nightmare, Courtney, as a bad joke: Good news: your book is selling very well.  Bad news: it&#8217;s selling well enough to be pirated.)</p>
<p>That&#8217;s not to say that there isn&#8217;t somewhere a pirated copy available of less-than-best sellers.</p>
<p>I have no doubt that the statistics (i.e., the actual math) used by the Attributor study are consistent and appropriate.  But sampling theory is more than getting your chi-square tests right.  This is a complicated problem, and when a company has a motive for its research, getting the result to appear both transparent and devoid of bias is that much harder.  (Which is why Consumer Reports is so valuable.)</p>
<p>I know this has been mentioned already and I apologize for restating the obvious, but I&#8217;m not convinced that people willing to download pirated material would actually pay for that material if the pirated stuff were not available.  Therefore, any monetary conclusion of the effect of piracy on publishing would have to have a third level of sampling: sampling the downloaders.  It doesn&#8217;t sound like Attributor did that.  (The fact that the downloaders are stealing intellectual property might make it a bit harder to get them to cooperate with a study.  That&#8217;s why sampling experts get paid well&#8230;)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rich</title>
		<link>http://www.courtneymilan.com/ramblings/2010/01/14/stop-using-bad-numbers/comment-page-1/#comment-11028</link>
		<dc:creator>Rich</dc:creator>
		<pubDate>Fri, 15 Jan 2010 23:25:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.courtneymilan.com/ramblings/?p=1037#comment-11028</guid>
		<description>@Carolyn,

Please do send me a copy of the study you reference.  If it&#039;s from Brian O&#039;Leary, then I have it and agree with one its main conclusions: that the peer-to-peer threat is overstated - our study was exclusively focused on the one-click hosting sites (e.g. rapidshare)</description>
		<content:encoded><![CDATA[<p>@Carolyn,</p>
<p>Please do send me a copy of the study you reference.  If it&#8217;s from Brian O&#8217;Leary, then I have it and agree with one its main conclusions: that the peer-to-peer threat is overstated &#8211; our study was exclusively focused on the one-click hosting sites (e.g. rapidshare)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sarah M. Anderson</title>
		<link>http://www.courtneymilan.com/ramblings/2010/01/14/stop-using-bad-numbers/comment-page-1/#comment-11026</link>
		<dc:creator>Sarah M. Anderson</dc:creator>
		<pubDate>Fri, 15 Jan 2010 20:30:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.courtneymilan.com/ramblings/?p=1037#comment-11026</guid>
		<description>I always enjoy it when people who don&#039;t know you personally somehow think that they can talk circles around you, CM. It&#039;s fun to watch them fail. 

I agree with Carolyn. I fail to see how shortening the title would make it less misleading.</description>
		<content:encoded><![CDATA[<p>I always enjoy it when people who don&#8217;t know you personally somehow think that they can talk circles around you, CM. It&#8217;s fun to watch them fail. </p>
<p>I agree with Carolyn. I fail to see how shortening the title would make it less misleading.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Carolyn Jewel</title>
		<link>http://www.courtneymilan.com/ramblings/2010/01/14/stop-using-bad-numbers/comment-page-1/#comment-11025</link>
		<dc:creator>Carolyn Jewel</dc:creator>
		<pubDate>Fri, 15 Jan 2010 17:30:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.courtneymilan.com/ramblings/?p=1037#comment-11025</guid>
		<description>Wow. Attributor&#039;s response to their misleading title is &quot;It needed to be short&quot;? 

I&#039;m glad Attributor weighed in on this since it&#039;s their work that&#039;s under discussion. While I appreciate the clarifications, I don&#039;t see that all that much as been clarified.

I&#039;m really surprised that they think frontlist titles are representative of books that get pirated. As as author myself, it&#039;s two of my oldest OOP books that I see pirated the most. In fact, it&#039;s the backlist of the really famous authors that downloaders seem to be most interested in.

The fact is, there isn&#039;t yet sufficiently rigorous study of the issue and the ONLY person to attempt a valid study that might hold up to scrutiny has come to the opposite conclusion as Attibutor.</description>
		<content:encoded><![CDATA[<p>Wow. Attributor&#8217;s response to their misleading title is &#8220;It needed to be short&#8221;? </p>
<p>I&#8217;m glad Attributor weighed in on this since it&#8217;s their work that&#8217;s under discussion. While I appreciate the clarifications, I don&#8217;t see that all that much as been clarified.</p>
<p>I&#8217;m really surprised that they think frontlist titles are representative of books that get pirated. As as author myself, it&#8217;s two of my oldest OOP books that I see pirated the most. In fact, it&#8217;s the backlist of the really famous authors that downloaders seem to be most interested in.</p>
<p>The fact is, there isn&#8217;t yet sufficiently rigorous study of the issue and the ONLY person to attempt a valid study that might hold up to scrutiny has come to the opposite conclusion as Attibutor.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rich</title>
		<link>http://www.courtneymilan.com/ramblings/2010/01/14/stop-using-bad-numbers/comment-page-1/#comment-11021</link>
		<dc:creator>Rich</dc:creator>
		<pubDate>Thu, 14 Jan 2010 22:53:56 +0000</pubDate>
		<guid isPermaLink="false">http://www.courtneymilan.com/ramblings/?p=1037#comment-11021</guid>
		<description>Hi Courtney - happy to respond!

1) We are definitely using the uploads as our market share figure, because only four of the sites publish the download figures eliminating a straight apples-to-apples comparison.  We also considered using web traffic as a proxy but found that it would have inflated the numbers substantially given Rapidshare&#039;s enormous web traffic (e.g.  http://siteanalytics.compete.com/rapidshare.com+4shared.com/)  In fact, many we spoke too thought the results were too conservative beacuse of this.

2) We debated whether to disclose the source for this, but chose to keep it confidential to avoid embarrassing any specific publishers.

3)  Mea culpa on the headline.  It needed to be short :-) 

4)  We did this to increase transparency so everyone can see for themselves.  We&#039;re planning to remove these at the end of the day from the post.  In reality, it&#039;s very easy to find these books.

5)  Yes indeed and thanks for asking.  We have a .phd in statistics on our staff. 

I hope this helps, and I appreciate that you dug into this.  Most of the questions are unfortunately much less thought out.</description>
		<content:encoded><![CDATA[<p>Hi Courtney &#8211; happy to respond!</p>
<p>1) We are definitely using the uploads as our market share figure, because only four of the sites publish the download figures eliminating a straight apples-to-apples comparison.  We also considered using web traffic as a proxy but found that it would have inflated the numbers substantially given Rapidshare&#8217;s enormous web traffic (e.g.  <a href="http://siteanalytics.compete.com/rapidshare.com+4shared.com/" rel="nofollow">http://siteanalytics.compete.com/rapidshare.com+4shared.com/</a>)  In fact, many we spoke too thought the results were too conservative beacuse of this.</p>
<p>2) We debated whether to disclose the source for this, but chose to keep it confidential to avoid embarrassing any specific publishers.</p>
<p>3)  Mea culpa on the headline.  It needed to be short <img src='http://www.courtneymilan.com/ramblings/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  </p>
<p>4)  We did this to increase transparency so everyone can see for themselves.  We&#8217;re planning to remove these at the end of the day from the post.  In reality, it&#8217;s very easy to find these books.</p>
<p>5)  Yes indeed and thanks for asking.  We have a .phd in statistics on our staff. </p>
<p>I hope this helps, and I appreciate that you dug into this.  Most of the questions are unfortunately much less thought out.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Courtney Milan</title>
		<link>http://www.courtneymilan.com/ramblings/2010/01/14/stop-using-bad-numbers/comment-page-1/#comment-11020</link>
		<dc:creator>Courtney Milan</dc:creator>
		<pubDate>Thu, 14 Jan 2010 22:26:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.courtneymilan.com/ramblings/?p=1037#comment-11020</guid>
		<description>Rich, thanks for stopping by. I do have a number of questions.

First, can you explain for me how you translate take down notifications into market share for download? At best, the number of take down notifications you send to a site shows the relative percentage of UPLOADS, or, more likely, the relative persistence of uploaders. It doesn&#039;t in any way quantify the relative percentage of DOWNLOADS those sites represent.

Second, how did you determine what percentage of book sales a particular catalog represented?

Third, why did Attributor title its blog post with a claim that was clearly disclaimed in the text of the study itself? Because, you know, when your press release says &quot;Online Piracy Costs Publishers Nearly $3 Billion,&quot; that seems pretty misleading, especially since people so rarely read the fine print.

Fourth, given your stated objectives, don&#039;t you think it&#039;s a little awkward to release a piracy study where you provide clearly labeled links to pages where people can download the copyrighted materials in question?

Finally, does the person who conducted the study have any formal training in either economics or statistical methods?</description>
		<content:encoded><![CDATA[<p>Rich, thanks for stopping by. I do have a number of questions.</p>
<p>First, can you explain for me how you translate take down notifications into market share for download? At best, the number of take down notifications you send to a site shows the relative percentage of UPLOADS, or, more likely, the relative persistence of uploaders. It doesn&#8217;t in any way quantify the relative percentage of DOWNLOADS those sites represent.</p>
<p>Second, how did you determine what percentage of book sales a particular catalog represented?</p>
<p>Third, why did Attributor title its blog post with a claim that was clearly disclaimed in the text of the study itself? Because, you know, when your press release says &#8220;Online Piracy Costs Publishers Nearly $3 Billion,&#8221; that seems pretty misleading, especially since people so rarely read the fine print.</p>
<p>Fourth, given your stated objectives, don&#8217;t you think it&#8217;s a little awkward to release a piracy study where you provide clearly labeled links to pages where people can download the copyrighted materials in question?</p>
<p>Finally, does the person who conducted the study have any formal training in either economics or statistical methods?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rich</title>
		<link>http://www.courtneymilan.com/ramblings/2010/01/14/stop-using-bad-numbers/comment-page-1/#comment-11019</link>
		<dc:creator>Rich</dc:creator>
		<pubDate>Thu, 14 Jan 2010 22:16:30 +0000</pubDate>
		<guid isPermaLink="false">http://www.courtneymilan.com/ramblings/?p=1037#comment-11019</guid>
		<description>Hi Courtney,

I work for Attributor and just came across your blog.  We certainly struck a nerve with you and I&#039;m not sure we&#039;ll ever agree on the study, but I wanted to clarify a few things.  You can always reach me at rich(at)attributor.com too.

While not 100% relevant, I wanted to point out that we offer a free service at fairshare.cc which is a partnership with the Creative Commons to allow bloggers and freelancers to see who is reusing their content.  We have also started the Fair Syndication Consortium to move past the takedown mentality and help newspapers and other publishers collect a fair share of revenue made from their work as it is reused across the Internet.

In response to some of your points

1)  None of our customers&#039; titles were included in the study.  We grabbed what we believe is representative of the industry and most were frontlist titles.

2)  The market share is based on the 52,000 successful takedowns we have sent since our service was launched in July &#039;09.  This is mentioned in a footnote but we should have made this more explicit.

3)  The projection to total U.S. books was indeed tricky and it&#039;s not perfect. The 913 titles were from publishers whose  *entire catalog* represented 13.5% of U.S. Book Sales.  So we assumed that those 913 titles represented these publishers entire catalog.  We believe this is a conservative approach but invite other ideas or approaches.


I&#039;m happy to engage further - as you note, we were pretty transparent on our methodology and were clear about these numbers representing potential losses.  

We&#039;ve been in this business for a little over six months and, by any measure, it&#039;s a big and growing problem.  

Thanks for letting me respond!</description>
		<content:encoded><![CDATA[<p>Hi Courtney,</p>
<p>I work for Attributor and just came across your blog.  We certainly struck a nerve with you and I&#8217;m not sure we&#8217;ll ever agree on the study, but I wanted to clarify a few things.  You can always reach me at rich(at)attributor.com too.</p>
<p>While not 100% relevant, I wanted to point out that we offer a free service at fairshare.cc which is a partnership with the Creative Commons to allow bloggers and freelancers to see who is reusing their content.  We have also started the Fair Syndication Consortium to move past the takedown mentality and help newspapers and other publishers collect a fair share of revenue made from their work as it is reused across the Internet.</p>
<p>In response to some of your points</p>
<p>1)  None of our customers&#8217; titles were included in the study.  We grabbed what we believe is representative of the industry and most were frontlist titles.</p>
<p>2)  The market share is based on the 52,000 successful takedowns we have sent since our service was launched in July &#8217;09.  This is mentioned in a footnote but we should have made this more explicit.</p>
<p>3)  The projection to total U.S. books was indeed tricky and it&#8217;s not perfect. The 913 titles were from publishers whose  *entire catalog* represented 13.5% of U.S. Book Sales.  So we assumed that those 913 titles represented these publishers entire catalog.  We believe this is a conservative approach but invite other ideas or approaches.</p>
<p>I&#8217;m happy to engage further &#8211; as you note, we were pretty transparent on our methodology and were clear about these numbers representing potential losses.  </p>
<p>We&#8217;ve been in this business for a little over six months and, by any measure, it&#8217;s a big and growing problem.  </p>
<p>Thanks for letting me respond!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Courtney Milan</title>
		<link>http://www.courtneymilan.com/ramblings/2010/01/14/stop-using-bad-numbers/comment-page-1/#comment-11018</link>
		<dc:creator>Courtney Milan</dc:creator>
		<pubDate>Thu, 14 Jan 2010 21:36:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.courtneymilan.com/ramblings/?p=1037#comment-11018</guid>
		<description>Actually, Theresa, let me put it another way.

I don&#039;t think the problem with piracy is that we need bigger punishments. The punishments we have now are quite, quite massive--many of these downloaders could get socked with damages of millions of dollars.

The problem is that these punishments are too hard to enforce.

We need the equivalent of a traffic ticket for downloaders--a fine, easy to levy, of a small amount (say $100), where the enforcement agency has an incentive to both levy the fines and collect the money.

That would do more to combat piracy than actually keelhauling one a year.</description>
		<content:encoded><![CDATA[<p>Actually, Theresa, let me put it another way.</p>
<p>I don&#8217;t think the problem with piracy is that we need bigger punishments. The punishments we have now are quite, quite massive&#8211;many of these downloaders could get socked with damages of millions of dollars.</p>
<p>The problem is that these punishments are too hard to enforce.</p>
<p>We need the equivalent of a traffic ticket for downloaders&#8211;a fine, easy to levy, of a small amount (say $100), where the enforcement agency has an incentive to both levy the fines and collect the money.</p>
<p>That would do more to combat piracy than actually keelhauling one a year.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Courtney Milan</title>
		<link>http://www.courtneymilan.com/ramblings/2010/01/14/stop-using-bad-numbers/comment-page-1/#comment-11017</link>
		<dc:creator>Courtney Milan</dc:creator>
		<pubDate>Thu, 14 Jan 2010 21:22:53 +0000</pubDate>
		<guid isPermaLink="false">http://www.courtneymilan.com/ramblings/?p=1037#comment-11017</guid>
		<description>&lt;i&gt;That said, I still want to keelhaul all the pirates and torch all the sites that foster piracy.&lt;/i&gt;

I don&#039;t, and I really don&#039;t approve of this kind of language.

And you know why?

I don&#039;t think that&#039;s appropriate punishment for the measure of harm. 

Sorry. I know you don&#039;t really mean that we should tie ropes to their legs and toss them in the water, to have their face ripped to shreds by the barnacles on the bottom of a boat, assuming they don&#039;t drown first. But when I write a post saying &quot;stop with the hyperbole,&quot; I&#039;m also going to call it out when it occurs in the comments.

This part of the hyperbolic talk is just as bad. I don&#039;t think someone should be keelhauled if they rob a bank of a million bucks and shoot two innocent bystanders. I certainly won&#039;t support it because someone clicked &quot;download.&quot;

Pirates should be subject to reasonable civil penalties, and if appropriate, criminal ones, that are commensurate with the harm inflicted.

Someone not giving me my twenty cents in royalties just doesn&#039;t justify physical harm.

Period.</description>
		<content:encoded><![CDATA[<p><i>That said, I still want to keelhaul all the pirates and torch all the sites that foster piracy.</i></p>
<p>I don&#8217;t, and I really don&#8217;t approve of this kind of language.</p>
<p>And you know why?</p>
<p>I don&#8217;t think that&#8217;s appropriate punishment for the measure of harm. </p>
<p>Sorry. I know you don&#8217;t really mean that we should tie ropes to their legs and toss them in the water, to have their face ripped to shreds by the barnacles on the bottom of a boat, assuming they don&#8217;t drown first. But when I write a post saying &#8220;stop with the hyperbole,&#8221; I&#8217;m also going to call it out when it occurs in the comments.</p>
<p>This part of the hyperbolic talk is just as bad. I don&#8217;t think someone should be keelhauled if they rob a bank of a million bucks and shoot two innocent bystanders. I certainly won&#8217;t support it because someone clicked &#8220;download.&#8221;</p>
<p>Pirates should be subject to reasonable civil penalties, and if appropriate, criminal ones, that are commensurate with the harm inflicted.</p>
<p>Someone not giving me my twenty cents in royalties just doesn&#8217;t justify physical harm.</p>
<p>Period.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Theresa Stevens</title>
		<link>http://www.courtneymilan.com/ramblings/2010/01/14/stop-using-bad-numbers/comment-page-1/#comment-11016</link>
		<dc:creator>Theresa Stevens</dc:creator>
		<pubDate>Thu, 14 Jan 2010 21:16:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.courtneymilan.com/ramblings/?p=1037#comment-11016</guid>
		<description>Yes, it&#039;s a shady, self-serving study with questionable metrics and nonsensical conclusions. That said, I still want to keelhaul all the pirates and torch all the sites that foster piracy.</description>
		<content:encoded><![CDATA[<p>Yes, it&#8217;s a shady, self-serving study with questionable metrics and nonsensical conclusions. That said, I still want to keelhaul all the pirates and torch all the sites that foster piracy.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

