<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: R annoyances</title>
	<atom:link href="http://www.win-vector.com/blog/2010/03/r-annoyances/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.win-vector.com/blog/2010/03/r-annoyances/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=r-annoyances</link>
	<description>The Applied Theorist&#039;s Point of View</description>
	<lastBuildDate>Wed, 25 Jan 2012 05:27:08 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Zhonghao Yu</title>
		<link>http://www.win-vector.com/blog/2010/03/r-annoyances/comment-page-1/#comment-2209</link>
		<dc:creator>Zhonghao Yu</dc:creator>
		<pubDate>Mon, 19 Apr 2010 15:51:44 +0000</pubDate>
		<guid isPermaLink="false">http://www.win-vector.com/blog/?p=1407#comment-2209</guid>
		<description>Just try help(&#039;[&#039;) to find the help for indexing.</description>
		<content:encoded><![CDATA[<p>Just try help(&#8216;[&#8216;) to find the help for indexing.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jmount</title>
		<link>http://www.win-vector.com/blog/2010/03/r-annoyances/comment-page-1/#comment-2085</link>
		<dc:creator>jmount</dc:creator>
		<pubDate>Tue, 30 Mar 2010 15:21:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.win-vector.com/blog/?p=1407#comment-2085</guid>
		<description>&lt;a href=&quot;#comment-2081&quot; rel=&quot;nofollow&quot;&gt;@Jeromy Anglim &lt;/a&gt; Some good points and thanks for the comment.  Just one amplification from my end.  I agree some facility to return a single row as a vector is a useful feature (in principle).   I deliberately used a cumbersome c(FALSE,TRUE,FALSE) notation (instead of just the index number: 2) to emphasize how the return-type changes due to mere changes of values of the input data (instead of in an orderly way due to the calling type/style).</description>
		<content:encoded><![CDATA[<p><a href="#comment-2081" rel="nofollow">@Jeromy Anglim </a> Some good points and thanks for the comment.  Just one amplification from my end.  I agree some facility to return a single row as a vector is a useful feature (in principle).   I deliberately used a cumbersome c(FALSE,TRUE,FALSE) notation (instead of just the index number: 2) to emphasize how the return-type changes due to mere changes of values of the input data (instead of in an orderly way due to the calling type/style).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeromy Anglim</title>
		<link>http://www.win-vector.com/blog/2010/03/r-annoyances/comment-page-1/#comment-2081</link>
		<dc:creator>Jeromy Anglim</dc:creator>
		<pubDate>Tue, 30 Mar 2010 00:54:57 +0000</pubDate>
		<guid isPermaLink="false">http://www.win-vector.com/blog/?p=1407#comment-2081</guid>
		<description>Interesting post.  I agree that learning R can be frustrating at times.
As a social scientist, I came from a background using MS Windows, MS Word, SPSS, and a somewhat surface-level understanding of statistics.
R has a habit of pushing me towards thinking like a statistician. 
The modelling notation, the requirement to create solutions from smaller tools, the way that R interfaces really well with LaTeX but poorly with MS Word, the use R makes of Unix-derived tools, these were all challenges at first. But ultimately they provide a far superior way of conducting data analysis.

I imagine a similar set of points could be made for R users coming from a computer science background. Learning R will probably encourage you to learn more about statistics and think more like an intelligent statistician.

The principle of least surprise depends on your expectations, which are in turn related to your prior training.
Returning a vector when subsetting a single row or column of a matrix is often useful behaviour.
When using a lapply or sapply function over the elements of a list, it&#039;s important to distinguish between &quot;[]&quot; and &quot;[[]]&quot;.</description>
		<content:encoded><![CDATA[<p>Interesting post.  I agree that learning R can be frustrating at times.<br />
As a social scientist, I came from a background using MS Windows, MS Word, SPSS, and a somewhat surface-level understanding of statistics.<br />
R has a habit of pushing me towards thinking like a statistician.<br />
The modelling notation, the requirement to create solutions from smaller tools, the way that R interfaces really well with LaTeX but poorly with MS Word, the use R makes of Unix-derived tools, these were all challenges at first. But ultimately they provide a far superior way of conducting data analysis.</p>
<p>I imagine a similar set of points could be made for R users coming from a computer science background. Learning R will probably encourage you to learn more about statistics and think more like an intelligent statistician.</p>
<p>The principle of least surprise depends on your expectations, which are in turn related to your prior training.<br />
Returning a vector when subsetting a single row or column of a matrix is often useful behaviour.<br />
When using a lapply or sapply function over the elements of a list, it&#8217;s important to distinguish between &#8220;[]&#8221; and &#8220;[[]]&#8221;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jmount</title>
		<link>http://www.win-vector.com/blog/2010/03/r-annoyances/comment-page-1/#comment-2056</link>
		<dc:creator>jmount</dc:creator>
		<pubDate>Wed, 24 Mar 2010 15:04:55 +0000</pubDate>
		<guid isPermaLink="false">http://www.win-vector.com/blog/?p=1407#comment-2056</guid>
		<description>Been asking around and running into a lot of gotchas in productionizing R-models.  Such as the rumored &quot;Hadley Wickham says never to use factors&quot; (the likely reason being it is next to impossible to guarantee you get the same transformation and indexing when loading new data to score).  We hadn&#039;t run into this before because we had be selling Java implementations of models generated in R (so we managed the data transformations by hand).  What we had not expected was &quot;lets keep things simple and let the client use R&quot; would be so perilous.</description>
		<content:encoded><![CDATA[<p>Been asking around and running into a lot of gotchas in productionizing R-models.  Such as the rumored &#8220;Hadley Wickham says never to use factors&#8221; (the likely reason being it is next to impossible to guarantee you get the same transformation and indexing when loading new data to score).  We hadn&#8217;t run into this before because we had be selling Java implementations of models generated in R (so we managed the data transformations by hand).  What we had not expected was &#8220;lets keep things simple and let the client use R&#8221; would be so perilous.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jmount</title>
		<link>http://www.win-vector.com/blog/2010/03/r-annoyances/comment-page-1/#comment-2049</link>
		<dc:creator>jmount</dc:creator>
		<pubDate>Mon, 22 Mar 2010 16:26:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.win-vector.com/blog/?p=1407#comment-2049</guid>
		<description>A sincere thank you to Tom who posted very good solutions the the specific problems I discussed.  It helps a lot (getting the right result out is much much safer than trying to repair a damaged result).  However, it is still frustrating to have to distinguish between [] and [[]] (neither of which can be easily searched on in the R help system) and having to add a third drop=FALSE argument to all matrix selections.  These requirements are an example of what one of my groups used to call &quot;you forgot to set the &#039;do not lose&#039; flag.&quot;   This described when a function call failed because you forgot to set a useless legacy argument to the one value that allowed the function to work properly.  Or in Alice in Wonderland terms: if [[]] and [rows,,drop=FALSE] are the natural operations then why don&#039;t they have more succinct names like [] and [rows,]?  (The Alice in Wonderland analogy: if rule 42 is the &quot;oldest rule in the book&quot; then why isn&#039;t it called &quot;rule 1?&quot;)</description>
		<content:encoded><![CDATA[<p>A sincere thank you to Tom who posted very good solutions the the specific problems I discussed.  It helps a lot (getting the right result out is much much safer than trying to repair a damaged result).  However, it is still frustrating to have to distinguish between [] and [[]] (neither of which can be easily searched on in the R help system) and having to add a third drop=FALSE argument to all matrix selections.  These requirements are an example of what one of my groups used to call &#8220;you forgot to set the &#8216;do not lose&#8217; flag.&#8221;   This described when a function call failed because you forgot to set a useless legacy argument to the one value that allowed the function to work properly.  Or in Alice in Wonderland terms: if [[]] and [rows,,drop=FALSE] are the natural operations then why don&#8217;t they have more succinct names like [] and [rows,]?  (The Alice in Wonderland analogy: if rule 42 is the &#8220;oldest rule in the book&#8221; then why isn&#8217;t it called &#8220;rule 1?&#8221;)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: William Pietri</title>
		<link>http://www.win-vector.com/blog/2010/03/r-annoyances/comment-page-1/#comment-2045</link>
		<dc:creator>William Pietri</dc:creator>
		<pubDate>Sat, 20 Mar 2010 19:44:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.win-vector.com/blog/?p=1407#comment-2045</guid>
		<description>Wow, I&#039;m glad it&#039;s not just me.

I needed to do some pretty basic statistical stuff on a couple hundred thousand points of data. The kind of thing that you can almost do in oocalc if you are willing to put up with a bunch of manual work. So I thought I&#039;d try R and level up my stats skills a little.

I blew maybe a day between the command line and some of the graphical interfaces that purport to make things better, and got absolutely nowhere; one of the hurdles was types, and what could be applied to what. In the end it was easier just for me to code my own tools in Ruby. Scott&#039;s comment perfectly summed up my feelings: &quot;As a design, R looks like a sort of sneeze.&quot;</description>
		<content:encoded><![CDATA[<p>Wow, I&#8217;m glad it&#8217;s not just me.</p>
<p>I needed to do some pretty basic statistical stuff on a couple hundred thousand points of data. The kind of thing that you can almost do in oocalc if you are willing to put up with a bunch of manual work. So I thought I&#8217;d try R and level up my stats skills a little.</p>
<p>I blew maybe a day between the command line and some of the graphical interfaces that purport to make things better, and got absolutely nowhere; one of the hurdles was types, and what could be applied to what. In the end it was easier just for me to code my own tools in Ruby. Scott&#8217;s comment perfectly summed up my feelings: &#8220;As a design, R looks like a sort of sneeze.&#8221;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tom</title>
		<link>http://www.win-vector.com/blog/2010/03/r-annoyances/comment-page-1/#comment-2044</link>
		<dc:creator>tom</dc:creator>
		<pubDate>Sat, 20 Mar 2010 19:41:08 +0000</pubDate>
		<guid isPermaLink="false">http://www.win-vector.com/blog/?p=1407#comment-2044</guid>
		<description>With respect to the matrix example: This should be:

m[c(FALSE,TRUE,FALSE), , drop = FALSE]
     [,1] [,2] [,3]
[1,]    2    1    0

class(m[c(FALSE,TRUE,FALSE), , drop = FALSE])
[1] &quot;matrix&quot;</description>
		<content:encoded><![CDATA[<p>With respect to the matrix example: This should be:</p>
<p>m[c(FALSE,TRUE,FALSE), , drop = FALSE]<br />
     [,1] [,2] [,3]<br />
[1,]    2    1    0</p>
<p>class(m[c(FALSE,TRUE,FALSE), , drop = FALSE])<br />
[1] &#8220;matrix&#8221;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tom</title>
		<link>http://www.win-vector.com/blog/2010/03/r-annoyances/comment-page-1/#comment-2043</link>
		<dc:creator>tom</dc:creator>
		<pubDate>Sat, 20 Mar 2010 19:38:15 +0000</pubDate>
		<guid isPermaLink="false">http://www.win-vector.com/blog/?p=1407#comment-2043</guid>
		<description>You should use [[

l[[varName]]
[1] 1 2 3

class(l[[varName]])
[1] &quot;numeric&quot;</description>
		<content:encoded><![CDATA[<p>You should use [[</p>
<p>l[[varName]]<br />
[1] 1 2 3</p>
<p>class(l[[varName]])<br />
[1] &#8220;numeric&#8221;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jmount</title>
		<link>http://www.win-vector.com/blog/2010/03/r-annoyances/comment-page-1/#comment-2042</link>
		<dc:creator>jmount</dc:creator>
		<pubDate>Sat, 20 Mar 2010 19:23:48 +0000</pubDate>
		<guid isPermaLink="false">http://www.win-vector.com/blog/?p=1407#comment-2042</guid>
		<description>Scott, I also love statically type-checked languages (some of my &quot;secret&quot; projects are in Scala).  People go on and on about how much they love dynamic and un-typed languages (where every function has all the power and pain of a macro)- but some day they will learn that 5 minutes of preparation (dealing with types and their declarations) can save a lot of debugging (and also produces better documented code).</description>
		<content:encoded><![CDATA[<p>Scott, I also love statically type-checked languages (some of my &#8220;secret&#8221; projects are in Scala).  People go on and on about how much they love dynamic and un-typed languages (where every function has all the power and pain of a macro)- but some day they will learn that 5 minutes of preparation (dealing with types and their declarations) can save a lot of debugging (and also produces better documented code).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Scott Locklin</title>
		<link>http://www.win-vector.com/blog/2010/03/r-annoyances/comment-page-1/#comment-2041</link>
		<dc:creator>Scott Locklin</dc:creator>
		<pubDate>Sat, 20 Mar 2010 19:20:36 +0000</pubDate>
		<guid isPermaLink="false">http://www.win-vector.com/blog/?p=1407#comment-2041</guid>
		<description>One of the great ironies of R is it has a syntax very similar to that of ML. ML is a programming language that makes type errors of this kind completely impossible. Try it some time when you have nothing better to do; it&#039;s frustrating in a different way (aka you wish you could turn off the strict type safety and have it do dumb R style typecasts), but it produces iron clad code which won&#039;t ever faceplant you on a dumb type error. 
As a design, R looks like a sort of sneeze. Probably stuff like you describe is a legacy of S+/ATT days (it would be interesting to check), but either way, I have come around to a reasonably good way of dealing with it. Pretend to be a dumb statistician that doesn&#039;t know anything about programming, and that&#039;s the &quot;right way&quot; to do stuff in R.

FWIIW, the Clojure dudes are trying to build an R style thing called &quot;Incanter.&quot; I&#039;m a bit skeptical, but it would be hard to do worse as a design than R.
http://data-sorcery.org/</description>
		<content:encoded><![CDATA[<p>One of the great ironies of R is it has a syntax very similar to that of ML. ML is a programming language that makes type errors of this kind completely impossible. Try it some time when you have nothing better to do; it&#8217;s frustrating in a different way (aka you wish you could turn off the strict type safety and have it do dumb R style typecasts), but it produces iron clad code which won&#8217;t ever faceplant you on a dumb type error.<br />
As a design, R looks like a sort of sneeze. Probably stuff like you describe is a legacy of S+/ATT days (it would be interesting to check), but either way, I have come around to a reasonably good way of dealing with it. Pretend to be a dumb statistician that doesn&#8217;t know anything about programming, and that&#8217;s the &#8220;right way&#8221; to do stuff in R.</p>
<p>FWIIW, the Clojure dudes are trying to build an R style thing called &#8220;Incanter.&#8221; I&#8217;m a bit skeptical, but it would be hard to do worse as a design than R.<br />
<a href="http://data-sorcery.org/" rel="nofollow">http://data-sorcery.org/</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>

