Читать книгу R For Dummies - Vries Andrie de - Страница 12

Part I
Getting Started with R Programming
Chapter 3
The Fundamentals of R
Keeping Your Code Readable

You may wonder why you should bother about reading code. You wrote the code yourself, so you should know what it does, right? You do now, but will you remember what you did if you have to redo that analysis six months from now on new data? Besides, you may have to share your scripts with other people, and what seems obvious to you may be far less obvious for them.

Some of the rules you’re about to see aren’t that strict. In fact, you can get away with almost anything in R, but that doesn’t mean it’s a good idea. In this section, we explain why you should avoid some constructs even though they aren’t strictly wrong.

Following naming conventions

R is very liberal when it comes to names for objects and functions. This freedom is a great blessing and a great burden at the same time. Nobody is obliged to follow strict rules, so everybody who programs something in R can basically do as he or she pleases.

Choosing a correct name

Although almost anything is allowed when giving names to objects, there are still a few rules in R that you can’t ignore:

✔ Names must start with a letter or a dot. If you start a name with a dot, the second character can’t be a digit.
✔ Names should contain only letters, numbers, underscore characters (_), and dots (.). Although you can force R to accept other characters in names, you shouldn’t, because these characters often have a special meaning in R.
✔ You can’t use the following special keywords as names:
● break
● else
● FALSE
● for
● function
● if
● Inf
● NA
● NaN
● next
● NULL
● repeat
● return
● TRUE
● while
R is case sensitive, which means that, for R, lastname and Lastname are two different objects. If R tells you it can’t find an object or function and you’re sure it should be there, check to make sure you used the right case.

Choosing a clear name

When you start writing code, it’s tempting to use short, generic names like x. There’s nothing wrong with that, as long as it is clear what each object represents. But that might become difficult when all your objects have a single letter name. Likewise, calling your datasets data1, data2, and so forth may be a bit confusing for the person who has to read your code later on, even if it makes all kinds of sense to you now. Remember: You could be the one who, in three months, is trying to figure out exactly what you were trying to achieve. Using descriptive names will allow you to keep your code readable.

Although you can name an object almost whatever you want, some names will cause less trouble than others. You may have noticed that none of the functions we’ve used until now are mentioned as being off-limits (see the preceding section). That’s right: If you want to call an object paste, you’re free to do so:

> paste <– paste("This gets","confusing")
> paste
[1] "This gets confusing"
> paste("Don’t","you","think?")
[1] "Don’t you think?"

R almost always will know perfectly well when you want the vector paste and when you need the function paste(). That doesn’t mean it’s a good idea to use the same name for both items, though. In some cases, doing so can cause unexpected errors. So if you can avoid giving the name of a function to an object, you should.

One situation in which you can really get into trouble is when you use capital F or T as an object name. You can do it, but you’re likely to break code at some point. Although it’s a very bad idea, T and F are all too often used as abbreviations for TRUE and FALSE, respectively. But T and F are not reserved keywords. So, if you change them, R will first look for the object T and only then try to replace T with TRUE. And any code that still expects T to mean TRUE will fail from this point on. Never use F or T, not as an object name and not as an abbreviation.

Choosing a naming style

If you have experience in programming, you’ve probably heard of camel case before. Camel case is a way of giving longer names to objects and functions. You capitalize every first letter of a word that is part of the name to improve the readability. So, you can have a veryLongVariableName and still be able to read it.

Unlike many other languages, R doesn’t use the dot (.) as an operator, so the dot can be used in names for objects as well. This style is called dotted style, where you write everything in lowercase and separate words or terms in a name with a dot. In fact, in R, many function names use dotted style. You’ve met a function like this earlier in the chapter: print.default(). Some package authors also use an underscore instead of a dot.

print.default() is the default method for the print() function. You can find Information on the arguments of the Help page for print.default().

You’re not obligated to use dotted style; you can use whatever style you want. We use dotted style throughout this book for objects, and camel case for functions. R uses dotted style for many base functions and objects, but because some parts of the internal mechanisms of R rely on that dot, you’re safer using camel case for functions. Whenever you see a dot, though, you don’t have to wonder what it does – it’s just part of the name.

Конец ознакомительного фрагмента. Купить книгу

Подняться наверх

Читать книгу R For Dummies - Vries Andrie de - Страница 12

Part IGetting Started with R ProgrammingChapter 3The Fundamentals of RKeeping Your Code Readable

Part I
Getting Started with R Programming
Chapter 3
The Fundamentals of R
Keeping Your Code Readable