John’s guide to tricky English grammar rules
The title is pretty self-explanatory: these are some rules and examples of tricky, confusing, frequently abused/ignored, or difficult-to-remember aspects of English grammar. Many of the example sentences are about science or medicine because I edit biomedical research manuscripts written by foreigners, to improve the language and grammar to the level of native English speakers, so I’ve used their errors to teach from.
Table of contents 1. Serial commas and the “Oxford” comma
2. Numbers: Words vs. numerals
3. Hyphens and En dashes
4. Literally vs. figuratively
5. Singular/plural noun–verb agreement
6. Tricky uses of commas in complex sentences
1. Serial commas and the inclusion of the final (or “Oxford”) comma
I start with this because it is important to me. Or, maybe, it’s important to me that people agree with me. In truth, those are probably the same thing. Serial commas are the commas that come in between items of a list. Perhaps you were taught that before the “and” or “or” preceding the final item of a list, you shouldn’t put a comma because the “and” or “or” fulfills the function of this comma in addition to alerting you that the last item is coming up. I think this is a bunch of hokum. Despite the advocacy of the final comma by Strunk & White and the Oxford University Press, the minds of many schoolchildren seem to have been poisoned with the no-final-comma rule, much to the English-speaking world’s detriment.
The problem is that following this rule 100% of the time leads to awkward, confusing, and even ambiguous sentences, whereas inserting a final comma before the last item is always at least as clear; is, in some cases, as the examples below show, clearer or at least easier to follow; and, in some cases, is the only way to definitively express the meaning of the sentence. According to definition, if it isn’t applicable at all times and in all situations, it isn’t a rule; it’s a preference, and a misguided one at that.
a. The following sentences or sentence fragments do not have different or ambiguous meanings without the final serial comma, but they’re really bulky and are therefore easier to follow with it added in:
The authors are affiliated with the medical school, the Department of Cellular and Developmental Biology and Laboratory and Industrial Products, Inc.
If you can’t admit this sentence reads very awkwardly with so many “ands” and would be improved by the insertion of a final comma, you have much worse problems than following the wrong comma rules.
This is characterized by the disruption of polarized tubular epithelial cell morphology, de novo mesenchymal gene expression and actin reorganization and increased cell migration and invasion.
There are four noun clauses separated by three “ands,” and not a comma in sight. Foolish and awkward. The middle pair of noun clauses (“mesenchymal gene expression” and “actin reorganization”) are both phenomena that happen for the first time (de novo) during the process in question, and the last two noun clauses (“cell migration” and “invasion”) are both modified by “increased.” Put a comma before the second “and,” and it looks like a normal sentence that an English speaker might be proud of writing.
EMDB was established by the Protein Databank in Europe (PDBe) at the European Bioinformatics Institute, the Research Collaboratory for Structural Bioinformatics (RCSB) at Rutgers and the National Center for Macromolecular Imaging (NCMI) at Baylor College of Medicine.
A final comma would make the separation between these long, bulky noun clauses a little easier to follow.
b. This is a good example of how the consistent use of the Oxford comma can avoid temporary confusion as to the meaning of a sentence:
In addition to endogenously synthesized cholesterol, the absorption of dietary cholesterol and the reabsorption of biliary cholesterol in the small intestine also contribute to the regulation of the plasma cholesterol level.
If the author of this sentence is not a habitual Oxford comma user, the reader doesn’t know if “endogenously synthesized cholesterol, the absorption of dietary cholesterol and the reabsorption of biliary cholesterol in the small intestine” is a list of three items, or if the comma separates an introductory phrase from an independent clause beginning with two noun clauses. It turns out it was the latter. If the reader knows the author is consistent in his use of the final serial comma, there would (should) be no confusion.
c. The meaning of the following sentences are impossible to determine definitively without a properly placed final comma:
The team focused on hiring and training, setting and meeting deadlines, expansion and reorganization and visibility in the retail market.
Which efforts were focused on the retail market? Visibility, or reorganization and visibility? Everything the hypothetical team or company does isn’t necessarily directed at the retail market, so some of their efforts could be business-to-business, and some activities could simply apply to internal affairs. Without the final comma, it is not clear what the reorganization and visibility efforts refer to.
Plant peroxisomes play a physiological role in the biosynthesis of the signaling molecule jasmonic acid, β-oxidation of indole-butyric acid and sulfur and polyamine metabolism.
Which process does the sulfur belong to? Is it part of the β-oxidation, or does it go with polyamine metabolism? You have to know what β-oxidation is to know that it can’t be applied to sulfur, but this comes from technical knowledge, not from anything the sentence conveyed. Without prior, specialized knowledge, it is not possible to determine the meaning of this sentence as written. I know no sentence is written in a vacuum, and you have to have some knowledge of the context and the meaning of the words to understand any sentence, but if you don’t understand something, that doesn’t mean the sentence didn’t properly convey its intended meaning; the above sentences don’t convey their intended meaning regardless of your specialized knowledge, and the reason for this is their lack of the final comma.
Wikipedia has a probably unintentionally funny article about the serial comma, whose humor stems from the fact that it is so inconsistent and bad, and that its authors were surely oblivious to this. It gives some good examples of the ambiguity that often results from forgoing the final comma:
Consider the possibly apocryphal book dedication quoted by Teresa Nielsen Hayden:
To my parents, Ayn Rand and God.
There is ambiguity about the writer’s parentage, because Ayn Rand and God can be read as in apposition to my parents, leading the reader to believe that the writer refers to Ayn Rand and God as his or her parents. A comma before and removes the ambiguity:
To my parents, Ayn Rand, and God.
Consider also:
My favourite types of sandwiches are pastrami, ham, cream cheese and peanut butter and jelly.
That’s all well and good. But the article then attempts to pin the blame for ambiguity on the inclusion of the final comma in some cases, such as the very same book dedication:
To my mother, Ayn Rand, and God
The serial comma after Ayn Rand creates ambiguity about the writer’s mother, because the proper-noun phrase Ayn Rand could be read as in apposition to my mother (with the commas fulfilling a parenthetical function), resulting in the interpretation “To my mother (who is Ayn Rand) and to God”. (Normally in such a case a writer should be trusted to explicitly include the second ‘to’ in order to relieve this ambiguity.) [emphasis added]
Yeah, no kidding: it isn’t the comma that would be creating the ambiguity, it’s the boneheaded author who doesn’t proofread her own work with a critical eye to detect ambiguities like that! It should be understood that if the author means, “To my mother (Ayn Rand) and to God,” she will either add the necessary “to” before “God” or will use parentheses where such ambiguity could arise. Additionally, in some cases a rearrangement might help, though it doesn’t here: “To Ayn Rand (my mother) and to God.” There is nothing wrong with parentheses, and nothing wrong with the final comma, either.
Clearly, the inclusion of the final comma in lists is more versatile—is, in fact, universally appropriate—whereas the absent-comma option quite frequently yields sentences that are harder to follow or even ambiguous, making it less appropriate some percentage of the time and therefore not suitable to be called a rule at all.
2. When to write out the word for a number and when you can use the numeral Unfortunately, there is no consensus on this topic, and there never will be. It’s important not to find one person or web page who tells you what you want to hear and go with that. For instance, some people who act like they know what they’re doing don’t even know the definitions of the words “number” and “numeral.” (Those morans got it exactly backwards. I’d be embarrassed if I were them.) A number is an abstract concept for a quantity, and a numeral is a written symbol for expressing that quantity. Roman numerals and Arabic numerals are two examples of different ways of expressing the same mathematical concepts (numbers). A word, made up of letters, is another way to express a number in writing.
In the most formal writing, it is basically never appropriate to use a numeral. If you’re doing that kind of writing, you’re probably not reading this because you’re already an accomplished writer, grammarian, and/or editor. Other style guides allow numerals for years and really long numbers (in the millions or billions, for example). Here are the rules that I think are the most agreed-upon and best for when numerals are allowed in formal writing:
- Use numerals for all numbers with more than one digit. This includes decimals and fractions.
- Any time you use a numeral for a certain noun (for instance, when that noun has a quantity of 10 or more), you can use a numeral for the same noun in the same sentence, even if the other number is less than 10. Example sentence:
The two largest groups contained 12 and 8 people, respectively.
Other things being equal, you would normally write “eight” instead of “8,” but “people” is modified by both 12 and 8 in this sentence, so to be uniform, it’s okay to break this rule and write the smaller number the same way you write the larger number. On the other hand, “groups” is only modified by one number (two), so it must be written as a word. On the third hand, it is acceptable to maintain document-wide uniformity and write “eight” the same way you do every other single-digit number in every other sentence.
A similar type of situation is the use of an integer and a decimal together:
a 1.7–2-fold increase
Normally you might write 2-fold as “two-fold” (though there are style guides that treat “fold” as a unit like % or other measured quantity), but in combination with—even in the same sentence as—that decimal, you must use a numeral.
- An abbreviated unit (kg, mM, h, %, etc.) requires a numeral; inversely, spelled-out numbers require the unit to be written out in full. This is only relevant at the beginning of sentences. That is, if you have to spell out the word at the beginning of the sentence, you cannot use an abbreviation for the unit. This might feel a little awkward when you have to write, “Ten microliters of PCR product was added…”, but it looks a hell of a lot better than beginning a sentence with, “10 μL.”
- When the number modifies any measured or counted unit, it is acceptable to use a numeral. Usually this involves an abbreviated unit, but other units that you might not always abbreviate, such as hour, day, week, or month, can take numerals as well. This might only be relevant to scientific publications, where many numbers and measurements are used, whereas other writings don’t contain a lot of numbers and would look fine with occasional numbers written as words. In scientific papers, this rule helps avoid the awkwardness of “at the 1-mo time point” or “the 3-d samples.” Writing “1-month” and “3-day” is clearer and less weird-looking, is often more uniform with the other numerals used in your methods or results sections, and is consistent because units will always be preceded by numerals.
Other counted nouns in scientific papers, such as group or mouse, are not units, so their numbers follow the other rules of formal writing.
3. Hyphens and En dashes a. The function of the hyphen
The hyphen is a compounder or combiner. It is very versatile and is woefully underused in the modern English-speaking world.
- When a two-word phrase consisting of an adjective and a noun is used as a single adjective, it should be hyphenated:
one-way street
real-world example
5-mL solution - Also, two nouns that are used as a single descriptor (adjective) should be hyphenated. Often we might think of the first noun as fulfilling the function of an adjective, the second noun as a noun, and the combination of the two as a single, different adjective.
Air traffic was so dense that afternoon that air-traffic control could hardly cope. (Hat tip: BBC.)
Density-gradient centrifugation
Insertion-mutant populations
She has been with her computer-programmer boyfriend for two years.
The DNA-methylation differences were more subtle than the protein-phosphorylation ones.Clearly it is often optimal to rearrange the sentence than to use some noun–noun modifiers, but that doesn’t make them grammatically incorrect.
- Many several-word phrases must be hyphenated when they precede their noun but must remain un-hyphenated when they follow their noun:
The teacher gave us some easy-to-remember rules.
The rules were easy to remember.
The Atlanta-to-Houston flight
The flight from Atlanta to HoustonHere’s an example that’s always bothered me. Why do radio stations claim they are “new”? Or that they are the “new #1 hit music station”? I know stations that have literally been around for four or five years that still use that same damn slogan. But the grammarian in me thinks, Well, if you hyphenate the words properly, the statement could still be technically correct…
A new-#1-hit music station is a music station that plays new #1 hits. (Presumably they play other things, too, seeing as how there is only one #1 hit at a given time.)
A new, number-1, hit-music station is a radio station that plays hit music, is new, and is the number-1 station in town.If you can point to a single radio marketing employee in the entire United States who knows the difference between the two or could punctuate either one of those sentences properly, I have a newborn unicorn and a shiny pot of gold to give you.
- Adverbial modifiers (adverb–adjective combinations) should not be hyphenated, except in common cases where the word “well” or “better” or “best” precedes its noun:
A poorly defined mechanism
A well-defined mechanism
A mechanism that is well defined
A relatively unknown protein
A better-known protein
A protein that is better known
The most easily recognizable member of the group [adverb–adverb–adjective!]
The best-known member of the group
He is best known for his accomplishments in…
The well-known chef opened a new restaurant.
The new restaurant’s chef is well known. - When numbers like the following are written out in words, they must be hyphenated (the word “and” is also verboten!):
Twenty-three
One hundred sixty-four
Three thousand four hundred ninety-nine - Any usage of the prefixes “self” or “ex” must be hyphenated:
self-confidence
self-appointed
self-diagnose
ex-wife - When a prefix ends with the same letter that the base word begins with, or the non-hyphenated compound word is likely to be read wrong for another reason, it’s usually best to hyphenate the compound word:
anti-inflammatory instead of antiinflammatory (though proinflammatory is okay)
pro-oncogenic instead of prooncogenic (though antioncogenic is okay)
de-ice instead of deice, which is never okay
re-sign instead of resign - Dangling hyphens are good to use when multiple prefixes precede the same base noun but you don’t want to repeat the base noun every time, or when you want to attach a single prefix to multiple base nouns but don’t want to repeat the prefix every time.
macro- and microeconomics
pre- and post-WWII
Ron Paul’s anti-Obama, -McCain, -Clinton, and -Bush campaignI particularly liked this example from the EDline editors mailing list, cited by Wikipedia:
…a large number of adjectives…were used to describe [ships protected by iron or steel armor]: iron- or steel- or armor-plated, -cased, -clothed, -sided, and many others….
That Wikipedia article on the hyphen is quite good and long, so I recommend reading it for a fuller description and explanation of its uses.
b. The function of the En dash
The En dash (–, HTML character code – or –) is a connector and a comparer. It’s the neglected middle child between the hyphen and the Em dash. (The Em dash is what we normally think of when we think of a dash, —.) So named because it is roughly the width of the letter n, the En dash is used to signify a range, to connect things, and to replace the hyphen in compound adjectives of more than two words (see part c.) Wikipedia has a really good article on dashes.
- In the more traditional, first two uses, the En dash replaces the word “to” or “and”:
allowed to grow for 2–3 days, not 2-3 days, because you mean “2 to 3″
dose–response curve, not dose-response curve, because it is a “dose-and-response” curve
cell–cell communication, not cell-cell, because you mean cell-to-cell
Atlanta–Houston flight, not Atlanta-Houston flight, because it is an Atlanta-to-Houston flight
President Jimmy Carter (1977–1981)
the Supreme Court’s 5–4 decision
a score of 31–27 - Lastly, the minus sign looks identical to the En dash on most computers and word processors, so it is best to write negative numbers with an En dash instead of a hyphen:
–80°C
a change of –20%It is important to note that if you’re giving a range of negative numbers, or any other times it might be confusing to the reader, you can’t use the En dash to specify a range:
–15 to –20 (negative fifteen to negative twenty)
–15 to 20 (negative fifteen to twenty)
c. Compound modifiers that require an En dash to perform the function of a hyphen
- The following types of constructions require an En dash and cannot use a hyphen because one side or another of the compound adjective has two words (or already contains a hyphen):
embryonic stem cell–focused research [With a hyphen, it would be conveying the idea of the noun clause "cell-focused research" being modified by "embryonic stem," which doesn't really make sense.]
germ cell–derived
high pressure–sensitive component
mouse organelle–specific component (it is specific to mouse organelles, not a mouse component that is specific to organelles) - Contrast the first example with a re-writing that uses ESC as an abbreviation for embryonic stem cell. It has the same meaning but requires a hyphen, not an En dash, because the compound modifier is only two words:
ESC-derived, not ESC–derived
I’m not 100% positive that this is right, but I think it’s best to use a hyphen instead of an En dash in these types of phrases where the full term and its parenthetical abbreviation (or a clarifying word in parentheses) precede a hyphen, if and only if they are both one word:
MitoTracker (MT)-stained cells
tyrosine (Tyr)-phosphorylated proteins
vehicle (DMSO)-treated ratsI don’t think it would be wrong to use an En dash instead in those examples.
- If your phrasing calls for a dangling hyphen or En dash, simply use them the way you normally would if they weren’t dangling:
the high pressure– and heat-sensitive gauge, not the high-pressure- and heat-sensitive gauge
the angiotensinogen II– and VEGF-induced effects - There’s also the classic example of Bart Simpson asking the following to Mrs. Krabappel:
How would I go about creating a half-man–half-monkey–type creature? Or: How would I go about creating a half-man, half-monkey–type creature?
Either way uses an En dash before “type.”
The take-home message: The En connects, the Em separates, and the hyphen compounds. 4. Don’t use “literally” when you literally mean “figuratively” This is one of my biggest pet peeves in the world because it is so ridiculous. Why do people say “literally” when they mean its exact opposite, “figuratively”? Here are some examples I’ve encountered that I can remember:
“My boyfriend literally hit the ceiling.” “We have literally created a monster.” “This is literally a tortoise–hare situation.” (Michael Wilbon, PTI, December 14, 2009, on the late-surging Chargers posing a potential threat to the 13-0 Colts, should they meet in the playoffs.)
People, please learn what the word “literally” means, and don’t use it foolishly. 5. Singular/plural subject–noun agreement When you are conveying the idea that a plural noun is equal to another, singular noun, the verb is going to disagree with one of them. That’s okay. Observe:
One possibility not previously explored is mutations in two different genes.
It might be better to re-word some sentences like that, but it doesn’t make them incorrect.
6. Tricky comma uses in complex sentences
When “and” precedes a dependent or independent clause that begins with an introductory phrase that itself is normally offset by commas, people have a hard time knowing where and how many commas to put. The best thing to remember is that when an independent clause is preceded by one of the seven conjunctions, a comma should precede the conjunction, but a comma shouldn’t precede a dependent clause (unless there are other reasons to put a comma there). Observe the following correct sentences:
In general, the mice produced 10 to 12 pups, and to avoid the influence of litter size on this phenotype, we only used dams whose litter size was 10 pups.